Add a thin virtual layer (sys/dev/hwpmc/hwpmc_pmu.{h,c}) on top of the
existing hwpmc row programming model, plus the multiplex rotation
back-end that the v0 series left as stubs.
Types (exposed as typedefs per review feedback):
pmu_event_t per-pmc state used to defer HW row binding
until commit/attach: state, leader flag, pre-computed scheduling constraint, time-enabled / time-running counters used by the multiplex layer.
pmu_group_t leader + sibling list with all-or-none
commit semantics. Carries multiplex bookkeeping (pg_running, pg_defer_ok, pg_assigned, pg_used_rows_mask).
pmc_sched_constraint_t allowed-row bitmask + popcount-as-weight
+ FIXED/EXCLUSIVE/SHARED flags. Lower weight means more constrained, which the assigner uses for most-constrained-first greedy placement.
Cross-architecture safety: the grouping/multiplex scheduler is only
implemented where a backend can describe per-event scheduling
constraints (x86/AMD today), so hwpmc_pmu.c and hwpmc_assign.c are
built on amd64/i386 only. hwpmc_pmu.h therefore gates the real entry
points behind HWPMC_PMU_GROUPS (defined for amd64 / i386) and
provides static-inline no-op stubs (pmu_group_on_allocate &c. return
EOPNOTSUPP, the csw/release hooks do nothing) for every other
architecture. The architecture-independent hwpmc_mod.c thus links
unchanged on arm64/arm/powerpc; behaviour there is identical to a
pre-grouping hwpmc. Extending support to another backend is a matter
of adding its constraint provider and defining HWPMC_PMU_GROUPS for it.
Group lifecycle: pmu_group_create / pmu_group_add / pmu_group_commit /
pmu_group_release / pmu_group_lookup. Per-PMC hooks:
pmu_group_on_allocate allocates the pmu_event; pmu_group_on_release
removes the pe from the group TAILQ before freeing it
Rotation runtime:
pmu_pp_schedule_in atomic per-pp placement of a whole group:
pmu_assign_group, attach every sibling, flip pm_state from STOPPED back to RUNNING (rotation-evicted PMCs were otherwise stuck STOPPED forever).
pmu_pp_schedule_out mirror image: detach, free rows, flip
pm_state to STOPPED, mark pg_assigned false.
pmu_pp_kick_rotate wake the per-pp rotation kthread.
pmu_pp_rotate_thread one kthread per pmc_process; sleeps on
pp_pmu_rot_thread, ticks every rotation_period_us microseconds.
pmu_pp_rotate_one cursor-based round-robin: evict every
currently scheduled group, then walk pp_pmu_groups starting from pp_pmu_rot_cursor and schedule_in until the first ENOSPC, which pins the next tick's cursor. This avoids the FIFO+greedy starvation pattern (small groups repeatedly winning the leftover slots).
pmu_pp_release_all drain the rotation kthread and sever every
group still hooked off pp before pp is freed. Wired into pmc_process_exit by patch 0004.
The PMC_F_GROUP_MUX commit fallback (over-subscription accepted instead
of returning ENOSPC) is wired up so userland can already opt in via
pmcstat -b.
Signed-off-by: Raghavendra K T <raghavendra.kt@amd.com>