Page MenuHomeFreeBSD

hwpmc: add PMU assigner and AMD Zen constraint provider
Needs ReviewPublic

Authored by raghavendra.kt_amd.com on Thu, Jun 18, 7:06 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jun 25, 10:49 PM
Unknown Object (File)
Wed, Jun 24, 11:54 PM
Unknown Object (File)
Wed, Jun 24, 10:15 AM
Unknown Object (File)
Wed, Jun 24, 7:46 AM
Unknown Object (File)
Tue, Jun 23, 2:30 AM
Unknown Object (File)
Sat, Jun 20, 8:33 PM
Subscribers
None

Details

Summary

Two pieces:

(1) sys/dev/hwpmc/hwpmc_assign.c

Most-constrained-first greedy assigner.  pmu_count_core_hw_slots
enumerates global rows that are class-compatible with the leader and
pass amd_can_assign_pmc; pmu_assign_group sorts events by weight and
binds them all-or-none.  Critical detail: the assigner operates in
per-class adjri space for pe_cons.pc_allowed_rows / *used_mask /
pc_fixed_row, but the framework numbers PMC rows globally.  The code
converts adjri <-> ri via hwpmc_ri_to_classdep() and pcd->pcd_ri so
that amd_can_assign_pmc and pcd_allocate_pmc are always called with
the class-relative index they expect.  Without this conversion AMD
trips its own KASSERT (illegal row index >= amd_npmcs) on Zen, where
K8 rows start at global ri=17.

(2) sys/dev/hwpmc/hwpmc_amd.c

Add amd_can_assign_pmc() (factored out of amd_allocate_pmc) and
amd_get_sched_constraint() that emits a pmc_sched_constraint_t
covering every Zen sub-class (CORE / L3_CACHE / DATA_FABRIC).
pc_allowed_rows is built in the per-class adjri namespace as
documented in hwpmc_pmu.h.  The two new functions are prototyped in
hwpmc_pmu.h (alongside the rest of the PMU layer interface) so
hwpmc_amd.h needs no change.  Other backends (Intel/ARM) continue to
return EOPNOTSUPP so legacy non-grouped allocation is unaffected.

Sponsored by: AMD
Signed-off-by: Raghavendra K T <raghavendra.kt@amd.com>

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 73960
Build 70843: arc lint + arc unit

Event Timeline

In the existing code we have to allocate counters into specific indexes in an order to check that they meet our constraints. Either:

  1. Keep this approach and you can just save and restore the counter to specific indices, which is nice because it checks that the group of counters can be satisfied. It's not a great approach for multiplexing.
  2. Expose the constraint checking to userspace so we can validate the counters without allocation.

https://reviews.freebsd.org/D57637 line pmcstat.c:1171

sys/dev/hwpmc/hwpmc_amd.c
517

Leave the comments in here and below that this is older support below this line. Hopefully @mhorne can land his diff to just remove this code.