Page MenuHomeFreeBSD

hwpmc_amd: fix amd_get_msr() MSR offset for newer counter bases
Needs ReviewPublic

Authored by paulo_nlink.com.br on Sun, Mar 1, 6:47 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Mar 2, 6:12 PM
Unknown Object (File)
Mon, Mar 2, 4:50 PM
Unknown Object (File)
Mon, Mar 2, 4:49 PM
Unknown Object (File)
Mon, Mar 2, 4:49 PM
Unknown Object (File)
Mon, Mar 2, 4:49 PM
Unknown Object (File)
Mon, Mar 2, 4:02 AM
Unknown Object (File)
Mon, Mar 2, 1:20 AM

Details

Summary

The previous code subtracted AMD_PMC_PERFCTR_0 (0xC0010004) from all
perfctr MSR addresses to compute a relative offset. This is incorrect
for counters using AMD_PMC_CORE_BASE (0xC0010200), AMD_PMC_L3_BASE
(0xC0010230), and AMD_PMC_DF_BASE (0xC0010240), producing wrong offsets.

Fix by promoting amd_core_npmcs, amd_l3_npmcs, and amd_df_npmcs to
static module-level variables and computing the correct flat RDPMC
index per AMD BKDG 24594 page 440:

ECX 0-5:   Core counters 0-5
ECX 6-9:   DF counters 0-3
ECX 10-15: L3 Cache counters 0-5
ECX 16-27: DF counters 4-15
ECX > 27:  Reserved, returns EINVAL

Verified on: AMD Ryzen 5600X (Family 19h, Zen 3), FreeBSD 16.0-CURRENT

  • 6 core counters (K8-0..K8-5) ENABLED
  • 6 L3 counters (K8-L3-0..K8-L3-5) ENABLED
  • 4 DF counters (K8-DF-0..K8-DF-3) ENABLED Total: 16 K8 PMCs correctly registered

Sponsored by: NLINK (https://nlink.com.br), Recife, Brazil

Test Plan

Loaded hwpmc.ko on AMD Ryzen 5600X (Family 19h, Zen 3), FreeBSD 16.0-CURRENT.
Both patches verified: pmccontrol -l shows correct counter state,
pmcstat -c 0 -s dram_channel_data_controller_0 returns valid counts.
No KASSERT failures, no kernel panics.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 71156
Build 68039: arc lint + arc unit

Event Timeline

If you want to fix your patch to the rdpmc problem we should land this and maybe wait for my other fix for the per-package change.

sys/dev/hwpmc/hwpmc_amd.c
661

Good catch this was an oversight, but your solution isn’t correct. RDPMC on AMD uses the counter index into including the core, df, l3 and umc counters with a remapping if you have too many DFs. See page 440 of https://docs.amd.com/v/u/en-US/24594_3.37

723

This bug has been there for some time, if you see my later patch I fix the flags to set per-package capability for these counters in https://cgit.freebsd.org/src/commit/sys/dev/hwpmc?id=00c0a1f0bf6c07e63384a389060dfc10924c0ed6.

I don’t think we should invent a new mechanism, when we already have a DOMWIDE capability flag. Unfortunately pmcstat query’s the flag from the class which doesn’t work when we have counters with different capabilities. Either it should read the flags returned by pmcallocate or it should honor the per-pmc flag in the JSON definitions. I had a patch to fix it that I’m going to submit.

This revision now requires changes to proceed.Sun, Mar 1, 9:34 PM
sys/dev/hwpmc/hwpmc_amd.c
658

This code predated my change and i realize it only worked for the first four counters, and broke on the remaining 2 core counters and DF/L3.

sys/dev/hwpmc/hwpmc_amd.c
723

Another reason to fix this the way other per-package counters are handled. Is it more generally solves the pmc flag problem, e.g. the PRECISE flag is only valid for a single core counter in Zen4/5.

Updated amd_get_msr() fix based on reviewer feedback.

Return ri directly as the flat RDPMC index instead of computing
per-subclass offsets. ri already encodes the correct flat index
since counters are registered sequentially in pmc_amd_initialize().

Withdrew the DF sharing patch deferring to reviewer's DOMWIDE fix.

Updated amd_get_msr() fix based on reviewer feedback from ali_mashtizadeh.

Return ri directly as the flat RDPMC index instead of computing
per-subclass offsets.

Withdrew the DF sharing patch - deferring to reviewer's DOMWIDE fix.

Updated diff: simplified fix to return ri directly as flat RDPMC index.

I pulled out my capability change, again this is reusing the existing code used by a few ARM64 platforms.

https://github.com/freebsd/freebsd-src/pull/2058

sys/dev/hwpmc/hwpmc_amd.c
660–689

Are you reading a different document? Ideally this should work for all supported counters and return an error if it’s not part of the mapped view.

0–3 Core performance counters 0–3 All processors
4–5 Core performance counters 4–5 Fn8000_0001_ECX[PerfCtrExtCore] = 1
6–9 Northbridge performance counters 0–3 Fn8000_0001_ECX[PerfCtrExtNB] = 1
10–15 L3 Cache performance counters 0–5 Fn8000_0001_ECX[PerfCtrExtLLC] = 1
16–27 Northbridge performance counters
4–15
Fn8000_0022_EBX[NumPerfCtrNB] >
Northbridge performance counter number
(4–15).

Thank you for the clarification. I now understand the RDPMC mapping
from AMD BKDG page 440, including the non-contiguous DF range
(ECX 6-9 and ECX 16-27).

Would you agree with promoting amd_core_npmcs and amd_l3_npmcs to
static module-level variables to implement the correct mapping in
amd_get_msr()?

sys/dev/hwpmc/hwpmc_amd.c
661

Understood! Returning ri directly is the correct fix since it already
represents the flat RDPMC index across all counter types. Will update.

723

Thank you for pointing to your DOMWIDE fix. I will abandon this part
and defer to your approach.

Thank you for the clarification. I now understand the RDPMC mapping
from AMD BKDG page 440, including the non-contiguous DF range
(ECX 6-9 and ECX 16-27).

Would you agree with promoting amd_core_npmcs and amd_l3_npmcs to
static module-level variables to implement the correct mapping in
amd_get_msr()?

Yes, lets promote all three counters amd_core/l3/df_count.

Also do you have a third party tool that tests rdpmc from the app? I recall there’s a library that did this was curious what APIs it’s currently using.

Thank you for pointing to the CMN-600 approach. I've looked at
hwpmc_cmn600.c and I think it will be a good approach for new
hwpmc_amd.c code.

I will wait for your patch before trying something. Still waiting
for your guidance on the correct RDPMC mapping for amd_get_msr().

Updated fix based on Ali Mashtizadeh's review feedback.

Promoted amd_core_npmcs, amd_l3_npmcs, amd_df_npmcs to static
module-level variables and implemented correct flat RDPMC index
mapping per AMD BKDG 24594 page 440.

adrian added inline comments.
sys/dev/hwpmc/hwpmc_amd.c
667

Looking at the comments in the file and generally having tinkered with hwpmc in the past, maybe think about adding a reference to the documentation (document name, chapter/section) explaining where this comes from?

That way people have some idea of what to search for when verifying / working on the code?

Thanks!

Added documentation references per Adrian Chadd's review feedback.
Added AMD BKDG 24594 rev 3.37 page 440 and AMD PPR 57930-A0
references to amd_get_msr() comments.