cpuctl: run amd_ucode_wrmsr only on one CPU and report if it failed
ClosedPublic
Actions

Authored by glebius on Sep 9 2025, 8:38 PM.

Details

Reviewers

chs
stas
kib
markj

Commits

rG797e92944d11: cpuctl: run amd_ucode_wrmsr only on one CPU and report if it failed
rG6683dcf61b3d: cpuctl: run amd_ucode_wrmsr only on one CPU and report if it failed

Summary

The CPUCTL_UPDATE is supposed to be applied only to the CPU the ioctl(2)
was performed on. This is true for Intel CPUs, but for AMD the SMP
rendezvouz of amd_ucode_wrmsr() effectively executed it on all CPUs.
Also, the update failure was not reported.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

glebius created this revision.Sep 9 2025, 8:38 PM

Herald added a subscriber: imp. · View Herald TranscriptSep 9 2025, 8:38 PM

glebius requested review of this revision.Sep 9 2025, 8:38 PM

Harbormaster completed remote builds in B66935: Diff 161836.Sep 9 2025, 8:38 PM

@avg said that the old behaviour was intentional: https://reviews.freebsd.org/D15560#328553

I guess it's not a problem on sufficiently new CPUs? That said, why is it useful to update ucode on cores one by one?

I somewhat doubt that the arguments by @avg were true. If indeed the ucode update could affect the infinity fabric protocols, it is up to the ucode itself to handle it. If you do software IPI to update all cores, it is still not synced in the strict sense, the update only happen in much rapid succession.

Both Intel and AMD describe how to update ucode on single CPU (or might be on single CPU out of HT pair). Since this is how it is documented, I do not see a reason to not expose the per-cpu update ability to userspace.

sys/dev/cpuctl/cpuctl.c
441	Broadcasting the update when you target a single CPU is excessive. Instead if checking cpuid in the handler, just specify the single-CPU map with smp_rendezvous_cpus().

In D52466#1198185, @kib wrote:

I somewhat doubt that the arguments by @avg were true. If indeed the ucode update could affect the infinity fabric protocols, it is up to the ucode itself to handle it. If you do software IPI to update all cores, it is still not synced in the strict sense, the update only happen in much rapid succession.

Well, the activity of the cores is quite limited during the update when using smp_rendezvous.

Both Intel and AMD describe how to update ucode on single CPU (or might be on single CPU out of HT pair). Since this is how it is documented, I do not see a reason to not expose the per-cpu update ability to userspace.

Ok. I'm just noting that this might not work on old CPUs (back when AMD did not document the ucode update procedures IIRC). I have no objection.

sys/dev/cpuctl/cpuctl.c
441	update_intel() does the same thing BTW, both should be fixed.

In D52466#1198185, @kib wrote:

I somewhat doubt that the arguments by @avg were true.

Maybe I misdiagnosed the problem back then, on the hardware that I had.
But I certainly did some investigation which led me towards trying to do the ucode update as closely in time as possible.
The system in question was from 10h family (Phenom II, I think).
I still have it somewhere, so I may try to test this change on that system if there is any interest in such a test and I find time.

Andriy didn't say there that the old behavior was intentional, rather the opposite - that old behavior causes a problems when updating two SMT cores simultaneously. My patch should also address the problems Andriy had with that old CPU.

I think smp_rendezvous() is still needed when we update a single core. We want any SMT sibling of it to spin executing only in non-CISC instructions. We potentially can create a map of CPUs and rendezvous only the SMT siblings of CPU being updated, instead of all CPUs.

In D52466#1198337, @glebius wrote:

Andriy didn't say there that the old behavior was intentional, rather the opposite - that old behavior causes a problems when updating two SMT cores simultaneously. My patch should also address the problems Andriy had with that old CPU.

That's not how I read it. "I changed the update code to what it is now (synchronized update), because on my Phenom II based systems the old update mode, core after core, driven by userland, caused hard hangs."

In particular, serialized updates caused hard hangs.

I think smp_rendezvous() is still needed when we update a single core. We want any SMT sibling of it to spin executing only in non-CISC instructions. We potentially can create a map of CPUs and rendezvous only the SMT siblings of CPU being updated, instead of all CPUs.

I think some comment needs to explain this, ideally with some reference to AMD documentation. In https://docs.amd.com/v/u/en-US/57930-A0-PUB_3.00 I see the microcode file header has a bit, "load control" which determines whether sibling threads should all be updated. In the ucode.c loader we currently always skip SMT siblings, see the check in ucode_load_ap().

Anyway, I do not object to the change.

Ok, guys, can you please approve this change, then? The rationale is that most recent microcode that is published in Linux and already committed to our ports fails to load on some CPUs and the tool right now doesn't report that. Reports success instead. The fact that on AMD the tool updates all cores, unlike what documentation says and unlike what the same command does on Intel, is also a bug.

markj accepted this revision.Sep 11 2025, 3:48 PM

This revision is now accepted and ready to land.Sep 11 2025, 3:48 PM

Closed by commit rG6683dcf61b3d: cpuctl: run amd_ucode_wrmsr only on one CPU and report if it failed (authored by glebius). · Explain WhySep 18 2025, 3:13 PM

This revision was automatically updated to reflect the committed changes.

glebius added a commit: rG6683dcf61b3d: cpuctl: run amd_ucode_wrmsr only on one CPU and report if it failed.

glebius added a commit: rG797e92944d11: cpuctl: run amd_ucode_wrmsr only on one CPU and report if it failed.Sep 22 2025, 4:12 PM