Details

Reviewers

emaste
jhb
sbruno
kib
imp

Commits

rS337715: Implement kernel support for early loading of Intel microcode updates.

Summary

The feature allows updates to be loaded as one of the first steps of CPU
initialization, in particular before CPU feature detection is performed
on the BSP. With this change we also automatically reload a selected
microcode update on CPUs following an ACPI resume.

The idea is to preload the update using standard loader(8)
functionality. The update file may consist of a single update (Intel
updates are per-<family,model,stepping> tuple), or of multiple updates
concatenated together. In the latter case, we use the algorithms
described in section 9.11 of the SDM to select the correct update for
the host CPU. Selection is performed on the BSP, followed by
application of the update. Once APs are started, ucode_load_ap()
references the update selected by the BSP. Somewhat later during boot,
ucode_release() allocates some kernel memory, copies the selected update
there, and frees memory backing the preloaded update file. This ensures
that we can distribute and load all microcode updates in a single file
without wasting much RAM.

Upon resume from ACPI suspend, each CPU invokes ucode_reload() to
re-apply the selected update, if any.

I did not implement AMD support in this revision since AMD updates are
not released nearly as frequently and are typically released by BIOS
vendors, and the AMD microcode update facilities are mostly
undocumented. In particular, none of my AMD laptops can be updated
using the microcode updates provided with devcpu-data, and I have been
unable to find testers.

Test Plan

I did some testing on a Haswell Xeon, and on a laptop with working
suspend/resume. I tested both i386 and amd64 on the laptop, but
do not own any 32-bit Intel CPUs. I plan to send a CFT.

To test the patch, we only need to tell the loader to load a file of type
"cpu_microcode". For instance, in loader.conf:

ucode_load="YES"
ucode_name="/boot/firmware/06-3f-02.6f"
ucode_type="cpu_microcode"

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Jul 20 2018, 8:45 PM

Harbormaster completed remote builds in B18216: Diff 45629.Jul 20 2018, 8:45 PM

Fix capitalization.

Harbormaster completed remote builds in B18217: Diff 45630.Jul 20 2018, 8:48 PM

markj edited the test plan for this revision. (Show Details)Jul 20 2018, 8:55 PM

markj added reviewers: emaste, jhb, sbruno, kib, imp.

emaste added a subscriber: bwidawsk.Jul 20 2018, 10:00 PM

emaste added a subscriber: scottph.

Don't log an error when no update is preloaded.

Harbormaster completed remote builds in B18239: Diff 45663.Jul 21 2018, 6:11 PM

markj added inline comments.Jul 24 2018, 3:34 PM

sys/i386/i386/machdep.c
2344 ↗	(On Diff #45663)	This isn't early enough on i386 - we call identify_cpu() from locore there.
sys/x86/x86/ucode.c
253 ↗	(On Diff #45663)	I will add a chicken switch for this.

Add a tunable to disable reclamation of memory and KVA used for preloaded microcode updates.
Call identify_cpu() a second time after loading microcode on the BSP on i386. Make the function callable from C.
Allocate physical memory for the selected microcode update and copy the update there before loading it. This means that we don't need to perform a copy in ucode_release(), and that we don't have to rely the loader to provide 16-byte alignment of the update file.

Harbormaster completed remote builds in B18381: Diff 45926.Jul 27 2018, 8:15 PM

markj mentioned this in D16354: Force 16 byte alignment of preloaded files on i386 and amd64..Jul 30 2018, 5:55 PM

cem added a subscriber: cem.Aug 2 2018, 1:36 AM

Suppose that we have some microcode blob loaded at boot time, and then a newer update is loaded with cpucontrol(8). Which blob would be tried on resume ?

sys/x86/acpica/acpi_wakeup.c
328 ↗	(On Diff #45926)	Did you considered applyin the ucode update in wakeup code. E.g. before we ever enter the long mode.
sys/x86/x86/ucode.c
84 ↗	(On Diff #45926)	I suggest to extract the hardware format definitions into a reusable header.
288 ↗	(On Diff #45926)	Why do you exclude the ht siblings ? I believe it is at least innocent to apply the update on each pair of threads. I suspect that it might be more useful to not exclude it, for some future bugs. Suppose that the initial update at the boot time failed. My reading of your code is that on resume, the same update would be tried. If this is true, wouldn't it be better to not retry the failing update on the resume ?

In D16370#351629, @kib wrote:

Suppose that we have some microcode blob loaded at boot time, and then a newer update is loaded with cpucontrol(8). Which blob would be tried on resume ?

We'd re-apply the old blob. I'm not sure if this is a bug or not - if no blob is loaded at boot time, then upon resume the new update will be lost anyway.

sys/x86/acpica/acpi_wakeup.c
328 ↗	(On Diff #45926)	I couldn't convince myself it was necessary, and it would require special code in ucode.c. The initial update is applied on the BSP only after we've entered long mode.
sys/x86/x86/ucode.c
84 ↗	(On Diff #45926)	Will do.
288 ↗	(On Diff #45926)	In section 9.11.6.3 of the Intel SDM it is stated that during MP initialization, updates to the same physical core must be serialized. Of course, we could use a mutex instead of the cpu_hyperthread check; I don't have a preference here. Yes, I think you're right. We should check the return value of ucode_intel_load() and set ucode_data conditionally.

In D16370#351674, @markj wrote:

We'd re-apply the old blob. I'm not sure if this is a bug or not - if no blob is loaded at boot time, then upon resume the new update will be lost anyway.

I think with this change we'd consider the userland cpucontrol method deprecated.

sys/x86/x86/ucode.c
288 ↗	(On Diff #45926)	Perhaps @bwidawsk or @scott.d.phillips_intel.com can comment on which approach is preferable.

In D16370#351679, @emaste wrote:

In D16370#351674, @markj wrote:

We'd re-apply the old blob. I'm not sure if this is a bug or not - if no blob is loaded at boot time, then upon resume the new update will be lost anyway.

I think with this change we'd consider the userland cpucontrol method deprecated.

I disagree. There is a value in both approaches. cpucontrol is useful to check a new blob, also it is useful when you do not want to apply specific microcode patch always, but need it sometime. E.g. HSW updates removed TRM. Or you might want to apply some costly workaround right now, but not persistently.

My point is that after cpucontrol loaded new blob, user needs to re-evaluate cpu_features, and this should persist. So that update better be honored by the suspend/resume code, otherwise what is the point of the feature ? It seems to be quite simple, cpucontrol update should replace the ucode_data.

In D16370#351703, @kib wrote:

In D16370#351679, @emaste wrote:

In D16370#351674, @markj wrote:

We'd re-apply the old blob. I'm not sure if this is a bug or not - if no blob is loaded at boot time, then upon resume the new update will be lost anyway.

I think with this change we'd consider the userland cpucontrol method deprecated.

I disagree. There is a value in both approaches. cpucontrol is useful to check a new blob, also it is useful when you do not want to apply specific microcode patch always, but need it sometime. E.g. HSW updates removed TRM. Or you might want to apply some costly workaround right now, but not persistently.

My point is that after cpucontrol loaded new blob, user needs to re-evaluate cpu_features, and this should persist. So that update better be honored by the suspend/resume code, otherwise what is the point of the feature ? It seems to be quite simple, cpucontrol update should replace the ucode_data.

Indeed, it's pretty straightforward. I will just add a function to replace ucode_data.

In D16370#351679, @emaste wrote:

I think with this change we'd consider the userland cpucontrol method deprecated.

No, having both methods still makes a lot of sense. Earlyboot ucode is necessary to unbreak CPUs enough to boot. But cpucontrol allows ucode updates at runtime. If a CPU errata fix or security mitigation can be applied at runtime instead of with a reboot, everyone is happier.

Additionally, the two sources have potentially different distribution channels with different latency. Admins may be able to fetch new ucode directly from a vendor faster than we can cut a new release or EN. This doesn't help them if the new ucode is needed to fix boot, but the vast majority of ucode patches seem to be ok to apply at runtime (we've gotten away with only runtime patching for this long...).

In D16370#351721, @cem wrote:

In D16370#351679, @emaste wrote:

I think with this change we'd consider the userland cpucontrol method deprecated.

No, having both methods still makes a lot of sense. Earlyboot ucode is necessary to unbreak CPUs enough to boot. But cpucontrol allows ucode updates at runtime. If a CPU errata fix or security mitigation can be applied at runtime instead of with a reboot, everyone is happier.

Well, runtime loading still has disadvantages. It is better than nothing for mission-critical servers that cannot tolerate downtime.

Additionally, the two sources have potentially different distribution channels with different latency.

How's that? The devcpu-data port supplies updates in a format that can be consumed by either method.

In D16370#351722, @markj wrote:

In D16370#351721, @cem wrote:

Well, runtime loading still has disadvantages. It is better than nothing for mission-critical servers that cannot tolerate downtime.

Sure.

Additionally, the two sources have potentially different distribution channels with different latency.

How's that? The devcpu-data port supplies updates in a format that can be consumed by either method.

Businesses may have early NDA access to microcode that the port does not have access to or cannot distribute.

Businesses may have early NDA access to microcode that the port does not have access to or cannot distribute.

But they could apply that microcode with either method.

Perhaps deprecated was the wrong word and I should say ucode-from-loader is the preferred method.

In D16370#351744, @emaste wrote:

Businesses may have early NDA access to microcode that the port does not have access to or cannot distribute.

But they could apply that microcode with either method.

Perhaps deprecated was the wrong word and I should say ucode-from-loader is the preferred method.

I think also "limited". To support cpu_features changing at runtime, various other parts of the kernel also have to start using dynamic checks. Most users of cpu_features do so during boot and then don't check again. The current spectre stuff only does so explicitly. Loading the blob from the loader (which is not persistent since you can just remove the package or edit loader.conf to disable it) is more future proof in that it works with places that only check feature flags during boot.

In D16370#351744, @emaste wrote:

Businesses may have early NDA access to microcode that the port does not have access to or cannot distribute.

But they could apply that microcode with either method.

Don't they have to somehow process it into something loader can load first?

Either way, seems like there are good caveats to add to cpucontrol.8 if they are not present already.

Updates based on feedback:

Add a mechanism for cpuctl(4) to replace the existing update.
Move ucode header struct definitions to ucode.h.
Only set ucode_data during boot if we successfully loaded microcode on the BSP.

Harbormaster completed remote builds in B18534: Diff 46211.Aug 2 2018, 7:40 PM

In D16370#351801, @cem wrote:

Sorry, I missed this.

In D16370#351744, @emaste wrote:

Businesses may have early NDA access to microcode that the port does not have access to or cannot distribute.

But they could apply that microcode with either method.

Don't they have to somehow process it into something loader can load first?

Not as far as I know. The updates are distributed as blobs following a format defined in the SDM, and both cpucontrol(8) and the kernel can load them.

Either way, seems like there are good caveats to add to cpucontrol.8 if they are not present already.

Yeah. I'm planning to do some man page updates as a separate change.

Were there any other comments or concerns regarding this change? There is still some time left to make substantial changes if needed, but not very much.

kib accepted this revision.Aug 10 2018, 3:07 PM

This revision is now accepted and ready to land.Aug 10 2018, 3:07 PM

Closed by commit rS337715: Implement kernel support for early loading of Intel microcode updates. (authored by markj). · Explain WhyAug 13 2018, 5:13 PM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: rpokala. · View Herald TranscriptAug 13 2018, 5:13 PM

markj added inline comments.Aug 13 2018, 8:14 PM

head/sys/x86/x86/ucode.c
344	This will break once memcpy() is an ifunc.

kib added inline comments.Aug 13 2018, 8:56 PM

head/sys/x86/x86/ucode.c
344	So it would become an explicit byte copy loop, perhaps with volatile qualifiers. link_elf_ireloc() already did that for bzero().

Implement early microcode loading for Intel i386 and amd64 platforms.
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 46632

head/sys/amd64/amd64/machdep.c

head/sys/amd64/amd64/mp_machdep.c

head/sys/conf/files.amd64

head/sys/conf/files.i386

head/sys/dev/cpuctl/cpuctl.c

head/sys/i386/i386/locore.s

head/sys/i386/i386/machdep.c

head/sys/i386/i386/mp_machdep.c

head/sys/x86/acpica/acpi_wakeup.c

head/sys/x86/include/ucode.h

head/sys/x86/x86/mp_x86.c

head/sys/x86/x86/ucode.c

Implement early microcode loading for Intel i386 and amd64 platforms.ClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 46632

head/sys/amd64/amd64/machdep.c

head/sys/amd64/amd64/mp_machdep.c

head/sys/conf/files.amd64

head/sys/conf/files.i386

head/sys/dev/cpuctl/cpuctl.c

head/sys/i386/i386/locore.s

head/sys/i386/i386/machdep.c

head/sys/i386/i386/mp_machdep.c

head/sys/x86/acpica/acpi_wakeup.c

head/sys/x86/include/ucode.h

head/sys/x86/x86/mp_x86.c

head/sys/x86/x86/ucode.c

Implement early microcode loading for Intel i386 and amd64 platforms.
ClosedPublic
Actions

Revision Contents
Changeset List