Page MenuHomeFreeBSD

Update L1TF workaround to sustain L1D pollution from NMI.
ClosedPublic

Authored by kib on Aug 18 2018, 2:27 PM.
Tags
None
Referenced Files
Unknown Object (File)
Nov 22 2024, 9:58 AM
Unknown Object (File)
Oct 10 2024, 11:44 AM
Unknown Object (File)
Oct 9 2024, 9:19 AM
Unknown Object (File)
Sep 29 2024, 1:16 AM
Unknown Object (File)
Sep 27 2024, 8:37 AM
Unknown Object (File)
Sep 24 2024, 11:00 PM
Unknown Object (File)
Sep 24 2024, 6:30 AM
Unknown Object (File)
Sep 23 2024, 7:39 AM
Subscribers

Details

Summary

Current mitigation for L1TF in bhyve flushes L1D either by an explicit WRMSR command, or by software reading enough uninteresting data to fully populate all lines of L1D. If NMI occurs after either of methods is completed, but before VM entry, L1D becomes polluted with the cache lines touched by NMI handlers. There is no interesting data which NMI accesses, but something sensitive might be co-located on the same cache line, and then L1TF exposes that to a rogue guest.

Use VM entry MSR load list to ensure atomicity of L1D cache and VM entry if updated microcode was loaded. If only software flush method is available, try to help the bhyve sw flusher by also flushing L1D on NMI exit to kernel mode.

Suggested by and discussed with: Andrew Cooper <andrew.cooper3@citrix.com>

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/amd64/amd64/support.S
1253 ↗(On Diff #46893)

bde@ would probably like a period at the end of the comment. :-P

sys/amd64/vmm/intel/vmx.c
814 ↗(On Diff #46893)

Perhaps move the stale comment here and adjust it slightly to say that if L1D flushing is enabled, use the MSR via the MSR load list if possible, otherwise fall back to software flushing during VM entry and return from IRET.

823 ↗(On Diff #46893)

I think I know what this is trying to do, but it probably warrants clarification (and perhaps commenting what the values for this variable are supposed to mean). Originally I thought the goal was to permit the user to set the tunable to some value to turn off the flushing. Instead, it seems that setting the tunable to 0 means "default, enable flushing if MSR isn't available when loading vmm.ko", 1 means "default, but flushing will happen for all NMIs unless you happen to load and then unload vmm.ko at which point it will then revert to '0'". Values > 1 seem to mean "always flush all the time regardless of vmm being loaded or unloaded".

sys/amd64/vmm/intel/vmx_support.S
181 ↗(On Diff #46893)

This comment is now stale.

kib marked 4 inline comments as done.

Rework comments.

Sorry to be so picky about the comments. The code part definitely looks fine to me.

sys/amd64/amd64/trap.c
164 ↗(On Diff #46896)

I do almost wonder if it would be clearer to have the tunable separate. That is you might have:

/*
  * Control L1D flush on return from NMI.  By default, L1D flush is only enabled when
  * vmm.ko is loaded and requires L1D flushes.  If set to a non-zero value, L1D is always
  * flushed.
  */
static int nmi_flush_l1d;
SYSCTL_INT(_machdep, OID_AUTO, nm_flush_l1d, CTLFLAG_RWTUN, ...);

int nmi_flush_l1d_sw;

Then you would have a SYSINIT or some such that sets 'nmi_flush_l1d_sw' during boot if nmi_flush_l1d was set by the user. In the vmm handlers you would set 'nmi_flush_l1d_sw' to either '1' (on load) or '0' (on unload) only if 'nmi_flush_l1d' was 0. This avoids the weird case where setting the value to '1' gets lost on vmm.ko unload, but it loses the ability to see the current effective setting via sysctl. I'm not sure which is better or worse. We will probably only need this if users need to use it as a workaround for a future SA though, so maybe it doesn't really matter. If you want to keep the current variable, maybe adjust the comment to something like:

/*
  * Control L1D flush on return from NMI.
  *
  * Tunable  can be set to the following values:
  * 0 - only enable flush on return from NMI if required by vmm.ko (default)
  * >1 - always flush on return from NMI.
  *
  * Post-boot, the sysctl indicates if flushing is currently enabled.
  */
This revision is now accepted and ready to land.Aug 18 2018, 4:36 PM
kib marked an inline comment as done.

jhb' wording for the comment.

This revision now requires review to proceed.Aug 18 2018, 5:25 PM
sys/amd64/amd64/trap.c
164 ↗(On Diff #46896)

Third option is to not document the tunable at all. It is mostly a backdoor to allow me to provide some emergency button in case this code causes problems. I.e. ideally it would be not a sysctl/tunable at all.

Or, I might add a hook in nmi handler return path, similar to pmc_hook, set by vmm.ko.

Anyway, I took your suggestion about the comment rewording. I do not want to add the complexity to the weird code.

This revision is now accepted and ready to land.Aug 19 2018, 5:34 PM
This revision was automatically updated to reflect the committed changes.