Page MenuHomeFreeBSD

kvmclock: Fix initialization when EARLY_AP_STARTUP is not defined
ClosedPublic

Authored by markj on Dec 14 2022, 9:23 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 19, 10:30 AM
Unknown Object (File)
Thu, Dec 12, 9:00 PM
Unknown Object (File)
Thu, Dec 12, 6:54 PM
Unknown Object (File)
Nov 15 2024, 6:54 AM
Unknown Object (File)
Nov 15 2024, 6:07 AM
Unknown Object (File)
Nov 15 2024, 5:57 AM
Unknown Object (File)
Nov 15 2024, 5:38 AM
Unknown Object (File)
Nov 15 2024, 3:48 AM
Subscribers

Details

Summary

To attach to the hypervisor, kvmclock needs to write a per-CPU MSR.
When EARLY_AP_STARTUP is not defined, device attach happens too early:
APs are not yet spun up, so smp_rendezvous only runs the callback on the
local CPU. As a result, the timecounter only gets initialized on the
BSP, and then timekeeping is broken.

Implement handling for !EARLY_AP_STARTUP kernels: keep track of the CPU
on which device attach ran, and then use a SI_SUB_SMP SYSINIT to
register the rest of the CPUs with the hypervisor.

Reported by: Shrikanth R Kamath <kshrikanth@juniper.net>

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 48732
Build 45618: arc lint + arc unit

Event Timeline

markj requested review of this revision.Dec 14 2022, 9:23 PM

I'm actually of a mind to say that EARLY_AP_STARTUP is required for x86 for 14.0. I had hoped for it to be a temporary option while other platforms caught up and then planned to remove it as an option once all the platforms were caught up. However, other platforms have been slower to catch up than I'd hoped.

sys/dev/kvm_clock/kvm_clock.c
118

Maybe we can look up the devclass by name here instead of using a global and a module event handler?

In D37705#857464, @jhb wrote:

I'm actually of a mind to say that EARLY_AP_STARTUP is required for x86 for 14.0. I had hoped for it to be a temporary option while other platforms caught up and then planned to remove it as an option once all the platforms were caught up. However, other platforms have been slower to catch up than I'd hoped.

I do think this patch has relatively little value. It fixes a problem that Juniper hit in a kernel based on stable/12 which does not (yet) enable EARLY_AP_STARTUP, so it'd be nice to have this fixed in stable branches even if we end up making EARLY_AP_STARTUP unconditional on x86 in 14.0. Another weak motivation is that this code might end up getting copied into a driver that runs on x86 plus other platforms without EARLY_AP_STARTUP, though currently it's quite x86-specific. (It seems like it should really live under sys/x86.)

sys/dev/kvm_clock/kvm_clock.c
118

Oh, yes.

markj marked an inline comment as done.

Use devclass_find().

I think having something you can MFC is fine. I should maybe write up a diff that adds an #error for x86 if EARLY_AP_STARTUP isn't present might be something I should post for review separately though. That idea had kind of fallen off of my radar.

sys/dev/kvm_clock/kvm_clock.c
118

It's true that when you MFC this, you can probably use the existing kvm_devclass global for this instead. It has only been removed in my sweep to remove devclass variables in general in 14.

120

This is still using the global? I think what you probably want to do that will minimize your diff on MFC is add a local variable in this function with the same name as the global in older branches and initialize the local with devclass_lookup() before you call devclass_get_maxunit(). Then on MFC you just remove the local variable and the line calling devclass_lookup.

This comment has been deleted.
sys/dev/kvm_clock/kvm_clock.c
120

?

markj marked 2 inline comments as done.
  • Pass the right unit number to devclass_get_softc().
  • Fix devclass usage.
This revision is now accepted and ready to land.Dec 16 2022, 6:52 PM
sys/dev/kvm_clock/kvm_clock.c
121

Also, isn't there only ever 1 of these devices? We hardcode a unit number of 0 in the identify routine. Probably the loop based on maxunit can go away and you can instead just look for a softc for unit 0 and return if it is NULL.

193

FWIW, on x86 this will always be 0 in the !EARLY case, but this is fine. You could maybe hardcode that in the SYSINIT if you wished though and then you wouldn't have to add a member to the softc. The SYSINIT would just become (after my other suggestion):

sc = device_get_softc(kvm_clock_devclass, 0);
if (sc == NULL || mp_ncpus == 1)
   return;

cpus = all_cpus;
CPU_CLR(&cpus, 0);
kvm_clock_system_time_enable(sc, &cpus);

If we think it didn't hurt to set the MSR twice on the BSP, we could further simplify this by removing the cpuset argument to kvm_clock_system_time_enable().

sys/dev/kvm_clock/kvm_clock.c
193

It's not obvious to me that we can safely reinitialize the timer on the BSP, doing so might invalidate some timecounter state. At that point, with EARLY_AP_STARTUP disabled, the timecounter is already active. And even if it works with the Linux version that I'm testing with, it might not work with other versions, so I can't easily test.

I considered hard-coding the cpuid since yes, we should still be on the BSP here. It does seem a bit fragile to assume that though, so I'm just playing it safe.

sys/dev/kvm_clock/kvm_clock.c
193

I guess one thought is the kldload case (can this driver be kldloaded?) In that case we might not be on the BSP so you need the softc member.

sys/dev/kvm_clock/kvm_clock.c
193

Currently we don't build a KLD for this driver but I suppose it could be done. And indeed that's another reason to keep track of the attaching CPU.

markj marked 2 inline comments as done.

Assume there is only a single instance of the driver, I had missed that
DEVICE_IDENTIFY hard-codes a unit number.

This revision now requires review to proceed.Dec 20 2022, 7:13 PM
jhb added inline comments.
sys/dev/kvm_clock/kvm_clock.c
170

One nit: we don't actually need this I think. We always set it to a value in the EARLY case below. It might makes sense I guess to #ifdef the member in the softc so it is only present in the !EARLY case if only so when EARLY eventually gets removed as an option it's easier to do the cleanup. :)

This revision is now accepted and ready to land.Dec 20 2022, 7:35 PM
  • Remove an unneeded initialization.
  • Define the firstcpu member only when EARLY_AP_STARTUP is not defined.
This revision now requires review to proceed.Dec 21 2022, 4:12 PM
This revision was not accepted when it landed; it landed in state Needs Review.Jan 13 2023, 3:40 PM
This revision was automatically updated to reflect the committed changes.