x86: Speed up clock calibration
ClosedPublic
Actions

Authored by cperciva on Jan 10 2022, 1:29 AM.

Details

Reviewers

imp
jhb
kib
markj
jrtc27

Commits

rGbaee6cc1814b: x86: Speed up clock calibration
rGc2705ceaeb09: x86: Speed up clock calibration

Summary

Prior to this commit, the TSC and local APIC frequencies were calibrated
at boot time by measuring the clocks before and after a one-second sleep.
This was simple and effective, but had the disadvantage of *requiring a
one-second sleep*.

Rather than making two clock measurements (before and after sleeping) we
now perform many measurements; and rather than simply subtracting the
starting count from the ending count, we calculate a best-fit regression
between the target clock and the reference clock (for which the current
best available timecounter is used). While we do this, we keep track
of an estimate of the uncertainty in the regression slope (aka. the ratio
of clock speeds), and stop measuring when we believe the uncertainty is
less than 1 PPM.

In order to avoid the risk of aliasing resulting from the data-gathering
loop synchronizing with (a multiple of) the frequency of the reference
clock, we add some additional spinning depending upon the iteration number.

For numerical stability and simplicity of implementation, we make use of
floating-point arithmetic for the statistical calculations.

This reduces the FreeBSD kernel boot time on x86 systems by between 1900
and 2000 ms.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 43800
Build 40688: arc lint + arc unit

Event Timeline

cperciva created this revision.Jan 10 2022, 1:29 AM

Herald added a subscriber: imp. · View Herald TranscriptJan 10 2022, 1:29 AM

cperciva requested review of this revision.Jan 10 2022, 1:29 AM

Harbormaster completed remote builds in B43800: Diff 101173.Jan 10 2022, 1:29 AM

cperciva added reviewers: imp, jhb, kib, markj, jrtc27.Jan 10 2022, 1:30 AM

kib added inline comments.Jan 10 2022, 11:09 AM

sys/kern/subr_clockcalib.c
42	`char *clkname`
44	`*tc`
108	It seems you are using 64bit reads for TSC, which cannot wrap. But LAPIC timer is 32bit and can wrap.
179	Perhaps use ia32_pause() there instead of empty loop?
sys/sys/clockcalib.h
37	I do not see a need of a new header for the whole single prototype. timetc.h IMO is the right place for clockcalib() proto. May be also add 'tc' somewhere in the name, to indicate that the function works with the timecounter.

markj added inline comments.Jan 10 2022, 3:20 PM

sys/x86/x86/tsc.c
715	Can't we enter the FPU-enabled section in clockcalib(), rather than forcing all callers to do it?

kib added inline comments.Jan 10 2022, 3:34 PM

sys/x86/x86/tsc.c
715	This was discussed to the death in some previous review. There is no compiler barrier that prevents it from reordering FPU instructions out of the region. The separate compilation unit is used as a kind of barrier, and could break with LTO turned on, I suspect.

val_packett.cool added a subscriber: val_packett.cool.Jan 11 2022, 3:18 PM

val_packett.cool added inline comments.

sys/sys/clockcalib.h
37	In case the header stays, it should at least not break buildworld :) add `<sys/types.h>` to avoid: /usr/obj/usr/src/amd64.amd64/tmp/usr/include/sys/clockcalib.h:37:1: error: unknown type name 'uint64_t' […] make[3]: stopped in /usr/src/tools/build/test-includes

jhb added inline comments.Jan 11 2022, 8:16 PM

sys/kern/subr_clockcalib.c
179	Maybe spelled as `cpu_spinwait` since this file is not otherwise x86-specific?
sys/x86/x86/local_apic.c
1008	New functions don't have to have the gratuitous blank line here, but if you are matching the style of existing functions in this file, it is ok to stay.

cperciva added inline comments.Jan 12 2022, 4:15 AM

sys/kern/subr_clockcalib.c
44	Thanks, fixed.
108	True, but if we were going to handle this is should be done in `local_apic.c`. For our purposes it shouldn't matter -- we're counting down from APIC_TIMER_MAX_COUNT and should have finished calibrating long before we hit zero.
179	Probably doesn't matter at the moment considering that it's unlikely we'll have any other non-idle thread running at this point in the boot process -- but sure, might as well include it.
sys/sys/clockcalib.h
37	Ok, merging into `timetc.h`. I don't think it makes sense to have `tc` in the function name though, since it's calibrating results returned by a function pointer, not a timecounter.

Fixes per reviews.

Harbormaster completed remote builds in B43875: Diff 101344.Jan 12 2022, 4:17 AM

kib accepted this revision.Jan 12 2022, 5:08 AM

kib added inline comments.

sys/kern/subr_clockcalib.c
179	It is not about scheduling, you are in critical section anyway, so context switches are disabled. cpu_spinwait() reduces power consumption of this core.

This revision is now accepted and ready to land.Jan 12 2022, 5:08 AM

cperciva added inline comments.Jan 12 2022, 6:12 AM

sys/kern/subr_clockcalib.c
179	Oh, I was thinking about the effect of PAUSE on hyperthreading, that it reduces the performance impact on the sibling thread. I didn't realize it also reduced power consumption. I doubt anyone cares about a one-time reduction in power consumption lasting a few microseconds, but there's no reason to not save a few mJ I guess...