Page MenuHomeFreeBSD

x86: Speed up clock calibration

Authored by cperciva on Jan 10 2022, 1:29 AM.



Prior to this commit, the TSC and local APIC frequencies were calibrated
at boot time by measuring the clocks before and after a one-second sleep.
This was simple and effective, but had the disadvantage of *requiring a
one-second sleep*.

Rather than making two clock measurements (before and after sleeping) we
now perform many measurements; and rather than simply subtracting the
starting count from the ending count, we calculate a best-fit regression
between the target clock and the reference clock (for which the current
best available timecounter is used). While we do this, we keep track
of an estimate of the uncertainty in the regression slope (aka. the ratio
of clock speeds), and stop measuring when we believe the uncertainty is
less than 1 PPM.

In order to avoid the risk of aliasing resulting from the data-gathering
loop synchronizing with (a multiple of) the frequency of the reference
clock, we add some additional spinning depending upon the iteration number.

For numerical stability and simplicity of implementation, we make use of
floating-point arithmetic for the statistical calculations.

This reduces the FreeBSD kernel boot time on x86 systems by between 1900
and 2000 ms.

Diff Detail

rS FreeBSD src repository - subversion
Lint OK
No Unit Test Coverage
Build Status
Buildable 43800
Build 40688: arc lint + arc unit

Event Timeline


char *clkname




It seems you are using 64bit reads for TSC, which cannot wrap. But LAPIC timer is 32bit and can wrap.


Perhaps use ia32_pause() there instead of empty loop?


I do not see a need of a new header for the whole single prototype. timetc.h IMO is the right place for clockcalib() proto. May be also add 'tc' somewhere in the name, to indicate that the function works with the timecounter.


Can't we enter the FPU-enabled section in clockcalib(), rather than forcing all callers to do it?


This was discussed to the death in some previous review. There is no compiler barrier that prevents it from reordering FPU instructions out of the region.

The separate compilation unit is used as a kind of barrier, and could break with LTO turned on, I suspect. added inline comments.

In case the header stays, it should at least not break buildworld :) add <sys/types.h> to avoid:

/usr/obj/usr/src/amd64.amd64/tmp/usr/include/sys/clockcalib.h:37:1: error: unknown type name 'uint64_t'
make[3]: stopped in /usr/src/tools/build/test-includes

Maybe spelled as cpu_spinwait since this file is not otherwise x86-specific?


New functions don't have to have the gratuitous blank line here, but if you are matching the style of existing functions in this file, it is ok to stay.


Thanks, fixed.


True, but if we were going to handle this is should be done in local_apic.c. For our purposes it shouldn't matter -- we're counting down from APIC_TIMER_MAX_COUNT and should have finished calibrating long before we hit zero.


Probably doesn't matter at the moment considering that it's unlikely we'll have any other non-idle thread running at this point in the boot process -- but sure, might as well include it.


Ok, merging into timetc.h. I don't think it makes sense to have tc in the function name though, since it's calibrating results returned by a function pointer, not a timecounter.

kib added inline comments.

It is not about scheduling, you are in critical section anyway, so context switches are disabled. cpu_spinwait() reduces power consumption of this core.

This revision is now accepted and ready to land.Jan 12 2022, 5:08 AM

Oh, I was thinking about the effect of PAUSE on hyperthreading, that it reduces the performance impact on the sibling thread. I didn't realize it also reduced power consumption.

I doubt anyone cares about a one-time reduction in power consumption lasting a few microseconds, but there's no reason to not save a few mJ I guess...

This revision was automatically updated to reflect the committed changes.