eventtimer: Fix several races in the timer reload code
ClosedPublic
Actions

Authored by markj on Jul 6 2022, 6:33 PM.

Details

Reviewers

mav
jhb
kib

Commits

rGa889a65ba369: eventtimer: Fix several races in the timer reload code

Summary

In handleevents(), lock the timer state before fetching the time for the
next event. A concurrent callout_cc_add() call might be changing the
next event time, and the race can cause handleevents() to program an
out-of-date time, causing the callout to run later (by an unbounded
period up to the idle hardclock timeout of 1s) than requested.

In cpu_idleclock(), call getnextcpuevent() with the timer state mutex
held, for similar reasons. In particular, cpu_idleclock() runs with
interrupts enabled, so an untimely timer interrupt can result in a stale
next event time being programmed. Further, an interrupt can cause
cpu_idleclock() to use a stale value for "now".

In cpu_activeclock(), disable interrupts before loading "now", so as to
avoid going backwards in time when calling handleevents(). It's ok to
leave interrupts enabled when checking "state->idle", since the race at
worst will cause handleevents() to be called unnecessarily. But use an
atomic load to indicate that the test is racy.

Modify getnextcpuevent() to take a pointer to the per-CPU timer state
structure, since the callers have already loaded it anyway.

PR: 264867

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Jul 6 2022, 6:33 PM

Herald added a subscriber: imp. · View Herald TranscriptJul 6 2022, 6:33 PM

markj requested review of this revision.Jul 6 2022, 6:33 PM

Harbormaster completed remote builds in B46291: Diff 107851.Jul 6 2022, 6:33 PM

Upload the right diff.

Harbormaster completed remote builds in B46292: Diff 107852.Jul 6 2022, 6:36 PM

jhb accepted this revision.Jul 6 2022, 6:41 PM

This revision is now accepted and ready to land.Jul 6 2022, 6:41 PM

markj edited the summary of this revision. (Show Details)Jul 6 2022, 6:48 PM

On old systems when this was written, having global eventtimer and slow timecounter, ET_HW_LOCK was very congested, that is why I was trying to move sbinuptime() and everything else I could out of it. It should be less critical these days.

mav added inline comments.Jul 6 2022, 6:58 PM

sys/kern/kern_clocksource.c
686	Variable now seems no longer needed after this change.

Purge an unneeded variable.

This revision now requires review to proceed.Jul 6 2022, 7:11 PM

Harbormaster completed remote builds in B46298: Diff 107858.Jul 6 2022, 7:11 PM

mav accepted this revision.Jul 6 2022, 7:13 PM

This revision is now accepted and ready to land.Jul 6 2022, 7:13 PM

In D35735#810600, @mav wrote:

On old systems when this was written, having global eventtimer and slow timecounter, ET_HW_LOCK was very congested, that is why I was trying to move sbinuptime() and everything else I could out of it. It should be less critical these days.

Are there any popular platforms without a per-CPU eventtimer?

If lock contention is still a problem somehow, there are some measures we could take to reduce the et_hw lock scope. For instance, cpu_idleclock() can do this:

spinlock_enter();
now = sbinuptime();
ET_HW_LOCK();
spinlock_exit();
...

though this will hurt single-threaded performance. It may also be possible to split getnextcpuevent() so that only callout events are synchronized. And, we should try to eliminate sources of frequent cross-CPU callout scheduling, which should reduce the use of et_hw mutex. I fixed a couple recently but did not try to do an exhaustive search, just fixed what showed up in profiling.

In D35735#810622, @markj wrote:

Are there any popular platforms without a per-CPU eventtimer?

Last I remember were about Gen2-3 Cores. There were still unreliable LAPIC timers, so they were falling back to HPET, which often could not provide enough per-CPU vectors. These days I don't know any. So I'd care about it mostly if it is cheap, just to not hurt somebody's relics too much. Though the global eventtimers support in general we'll probably keep for a while.

In D35735#810631, @mav wrote:

In D35735#810622, @markj wrote:

Are there any popular platforms without a per-CPU eventtimer?

Last I remember were about Gen2-3 Cores. There were still unreliable LAPIC timers, so they were falling back to HPET, which often could not provide enough per-CPU vectors. These days I don't know any. So I'd care about it mostly if it is cheap, just to not hurt somebody's relics too much. Though the global eventtimers support in general we'll probably keep for a while.

I do not remember anything bad about IvyBridge (Core v2) or Haswell (Core v3). Indeed Nehalem (very first Cores) need C3 disabled for LAPIC timer to work as intended.

Closed by commit rGa889a65ba369: eventtimer: Fix several races in the timer reload code (authored by markj). · Explain WhyJul 11 2022, 7:59 PM

This revision was automatically updated to reflect the committed changes.

markj added a commit: rGa889a65ba369: eventtimer: Fix several races in the timer reload code.