Page MenuHomeFreeBSD

eventtimer: Fix several races in the timer reload code
ClosedPublic

Authored by markj on Jul 6 2022, 6:33 PM.
Tags
None
Referenced Files
Unknown Object (File)
Feb 19 2024, 6:01 PM
Unknown Object (File)
Dec 23 2023, 1:41 AM
Unknown Object (File)
Dec 20 2023, 5:21 AM
Unknown Object (File)
Oct 17 2023, 5:23 AM
Unknown Object (File)
Oct 16 2023, 2:17 PM
Unknown Object (File)
Sep 3 2023, 2:07 PM
Unknown Object (File)
Sep 3 2023, 2:06 PM
Unknown Object (File)
Sep 3 2023, 2:05 PM
Subscribers

Details

Summary

In handleevents(), lock the timer state before fetching the time for the
next event. A concurrent callout_cc_add() call might be changing the
next event time, and the race can cause handleevents() to program an
out-of-date time, causing the callout to run later (by an unbounded
period up to the idle hardclock timeout of 1s) than requested.

In cpu_idleclock(), call getnextcpuevent() with the timer state mutex
held, for similar reasons. In particular, cpu_idleclock() runs with
interrupts enabled, so an untimely timer interrupt can result in a stale
next event time being programmed. Further, an interrupt can cause
cpu_idleclock() to use a stale value for "now".

In cpu_activeclock(), disable interrupts before loading "now", so as to
avoid going backwards in time when calling handleevents(). It's ok to
leave interrupts enabled when checking "state->idle", since the race at
worst will cause handleevents() to be called unnecessarily. But use an
atomic load to indicate that the test is racy.

Modify getnextcpuevent() to take a pointer to the per-CPU timer state
structure, since the callers have already loaded it anyway.

PR: 264867

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj requested review of this revision.Jul 6 2022, 6:33 PM
This revision is now accepted and ready to land.Jul 6 2022, 6:41 PM

On old systems when this was written, having global eventtimer and slow timecounter, ET_HW_LOCK was very congested, that is why I was trying to move sbinuptime() and everything else I could out of it. It should be less critical these days.

sys/kern/kern_clocksource.c
686

Variable now seems no longer needed after this change.

markj marked an inline comment as done.

Purge an unneeded variable.

This revision now requires review to proceed.Jul 6 2022, 7:11 PM
This revision is now accepted and ready to land.Jul 6 2022, 7:13 PM
In D35735#810600, @mav wrote:

On old systems when this was written, having global eventtimer and slow timecounter, ET_HW_LOCK was very congested, that is why I was trying to move sbinuptime() and everything else I could out of it. It should be less critical these days.

Are there any popular platforms without a per-CPU eventtimer?

If lock contention is still a problem somehow, there are some measures we could take to reduce the et_hw lock scope. For instance, cpu_idleclock() can do this:

spinlock_enter();
now = sbinuptime();
ET_HW_LOCK();
spinlock_exit();
...

though this will hurt single-threaded performance. It may also be possible to split getnextcpuevent() so that only callout events are synchronized. And, we should try to eliminate sources of frequent cross-CPU callout scheduling, which should reduce the use of et_hw mutex. I fixed a couple recently but did not try to do an exhaustive search, just fixed what showed up in profiling.

Are there any popular platforms without a per-CPU eventtimer?

Last I remember were about Gen2-3 Cores. There were still unreliable LAPIC timers, so they were falling back to HPET, which often could not provide enough per-CPU vectors. These days I don't know any. So I'd care about it mostly if it is cheap, just to not hurt somebody's relics too much. Though the global eventtimers support in general we'll probably keep for a while.

In D35735#810631, @mav wrote:

Are there any popular platforms without a per-CPU eventtimer?

Last I remember were about Gen2-3 Cores. There were still unreliable LAPIC timers, so they were falling back to HPET, which often could not provide enough per-CPU vectors. These days I don't know any. So I'd care about it mostly if it is cheap, just to not hurt somebody's relics too much. Though the global eventtimers support in general we'll probably keep for a while.

I do not remember anything bad about IvyBridge (Core v2) or Haswell (Core v3). Indeed Nehalem (very first Cores) need C3 disabled for LAPIC timer to work as intended.