Page MenuHomeFreeBSD

Fix several races in ntptime code and provide fine-grained locking for ntptime and resettodr().
ClosedPublic

Authored by kib on Jun 12 2016, 2:16 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Dec 10, 10:04 AM
Unknown Object (File)
Mon, Dec 2, 7:21 AM
Unknown Object (File)
Sun, Dec 1, 8:33 PM
Unknown Object (File)
Mon, Nov 25, 7:09 PM
Unknown Object (File)
Nov 16 2024, 6:14 PM
Unknown Object (File)
Nov 12 2024, 1:18 PM
Unknown Object (File)
Nov 5 2024, 1:03 PM
Unknown Object (File)
Oct 26 2024, 4:59 PM
Subscribers

Details

Summary

Currently both ntptime code and resettodr() are Giant-locked. In particular, the Giant is supposed to protect against parallel ntp_adjtime(2) invocations. But, for instance, sys_ntp_adjtime() does copyout(9) under Giant and then examines time_status to return syscall result. Since copyout(9) could sleep, the syscall result might be
inconsistent.

Another and bigger issue is that if PPS is configured, hardpps() is executed without any protection against the parallel syscall invocation. Potentially, this may result in the inconsistent state of the ntptime state variables, but I cannot say how serious such distortion is. The non-functional splclock() call in sys_ntp_adjtime() protected against clock interrupts calling hardpps() in the pre-SMP era.

The patch modernizes the locking. A mutex protects ntptime data. Due to the hardpps() KPI legitimately serving from the interrupt filters (and e.g. uart(4) does call it from filter), the lock cannot be sleepable mutex if PPS_SYNC is defined.

The ntp_is_time_error() interface was changed to make the calls outside the ntpadj_lock at least self-consistent. Potentially callout could act upon stale state of the ntptime machinery still, but this is true for the current code in more ugly ways, and probably should be fine due to both low quality and long time of update for typical RTC.

For resettodr(), I just added an internal mutex and removed (non-enforced) requirement of owning Giant around the call. Since resettodr() is callable from the clock swi, MD CLOCK_SETTIME() implementations must not sleep anyway. The lock ensures that the callout to write-out the clock does not interfere with the machdep.adjkerntz sysctl and shutdown. I believe, but did not verified, that this is enough for all drivers (due to non-sleepable statement).

Test Plan

I do not have any PPS hardware so I only tested that the spinlock case does not cause immediate smoke-out for an adjtime(2) call.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

kib retitled this revision from to Fix several races in ntptime code and provide fine-grained locking for ntptime and resettodr()..
kib updated this object.
kib edited the test plan for this revision. (Show Details)
kib added reviewers: ian, imp, lstewart.
kib set the repository for this revision to rS FreeBSD src repository - subversion.
kib added a subscriber: emaste.
sys/kern/kern_ntptime.c
489 ↗(On Diff #17538)

Is time_state a global protected by this lock? If so, you could just do this:

int retval;

if (ntp_is_time_error(time_status))
    retval = TIME_ERROR;
else
    retval = time_state;
NTPADJ_UNLOCK();

And then set td_retval[0] to retval if there is no error from copyout().

kib marked an inline comment as done.

Implement jhb' suggestion.

imp edited edge metadata.

Generally, I like this cleanup. Wondering why sometimes we spin and other times we don't. Is there a performance win for that? Since this lock is usually not contested, I'm skeptical there's a win. If there's not a win, I'd recommend always spinning. There's a couple of minor style nits, but I'm not going to block based on them. Use your best judgement.

sys/kern/kern_ntptime.c
179 ↗(On Diff #17945)

Why not always spin?

233 ↗(On Diff #17945)

time_status hides the global time status. Higher warning levels will complain.

822–823 ↗(On Diff #17945)

Given the number of unlocks, I'd be more inclined to have a goto out pattern that does the unlock and returns.

But it's a minor point.

965 ↗(On Diff #17945)

out would be here, see comment above.

This revision is now accepted and ready to land.Jun 27 2016, 8:11 PM
kib marked 3 inline comments as done.Jun 27 2016, 8:25 PM
kib added inline comments.
sys/kern/kern_ntptime.c
179 ↗(On Diff #17945)

Spinlocks add interrupt latency, as a generic policy in kernel we prefer not to spin if ever possible. I fixed several existing cases in kernel, like umtx, which used spinlocks without a justification. And there, if we can avoid spinning, we should IMO. Typical kernel does not have PPS_SYNC, so always using spinlock adds unneeded overhead.

kib edited edge metadata.

Rename argument to ntp_is_time_error().
Use goto out in hardpps().

This revision now requires review to proceed.Jun 27 2016, 8:33 PM
jhb edited edge metadata.
This revision is now accepted and ready to land.Jun 28 2016, 3:27 PM
This revision was automatically updated to reflect the committed changes.