Page MenuHomeFreeBSD

watchdog: Convert to using sbintime_t format
Needs ReviewPublic

Authored by jhibbits on Oct 28 2024, 3:44 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Feb 20, 2:20 AM
Unknown Object (File)
Mon, Feb 10, 12:03 PM
Unknown Object (File)
Fri, Jan 31, 6:57 PM
Unknown Object (File)
Mon, Jan 27, 9:24 AM
Unknown Object (File)
Fri, Jan 24, 1:20 AM
Unknown Object (File)
Jan 18 2025, 5:00 PM
Unknown Object (File)
Jan 17 2025, 1:35 AM
Unknown Object (File)
Dec 26 2024, 8:14 AM
Subscribers

Details

Reviewers
phk
imp
jhb
stevek
Summary

Some watchdogs are now based on a countdown timer instead of a bit
check. To deal with these, convert the watchdog framework to use
sbintime_t instead of power-of-2-nanoseconds. This allows more
precision, and more variety of watchdog timeouts. Keep the old method
as a compatibility layer, so that drivers can be migrated slowly, as
needed.

Still TODO: documentation updates, probably a lot of cleanup, too.

Test Plan

This was tested in a VM with SW_WATCHDOG configured, and various debug printf()s added, which I'll commit later.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 60222
Build 57106: arc lint + arc unit

Event Timeline

jhibbits created this revision.

As the original author of the watchdog interfaces, I have no objections to this change, but I have not reviewed the diff.

adrian added inline comments.
sys/sys/watchdog.h
45

is it ok for these to re-use the same ioctl numbers as the existing API?

sys/sys/watchdog.h
45

I believe so, because the type is also encoded into the final ioctl number. At least, there's no build errors, and the right cases are used when tested :)

sys/sys/watchdog.h
45

Only the size of the type goes into the behind-the-scenes magic number.

While the risk that u_int and sbintime_t ever have the same size is probably nonexistent, the numbers are not a particularly limited resource, so I see no reason to reuse 42.

But I am unaware of any "official guidance" on this point, so that is merely my opinion.

I seem to have watchdogs only in old 32 bit hardware, so I am unfortunately unable to test this patch.

So if I'm reading this correctly, there are a couple of changes here that aren't quite described in the commit log:

  1. In the existing interface, the "pat" operation (both ioctl and in-kernel) accepts a mask of a control operation and a timeout value. In the new interface, the pat interface only accepts the timeout and a new control operation handles things like resetting the watchdog. I actually think just splitting that change out into its own commit (i.e. adding the new function and ioctl) earlier in the series might be nicer (if it is indeed separable).
  1. Do we need to keep the old ioctls around for userspace to still use? If not, I'd rather not have them in the header at all, and not add "_SBT" suffixes. That is, I'd rather do
#define WDIOCPATPAT	_IOW('W', 42, sbintime_t)	/* pat the watchdog */

in the header, and only in the kernel code under a suitable #ifdef:

#ifdef COMPAT_FREEBSD14
#define WDIOCPATPAT_14 _IOW('W', 42, u_int)
...
#endif

The _SBT prefix won't age well, so I'd rather keep the names. It will also let us ensure any existing userspace software is fixed for 15.0. In terms of ioctl numbers, I do agree with phk@ that we have lots of numbers, so might as well err on the side of caution and pick new values.

sys/dev/watchdog/watchdog.c
71

Do we still need this as a separate value? Can we not compute it from the sbt value instead when needed? If so, then the sbt variant of the variable can just take on this name. I have the same question about wd_last_sbt vs wd_last_u. Having two copies of a value is usually a recipe for them getting out of sync vs having a single source of truth and computing alternative interpretations on demand.

147

error_sbt seems unused at this point in time. That is, I have no idea what this comment means. How would you print a warning here when error_sbt is always 0? Would you print the warning after it gets set after calling EVENTHANDLER_INVOKE? If so, maybe move the comment down to where you would do the check and print the warning?

In D47312#1110694, @jhb wrote:

So if I'm reading this correctly, there are a couple of changes here that aren't quite described in the commit log:

  1. In the existing interface, the "pat" operation (both ioctl and in-kernel) accepts a mask of a control operation and a timeout value. In the new interface, the pat interface only accepts the timeout and a new control operation handles things like resetting the watchdog. I actually think just splitting that change out into its own commit (i.e. adding the new function and ioctl) earlier in the series might be nicer (if it is indeed separable).

Good idea. I think it is easily separable, since the control is effectively just the control bits of the pat ioctl, separated out.

  1. Do we need to keep the old ioctls around for userspace to still use? If not, I'd rather not have them in the header at all, and not add "_SBT" suffixes. That is, I'd rather do
#define WDIOCPATPAT	_IOW('W', 42, sbintime_t)	/* pat the watchdog */

in the header, and only in the kernel code under a suitable #ifdef:

#ifdef COMPAT_FREEBSD14
#define WDIOCPATPAT_14 _IOW('W', 42, u_int)
...
#endif

The _SBT prefix won't age well, so I'd rather keep the names. It will also let us ensure any existing userspace software is fixed for 15.0. In terms of ioctl numbers, I do agree with phk@ that we have lots of numbers, so might as well err on the side of caution and pick new values.

I thought about this, and I'm glad you brought it up. I erred on the side of maintaining source compatibility, in case other watchdog daemons exist in the wild. However, if we're not guaranteeing source compatibility, only binary compatibility, then I'm content with migrating over, and dropping the _SBT. I will, at the least, change the ioctl numbers.

In D47312#1110676, @phk wrote:

I seem to have watchdogs only in old 32 bit hardware, so I am unfortunately unable to test this patch.

I don't quite understand this. sbintime_t is always 64 bits. I think just regression testing would be sufficient on any hardware (showing that it doesn't break), which should be fine with any hardware.

sys/dev/watchdog/watchdog.c
147

Yes, this comment belongs down below where it's actually set. I moved code around, and forgot to move the comment with it.

thanks for doing this!

Honestly, I'd just suggest making new ioctls, and it's up to you if you want to keep backwards compatibility.