Page MenuHomeFreeBSD

Add two new tunables / sysctls to controll reboot after panic
ClosedPublic

Authored by imp on Nov 11 2017, 5:59 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Nov 8, 4:30 PM
Unknown Object (File)
Mon, Nov 4, 1:02 PM
Unknown Object (File)
Sat, Nov 2, 2:12 AM
Unknown Object (File)
Thu, Oct 31, 3:14 PM
Unknown Object (File)
Oct 9 2024, 11:32 AM
Unknown Object (File)
Oct 6 2024, 8:36 PM
Unknown Object (File)
Oct 2 2024, 12:18 PM
Unknown Object (File)
Oct 2 2024, 1:11 AM
Subscribers
None

Details

Summary

kern.poweroff_on_panic which, when enabled, instructs a system to
power off on a panic instead of a reboot.

kern.powercyle_on_panic which, when enabled, instructs a system to
power cycle, if possible, on a panic instead of a reboot.

Sponsored by: Netflix

Test Plan

Set them and force a panic :)

Why would we want this? We have a hardware problem that happens rarely with one of the drives we use. We've not been able to catch it misbehaving. So, we want to power off when we detect that it misbehaves (times out) so that the circular buffer in the drive doesn't fill with junk and obliterate the data the vendor is interested in. This is easy enough with devd + the timeout code we've done. However, there's a hitch: sometimes we panic due to the I/O error. So, when we're running the tests, we'll set kern.poweroff_on_panic and then we won't have the system reboot, but power off and entomb the data we need. We'll have the logs, do we'll know the offending drive (which has to be physically removed from the system and returned to the vendor), and a quick boot to get them if we have a console failure won't be 'too much' to overwhelm the log. OK, that's likely TMI, but should describe the problem I'm trying to solve well enough to judge the merits of this patch.

I added the powercycle one because that's the other way to reboot, and it's a tiny cost in code and might prove useful.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 12643
Build 12914: arc lint + arc unit

Event Timeline

Update to include power cycle code... <doh>

imp added reviewers: kib, scottl, jhb, rwatson, cem.
sys/kern/kern_shutdown.c
138

we have SYSCTL_BOOL now, might as well use it.

May be, add a sysctl/tunable which takes the RB_XXX flags directly, instead of the per-flag variable, to be acted upon on panic ?

In D13042#271028, @kib wrote:

May be, add a sysctl/tunable which takes the RB_XXX flags directly, instead of the per-flag variable, to be acted upon on panic ?

I thought of that but rejected it. (a) only a couple of flags make sense and (b) this is an easier interface to explain. RB_POWEROFF is 0x4000 and RB_POWERCYCLE is 0x400000 which would make setting it difficult. Was that 3 0's or 4? or 6?

sys/kern/kern_shutdown.c
138

Sure. With appropriate changes to the types.

This revision is now accepted and ready to land.Nov 13 2017, 5:21 PM

This is probably fine. You could also perhaps have a variant of kib@'s suggestion that takes a string: kern.panic_shutdown = {"reboot", "poweroff", "power-cycle"}. Internally you would store it as a simple int / enum, but use a SYSCTL_PROC to map strings to the internal values. This also means you can't set both RB_POWEROFF and RB_POWERCYCLE (which the current patch lets you do) for which the behavior isn't clear (which one has precedence, etc.?)

In D13042#271831, @jhb wrote:

This is probably fine. You could also perhaps have a variant of kib@'s suggestion that takes a string: kern.panic_shutdown = {"reboot", "poweroff", "power-cycle"}. Internally you would store it as a simple int / enum, but use a SYSCTL_PROC to map strings to the internal values. This also means you can't set both RB_POWEROFF and RB_POWERCYCLE (which the current patch lets you do) for which the behavior isn't clear (which one has precedence, etc.?)

As implemented, POWERCYCLE takes precedence over POWEROFF which takes precedence over HALT which takes precedence over AUTOBOOT, unless autoboot fails, in which case we halt. It isn't documented that this should be the case, and I only implemented it this way because I wanted the ability to have the thing that power cycles fail and then have it reboot or possibly power off. Though the system call can handle all these possibilities, I'm not sure the variations on halt/reboot/shutdown can specify this much nuance (though the fallback to reboot or halt is possible, not sure that you can power cycle with a fallback to power off.

Having said that, I'm cool with an enum thing, but didn't recall that SYSCTL_PROC settings actually could be tuneables. That would make it easier to use, perhaps (assuming we spell power-off and power-cycle either both with or both without - and allow both forms), but this is a rare enough case I worry about too much polish getting in the way. So I'm inclined to push back a little to see how important you think your proposed interface is....

In D13042#271835, @imp wrote:
In D13042#271831, @jhb wrote:

This is probably fine. You could also perhaps have a variant of kib@'s suggestion that takes a string: kern.panic_shutdown = {"reboot", "poweroff", "power-cycle"}. Internally you would store it as a simple int / enum, but use a SYSCTL_PROC to map strings to the internal values. This also means you can't set both RB_POWEROFF and RB_POWERCYCLE (which the current patch lets you do) for which the behavior isn't clear (which one has precedence, etc.?)

As implemented, POWERCYCLE takes precedence over POWEROFF which takes precedence over HALT which takes precedence over AUTOBOOT, unless autoboot fails, in which case we halt. It isn't documented that this should be the case, and I only implemented it this way because I wanted the ability to have the thing that power cycles fail and then have it reboot or possibly power off. Though the system call can handle all these possibilities, I'm not sure the variations on halt/reboot/shutdown can specify this much nuance (though the fallback to reboot or halt is possible, not sure that you can power cycle with a fallback to power off.

Having said that, I'm cool with an enum thing, but didn't recall that SYSCTL_PROC settings actually could be tuneables. That would make it easier to use, perhaps (assuming we spell power-off and power-cycle either both with or both without - and allow both forms), but this is a rare enough case I worry about too much polish getting in the way. So I'm inclined to push back a little to see how important you think your proposed interface is....

With the changes Hans made to invoke the sysctl node handler for CTLFLAG_TUN, SYSCTL_PROC should work fine. We also have other tri-state type nodes that accept an integer and define their meaning in the description available via 'sysctl -d' (e.g. vfs.timestamp_precision). I think an enum is probably simpler for an admin to wrap their head around.

OK. I'm going to push this in as-is, since I've tested it and an confident.
I'll also make it an enum in a future commit.
And do that via a sysctl API that lets one define ENUMs with fixed mapping like this, including tunable support.

This revision was automatically updated to reflect the committed changes.