Page MenuHomeFreeBSD

Add two new tunables / sysctls to controll reboot after panic
ClosedPublic

Authored by imp on Nov 11 2017, 5:59 AM.

Details

Summary

kern.poweroff_on_panic which, when enabled, instructs a system to
power off on a panic instead of a reboot.

kern.powercyle_on_panic which, when enabled, instructs a system to
power cycle, if possible, on a panic instead of a reboot.

Sponsored by: Netflix

Test Plan

Set them and force a panic :)

Why would we want this? We have a hardware problem that happens rarely with one of the drives we use. We've not been able to catch it misbehaving. So, we want to power off when we detect that it misbehaves (times out) so that the circular buffer in the drive doesn't fill with junk and obliterate the data the vendor is interested in. This is easy enough with devd + the timeout code we've done. However, there's a hitch: sometimes we panic due to the I/O error. So, when we're running the tests, we'll set kern.poweroff_on_panic and then we won't have the system reboot, but power off and entomb the data we need. We'll have the logs, do we'll know the offending drive (which has to be physically removed from the system and returned to the vendor), and a quick boot to get them if we have a console failure won't be 'too much' to overwhelm the log. OK, that's likely TMI, but should describe the problem I'm trying to solve well enough to judge the merits of this patch.

I added the powercycle one because that's the other way to reboot, and it's a tiny cost in code and might prove useful.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

imp created this revision.Nov 11 2017, 5:59 AM
imp updated this revision to Diff 35079.Nov 11 2017, 6:03 AM

Update to include power cycle code... <doh>

imp edited the test plan for this revision. (Show Details)Nov 11 2017, 6:11 AM
imp added reviewers: kib, scottl, jhb, rwatson, cem.
cem added inline comments.Nov 11 2017, 7:04 AM
sys/kern/kern_shutdown.c
138 ↗(On Diff #35079)

we have SYSCTL_BOOL now, might as well use it.

kib added a comment.Nov 11 2017, 9:17 AM

May be, add a sysctl/tunable which takes the RB_XXX flags directly, instead of the per-flag variable, to be acted upon on panic ?

imp added a comment.Nov 11 2017, 4:35 PM
In D13042#271028, @kib wrote:

May be, add a sysctl/tunable which takes the RB_XXX flags directly, instead of the per-flag variable, to be acted upon on panic ?

I thought of that but rejected it. (a) only a couple of flags make sense and (b) this is an easier interface to explain. RB_POWEROFF is 0x4000 and RB_POWERCYCLE is 0x400000 which would make setting it difficult. Was that 3 0's or 4? or 6?

imp added inline comments.Nov 11 2017, 4:37 PM
sys/kern/kern_shutdown.c
138 ↗(On Diff #35079)

Sure. With appropriate changes to the types.

scottl accepted this revision.Nov 13 2017, 5:21 PM
This revision is now accepted and ready to land.Nov 13 2017, 5:21 PM
jhb added a comment.Nov 13 2017, 6:53 PM

This is probably fine. You could also perhaps have a variant of kib@'s suggestion that takes a string: kern.panic_shutdown = {"reboot", "poweroff", "power-cycle"}. Internally you would store it as a simple int / enum, but use a SYSCTL_PROC to map strings to the internal values. This also means you can't set both RB_POWEROFF and RB_POWERCYCLE (which the current patch lets you do) for which the behavior isn't clear (which one has precedence, etc.?)

imp added a comment.Nov 13 2017, 7:06 PM
In D13042#271831, @jhb wrote:

This is probably fine. You could also perhaps have a variant of kib@'s suggestion that takes a string: kern.panic_shutdown = {"reboot", "poweroff", "power-cycle"}. Internally you would store it as a simple int / enum, but use a SYSCTL_PROC to map strings to the internal values. This also means you can't set both RB_POWEROFF and RB_POWERCYCLE (which the current patch lets you do) for which the behavior isn't clear (which one has precedence, etc.?)

As implemented, POWERCYCLE takes precedence over POWEROFF which takes precedence over HALT which takes precedence over AUTOBOOT, unless autoboot fails, in which case we halt. It isn't documented that this should be the case, and I only implemented it this way because I wanted the ability to have the thing that power cycles fail and then have it reboot or possibly power off. Though the system call can handle all these possibilities, I'm not sure the variations on halt/reboot/shutdown can specify this much nuance (though the fallback to reboot or halt is possible, not sure that you can power cycle with a fallback to power off.

Having said that, I'm cool with an enum thing, but didn't recall that SYSCTL_PROC settings actually could be tuneables. That would make it easier to use, perhaps (assuming we spell power-off and power-cycle either both with or both without - and allow both forms), but this is a rare enough case I worry about too much polish getting in the way. So I'm inclined to push back a little to see how important you think your proposed interface is....

jhb added a comment.Nov 13 2017, 9:17 PM
In D13042#271835, @imp wrote:
In D13042#271831, @jhb wrote:

This is probably fine. You could also perhaps have a variant of kib@'s suggestion that takes a string: kern.panic_shutdown = {"reboot", "poweroff", "power-cycle"}. Internally you would store it as a simple int / enum, but use a SYSCTL_PROC to map strings to the internal values. This also means you can't set both RB_POWEROFF and RB_POWERCYCLE (which the current patch lets you do) for which the behavior isn't clear (which one has precedence, etc.?)

As implemented, POWERCYCLE takes precedence over POWEROFF which takes precedence over HALT which takes precedence over AUTOBOOT, unless autoboot fails, in which case we halt. It isn't documented that this should be the case, and I only implemented it this way because I wanted the ability to have the thing that power cycles fail and then have it reboot or possibly power off. Though the system call can handle all these possibilities, I'm not sure the variations on halt/reboot/shutdown can specify this much nuance (though the fallback to reboot or halt is possible, not sure that you can power cycle with a fallback to power off.
Having said that, I'm cool with an enum thing, but didn't recall that SYSCTL_PROC settings actually could be tuneables. That would make it easier to use, perhaps (assuming we spell power-off and power-cycle either both with or both without - and allow both forms), but this is a rare enough case I worry about too much polish getting in the way. So I'm inclined to push back a little to see how important you think your proposed interface is....

With the changes Hans made to invoke the sysctl node handler for CTLFLAG_TUN, SYSCTL_PROC should work fine. We also have other tri-state type nodes that accept an integer and define their meaning in the description available via 'sysctl -d' (e.g. vfs.timestamp_precision). I think an enum is probably simpler for an admin to wrap their head around.

imp added a comment.Nov 13 2017, 10:59 PM

OK. I'm going to push this in as-is, since I've tested it and an confident.
I'll also make it an enum in a future commit.
And do that via a sysctl API that lets one define ENUMs with fixed mapping like this, including tunable support.

This revision was automatically updated to reflect the committed changes.