Page MenuHomeFreeBSD

shutdown: audit shutdown_post_sync event callbacks
ClosedPublic

Authored by mhorne on Oct 23 2023, 8:32 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, May 22, 7:51 PM
Unknown Object (File)
Fri, May 10, 11:26 PM
Unknown Object (File)
Wed, May 8, 9:23 AM
Unknown Object (File)
Wed, May 8, 9:23 AM
Unknown Object (File)
Wed, May 8, 9:22 AM
Unknown Object (File)
Wed, May 8, 6:53 AM
Unknown Object (File)
Apr 29 2024, 2:43 AM
Unknown Object (File)
Apr 22 2024, 2:33 AM
Subscribers

Details

Summary

Ensure they are all panic/debugger safe.

Most handlers for this event are for disk drivers/geom modules. There
are a mix of checks being used here (or not), so let's standardize on
checking the presence of the RB_NOSYNC flag.

This flag is set whenever:

  1. The kernel has panicked and kern.sync_on_panic=0*
  2. We reboot from within the kernel debugger (the "reset" command)
  3. Userspace requested it, e.g. by 'reboot -n'

Name the functions consistently.

*This sysctl is tuned to zero by default, but its existence means that
these handlers can be executed after a panic, at the user's discretion.
IMO this use-case is implicitly understood to be risky, and we'd be
better off eliminating it altogether.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

The changes that follow this are basically the result of a similar audit for the shutdown_pre_sync and shutdown_final handlers. Since they are numerous, I split them into individual changes.

I'm not too sure about the geom ones either, but didn't flag the ones where new RB_NOSYNC was added.
Might want to add phk to the list of reviewers.
But I think we'd still want to write the updated metadata for these raid systems so we don't come up with, for example, broken mirrors after a panic all the time.

sys/dev/pst/pst-raid.c
187 ↗(On Diff #129259)

I'm pretty sure this is wrong.
Shutting down the controller is different than doing a sync(2).
It's needed to prevent data loss I believe as well as put the hardware in a known good state for reboot (including things that might keep it from rebooting, though I don't know if that's true for this).

In D42337#966249, @imp wrote:

I'm not too sure about the geom ones either, but didn't flag the ones where new RB_NOSYNC was added.

We had to make similar changes for at least g_mirror, i.e., we need to avoid doing anything after a panic. In general we can't sync metadata to disk after a panic, since that usually requires a working scheduler.

Might want to add phk to the list of reviewers.
But I think we'd still want to write the updated metadata for these raid systems so we don't come up with, for example, broken mirrors after a panic all the time.

What exactly is the problematic scenario? It would still arise after, e.g., a triple fault.

sys/dev/pst/pst-raid.c
187 ↗(On Diff #129259)

iop_queue_wait_msg() will sleep in this case, so I'm not sure how this can work.

Echoing what Mark notes, in looking at each of these geom cases I observe that their code-paths rely on a functioning scheduler. Calls to wakeup() when the scheduler is stopped will panic recursively, and aside from this there are loops which would never terminate as they wait for the worker threads to exit.

See for example g_journal_destroy() or g_raid3_destroy().

This change at least makes the various geom classes behave consistently w.r.t. these shutdown hooks and RB_NOSYNC.

The geom ones are fine.
I'd be tempted to skip the wait for the threads code be skipped when the scheduler is stopped explicitly rather than the rb_nosync flag being set in boot howtoo. The latter seems a big magic... but that's likely my experience with nvme and the paths i took there... But I'm struggling to articulate which tests to use if we were to document this more explicitly somewhere.

What exactly is the problematic scenario? It would still arise after, e.g., a triple fault.

Yes... but I don't want to turn a simple panic when we have writes in flight for some but not all of the mirrors... though that likely requires more careful thought... and maybe adding a BIO_POLLED flag to tell the lower layers to complete the IO in an envelope without the scheduler... as a generalization of dump... but at $WORK we are moving away from gmirror so I'm not super fussy about this assuming we don't get an automatic broken mirror on panic...

Improve pst(4) handling. If the scheduler is stopped, mask all interrupts,
which will instruct iop_queue_wait_msg() to use a DELAY loop instead of a
sleep. This way we always perform device shutdown.

While here use the named constant for the relevant interrupt bit.

sys/dev/pst/pst-iop.c
448–461 ↗(On Diff #130396)

We will execute this block in the SCHEDULER_STOPPED() case.

Seemsgood?

markj added inline comments.
sys/dev/pst/pst-iop.c
448–461 ↗(On Diff #130396)

Yes, this looks like it'd work.

This revision is now accepted and ready to land.Nov 23 2023, 2:55 PM