Page MenuHomeFreeBSD

nvme: use config_intrhook_drain to avoid removable card races
ClosedPublic

Authored by imp on Mar 2 2021, 12:01 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Jan 22, 5:25 PM
Unknown Object (File)
Sat, Jan 18, 5:40 PM
Unknown Object (File)
Sat, Jan 18, 5:24 PM
Unknown Object (File)
Sat, Jan 18, 5:12 PM
Unknown Object (File)
Mon, Jan 13, 6:22 AM
Unknown Object (File)
Mon, Jan 13, 6:15 AM
Unknown Object (File)
Mon, Jan 13, 5:55 AM
Unknown Object (File)
Mon, Jan 13, 5:51 AM
Subscribers

Details

Summary

nvme drives are configured early in boot. However, a number of the configuration
steps takes which take a while, so we defer those to a config intrhook that runs
before the root filesystem is mounted. At the same time, the PCI hot plug wakes
up and tests the status of the card. It may decide that the card has gone away
and deletes the child. As part of that process nvme_detach is called. If this
call happens after the config_intrhook starts to run, but before it is finished,
there's a race where we can tear down the device's soft state while the
config_intrhook is still using it. Use the new config_intrhook_drain to
disestablish the hook. Either it will be removed w/o running, or the routine
will wait for it to finish. This closes the race and allows safe hotplug at any
time, even very early in boot.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 37671
Build 34560: arc lint + arc unit

Event Timeline

imp requested review of this revision.Mar 2 2021, 12:01 AM

Update to include DONE and man page

gbe added a subscriber: gbe.

LGTM for the manpage kind of things.

This revision is now accepted and ready to land.Mar 3 2021, 7:39 PM
mav added inline comments.
sys/kern/subr_autoconf.c
261 ↗(On Diff #84932)

I think alike to callout_drain() this function could return status in which it caught the hook: stopped by this call, completed naturally or already completed. It could be useful for proper cleanup. It may be not needed for present nvme_ctrlr_destruct(), but would not harm IMO.

Other than the commend above I think it may work.

This revision now requires review to proceed.Mar 4 2021, 12:03 AM
sys/kern/subr_autoconf.c
261 ↗(On Diff #84932)

You bet. Done and updated in D29005 some of which leaked through to D29006

This revision is now accepted and ready to land.Mar 4 2021, 2:34 PM
This revision now requires review to proceed.Mar 8 2021, 5:18 PM
This revision is now accepted and ready to land.Mar 11 2021, 2:51 PM