HomeFreeBSD

nvme: fix a race between failing the controller and failing requests

Description

nvme: fix a race between failing the controller and failing requests

Part of the nvme recovery process for errors is to reset the
card. Sometimes, this results in failing the entire controller. When nda
is in use, we free the sim, which will sleep until all the I/O has
completed. However, with only one thread, the request fail task never
runs once the reset thread sleeps here. Create two threads to allow I/O
to fail until it's all processed and the reset task can proceed.

This is a temporary kludge until I can work out questions that arose
during the review, not least is what was the race that queueing to a
failure task solved. The original commit is vague and other error paths
in the same context do a direct failure. I'll investigate that more
completely before committing changing that to a direct failure. mav@
raised this issue during the review, but didn't otherwise object.

Multiple threads, though, solve the problem in the mean time until other
such means can be perfected.

Reviewed by: jhb@
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D30366

Details

Provenance
impAuthored on May 29 2021, 5:01 AM
Differential Revision
D30366: nvme: fix a race between failing the controller and failing requests
Parents
rGe5f5b6a75c0a: .github: Attempt to un-break Clang 9 action
Branches
Unknown
Tags
Unknown