Page MenuHomeFreeBSD

Complete the failsafe outstanding commands to the device later in the process.
AbandonedPublic

Authored by imp on Feb 19 2020, 3:33 PM.
Tags
None
Referenced Files
Unknown Object (File)
May 10 2024, 3:19 PM
Unknown Object (File)
Dec 23 2023, 2:06 AM
Unknown Object (File)
Dec 22 2023, 1:39 PM
Unknown Object (File)
Sep 8 2023, 6:50 PM
Unknown Object (File)
May 23 2023, 2:44 PM
Unknown Object (File)
Jan 7 2023, 1:45 AM
Unknown Object (File)
Dec 22 2022, 5:43 PM
Subscribers
None

Details

Reviewers
slm
scottl
ken
Summary

r355056 added making the outstanding commands busy before completing them. This
got around the panic from the state machine being incorrect. However, it lead to
double completions of the commands. After we sent MPI2_SAS_OP_REMOVE_DEVICE, the
IOC would send the normal completion records for them back, so we were racing
that process. The race produced later panics with the command in the wrong state
(usually trying to allcoate a busy command).

Instead, move the fail-safe completion of the outstanding commands from just
after we send MPI2_SAS_OP_REMOVE_DEVICE to that commands successful completion
routine. This eliminates the race.

With this change, I'm now able to turn off the PHY on very active disks without
it leading to a panic. Prior to this change, a moderate amount of traffic to the
device was all that was required to trigger a panic all the time.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 29488
Build 27358: arc lint + arc unit

Event Timeline

Note: the manual for this card is a bit ambiguous on this point, but the implications are there...

Turns out, this is bogus.

Why? Because we need to just wait for the commands to finish. We also need to fail them in a way that makes sure they don't retry. We have to single step things, and this isn't doesn't address this. It just moves the race.