Page MenuHomeFreeBSD

mpr/mps: Fix a race in diagnostic reset
ClosedPublic

Authored by imp on Jan 24 2022, 8:39 PM.
Tags
None
Referenced Files
F95183716: D34017.id.diff
Thu, Sep 19, 5:16 PM
Unknown Object (File)
Tue, Sep 17, 6:12 PM
Unknown Object (File)
Tue, Sep 17, 6:52 AM
Unknown Object (File)
Tue, Sep 17, 3:02 AM
Unknown Object (File)
Sun, Sep 15, 10:42 PM
Unknown Object (File)
Sun, Sep 8, 4:51 PM
Unknown Object (File)
Thu, Sep 5, 6:35 PM
Unknown Object (File)
Tue, Sep 3, 11:36 PM
Subscribers
None

Details

Summary

There's a small race in freezing the simq when performing a diagnostic
reset. During this time, a transaction can slip through and encounter
the target id of 0. If we're still in diagnostic reset when we detect
this, don't say the device isn't there. Instead, freeze the queue and
return a requeue status, similar to what we do when we're resetting
a target and a transaction get here.

Sponsored by: Netflix

Test Plan

This race would be hit in about 1-2% of diagnostic rests, though some scenarios are more likely than others.
A heavily loaded system with a lsiutil induced diag reset would see it, but at a much lower rate.
Some IOC Fault-caused resets would set a hit rate of closer to 10%.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 44068
Build 40956: arc lint + arc unit

Event Timeline

imp requested review of this revision.Jan 24 2022, 8:39 PM
imp added reviewers: scottl, ken, mav.

Yes, unfortunately separation of SIM and queue locks created this race window.

This revision is now accepted and ready to land.Jan 25 2022, 2:10 AM
This revision was automatically updated to reflect the committed changes.