Page MenuHomeFreeBSD

mpr/mps: Fix a race in diagnostic reset
ClosedPublic

Authored by imp on Jan 24 2022, 8:39 PM.
Tags
None
Referenced Files
F132323555: D34017.diff
Wed, Oct 15, 9:52 PM
Unknown Object (File)
Sun, Oct 12, 2:02 AM
Unknown Object (File)
Sat, Sep 27, 5:07 PM
Unknown Object (File)
Fri, Sep 26, 1:14 AM
Unknown Object (File)
Fri, Sep 19, 11:05 PM
Unknown Object (File)
Tue, Sep 16, 7:29 AM
Unknown Object (File)
Sep 14 2025, 9:52 PM
Unknown Object (File)
Sep 8 2025, 4:33 PM
Subscribers
None

Details

Summary

There's a small race in freezing the simq when performing a diagnostic
reset. During this time, a transaction can slip through and encounter
the target id of 0. If we're still in diagnostic reset when we detect
this, don't say the device isn't there. Instead, freeze the queue and
return a requeue status, similar to what we do when we're resetting
a target and a transaction get here.

Sponsored by: Netflix

Test Plan

This race would be hit in about 1-2% of diagnostic rests, though some scenarios are more likely than others.
A heavily loaded system with a lsiutil induced diag reset would see it, but at a much lower rate.
Some IOC Fault-caused resets would set a hit rate of closer to 10%.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

imp requested review of this revision.Jan 24 2022, 8:39 PM
imp added reviewers: scottl, ken, mav.

Yes, unfortunately separation of SIM and queue locks created this race window.

This revision is now accepted and ready to land.Jan 25 2022, 2:10 AM
This revision was automatically updated to reflect the committed changes.