Page MenuHomeFreeBSD

mpr(4) and mps(4) shouldn't indefinitely retry for "terminated ioc" errors
ClosedPublic

Authored by asomers on May 4 2016, 11:23 PM.
Tags
None
Referenced Files
F105001629: D6210.diff
Wed, Dec 11, 12:57 PM
Unknown Object (File)
Mon, Dec 9, 6:45 AM
Unknown Object (File)
Wed, Dec 4, 2:21 PM
Unknown Object (File)
Sun, Nov 17, 4:51 PM
Unknown Object (File)
Nov 6 2024, 1:28 AM
Unknown Object (File)
Oct 5 2024, 7:25 AM
Unknown Object (File)
Sep 28 2024, 9:39 AM
Unknown Object (File)
Sep 27 2024, 6:23 AM
Subscribers

Details

Summary

Change the mps(4) and mpr(4) drivers to decrement the CCB retry count when
they receive a "terminated ioc" type error.

Revision 218812 changed mps(4) to unconditionally retry these types of
errors, in the belief that they always reflected transient transport-related
errors. But some Seagate SMR drives cause both mpr and mps controllers to
persistently return these types of errors. Retrying indefinitely can

So, instead of returning CAM_REQUEUE_REQ, return CAM_REQ_CMP_ERR. This will
tell the CAM error recovery code to decrement the retry count, so the probe
will fail and move on. We're taking the risk of false drive failures when
we run into topology issues, but that's better than blocking the boot due to
a drive failure.

sys/dev/mpr/mpr_sas.c,
sys/dev/mps/mps_sas.c:
In mprsas_scsiio_complete() and mpssas_scsiio_complete(), return
CAM_REQ_CMP_ERROR for LSI's status values
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED and
MPI2_IOCSTATUS_SCSI_EXT_TERMINATED. Add comments explaining the
reasons and the tradeoff.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

asomers retitled this revision from to mpr(4) and mps(4) shouldn't indefinitely retry for "terminated ioc" errors.
asomers updated this object.
asomers edited the test plan for this revision. (Show Details)
asomers added a reviewer: slm.
asomers added a subscriber: ken.
slm edited edge metadata.
This revision is now accepted and ready to land.May 5 2016, 3:21 PM
This revision was automatically updated to reflect the committed changes.