Page MenuHomeFreeBSD

mpr(4) and mps(4) shouldn't indefinitely retry for "terminated ioc" errors
ClosedPublic

Authored by asomers on May 4 2016, 11:23 PM.

Details

Summary

Change the mps(4) and mpr(4) drivers to decrement the CCB retry count when
they receive a "terminated ioc" type error.

Revision 218812 changed mps(4) to unconditionally retry these types of
errors, in the belief that they always reflected transient transport-related
errors. But some Seagate SMR drives cause both mpr and mps controllers to
persistently return these types of errors. Retrying indefinitely can

So, instead of returning CAM_REQUEUE_REQ, return CAM_REQ_CMP_ERR. This will
tell the CAM error recovery code to decrement the retry count, so the probe
will fail and move on. We're taking the risk of false drive failures when
we run into topology issues, but that's better than blocking the boot due to
a drive failure.

sys/dev/mpr/mpr_sas.c,
sys/dev/mps/mps_sas.c:
In mprsas_scsiio_complete() and mpssas_scsiio_complete(), return
CAM_REQ_CMP_ERROR for LSI's status values
MPI2_IOCSTATUS_SCSI_IOC_TERMINATED and
MPI2_IOCSTATUS_SCSI_EXT_TERMINATED. Add comments explaining the
reasons and the tradeoff.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

asomers updated this revision to Diff 15907.May 4 2016, 11:23 PM
asomers retitled this revision from to mpr(4) and mps(4) shouldn't indefinitely retry for "terminated ioc" errors.
asomers updated this object.
asomers edited the test plan for this revision. (Show Details)
asomers added a reviewer: slm.
asomers added a subscriber: ken.
slm accepted this revision.May 5 2016, 3:21 PM
slm edited edge metadata.
This revision is now accepted and ready to land.May 5 2016, 3:21 PM
This revision was automatically updated to reflect the committed changes.