HomeFreeBSD

Fix error recovery behavior in the pass(4) driver.

Description

Fix error recovery behavior in the pass(4) driver.

After FreeBSD SVN revision 236814, the pass(4) driver changed from
only doing error recovery when the CAM_PASS_ERR_RECOVER flag was
set on a CCB to sometimes doing error recovery if the passed in
retry count was non-zero.

Error recovery would happen if two conditions were met:

  1. The error recovery action was simply a retry. (Which is most cases.)
  2. The retry_count is non-zero. (Which happened a lot because of cut-and-pasted code.)

This explains a bug I noticed in with camcontrol:

camcontrol tur da34 -v

Unit is ready

camcontrol reset da34

Reset of 1:172:0 was successful

At this point, there should be a Unit Attention:

camcontrol tur da34 -v

Unit is ready

No Unit Attention.

Try it again:

camcontrol reset da34

Reset of 1:172:0 was successful

Now set the retry_count to 0 for the TUR:

camcontrol tur da34 -v -C 0

Unit is not ready
(pass42:mps1:0:172:0): TEST UNIT READY. CDB: 00 00 00 00 00 00
(pass42:mps1:0:172:0): CAM status: SCSI Status Error
(pass42:mps1:0:172:0): SCSI status: Check Condition
(pass42:mps1:0:172:0): SCSI sense: UNIT ATTENTION asc:29,2 (SCSI bus reset occurred)
(pass42:mps1:0:172:0): Field Replaceable Unit: 2

There is the unit attention. camcontrol(8) has a default
retry_count of 1, in case someone sets the -E flag without
setting -C.

The CAM_PASS_ERR_RECOVER behavior was only broken with the
CAMIOCOMMAND ioctl, which is the synchronous pass(4) API. It has
worked as intended (error recovery is only done when the flag
is set) in the asynchronous API (CAMIOQUEUE ioctl).

sys/cam/scsi/scsi_pass.c:
In passsendccb(), when calling cam_periph_runccb(), only
specify the error routine when CAM_PASS_ERR_RECOVER is set.

share/man/man4/pass.4:
Document that CAM_PASS_ERR_RECOVER is needed to enable
error recovery.

Reported by: Terry Kennedy <TERRY@glaver.org>
PR: kern/218572
MFC after: 1 week
Sponsored by: Spectra Logic

Details

Provenance
kenAuthored on
Parents
rS317774: Add the ability to rescan or reset devices specified by peripheral
Branches
Unknown
Tags
Unknown