HomeFreeBSD

MFC: 332385

Description

MFC: 332385

r332385:

hyperv/storvsc: storvsc_io_done(): do not use CAM_SEL_TIMEOUT

CAM_SEL_TIMEOUT was introduced in
https://reviews.freebsd.org/D7521 (r304251), which claimed:

"VM shall response to CAM layer with CAM_SEL_TIMEOUT to filter those
invalid LUNs. Never use CAM_DEV_NOT_THERE which will block LUN scan
for LUN number higher than 7."

But it turns out this is not correct:

I think what really filters the invalid LUNs in r304251 is that:
before r304251, we could set the CAM_REQ_CMP without checking
vm_srb->srb_status at all:
ccb->ccb_h.status |= CAM_REQ_CMP.

r304251 checks vm_srb->srb_status and sets ccb->ccb_h.status properly,
so the invalid LUNs are filtered.

I changed my code version to r304251 but replaced the CAM_SEL_TIMEOUT
with CAM_DEV_NOT_THERE, and I confirmed the invalid LUNs can also be
filtered, and I successfully hot-added and hot-removed 8 disks to/from
the VM without any issue.

CAM_SEL_TIMEOUT has an unwanted side effect -- see cam_periph_error():
For a selection timeout, we consider all of the LUNs on
the target to be gone. If the status is CAM_DEV_NOT_THERE,
then we only get rid of the device(s) specified by the
path in the original CCB.

This means: for a VM with a valid LUN on 3:0:0:0, when the VM inquires
3:0:0:1 and the host reports 3:0:0:1 doesn't exist and storvsc returns
CAM_SEL_TIMEOUT to the CAM layer, CAM will detech 3:0:0:0 as well: this
is the bug I reported recently:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226583

PR:     [[ https://bugs.freebsd.org/226583 | 226583 ]]
Reviewed by:    mav
Sponsored by:   Microsoft
Differential Revision:  https://reviews.freebsd.org/D14690

Details

Provenance
dexuanAuthored on
Parents
rS332902: pwd_mkdb: default to network (big) endian hash order
Branches
Unknown
Tags
Unknown