922c675756
CAM_SEL_TIMEOUT was introduced in https://reviews.freebsd.org/D7521 (r304251), which claimed: "VM shall response to CAM layer with CAM_SEL_TIMEOUT to filter those invalid LUNs. Never use CAM_DEV_NOT_THERE which will block LUN scan for LUN number higher than 7." But it turns out this is not correct: I think what really filters the invalid LUNs in r304251 is that: before r304251, we could set the CAM_REQ_CMP without checking vm_srb->srb_status at all: ccb->ccb_h.status |= CAM_REQ_CMP. r304251 checks vm_srb->srb_status and sets ccb->ccb_h.status properly, so the invalid LUNs are filtered. I changed my code version to r304251 but replaced the CAM_SEL_TIMEOUT with CAM_DEV_NOT_THERE, and I confirmed the invalid LUNs can also be filtered, and I successfully hot-added and hot-removed 8 disks to/from the VM without any issue. CAM_SEL_TIMEOUT has an unwanted side effect -- see cam_periph_error(): For a selection timeout, we consider all of the LUNs on the target to be gone. If the status is CAM_DEV_NOT_THERE, then we only get rid of the device(s) specified by the path in the original CCB. This means: for a VM with a valid LUN on 3:0:0:0, when the VM inquires 3:0:0:1 and the host reports 3:0:0:1 doesn't exist and storvsc returns CAM_SEL_TIMEOUT to the CAM layer, CAM will detech 3:0:0:0 as well: this is the bug I reported recently: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=226583 PR: 226583 Reviewed by: mav MFC after: 1 week Sponsored by: Microsoft Differential Revision: https://reviews.freebsd.org/D14690