nvmf: don't set qpair->group to NULL.
The typical rdma qpair disconnect function goes through the function _nvmf_rdma_disconnect_retry. When this function was introduced, it was discovered that we could receive a qpair disconnect event for a given qpair before that qpair had been assigned to a poll group. In order to ensure that the disconnect procedure completed properly, we waited on the current thread in _nvmf_rdma_disconnect_retry for the qpair to be assigned a poll group before we finally disconnected. see rdma.c:2250. Since _nvmf_rdma_disconnect_retry was not necessarily called from the poll group's thread, we relied upon the assumption that the group variable would never be set back to NULL. See the comment on rdma.c: 2243. However, in _spdk_nvmf_qpair_destroy we were setting the group back to NULL. This operation can result in the following set of operations across multiple threads that prevent a qpair from ever being fully destroyed. 1. thread 1: receive a disconnect event - call nvmf_rdma_disconnect 2. thread 1: from nvmf_rdma_disconnect call spdk_nvmf_rdma_qpair_inc_refcnt - setting rqpair->refcnt to 1. 3. thread 2: call spdk_nvmf_rdma_poller_poll. 4. thread 2: in spdk_nvmf_rdma_poller_poll reap a completion with an error status which causes us to call spdk_nvmf_qpair_disconnect - rdma:2846 5. thread 2: spdk_nvmf_qpair_disconnect calls _spdk_nvmf_qpair_destroy which sets qpair->group = NULL 6. thread 1: from nvmf_rdma_disconnect we call _nvmf_rdma_disconnect_retry which checks if qpair->group == NULL. If that is the case, we assume that the qpair has not been assigned a group yet and send ourself a message to call _nvmf_rdma_disconnect_retry again. see rdma.c:2253 7. thread 2: from _spdk_nvmf_qpair_destroy we call spdk_nvmf_transport_qpair_fini which results in a call to spdk_nvmf_rdma_close_qpair. which sends dummy send and recvs to the qpair. 8. thread 2: we call poller_poll and get completions for both the send and recv dummy requests. This results in a call to spdk_nvmf_rdma_qpair_destroy. 9. thread 2: spdk_nvmf_rdma_qpair_destroy checks rqpair->refcnt and when it sees that it does not = 0 (see step 2 above) it returns without freeing the resources. see rdma.c:629 10. thread 1: we keep churning in _nvmf_rdma_disconnect_retry sending ourselves messages because rqpair->group is going to be null. Thread 1 never reaches line 2257 where it sends a message to call _nvmf_rdma_qpair_disconnect. _nvmf_rdma_qpair_disconnect is the function that decreases the rqpair->refcnt and allows us to make forward progress on destroying the qpair. I encountered this issue while trying to disconnect from our target using the kernel initiator with an x722 NIC. I think the timing on this bug comes out with that specific configuration because come of the calls in the disconnect path on thread 1 fail causing it to take longer giving a chance to the second thread to delete the qpair. There are really two issues at play here. We don't have a single point of entry for disconnecting RDMA qpairs, and we rely on the qpair->group variable never being set back to NULL. This patch addresses the second issue, and the next patch in the series addresses the first. Change-Id: I65395d0bbb67edfa7bad2ddc70906606c3d83781 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/443304 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This commit is contained in:
parent
0b4da45dbc
commit
ceb32abbd8
@ -739,7 +739,6 @@ _spdk_nvmf_qpair_destroy(void *ctx, int status)
|
||||
}
|
||||
|
||||
TAILQ_REMOVE(&qpair->group->qpairs, qpair, link);
|
||||
qpair->group = NULL;
|
||||
|
||||
spdk_nvmf_transport_qpair_fini(qpair);
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user