nvmf: Fix race condition when adding IO qpair

Similar issue was fixed in
813869d823
nvmf: Fix possible race condition when adding IO qpair

This patch fixes the same issue which occurs a bit later,
when a  message is delivered to another thread. This issue
occurred on CI, callstack is the following:

00:11:46.296  #6  0x00007f2705199f05 in __ubsan_handle_type_mismatch_v1 () from /lib64/libubsan.so.1
00:11:46.296  No symbol table info available.
00:11:46.296  #7  0x00007f27067ace6f in ctrlr_add_qpair_and_update_rsp (qpair=0x221edc0, ctrlr=0x1dc4ea0, rsp=0x2242918) at ctrlr.c:230
00:11:46.296          __PRETTY_FUNCTION__ = "ctrlr_add_qpair_and_update_rsp"
00:11:46.296          __func__ = "ctrlr_add_qpair_and_update_rsp"
00:11:46.296  #8  0x00007f27067b1d0b in nvmf_ctrlr_add_io_qpair (ctx=0x2242540) at ctrlr.c:534
00:11:46.296          req = 0x2242540
00:11:46.296          rsp = 0x2242918
00:11:46.296          qpair = 0x221edc0
00:11:46.296          ctrlr = 0x1dc4ea0
00:11:46.296          __func__ = "nvmf_ctrlr_add_io_qpair"
00:11:46.296  #9  0x00007f27062553ce in msg_queue_run_batch (thread=0x1cff540, max_msgs=8) at thread.c:553

where line 230 in ctrlr.c was
assert(ctrlr->admin_qpair->group->thread == spdk_get_thread());
That means that admin qpair was disconnected from the poll
group and controller is in the process of destruction

Change-Id: I818ba56adda5ed3488a8df78483c0b6839758192
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6364
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This commit is contained in:
Alexey Marchuk 2021-02-10 09:35:56 +03:00 committed by Tomasz Zawadzki
parent b3dae51b65
commit 92f62deefc

View File

@ -492,6 +492,7 @@ nvmf_ctrlr_add_io_qpair(void *ctx)
struct spdk_nvmf_fabric_connect_rsp *rsp = &req->rsp->connect_rsp;
struct spdk_nvmf_qpair *qpair = req->qpair;
struct spdk_nvmf_ctrlr *ctrlr = qpair->ctrlr;
struct spdk_nvmf_qpair *admin_qpair = ctrlr->admin_qpair;
/* Unit test will check qpair->ctrlr after calling spdk_nvmf_ctrlr_connect.
* For error case, the value should be NULL. So set it to NULL at first.
@ -531,6 +532,15 @@ nvmf_ctrlr_add_io_qpair(void *ctx)
goto end;
}
if (admin_qpair->state != SPDK_NVMF_QPAIR_ACTIVE || admin_qpair->group == NULL) {
/* There is a chance that admin qpair is being destroyed at this moment due to e.g.
* expired keep alive timer. Part of the qpair destruction process is change of qpair's
* state to DEACTIVATING and removing it from poll group */
SPDK_ERRLOG("Inactive admin qpair (state %d, group %p)\n", admin_qpair->state, admin_qpair->group);
SPDK_NVMF_INVALID_CONNECT_CMD(rsp, qid);
goto end;
}
ctrlr_add_qpair_and_update_rsp(qpair, ctrlr, rsp);
end:
spdk_nvmf_request_complete(req);