numam-spdk/scripts/rpc
Shuhei Matsumoto ae4e54fdc3 bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed
Previously reconnect retry was not controlled and was repeated indefinitely.

This patch adds two options, ctrlr_loss_timeout_sec and reconnect_delay_sec,
to nvme_ctrlr and add reset_start_tsc, reconnect_is_delayed, and
reconnect_delay_timer to nvme_ctrlr to control reconnect retry.

Both of ctrlr_loss_timeout_sec and reconnect_delay_sec are initialized to
zero. This means reconnect is not throttled as we did before this patch.

A few more changes are added.

Change nvme_io_path_is_failed() to return false if reset is throttled
even if nvme_ctrlr is reseting or is to be reconnected.

spdk_nvme_ctrlr_reconnect_poll_async() may continue returning -EAGAIN
infinitely. To check out such exceptional case, use ctrlr_loss_timeout_sec.

Not only ctrlr reset but also non-multipath ctrlr failover is controlled.
So we need to include path failover into ctrlr reconnect.

When the active path is removed and switched to one of the alternative paths,
if ctrlr reconnect is scheduled, connecting to the alternative path is left
to the scheduled reconnect.

If reset or reconnect ctrlr is failed and the retry is scheduled,
switch the active path to one of alternative paths.

Restore unit test cases removed in the previous patches.

Change-Id: Idec636c4eced39eb47ff4ef6fde72d6fd9fe4f85
Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10128
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
2022-01-17 14:25:15 +00:00
..
__init__.py net: Remove library 2021-07-13 08:57:58 +00:00
app.py spelling: scripts 2021-12-03 08:13:04 +00:00
bdev.py bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed 2022-01-17 14:25:15 +00:00
blobfs.py rpc/blobfs: add cache size setting rpc 2019-11-07 00:33:25 +00:00
client.py scripts/rpc: Make sure address argument is properly interpreted 2021-05-24 10:11:05 +00:00
cmd_parser.py rpc: add a command parser 2021-05-11 12:02:00 +00:00
env_dpdk.py env_dpdk/rpc: add rpc to get memory stats. 2019-12-13 11:05:57 +00:00
helpers.py rpc.py: add framework for detecting deprecated aliases 2019-05-09 04:37:08 +00:00
idxd.py idxd/rpc: Revise the rpc function to use kernel or user driver 2021-07-13 17:22:30 +00:00
ioat.py ioat: remove whitelist/blacklist functionality 2020-12-03 09:41:07 +00:00
iscsi.py spelling: scripts 2021-12-03 08:13:04 +00:00
log.py RPC: rename get_log_flags to log_get_flags 2019-09-24 16:42:41 +00:00
lvol.py RPC: rename rpc construct_lvol_store to bdev_lvol_create_lvstore 2019-08-30 16:40:44 +00:00
nbd.py rpc: Rename get_nbd_disks to nbd_get_disks 2019-09-19 20:56:35 +00:00
notify.py RPC: rename get_notifications to notify_get_notifications 2019-09-24 16:42:41 +00:00
nvme.py bdev/opal: Add rpc for init, revert and get info 2019-10-24 17:09:57 +00:00
nvmf.py nvmf: zero-copy enable flag in transport opts 2022-01-06 18:53:42 +00:00
pmem.py rpc: Rename delete_pmem_pool to bdev_pmem_delete_pool 2019-09-05 07:04:17 +00:00
sock.py socket: Remove deprecated enable_zerocopy_send 2021-07-23 10:30:25 +00:00
subsystem.py rpc: add method for listing PCI devices 2021-12-14 09:08:59 +00:00
trace.py trace_rpc.c: add support for enabling individual traces 2022-01-14 11:01:15 +00:00
vhost.py virtio-blk: add hotplug rpc 2021-04-16 19:21:13 +00:00
vmd.py event/subsystem/vmd: RPC support 2019-07-26 18:27:40 +00:00