numam-spdk

History

Shuhei Matsumoto ae4e54fdc3 bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed Previously reconnect retry was not controlled and was repeated indefinitely. This patch adds two options, ctrlr_loss_timeout_sec and reconnect_delay_sec, to nvme_ctrlr and add reset_start_tsc, reconnect_is_delayed, and reconnect_delay_timer to nvme_ctrlr to control reconnect retry. Both of ctrlr_loss_timeout_sec and reconnect_delay_sec are initialized to zero. This means reconnect is not throttled as we did before this patch. A few more changes are added. Change nvme_io_path_is_failed() to return false if reset is throttled even if nvme_ctrlr is reseting or is to be reconnected. spdk_nvme_ctrlr_reconnect_poll_async() may continue returning -EAGAIN infinitely. To check out such exceptional case, use ctrlr_loss_timeout_sec. Not only ctrlr reset but also non-multipath ctrlr failover is controlled. So we need to include path failover into ctrlr reconnect. When the active path is removed and switched to one of the alternative paths, if ctrlr reconnect is scheduled, connecting to the alternative path is left to the scheduled reconnect. If reset or reconnect ctrlr is failed and the retry is scheduled, switch the active path to one of alternative paths. Restore unit test cases removed in the previous patches. Change-Id: Idec636c4eced39eb47ff4ef6fde72d6fd9fe4f85 Signed-off-by: Shuhei Matsumoto <smatsumoto@nvidia.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/10128 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>		2022-01-17 14:25:15 +00:00
..
accel	idxd: Add support for vectored crc32 + copy	2022-01-12 08:20:39 +00:00
bdev	bdev/nvme: Retry reconnecting ctrlr after seconds if reset failed	2022-01-17 14:25:15 +00:00
blob	blob: use uint64_t for unmap and write_zeroes lba count	2021-10-14 08:17:16 +00:00
blobfs	blobfs: check return value of strdup in blobfs_fuse_start()	2021-06-16 08:53:21 +00:00
env_dpdk	so_ver: increase all major versions	2021-02-05 14:43:47 +00:00
event	nvmf: remove accept poller from generic layer	2021-12-14 13:18:33 +00:00
scheduler	gscheduler: use current tsc for decision.	2021-12-31 09:21:27 +00:00
sock	sock: Fix SPDK_ZEROCOPY do not work for IPV6	2021-11-30 09:09:03 +00:00
Makefile	scheduler: create public API and subsystem for scheduler/governor	2021-09-07 07:33:03 +00:00