356 Commits

Author SHA1 Message Date
Seth Howell
825cac2720 rdma.c: Create a single point of entry for qpair disconnect
Since there are multiple events/conditions that can trigger a qpair
disconnection, we need to funnel them to a single point of entry. If
more than one of these events occurs, we can ignore all but the first
since once a disconnect starts, it can't be stopped.

Change-Id: I749c9087a25779fcd5e3fe6685583a610ad983d3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443305
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-12 20:39:44 +00:00
Seth Howell
b6b0a0ba59 rdma: adjust I/O unit based on device SGL support
For devices that support fewer SGE elements than our default values, we
need to adjust the I/O unit size so that we don't ever try to submit
more SGLs than we are allowed to.

Change-Id: I316d88459380f28009cc8a3d9357e9c67b08e871
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442776
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 18:46:57 +00:00
Seth Howell
92f5548a91 rdma: properly account num_outstanding_data_wr
This value was not being decremented when we got SEND completions for
write operations because we were using the recv send to indicate when we
had completed all writes associated with the request. I also erroneously
made the assumption that spdk_nvmf_rdma_request_parse_sgl would properly
reset this value to zero for all requests. However, for requests that
return SPDK_NVME_DATA_NONE rom spdk_nvmf_rdma_request_get_xfer, this
funxtion is skipped and the value is never reset. This can cause a
coherency issue on admin queues when we request multiple log files. When
the keep_alive request is resent, it can pick up an old rdma_req which
reports the wrong number of outstanding_wrs and it will permanently
increment the qpairs curr_send_depth.

This change decrements num_outstanding_data_wrs on writes, and also
resets that value when the request is freed to ensure that this problem
doesn't occur again.

Change-Id: I5866af97c946a0a58c30507499b43359fb6d0f64
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443811
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 18:43:44 +00:00
Seth Howell
41cd5ff4fb rdma: fix max_read_depth_definition.
max_read_depth should be based on max_qp_init_read_atomic, or the
maximum number of read values that the initiator will accept as
outstanding.

The device attributes object contains values for both the initiator
(remote side) and the target (local side). All attributes with the name
init in them are meant to correspond to the initiator. The
qp_read_atomic value represents the number of reads and atomic
operations that can have this device as the target. qp_init_read_atomic
represents how many read operations the initiator has said that we can
have outstanding that have the initiator's rdma device as the target.

Since this number represents how many outstanding reads we will send to
the initiator at once, we should use the qp_init_read_atomic value.

Change-Id: Iacc044e8321080de8accd9128ac3777bbb948afc
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442409
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-05 18:04:04 +00:00
Ben Walker
9521d11bdb nvmf/rdma: Remove stray spdk_nvmf_rdma_wr
Wasn't used.

Change-Id: I5b440e18a0a6cbb9b6137b7074a0312e51f41b95
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441592
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-04 19:14:54 +00:00
Ben Walker
608d80a033 nvmf/rdma: Eliminate management channel
This is a holdover from before poll groups were introduced.
We just need a per-thread context for a set of connections,
so now that a poll group exists we can use that instead.

Change-Id: I1a91abf52dac6e77ea8505741519332548595c57
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442430
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-04 18:20:13 +00:00
Ben Walker
4e614b3127 nvmf/rdma: Capitalize SEND in code comment for consistency
The READ and ATOMIC in the comment above are capitalized, so
make this all caps too.

Change-Id: I49fae2ceb826b22953d9b26d42b95f17e2dac617
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442427
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-04 18:12:31 +00:00
Ben Walker
e1dd85a5b7 nvmf: Don't increment current_recv_depth for dummy RECV
When a connection goes to close and has no I/O outstanding,
the current_recv_depth was being decremented beyond 0 and rolling over.

If the poll group then finds a successful receive completion on the next
poll (for a command that arrived prior to starting the disconnect but
hadn't been processed yet), it would trip the max queue depth check
added recently and start another disconnect process. If only one command
arrives in this window, everything actually works out ok.

However, if there are two receive completions sitting in the completion
queue after the disconnect process is started, the first one does the
double disconnect and the second one does another disconnect which ends
up dereferencing a null pointer.

Since there is always a special reserved slot for the dummy recv, don't
do decrements or increments of the current_recv_depth for the dummy
recv. This allows the code to still enforce the actual max_queue_depth
on recvs without underflowing or overflowing the counter.

Change-Id: I56c95b2424e956a3b007b25c50cbf47262245b8f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/442642
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-01-30 19:03:46 +00:00
zkhatami88
8e2f0cdb01 nvmf: Add mechanism to override nvmf pd/mr behavior
Change-Id: I8d3abfcd1934bbab5bf8dacae08e8a7f29992b93
Signed-off-by: zkhatami88 <z.khatami88@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/433977
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
2019-01-30 19:03:35 +00:00
Seth Howell
1d0a8e1cec rdma: split PENDING_DATA_TRANSFER into two states.
Since we have different requirements for submitting RDMA read and write
operations, we should track them separately so that we don't block
writes when the device does not have enough resources for read
operations.

Change-Id: I5d6424c0e26f2f5362866d1bb21eb46700c245da
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-01-28 16:58:50 +00:00
Seth Howell
158dc9470d rdma: Make sure we don't submit too many WRs
Before, the number of WRs and the number of RDMA requests were linked by
a constant multiple. This is no longer the case so we need to make sure
that we don't overshoot the limit of WRs for the qpair.

Change-Id: I0eac75e96c25d78d0656e4b22747f15902acdab7
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439573
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-01-28 16:58:50 +00:00
Seth Howell
dfdd76cf21 rdma: track outstanding data work requests directly.
This gives us more realistic control over the number of requests we can
submit.

Change-Id: Ie717912685eaa56905c32d143c7887b636c1a9e9
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441606
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-01-25 19:12:17 +00:00
Seth Howell
7289d370f7 rdma: fix rw_depth to read_depth:
rw_depth was a misinterpretation of the spec. It is based on the value
of max_qp_rd_atom which only governs the number of read and atomic
operations. However, we were using rw_depth to block both read and write
operations which is an unnecessary restriction. write operations should
only be governed by the number of Work Requests posted to the send
queue. We currently guarantee that we will never overshoot the queue
depth for Work requests since they are embedded in the requests and
limited to a size of max_queue_depth.

Change-Id: Ib945ade4ef9a63420afce5af7e4852932345a460
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441165
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-01-25 19:12:17 +00:00
Seth Howell
5301be93cd rdma: set wr opcodes while parsing the SGL.
Change-Id: I88fdf0b48653997f790cf5de6774d1c16621a9c1
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441605
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-01-25 19:12:17 +00:00
Seth Howell
1f9ac1179e rdma: add num_outstanding_data_wr tracker to req
This will be necessary later on when we need to throttle send and recv
requests in software.

Change-Id: Ifb25eaabd15e101fbfc2959a08a321f80857b280
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/441604
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-01-25 19:12:17 +00:00
Xiaodong Liu
db5c3ce362 nvmf/rdma: dynamically enlarge CQ size
Assigned CQ size when creating CQ may run over due to
heavy workload with too many qpairs. Enlarge it dynamically
can prevent IBV_EVENT_CQ_ERR caused by CQ's runover.
This patch fixes issue #498:
https://github.com/spdk/spdk/issues/498

Change-Id: I6c2d7194d4147d812d49d4fe787fcba5c6bbede9
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440853
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2019-01-24 21:51:09 +00:00
Seth Howell
cf73fb2f1f nvmf/rdma: add a pool of request_data structs
This change is related to enabling multi-sgl element support in
the NVMe-oF target.

For single SGL use cases, there is a 1:1  relationship between
rdma_requests and ibv_wrs used to transfer the data associated with
the request. In the ingle SGL case that ibv_wr is embedded inside of
the spdk_nvmf_rdma_request structure as part of an rdma_request_data
structure.
However, with Multi-SGL element support, we require multiple
ibv_wrs per rdma_request. Insted of embedding these
structures inside of the rdma_request and bloating up that object, I
opted to leave the first one embedded in the object and create a pool
that requests can pull from in the Multi-SGL path.
By leaving the first request_data object embedded in the rdma_request
structure, we avoid adding the latency of requesting a mempool object
in the basic cases.

Change-Id: I7282242f1e34a32eb59b55f326a6c331d455625e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/428561
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2019-01-18 20:43:46 +00:00
Seth Howell
caa06154bd rdma: fix the poll_group_create error paths.
It was possible to leak pollers if we had multiple devices in the
transport. The new err_exit path fixes this.

Change-Id: Iafd5643c67fae741113f10afe761af1988cb6a9b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439419
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-01-18 16:57:37 +00:00
Seth Howell
e6ddb7df3f rdma: use the new common poll group data buffer cache.
This change is aimed at addressing github issue #555

Change-Id: I5112ac38c59f2f0a17d0c560e7e2f640a11f58a9
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440419
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-01-18 16:57:37 +00:00
Seth Howell
e28605f47a nvmf/transport: move buffer_pool to generic struct.
This is shared between all currently valid transports. Just move it up
to the generic structure. This will make implementing more shared
features on top of this a lot easier.

Change-Id: Ia896edcb7555903ba97adf862bc8d44228df2d36
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440416
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-01-17 19:42:01 +00:00
Seth Howell
e816c8fda8 nvmf: add a buffer_cache to transport opts
This patch series is geared at solving github issue 555.
Ultimately the goal of this series is to add a per-poll-group buffer
cache to prevent starvation.

Change-Id: I8ddaa47487665c2f9adce2109eb71b8fa71a7927
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439415
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-01-16 19:06:20 +00:00
Seth Howell
b17e0ae7db rdma: process pending reqs before destroying qp
This is an attempt to clean up requests sititng in the
waiting_for_buffer state before destroying it for good.

Change-Id: I8ae047e4d7fd01f30419ae346e4da49355dc033d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440127
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-01-15 18:11:41 +00:00
Seth Howell
e0280b1100 rdma: add drain argument to process_pending
This allows us to drain all of the pending requests from the qpairs
before we destroy them, preventing them from being picked up on
subsequent process_pending polls.

Change-Id: I149deff437b4c1764fabf542cdd25dd067a8713a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440428
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-01-15 18:11:41 +00:00
Ziye Yang
2b787d487e nvmf/rdma: remove the duplicated code in spdk_nvmf_rdma_request_free
The purpose of this patch is to remove the duplicated code
used in spdk_nvmf_rdma_request_free

Change-Id: I3f74466a7ec788000eff9c2a75c9ea2cacaf5cc2
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/439942
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-01-14 03:56:28 +00:00
Ziye Yang
cb1c3fae98 nvmf/rdma: fix the coredump issue when ctrlr + c target
When the host connects the target and does the io related job,
if we use ctrlr + c, it will be crash. The issue
is that we found the rqpair->qpair.group is NULL.

Change-Id: Id36cfac2be9abc707bf75a2e1ddb3f414610b6f1
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/437232
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-01-09 21:05:32 +00:00
Seth Howell
212fd2196f rdma: Complete rdma_req when RDMA_READ op fails
This operation is not attached to a send request so we need to put the
request into the completed state right away since there is no send
associated with it during the draining process.

Change-Id: I294f99950b00a584d8940bb4f93ac046c478d3b3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/439437
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-01-09 20:55:55 +00:00
JinYu
76675f6f60 rdma: check ibv state after rdma update it
We found ibv state value may be unreasonable, so before we
use the state value we do some judgement. The unreasonable
state probably means hardware issue, so the process flow
become unpredicatable.
Fix GitHub issue #508.

Change-Id: I213f4d684b103cce7bc072aecd591e2c491e0596
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/436920
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-01-09 08:29:43 +00:00
Seth Howell
fa757dc96d rdma: dump outstanding requests from rqpairs
If this happens, we have something going seriously wrong and we need as
much debug information as we can get.

Change-Id: I305512790461443316b9f231fa2afeb69593af1b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438097
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-01-09 08:28:50 +00:00
Ziye Yang
0b20f2e552 nvmf/rdma: Remove data buffer num dependency on SPDK_NVMF_MAX_SGL_ENTRIES
The least needed data buffer number should only
be larger for completing one RDMA (read/write RDMA).

Change-Id: I44eb51db279fc055f687eb78b6a642dbb5cb23f3
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/437808
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-12-25 01:35:43 +00:00
Ziye Yang
58f1624497 nvmf: add the transport shared buffer num configuration option.
Previously, we allocate the buffer size according
to the MaxQueueDepth info, however this is not exactly
a good way for customers to configure, we should provided
a shared buffer number configuration for the transport.

Change-Id: Ic6ff83076a65e77ec7376688ffb3737fd899057c
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/437450
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-12-20 19:55:57 +00:00
Evgeniy Kochetov
7da9f8faba nvmf/rdma: Fix refcnt check on RDMA QP destroy
Check for QP reference counter in RDMA QP destroy function was wrong
and QP resources were never released.

Change-Id: I6ab0ce39452e8263f89589d138c90f749516ebb1
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-on: https://review.gerrithub.io/436974
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-12-14 16:15:12 +00:00
Seth Howell
a451c8385e NVMe-oF: Add explicit reports for MR-split buffers:
This is a failsafe for finding and reporting data buffers that span
multiple Memory Regions. These errors should never be triggered, but
finding and reporting them will help any debugging.

Change-Id: I3c61e3cc510f5a36039fc1815ff0de45fce794d5
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/436054
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-12-10 23:47:38 +00:00
Ziye Yang
d40f805d54 nvmf: fix the error path for shared data buffer free.
Since we use aligned buffer, I think that the error handling
path here is not correct, the address is wrong.

Change-Id: I5bcb7f050199496423f861fd6aea65e0fe48c804
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.gerrithub.io/435992
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
2018-12-05 05:57:09 +00:00
Seth Howell
5aca5cd71b rdma: don't print a notice on QP state change.
This notice was scaring a lot of people because every time we disconnect
a qpair it tells the user that qpair is entering an error state. That is
part of the normal state flow of qpairs during disconnect, but makes it
seem like something is going wrong.

Change-Id: I776e71db2b24fa963113fee88b5cf02c0820f171
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/435555
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-12-03 09:44:06 +00:00
Jim Harris
942e02aa68 nvmf: add some instrumentation in error path
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1b5fad59c76fb4dbb6fcedf3f5a1e24af2064c4d
Reviewed-on: https://review.gerrithub.io/434271
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-11-30 19:58:14 +00:00
Liu Xiaodong
0e7ca66922 lib/trace: show specific usage of trace mask
Previously, if want to know which mask bit is used for specific
trace group, the only way is to check source code. Now list
each trace group with its trace tpoint group mask bit in
usage message

Change-Id: I7a85fe9c0885f1919f6ffbdc97dab81f1986fb07
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/435448
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-11-30 14:52:28 +00:00
Liu Xiaodong
73a3e13280 lib/nvmf: realign tab for TRACE_GROUP_NVMF_XXX
Change-Id: I7be0c7c417c84421e6abdbefb734cd0c05561194
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/435405
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-11-30 14:52:28 +00:00
Seth Howell
0e6a32deab nvme_rdma/nvmf: add cb_fns to check mr contiguity
This is necessary to confirm that a buffer that spans a 2_MB boundary is
still in a single MR.

Change-Id: If0d14e514ab2197a0d2e3af4f565f56d50591210
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/435179
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-11-29 15:16:13 +00:00
Seth Howell
1180bf8343 rdma: clean up SGE definitions and properly set values
We have historically conflated SPDK_NVMF_MAX_SGL_ENTRIES and the maximum
number of SGEs associated with a wr data object. For now these are the
same thing, but there should be nothing tying the number of NVMe request
SGL elements to the number of rdma request wr sgl elements.
Also, clarify the rx_sge and tx_sge enums to reflect the actual maximum
number of SGEs associated with either the send and receive queues.
This change doesn't actually modify these values, but sets us up to do
things like split the data in an NVMe SGE into multiple WR SGEs in case
the buffer associated with the NVMe SGE is not contained in a single
RDMA mr.
We also need to store these values in the qpair for later usage.

Change-Id: Iff3756fc72787a4b72a99b2bdf90bf486a8010fa
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/433196
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-11-16 15:18:41 +00:00
Seth Howell
1d9be84bfd nvmf/rdma: change the default buffer size.
Having the buffers be the same size as the maximum xfer size doesn't do
us any favors. Make these buffers a ratio of the maximum transfer size
and the number of supported nvmf SGLs.

Also configure the number of nvmf request iovs to correspond with this
new ratio.

Change-Id: I3147dcd86b599c74521ebfdf3bcdbcdee8871a3a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/428747
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-11-15 08:17:39 +00:00
Evgeniy Kochetov
90b4bd6cf9 nvmf/rdma: Fix QP shutdown procedure implementation
This patch implements the following QP shutdown flow:
1. Move the QP to ERR state
2. Post dummy work requests to send and receive queues
3. Poll CQ until it returns dummy work requests (with WR Flush Error status)
4. Call ibv_destroy_qp and release resources

In order to differentiate dummy and normal WRs new spdk_nvmf_rdma_wr
structure was introduced which contains type of WR. Since now it is
expected that wr_id field in ibv_recv/send_wr and ibv_wc always points
to this structure. Based on WR type wr_id can be safely casted to
correct container structure. In case of unsuccessful work completions
'opcode' can not be used for this purpose because it may be
invalid (see "IB Architecture Specification Volume 1", ch. 11.4.2.1
"Poll for completion").

Change-Id: Ifb791e36114c619c71ad4d831f2c7972fe7cf13d
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-on: https://review.gerrithub.io/430754
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
2018-11-08 21:20:25 +00:00
Ben Walker
5941ab0351 nvmf/rdma: Simplify code that casts wr_id field
We were previously doing lots of checks in debug mode
to verify the validity of this field. Now we understand
how it works, so these checks are never going to hit
and are just making the code harder to read.

Change-Id: Ic82d479ae34a8c7db06db62aee1cdf6e8bec126e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/430866
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
2018-11-02 16:39:37 +00:00
Ben Walker
50a438d3bc nvmf/rdma: No longer rely on wr.opcode being valid on error
The specification states that opcode is not valid when the status
is not success. Instead, keep track of the operation type ourselves.

Change-Id: I60af4b35e761c46f5f296a61cedfca198836197f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Co-authored-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/430865
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
2018-11-02 16:39:37 +00:00
Ben Walker
8e7295036b nvmf/rdma: Remove error recovery of RDMA qps
After some lengthy discussions with the RDMA experts, the only
way forward on an RDMA qp error is to disconnect it. The initiator
can create a new qp if it wants to later on.

Remove all of the error recovery code and disconnect the qp
any time an error is encountered.

Change-Id: I11e1df5aaeb8592a47ca01529cfd6a069828bd7f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/430389
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-11-02 16:39:37 +00:00
Ben Walker
d3fa0181e3 nvmf/rdma: Move cm event processing down near where it is referenced
Code movement only. No other changes.

Change-Id: I04cf179ecd57154172a9369926cbeaaa37e11a52
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/430505
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-10-31 21:56:31 +00:00
Ben Walker
039c8341e3 nvmf/rdma: Remove handling for LAST_WQE_REACHED
This event only occurs when using shared receive queues, which
the target does not currently support.

Change-Id: If155843610cf0e961b9783d4afd64b969b4316f4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/430388
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-10-31 21:56:31 +00:00
Seth Howell
e6dac39cb0 nvmf/rdma: rename SPDK_NVMF_RDMA_DEFAULT_IO_UNIT_SIZE
This value for the rdma transport at least is tied very closely to the
size of the iover buffers. Changing the name makes it less confusing.

Change-Id: I8a703f023c37f794323b7280228340aa587243fe
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/428746
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-10-12 16:30:24 +00:00
Ben Walker
aaa691b0ce nvmf/rdma: Delay disconnect processing until connect processing is done
If a disconnect occurs before connect processing has completed,
delay handling the disconnect.

Change-Id: Ibf91d7dc1f389be452ac6be8948c51e5dd3b9614
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/425990
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Piotr Pelpliński <piotr.pelplinski@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-10-04 22:59:57 +00:00
Pawel Wodkowski
c4fee1e970 mk: don't use '-include spdk/config.h'
Each file that need to check SPDK_CONFIG_* options need to include
spdk/config.h explicitly.

Change-Id: If9f2a91ac4c2b1a300dcf88ec3e2a12714ad344a
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/427221
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-10-02 23:13:32 +00:00
Seth Howell
5d57386885 env_dpdk: spdk_mem_map_translate informs user of translation size.
This function will now check for whether or not a memory region is
contiguous accross 2MB map entries and return the total length of that
contiguous buffer up to the size specified by the user.

Also includes unittests
This series of changes is aimed at enabling spdk_mem_map_translate to
report back to the user the length of the valid mem_map up to the
function that requested the translation.
This will be useful when retrieving memory regions associated with I/O
buffers in NVMe-oF. For large I/O it will be possible that the buffer is
split over multiple MRs and the I/O will have to be split into multiple
SGLs.
Change-Id: I2ce582427d451be5a317808d0825c770e12e9a69
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/425329
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-09-26 20:57:57 +00:00