Commit Graph

204 Commits

Author SHA1 Message Date
Seth Howell
7630daa204 nvme: move queueing requests to the generic layer
The tailq and the requests all belong to the generic layer, might as
well put the queueing code there for better encapsulation.

Change-Id: Id5f08f798121b50a21044cfc61856999c50ca227
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469758
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-09-30 21:17:47 +00:00
Seth Howell
13fb1b690e nvme_rdma: add a timeout for spinning on cm events.
Previously we would just sit forever. preventing us from properly
attempting reconnects and timing out.

Change-Id: Id7386ab95cf75fd9ac972b44afa2719aad412f49
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469021
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-09-30 21:17:47 +00:00
Seth Howell
5ac814e36c nvme_rdma: share the cm_event channel between qpairs.
This enables us to create a single file descriptor and a single event
channel to poll for completions. With that accomplished, we can easily
poll for events on the admin qpair each time we check it for
completions.

Change-Id: I8b901252510744a956bef12594d1e045715e002e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467549
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-09-30 21:17:47 +00:00
Seth Howell
f12e6bc041 nvme_rdma: in qp_disconnect, set resources to NULL
This prevents us from failing a reset and then trying to double put the
rqpair->cq which ends up causing seg faults.

Change-Id: If3e14a3d039b4b19cc587a7482157f4b23f8ee32
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469609
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-09-30 21:17:47 +00:00
Seth Howell
579d44b0ee nvme_rdma: make handling of cm_events more robust
By splitting all cm_event handling into a single function, we can create
a single point of contact for cm_events, whether we want to process them
synchronously or asynchronously.

Change-Id: I053a850358605115362f424de55e66806a769320
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467546
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-09-18 22:19:37 +00:00
Seth Howell
ad7a01bde3 nvme_rdma: make cm_event fd asynchronous.
This is paving the way for additional changes to enable polling for
cm_events in the initiator.

For now, just present the same blocking API on top of the now polled
file descriptor. Later, we will change this API to be more useful.

Change-Id: I174dac028720f95c30100f6dc2ed49b5bb2a7e40
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467545
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-09-18 22:19:37 +00:00
Ben Walker
647afdec44 Revert "nvme: small code cleanup for nvme_transport_ctrlr_scan"
This reverts commit 6129e78d26.

When the initiator sends the discovery log page, if the log page
exceeds the size of its data buffer, it will break it up into
multiple log page commands with appropriate offsets. However,
supporting offsets in log pages is an optional feature in NVMe
and reported by the EDLP bit in the identify data.

This commit changed the discovery process to no longer send an
identify command prior to doing the discovery log page command,
so the values in the identify data are always 0. If the discovery
log page exceeds the size of the data buffer (4k), it will then
fail to send the second log page with an offset because it
believes the controller does not support the feature.

Revert this change to fix it. An identify should always be sent
as part of the discovery process. A test case is included in a
follow up patch the demonstrates the bug.

Reported-by: Zahra Khatami <zahra.k.khatami@oracle.com>
Reported-by: Akshay Shah <akshay.shah@oracle.com>

Change-Id: Iefd512a7521e0fea90541b3eb547671cfa816ea6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466819
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-09-09 21:52:07 +00:00
Shuhei Matsumoto
8b539eb553 nvme: Set appropriate value to max_xfer_size and max_sge
SPDK NVMe-oF initiator driver could not transfer IO whose size is
more than 128KiB even if NVMe-oF target allows IO whose size is
more than 128KiB both for RDMA and TCP transport.

Some use cases need to transfer IO larger than 128KiB.

For RDMA transport, max_mr_size by ibv_query_device of RDMA devices
indicates the maximum size of a single memory region and is independent
from the actual I/O size, and is very likely to be larger than 2 MiB
which is the granularity we currently register memory regions.

Actually some RDMA NICs return UINT64_MAX for max_mr_size by ibv_query_device.

Hence use UINT32_MAX and let the generic layer use the controller data
to moderate this value.

On the other hand, for TCP transport, there is no limit for maximum IO
size and hence use UINT32_MAX.

Besides, for RDMA transport, max_sges should be the minimum of
max_sge got by querying RDMA devices and NVME_RDMA_MAX_SGL_DESCRIPTORS.
Hence do this change together in this patch.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idc813afd3e525bf5f370c0fcd2623f9c146a5528
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459218
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-05 06:35:41 +00:00
Shuhei Matsumoto
cf3c54bc03 nvme: Ensure max_sges not to exceed what controller supports in generic layer
Previously comparing the transport supported value and the target value
was done in RDMA transport layer. However this comparison should be
done in the generic layer like the maximum IO transfer size. Hence
change the comparison to do in the generic layer in this patch.

Besides, for MSDBD, the value 0 indicates no limit but we had handled
this as maximum number of SGS entries was 0 by mistake. This patch fixes
the bug together.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I54365cf114169b10180ec2c659f9c7302672674c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-07-05 06:35:41 +00:00
Jim Harris
ef1f844395 nvme: add qpair parameter to nvme_complete_request
In some cases we have the qpair already when calling
this function.  So pass the qpair to avoid having
to get it from the request.  This shows about a 3%
performance improvement for high IOPs single core
tests.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I22fcca560492f4e7cf5ffedd252e41a027d0dd79

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455286
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-22 14:51:01 +00:00
Jim Harris
79fad08a7e nvme: add transport qpair_disconnect function
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9918f8fed0e559be5d865702b647566dd1e2ed18

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453936
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-14 08:48:11 +00:00
Jim Harris
e7d8c05b5d nvme: break out rdma disconnect from destroy
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2ce5413635c68403edf532e53d8e15d04f0fd6c5

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453933
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-14 08:48:11 +00:00
Jim Harris
8986de8b98 nvme: rename transport reconnect function to just connect
The RDMA transport was the only one implementing this
function, and it only does a connect - not a disconnect
followed by a connect.

A later patch will add a matching disconnect function.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib68eb0ff2f8e59f437d6d8831bb37dfddf83e9a4

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453929
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-14 08:48:11 +00:00
Jim Harris
f0be163639 nvme: check is_enabled flag at common layer
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I85e8289d10b481d3ca1cd125f73bd5abc4d1bf16

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453928
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-14 08:48:11 +00:00
James Bergsten
740b2f5622 nvme: spdk_nvme_ctrlr_get_registers
This function returns a pointer to the PCIe I/O registers for a controller
or NULL if unsupported for this transport.

Used for PCIe only, other transports return NULL.

Use with caution.

Signed-off-by: James Bergsten <jamesx.bergsten@intel.com>
Change-Id: I849f9de9ad259a65b1eef9c1237345eb7195b9bf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452927
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-09 22:58:05 +00:00
Jim Harris
fabd7fbb41 nvme: remove qpair_disable
This transport function is a complete nop now, so
remove it.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5cc6ac75795a3cf5311f24e2ac293fb53d4b9f8c

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453487
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-05-08 01:44:20 +00:00
Jim Harris
783a2a20f1 nvme: add transport_qpair_abort_reqs
This will allow us to move more of the reset-related
functionality to the common layer, as part of enabling
resets for fabrics controllers.

The transport qpair_enable and qpair_fail functions
acted similarly - so those are both removed now and
replaced with this new qpair_abort_reqs function.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9486630ad5b807239b0b5bcde50e8cfd313695d3

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453486
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-08 01:44:20 +00:00
Jim Harris
5d431efd6d nvme: move is_enabled logic to common layer
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idd938f255226256d864f70921ecd70c54769b9b2

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453485
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-08 01:44:20 +00:00
Jim Harris
f366e261a6 nvme: abort aers at common layer
We submit AERs to all controllers - both pcie and
fabrics.  But currently we only manually abort the
aers when disabling the qpair for pcie.  Make this
common instead by creating a new transport function
for aborting aers.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1e926b61b8035488cdc6e8cb4336b373732f985e

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453482
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-05-08 01:44:20 +00:00
Jim Harris
14e67af3c5 nvme: rename reinit_io_qpair to reconnect_qpair
This better explains what the function is doing,
and makes the name more general so we can use it
for the adminq as well.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6b55761cb141a9a79cdef876be47995d8813b312

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453480
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-08 01:44:20 +00:00
Jim Harris
3a1b2ec262 nvme/rdma: alloc req/rsp during construct (not connect)
This moves us towards not freeing and reallocating
this memory if and when we reconnect the qpair.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic20d3c221442f6206d161760a8bfa7f9b8989d4c

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453479
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-08 01:44:20 +00:00
Jim Harris
6949c71dca nvme/rdma: separate req/rsp allocation from registration
This will simplify some upcoming changes to reconnect
a qpair.  In these cases we only need to re-register
the memory - we shouldn't have to allocate it again.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id8adff313f191fbf11d7502127a2b961f2ca2f6e

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453478
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-05-08 01:44:20 +00:00
Seth Howell
eb6006c242 nvme_rdma: don't send split sgl requests inline.
In order to truly support multi-sgl inline requests in the RDMA
transport, we would need to increase the size of the
spdk_nvme_rdma_req object dramatically. This is because we would need
enough ibv_sge objects in it to support up to the maximum number of SGEs
supported by the target (for SPDK that is up to 16). Instead of doing
that or creating a new pool of shared ibv_sge objects to support that
case, just send split multi-sgl requests through the regular sgl path.

Change-Id: I78313bd88f3ed1cea3b772d9476a00087f49a4dd
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452266
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-05-06 18:11:13 +00:00
Seth Howell
3f2553a3f7 nvme_rdma: Allow the SGL to be larger than the req_length
The upper levels of the stack allow for this, so we should follow that
pattern so I/O don't break here.

Change-Id: Ia862f14975a551b0675bafd7709fb7897d0d567e
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450685
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-04-19 17:29:45 +00:00
Ben Walker
d02950e6f5 nvme: Cache the cb_fn and cb_arg in the tracker
This avoids a data dependent load to find which
callback to call in the completion path.

Change-Id: Ifa20790a7af3332a74bc45037e589668744af797
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450558
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-04-10 21:29:03 +00:00
Changpeng Liu
6129e78d26 nvme: small code cleanup for nvme_transport_ctrlr_scan
The identify data is only valid if spdk_nvme_connect()
was used with Discovery Controller, so move this code
into the section where it belongs to.

Change-Id: I1897f38277eafc192552a09556a568e9152bb72d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448500
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-03-21 18:22:28 +00:00
lorneli
815f82b17b nvme: mv submit_tick assignments to generic qpair code
Move req->submit_tick assignments from specific transports to generic
qpair code.

Check whether submit_tick has been assigned before doing the actual
assignment, because a request may be submitted several times and the
original submit_tick shouldn't be covered.

Change-Id: I2de8018dc21763eb5a19bb9d48dfbdef764b036e
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/444702
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-21 20:29:59 +00:00
Changpeng Liu
3306e49e24 nvme: introduce probe context data structure and API
Existing NVMe driver uses a global list g_nvme_init_ctrlrs
to track the controllers during initialization, and internal
function will start each controller in the list one by one
until the list is empty.  We introduce a probe context
and move the global list into the context, with the context
we can enable asynchronous probe API in the next patch, also
this can enable parallel probe feature.

Change-Id: I538537abe8c1a4a82fb168ca8055de42caa6e4f9
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/426304
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-02-15 03:14:20 +00:00
Darek Stojaczyk
655d54f3f1 nvme: remaning changes related to nvme hooks
Change-Id: I07f3f403bef26a7c3e41b3c9f74e7ba4e378b2cc
Signed-off-by: zkhatami88 <z.khatami88@gmail.com>
Reviewed-on: https://review.gerrithub.io/c/443650
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-02-12 19:43:02 +00:00
Seth Howell
e7beb0d1fd nvme_rdma: don't put req until both send and recv have completed
This prevents us from overrunning the send queue.

Change-Id: I6afbd9e2ba0ff266eb8fee2ae0361ac89fad7f81
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/443476
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-02-12 18:45:11 +00:00
Changpeng Liu
d9e865a885 nvme: report SQ deletion code to outstanding admin requests
RDMA transport will report SPDK_NVME_SC_ABORTED_POWER_LOSS code
when fail the admin queue, however, SPDK_NVME_SC_ABORTED_SQ_DELETION
makes more sense here, because we know we are going to shutdown
the controller.

Fix issue #568.

Change-Id: I31da095ec92c06079511d89cc2743654ba2c001b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/440132
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-01-17 21:04:43 +00:00
Seth Howell
3018bf907b nvme_rdma: multi-element sgl support for inline reqs
Necessary to avoid erroring out in the edge case where we have an SGL
request sent with two buffers that fit in the incapsule data size.

Change-Id: If51fb69c402482b564c737319584378cb03e7213
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/436062
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-01-15 17:25:35 +00:00
Seth Howell
8b4534e43e nvmf_rdma: When destroying a qpair, fail the I/O.
This helps us get rit of outstanding requests at the bdev layer.

Change-Id: I362c7c0c6641715fcd96e8eb465b308c368d34fc
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/431844
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-12-23 00:59:42 +00:00
Seth Howell
a451c8385e NVMe-oF: Add explicit reports for MR-split buffers:
This is a failsafe for finding and reporting data buffers that span
multiple Memory Regions. These errors should never be triggered, but
finding and reporting them will help any debugging.

Change-Id: I3c61e3cc510f5a36039fc1815ff0de45fce794d5
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/436054
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-12-10 23:47:38 +00:00
Seth Howell
0e6a32deab nvme_rdma/nvmf: add cb_fns to check mr contiguity
This is necessary to confirm that a buffer that spans a 2_MB boundary is
still in a single MR.

Change-Id: If0d14e514ab2197a0d2e3af4f565f56d50591210
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/435179
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-11-29 15:16:13 +00:00
Seth Howell
e688d1ccf1 nvme_rdma: properly configure and store max_sges
The max_send_sge and max_recv_sge values can be set to any value from
0...dev_attr->max_sge. WHen we actually set the attributes, we will
receive a qpair with values for max_sge greater than or equal to what we
initially set. We need to store the maximum number of SGEs for later use
when constructing work requests.
Previously we have not relied on these values since we assumed that we
would always be able to have more sges than we asked for initially. This
may change as we try to allocate more SGEs to handle splitting buffers
across memory regions.

Change-Id: Ibbeae1908b86baa3a96d9c6cd2051401aaa2197b
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/433307
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2018-11-16 15:18:41 +00:00
Ben Walker
bf1a82cf5a nvme/rdma: Make RDMA memory map per-pd again
The RDMA memory map needs to be per-protection
domain, not per NVMe controller. Otherwise, when
an NVMe controller is removed, the memory map may
reference an invalid pointer to a detached
controller.

Change-Id: I0c5bd2172daee0c70efb40eab784839e0cde8bc4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/432590
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-11-12 15:31:34 +00:00
zkhatami88
9fb6947617 nvme: Add mechanism to override RDMA pd/mr behavior
Add a mechanism to modify the RDMA transport's behavior
when creating protection domains and registering memory.
This is entirely optional.

Change-Id: I7cd850e76a673bf5521ca4815b779c53ab9567e8
Signed-off-by: zkhatami88 <z.khatami88@gmail.com>
Reviewed-on: https://review.gerrithub.io/421415
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-11-05 17:22:30 +00:00
Ben Walker
438bae7966 nvme: Enable RNR Retry in RDMA transport
The initiator will now automatically retry sending
commands if the target is not ready.

Change-Id: I13354283f77d9ccba9645e83c77061cac1b07b0f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/428732
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2018-10-12 16:29:16 +00:00
Ben Walker
e3cd058cb4 nvme: Don't alias sgl1 when building request
This makes this particular function consistent
with all of the other functions in this file, and
I feel it is slightly more readable.

Change-Id: I99ace5b9eb45b0f706ca85a64b155444f45c9815
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/428730
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-10-11 20:45:27 +00:00
Ben Walker
6b2fb186f6 nvme: Make it clearer which RDMA sgl element is being filled out
While more verbose, this makes it much more obvious that
an array of SGL elements is being filled out.

Change-Id: I98b8e5d46af32c5d7dbb990e267fdfd594942081
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/428729
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-10-11 20:45:27 +00:00
Ben Walker
3fe0db6ca3 nvme: Correctly set RDMA num_sge value for every command
Also add some comments.

Change-Id: I97c3a44f97aa3dadc114005c10bec83ae75994cf
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/428728
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-10-11 20:45:27 +00:00
Ben Walker
b99136081f nvme: Add clarifying comments when building RDMA SGLs
Change-Id: I9e0fc92e422de3fc65c5048a63f4c7dcc46f7324
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/428727
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
2018-10-11 20:45:27 +00:00
wuzhouhui
5d5181db70 nvme/rdma: fix a stack-buffer-overflow error
spdk_mem_map_translate() dereference a uint64_t * to get a
8-bytes long integer, but nvme_rdma_build_sgl_request() just passes
a 4-bytes long integer as last parameter, this causes a
stack-buffer-overflow error.

Reported in 3ba5ea9087.1539172863/fedora-05/build.log

Change-Id: Id1cda22114fef466dbb930b502e3a68310331f0e
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/428693
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2018-10-11 01:58:48 +00:00
Seth Howell
935cdbe497 lib/nvme: Enable multi SGE support in the NVMe-oF host.
Change-Id: Icbbea0d586ae086314085e682fc13aa63fa3c167
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/427543
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-10-09 21:17:17 +00:00
Seth Howell
6820641a5d nvme_rdma: add checks for translation size to request funcs
This series of changes is aimed at enabling spdk_mem_map_translate to
report back to the user the length of the valid mem_map up to the
function that requested the translation.
This will be useful when retrieving memory regions associated with I/O
buffers in NVMe-oF. For large I/O it will be possible that the buffer is
split over multiple MRs and the I/O will have to be split into multiple
SGLs.

Change-Id: I830aba773e1d247ec571ff31eaba970ced0fd7a0
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/425413
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-09-26 20:57:57 +00:00
Seth Howell
5d57386885 env_dpdk: spdk_mem_map_translate informs user of translation size.
This function will now check for whether or not a memory region is
contiguous accross 2MB map entries and return the total length of that
contiguous buffer up to the size specified by the user.

Also includes unittests
This series of changes is aimed at enabling spdk_mem_map_translate to
report back to the user the length of the valid mem_map up to the
function that requested the translation.
This will be useful when retrieving memory regions associated with I/O
buffers in NVMe-oF. For large I/O it will be possible that the buffer is
split over multiple MRs and the I/O will have to be split into multiple
SGLs.
Change-Id: I2ce582427d451be5a317808d0825c770e12e9a69
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/425329
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-09-26 20:57:57 +00:00
Seth Howell
4e06bb5e6d env: pass an spdk_mem_map_ops structure to mem_map_alloc
This series of changes is aimed at enabling spdk_mem_map_translate to
report back to the user the length of the valid mem_map up to the
function that requested the translation.
This will be useful when retrieving memory regions associated with I/O
buffers in NVMe-oF. For large I/O it will be possible that the buffer is
split over multiple MRs and the I/O will have to be split into multiple
SGLs.

Change-Id: I90da6d4d31c669a3bf046f7721923dd743c5ef21
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/425328
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2018-09-26 20:57:57 +00:00
Seth Howell
d288c41242 env_dpdk: change behavior of spdk_mem_map_translate
The function now takes a pointer as it's last argument, and copies the
size of the memory region for which the translation is validinto that
pointer.
For now, that will always be 2MB. However that behavior can change in
the future.

This series of changes is aimed at enabling spdk_mem_map_translate to
report back to the user the length of the valid mem_map up to the
function that requested the translation.
This will be useful when retrieving memory regions associated with I/O
buffers in NVMe-oF. For large I/O it will be possible that the buffer is
split over multiple MRs and the I/O will have to be split into multiple
SGLs.

Change-Id: I8686c166ec956507f5ae55cf602341281482cb89
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/424888
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-09-15 00:28:23 +00:00
Jim Harris
6c391ec277 nvme/rdma: clarify icdoff usage
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9ae4a9abec949942a1df001931dded94396be01e

Reviewed-on: https://review.gerrithub.io/425325
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Potnuri Bharat Teja <bharatppotnuri@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2018-09-12 17:00:20 +00:00