387 Commits

Author SHA1 Message Date
Cunyin Chang
b7349216b9 nvme: optimization for nvme probe and detachment.
remove the unnecessary rte_eal_pci_probe_one() in function
spdk_pci_device_detach(), this could cause error message when we
terminate the application, it will also not make sense try to probe one
device after we detach it, we could call spdk_pci_nvme_device_attach()
instead of spdk_pci_nvme_enumerate() when we have one given device address,
dpdk will try to scan the device and add it back to pci device list then.

Change-Id: I35f5bb412249bb20da57394f0531c10a49691906
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
2017-03-15 09:00:25 -07:00
Daniel Verkamp
83e556534b nvme/rdma: eliminate bounce buffer copy
Register all spdk_malloc() memory regions as ibv_mr in a spdk_mem_map
so we can look up the RDMA key for the user's buffer and pass it in the SGL
directly, rather than copying through a pre-registered bounce buffer.

Change-Id: I7340bc2020b5256750c95dbd24ba67961404e5e7
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-03-10 11:39:28 -07:00
Changpeng Liu
8482b47d53 nvme: initialize namespace capability flags with correct order
The extended LBA format flag should be initialized after namespace
capability flag.

Change-Id: Iad479b454bb4e31120c17d40ae23937a099c6f8f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
2017-03-10 11:39:09 -07:00
Ben Walker
ac9b92c853 nvme/rdma: Consolidate send_cq and recv_cq
The send completions must be processed prior to the
recv completions. However, if the completion queues
are separate this leaves a small window where
a send+recv completion arrive between polling
the send_cq and the recv_cq, resulting in the code
seeing the recv completion prior to the send
completion.

By combining the completion queues, this eliminates
any potential gap. The send completion will always
be processed before the recv completion.

Change-Id: I06bfef6af48559d0b9e00524ebc10f1a102e7387
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-03-09 15:08:18 -07:00
Daniel Verkamp
b55d837e7b nvme: only include x86intrin.h for x86 platforms
The actual uses of intrinsics are already guarded by feature-specific
ifdefs in nvme_pcie_copy_command(), but the header itself should also
only be included when it will actually be needed.

Change-Id: Ife65d6432b8dfd9d9db80fe4e385ab76491874c0
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-03-07 12:44:10 -07:00
Daniel Verkamp
cb0538aefb nvme: clamp io_queue_size to valid range
I/O queues must have at least 2 entries.

Change-Id: Ib11b08fab4b518501f8ce80d6b73d9e33c9777a6
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-03-06 15:30:04 -07:00
Jim Harris
37ccb50c50 nvme: allow for deletion of I/O qpairs during their completion context
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibc6566e9248cd7004aa5d4374f32b519062ed6d9
2017-03-06 14:01:25 -07:00
Daniel Verkamp
6a03451228 nvme/rdma: use spdk_malloc() for DMA data buffers
These allocations need to be from memory registered with the SPDK env
library to allow future work on automatic ibverbs memory registration.

Change-Id: I6ec6999ecd6d6bf6ba4ab159630f7d01f3d46154
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-03-03 10:23:20 -07:00
Daniel Verkamp
b9ca539390 nvme: add transport ID comparison function
Change-Id: I4ab7cbad1c31ac57347bb470f3f545d3ca81de43
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-03-01 14:35:18 -07:00
Daniel Verkamp
a3a3d7dddf nvme: export transport ID trtype and adrfam parsers
Change-Id: I8d417c5baa68832c568c3df0927d84e2f44bb887
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-03-01 14:35:18 -07:00
Isaac Otsiabah
687e93c4a5 nvme: export abort command as public API 2017-02-27 10:44:17 -07:00
Daniel Verkamp
b6d6cd8e81 nvme: pass command ID to timeout callback
This can be used for issuing an abort for the timed-out command.

Change-Id: I3c5727fdddc156cd7c8f99afbc3e6da8e73bba56
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-02-23 17:49:01 -07:00
Ziye Yang
27cf11d04e nvme,rdma: optimize nvme_rdma_post_recv
Move the ibv_recv_wr initialization in
nvme_rdma_alloc_rsps. Thus we can save some
CPU times

Change-Id: Id449b2684290431f8b3ba97ec4058171d34038bf
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-02-22 19:15:41 -07:00
Ziye Yang
b89e758c8d nvme.rdma: move ibv_send_wr init while allocate rdma reqs
We do not need to set it for submission since the contents
are same

Change-Id: I345094e2e8a858b318be73d28f09393566587d95
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-02-22 19:14:54 -07:00
Daniel Verkamp
5f2f2052c0 nvme: reorder spdk_nvme_timeout_cb arguments
Move cb_arg to the first argument to match the other NVMe callback
function signatures.

Change-Id: I4e699c8071dcb7ba4ce3cdb82ee985600208204c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-02-22 17:55:43 -07:00
Ben Walker
08c69c9cc5 nvme: Handle failing MMIO reads while resetting
This has been reported for a number of different device
types. We suspect these devices are technically out of
spec, but they work with most other available NVMe
drivers on accident.

Change-Id: I529cfc03fc314cbab2a1cd40620bf1dd5b54182d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-02-21 14:20:41 -07:00
Daniel Verkamp
59fc5ba613 nvme: fix extended LBA block size calculations
For namespaces with end-to-end protection information, metadata size
of exactly 8 bytes, and extended LBA configured, the NVMe driver would
calculate the size of the data block incorrectly.  The NVMe spec has a
special provision for this specific case (8-byte metadata only) and
PRACT = 1 that requires that the host does not send the metadata as part
of the host memory buffer.

To fix this, clean up the calculation of the per-block data transfer
size by adding a new extended_lba_size field in the namespace, which
represents the total size of data to be transferred per block based on
the namespace's configured metadata size and whether it transfers
metadata as part of the data buffer.  Then add the special case for
PRACT = 1 and PI configured and extended LBA in the R/W helper
functions.

Change-Id: I0b383a58c773cac06e6c018858b57129064c6059
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-02-10 10:24:26 -07:00
Daniel Verkamp
84d904841f util: move common helper functions to util.h
These were repeated a few different places, so pull them into a common
header file.

Change-Id: Id807fa2cfec0de2e0363aeb081510fb801781985
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-02-10 09:38:52 -07:00
Daniel Verkamp
ff0047816d nvme: track submit time instead of timeout time
This removes one addition from the submission path (negligible, but a
nice side effect), but also opens up the possibility of reporting the
total time an I/O took - since we are always tracking the submission
time anyway, there is no extra cost to report it in the completion
callback.

Change-Id: I7129e7c09d20da8082042a7622d045846461dd9c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-02-08 09:23:58 -07:00
Daniel Verkamp
d57f3db1b5 nvme: clarify nvme_ctrlr_set_state debug prints
For infinite timeout states, instead of printing UINT64_MAX as a
decimal number, interpret it as "no timeout" instead.

Change-Id: I579f5857f96286734940ab5f493261e60354c4fe
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-31 18:06:09 -07:00
Daniel Verkamp
018e6afff2 nvme: remove ctrlr_get_pci_id transport callback
The PCIe transport initializes the quirks directly, so the generic hook
to get PCI ID is no longer necessary.  This path was dead code.

Change-Id: I25bdaa598db53e4312a264d9d8356d1b416696e5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-31 14:50:40 -07:00
Daniel Verkamp
a384110d7a nvme: move nvme_qpair_fail() call to generic code
The logic to fail queue pairs when the controller is failed should be
handled in the generic code, not in the individual transports.

This also allows nvme_qpair_fail() to be private to nvme_qpair.c.

Change-Id: I6194576dceb35073b9af8847e59314900028637c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-31 14:50:40 -07:00
Ben Walker
eb2ec1b07b nvme/rdma: Reap multiple completions per poll
This is more CPU efficient than only grabbing one
completion per call to ibv_poll_cq.

Change-Id: I0c70d33639f0f345482d9e7c810f9c6723937058
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-30 14:16:05 -07:00
Daniel Verkamp
b809cdd3ab env: add spdk_mempool_create() socket_id parameter
Change-Id: I8ab0bb2c6e1fdf1681fbd049a096c1768e54dc27
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-30 13:05:30 -07:00
Ziye Yang
47ae3957eb nvme/rdma: Invoke rdma_ack_cm_event in exceptional cases
In our previous code, we did not ack the event in
exceptional cases when we get a event via rdma_get_cm_event.
Thus, the code may block with in this statement:
rdma_destroy_id(rqpair->cm_id);
in some exceptiaonal cases. And this patch will solve this
issue.

Change-Id: Iddb6fb5356a5ee0ed04e261a040ba53042fca302
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-26 10:48:23 -07:00
Cunyin Chang
5f3e922a6a nvme: Fail the qpair if it is marked as failure when process the completion queue.
This make sure the qpair failure could be started from upper level application.

Change-Id: I7e04fe36929cc634ddf0078db96fbc40afb38f8c
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
2017-01-25 16:31:52 -07:00
Ben Walker
d77c030172 nvmf: NVMe-oF 1.1 adds cntlid to RDMA private data
Change-Id: I44ec5264fc93fa85706750cb23bbd0ed0587db81
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-23 16:46:55 -07:00
Cunyin Chang
2d5087b305 nvme: Add assert for g_thread_mmio_ctrlr in sigbus error handler function.
The g_thread_mmio_ctrlr should be not NULL pointer when it enter the
handler function.

Change-Id: I45dba601c672b16e2c6feafd9059bafde0d8f1b4
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
2017-01-23 16:10:09 -07:00
HaoZhiZhang
49daf72e0e nvme: support extended LBA without protection information (#101)
If namespace is formatted with per lba metadata feature and also disable end-to-end protection
feature, host couldn't use per extended-lba metadata area.

Signed-off-by: Zhihao Zhang <thomas.zzh@alibaba-inc.com>
2017-01-23 11:20:04 -07:00
Daniel Verkamp
d63a30e39d nvme/pcie: return 1 when PCI address doesn't match
If the user asked for a specific PCI address in spdk_nvme_probe(), we
need to return 1, not 0, for the other PCI addresses that don't match
when enumerating.  0 means to attach the PCI driver, whereas 1 means to
continue enumerating.

With the previous behavior of returning 0, all NVMe devices would be
attached to the DPDK PCI driver, even if the user did not request for
them to be probed, and further calls to spdk_nvme_probe() would not find
any devices.

Change-Id: Ifbbcd7d1abe8ab535b6957855172e66a3e69fbe4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-20 17:07:30 -07:00
Daniel Verkamp
be8a9d6966 nvme: add transport ID string parsing function
Change-Id: I33c15c8a56c25667567b373d21a117cca1f756c7
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-19 14:05:57 -07:00
Ziye Yang
0a573526b6 nvme/pcie: Add the support to probe nvme by pci_addr
Currently we use the pci functions provided by DPDK,
it identifies the device by class id related
info but not by pci bdf info, so we can add the filering
by pci_addr in pcie_nvme_enum_cb function.

Change-Id: I5942e98853f00fc10fa6aae5c113517653d1b357
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-18 15:30:45 -07:00
Jim Harris
86e8a920bf nvme: split non-compliant SGLs into multiple requests
Since nvme_ns_cmd.c now walks the SGL, some of the test code
needs to also be updated to initialize and return correct values
such as ctrlr->flags and sge_length.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I521213695def35d0897aabf57a0638a6c347632e
2017-01-17 07:51:09 -07:00
Ziye Yang
d61ddd3c93 nvme/rdma: Support directly connect via trid
Preivously, we only supports probe the NVMf target
via discovery info, now we can support to directly
to connect it.

Change-Id: I08ce1d95de6744286357e68b48c97b773b902ac8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-13 10:57:03 -07:00
Ziye Yang
90f13aa634 nvme/rdma: Support sgl for readv/writev functions
(1) Add nvme_rdma_build_sgl_request function
(2) Merge nvme_rdma_pre/post_copy_mem to nvme_rdma_copy_mem

Change-Id: I86abab821b32b4da0aa9489a6b9f7dc430333159
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-11 12:36:52 -07:00
Daniel Verkamp
f80c0f4fdd nvme: remove transport ctrlr_attach callback
Now that the hotplug code is isolated in nvme_pcie.c, it can call the
PCIe transport attach function directly.

Change-Id: I2df3b9168473b537cc9b13367e06d3d3b6fa22be
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-09 11:36:27 -07:00
Daniel Verkamp
df8129fb39 nvme: move num_entries to transport-specific qpairs
The spdk_nvme_qpair::num_entries value is never used in the common code,
so move it to the individual transport qpairs to make it clear that it
is a transport-specific implementation detail.

Change-Id: I5c8f0de4fcd808912ba6d248cf5cee816079fd32
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-05 15:49:09 -07:00
Daniel Verkamp
f1a9afa9fc assert.h: split UNREACHABLE into internal header
The definition of SPDK_UNREACHABLE uses the build-time DEBUG definition,
which is not available in the public API.

Change-Id: I1862c99fa5c85ccd3483f94e9c35de531da57f3c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-05 11:57:18 -07:00
Ben Walker
c2d38ee8da nvme/rdma: Improve error message when polling fails
Change-Id: Ia75d62669f3c1f923031f6a33e4797cc56ac1141
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-05 09:24:02 -07:00
Ben Walker
d9a3a8e50c nvme/rdma: nvme_rdma_recv now takes the response index
Instead of passing the work completion, just pass the
response index. This keeps the work completions localized
to the polling function.

Change-Id: I0e6a1d8564200b5ac3aa43dfd58ae152d439bbd8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-05 09:24:02 -07:00
Ben Walker
0ebf93e35c nvme/rdma: Pass qsize to nvme_rdma_ctrlr_create_qpair
This eliminates an if statement, since the two callers
of this function know the desired queue size.

Change-Id: I28fabac8613f7b8fc7d96cf95b085b6e4dcf985f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-05 09:24:02 -07:00
Ben Walker
c26c655d59 nvme/rdma: Eliminate nvme_rdma_ctrlr_construct_admin_qpair
Just call the regular qpair create function instead.

Change-Id: Ic35b1eb6fcdf0d82733ea573a493f583dd63d5bd
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-05 09:24:02 -07:00
Ben Walker
3ad0c336c8 nvme/rdma: Eliminate max_queue_depth
Use the num_entries value in the generic qpair instead. These
values had to match anyway.

Change-Id: Ia6400fbaba97df3ef6db4dc07a2ab95af1e5143f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-05 09:24:02 -07:00
Ben Walker
10aed30149 nvme/rdma: Handle requested qsize not matching obtained qsize.
Change-Id: I44d9e01dba87c2f78ef13296b78532255d749432
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-05 09:24:02 -07:00
Daniel Verkamp
e1b514ce9a nvme/rdma: cache value of bb_mr->rkey
Avoid an extra level of pointer chasing when we are filling out the NVMe
SGL.

Change-Id: I1a40af16fda80f7480c419524876bfb1a1902eb8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-03 16:56:34 -07:00
Cunyin Chang
6152d5bf8f nvme: move hot-plug logic to nvme pcie layer.
Change-Id: I2042b34e4284a9c59aa3092ccd061c075748880c
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
2017-01-03 16:56:13 -07:00
Jim Harris
4cc1cf888c nvme: break out request child creation code to separate function
This will allow it to be better be reused for some future patches
enabling splitting of non-PRP-compliant SGL-based requests.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ica38fd6cf191f72baa524bcc4896b3c9939ab762
2017-01-03 15:16:05 -07:00
Yongseok Oh
ee5b26a2d2 nvme: LBA variable fixed to 64bits for _nvme_ns_cmd_setup_request() function (#96) 2017-01-03 09:35:37 -07:00
GangCao
5a3def3936 nvme: free allocated contig_buffer
Change-Id: I1ce45d4937ee7a170d74f381cfaed263e3420ccd
Signed-off-by: GangCao <gang.cao@intel.com>
2016-12-20 13:49:31 -07:00
Ben Walker
bf89b4da10 nvme/rdma: Eliminate _nvme_rdma_ctrlr_create_qpair
This intermediate function is no longer needed.

Change-Id: I3523cc6d8f3b290165a953d42cca8b76eda762c5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2016-12-19 12:51:50 -07:00