For namespaces with end-to-end protection information, metadata size
of exactly 8 bytes, and extended LBA configured, the NVMe driver would
calculate the size of the data block incorrectly. The NVMe spec has a
special provision for this specific case (8-byte metadata only) and
PRACT = 1 that requires that the host does not send the metadata as part
of the host memory buffer.
To fix this, clean up the calculation of the per-block data transfer
size by adding a new extended_lba_size field in the namespace, which
represents the total size of data to be transferred per block based on
the namespace's configured metadata size and whether it transfers
metadata as part of the data buffer. Then add the special case for
PRACT = 1 and PI configured and extended LBA in the R/W helper
functions.
Change-Id: I0b383a58c773cac06e6c018858b57129064c6059
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
These were repeated a few different places, so pull them into a common
header file.
Change-Id: Id807fa2cfec0de2e0363aeb081510fb801781985
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This removes one addition from the submission path (negligible, but a
nice side effect), but also opens up the possibility of reporting the
total time an I/O took - since we are always tracking the submission
time anyway, there is no extra cost to report it in the completion
callback.
Change-Id: I7129e7c09d20da8082042a7622d045846461dd9c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
For infinite timeout states, instead of printing UINT64_MAX as a
decimal number, interpret it as "no timeout" instead.
Change-Id: I579f5857f96286734940ab5f493261e60354c4fe
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The PCIe transport initializes the quirks directly, so the generic hook
to get PCI ID is no longer necessary. This path was dead code.
Change-Id: I25bdaa598db53e4312a264d9d8356d1b416696e5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The logic to fail queue pairs when the controller is failed should be
handled in the generic code, not in the individual transports.
This also allows nvme_qpair_fail() to be private to nvme_qpair.c.
Change-Id: I6194576dceb35073b9af8847e59314900028637c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is more CPU efficient than only grabbing one
completion per call to ibv_poll_cq.
Change-Id: I0c70d33639f0f345482d9e7c810f9c6723937058
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
In our previous code, we did not ack the event in
exceptional cases when we get a event via rdma_get_cm_event.
Thus, the code may block with in this statement:
rdma_destroy_id(rqpair->cm_id);
in some exceptiaonal cases. And this patch will solve this
issue.
Change-Id: Iddb6fb5356a5ee0ed04e261a040ba53042fca302
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
This make sure the qpair failure could be started from upper level application.
Change-Id: I7e04fe36929cc634ddf0078db96fbc40afb38f8c
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
The g_thread_mmio_ctrlr should be not NULL pointer when it enter the
handler function.
Change-Id: I45dba601c672b16e2c6feafd9059bafde0d8f1b4
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
If namespace is formatted with per lba metadata feature and also disable end-to-end protection
feature, host couldn't use per extended-lba metadata area.
Signed-off-by: Zhihao Zhang <thomas.zzh@alibaba-inc.com>
If the user asked for a specific PCI address in spdk_nvme_probe(), we
need to return 1, not 0, for the other PCI addresses that don't match
when enumerating. 0 means to attach the PCI driver, whereas 1 means to
continue enumerating.
With the previous behavior of returning 0, all NVMe devices would be
attached to the DPDK PCI driver, even if the user did not request for
them to be probed, and further calls to spdk_nvme_probe() would not find
any devices.
Change-Id: Ifbbcd7d1abe8ab535b6957855172e66a3e69fbe4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently we use the pci functions provided by DPDK,
it identifies the device by class id related
info but not by pci bdf info, so we can add the filering
by pci_addr in pcie_nvme_enum_cb function.
Change-Id: I5942e98853f00fc10fa6aae5c113517653d1b357
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Since nvme_ns_cmd.c now walks the SGL, some of the test code
needs to also be updated to initialize and return correct values
such as ctrlr->flags and sge_length.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I521213695def35d0897aabf57a0638a6c347632e
Preivously, we only supports probe the NVMf target
via discovery info, now we can support to directly
to connect it.
Change-Id: I08ce1d95de6744286357e68b48c97b773b902ac8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
(1) Add nvme_rdma_build_sgl_request function
(2) Merge nvme_rdma_pre/post_copy_mem to nvme_rdma_copy_mem
Change-Id: I86abab821b32b4da0aa9489a6b9f7dc430333159
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Now that the hotplug code is isolated in nvme_pcie.c, it can call the
PCIe transport attach function directly.
Change-Id: I2df3b9168473b537cc9b13367e06d3d3b6fa22be
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The spdk_nvme_qpair::num_entries value is never used in the common code,
so move it to the individual transport qpairs to make it clear that it
is a transport-specific implementation detail.
Change-Id: I5c8f0de4fcd808912ba6d248cf5cee816079fd32
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The definition of SPDK_UNREACHABLE uses the build-time DEBUG definition,
which is not available in the public API.
Change-Id: I1862c99fa5c85ccd3483f94e9c35de531da57f3c
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Instead of passing the work completion, just pass the
response index. This keeps the work completions localized
to the polling function.
Change-Id: I0e6a1d8564200b5ac3aa43dfd58ae152d439bbd8
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This eliminates an if statement, since the two callers
of this function know the desired queue size.
Change-Id: I28fabac8613f7b8fc7d96cf95b085b6e4dcf985f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Just call the regular qpair create function instead.
Change-Id: Ic35b1eb6fcdf0d82733ea573a493f583dd63d5bd
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Use the num_entries value in the generic qpair instead. These
values had to match anyway.
Change-Id: Ia6400fbaba97df3ef6db4dc07a2ab95af1e5143f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Avoid an extra level of pointer chasing when we are filling out the NVMe
SGL.
Change-Id: I1a40af16fda80f7480c419524876bfb1a1902eb8
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This will allow it to be better be reused for some future patches
enabling splitting of non-PRP-compliant SGL-based requests.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ica38fd6cf191f72baa524bcc4896b3c9939ab762
This intermediate function is no longer needed.
Change-Id: I3523cc6d8f3b290165a953d42cca8b76eda762c5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Sending the fabric connect command is part of establishing
a connection, so move it into the main connection-establishing
function.
Change-Id: I55e7ffdd16b576c81b51d7d3910203f9afc1f4c2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This function initializes the members of an existing
qpair struct. It doesn't construct one from scratch.
Change-Id: I0b9afac1ad25cfb217efd146702f693c74f5f697
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
No need to allocate all of the requests and responses until
we know a connection can be established.
Change-Id: I072a10aadfd7ced773634448f7d7e788622d0a4c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The code is clearer if this function is incorporated
into its only caller.
Change-Id: I33901cddf80ae27896b2acfd1b9e7d212f21f5f3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is resolving the address and route to the target, not
binding a socket to an address.
Change-Id: I80055481ed2e020410a1e186a4e7371b60faaee9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
They were very close to the same already, so finish the job.
Change-Id: Ifba9e3b2d11a3e70cbfbe46f57a67552db2757ed
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
We should be sending the bounce buffer's remote key to the target so it
can put it into an RDMA SGE on the remote side.
Change-Id: Icded155ad2292c67baa722f001c9c07178bc2754
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is no particular reason for this to be 127; make it 128 to at
least be consistent with the PCIe transport.
Change-Id: I60500e0044d3549ba6350e1f35f09d624848bd21
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This function was only called from one place and saved no
lines of code.
Change-Id: If5e653732df57c1f2c93e20cf4f286eac31df91c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This matches the behavior of nvme_pcie, which queues a request if no
tracker is available.
Change-Id: Idbf6c951c89451cfea22ec6bc553ff46f988f818
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>