64k sessions over the lifetime of a single target is something
that really could happen, so handle this case.
Change-Id: Iaed92b9ff6cd078fcd7c1efe88cf0c860c77c4ac
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
For iscsi read/write, expected_data_xfer_len
is 0, dxfer_dir is set to SPDK_SCSI_DIR_NONE.
But we can still have read/write op in SCSI layer.
This patch solves this issue.
Change-Id: I950e163fffb06fefaf8a913d1f6de29c96a52264
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The g_thread_mmio_ctrlr should be not NULL pointer when it enter the
handler function.
Change-Id: I45dba601c672b16e2c6feafd9059bafde0d8f1b4
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
If namespace is formatted with per lba metadata feature and also disable end-to-end protection
feature, host couldn't use per extended-lba metadata area.
Signed-off-by: Zhihao Zhang <thomas.zzh@alibaba-inc.com>
If the user asked for a specific PCI address in spdk_nvme_probe(), we
need to return 1, not 0, for the other PCI addresses that don't match
when enumerating. 0 means to attach the PCI driver, whereas 1 means to
continue enumerating.
With the previous behavior of returning 0, all NVMe devices would be
attached to the DPDK PCI driver, even if the user did not request for
them to be probed, and further calls to spdk_nvme_probe() would not find
any devices.
Change-Id: Ifbbcd7d1abe8ab535b6957855172e66a3e69fbe4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This is not actually optional - it contains required
information for setting up the connection.
Change-Id: I21136de12794a0f4f5c14c5d3e2e3f2306c5c102
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This isn't used anywhere yet, but it will be for
NVMe-oF 1.1.
Change-Id: Ieae0688e6ad5b7a44568e5760382b5716b02e6f0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
The code doesn't actually use this property of cntlid
for anything yet, but we will need it later.
Change-Id: I5fd514d75b903cc8769e7b9f196a4624e9cf876c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
This is necessary to process asynchronous events, as well as keep-alive
support for NVMe over Fabrics connections.
Based on a patch by Edward Yang <eyang@us.fujitsu.com>
Change-Id: I3e81f3d5061f75b12b625fa1a06629c6dc3dc61b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
There is only a single device ID for all channels on the SKX
implementation of I/OAT.
Change-Id: I90ee79b1b673a199754f1ca4c9e38e934294e261
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This prevents the need for bdev users and modules to manipulate the
internal bdev_io error.nvme fields.
For now, all non-NVMe error types are treated as a generic device error,
but translation from SCSI to NVMe could be added in the future.
Change-Id: I4e831b26a2f41bf2f405c7576d5019bb898d4d1b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently we use the pci functions provided by DPDK,
it identifies the device by class id related
info but not by pci bdf info, so we can add the filering
by pci_addr in pcie_nvme_enum_cb function.
Change-Id: I5942e98853f00fc10fa6aae5c113517653d1b357
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Since nvme_ns_cmd.c now walks the SGL, some of the test code
needs to also be updated to initialize and return correct values
such as ctrlr->flags and sge_length.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I521213695def35d0897aabf57a0638a6c347632e
Convert the number parsing function into a linear sequence with a goto
label for each state, rather than a single loop with a state variable.
This makes the code easier to read and also improves speed (better
branch prediction and smaller inner loops for the common case).
On my test system, jsoncat citylots.json > /dev/null improves from
~1.7s to ~1.2s.
This changes behavior of some number parsing test cases: inputs matching
the number grammar as defined by JSON will be returned even if there is
trailing garbage, consistent with the rest of the parser. For example,
the input 01 will be parsed as a valid number 0 followed by trailing 1.
This only makes any difference when the full input is a single
number value, since if the value was nested in an object or array, the
trailing garbage will not match the expected syntax and the whole parse
will fail with SPDK_JSON_PARSE_INVALID (e.g. [00 will parse the first 0
as a number and then fail on the second 0, since only a comma or right
square bracket would be accepted).
Change-Id: Ifabfaed611219b3e0a06c8677190a28b87e8a13b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This improves output speed significantly, especially if the write
callback is expensive (e.g. issues a syscall or takes a lock).
On my test system, jsoncat citylots.json > /dev/null improves from
~2.8s to ~1.7s.
citylots.json: https://github.com/zemirco/sf-city-lots-json (~181 MiB)
Change-Id: I7d411ce92366712ed87ad5fc6e9b64828541db4d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
If a blockdev module calls spdk_bdev_io_complete() within its
submit_request function, and the user's completion callback issues a new
I/O, it is possible to cause infinite recursion, consuming all available
stack space.
To avoid this, track whether a bdev_io is being processed by
submit_request, and if io_complete() is called in this case, defer the
completion via an event.
Change-Id: I6ccdb8ed4ee0d5738e6c9840d35431de52bd5fa2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Preivously, we only supports probe the NVMf target
via discovery info, now we can support to directly
to connect it.
Change-Id: I08ce1d95de6744286357e68b48c97b773b902ac8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
I do not see any reason to ignore using this channel. If that,
we should give comments in the file, otherwise we need to add it.
Change-Id: I56ad491c67a23831befc8c761ad0a02e721a15a4
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Because of the addition of io_channel support to the bdev layer, there
is no longer a need to re-run a completed I/O through the submission
event pipeline; it can be freed directly.
Change-Id: I2b9163c87293345acf0e85f6d0c1032f30209659
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Include timer-based pollers in the active/idle check that uses
last_action to determine when a reactor last executed an action.
Change-Id: Ib8f1253675b57aeb59206d099c6257f6d07f5acf
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
One microsecond is not really long enough to detect an idle condition
where calling the OS usleep() makes sense. Increase the minimum time
spent spin-waiting on events and pollers from one microsecond to one
millisecond.
Change-Id: I678118e357330f133251f4cfada8ff27e10158a5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
When a connection enters full-feature phase and is assigned to an lcore,
we need to increment the counter for the new lcore, not the connection's
existing lcore.
Change-Id: Idced4090b6e8ac35a767fd223fbd81ba824615d3
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
Claim the block devices used by iSCSI LUNs and NVMe-oF subsystems so
they can't accidentally be reused.
This will also be used by virtual block devices to allow layering of
bdevs.
Change-Id: I5384923fbf24f13f4ce720a797c5a628053d49f4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
(1) Add nvme_rdma_build_sgl_request function
(2) Merge nvme_rdma_pre/post_copy_mem to nvme_rdma_copy_mem
Change-Id: I86abab821b32b4da0aa9489a6b9f7dc430333159
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Use a plain function pointer + callback context for the bdev I/O
completion callback. This is possible now because each I/O channel will
be polled on the core that submitted the I/O.
Change-Id: I29ee8e4a3430df11c74845adab840395b9bc5010
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
An old prototype SPDK AHCI driver would return
TASK_SET_FULL if all NCQ slots were full on a given
disk. This would kick the SCSI task back to the LUN
to be retried later. Since then, we have pushed
responsibility onto the bdev modules themselves
to handle this kind of queueing/retry logic.
Removing this logic allows us to make some additional
changes that enable tasks to get completed inline without
an extra event callback to handle completion. We also
no longer need to worry about checking if pending tasks
need to be executed in the complete_task() routine, since
the execute() routine will now always exhaust the pending_task
list.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If2dc3ab017e0dbc225c8f627e1f87c5a8e9b1e3e
Now that the hotplug code is isolated in nvme_pcie.c, it can call the
PCIe transport attach function directly.
Change-Id: I2df3b9168473b537cc9b13367e06d3d3b6fa22be
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The reactor structures are allocated in a contiguous array, and each
reactor is accessed from a different core, so align the reactor
structure to avoid false sharing.
Change-Id: I95162620ccb58fae060b2d95e47a38621dfbd140
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
It is private to lib/event/reactor.c and does not need to be exposed in
the global namespace.
Change-Id: Idfff0365a0afdd90a0567825d520adf61d99fd2b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Previously, we did not calculate the ref for the LUN.
Change-Id: If2b7bc7d129e7efd994a7987ae2c421048969acb
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
An SGE could be for a payload that is greater than the NVMe
devices MDTS (i.e. 128KB), but that SGE may not be aligned
on a sector-size boundary. We can safely assume that each
iov is individually physically contiguous - the DPDK
mempools for example guarantee this.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8143ed01814c3154d0a06b8bbc548484437c1e88
The spdk_nvme_qpair::num_entries value is never used in the common code,
so move it to the individual transport qpairs to make it clear that it
is a transport-specific implementation detail.
Change-Id: I5c8f0de4fcd808912ba6d248cf5cee816079fd32
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The 'next' event pointer was never used in the entire code base (always
NULL).
Change-Id: I75f999d3a2e10512d86edec1a5a46ef263e2635b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Use 'struct spdk_event *' directly for consistency with the rest of the
API.
Change-Id: Ib41a9bf47f5b18f4aebf5f4dee055455cb12ef7d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
This allows the elimination of the spdk_event_get_arg1() and
spdk_event_get_arg2() macros, which accessed the event structure
directly; this was preventing the event structure definition from being
moved out of the public API header.
Change-Id: I74eced799ad7df61ff0b1390c63fb533e3fae8eb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The public API user is supposed to retrieve the defaults via the
spdk_app_opts_init() function.
Change-Id: Ie2bd6e809b2d47dbd5d62d396e8715f89f4052d9
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
The spdk_poller_register() function provides a way to pass an event to
call once the poller is registered, but it is always NULL in the current
code base.
Change-Id: I459bf40ae4d050589577d113b7984f1563aaa9cc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>