Commit Graph

1048 Commits

Author SHA1 Message Date
Ben Walker
3cc3f2646a nvmf: Move trace point declarations to bottom of nvmf_internal.h
Change-Id: I805d5e150feb18bc62156b592d4052c9dbdd6f89
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-23 16:46:55 -07:00
Ben Walker
dc42663305 nvmf: Remove duplicated transport init
This just appears to be a bug.

Change-Id: Icd888fec47a392def646b388a61a1003a7b2aaac
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-23 16:46:55 -07:00
Ben Walker
06b9c46561 nvmf: Add utility functions to create/destroy listen addresses.
Change-Id: I58c21caa8f7f0b564c6d8684fe6c7501e810dfa0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-23 16:46:55 -07:00
Ben Walker
ec38ec127c nvmf: Handle wrap-around for global cntlids
64k sessions over the lifetime of a single target is something
that really could happen, so handle this case.

Change-Id: Iaed92b9ff6cd078fcd7c1efe88cf0c860c77c4ac
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-23 16:46:55 -07:00
Ben Walker
d77c030172 nvmf: NVMe-oF 1.1 adds cntlid to RDMA private data
Change-Id: I44ec5264fc93fa85706750cb23bbd0ed0587db81
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-23 16:46:55 -07:00
Ziye Yang
4133842d36 scsi: fix the scsi read write direction issue.
For iscsi read/write, expected_data_xfer_len
is 0, dxfer_dir is set to SPDK_SCSI_DIR_NONE.
But we can still have read/write op in SCSI layer.
This patch solves this issue.

Change-Id: I950e163fffb06fefaf8a913d1f6de29c96a52264
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-23 16:12:27 -07:00
Cunyin Chang
2d5087b305 nvme: Add assert for g_thread_mmio_ctrlr in sigbus error handler function.
The g_thread_mmio_ctrlr should be not NULL pointer when it enter the
handler function.

Change-Id: I45dba601c672b16e2c6feafd9059bafde0d8f1b4
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
2017-01-23 16:10:09 -07:00
Ziye Yang
4a5a24d537 ioat: cleanup logic in spdk_ioat_submit_copy
Change-Id: I90614b93e1eb6b7b09ca7a21efe5a782f08a9da6
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-23 16:09:11 -07:00
Tsuyoshi Uchida
5ee4728d0c log: define prioritynames[] (#102) 2017-01-23 16:07:29 -07:00
HaoZhiZhang
49daf72e0e nvme: support extended LBA without protection information (#101)
If namespace is formatted with per lba metadata feature and also disable end-to-end protection
feature, host couldn't use per extended-lba metadata area.

Signed-off-by: Zhihao Zhang <thomas.zzh@alibaba-inc.com>
2017-01-23 11:20:04 -07:00
Daniel Verkamp
d63a30e39d nvme/pcie: return 1 when PCI address doesn't match
If the user asked for a specific PCI address in spdk_nvme_probe(), we
need to return 1, not 0, for the other PCI addresses that don't match
when enumerating.  0 means to attach the PCI driver, whereas 1 means to
continue enumerating.

With the previous behavior of returning 0, all NVMe devices would be
attached to the DPDK PCI driver, even if the user did not request for
them to be probed, and further calls to spdk_nvme_probe() would not find
any devices.

Change-Id: Ifbbcd7d1abe8ab535b6957855172e66a3e69fbe4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-20 17:07:30 -07:00
Cunyin Chang
4f6cc16e2f bdev/nvme: initialize the adminq_timer_poller as NULL.
Change-Id: I86ef6d42e98a39ed765de88f74b826edc8f2c904
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
2017-01-20 10:08:59 -07:00
Ben Walker
765173a7ca nvmf: Make RDMA private data required.
This is not actually optional - it contains required
information for setting up the connection.

Change-Id: I21136de12794a0f4f5c14c5d3e2e3f2306c5c102
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-20 10:02:32 -07:00
Ben Walker
4ef419305e nvmf: Add function to get subsystem by id
This isn't used anywhere yet, but it will be for
NVMe-oF 1.1.

Change-Id: Ieae0688e6ad5b7a44568e5760382b5716b02e6f0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-20 10:02:32 -07:00
Ben Walker
1cbbfb86fa nvmf: Make cntlid globally unique.
The code doesn't actually use this property of cntlid
for anything yet, but we will need it later.

Change-Id: I5fd514d75b903cc8769e7b9f196a4624e9cf876c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
2017-01-20 10:02:32 -07:00
GangCao
929fb087e3 free allocated spdk_conf in case of failure
Change-Id: I1c7b1ea12e535da83fc47f449ccb6fb02a231047
Signed-off-by: GangCao <gang.cao@intel.com>
2017-01-19 15:01:13 -07:00
Daniel Verkamp
be8a9d6966 nvme: add transport ID string parsing function
Change-Id: I33c15c8a56c25667567b373d21a117cca1f756c7
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-19 14:05:57 -07:00
Daniel Verkamp
5de35015b9 bdev/nvme: add timer-based admin queue poller
This is necessary to process asynchronous events, as well as keep-alive
support for NVMe over Fabrics connections.

Based on a patch by Edward Yang <eyang@us.fujitsu.com>

Change-Id: I3e81f3d5061f75b12b625fa1a06629c6dc3dc61b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-19 10:54:33 -07:00
Daniel Verkamp
5eacff59cd ioat: add Skylake Xeon device ID
There is only a single device ID for all channels on the SKX
implementation of I/OAT.

Change-Id: I90ee79b1b673a199754f1ca4c9e38e934294e261
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-19 09:39:59 -07:00
Daniel Verkamp
6aabf494dc scsi: only generate sense data for Check Condition
Change-Id: Ia8bc43f045f367c12a8da818bd8496e45b8ac930
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-19 09:39:26 -07:00
Daniel Verkamp
a53f617423 bdev: add API to translate to and from NVMe status
This prevents the need for bdev users and modules to manipulate the
internal bdev_io error.nvme fields.

For now, all non-NVMe error types are treated as a generic device error,
but translation from SCSI to NVMe could be added in the future.

Change-Id: I4e831b26a2f41bf2f405c7576d5019bb898d4d1b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-19 09:39:26 -07:00
Ziye Yang
0a573526b6 nvme/pcie: Add the support to probe nvme by pci_addr
Currently we use the pci functions provided by DPDK,
it identifies the device by class id related
info but not by pci bdf info, so we can add the filering
by pci_addr in pcie_nvme_enum_cb function.

Change-Id: I5942e98853f00fc10fa6aae5c113517653d1b357
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-18 15:30:45 -07:00
Tsuyoshi Uchida
950b48de61 log: use facilitynames to set/get log facility (#81)
* log: use facilitynames to set/get log facility

Define our own facilitynames[] instead of defining SYSLOG_NAMES
2017-01-17 11:20:34 -07:00
Jim Harris
86e8a920bf nvme: split non-compliant SGLs into multiple requests
Since nvme_ns_cmd.c now walks the SGL, some of the test code
needs to also be updated to initialize and return correct values
such as ctrlr->flags and sge_length.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I521213695def35d0897aabf57a0638a6c347632e
2017-01-17 07:51:09 -07:00
Daniel Verkamp
38c09e5eed json/parse: rewrite and simplify number parsing
Convert the number parsing function into a linear sequence with a goto
label for each state, rather than a single loop with a state variable.

This makes the code easier to read and also improves speed (better
branch prediction and smaller inner loops for the common case).

On my test system, jsoncat citylots.json > /dev/null improves from
~1.7s to ~1.2s.

This changes behavior of some number parsing test cases: inputs matching
the number grammar as defined by JSON will be returned even if there is
trailing garbage, consistent with the rest of the parser.  For example,
the input 01 will be parsed as a valid number 0 followed by trailing 1.
This only makes any difference when the full input is a single
number value, since if the value was nested in an object or array, the
trailing garbage will not match the expected syntax and the whole parse
will fail with SPDK_JSON_PARSE_INVALID (e.g. [00 will parse the first 0
as a number and then fail on the second 0, since only a comma or right
square bracket would be accepted).

Change-Id: Ifabfaed611219b3e0a06c8677190a28b87e8a13b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-13 13:18:50 -07:00
Daniel Verkamp
a509ddeb24 json/write: add an output buffer
This improves output speed significantly, especially if the write
callback is expensive (e.g. issues a syscall or takes a lock).

On my test system, jsoncat citylots.json > /dev/null improves from
~2.8s to ~1.7s.

citylots.json: https://github.com/zemirco/sf-city-lots-json (~181 MiB)

Change-Id: I7d411ce92366712ed87ad5fc6e9b64828541db4d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-13 13:18:50 -07:00
Daniel Verkamp
2138676573 bdev: defer completions from within submit_request
If a blockdev module calls spdk_bdev_io_complete() within its
submit_request function, and the user's completion callback issues a new
I/O, it is possible to cause infinite recursion, consuming all available
stack space.

To avoid this, track whether a bdev_io is being processed by
submit_request, and if io_complete() is called in this case, defer the
completion via an event.

Change-Id: I6ccdb8ed4ee0d5738e6c9840d35431de52bd5fa2
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-13 12:37:54 -07:00
Ziye Yang
d61ddd3c93 nvme/rdma: Support directly connect via trid
Preivously, we only supports probe the NVMf target
via discovery info, now we can support to directly
to connect it.

Change-Id: I08ce1d95de6744286357e68b48c97b773b902ac8
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-13 10:57:03 -07:00
Ziye Yang
e1b607d07b ioat: add missing Haswell channel 1 device ID
I do not see any reason to ignore using this channel. If that,
we should give comments in the file, otherwise we need to add it.

Change-Id: I56ad491c67a23831befc8c761ad0a02e721a15a4
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-13 10:33:49 -07:00
Daniel Verkamp
4600aaf68f bdev: simplify spdk_bdev_free_io() flow
Because of the addition of io_channel support to the bdev layer, there
is no longer a need to re-run a completed I/O through the submission
event pipeline; it can be freed directly.

Change-Id: I2b9163c87293345acf0e85f6d0c1032f30209659
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-12 14:25:35 -07:00
Daniel Verkamp
4a95a81e69 event/reactor: update last_action for timer pollers
Include timer-based pollers in the active/idle check that uses
last_action to determine when a reactor last executed an action.

Change-Id: Ib8f1253675b57aeb59206d099c6257f6d07f5acf
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-12 14:04:01 -07:00
Daniel Verkamp
d2c0feac8a event/reactor: increase spin time from 1us to 1ms
One microsecond is not really long enough to detect an idle condition
where calling the OS usleep() makes sense.  Increase the minimum time
spent spin-waiting on events and pollers from one microsecond to one
millisecond.

Change-Id: I678118e357330f133251f4cfada8ff27e10158a5
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-12 14:01:26 -07:00
Cunyin Chang
683c7d05eb iscsi: increment the correct lcore's g_num_connections in FFP transition
When a connection enters full-feature phase and is assigned to an lcore,
we need to increment the counter for the new lcore, not the connection's
existing lcore.

Change-Id: Idced4090b6e8ac35a767fd223fbd81ba824615d3
Signed-off-by: Cunyin Chang <cunyin.chang@intel.com>
2017-01-12 10:15:05 -07:00
Daniel Verkamp
249a68e92b bdev: add API to claim block devices
Claim the block devices used by iSCSI LUNs and NVMe-oF subsystems so
they can't accidentally be reused.

This will also be used by virtual block devices to allow layering of
bdevs.

Change-Id: I5384923fbf24f13f4ce720a797c5a628053d49f4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-11 16:49:39 -07:00
Ziye Yang
143692d18f iscsi: handle the corner case while partial read is failed
Change-Id: If2ba687d49bd5e282c3a5f8516760859376dc658
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-11 13:15:21 -07:00
Ziye Yang
2a3154bd87 SCSI: Fix SCSI R/W error status when lba and its range is not valid
Change-Id: Ibdf3941991d552e67b69c28eacacd5384570145a
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-11 12:39:31 -07:00
Ziye Yang
90f13aa634 nvme/rdma: Support sgl for readv/writev functions
(1) Add nvme_rdma_build_sgl_request function
(2) Merge nvme_rdma_pre/post_copy_mem to nvme_rdma_copy_mem

Change-Id: I86abab821b32b4da0aa9489a6b9f7dc430333159
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-11 12:36:52 -07:00
Daniel Verkamp
a96dc2592e bdev: remove event dependency from I/O callback
Use a plain function pointer + callback context for the bdev I/O
completion callback.  This is possible now because each I/O channel will
be polled on the core that submitted the I/O.

Change-Id: I29ee8e4a3430df11c74845adab840395b9bc5010
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-09 12:09:36 -07:00
Jim Harris
292c9c42aa scsi: simplify lun task execution
An old prototype SPDK AHCI driver would return
TASK_SET_FULL if all NCQ slots were full on a given
disk.  This would kick the SCSI task back to the LUN
to be retried later.  Since then, we have pushed
responsibility onto the bdev modules themselves
to handle this kind of queueing/retry logic.

Removing this logic allows us to make some additional
changes that enable tasks to get completed inline without
an extra event callback to handle completion.  We also
no longer need to worry about checking if pending tasks
need to be executed in the complete_task() routine, since
the execute() routine will now always exhaust the pending_task
list.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If2dc3ab017e0dbc225c8f627e1f87c5a8e9b1e3e
2017-01-09 11:37:25 -07:00
Daniel Verkamp
f80c0f4fdd nvme: remove transport ctrlr_attach callback
Now that the hotplug code is isolated in nvme_pcie.c, it can call the
PCIe transport attach function directly.

Change-Id: I2df3b9168473b537cc9b13367e06d3d3b6fa22be
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-09 11:36:27 -07:00
Daniel Verkamp
c000c930b2 event: align spdk_reactor to cache line boundary
The reactor structures are allocated in a contiguous array, and each
reactor is accessed from a different core, so align the reactor
structure to avoid false sharing.

Change-Id: I95162620ccb58fae060b2d95e47a38621dfbd140
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-06 09:50:35 -07:00
Daniel Verkamp
98568f1dab event: make g_spdk_event_mempool static
It is private to lib/event/reactor.c and does not need to be exposed in
the global namespace.

Change-Id: Idfff0365a0afdd90a0567825d520adf61d99fd2b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-06 09:50:35 -07:00
Ziye Yang
ae07bdf125 scsi: make the io channel of scsi lun free correct
Previously, we did not calculate the ref for the LUN.

Change-Id: If2b7bc7d129e7efd994a7987ae2c421048969acb
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
2017-01-06 09:50:22 -07:00
Jim Harris
6d4ce17380 bdev/nvme: do not split SGE callbacks on 2MB boundaries
An SGE could be for a payload that is greater than the NVMe
devices MDTS (i.e. 128KB), but that SGE may not be aligned
on a sector-size boundary.  We can safely assume that each
iov is individually physically contiguous - the DPDK
mempools for example guarantee this.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8143ed01814c3154d0a06b8bbc548484437c1e88
2017-01-05 15:51:04 -07:00
Daniel Verkamp
df8129fb39 nvme: move num_entries to transport-specific qpairs
The spdk_nvme_qpair::num_entries value is never used in the common code,
so move it to the individual transport qpairs to make it clear that it
is a transport-specific implementation detail.

Change-Id: I5c8f0de4fcd808912ba6d248cf5cee816079fd32
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-05 15:49:09 -07:00
Daniel Verkamp
7ac9a4ecbb event: remove spdk_event_allocate() next parameter
The 'next' event pointer was never used in the entire code base (always
NULL).

Change-Id: I75f999d3a2e10512d86edec1a5a46ef263e2635b
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-05 11:57:18 -07:00
Daniel Verkamp
c3ede774c7 event: remove spdk_event_t typedef
Use 'struct spdk_event *' directly for consistency with the rest of the
API.

Change-Id: Ib41a9bf47f5b18f4aebf5f4dee055455cb12ef7d
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-05 11:57:18 -07:00
Daniel Verkamp
44ef085bed event: pass arg1 and arg2 directly to event fn
This allows the elimination of the spdk_event_get_arg1() and
spdk_event_get_arg2() macros, which accessed the event structure
directly; this was preventing the event structure definition from being
moved out of the public API header.

Change-Id: I74eced799ad7df61ff0b1390c63fb533e3fae8eb
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-05 11:57:18 -07:00
Daniel Verkamp
3d528833d5 event: move default opt values out of public API
The public API user is supposed to retrieve the defaults via the
spdk_app_opts_init() function.

Change-Id: Ie2bd6e809b2d47dbd5d62d396e8715f89f4052d9
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-05 11:57:18 -07:00
Daniel Verkamp
2931a3efef event: remove 'complete' parameter from poller_register
The spdk_poller_register() function provides a way to pass an event to
call once the poller is registered, but it is always NULL in the current
code base.

Change-Id: I459bf40ae4d050589577d113b7984f1563aaa9cc
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
2017-01-05 11:57:18 -07:00