Compare commits

...

69 Commits

Author SHA1 Message Date
Shuhei Matsumoto
3d1bbb273b ut/nvme_pcie: Fix a few assert conditions which had used not == but =
A compiler got warning for these mistakes.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie9772910b6a3cc9d6e45cfae1c19048179d16189
(cherry picked from commit 7641283387)
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5527
Tested-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-12-17 15:41:22 +00:00
Alexey Marchuk
43a94514af make/dpdk: Correct compiler type detection
This commit fixes compiler type detection to suppress
warnings specific for gcc 10

Change-Id: I66264451792ff84a53001badc7c2f8a452d732af
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
(cherry picked from commit 1415e38411)
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5525
Tested-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: <dongx.yi@intel.com>
2020-12-17 15:41:22 +00:00
Alexey Marchuk
f71ccc5691 make/dpdk: Suppress GCC 10 warnings
Suppres the following warnings which causes compilation errors:
1. gcc 10 complains on operations with zero size arrays in rte_cryptodev.c
Suppress this warning by adding -Wno-stringop-overflow compilation flag
2. gcc 10 disables fcommon by default and complains on multiple definition of
aesni_mb_logtype_driver symbol which is defined in header file and presented in sevral
translation units. Add -fcommon compilation flag.

Fixes issue #1493

Change-Id: I9241bf1fd78e86df6a6eb46b4ff787b2f7027b7d
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
(cherry-picked from commit 970c6d099e)
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5526
Tested-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-12-17 15:41:22 +00:00
Seth Howell
74c9fd40fd sock: keep track of removed sockets during call to poll
We have been intermittently hitting the assert where
we check sock->cb_fn != NULL in spdk_sock_group_impl_poll_count.

The only way we could be hitting this specific error is if we
wereremoving a socket from a sock group within after receiving
an event for it.

Specifically, we are seeing this error on the NVMe-oF TCP target
which relies on posix sockets using epoll.

The man page for epoll states the following:

 If you use an event cache or store all the file descriptors
 returned from epoll_wait(2), then make sure to provide
 a  way  to  mark its closure dynamically (i.e., caused by
 a previous event's processing).  Suppose you receive 100 events
 from epoll_wait(2), and in event #47 a condition causes event
 #13 to be closed.  If you remove  the  structure  and close(2)
 the file descriptor for event #13, then your event cache might
 still say there are events waiting for that file descriptor
 causing confusion.

 One solution for this is to call, during the processing
 of  event  47,  epoll_ctl(EPOLL_CTL_DEL)  to  delete  file
 descriptor  13 and close(2), then mark its associated data
 structure as removed and link it to a cleanup list.  If
 you find another event for file descriptor 13 in your batch
 processing, you will discover the file descriptor  had
 been previously removed and there will be no confusion.

Since we do store all of the file descriptors returned from
epoll_wait, we need to implement the tracking mentioned above.

fixes issue #1294

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ib592ce19e3f0b691e3a825d02ebb42d7338e3ceb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1589
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
(cherry picked from commit e71e81b631)
2020-07-05 14:53:47 +09:00
Tomasz Zawadzki
e46860f591 version: 20.01.3 pre
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I698c84ce3fff46612a67a1418551039e7f433a9d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2699
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-06-01 18:04:29 +00:00
Tomasz Zawadzki
b2808069e3 SPDK 20.01.2
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Id838a19f76bd5e7f6c771a3a4f673e85e4a1f92b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2698
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-06-01 17:50:18 +00:00
Darek Stojaczyk
9b83caf89f dpdkbuild: add support for DPDK 20.05
EAL got a new dependency in 20.05: rte_telemetry.

Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>

(cherry picked from commit 3b99a376d5595f4d6a458e8221bfd0b6b6f07b83)
Change-Id: I43df7afe9a84e88f034a7f87fc6a299f0bbd8bac
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2705
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-06-01 17:50:18 +00:00
Tomasz Zawadzki
fb56c3214e CHANGELOG: updated with changes backported to 20.01.x
- bdev_rbd_resize was added to incorrect section
- added info on spdk_mem_reserve

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I32f9b0ddf4d87243e3eab2b47fd35debc135d4e9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2697
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-06-01 17:50:18 +00:00
Jim Harris
f8d26843fc nvme: create netlink socket during nvme_driver_init
This helps ensure thread safety on creation of the
netlink socket, when probe is called from multiple
threads at once.  It is also a lot more clean - we just
create it once, rather than checking every time probe
is called to see if it has to be created.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2681 (master)

(cherry picked from commit 89e47f6014)
Change-Id: I528cedc3ff44de6ea8ecaf6d2389226502ba408e
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2696
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-06-01 17:50:18 +00:00
Jim Harris
239eae6000 nvme: add mutex to nvme_driver_init
This will allow spdk_nvme_probe and variants to be
called from multiple threads in parallel.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2680 (master)

(cherry picked from commit 18f79f2449)
Change-Id: I534db605c9e192b943afe973981b7b503d8b7e34
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2695
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-06-01 17:50:18 +00:00
Tomasz Zawadzki
8bfa974cc2 dpdk: update submodule to include fix for vhost from CVE-2020-10722 to 10726
Updated submodule from branch containing DPDK 19.11 to DPDK 19.11.2

That includes fixes for DPDK vulnerabilities:
- CVE-2020-10722
- CVE-2020-10723
- CVE-2020-10724
- CVE-2020-10725
- CVE-2020-10726
Along with other fixes done between those DPDK maintenance releases.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I33e14eba54568a2313bb0020bad9be3fdfc6836b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2564
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-06-01 08:55:55 +00:00
zkhatami88
ed016fdbfb nvme/rdma: Using hooks in reg mr
Signed-off-by: zkhatami88 <z.khatami88@gmail.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1905 (master)

(cherry picked from commit fe3fab26bf)
Change-Id: I9493fe82b5b758c0092d20ef18b79d652fefed85
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2610
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-06-01 08:55:55 +00:00
zkhatami88
ce4da6c39a nvme/rdma: When RDMA hooks exist, prefer spdk_zmalloc for internal
allocations

Signed-off-by: zkhatami88 <z.khatami88@gmail.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1593 (master)

(cherry picked from commit 58a8fe2eee)
Change-Id: I7f810ee78fecca7eb8a4387f6d63e1a952966e57
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2609
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-05-29 09:08:44 +00:00
Seth Howell
ae161cdec6 nvme/rdma: make sure we free resources in error path.
Not sure how we missed this.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1122 (master)

(cherry picked from commit 2248e52150)
Change-Id: If920cb3a7708c33032e1da28c564d4c28ddafdf4
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2608
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 09:08:44 +00:00
Tomasz Zawadzki
7f9ea53d35 lib/nvme: assign NULL to external_io_msgs ring after free
Multiple nmvme_io_msg producers on the ctrlr share the same ring.
After freeing it, it should be set to NULL. In order to prevent
either nvme_io_msg_ctrlr_detach() or spdk_nvme_io_msg_process()
from interacting on freed memory.

Above happened when resolving issues in later patches.
After their respective fixes, there is no scenario that
solely reproduces this failure so no tests were added in this
patch.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1917 (master)

(cherry picked from commit 251a551aa3)
Change-Id: I72b695d995b63bd002cc03e60cd4bdc82cfbe8ae
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2162
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 09:07:55 +00:00
Tomasz Zawadzki
6fe32e3e17 lib/nvme: free io buffer for nvme_io_msg
This buffer was not released after failure to enqueue.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1916 (master)

(cherry picked from commit f955c75ef4)
Change-Id: If84317c67626a3193851c90be056b8550a5fccee
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2161
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 09:07:55 +00:00
Tomasz Zawadzki
b8446bb66d nvme: do not allow the same nvme_io_msg_producer to register twice
Previous to this change it was possible to register
same nvme_io_msg_producer twice. This kind of functionality does
not make sense in current scope of it, as each message to/from
io_msg_producer does not have identifier other than this pointer.

In case of nvme_cuse this allowed creation of multiple /dev/spdk/nvme*
devices and caused an infinite loop when detaching an nvme controller.

This patch disallows that and adds test for nvme_cuse.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1938 (master)

(cherry picked from commit 7fbdeacc9e)
Change-Id: I5f56548d1bce878417323c12909d6970416d2020
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2160
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 09:07:55 +00:00
Tomasz Zawadzki
3f5f09db46 lib/cuse: provide proper error codes up to RPC
This patch adjusts several return codes to provide
more than just -1.

Along with fix to json rpc error print,
where negative error code was passed to spdk_strerror().
Resulting in unkown error being reported.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1915 (master)

(cherry picked from commit ef6ffb39d6)
Change-Id: I254f6d716d0ce587f88cc658163ba049378f3b2f
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2159
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 09:07:55 +00:00
Ben Walker
4faf9fc37b nvme: Make spdk_nvme_cuse_register thread safe
There is no indication right now that this function couldn't be called
by multiple threads on different controllers. However, internally it is
using two globals that can become corrupted if the user were to do this.
Put a lock around them so it is safe.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1903 (master)

(cherry picked from commit 5340d17823)
Change-Id: I59361f510eb1659c2346f1fd33c375add1dc9c81
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2158
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 09:07:55 +00:00
Tomasz Zawadzki
cbb1d099ff cuse: fix nvme_cuse unregister segfault
Unregistering nvme_cuse when the device did not exist
resulted in SEGFAULT within nvme_io_msg_ctrlr_unregister().

To prevent that, when no nvme_cuse is registered for the
ctrlr do not unregister nvme_io_msg_producer.

RPC and spdk_nvme_cuse_unregister() now return an error.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1921 (master)

For backporting to 20.01.x, API breaking changes were removed.
Only part that could cause the segfault remained.

(cherry picked from commit d9a11fd5b1)
Change-Id: Id77cebe23ff91023a24cfe091f5f62a76a9175fd
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2156
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 09:07:55 +00:00
Tomasz Zawadzki
d42b332ae6 cuse: refactor retrieving cuse_device to separate function
This patch adds nvme_cuse_get_cuse_ctrlr_device() and
nvme_cuse_get_cuse_ns_device that returns
struct cuse_device of a given nvme controller or namespace.

Similar iteration was used in two places so they were
replaced accordingly.
Next patch will add third.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1918 (master)

(cherry picked from commit 15a5018067)
Change-Id: I25ada843a59c632fe330263a65456d25c5ccf4cc
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2155
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 09:07:55 +00:00
Alexey Marchuk
930d91f479 nvme: Abort queued reqs when destroying qpair
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1791 (master)

(cherry picked from commit 4279766935)
Change-Id: Idef1b88cf47cf9f82b1f4499ef836dfa741c0c7f
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2606
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-05-29 08:29:33 +00:00
Alexey Marchuk
0acac18cfa nvme/rdma: Clean pointer to nvme_request
That is done to make sure that scenario described in github
issue #1292 won't happen

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1771 (master)

(cherry picked from commit f11989385e)
Change-Id: Ie2ad001da701e25ef984ae57da850fb84d51b734
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2641
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-05-29 08:29:33 +00:00
Alexey Marchuk
2381516ecc nvme/rdma: Wait for completions of both RDMA RECV and SEND
In some situations we may get a completion of RDMA_RECV before
completion of RDMA_SEND and this can lead to a bug described in #1292
To avoid such situations we must complete nvme_request only when
we received both RMDA_RECV and RDMA_SEND completions.
Add a new field to spdk_nvme_rdma_req to store response idx -
it is used to complete nvme request when RDMA_RECV was completed
before RDMA_SEND
Repost RDMA_RECV when both RDMA_SEND and RDMA_RECV are completed
Side changes: change type of spdk_nvme_rdma_req::id to uint16_t,
repack struct nvme_rdma_qpair

Fixes #1292

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1770 (master)

(cherry picked from commit 581e1bb576)
Change-Id: Ie51fbbba425acf37c306c5af031479bc9de08955
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2640
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-05-29 08:29:33 +00:00
Tomasz Zawadzki
90501268d6 lib/blob: merge EP of a clone when deleting a snapshot
In general it is not possible to delete snapshot when
there are clones on top of it.
There is special case when there is just a single clone
on top that snapshot.

In such case the clone is 'merged' with snapshot.
Unallocated clusters in clone, are filled with the ones
in snapshot (if allocated there).

Similar behavior should have occurred for extent pages.

This patch adds the implementation for moving EP from
snapshot to clone along with UT.

The UT exposes the issue by allowing delete_blob
to proceed beyond just unrecoverable snapshot blob.

Fixes #1291

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1163 (master)

Removed changes in UT since it requires couple multiple UT refactoring
changes before it.

(cherry picked from commit 0f5157377f)
Change-Id: Ib2824c5737021f8e8d9b533a4cd245c12e6fe9fa
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2599
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 08:27:18 +00:00
Liang Yan
ae0db495fb bdev/rbd: increase the segment in flush opeartion
Signed-off-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2490 (master)

(cherry picked from commit f2ede6b486)
Change-Id: Ibde0f924c1b78c9a8f0f440e944c7eb81631ed1b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2597
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
2020-05-29 08:27:09 +00:00
Michael Haeuptle
9bcc0ea8e8 ENV_DPDK/VFIO: Increase PCI tear down timeout
When removing large number of devices (>8) in parallel,
the 20ms timeout is not long enough.

As part of spdk_detach_cb, DPDK calls into the VFIO driver
which may get delayed due to multiple hot removes being
processed by pciehp driver (pciehp IRQ thread function
is handling the actual removal of a device in paralle but
all of the IRQ thread function compete for a global mutex
increasing processing time and race conditions).

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1588 (master)

(cherry picked from commit 55df83ceb6)
Change-Id: I470fbbee92dac9677082c873781efe41e2941cd5
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2598
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
2020-05-29 08:27:00 +00:00
Ben Walker
fab97f2aac Revert "env: Use rte_malloc in spdk_mem_register code path when possible"
This reverts commit 6d6052ac96.

This approach is no longer necessary given the patch immediately
preceeding this one.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2512 (master)

(cherry picked from commit 76aed8e4ff)
Change-Id: I5aab14346fa5a14dbf33c94ffcf88b045cdb4999
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2601
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-29 08:26:42 +00:00
Ben Walker
d635d6d297 env: Add spdk_mem_reserve
The spdk_mem_reserve() function reserves a memory region in SPDK's
memory maps. This pre-allocates all of the required data structures
to hold memory address translations for that region without actually
populating the region.

After a region is reserved, calls to spdk_mem_register() for
addresses in that range will not require any internal memory
allocations. This is useful when overlaying a custom memory allocator
on top of SPDK's hugepage memory, such as tcmalloc.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2511 (master)

This backport requires increasing SO_MINOR version, since it adds
new API. Version 2.1 does not conflict with any other, since on master
the API was increased from 2 to 3 SO_VER see:
(229ef16b) lib/env_dpdk: add map file and rev so major version.

(cherry picked from commit cf450c0d7c)
Change-Id: Ia4e8a770e8b5c956814aa90e9119013356dfab46
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2600
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-05-29 08:26:42 +00:00
Tomasz Zawadzki
e8d8cef0fd make: allow individual SO version for each library
Based on patch:
(19392783)make: rev SO versions individually for libraries.

It allows each library to update its own version separate
from the SO_SUFFIX_ALL==2.0.

This will allow increasing SO_MINOR version when needed.

Change-Id: Ic381a848e5f0e5af4b7f68725eb45138e00ca65b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2593
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-05-29 08:26:42 +00:00
Tomasz Zawadzki
062da7a08a nvme/pcie: reduce physically contiguous memory for CQ/SQ
Following patch made sure that CQ/SQ are allocated in
physically contiguous manner:
(64db67) nvme/pcie: make sure sq and cq are physically contiguous

Using MAX_IO_QUEUE_ENTRIES is enough to make sure that either
queue does not span multiple hugepages.

Yet the patch made sure that whole page is occupied only
by the queue. Which unnecessarily increases memory consumption
up to two hugepages per each qpair.

This patch changes it so that each queue alignment is limited
up to its size.

Changes in hugepages consumed when allocating io_qpair in hello_world
application:
io_queue_size		Without patch	With patch
256			8MiB		0MiB
1024			12MiB		4MiB
4096			24MiB		16MiB
Note: 0MiB means no new hugepages were required and qpair fits into
previously allocated hugepages (see all steps before io_qpair
allocation in hello_world).

Intersting result of this patch is that since we required alignment
up to the hugepage size this resulted in reserving even two 2MiB
hugepages to account for DPDK internal malloc trailing element.
See alloc_sz in try_expand_heap_primary() within malloc_heap.c

This patch not only reduces overall memory reserved for the
queues, but decreases increase in heap consumption on DPDK side.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2244 (master)

(cherry picked from commit d3cf561199)
Change-Id: I75bf86e93674b4822d8204df3fb99458dec61e9c
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2510
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-05-28 08:55:44 +00:00
GangCao
a7f7b1955e bdev/rbd: add ceph rbd resize function
This is to backport below change to SPDK v20.01.2 LTS release.
6a29c6a906

Change-Id: I9b7ed97f2a376af71578ccb5556231832863b255
Signed-off-by: Liang Yan <liang.z.yan@intel.com>
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2262
Tested-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2020-05-27 13:36:19 +00:00
GangCao
d4d3e76aed vhost: Fix the issue of virtual machine device parameter max_segments always equal to 1
Solve the problem that the /sys/block/vd../max_segments is always 1 in the virtual
machine,and avoid the problem of low sequential read and write performance caused
by this limitation in the general block device layer of some lower kernels.

Backport this fix (9c6d4649eb)
to the SPDK LTS 20.01.2 release.

Change-Id: I30f6201bbfbb7885379b1b0ae19b64a1673e487f
Signed-off-by: suhua <suhua1@kingsoft.com>
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2261
Tested-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
2020-05-25 15:42:26 +00:00
Tomasz Zawadzki
09377fc41f version: 20.01.2 pre
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If458a3c6571f9a9beaf6ba5202d0fd29e623dc1f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1376
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-20 19:03:49 +00:00
Tomasz Zawadzki
b90630a465 SPDK 20.01.1
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I519f07a157f361141d3c2d9f4cf49af646af0901
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1375
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-20 19:03:49 +00:00
yidong0635
1ffa3c3f08 lib/nvme: Fix scanbuild issue about uninitialized value.
Issue:
nvme.c:766:2: warning: 4th function call argument is an uninitialized value
        snprintf(trid->trstring, SPDK_NVMF_TRSTRING_MAX_LEN, "%s", trstring);

Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1314 (master)

(cherry picked from commit 4a1ec34d3b)
Change-Id: I4b0ae106ef8e4e72e80ec96d10010fddf8173144
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1371
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-20 19:03:49 +00:00
Shuhei Matsumoto
136c0771ad lib/iscsi: Return when connection state is already exited at login completion
iSCSI target got segmentation fault if connection is being exited
between spdk_iscsi_conn_write_pdu() and its callback
iscsi_conn_login_pdu_success_complete() are executed.

This was caused by recent asynchronous socket write feature.

Fixes issue #1278.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1275 (master)

(cherry picked from commit 628dc9c162)
Change-Id: Idffd90cd6ee8e6cb4298fe3f1363d8d5c5a3c49d
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1355
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
paul luse
7976cae3b4 module/crypto: increase the number of queue pairs for AESNI_MB
Default was 8 which meant max of 8 bdevs.  Bump it up to 64.

Fixes issue #1232

Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1063 (master)

(cherry picked from commit 302f7aa6e4)
Change-Id: I966e90de5c27910df0e4da0d1062d9d1665f8de6
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1306
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-03-19 08:09:53 +00:00
Jim Harris
01a942dc6b bdev/nvme: do not destruct ctrlr if reset is in progress
The adminq poller could get a failure if the ctrlr has
already been hot removed, which starts a reset.

But while the for_each_channel is running for the reset,
the hotplug poller could run and start the destruct
process.  If the ctrlr is deleted before the for_each_channel
completes, we will try to call spdk_nvme_ctrlr_reset() on
a deleted controller.

While here, also add a check to skip the reset if the
controller is already in the process of being removed.

Fixes #1273.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1253 (master)

(cherry picked from commit ba7b55de87)
Change-Id: I20286814d904b8d5a9c5209bbb53663683a4e6b0
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1305
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-03-19 08:09:53 +00:00
Jim Harris
b030befb1d bdev/nvme: use mutex to protect 'resetting' member
This isn't in the performance path, so using the mutex
here makes it a bit more consistent with other ctrlr
members such as 'destruct'.

This prepares for a future patch which will defer
ctrlr destruction on removal if a reset is in progress.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1252 (master)

(cherry picked from commit 2571cbd807)
Change-Id: Ica019cd90dc3b46ef6a13dd311054dbdc95855aa
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1304
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
Jim Harris
77a53c2c00 dpdk: move submodule to commit 3fcb1dd
This adds recent commit:
  contigmem: cleanup properly when load fails

Fixes issue #1262.

Signed-off-by: Jim Harris <james.r.harris@intel.com>

Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1217 (master)

(cherry picked from commit 328a221299)
Change-Id: I4d873af280803c3cc6c146439a0bbc7af4c7296c
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1303
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2020-03-19 08:09:53 +00:00
Vitaliy Mysak
cfc2cd611c env_dpdk: dont treat NULL as error in spdk_map_bar_rte()
We use `spdk_map_bar_rte()` to read mapped addresses
from PCI BARs.
This function is currently checking for NULL in each pair.
But in PCI memory, some registers can be left unused,
in which case they are set to 0.
As a result, we may read some NULL pointers from BARs,
which is OK.
To check if given address is indeed invalid, we should first
check if it is used.
So it is best to delegate such checks to the
user of this function.
In fact, users already do the NULL check where it is needed
(ex: virtio_pci.c:390, nvme_pcie.c:589)
so this patch just removes them from `spdk_map_bar_rte()`.

This solves github issue #1206

Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1129 (master)

(cherry picked from commit d4653a31e0)
Change-Id: I88021ceca1b9e9d503b224f790819999cd16da01
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1302
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-03-19 08:09:53 +00:00
yidong0635
4e4bb7f822 rdma: Fix Segmentation fault when not sufficient memory for RDMA queue.
Fix Segmentation fault on the target side.
Issue:
rdma.c:2752:spdk_nvmf_rdma_listen: *NOTICE*: *** NVMe/RDMA Target Listening on 192.168.35.11 port 4420 ***
rdma.c: 789:nvmf_rdma_resources_create: *ERROR*: Unable to allocate sufficient memory for RDMA queue.
rdma.c:3385:spdk_nvmf_rdma_poll_group_create: *ERROR*: Unable to allocate resources for shared receive queue.
Segmentation fault (core dumped)

GDB:
Program terminated with signal 11, Segmentation fault.
736             if (resources->cmds_mr) {
(gdb) bt
736             if (resources->cmds_mr) {
(gdb) bt
0  nvmf_rdma_resources_destroy (resources=0x0) at rdma.c:736
1  0x0000000000497516 in spdk_nvmf_rdma_poll_group_destroy (group=group@entry=0x2fe1300) at rdma.c:3489
2  0x00000000004978bb in spdk_nvmf_rdma_poll_group_create (transport=0x2fe11d0) at rdma.c:3371
3  0x000000000048df70 in spdk_nvmf_transport_poll_group_create (transport=0x2fe11d0) at transport.c:267
4  0x000000000048a450 in spdk_nvmf_poll_group_add_transport (group=0x2f49af0, transport=<optimized out>) at nvmf.c:941
5  0x000000000048a6cb in spdk_nvmf_tgt_create_poll_group (io_device=0x2fce600, ctx_buf=0x2f49af0) at nvmf.c:122
6  0x00000000004a0492 in spdk_get_io_channel (io_device=0x2fce600) at thread.c:1324
7  0x000000000048a0e9 in spdk_nvmf_poll_group_create (tgt=<optimized out>) at nvmf.c:723
8  0x000000000047f230 in nvmf_tgt_create_poll_group (ctx=<optimized out>) at nvmf_tgt.c:356
9  0x000000000049f92b in spdk_on_thread (ctx=0x2f81b20) at thread.c:1065
10 0x000000000049f17d in _spdk_msg_queue_run_batch (max_msgs=<optimized out>, thread=0x1e67e90) at thread.c:554
11 spdk_thread_poll (thread=thread@entry=0x1e67e90, max_msgs=max_msgs@entry=0, now=now@entry=947267017376702) at thread.c:623
12 0x000000000049af86 in _spdk_reactor_run (arg=0x1e678c0) at reactor.c:342
13 0x000000000049b3a9 in spdk_reactors_start () at reactor.c:448
14 0x0000000000499a00 in spdk_app_start (opts=opts@entry=0x7ffc2a5e0ce0, start_fn=start_fn@entry=0x40aa80 <nvmf_tgt_started>,
						arg1=arg1@entry=0x0) at app.c:690
15 0x0000000000408237 in main (argc=5, argv=0x7ffc2a5e0e98) at nvmf_main.c:75

Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1073 (master)

(cherry picked from commit 9d93c08234)
Change-Id: Id9bf081964d0cf3575757e80fc7582b80776d554
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1301
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2020-03-19 08:09:53 +00:00
Seth Howell
a763a7263a mk: bump the shared object major version to 2.
This is to indicate the ABI breakage in the bdev library. A function's
argument list was changed which breaks both backwards and forwards
compatibility.

Going forward, all backwards compatibility breaking changes should be
marked with a rev of the SO major version for that library. All forwards
compatibility breaking changes should be marked with a rev of the SO
minor version.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1066 (master)

(cherry picked from commit c5911f0224)
Change-Id: I35e45c102c5c6de3c684919a10e5116f8f2c375f
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1300
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
2020-03-19 08:09:53 +00:00
Changpeng Liu
52c7d46a3c nvme: set transport string before the probe based on transport type
Users may only set the transport type, but for the actual probe
process, the trstring field is mandatory, so set the trstring
based on transport type at first.  Also remove unnecessary
spdk_nvme_trid_populate_transport() call from each transport
module.

Fix #1228.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1001 (master)

(cherry picked from commit 8d6f48fbf8)
Change-Id: I2378065945cf725df4b1997293a737c101969e69
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1299
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
Alexey Marchuk
776d45b0e3 nvme: Fix potential use of non-initialized variable
trstring variable in spdk_nvme_trid_populate_transport is not
initialized, that can lead to snprintf() writes some garbage to
trid->trstring if the user passes SPDK_NVME_TRANSPORT_CUSTOM trtype
Add return statement and assert to CUSTOM/default switch

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483469 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit 3424def90a)
Change-Id: I6c6c37f9aa74d61b346f7be27fb890c7a34e9229
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1318
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
Changpeng Liu
83edd2f716 nvme: detach the controller in STUB and flush the admin active requests at last
In the autotest, when calling kill_stub() function, there is error log
like this: "Device 0000:83:00.0 is still attached at shutdown!", so it's
better to detach the controller when exit the stub process.

But after call spdk_nvme_detach() in the stub process, there is another issue:
1. NVMe stub running as the primary process, and it will send 4 AERs.
2. Using NVMe reset tool as the secondary process.

When doing NVMe reset from the secondary process, it will abort all the
outstanding requests, so for the 4 AERs from the primary process, the 4
requests will be added to the active_proc->active_reqs list.

When calling spdk_nvme_detach() to detach a controller, there is a
assertion in the nvme_ctrlr_free_processes() at last to check the
active requests list of this active process data structure.

We can add a check before destructing the controller to poll the
completion queue, so that the active requests list can be flushed.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/977 (master)

(cherry picked from commit bad2c8e86c)
Change-Id: I0c473e935333a28d16f4c9fb443341fc47c5c24f
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1298
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
Jacek Kalwas
374d2a2f64 nvme: fix command specific status code
Given enum was not aligned with spec. This status can be reported when
size equals 0.

Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/928 (master)

(cherry picked from commit a7a0d02d8b)
Change-Id: If51f6b051c13880c1fd4e6bb0a02f134b28b5a88
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1297
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
2020-03-19 08:09:53 +00:00
Alexey Marchuk
1a4dec353a nvmf/rpc: Destroy subsystem if spdk_rpc_nvmf_create_subsystem fails
Destroy subystem if spdk_nvmf_subsystem_set_sn or spdk_nvmf_subsystem_set_mn
failed. Check status in spdk_rpc_nvmf_subsystem_started callback, destroy
subsystem and report an error on error.

Fixes #1192

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/832 (master)

(cherry picked from commit c29247e1fe)
Change-Id: Id6bdfe4705b5f4677118f94e04652c2457a3fdcc
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1296
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
Alexey Marchuk
c55906a30d rdma: Correct handling of RDMA_CM_EVENT_DEVICE_REMOVAL
This event can occur for either qpair or listening device. The
current implementation assumes that every event refers to a qpair
which is wrong. Fix: check if the event refers to a device and
disconnect all qpairs associated with the device and stop all
listeners.

Update spdk_nvmf_process_cm_event - break iteration if
rdma_get_cm_event returns a nonzero value to reduce the
indentation depth

Fixes #1184

Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/574 (master)

(cherry picked from commit 804b066929)
Change-Id: I8c4244d030109ab33223057513674af69dcf2be2
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1295
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-03-19 08:09:53 +00:00
Tomasz Kulasek
400316351f test/vpp: fix error handling in vppctl non-interactive mode
On Fedora 30 we have noticed VPP 19.04 related issues:

  1) Error values returned by vppctl in non-interactive mode
     are not relevant to the success/fail of command.
     Vppctl ALWAYS returns 0, so "-e" bash option is unable
     to detect any errors.
  2) We have intermittent pipefail errors (error 141) returned
     by vppctl on disconnect from vpp, even though commands are
     executed succesfully.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1214 (master)

(cherry picked from commit 7b7e97604b)
Change-Id: Ie22ea24f7e81017089b899111724d338eeb81113
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1287
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-03-19 08:09:53 +00:00
Tomasz Kulasek
1dfd2cf594 sock/vpp: fix compilation with gcc9
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1084 (master)

(cherry picked from commit b61e2479f5)
Change-Id: Ia48a59807047ea2ab5103638fb49bfea9446f854
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1285
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
Tomasz Zawadzki
3fcc5a9ec1 lib/blob: queue up blob persists when one already is ongoing
It is possible for multiple blob persists to affect one another.
Either by blob->state changes or blob mutable data.
Safe way to prevent that is to queue up the persists.

Next persist will be executed only after previous one completes.

Fixes #1170
Fixes #960

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/776 (master)

(cherry picked from commit 030be573f3)
Change-Id: Iaf95d9238510100b629050bc0d5c2c96c982a60c
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1308
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-03-19 08:09:53 +00:00
Tomasz Zawadzki
f7730adbf0 lib/blob: move starting persist to separate function
_spdk_blob_persist_check_dirty() function will be
called in subsequent patch at the end of persist
in _spdk_blob_persist_complete() to proceed
with any queued up persists.
Please see following patch for this.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/872 (master)

(cherry picked from commit dd80edb2b4)
Change-Id: Ieeb334e23cde329743647f728e70dd60333c224a
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1307
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-03-19 08:09:53 +00:00
Seth Howell
88b5d6d183 test/nvmf: add verify_backlog to fio SGL tests.
On newer versions of FIO, there is an issue with heavy verify workloads
where one of the headers (rand_seed) gets incorrectly generated by fio
during verify. This can be circumvented by using the verify_backlog
flag.

This is needed because it will enable testing this workload on the tcp
transport using fio in the SPDK test pool.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/988 (master)

(cherry picked from commit 2b43f6353f)
Change-Id: I028be3fdb72a76733b4226a37b6332cd45d0f774
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1294
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
Tomasz Kulasek
fad91fb911 test/nvme: fix correct controllers name in nvme-cli cuse test
Fixes issue #1223

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/981 (master)

(cherry picked from commit 03842fd950)
Change-Id: I16bf739d9be54249600e135a07fdeb554c77f4cf
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1324
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
Karol Latecki
afc6fb5e1a autorun_post: skip confirming executed tests
Allow to skip confirmPerPatchTests if needed.

Signed-off-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483016 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit ac26fec9c6)
Change-Id: I8741d80de5cac9954e3429b951a71dc065c40bb5
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1317
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-03-19 08:09:53 +00:00
Darek Stojaczyk
02dda9731a test/rocksdb: fix db_bench build with gcc9
GCC9 complains:
./db/version_edit.h:134:71: error: implicitly-declared "constexpr
rocksdb::FileDescriptor::FileDescriptor(const
rocksdb::FileDescriptor&)" is deprecated [-Werror=deprecated-copy]

From what I see this can be fixed by explicitly
defining some constructors and assignment operators,
even setting them to `= default;`. I didn't dig into
this further, just ignore the warning for now.

Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1082 (master)

(cherry picked from commit a5bcbbefcb)
Change-Id: Ia0ee0cc5fc1dce36f7098959d383b08855a825df
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/1286
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2020-03-19 08:09:53 +00:00
Tomasz Zawadzki
cc02904e82 version: 20.01.1 pre
Change-Id: I703ff74a236b0a3c6254f332e57995311b2b082b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483389
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-31 09:42:34 +00:00
Tomasz Zawadzki
5ffffe9d96 SPDK 20.01
Change-Id: I5ad326fcd246e3f2cf8231b53105a5fe70edc9c7
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483388
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2020-01-31 09:42:34 +00:00
Ziye Yang
6edcf515d6 sock/posix: Change the return type of function _sock_check_zcopy
Purpose: The function spdk_sock_request_put may
return an error code, and close the socket, so we should change the
return type of  _sock_check_zcopy.

If the return value of  _sock_check_zcopy is not zero,
we should not handle the EPOLLIN event.

Fixes #1169

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483311 (master)

(cherry picked from commit 9587017902)
Change-Id: Ie6fbd7ebff54749da8fa48836cc631eea09c4ab8
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483411
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-31 09:42:34 +00:00
Tomasz Zawadzki
a0f982bb0a lib/blob: add invalid flag for extent table
With recent changes to extent on-disk metadata format,
new format (Extent Pages) is not backwards compatible.
Meanwhile old format (Extent RLE) is backwards
compatible with older SPDK applications.

Summing up:
Blobstore created pre SPDK 20.01 can only use Extent RLE.
Blobstore created starting with SPDK 20.01 can use both,
Extent Pages and Extent RLE specified by use_extent_table opts.

When use_extent_table is set to true, invalid flag for it is set.
SPDK application pre 20.01, will not load such blob.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483220 (master)

(cherry picked from commit dc24539c40)
Change-Id: If14ebd03f19eb581d71dcb46191e099336655189
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483395
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-31 09:42:34 +00:00
Seth Howell
a1e730b460 test/nvmf: disable bdevperf tests on soft-roce
Github issue 1165 details some issues we have with soft-roce and these
tests. Right now we are disabling them for build stability.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483300 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit e0f63b969d)
Change-Id: I3a9e28ff3cc1c6ac7d9aa91d93541e295514bb7b
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483407
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-31 09:42:34 +00:00
Tomasz Zawadzki
2ad2b27149 lib/blob: document use_extent_table
Patch adds documentation and CHANGELOG update
for newly added Extent Table/Page path.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483247 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit 353252b1b4)
Change-Id: I86f6c5680084a92d50bd9ca39b68d68a9908ecf8
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483381
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-31 09:42:34 +00:00
Ben Walker
7c06ec7247 sock/posix: Block recursive calls to spdk_sock_flush
Don't allow calling spdk_sock_flush while the socket is
closed.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483148 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit d0f4a51fdc)
Change-Id: I9020a49ab8906b0f343e3f48f8b96bd38308ab17
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483380
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-31 09:42:34 +00:00
Seth Howell
06e7d22c06 CHANGELOG: Alphabetize the 20.01 changelog sections
We should probably be consistent about this going forward.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482911 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit 967fa2d707)
Change-Id: I6893ac991a0e506edad737db72986d82d6f1734e
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483258
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-30 16:57:01 +00:00
Seth Howell
54714eae1a CHANGELOG: update changelog for the 20.01 release.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482448 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit 64021521f7)
Change-Id: Ie1760d1d65d8f8266c80327c853720f4299594ce
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483257
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-30 16:57:01 +00:00
Seth Howell
cc9c0e6922 env_dpdk: keep a memmap refcount of physical addresses
This allows us to avoid trying to map the same physical address to the
IOMMU in physical mode while still making sure that we don't
accidentally unmap that physical address before we are done referencing
it.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483133 (master)
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>

(cherry picked from commit f4a63bb8b3)
Change-Id: I947408411538b921bdc5a89ce8d5e40fd826e971
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483256
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-30 16:57:01 +00:00
Seth Howell
9cfd844f5f lib/nvmf: properly validate fuse command fields.
The fuse command value is a two byte value, but we were only checking to
see if the fuse value was equal to SPDK_NVME_CMD_FUSE_FIRST or
SPDK_NVME_CMD_FUSE_SECOND in spdk_nvmf_ctrlr_process_io_fused_cmd. If a
haywire initiator sent a command with a fused value equal to
SPDK_NVME_CMD_FUSE_MASK, that would result in us skipping all checks and
dereferencing a null pointer in
spdk_nvmf_bdev_ctrlr_compare_and_write_cmd.

To fix this, add an extra condition to validate the cuse field.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483123 (master)
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>

(cherry picked from commit f0ca01e102)
Change-Id: I1ec4169ff5637562effd694f7046c6e3389627f1
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/483255
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-30 16:57:01 +00:00
59 changed files with 1244 additions and 447 deletions

View File

@ -1,6 +1,74 @@
# Changelog
## v20.01: (Upcoming Release)
## v20.01.3: (Upcoming Release)
## v20.01.2:
### dpdk
Updated DPDK submodule to DPDK 19.11.2, which includes fixes for DPDK vulnerabilities:
CVE-2020-10722, CVE-2020-10723, CVE-2020-10724, CVE-2020-10725, CVE-2020-10724.
### env_dpdk
A new function, `spdk_mem_reserve`, has been added to reserve a memory region in SPDK's
memory maps. It pre-allocates data structures to hold memory address translations
without populating the region.
### rpc
A new RPC, `bdev_rbd_resize` has been added to resize the Ceph RBD bdev.
## v20.01.1:
## v20.01:
### bdev
A new function, `spdk_bdev_set_timeout`, has been added to set per descriptor I/O timeouts.
A new class of functions `spdk_bdev_compare*`, have been added to allow native bdev support
of block comparisons and compare-and-write.
A new class of bdev events, `SPDK_BDEV_EVENT_MEDIA_MANAGEMENT`, has been added to allow bdevs
which expose raw media to alert all I/O channels of pending media management events.
A new API was added `spdk_bdev_io_get_aux_buf` allowing the caller to request
an auxiliary buffer for its own private use. The API is used in the same manner that
`spdk_bdev_io_get_buf` is used and the length of the buffer is always the same as the
bdev_io primary buffer. 'spdk_bdev_io_put_aux_buf' frees the allocated auxiliary
buffer.
### blobfs
Added boolean return value for function spdk_fs_set_cache_size to indicate its operation result.
Added `blobfs_set_cache_size` RPC method to set cache size for blobstore filesystem.
### blobstore
Added new `use_extent_table` option to `spdk_blob_opts` for creating blobs with Extent Table descriptor.
Using this metadata format, dramatically decreases number of writes required to persist each cluster allocation
for thin provisioned blobs. Extent Table descriptor is enabled by default.
See the [Blobstore Programmer's Guide](https://spdk.io/doc/blob.html#blob_pg_cluster_layout) for more details.
### dpdk
Updated DPDK submodule to DPDK 19.11.
### env_dpdk
`spdk_env_dpdk_post_init` now takes a boolean, `legacy_mem`, as an argument.
A new function, `spdk_env_dpdk_dump_mem_stats`, prints information about the memory consumed by DPDK to a file specified by
the user. A new utility, `scripts/dpdk_mem_info.py`, wraps this function and prints the output in an easy to read way.
### event
The functions `spdk_reactor_enable_framework_monitor_context_switch()` and
`spdk_reactor_framework_monitor_context_switch_enabled()` have been changed to
`spdk_framework_enable_context_switch_monitor()` and
`spdk_framework_context_switch_monitor_enabled()`, respectively.
### ftl
@ -18,43 +86,6 @@ parameter.
`spdk_ftl_punit_range` and `ftl_module_init_opts` structures were removed.
### nvmf
Support for custom NVMe admin command handlers and admin command passthru
in the NVMF subsystem.
It is now possible to set a custom handler for a specific NVMe admin command.
For example, vendor specific admin commands can now be intercepted by implementing
a function handling the command.
Further NVMe admin commands can be forwarded straight to an underlying NVMe bdev.
The functions `spdk_nvmf_set_custom_admin_cmd_hdlr` and `spdk_nvmf_set_passthru_admin_cmd`
in `spdk_internal/nvmf.h` expose this functionality. There is an example custom admin handler
for the NVMe IDENTIFY CTRLR in `lib/nvmf/custom_cmd_hdlr.c`. This handler gets the SN, MN, FR, IEEE, FGUID
attributes from the first NVMe drive in the NVMF subsystem and returns it to the NVMF initiator (sn and mn attributes
specified during NVMF subsystem creation RPC will be overwritten).
This handler can be enabled via the `nvmf_set_config` RPC.
Note: In a future version of SPDK, this handler will be enabled by default.
### bdev
A new API was added `spdk_bdev_io_get_aux_buf` allowing the caller to request
an auxiliary buffer for its own private use. The API is used in the same manner that
`spdk_bdev_io_get_buf` is used and the length of the buffer is always the same as the
bdev_io primary buffer. 'spdk_bdev_io_put_aux_buf' frees the allocated auxiliary
buffer.
### sock
Added spdk_sock_writev_async for performing asynchronous writes to sockets. This call will
never return EAGAIN, instead queueing internally until the data has all been sent. This can
simplify many code flows that create pollers to continue attempting to flush writes
on sockets.
Added `impl_name` parameter in spdk_sock_listen and spdk_sock_connect functions. Users may now
specify the sock layer implementation they'd prefer to use. Valid implementations are currently
"vpp" and "posix" and NULL, where NULL results in the previous behavior of the functions.
### isa-l
Updated ISA-L submodule to commit f3993f5c0b6911 which includes implementation and
@ -62,16 +93,30 @@ optimization for aarch64.
Enabled ISA-L on aarch64 by default in addition to x86.
### thread
### nvme
`spdk_thread_send_msg` now returns int indicating if the message was successfully
sent.
`delayed_pcie_doorbell` parameter in `spdk_nvme_io_qpair_opts` was renamed to `delay_cmd_submit`
to allow reuse in other transports.
### blobfs
Added RDMA WR batching to NVMf RDMA initiator. Send and receive WRs are chained together
and posted with a single call to ibv_post_send(receive) in the next call to qpair completion
processing function. Batching is controlled by 'delay_cmd_submit' qpair option.
Added boolean return value for function spdk_fs_set_cache_size to indicate its operation result.
The NVMe-oF initiator now supports plugging out of tree NVMe-oF transports. In order
to facilitate this feature, several small API changes have been made:
Added `blobfs_set_cache_size` RPC method to set cache size for blobstore filesystem.
The `spdk_nvme_transport_id` struct now contains a trstring member used to identify the transport.
A new function, `spdk_nvme_transport_available_by_name`, has been added.
A function table, `spdk_nvme_transport_ops`, and macro, `SPDK_NVME_TRANSPORT_REGISTER`, have been added which
enable registering out of tree transports.
A new function, `spdk_nvme_ns_supports_compare`, allows a user to check whether a given namespace supports the compare
operation.
A new family of functions, `spdk_nvme_ns_compare*`, give the user access to submitting compare commands to NVMe namespaces.
A new function, `spdk_nvme_ctrlr_cmd_get_log_page_ext`, gives users more granular control over the command dwords sent in
log page requests.
### nvmf
@ -91,45 +136,71 @@ Add `spdk_nvmf_tgt_stop_listen()` that can be used to stop listening for
incoming connections for specified target and trid. Listener is not stopped
implicitly upon destruction of a subsystem any more.
A custom NVMe admin command handler has been added which allows the user to use the real drive
attributes from one of the target NVMe drives when reporting drive attributes to the initiator.
This handler can be enabled via the `nvmf_set_config` RPC.
Note: In a future version of SPDK, this handler will be enabled by default.
The SPDK target and initiator both now include compare-and-write functionality with one caveat. If using the RDMA transport,
the target expects the initiator to send both the compare command and write command either with, or without inline data. The
SPDK initiator currently respects this requirement, but this note is included as a flag for other initiators attempting
compatibility with this version of SPDK.
### rpc
A new RPC, `bdev_zone_block_create`, enables creating an emulated zoned bdev on top of a standard block device.
A new RPC, `bdev_ocssd_create`, enables creating an emulated zoned bdev on top of an Open Channel SSD.
A new RPC, `blobfs_set_cache_size`, enables managing blobfs cache size.
A new RPC, `env_dpdk_get_mem_stats`, has been added to facilitate reading DPDK related memory
consumption stats. Please see the env_dpdk section above for more details.
A new RPC, `framework_get_reactors`, has been added to retrieve a list of all reactors.
`bdev_ftl_create` now takes a `base_bdev` argument in lieu of `trtype`, `traddr`, and `punits`.
`bdev_nvme_set_options` now allows users to disable I/O submission batching with the `-d` flag
`bdev_nvme_cuse_register` now accepts a `name` parameter.
`bdev_uring_create` now takes arguments for `bdev_name` and `block_size`
`nvmf_set_config` now takes an argument to enable passthru of identify commands to base NVMe devices.
Please see the nvmf section above for more details.
### scsi
`spdk_scsi_lun_get_dif_ctx` now takes an additional argument of type `spdk_scsi_task`.
### sock
Added spdk_sock_writev_async for performing asynchronous writes to sockets. This call will
never return EAGAIN, instead queueing internally until the data has all been sent. This can
simplify many code flows that create pollers to continue attempting to flush writes
on sockets.
Added `impl_name` parameter in spdk_sock_listen and spdk_sock_connect functions. Users may now
specify the sock layer implementation they'd prefer to use. Valid implementations are currently
"vpp" and "posix" and NULL, where NULL results in the previous behavior of the functions.
### thread
`spdk_thread_send_msg` now returns int indicating if the message was successfully
sent.
A new function `spdk_thread_send_critical_msg`, has been added to support sending a single message from
a context that may be interrupted, e.g. a signal handler.
Two new functions, `spdk_poller_pause`, and `spdk_poller_resume`, have been added to give greater control
of pollers to the application owner.
### util
`spdk_pipe`, a new utility for buffering data from sockets or files for parsing
has been added. The public API is available at `include/spdk/pipe.h`.
### nvme
`delayed_pcie_doorbell` parameter in `spdk_nvme_io_qpair_opts` was renamed to `delay_cmd_submit`
to allow reuse in other transports.
Added RDMA WR batching to NVMf RDMA initiator. Send and receive WRs are chained together
and posted with a single call to ibv_post_send(receive) in the next call to qpair completion
processing function. Batching is controlled by 'delay_cmd_submit' qpair option.
The NVMe-oF initiator now supports plugging out of tree NVMe-oF transports. In order
to facilitate this feature, several small API changes have been made:
The `spdk_nvme_transport_id` struct now contains a trstring member used to identify the transport.
A new function, `spdk_nvme_transport_available_by_name`, has been added.
A function table, `spdk_nvme_transport_ops`, and macro, `SPDK_NVME_TRANSPORT_REGISTER`, have been added which
enable registering out of tree transports.
### rpc
Added optional 'delay_cmd_submit' parameter to 'bdev_nvme_set_options' RPC method.
An new RPC `framework_get_reactors` has been added to retrieve list of all reactors.
### dpdk
Updated DPDK submodule to DPDK 19.11.
### event
The functions `spdk_reactor_enable_framework_monitor_context_switch()` and
`spdk_reactor_framework_monitor_context_switch_enabled()` have been changed to
`spdk_framework_enable_context_switch_monitor()` and
`spdk_framework_context_switch_monitor_enabled()`, respectively.
### bdev
Added spdk_bdev_io_get_nvme_fused_status function for translating bdev_io status to NVMe status

View File

@ -144,7 +144,7 @@ def confirmPerPatchTests(test_list, skiplist):
exit(1)
def aggregateCompletedTests(output_dir, repo_dir):
def aggregateCompletedTests(output_dir, repo_dir, skip_confirm=False):
test_list = {}
test_completion_table = []
@ -172,14 +172,15 @@ def aggregateCompletedTests(output_dir, repo_dir):
printListInformation("Tests", test_list)
generateTestCompletionTables(output_dir, test_completion_table)
skipped_tests = getSkippedTests(repo_dir)
confirmPerPatchTests(test_list, skipped_tests)
if not skip_confirm:
confirmPerPatchTests(test_list, skipped_tests)
def main(output_dir, repo_dir):
def main(output_dir, repo_dir, skip_confirm=False):
generateCoverageReport(output_dir, repo_dir)
collectOne(output_dir, 'doc')
collectOne(output_dir, 'ut_coverage')
aggregateCompletedTests(output_dir, repo_dir)
aggregateCompletedTests(output_dir, repo_dir, skip_confirm)
if __name__ == "__main__":
@ -188,5 +189,7 @@ if __name__ == "__main__":
help="The location of your build's output directory")
parser.add_argument("-r", "--repo_directory", type=str, required=True,
help="The location of your spdk repository")
parser.add_argument("-s", "--skip_confirm", required=False, action="store_true",
help="Do not check if all autotest.sh tests were executed.")
args = parser.parse_args()
main(args.directory_location, args.repo_directory)
main(args.directory_location, args.repo_directory, args.skip_confirm)

View File

@ -119,6 +119,12 @@ To remove a block device representation use the bdev_rbd_delete command.
`rpc.py bdev_rbd_delete Rbd0`
To resize a bdev use the bdev_rbd_resize command.
`rpc.py bdev_rbd_resize Rbd0 4096`
This command will resize the Rbd0 bdev to 4096 MiB.
# Compression Virtual Bdev Module {#bdev_config_compress}
The compression bdev module can be configured to provide compression/decompression

View File

@ -318,6 +318,24 @@ form a linked list. The first page in the list will be written in place on updat
be written to fresh locations. This requires the backing device to support an atomic write size greater than
or equal to the page size to guarantee that the operation is atomic. See the section on atomicity for details.
### Blob cluster layout {#blob_pg_cluster_layout}
Each blob is an ordered list of clusters, where starting LBA of a cluster is called extent. A blob can be
thin provisioned, resulting in no extent for some of the clusters. When first write operation occurs
to the unallocated cluster - new extent is chosen. This information is stored in RAM and on-disk.
There are two extent representations on-disk, dependent on `use_extent_table` (default:true) opts used
when creating a blob.
* **use_extent_table=true**: EXTENT_PAGE descriptor is not part of linked list of pages. It contains extents
that are not run-length encoded. Each extent page is referenced by EXTENT_TABLE descriptor, which is serialized
as part of linked list of pages. Extent table is run-length encoding all unallocated extent pages.
Every new cluster allocation updates a single extent page, in case when extent page was previously allocated.
Otherwise additionally incurs serializing whole linked list of pages for the blob.
* **use_extent_table=false**: EXTENT_RLE descriptor is serialized as part of linked list of pages.
Extents pointing to contiguous LBA are run-length encoded, including unallocated extents represented by 0.
Every new cluster allocation incurs serializing whole linked list of pages for the blob.
### Sequences and Batches
Internally Blobstore uses the concepts of sequences and batches to submit IO to the underlying device in either

View File

@ -1853,6 +1853,49 @@ Example response:
}
~~~
## bdev_rbd_resize {#rpc_bdev_rbd_resize}
Resize @ref bdev_config_rbd bdev
This method is available only if SPDK was build with Ceph RBD support.
### Result
`true` if bdev with provided name was resized or `false` otherwise.
### Parameters
Name | Optional | Type | Description
----------------------- | -------- | ----------- | -----------
name | Required | string | Bdev name
new_size | Required | int | New bdev size for resize operation in MiB
### Example
Example request:
~~~
{
"params": {
"name": "Rbd0"
"new_size": "4096"
},
"jsonrpc": "2.0",
"method": "bdev_rbd_resize",
"id": 1
}
~~~
Example response:
~~~
{
"jsonrpc": "2.0",
"id": 1,
"result": true
}
~~~
## bdev_delay_create {#rpc_bdev_delay_create}
Create delay bdev. This bdev type redirects all IO to it's base bdev and inserts a delay on the completion

2
dpdk

@ -1 +1 @@
Subproject commit fdb511332624e28631f553a226abb1dc0b35b28a
Subproject commit ef71bfaface10cc19b75e45d3158ab71a788e3a9

View File

@ -140,6 +140,18 @@ endif
# Allow users to specify EXTRA_DPDK_CFLAGS if they want to build DPDK using unsupported compiler versions
DPDK_CFLAGS += $(EXTRA_DPDK_CFLAGS)
ifeq ($(CC_TYPE),gcc)
GCC_MAJOR = $(shell echo __GNUC__ | $(CC) -E -x c - | tail -n 1)
ifeq ($(shell test $(GCC_MAJOR) -ge 10 && echo 1), 1)
#1. gcc 10 complains on operations with zero size arrays in rte_cryptodev.c, so
#disable this warning
#2. gcc 10 disables fcommon by default and complains on multiple definition of
#aesni_mb_logtype_driver symbol which is defined in header file and presented in sevral
#translation units
DPDK_CFLAGS += -Wno-stringop-overflow -fcommon
endif
endif
$(SPDK_ROOT_DIR)/dpdk/build: $(SPDK_ROOT_DIR)/mk/cc.mk $(SPDK_ROOT_DIR)/include/spdk/config.h
$(Q)rm -rf $(SPDK_ROOT_DIR)/dpdk/build
$(Q)$(MAKE) -C $(SPDK_ROOT_DIR)/dpdk config T=$(DPDK_CONFIG) $(DPDK_OPTS)

View File

@ -11,6 +11,7 @@ iodepth=128
rw=randrw
bs=16k
verify=md5
verify_backlog=32
[test]
numjobs=1

View File

@ -1241,6 +1241,20 @@ int spdk_mem_register(void *vaddr, size_t len);
*/
int spdk_mem_unregister(void *vaddr, size_t len);
/**
* Reserve the address space specified in all memory maps.
*
* This pre-allocates the necessary space in the memory maps such that
* future calls to spdk_mem_register() on that region require no
* internal memory allocations.
*
* \param vaddr Virtual address to reserve
* \param len Length in bytes of vaddr
*
* \return 0 on success, negated errno on failure.
*/
int spdk_mem_reserve(void *vaddr, size_t len);
#ifdef __cplusplus
}
#endif

View File

@ -2781,6 +2781,14 @@ struct spdk_nvme_rdma_hooks {
* \return Infiniband remote key (rkey) for this buf
*/
uint64_t (*get_rkey)(struct ibv_pd *pd, void *buf, size_t size);
/**
* \brief Put back keys got from get_rkey.
*
* \param key The Infiniband remote key (rkey) got from get_rkey
*
*/
void (*put_rkey)(uint64_t key);
};
/**

View File

@ -1156,7 +1156,7 @@ enum spdk_nvme_generic_command_status_code {
enum spdk_nvme_command_specific_status_code {
SPDK_NVME_SC_COMPLETION_QUEUE_INVALID = 0x00,
SPDK_NVME_SC_INVALID_QUEUE_IDENTIFIER = 0x01,
SPDK_NVME_SC_MAXIMUM_QUEUE_SIZE_EXCEEDED = 0x02,
SPDK_NVME_SC_INVALID_QUEUE_SIZE = 0x02,
SPDK_NVME_SC_ABORT_COMMAND_LIMIT_EXCEEDED = 0x03,
/* 0x04 - reserved */
SPDK_NVME_SC_ASYNC_EVENT_REQUEST_LIMIT_EXCEEDED = 0x05,

View File

@ -54,7 +54,7 @@
* Patch level is incremented on maintenance branch releases and reset to 0 for each
* new major.minor release.
*/
#define SPDK_VERSION_PATCH 0
#define SPDK_VERSION_PATCH 3
/**
* Version string suffix.

View File

@ -77,6 +77,12 @@ struct spdk_sock_group_impl {
struct spdk_net_impl *net_impl;
TAILQ_HEAD(, spdk_sock) socks;
STAILQ_ENTRY(spdk_sock_group_impl) link;
/* List of removed sockets. refreshed each time we poll the sock group. */
int num_removed_socks;
/* Unfortunately, we can't just keep a tailq of the sockets in case they are freed
* or added to another poll group later.
*/
uintptr_t removed_socks[MAX_EVENTS_PER_POLL];
};
struct spdk_net_impl {

View File

@ -165,7 +165,7 @@ spdk_scsi_nvme_translate(const struct spdk_bdev_io *bdev_io, int *sc, int *sk,
*ascq = SPDK_SCSI_ASCQ_CAUSE_NOT_REPORTABLE;
break;
case SPDK_NVME_SC_INVALID_QUEUE_IDENTIFIER:
case SPDK_NVME_SC_MAXIMUM_QUEUE_SIZE_EXCEEDED:
case SPDK_NVME_SC_INVALID_QUEUE_SIZE:
case SPDK_NVME_SC_ASYNC_EVENT_REQUEST_LIMIT_EXCEEDED:
case SPDK_NVME_SC_INVALID_FIRMWARE_SLOT:
case SPDK_NVME_SC_INVALID_FIRMWARE_IMAGE:

View File

@ -247,6 +247,7 @@ _spdk_blob_alloc(struct spdk_blob_store *bs, spdk_blob_id id)
TAILQ_INIT(&blob->xattrs);
TAILQ_INIT(&blob->xattrs_internal);
TAILQ_INIT(&blob->pending_persists);
return blob;
}
@ -268,6 +269,7 @@ static void
_spdk_blob_free(struct spdk_blob *blob)
{
assert(blob != NULL);
assert(TAILQ_EMPTY(&blob->pending_persists));
free(blob->active.extent_pages);
free(blob->clean.extent_pages);
@ -1520,6 +1522,7 @@ struct spdk_blob_persist_ctx {
spdk_bs_sequence_t *seq;
spdk_bs_sequence_cpl cb_fn;
void *cb_arg;
TAILQ_ENTRY(spdk_blob_persist_ctx) link;
};
static void
@ -1540,22 +1543,34 @@ spdk_bs_batch_clear_dev(struct spdk_blob_persist_ctx *ctx, spdk_bs_batch_t *batc
}
}
static void _spdk_blob_persist_check_dirty(struct spdk_blob_persist_ctx *ctx);
static void
_spdk_blob_persist_complete(spdk_bs_sequence_t *seq, void *cb_arg, int bserrno)
{
struct spdk_blob_persist_ctx *ctx = cb_arg;
struct spdk_blob_persist_ctx *next_persist;
struct spdk_blob *blob = ctx->blob;
if (bserrno == 0) {
_spdk_blob_mark_clean(blob);
}
assert(ctx == TAILQ_FIRST(&blob->pending_persists));
TAILQ_REMOVE(&blob->pending_persists, ctx, link);
next_persist = TAILQ_FIRST(&blob->pending_persists);
/* Call user callback */
ctx->cb_fn(seq, ctx->cb_arg, bserrno);
/* Free the memory */
spdk_free(ctx->pages);
free(ctx);
if (next_persist != NULL) {
_spdk_blob_persist_check_dirty(next_persist);
}
}
static void
@ -2060,6 +2075,25 @@ _spdk_blob_persist_dirty(spdk_bs_sequence_t *seq, void *cb_arg, int bserrno)
_spdk_bs_write_super(seq, ctx->blob->bs, ctx->super, _spdk_blob_persist_dirty_cpl, ctx);
}
static void
_spdk_blob_persist_check_dirty(struct spdk_blob_persist_ctx *ctx)
{
if (ctx->blob->bs->clean) {
ctx->super = spdk_zmalloc(sizeof(*ctx->super), 0x1000, NULL,
SPDK_ENV_SOCKET_ID_ANY, SPDK_MALLOC_DMA);
if (!ctx->super) {
ctx->cb_fn(ctx->seq, ctx->cb_arg, -ENOMEM);
free(ctx);
return;
}
spdk_bs_sequence_read_dev(ctx->seq, ctx->super, _spdk_bs_page_to_lba(ctx->blob->bs, 0),
_spdk_bs_byte_to_lba(ctx->blob->bs, sizeof(*ctx->super)),
_spdk_blob_persist_dirty, ctx);
} else {
_spdk_blob_persist_start(ctx);
}
}
/* Write a blob to disk */
static void
@ -2070,7 +2104,7 @@ _spdk_blob_persist(spdk_bs_sequence_t *seq, struct spdk_blob *blob,
_spdk_blob_verify_md_op(blob);
if (blob->state == SPDK_BLOB_STATE_CLEAN) {
if (blob->state == SPDK_BLOB_STATE_CLEAN && TAILQ_EMPTY(&blob->pending_persists)) {
cb_fn(seq, cb_arg, 0);
return;
}
@ -2086,21 +2120,15 @@ _spdk_blob_persist(spdk_bs_sequence_t *seq, struct spdk_blob *blob,
ctx->cb_arg = cb_arg;
ctx->next_extent_page = 0;
if (blob->bs->clean) {
ctx->super = spdk_zmalloc(sizeof(*ctx->super), 0x1000, NULL,
SPDK_ENV_SOCKET_ID_ANY, SPDK_MALLOC_DMA);
if (!ctx->super) {
cb_fn(seq, cb_arg, -ENOMEM);
free(ctx);
return;
}
spdk_bs_sequence_read_dev(seq, ctx->super, _spdk_bs_page_to_lba(blob->bs, 0),
_spdk_bs_byte_to_lba(blob->bs, sizeof(*ctx->super)),
_spdk_blob_persist_dirty, ctx);
} else {
_spdk_blob_persist_start(ctx);
/* Multiple blob persists can affect one another, via blob->state or
* blob mutable data changes. To prevent it, queue up the persists. */
if (!TAILQ_EMPTY(&blob->pending_persists)) {
TAILQ_INSERT_TAIL(&blob->pending_persists, ctx, link);
return;
}
TAILQ_INSERT_HEAD(&blob->pending_persists, ctx, link);
_spdk_blob_persist_check_dirty(ctx);
}
struct spdk_blob_copy_cluster_ctx {
@ -5129,6 +5157,9 @@ _spdk_bs_create_blob(struct spdk_blob_store *bs,
}
blob->use_extent_table = opts->use_extent_table;
if (blob->use_extent_table) {
blob->invalid_flags |= SPDK_BLOB_EXTENT_TABLE;
}
if (!internal_xattrs) {
_spdk_blob_xattrs_init(&internal_xattrs_default);
@ -6179,7 +6210,14 @@ _spdk_delete_snapshot_sync_clone_cpl(void *cb_arg, int bserrno)
ctx->snapshot->active.clusters[i] = 0;
}
}
for (i = 0; i < ctx->snapshot->active.num_extent_pages &&
i < ctx->clone->active.num_extent_pages; i++) {
if (ctx->clone->active.extent_pages[i] == ctx->snapshot->active.extent_pages[i]) {
ctx->snapshot->active.extent_pages[i] = 0;
}
}
_spdk_blob_set_thin_provision(ctx->snapshot);
ctx->snapshot->state = SPDK_BLOB_STATE_DIRTY;
if (ctx->parent_snapshot_entry != NULL) {
@ -6212,6 +6250,12 @@ _spdk_delete_snapshot_sync_snapshot_xattr_cpl(void *cb_arg, int bserrno)
ctx->clone->active.clusters[i] = ctx->snapshot->active.clusters[i];
}
}
for (i = 0; i < ctx->snapshot->active.num_extent_pages &&
i < ctx->clone->active.num_extent_pages; i++) {
if (ctx->clone->active.extent_pages[i] == 0) {
ctx->clone->active.extent_pages[i] = ctx->snapshot->active.extent_pages[i];
}
}
/* Delete old backing bs_dev from clone (related to snapshot that will be removed) */
ctx->clone->back_bs_dev->destroy(ctx->clone->back_bs_dev);

View File

@ -166,6 +166,9 @@ struct spdk_blob {
bool extent_table_found;
bool use_extent_table;
/* A list of pending metadata pending_persists */
TAILQ_HEAD(, spdk_blob_persist_ctx) pending_persists;
/* Number of data clusters retrived from extent table,
* that many have to be read from extent pages. */
uint64_t remaining_clusters_in_et;
@ -331,7 +334,8 @@ struct spdk_blob_md_descriptor_extent_page {
#define SPDK_BLOB_THIN_PROV (1ULL << 0)
#define SPDK_BLOB_INTERNAL_XATTR (1ULL << 1)
#define SPDK_BLOB_INVALID_FLAGS_MASK (SPDK_BLOB_THIN_PROV | SPDK_BLOB_INTERNAL_XATTR)
#define SPDK_BLOB_EXTENT_TABLE (1ULL << 2)
#define SPDK_BLOB_INVALID_FLAGS_MASK (SPDK_BLOB_THIN_PROV | SPDK_BLOB_INTERNAL_XATTR | SPDK_BLOB_EXTENT_TABLE)
#define SPDK_BLOB_READ_ONLY (1ULL << 0)
#define SPDK_BLOB_DATA_RO_FLAGS_MASK SPDK_BLOB_READ_ONLY

View File

@ -34,6 +34,10 @@
SPDK_ROOT_DIR := $(abspath $(CURDIR)/../..)
include $(SPDK_ROOT_DIR)/mk/spdk.common.mk
SO_VER := 2
SO_MINOR := 1
SO_SUFFIX := $(SO_VER).$(SO_MINOR)
CFLAGS += $(ENV_CFLAGS)
C_SRCS = env.c memory.c pci.c init.c threads.c
C_SRCS += pci_nvme.c pci_ioat.c pci_virtio.c pci_vmd.c

View File

@ -78,6 +78,11 @@ ifneq (, $(wildcard $(DPDK_ABS_DIR)/lib/librte_bus_pci.*))
DPDK_LIB_LIST += rte_bus_pci
endif
# DPDK 20.05 eal dependency
ifneq (, $(wildcard $(DPDK_ABS_DIR)/lib/librte_telemetry.*))
DPDK_LIB_LIST += rte_telemetry
endif
# There are some complex dependencies when using crypto, reduce or both so
# here we add the feature specific ones and set a flag to add the common
# ones after that.

View File

@ -36,7 +36,6 @@
#include "env_internal.h"
#include <rte_config.h>
#include <rte_malloc.h>
#include <rte_memory.h>
#include <rte_eal_memconfig.h>
@ -343,11 +342,7 @@ spdk_mem_map_free(struct spdk_mem_map **pmap)
}
for (i = 0; i < sizeof(map->map_256tb.map) / sizeof(map->map_256tb.map[0]); i++) {
if (g_legacy_mem) {
rte_free(map->map_256tb.map[i]);
} else {
free(map->map_256tb.map[i]);
}
free(map->map_256tb.map[i]);
}
pthread_mutex_destroy(&map->mutex);
@ -508,6 +503,57 @@ spdk_mem_unregister(void *vaddr, size_t len)
return 0;
}
int
spdk_mem_reserve(void *vaddr, size_t len)
{
struct spdk_mem_map *map;
void *seg_vaddr;
size_t seg_len;
uint64_t reg;
if ((uintptr_t)vaddr & ~MASK_256TB) {
DEBUG_PRINT("invalid usermode virtual address %p\n", vaddr);
return -EINVAL;
}
if (((uintptr_t)vaddr & MASK_2MB) || (len & MASK_2MB)) {
DEBUG_PRINT("invalid %s parameters, vaddr=%p len=%ju\n",
__func__, vaddr, len);
return -EINVAL;
}
if (len == 0) {
return 0;
}
pthread_mutex_lock(&g_spdk_mem_map_mutex);
/* Check if any part of this range is already registered */
seg_vaddr = vaddr;
seg_len = len;
while (seg_len > 0) {
reg = spdk_mem_map_translate(g_mem_reg_map, (uint64_t)seg_vaddr, NULL);
if (reg & REG_MAP_REGISTERED) {
pthread_mutex_unlock(&g_spdk_mem_map_mutex);
return -EBUSY;
}
seg_vaddr += VALUE_2MB;
seg_len -= VALUE_2MB;
}
/* Simply set the translation to the memory map's default. This allocates the space in the
* map but does not provide a valid translation. */
spdk_mem_map_set_translation(g_mem_reg_map, (uint64_t)vaddr, len,
g_mem_reg_map->default_translation);
TAILQ_FOREACH(map, &g_spdk_mem_maps, tailq) {
spdk_mem_map_set_translation(map, (uint64_t)vaddr, len, map->default_translation);
}
pthread_mutex_unlock(&g_spdk_mem_map_mutex);
return 0;
}
static struct map_1gb *
spdk_mem_map_get_map_1gb(struct spdk_mem_map *map, uint64_t vfn_2mb)
{
@ -527,23 +573,7 @@ spdk_mem_map_get_map_1gb(struct spdk_mem_map *map, uint64_t vfn_2mb)
/* Recheck to make sure nobody else got the mutex first. */
map_1gb = map->map_256tb.map[idx_256tb];
if (!map_1gb) {
/* Some of the existing apps use TCMalloc hugepage
* allocator and register this tcmalloc allocated
* hugepage memory with SPDK in the mmap hook. Since
* this function is called in the spdk_mem_register
* code path we can't do a malloc here otherwise that
* would cause a livelock. So we use the dpdk provided
* allocator instead, which avoids this cyclic
* dependency. Note this is only guaranteed to work when
* DPDK dynamic memory allocation is disabled (--legacy-mem),
* which then is a requirement for anyone using TCMalloc in
* this way.
*/
if (g_legacy_mem) {
map_1gb = rte_malloc(NULL, sizeof(struct map_1gb), 0);
} else {
map_1gb = malloc(sizeof(struct map_1gb));
}
map_1gb = malloc(sizeof(struct map_1gb));
if (map_1gb) {
/* initialize all entries to default translation */
for (i = 0; i < SPDK_COUNTOF(map_1gb->map); i++) {
@ -778,14 +808,23 @@ static TAILQ_HEAD(, spdk_vtophys_pci_device) g_vtophys_pci_devices =
TAILQ_HEAD_INITIALIZER(g_vtophys_pci_devices);
static struct spdk_mem_map *g_vtophys_map;
static struct spdk_mem_map *g_phys_ref_map;
#if SPDK_VFIO_ENABLED
static int
vtophys_iommu_map_dma(uint64_t vaddr, uint64_t iova, uint64_t size)
{
struct spdk_vfio_dma_map *dma_map;
uint64_t refcount;
int ret;
refcount = spdk_mem_map_translate(g_phys_ref_map, iova, NULL);
assert(refcount < UINT64_MAX);
if (refcount > 0) {
spdk_mem_map_set_translation(g_phys_ref_map, iova, size, refcount + 1);
return 0;
}
dma_map = calloc(1, sizeof(*dma_map));
if (dma_map == NULL) {
return -ENOMEM;
@ -832,6 +871,7 @@ vtophys_iommu_map_dma(uint64_t vaddr, uint64_t iova, uint64_t size)
out_insert:
TAILQ_INSERT_TAIL(&g_vfio.maps, dma_map, tailq);
pthread_mutex_unlock(&g_vfio.mutex);
spdk_mem_map_set_translation(g_phys_ref_map, iova, size, refcount + 1);
return 0;
}
@ -839,6 +879,7 @@ static int
vtophys_iommu_unmap_dma(uint64_t iova, uint64_t size)
{
struct spdk_vfio_dma_map *dma_map;
uint64_t refcount;
int ret;
pthread_mutex_lock(&g_vfio.mutex);
@ -854,6 +895,18 @@ vtophys_iommu_unmap_dma(uint64_t iova, uint64_t size)
return -ENXIO;
}
refcount = spdk_mem_map_translate(g_phys_ref_map, iova, NULL);
assert(refcount < UINT64_MAX);
if (refcount > 0) {
spdk_mem_map_set_translation(g_phys_ref_map, iova, size, refcount - 1);
}
/* We still have outstanding references, don't clear it. */
if (refcount > 1) {
pthread_mutex_unlock(&g_vfio.mutex);
return 0;
}
/** don't support partial or multiple-page unmap for now */
assert(dma_map->map.size == size);
@ -1383,10 +1436,21 @@ spdk_vtophys_init(void)
.are_contiguous = vtophys_check_contiguous_entries,
};
const struct spdk_mem_map_ops phys_ref_map_ops = {
.notify_cb = NULL,
.are_contiguous = NULL,
};
#if SPDK_VFIO_ENABLED
spdk_vtophys_iommu_init();
#endif
g_phys_ref_map = spdk_mem_map_alloc(0, &phys_ref_map_ops, NULL);
if (g_phys_ref_map == NULL) {
DEBUG_PRINT("phys_ref map allocation failed.\n");
return -ENOMEM;
}
g_vtophys_map = spdk_mem_map_alloc(SPDK_VTOPHYS_ERROR, &vtophys_map_ops, NULL);
if (g_vtophys_map == NULL) {
DEBUG_PRINT("vtophys map allocation failed\n");

View File

@ -62,15 +62,7 @@ spdk_map_bar_rte(struct spdk_pci_device *device, uint32_t bar,
struct rte_pci_device *dev = device->dev_handle;
*mapped_addr = dev->mem_resource[bar].addr;
if (*mapped_addr == NULL) {
return -1;
}
*phys_addr = (uint64_t)dev->mem_resource[bar].phys_addr;
if (*phys_addr == 0) {
return -1;
}
*size = (uint64_t)dev->mem_resource[bar].len;
return 0;
@ -141,8 +133,8 @@ spdk_detach_rte(struct spdk_pci_device *dev)
dev->internal.pending_removal = true;
if (spdk_process_is_primary() && !pthread_equal(g_dpdk_tid, pthread_self())) {
rte_eal_alarm_set(1, spdk_detach_rte_cb, rte_dev);
/* wait up to 20ms for the cb to start executing */
for (i = 20; i > 0; i--) {
/* wait up to 2s for the cb to finish executing */
for (i = 2000; i > 0; i--) {
spdk_delay_us(1000);
pthread_mutex_lock(&g_pci_mutex);
@ -157,7 +149,7 @@ spdk_detach_rte(struct spdk_pci_device *dev)
/* besides checking the removed flag, we also need to wait
* for the dpdk detach function to unwind, as it's doing some
* operations even after calling our detach callback. Simply
* cancell the alarm - if it started executing already, this
* cancel the alarm - if it started executing already, this
* call will block and wait for it to finish.
*/
rte_eal_alarm_cancel(spdk_detach_rte_cb, rte_dev);
@ -171,6 +163,8 @@ spdk_detach_rte(struct spdk_pci_device *dev)
if (!removed) {
fprintf(stderr, "Timeout waiting for DPDK to remove PCI device %s.\n",
rte_dev->name);
/* If we reach this state, then the device couldn't be removed and most likely
a subsequent hot add of a device in the same BDF will fail */
}
} else {
spdk_detach_rte_cb(rte_dev);

View File

@ -1138,6 +1138,11 @@ iscsi_conn_login_pdu_success_complete(void *arg)
{
struct spdk_iscsi_conn *conn = arg;
if (conn->state >= ISCSI_CONN_STATE_EXITING) {
/* Connection is being exited before this callback is executed. */
SPDK_DEBUGLOG(SPDK_LOG_ISCSI, "Connection is already exited.\n");
return;
}
if (conn->full_feature) {
if (iscsi_conn_params_update(conn) != 0) {
return;

View File

@ -34,6 +34,7 @@
#include "spdk/nvmf_spec.h"
#include "nvme_internal.h"
#include "nvme_io_msg.h"
#include "nvme_uevent.h"
#define SPDK_NVME_DRIVER_NAME "spdk_nvme_driver"
@ -350,10 +351,19 @@ nvme_robust_mutex_init_shared(pthread_mutex_t *mtx)
int
nvme_driver_init(void)
{
static pthread_mutex_t g_init_mutex = PTHREAD_MUTEX_INITIALIZER;
int ret = 0;
/* Any socket ID */
int socket_id = -1;
/* Use a special process-private mutex to ensure the global
* nvme driver object (g_spdk_nvme_driver) gets initialized by
* only one thread. Once that object is established and its
* mutex is initialized, we can unlock this mutex and use that
* one instead.
*/
pthread_mutex_lock(&g_init_mutex);
/* Each process needs its own pid. */
g_spdk_nvme_pid = getpid();
@ -366,6 +376,7 @@ nvme_driver_init(void)
if (spdk_process_is_primary()) {
/* The unique named memzone already reserved. */
if (g_spdk_nvme_driver != NULL) {
pthread_mutex_unlock(&g_init_mutex);
return 0;
} else {
g_spdk_nvme_driver = spdk_memzone_reserve(SPDK_NVME_DRIVER_NAME,
@ -375,7 +386,7 @@ nvme_driver_init(void)
if (g_spdk_nvme_driver == NULL) {
SPDK_ERRLOG("primary process failed to reserve memory\n");
pthread_mutex_unlock(&g_init_mutex);
return -1;
}
} else {
@ -393,15 +404,16 @@ nvme_driver_init(void)
}
if (g_spdk_nvme_driver->initialized == false) {
SPDK_ERRLOG("timeout waiting for primary process to init\n");
pthread_mutex_unlock(&g_init_mutex);
return -1;
}
} else {
SPDK_ERRLOG("primary process is not started yet\n");
pthread_mutex_unlock(&g_init_mutex);
return -1;
}
pthread_mutex_unlock(&g_init_mutex);
return 0;
}
@ -415,12 +427,21 @@ nvme_driver_init(void)
if (ret != 0) {
SPDK_ERRLOG("failed to initialize mutex\n");
spdk_memzone_free(SPDK_NVME_DRIVER_NAME);
pthread_mutex_unlock(&g_init_mutex);
return ret;
}
/* The lock in the shared g_spdk_nvme_driver object is now ready to
* be used - so we can unlock the g_init_mutex here.
*/
pthread_mutex_unlock(&g_init_mutex);
nvme_robust_mutex_lock(&g_spdk_nvme_driver->lock);
g_spdk_nvme_driver->initialized = false;
g_spdk_nvme_driver->hotplug_fd = spdk_uevent_connect();
if (g_spdk_nvme_driver->hotplug_fd < 0) {
SPDK_DEBUGLOG(SPDK_LOG_NVME, "Failed to open uevent netlink socket\n");
}
TAILQ_INIT(&g_spdk_nvme_driver->shared_attached_ctrlrs);
@ -594,6 +615,7 @@ spdk_nvme_probe_internal(struct spdk_nvme_probe_ctx *probe_ctx,
int rc;
struct spdk_nvme_ctrlr *ctrlr, *ctrlr_tmp;
spdk_nvme_trid_populate_transport(&probe_ctx->trid, probe_ctx->trid.trtype);
if (!spdk_nvme_transport_available_by_name(probe_ctx->trid.trstring)) {
SPDK_ERRLOG("NVMe trtype %u not available\n", probe_ctx->trid.trtype);
return -1;
@ -741,7 +763,7 @@ void
spdk_nvme_trid_populate_transport(struct spdk_nvme_transport_id *trid,
enum spdk_nvme_transport_type trtype)
{
const char *trstring;
const char *trstring = "";
trid->trtype = trtype;
switch (trtype) {
@ -760,7 +782,8 @@ spdk_nvme_trid_populate_transport(struct spdk_nvme_transport_id *trid,
case SPDK_NVME_TRANSPORT_CUSTOM:
default:
SPDK_ERRLOG("don't use this for custom transports\n");
break;
assert(0);
return;
}
snprintf(trid->trstring, SPDK_NVMF_TRSTRING_MAX_LEN, "%s", trstring);
}

View File

@ -2618,6 +2618,7 @@ nvme_ctrlr_destruct(struct spdk_nvme_ctrlr *ctrlr)
SPDK_DEBUGLOG(SPDK_LOG_NVME, "Prepare to destruct SSD: %s\n", ctrlr->trid.traddr);
spdk_nvme_qpair_process_completions(ctrlr->adminq, 0);
nvme_transport_admin_qpair_abort_aers(ctrlr->adminq);
TAILQ_FOREACH_SAFE(qpair, &ctrlr->active_io_qpairs, tailq, tmp) {

View File

@ -60,6 +60,7 @@ struct cuse_device {
TAILQ_ENTRY(cuse_device) tailq;
};
static pthread_mutex_t g_cuse_mtx = PTHREAD_MUTEX_INITIALIZER;
static TAILQ_HEAD(, cuse_device) g_ctrlr_ctx_head = TAILQ_HEAD_INITIALIZER(g_ctrlr_ctx_head);
static struct spdk_bit_array *g_ctrlr_started;
@ -700,13 +701,14 @@ cuse_nvme_ns_start(struct cuse_device *ctrlr_device, uint32_t nsid, const char *
if (rv < 0) {
SPDK_ERRLOG("Device name too long.\n");
free(ns_device);
return -1;
return -ENAMETOOLONG;
}
if (pthread_create(&ns_device->tid, NULL, cuse_thread, ns_device)) {
rv = pthread_create(&ns_device->tid, NULL, cuse_thread, ns_device);
if (rv != 0) {
SPDK_ERRLOG("pthread_create failed\n");
free(ns_device);
return -1;
return -rv;
}
TAILQ_INSERT_TAIL(&ctrlr_device->ns_devices, ns_device, tailq);
@ -811,7 +813,7 @@ nvme_cuse_start(struct spdk_nvme_ctrlr *ctrlr)
g_ctrlr_started = spdk_bit_array_create(128);
if (g_ctrlr_started == NULL) {
SPDK_ERRLOG("Cannot create bit array\n");
return -1;
return -ENOMEM;
}
}
@ -843,9 +845,10 @@ nvme_cuse_start(struct spdk_nvme_ctrlr *ctrlr)
snprintf(ctrlr_device->dev_name, sizeof(ctrlr_device->dev_name), "spdk/nvme%d",
ctrlr_device->index);
if (pthread_create(&ctrlr_device->tid, NULL, cuse_thread, ctrlr_device)) {
rv = pthread_create(&ctrlr_device->tid, NULL, cuse_thread, ctrlr_device);
if (rv != 0) {
SPDK_ERRLOG("pthread_create failed\n");
rv = -1;
rv = -rv;
goto err3;
}
TAILQ_INSERT_TAIL(&g_ctrlr_ctx_head, ctrlr_device, tailq);
@ -857,10 +860,10 @@ nvme_cuse_start(struct spdk_nvme_ctrlr *ctrlr)
continue;
}
if (cuse_nvme_ns_start(ctrlr_device, nsid, ctrlr_device->dev_name) < 0) {
rv = cuse_nvme_ns_start(ctrlr_device, nsid, ctrlr_device->dev_name);
if (rv < 0) {
SPDK_ERRLOG("Cannot start CUSE namespace device.");
cuse_nvme_ctrlr_stop(ctrlr_device);
rv = -1;
goto err3;
}
}
@ -877,10 +880,10 @@ err2:
return rv;
}
static void
nvme_cuse_stop(struct spdk_nvme_ctrlr *ctrlr)
static struct cuse_device *
nvme_cuse_get_cuse_ctrlr_device(struct spdk_nvme_ctrlr *ctrlr)
{
struct cuse_device *ctrlr_device;
struct cuse_device *ctrlr_device = NULL;
TAILQ_FOREACH(ctrlr_device, &g_ctrlr_ctx_head, tailq) {
if (ctrlr_device->ctrlr == ctrlr) {
@ -888,12 +891,46 @@ nvme_cuse_stop(struct spdk_nvme_ctrlr *ctrlr)
}
}
return ctrlr_device;
}
static struct cuse_device *
nvme_cuse_get_cuse_ns_device(struct spdk_nvme_ctrlr *ctrlr, uint32_t nsid)
{
struct cuse_device *ctrlr_device = NULL;
struct cuse_device *ns_device = NULL;
ctrlr_device = nvme_cuse_get_cuse_ctrlr_device(ctrlr);
if (!ctrlr_device) {
return NULL;
}
TAILQ_FOREACH(ns_device, &ctrlr_device->ns_devices, tailq) {
if (ns_device->nsid == nsid) {
break;
}
}
return ns_device;
}
static void
nvme_cuse_stop(struct spdk_nvme_ctrlr *ctrlr)
{
struct cuse_device *ctrlr_device;
pthread_mutex_lock(&g_cuse_mtx);
ctrlr_device = nvme_cuse_get_cuse_ctrlr_device(ctrlr);
if (!ctrlr_device) {
SPDK_ERRLOG("Cannot find associated CUSE device\n");
pthread_mutex_unlock(&g_cuse_mtx);
return;
}
cuse_nvme_ctrlr_stop(ctrlr_device);
pthread_mutex_unlock(&g_cuse_mtx);
}
static struct nvme_io_msg_producer cuse_nvme_io_msg_producer = {
@ -911,18 +948,35 @@ spdk_nvme_cuse_register(struct spdk_nvme_ctrlr *ctrlr)
return rc;
}
pthread_mutex_lock(&g_cuse_mtx);
rc = nvme_cuse_start(ctrlr);
if (rc) {
nvme_io_msg_ctrlr_unregister(ctrlr, &cuse_nvme_io_msg_producer);
}
pthread_mutex_unlock(&g_cuse_mtx);
return rc;
}
void
spdk_nvme_cuse_unregister(struct spdk_nvme_ctrlr *ctrlr)
{
nvme_cuse_stop(ctrlr);
struct cuse_device *ctrlr_device;
pthread_mutex_lock(&g_cuse_mtx);
ctrlr_device = nvme_cuse_get_cuse_ctrlr_device(ctrlr);
if (!ctrlr_device) {
SPDK_ERRLOG("Cannot find associated CUSE device\n");
pthread_mutex_unlock(&g_cuse_mtx);
return;
}
cuse_nvme_ctrlr_stop(ctrlr_device);
pthread_mutex_unlock(&g_cuse_mtx);
nvme_io_msg_ctrlr_unregister(ctrlr, &cuse_nvme_io_msg_producer);
}
@ -932,20 +986,15 @@ spdk_nvme_cuse_get_ctrlr_name(struct spdk_nvme_ctrlr *ctrlr)
{
struct cuse_device *ctrlr_device;
if (TAILQ_EMPTY(&g_ctrlr_ctx_head)) {
return NULL;
}
TAILQ_FOREACH(ctrlr_device, &g_ctrlr_ctx_head, tailq) {
if (ctrlr_device->ctrlr == ctrlr) {
break;
}
}
pthread_mutex_lock(&g_cuse_mtx);
ctrlr_device = nvme_cuse_get_cuse_ctrlr_device(ctrlr);
if (!ctrlr_device) {
pthread_mutex_unlock(&g_cuse_mtx);
return NULL;
}
pthread_mutex_unlock(&g_cuse_mtx);
return ctrlr_device->dev_name;
}
@ -953,31 +1002,15 @@ char *
spdk_nvme_cuse_get_ns_name(struct spdk_nvme_ctrlr *ctrlr, uint32_t nsid)
{
struct cuse_device *ns_device;
struct cuse_device *ctrlr_device;
if (TAILQ_EMPTY(&g_ctrlr_ctx_head)) {
return NULL;
}
TAILQ_FOREACH(ctrlr_device, &g_ctrlr_ctx_head, tailq) {
if (ctrlr_device->ctrlr == ctrlr) {
break;
}
}
if (!ctrlr_device) {
return NULL;
}
TAILQ_FOREACH(ns_device, &ctrlr_device->ns_devices, tailq) {
if (ns_device->nsid == nsid) {
break;
}
}
pthread_mutex_lock(&g_cuse_mtx);
ns_device = nvme_cuse_get_cuse_ns_device(ctrlr, nsid);
if (!ns_device) {
pthread_mutex_unlock(&g_cuse_mtx);
return NULL;
}
pthread_mutex_unlock(&g_cuse_mtx);
return ns_device->dev_name;
}

View File

@ -767,6 +767,9 @@ struct nvme_driver {
bool initialized;
struct spdk_uuid default_extended_host_id;
/** netlink socket fd for hotplug messages */
int hotplug_fd;
};
extern struct nvme_driver *g_spdk_nvme_driver;

View File

@ -64,6 +64,7 @@ nvme_io_msg_send(struct spdk_nvme_ctrlr *ctrlr, uint32_t nsid, spdk_nvme_io_msg_
rc = spdk_ring_enqueue(ctrlr->external_io_msgs, (void **)&io, 1, NULL);
if (rc != 1) {
assert(false);
free(io);
pthread_mutex_unlock(&ctrlr->external_io_msgs_lock);
return -ENOMEM;
}
@ -106,6 +107,20 @@ spdk_nvme_io_msg_process(struct spdk_nvme_ctrlr *ctrlr)
return count;
}
static bool
nvme_io_msg_is_producer_registered(struct spdk_nvme_ctrlr *ctrlr,
struct nvme_io_msg_producer *io_msg_producer)
{
struct nvme_io_msg_producer *tmp;
STAILQ_FOREACH(tmp, &ctrlr->io_producers, link) {
if (tmp == io_msg_producer) {
return true;
}
}
return false;
}
int
nvme_io_msg_ctrlr_register(struct spdk_nvme_ctrlr *ctrlr,
struct nvme_io_msg_producer *io_msg_producer)
@ -115,6 +130,10 @@ nvme_io_msg_ctrlr_register(struct spdk_nvme_ctrlr *ctrlr,
return -EINVAL;
}
if (nvme_io_msg_is_producer_registered(ctrlr, io_msg_producer)) {
return -EEXIST;
}
if (!STAILQ_EMPTY(&ctrlr->io_producers) || ctrlr->is_resetting) {
/* There are registered producers - IO messaging already started */
STAILQ_INSERT_TAIL(&ctrlr->io_producers, io_msg_producer, link);
@ -136,7 +155,8 @@ nvme_io_msg_ctrlr_register(struct spdk_nvme_ctrlr *ctrlr,
if (ctrlr->external_io_msgs_qpair == NULL) {
SPDK_ERRLOG("spdk_nvme_ctrlr_alloc_io_qpair() failed\n");
spdk_ring_free(ctrlr->external_io_msgs);
return -1;
ctrlr->external_io_msgs = NULL;
return -ENOMEM;
}
STAILQ_INSERT_TAIL(&ctrlr->io_producers, io_msg_producer, link);
@ -157,6 +177,7 @@ nvme_io_msg_ctrlr_detach(struct spdk_nvme_ctrlr *ctrlr)
if (ctrlr->external_io_msgs) {
spdk_ring_free(ctrlr->external_io_msgs);
ctrlr->external_io_msgs = NULL;
}
if (ctrlr->external_io_msgs_qpair) {
@ -173,6 +194,10 @@ nvme_io_msg_ctrlr_unregister(struct spdk_nvme_ctrlr *ctrlr,
{
assert(io_msg_producer != NULL);
if (!nvme_io_msg_is_producer_registered(ctrlr, io_msg_producer)) {
return;
}
STAILQ_REMOVE(&ctrlr->io_producers, io_msg_producer, nvme_io_msg_producer, link);
if (STAILQ_EMPTY(&ctrlr->io_producers)) {
nvme_io_msg_ctrlr_detach(ctrlr);

View File

@ -212,7 +212,6 @@ static int nvme_pcie_qpair_destroy(struct spdk_nvme_qpair *qpair);
__thread struct nvme_pcie_ctrlr *g_thread_mmio_ctrlr = NULL;
static uint16_t g_signal_lock;
static bool g_sigset = false;
static int g_hotplug_fd = -1;
static void
nvme_sigbus_fault_sighandler(int signum, siginfo_t *info, void *ctx)
@ -271,7 +270,11 @@ _nvme_pcie_hotplug_monitor(struct spdk_nvme_probe_ctx *probe_ctx)
union spdk_nvme_csts_register csts;
struct spdk_nvme_ctrlr_process *proc;
while (spdk_get_uevent(g_hotplug_fd, &event) > 0) {
if (g_spdk_nvme_driver->hotplug_fd < 0) {
return 0;
}
while (spdk_get_uevent(g_spdk_nvme_driver->hotplug_fd, &event) > 0) {
if (event.subsystem == SPDK_NVME_UEVENT_SUBSYSTEM_UIO ||
event.subsystem == SPDK_NVME_UEVENT_SUBSYSTEM_VFIO) {
if (event.action == SPDK_NVME_UEVENT_ADD) {
@ -768,14 +771,7 @@ nvme_pcie_ctrlr_scan(struct spdk_nvme_probe_ctx *probe_ctx,
/* Only the primary process can monitor hotplug. */
if (spdk_process_is_primary()) {
if (g_hotplug_fd < 0) {
g_hotplug_fd = spdk_uevent_connect();
if (g_hotplug_fd < 0) {
SPDK_DEBUGLOG(SPDK_LOG_NVME, "Failed to open uevent netlink socket\n");
}
} else {
_nvme_pcie_hotplug_monitor(probe_ctx);
}
_nvme_pcie_hotplug_monitor(probe_ctx);
}
if (enum_ctx.has_pci_addr == false) {
@ -828,7 +824,6 @@ struct spdk_nvme_ctrlr *nvme_pcie_ctrlr_construct(const struct spdk_nvme_transpo
pctrlr->is_remapped = false;
pctrlr->ctrlr.is_removed = false;
spdk_nvme_trid_populate_transport(&pctrlr->ctrlr.trid, SPDK_NVME_TRANSPORT_PCIE);
pctrlr->devhandle = devhandle;
pctrlr->ctrlr.opts = *opts;
memcpy(&pctrlr->ctrlr.trid, trid, sizeof(pctrlr->ctrlr.trid));
@ -997,7 +992,8 @@ nvme_pcie_qpair_construct(struct spdk_nvme_qpair *qpair,
volatile uint32_t *doorbell_base;
uint64_t offset;
uint16_t num_trackers;
size_t page_align = VALUE_2MB;
size_t page_align = sysconf(_SC_PAGESIZE);
size_t queue_align, queue_len;
uint32_t flags = SPDK_MALLOC_DMA;
uint64_t sq_paddr = 0;
uint64_t cq_paddr = 0;
@ -1035,7 +1031,7 @@ nvme_pcie_qpair_construct(struct spdk_nvme_qpair *qpair,
/* cmd and cpl rings must be aligned on page size boundaries. */
if (ctrlr->opts.use_cmb_sqs) {
if (nvme_pcie_ctrlr_alloc_cmb(ctrlr, pqpair->num_entries * sizeof(struct spdk_nvme_cmd),
sysconf(_SC_PAGESIZE), &offset) == 0) {
page_align, &offset) == 0) {
pqpair->cmd = pctrlr->cmb_bar_virt_addr + offset;
pqpair->cmd_bus_addr = pctrlr->cmb_bar_phys_addr + offset;
pqpair->sq_in_cmb = true;
@ -1049,9 +1045,9 @@ nvme_pcie_qpair_construct(struct spdk_nvme_qpair *qpair,
/* To ensure physical address contiguity we make each ring occupy
* a single hugepage only. See MAX_IO_QUEUE_ENTRIES.
*/
pqpair->cmd = spdk_zmalloc(pqpair->num_entries * sizeof(struct spdk_nvme_cmd),
page_align, NULL,
SPDK_ENV_SOCKET_ID_ANY, flags);
queue_len = pqpair->num_entries * sizeof(struct spdk_nvme_cmd);
queue_align = spdk_max(spdk_align32pow2(queue_len), page_align);
pqpair->cmd = spdk_zmalloc(queue_len, queue_align, NULL, SPDK_ENV_SOCKET_ID_ANY, flags);
if (pqpair->cmd == NULL) {
SPDK_ERRLOG("alloc qpair_cmd failed\n");
return -ENOMEM;
@ -1072,9 +1068,9 @@ nvme_pcie_qpair_construct(struct spdk_nvme_qpair *qpair,
if (pqpair->cq_vaddr) {
pqpair->cpl = pqpair->cq_vaddr;
} else {
pqpair->cpl = spdk_zmalloc(pqpair->num_entries * sizeof(struct spdk_nvme_cpl),
page_align, NULL,
SPDK_ENV_SOCKET_ID_ANY, flags);
queue_len = pqpair->num_entries * sizeof(struct spdk_nvme_cpl);
queue_align = spdk_max(spdk_align32pow2(queue_len), page_align);
pqpair->cpl = spdk_zmalloc(queue_len, queue_align, NULL, SPDK_ENV_SOCKET_ID_ANY, flags);
if (pqpair->cpl == NULL) {
SPDK_ERRLOG("alloc qpair_cpl failed\n");
return -ENOMEM;

View File

@ -207,7 +207,7 @@ static const struct nvme_string generic_status[] = {
static const struct nvme_string command_specific_status[] = {
{ SPDK_NVME_SC_COMPLETION_QUEUE_INVALID, "INVALID COMPLETION QUEUE" },
{ SPDK_NVME_SC_INVALID_QUEUE_IDENTIFIER, "INVALID QUEUE IDENTIFIER" },
{ SPDK_NVME_SC_MAXIMUM_QUEUE_SIZE_EXCEEDED, "MAX QUEUE SIZE EXCEEDED" },
{ SPDK_NVME_SC_INVALID_QUEUE_SIZE, "INVALID QUEUE SIZE" },
{ SPDK_NVME_SC_ABORT_COMMAND_LIMIT_EXCEEDED, "ABORT CMD LIMIT EXCEEDED" },
{ SPDK_NVME_SC_ASYNC_EVENT_REQUEST_LIMIT_EXCEEDED, "ASYNC LIMIT EXCEEDED" },
{ SPDK_NVME_SC_INVALID_FIRMWARE_SLOT, "INVALID FIRMWARE SLOT" },
@ -575,6 +575,7 @@ nvme_qpair_deinit(struct spdk_nvme_qpair *qpair)
{
struct nvme_error_cmd *cmd, *entry;
nvme_qpair_abort_queued_reqs(qpair, 1);
nvme_qpair_complete_error_reqs(qpair);
TAILQ_FOREACH_SAFE(cmd, &qpair->err_cmd_head, link, entry) {

View File

@ -127,6 +127,12 @@ struct spdk_nvme_recv_wr_list {
struct ibv_recv_wr *last;
};
/* Memory regions */
union nvme_rdma_mr {
struct ibv_mr *mr;
uint64_t key;
};
/* NVMe RDMA qpair extensions for spdk_nvme_qpair */
struct nvme_rdma_qpair {
struct spdk_nvme_qpair qpair;
@ -143,18 +149,19 @@ struct nvme_rdma_qpair {
uint16_t num_entries;
bool delay_cmd_submit;
/* Parallel arrays of response buffers + response SGLs of size num_entries */
struct ibv_sge *rsp_sgls;
struct spdk_nvme_cpl *rsps;
struct ibv_recv_wr *rsp_recv_wrs;
bool delay_cmd_submit;
struct spdk_nvme_send_wr_list sends_to_post;
struct spdk_nvme_recv_wr_list recvs_to_post;
/* Memory region describing all rsps for this qpair */
struct ibv_mr *rsp_mr;
union nvme_rdma_mr rsp_mr;
/*
* Array of num_entries NVMe commands registered as RDMA message buffers.
@ -163,7 +170,7 @@ struct nvme_rdma_qpair {
struct spdk_nvmf_cmd *cmds;
/* Memory region describing all cmds for this qpair */
struct ibv_mr *cmd_mr;
union nvme_rdma_mr cmd_mr;
struct spdk_nvme_rdma_mr_map *mr_map;
@ -174,8 +181,19 @@ struct nvme_rdma_qpair {
struct rdma_cm_event *evt;
};
enum NVME_RDMA_COMPLETION_FLAGS {
NVME_RDMA_SEND_COMPLETED = 1u << 0,
NVME_RDMA_RECV_COMPLETED = 1u << 1,
};
struct spdk_nvme_rdma_req {
int id;
uint16_t id;
uint16_t completion_flags: 2;
uint16_t reserved: 14;
/* if completion of RDMA_RECV received before RDMA_SEND, we will complete nvme request
* during processing of RDMA_SEND. To complete the request we must know the index
* of nvme_cpl received in RDMA_RECV, so store it in this field */
uint16_t rsp_idx;
struct ibv_send_wr send_wr;
@ -184,8 +202,6 @@ struct spdk_nvme_rdma_req {
struct ibv_sge send_sgl[NVME_RDMA_DEFAULT_TX_SGE];
TAILQ_ENTRY(spdk_nvme_rdma_req) link;
bool request_ready_to_put;
};
static const char *rdma_cm_event_str[] = {
@ -210,6 +226,26 @@ static const char *rdma_cm_event_str[] = {
static LIST_HEAD(, spdk_nvme_rdma_mr_map) g_rdma_mr_maps = LIST_HEAD_INITIALIZER(&g_rdma_mr_maps);
static pthread_mutex_t g_rdma_mr_maps_mutex = PTHREAD_MUTEX_INITIALIZER;
static inline void *
nvme_rdma_calloc(size_t nmemb, size_t size)
{
if (!g_nvme_hooks.get_rkey) {
return calloc(nmemb, size);
} else {
return spdk_zmalloc(nmemb * size, 0, NULL, SPDK_ENV_SOCKET_ID_ANY, SPDK_MALLOC_DMA);
}
}
static inline void
nvme_rdma_free(void *buf)
{
if (!g_nvme_hooks.get_rkey) {
free(buf);
} else {
spdk_free(buf);
}
}
int nvme_rdma_ctrlr_delete_io_qpair(struct spdk_nvme_ctrlr *ctrlr,
struct spdk_nvme_qpair *qpair);
@ -244,7 +280,8 @@ nvme_rdma_req_get(struct nvme_rdma_qpair *rqpair)
static void
nvme_rdma_req_put(struct nvme_rdma_qpair *rqpair, struct spdk_nvme_rdma_req *rdma_req)
{
rdma_req->request_ready_to_put = false;
rdma_req->completion_flags = 0;
rdma_req->req = NULL;
TAILQ_REMOVE(&rqpair->outstanding_reqs, rdma_req, link);
TAILQ_INSERT_HEAD(&rqpair->free_reqs, rdma_req, link);
}
@ -614,23 +651,66 @@ nvme_rdma_post_recv(struct nvme_rdma_qpair *rqpair, uint16_t rsp_idx)
return nvme_rdma_qpair_queue_recv_wr(rqpair, wr);
}
static int
nvme_rdma_reg_mr(struct rdma_cm_id *cm_id, union nvme_rdma_mr *mr, void *mem, size_t length)
{
if (!g_nvme_hooks.get_rkey) {
mr->mr = rdma_reg_msgs(cm_id, mem, length);
if (mr->mr == NULL) {
SPDK_ERRLOG("Unable to register mr: %s (%d)\n",
spdk_strerror(errno), errno);
return -1;
}
} else {
mr->key = g_nvme_hooks.get_rkey(cm_id->pd, mem, length);
}
return 0;
}
static void
nvme_rdma_dereg_mr(union nvme_rdma_mr *mr)
{
if (!g_nvme_hooks.get_rkey) {
if (mr->mr && rdma_dereg_mr(mr->mr)) {
SPDK_ERRLOG("Unable to de-register mr\n");
}
} else {
if (mr->key) {
g_nvme_hooks.put_rkey(mr->key);
}
}
memset(mr, 0, sizeof(*mr));
}
static uint32_t
nvme_rdma_mr_get_lkey(union nvme_rdma_mr *mr)
{
uint32_t lkey;
if (!g_nvme_hooks.get_rkey) {
lkey = mr->mr->lkey;
} else {
lkey = *((uint64_t *) mr->key);
}
return lkey;
}
static void
nvme_rdma_unregister_rsps(struct nvme_rdma_qpair *rqpair)
{
if (rqpair->rsp_mr && rdma_dereg_mr(rqpair->rsp_mr)) {
SPDK_ERRLOG("Unable to de-register rsp_mr\n");
}
rqpair->rsp_mr = NULL;
nvme_rdma_dereg_mr(&rqpair->rsp_mr);
}
static void
nvme_rdma_free_rsps(struct nvme_rdma_qpair *rqpair)
{
free(rqpair->rsps);
nvme_rdma_free(rqpair->rsps);
rqpair->rsps = NULL;
free(rqpair->rsp_sgls);
nvme_rdma_free(rqpair->rsp_sgls);
rqpair->rsp_sgls = NULL;
free(rqpair->rsp_recv_wrs);
nvme_rdma_free(rqpair->rsp_recv_wrs);
rqpair->rsp_recv_wrs = NULL;
}
@ -640,20 +720,19 @@ nvme_rdma_alloc_rsps(struct nvme_rdma_qpair *rqpair)
rqpair->rsps = NULL;
rqpair->rsp_recv_wrs = NULL;
rqpair->rsp_sgls = calloc(rqpair->num_entries, sizeof(*rqpair->rsp_sgls));
rqpair->rsp_sgls = nvme_rdma_calloc(rqpair->num_entries, sizeof(*rqpair->rsp_sgls));
if (!rqpair->rsp_sgls) {
SPDK_ERRLOG("Failed to allocate rsp_sgls\n");
goto fail;
}
rqpair->rsp_recv_wrs = calloc(rqpair->num_entries,
sizeof(*rqpair->rsp_recv_wrs));
rqpair->rsp_recv_wrs = nvme_rdma_calloc(rqpair->num_entries, sizeof(*rqpair->rsp_recv_wrs));
if (!rqpair->rsp_recv_wrs) {
SPDK_ERRLOG("Failed to allocate rsp_recv_wrs\n");
goto fail;
}
rqpair->rsps = calloc(rqpair->num_entries, sizeof(*rqpair->rsps));
rqpair->rsps = nvme_rdma_calloc(rqpair->num_entries, sizeof(*rqpair->rsps));
if (!rqpair->rsps) {
SPDK_ERRLOG("can not allocate rdma rsps\n");
goto fail;
@ -668,22 +747,25 @@ fail:
static int
nvme_rdma_register_rsps(struct nvme_rdma_qpair *rqpair)
{
int i, rc;
uint16_t i;
int rc;
uint32_t lkey;
rqpair->rsp_mr = rdma_reg_msgs(rqpair->cm_id, rqpair->rsps,
rqpair->num_entries * sizeof(*rqpair->rsps));
if (rqpair->rsp_mr == NULL) {
rc = -errno;
SPDK_ERRLOG("Unable to register rsp_mr: %s (%d)\n", spdk_strerror(errno), errno);
rc = nvme_rdma_reg_mr(rqpair->cm_id, &rqpair->rsp_mr,
rqpair->rsps, rqpair->num_entries * sizeof(*rqpair->rsps));
if (rc < 0) {
goto fail;
}
lkey = nvme_rdma_mr_get_lkey(&rqpair->rsp_mr);
for (i = 0; i < rqpair->num_entries; i++) {
struct ibv_sge *rsp_sgl = &rqpair->rsp_sgls[i];
rsp_sgl->addr = (uint64_t)&rqpair->rsps[i];
rsp_sgl->length = sizeof(rqpair->rsps[i]);
rsp_sgl->lkey = rqpair->rsp_mr->lkey;
rsp_sgl->lkey = lkey;
rqpair->rsp_recv_wrs[i].wr_id = i;
rqpair->rsp_recv_wrs[i].next = NULL;
@ -711,10 +793,7 @@ fail:
static void
nvme_rdma_unregister_reqs(struct nvme_rdma_qpair *rqpair)
{
if (rqpair->cmd_mr && rdma_dereg_mr(rqpair->cmd_mr)) {
SPDK_ERRLOG("Unable to de-register cmd_mr\n");
}
rqpair->cmd_mr = NULL;
nvme_rdma_dereg_mr(&rqpair->cmd_mr);
}
static void
@ -724,25 +803,25 @@ nvme_rdma_free_reqs(struct nvme_rdma_qpair *rqpair)
return;
}
free(rqpair->cmds);
nvme_rdma_free(rqpair->cmds);
rqpair->cmds = NULL;
free(rqpair->rdma_reqs);
nvme_rdma_free(rqpair->rdma_reqs);
rqpair->rdma_reqs = NULL;
}
static int
nvme_rdma_alloc_reqs(struct nvme_rdma_qpair *rqpair)
{
int i;
uint16_t i;
rqpair->rdma_reqs = calloc(rqpair->num_entries, sizeof(struct spdk_nvme_rdma_req));
rqpair->rdma_reqs = nvme_rdma_calloc(rqpair->num_entries, sizeof(struct spdk_nvme_rdma_req));
if (rqpair->rdma_reqs == NULL) {
SPDK_ERRLOG("Failed to allocate rdma_reqs\n");
goto fail;
}
rqpair->cmds = calloc(rqpair->num_entries, sizeof(*rqpair->cmds));
rqpair->cmds = nvme_rdma_calloc(rqpair->num_entries, sizeof(*rqpair->cmds));
if (!rqpair->cmds) {
SPDK_ERRLOG("Failed to allocate RDMA cmds\n");
goto fail;
@ -785,16 +864,20 @@ static int
nvme_rdma_register_reqs(struct nvme_rdma_qpair *rqpair)
{
int i;
int rc;
uint32_t lkey;
rqpair->cmd_mr = rdma_reg_msgs(rqpair->cm_id, rqpair->cmds,
rqpair->num_entries * sizeof(*rqpair->cmds));
if (!rqpair->cmd_mr) {
SPDK_ERRLOG("Unable to register cmd_mr\n");
rc = nvme_rdma_reg_mr(rqpair->cm_id, &rqpair->cmd_mr,
rqpair->cmds, rqpair->num_entries * sizeof(*rqpair->cmds));
if (rc < 0) {
goto fail;
}
lkey = nvme_rdma_mr_get_lkey(&rqpair->cmd_mr);
for (i = 0; i < rqpair->num_entries; i++) {
rqpair->rdma_reqs[i].send_sgl[0].lkey = rqpair->cmd_mr->lkey;
rqpair->rdma_reqs[i].send_sgl[0].lkey = lkey;
}
return 0;
@ -804,35 +887,6 @@ fail:
return -ENOMEM;
}
static int
nvme_rdma_recv(struct nvme_rdma_qpair *rqpair, uint64_t rsp_idx, int *reaped)
{
struct spdk_nvme_rdma_req *rdma_req;
struct spdk_nvme_cpl *rsp;
struct nvme_request *req;
assert(rsp_idx < rqpair->num_entries);
rsp = &rqpair->rsps[rsp_idx];
rdma_req = &rqpair->rdma_reqs[rsp->cid];
req = rdma_req->req;
nvme_rdma_req_complete(req, rsp);
if (rdma_req->request_ready_to_put) {
(*reaped)++;
nvme_rdma_req_put(rqpair, rdma_req);
} else {
rdma_req->request_ready_to_put = true;
}
if (nvme_rdma_post_recv(rqpair, rsp_idx)) {
SPDK_ERRLOG("Unable to re-post rx descriptor\n");
return -1;
}
return 0;
}
static int
nvme_rdma_resolve_addr(struct nvme_rdma_qpair *rqpair,
struct sockaddr *src_addr,
@ -1023,9 +1077,9 @@ nvme_rdma_register_mem(struct nvme_rdma_qpair *rqpair)
}
}
mr_map = calloc(1, sizeof(*mr_map));
mr_map = nvme_rdma_calloc(1, sizeof(*mr_map));
if (mr_map == NULL) {
SPDK_ERRLOG("calloc() failed\n");
SPDK_ERRLOG("Failed to allocate mr_map\n");
pthread_mutex_unlock(&g_rdma_mr_maps_mutex);
return -1;
}
@ -1035,7 +1089,8 @@ nvme_rdma_register_mem(struct nvme_rdma_qpair *rqpair)
mr_map->map = spdk_mem_map_alloc((uint64_t)NULL, &nvme_rdma_map_ops, pd);
if (mr_map->map == NULL) {
SPDK_ERRLOG("spdk_mem_map_alloc() failed\n");
free(mr_map);
nvme_rdma_free(mr_map);
pthread_mutex_unlock(&g_rdma_mr_maps_mutex);
return -1;
}
@ -1067,7 +1122,7 @@ nvme_rdma_unregister_mem(struct nvme_rdma_qpair *rqpair)
if (mr_map->ref == 0) {
LIST_REMOVE(mr_map, link);
spdk_mem_map_free(&mr_map->map);
free(mr_map);
nvme_rdma_free(mr_map);
}
pthread_mutex_unlock(&g_rdma_mr_maps_mutex);
@ -1517,6 +1572,7 @@ nvme_rdma_req_init(struct nvme_rdma_qpair *rqpair, struct nvme_request *req,
struct spdk_nvme_ctrlr *ctrlr = rqpair->qpair.ctrlr;
int rc;
assert(rdma_req->req == NULL);
rdma_req->req = req;
req->cmd.cid = rdma_req->id;
@ -1569,7 +1625,7 @@ nvme_rdma_ctrlr_create_qpair(struct spdk_nvme_ctrlr *ctrlr,
struct spdk_nvme_qpair *qpair;
int rc, retry_count = 0;
rqpair = calloc(1, sizeof(struct nvme_rdma_qpair));
rqpair = nvme_rdma_calloc(1, sizeof(struct nvme_rdma_qpair));
if (!rqpair) {
SPDK_ERRLOG("failed to get create rqpair\n");
return NULL;
@ -1587,6 +1643,7 @@ nvme_rdma_ctrlr_create_qpair(struct spdk_nvme_ctrlr *ctrlr,
SPDK_DEBUGLOG(SPDK_LOG_NVME, "rc =%d\n", rc);
if (rc) {
SPDK_ERRLOG("Unable to allocate rqpair RDMA requests\n");
nvme_rdma_free(rqpair);
return NULL;
}
SPDK_DEBUGLOG(SPDK_LOG_NVME, "RDMA requests allocated\n");
@ -1595,6 +1652,8 @@ nvme_rdma_ctrlr_create_qpair(struct spdk_nvme_ctrlr *ctrlr,
SPDK_DEBUGLOG(SPDK_LOG_NVME, "rc =%d\n", rc);
if (rc < 0) {
SPDK_ERRLOG("Unable to allocate rqpair RDMA responses\n");
nvme_rdma_free_reqs(rqpair);
nvme_rdma_free(rqpair);
return NULL;
}
SPDK_DEBUGLOG(SPDK_LOG_NVME, "RDMA responses allocated\n");
@ -1686,7 +1745,7 @@ nvme_rdma_ctrlr_delete_io_qpair(struct spdk_nvme_ctrlr *ctrlr, struct spdk_nvme_
nvme_rdma_free_reqs(rqpair);
nvme_rdma_free_rsps(rqpair);
free(rqpair);
nvme_rdma_free(rqpair);
return 0;
}
@ -1718,20 +1777,19 @@ struct spdk_nvme_ctrlr *nvme_rdma_ctrlr_construct(const struct spdk_nvme_transpo
struct ibv_device_attr dev_attr;
int i, flag, rc;
rctrlr = calloc(1, sizeof(struct nvme_rdma_ctrlr));
rctrlr = nvme_rdma_calloc(1, sizeof(struct nvme_rdma_ctrlr));
if (rctrlr == NULL) {
SPDK_ERRLOG("could not allocate ctrlr\n");
return NULL;
}
spdk_nvme_trid_populate_transport(&rctrlr->ctrlr.trid, SPDK_NVME_TRANSPORT_RDMA);
rctrlr->ctrlr.opts = *opts;
memcpy(&rctrlr->ctrlr.trid, trid, sizeof(rctrlr->ctrlr.trid));
contexts = rdma_get_devices(NULL);
if (contexts == NULL) {
SPDK_ERRLOG("rdma_get_devices() failed: %s (%d)\n", spdk_strerror(errno), errno);
free(rctrlr);
nvme_rdma_free(rctrlr);
return NULL;
}
@ -1743,7 +1801,7 @@ struct spdk_nvme_ctrlr *nvme_rdma_ctrlr_construct(const struct spdk_nvme_transpo
if (rc < 0) {
SPDK_ERRLOG("Failed to query RDMA device attributes.\n");
rdma_free_devices(contexts);
free(rctrlr);
nvme_rdma_free(rctrlr);
return NULL;
}
rctrlr->max_sge = spdk_min(rctrlr->max_sge, (uint16_t)dev_attr.max_sge);
@ -1754,13 +1812,13 @@ struct spdk_nvme_ctrlr *nvme_rdma_ctrlr_construct(const struct spdk_nvme_transpo
rc = nvme_ctrlr_construct(&rctrlr->ctrlr);
if (rc != 0) {
free(rctrlr);
nvme_rdma_free(rctrlr);
return NULL;
}
STAILQ_INIT(&rctrlr->pending_cm_events);
STAILQ_INIT(&rctrlr->free_cm_events);
rctrlr->cm_events = calloc(NVME_RDMA_NUM_CM_EVENTS, sizeof(*rctrlr->cm_events));
rctrlr->cm_events = nvme_rdma_calloc(NVME_RDMA_NUM_CM_EVENTS, sizeof(*rctrlr->cm_events));
if (rctrlr->cm_events == NULL) {
SPDK_ERRLOG("unable to allocat buffers to hold CM events.\n");
nvme_rdma_ctrlr_destruct(&rctrlr->ctrlr);
@ -1834,7 +1892,7 @@ nvme_rdma_ctrlr_destruct(struct spdk_nvme_ctrlr *ctrlr)
STAILQ_INIT(&rctrlr->free_cm_events);
STAILQ_INIT(&rctrlr->pending_cm_events);
free(rctrlr->cm_events);
nvme_rdma_free(rctrlr->cm_events);
if (rctrlr->cm_channel) {
rdma_destroy_event_channel(rctrlr->cm_channel);
@ -1843,7 +1901,7 @@ nvme_rdma_ctrlr_destruct(struct spdk_nvme_ctrlr *ctrlr)
nvme_ctrlr_destruct_finish(ctrlr);
free(rctrlr);
nvme_rdma_free(rctrlr);
return 0;
}
@ -1945,6 +2003,14 @@ nvme_rdma_qpair_check_timeout(struct spdk_nvme_qpair *qpair)
}
}
static inline int
nvme_rdma_request_ready(struct nvme_rdma_qpair *rqpair, struct spdk_nvme_rdma_req *rdma_req)
{
nvme_rdma_req_complete(rdma_req->req, &rqpair->rsps[rdma_req->rsp_idx]);
nvme_rdma_req_put(rqpair, rdma_req);
return nvme_rdma_post_recv(rqpair, rdma_req->rsp_idx);
}
#define MAX_COMPLETIONS_PER_POLL 128
int
@ -1954,10 +2020,12 @@ nvme_rdma_qpair_process_completions(struct spdk_nvme_qpair *qpair,
struct nvme_rdma_qpair *rqpair = nvme_rdma_qpair(qpair);
struct ibv_wc wc[MAX_COMPLETIONS_PER_POLL];
int i, rc = 0, batch_size;
uint32_t reaped;
uint32_t reaped = 0;
uint16_t rsp_idx;
struct ibv_cq *cq;
struct spdk_nvme_rdma_req *rdma_req;
struct nvme_rdma_ctrlr *rctrlr;
struct spdk_nvme_cpl *rsp;
if (spdk_unlikely(nvme_rdma_qpair_submit_sends(rqpair) ||
nvme_rdma_qpair_submit_recvs(rqpair))) {
@ -1982,7 +2050,6 @@ nvme_rdma_qpair_process_completions(struct spdk_nvme_qpair *qpair,
cq = rqpair->cq;
reaped = 0;
do {
batch_size = spdk_min((max_completions - reaped),
MAX_COMPLETIONS_PER_POLL);
@ -2012,20 +2079,32 @@ nvme_rdma_qpair_process_completions(struct spdk_nvme_qpair *qpair,
goto fail;
}
if (nvme_rdma_recv(rqpair, wc[i].wr_id, &reaped)) {
SPDK_ERRLOG("nvme_rdma_recv processing failure\n");
goto fail;
assert(wc[i].wr_id < rqpair->num_entries);
rsp_idx = (uint16_t)wc[i].wr_id;
rsp = &rqpair->rsps[rsp_idx];
rdma_req = &rqpair->rdma_reqs[rsp->cid];
rdma_req->completion_flags |= NVME_RDMA_RECV_COMPLETED;
rdma_req->rsp_idx = rsp_idx;
if ((rdma_req->completion_flags & NVME_RDMA_SEND_COMPLETED) != 0) {
if (spdk_unlikely(nvme_rdma_request_ready(rqpair, rdma_req))) {
SPDK_ERRLOG("Unable to re-post rx descriptor\n");
goto fail;
}
reaped++;
}
break;
case IBV_WC_SEND:
rdma_req = (struct spdk_nvme_rdma_req *)wc[i].wr_id;
rdma_req->completion_flags |= NVME_RDMA_SEND_COMPLETED;
if (rdma_req->request_ready_to_put) {
if ((rdma_req->completion_flags & NVME_RDMA_RECV_COMPLETED) != 0) {
if (spdk_unlikely(nvme_rdma_request_ready(rqpair, rdma_req))) {
SPDK_ERRLOG("Unable to re-post rx descriptor\n");
goto fail;
}
reaped++;
nvme_rdma_req_put(rqpair, rdma_req);
} else {
rdma_req->request_ready_to_put = true;
}
break;

View File

@ -236,6 +236,11 @@ nvme_tcp_ctrlr_disconnect_qpair(struct spdk_nvme_ctrlr *ctrlr, struct spdk_nvme_
struct nvme_tcp_qpair *tqpair = nvme_tcp_qpair(qpair);
struct nvme_tcp_pdu *pdu;
if (nvme_qpair_get_state(qpair) == NVME_QPAIR_DISABLED) {
/* Already disconnecting */
return;
}
nvme_qpair_set_state(qpair, NVME_QPAIR_DISABLED);
spdk_sock_close(&tqpair->sock);
@ -1620,7 +1625,6 @@ struct spdk_nvme_ctrlr *nvme_tcp_ctrlr_construct(const struct spdk_nvme_transpor
tctrlr->ctrlr.opts = *opts;
tctrlr->ctrlr.trid = *trid;
spdk_nvme_trid_populate_transport(&tctrlr->ctrlr.trid, SPDK_NVME_TRANSPORT_TCP);
rc = nvme_ctrlr_construct(&tctrlr->ctrlr);
if (rc != 0) {

View File

@ -2496,6 +2496,11 @@ spdk_nvmf_ctrlr_process_io_fused_cmd(struct spdk_nvmf_request *req, struct spdk_
/* save request of first command to generate response later */
req->first_fused_req = first_fused_req;
req->qpair->first_fused_req = NULL;
} else {
SPDK_ERRLOG("Invalid fused command fuse field.\n");
rsp->status.sct = SPDK_NVME_SCT_GENERIC;
rsp->status.sc = SPDK_NVME_SC_INVALID_FIELD;
return SPDK_NVMF_REQUEST_EXEC_STATUS_COMPLETE;
}
rc = spdk_nvmf_bdev_ctrlr_compare_and_write_cmd(bdev, desc, ch, req->first_fused_req, req);

View File

@ -2,7 +2,7 @@
* BSD LICENSE
*
* Copyright (c) Intel Corporation. All rights reserved.
* Copyright (c) 2018-2019 Mellanox Technologies LTD. All rights reserved.
* Copyright (c) 2018-2020 Mellanox Technologies LTD. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
@ -359,11 +359,17 @@ spdk_rpc_nvmf_subsystem_started(struct spdk_nvmf_subsystem *subsystem,
void *cb_arg, int status)
{
struct spdk_jsonrpc_request *request = cb_arg;
struct spdk_json_write_ctx *w;
w = spdk_jsonrpc_begin_result(request);
spdk_json_write_bool(w, true);
spdk_jsonrpc_end_result(request, w);
if (!status) {
struct spdk_json_write_ctx *w = spdk_jsonrpc_begin_result(request);
spdk_json_write_bool(w, true);
spdk_jsonrpc_end_result(request, w);
} else {
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
"Subsystem %s start failed",
subsystem->subnqn);
spdk_nvmf_subsystem_destroy(subsystem);
}
}
static void
@ -371,72 +377,77 @@ spdk_rpc_nvmf_create_subsystem(struct spdk_jsonrpc_request *request,
const struct spdk_json_val *params)
{
struct rpc_subsystem_create *req;
struct spdk_nvmf_subsystem *subsystem;
struct spdk_nvmf_subsystem *subsystem = NULL;
struct spdk_nvmf_tgt *tgt;
int rc = -1;
req = calloc(1, sizeof(*req));
if (!req) {
goto invalid;
SPDK_ERRLOG("Memory allocation failed\n");
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
"Memory allocation failed");
return;
}
if (spdk_json_decode_object(params, rpc_subsystem_create_decoders,
SPDK_COUNTOF(rpc_subsystem_create_decoders),
req)) {
SPDK_ERRLOG("spdk_json_decode_object failed\n");
goto invalid;
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INVALID_PARAMS, "Invalid parameters");
goto cleanup;
}
tgt = spdk_nvmf_get_tgt(req->tgt_name);
if (!tgt) {
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
"Unable to find a target.");
goto invalid_custom_response;
SPDK_ERRLOG("Unable to find target %s\n", req->tgt_name);
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
"Unable to find target %s", req->tgt_name);
goto cleanup;
}
subsystem = spdk_nvmf_subsystem_create(tgt, req->nqn, SPDK_NVMF_SUBTYPE_NVME,
req->max_namespaces);
if (!subsystem) {
goto invalid;
SPDK_ERRLOG("Unable to create subsystem %s\n", req->nqn);
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
"Unable to create subsystem %s", req->nqn);
goto cleanup;
}
if (req->serial_number) {
if (spdk_nvmf_subsystem_set_sn(subsystem, req->serial_number)) {
SPDK_ERRLOG("Subsystem %s: invalid serial number '%s'\n", req->nqn, req->serial_number);
goto invalid;
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INVALID_PARAMS,
"Invalid SN %s", req->serial_number);
goto cleanup;
}
}
if (req->model_number) {
if (spdk_nvmf_subsystem_set_mn(subsystem, req->model_number)) {
SPDK_ERRLOG("Subsystem %s: invalid model number '%s'\n", req->nqn, req->model_number);
goto invalid;
spdk_jsonrpc_send_error_response_fmt(request, SPDK_JSONRPC_ERROR_INVALID_PARAMS,
"Invalid MN %s", req->model_number);
goto cleanup;
}
}
spdk_nvmf_subsystem_set_allow_any_host(subsystem, req->allow_any_host);
rc = spdk_nvmf_subsystem_start(subsystem,
spdk_rpc_nvmf_subsystem_started,
request);
cleanup:
free(req->nqn);
free(req->tgt_name);
free(req->serial_number);
free(req->model_number);
free(req);
spdk_nvmf_subsystem_start(subsystem,
spdk_rpc_nvmf_subsystem_started,
request);
return;
invalid:
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INVALID_PARAMS, "Invalid parameters");
invalid_custom_response:
if (req) {
free(req->nqn);
free(req->tgt_name);
free(req->serial_number);
free(req->model_number);
if (rc && subsystem) {
spdk_nvmf_subsystem_destroy(subsystem);
}
free(req);
}
SPDK_RPC_REGISTER("nvmf_create_subsystem", spdk_rpc_nvmf_create_subsystem, SPDK_RPC_RUNTIME)
SPDK_RPC_REGISTER_ALIAS_DEPRECATED(nvmf_create_subsystem, nvmf_subsystem_create)

View File

@ -2,7 +2,7 @@
* BSD LICENSE
*
* Copyright (c) Intel Corporation. All rights reserved.
* Copyright (c) 2019 Mellanox Technologies LTD. All rights reserved.
* Copyright (c) 2019, 2020 Mellanox Technologies LTD. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
@ -2961,14 +2961,30 @@ static const char *CM_EVENT_STR[] = {
};
#endif /* DEBUG */
static void
nvmf_rdma_disconnect_qpairs_on_port(struct spdk_nvmf_rdma_transport *rtransport,
struct spdk_nvmf_rdma_port *port)
{
struct spdk_nvmf_rdma_poll_group *rgroup;
struct spdk_nvmf_rdma_poller *rpoller;
struct spdk_nvmf_rdma_qpair *rqpair;
TAILQ_FOREACH(rgroup, &rtransport->poll_groups, link) {
TAILQ_FOREACH(rpoller, &rgroup->pollers, link) {
TAILQ_FOREACH(rqpair, &rpoller->qpairs, link) {
if (rqpair->listen_id == port->id) {
spdk_nvmf_rdma_start_disconnect(rqpair);
}
}
}
}
}
static bool
nvmf_rdma_handle_cm_event_addr_change(struct spdk_nvmf_transport *transport,
struct rdma_cm_event *event)
{
struct spdk_nvme_transport_id trid;
struct spdk_nvmf_rdma_qpair *rqpair;
struct spdk_nvmf_rdma_poll_group *rgroup;
struct spdk_nvmf_rdma_poller *rpoller;
struct spdk_nvmf_rdma_port *port;
struct spdk_nvmf_rdma_transport *rtransport;
uint32_t ref, i;
@ -2986,27 +3002,41 @@ nvmf_rdma_handle_cm_event_addr_change(struct spdk_nvmf_transport *transport,
}
}
if (event_acked) {
TAILQ_FOREACH(rgroup, &rtransport->poll_groups, link) {
TAILQ_FOREACH(rpoller, &rgroup->pollers, link) {
TAILQ_FOREACH(rqpair, &rpoller->qpairs, link) {
if (rqpair->listen_id == port->id) {
spdk_nvmf_rdma_start_disconnect(rqpair);
}
}
}
}
nvmf_rdma_disconnect_qpairs_on_port(rtransport, port);
for (i = 0; i < ref; i++) {
spdk_nvmf_rdma_stop_listen(transport, &trid);
}
while (ref > 0) {
for (i = 0; i < ref; i++) {
spdk_nvmf_rdma_listen(transport, &trid, NULL, NULL);
ref--;
}
}
return event_acked;
}
static void
nvmf_rdma_handle_cm_event_port_removal(struct spdk_nvmf_transport *transport,
struct rdma_cm_event *event)
{
struct spdk_nvmf_rdma_port *port;
struct spdk_nvmf_rdma_transport *rtransport;
uint32_t ref, i;
port = event->id->context;
rtransport = SPDK_CONTAINEROF(transport, struct spdk_nvmf_rdma_transport, transport);
ref = port->ref;
SPDK_NOTICELOG("Port %s:%s is being removed\n", port->trid.traddr, port->trid.trsvcid);
nvmf_rdma_disconnect_qpairs_on_port(rtransport, port);
rdma_ack_cm_event(event);
for (i = 0; i < ref; i++) {
spdk_nvmf_rdma_stop_listen(transport, &port->trid);
}
}
static void
spdk_nvmf_process_cm_event(struct spdk_nvmf_transport *transport, new_qpair_fn cb_fn, void *cb_arg)
{
@ -3024,68 +3054,87 @@ spdk_nvmf_process_cm_event(struct spdk_nvmf_transport *transport, new_qpair_fn c
while (1) {
event_acked = false;
rc = rdma_get_cm_event(rtransport->event_channel, &event);
if (rc == 0) {
SPDK_DEBUGLOG(SPDK_LOG_RDMA, "Acceptor Event: %s\n", CM_EVENT_STR[event->event]);
if (rc) {
if (errno != EAGAIN && errno != EWOULDBLOCK) {
SPDK_ERRLOG("Acceptor Event Error: %s\n", spdk_strerror(errno));
}
break;
}
spdk_trace_record(TRACE_RDMA_CM_ASYNC_EVENT, 0, 0, 0, event->event);
SPDK_DEBUGLOG(SPDK_LOG_RDMA, "Acceptor Event: %s\n", CM_EVENT_STR[event->event]);
switch (event->event) {
case RDMA_CM_EVENT_ADDR_RESOLVED:
case RDMA_CM_EVENT_ADDR_ERROR:
case RDMA_CM_EVENT_ROUTE_RESOLVED:
case RDMA_CM_EVENT_ROUTE_ERROR:
/* No action required. The target never attempts to resolve routes. */
spdk_trace_record(TRACE_RDMA_CM_ASYNC_EVENT, 0, 0, 0, event->event);
switch (event->event) {
case RDMA_CM_EVENT_ADDR_RESOLVED:
case RDMA_CM_EVENT_ADDR_ERROR:
case RDMA_CM_EVENT_ROUTE_RESOLVED:
case RDMA_CM_EVENT_ROUTE_ERROR:
/* No action required. The target never attempts to resolve routes. */
break;
case RDMA_CM_EVENT_CONNECT_REQUEST:
rc = nvmf_rdma_connect(transport, event, cb_fn, cb_arg);
if (rc < 0) {
SPDK_ERRLOG("Unable to process connect event. rc: %d\n", rc);
break;
case RDMA_CM_EVENT_CONNECT_REQUEST:
rc = nvmf_rdma_connect(transport, event, cb_fn, cb_arg);
if (rc < 0) {
SPDK_ERRLOG("Unable to process connect event. rc: %d\n", rc);
break;
}
}
break;
case RDMA_CM_EVENT_CONNECT_RESPONSE:
/* The target never initiates a new connection. So this will not occur. */
break;
case RDMA_CM_EVENT_CONNECT_ERROR:
/* Can this happen? The docs say it can, but not sure what causes it. */
break;
case RDMA_CM_EVENT_UNREACHABLE:
case RDMA_CM_EVENT_REJECTED:
/* These only occur on the client side. */
break;
case RDMA_CM_EVENT_ESTABLISHED:
/* TODO: Should we be waiting for this event anywhere? */
break;
case RDMA_CM_EVENT_DISCONNECTED:
rc = nvmf_rdma_disconnect(event);
if (rc < 0) {
SPDK_ERRLOG("Unable to process disconnect event. rc: %d\n", rc);
break;
case RDMA_CM_EVENT_CONNECT_RESPONSE:
/* The target never initiates a new connection. So this will not occur. */
break;
case RDMA_CM_EVENT_CONNECT_ERROR:
/* Can this happen? The docs say it can, but not sure what causes it. */
break;
case RDMA_CM_EVENT_UNREACHABLE:
case RDMA_CM_EVENT_REJECTED:
/* These only occur on the client side. */
break;
case RDMA_CM_EVENT_ESTABLISHED:
/* TODO: Should we be waiting for this event anywhere? */
break;
case RDMA_CM_EVENT_DISCONNECTED:
case RDMA_CM_EVENT_DEVICE_REMOVAL:
}
break;
case RDMA_CM_EVENT_DEVICE_REMOVAL:
/* In case of device removal, kernel IB part triggers IBV_EVENT_DEVICE_FATAL
* which triggers RDMA_CM_EVENT_DEVICE_REMOVAL on all cma_ids.
* Once these events are sent to SPDK, we should release all IB resources and
* don't make attempts to call any ibv_query/modify/create functions. We can only call
* ibv_destory* functions to release user space memory allocated by IB. All kernel
* resources are already cleaned. */
if (event->id->qp) {
/* If rdma_cm event has a valid `qp` pointer then the event refers to the
* corresponding qpair. Otherwise the event refers to a listening device */
rc = nvmf_rdma_disconnect(event);
if (rc < 0) {
SPDK_ERRLOG("Unable to process disconnect event. rc: %d\n", rc);
break;
}
break;
case RDMA_CM_EVENT_MULTICAST_JOIN:
case RDMA_CM_EVENT_MULTICAST_ERROR:
/* Multicast is not used */
break;
case RDMA_CM_EVENT_ADDR_CHANGE:
event_acked = nvmf_rdma_handle_cm_event_addr_change(transport, event);
break;
case RDMA_CM_EVENT_TIMEWAIT_EXIT:
/* For now, do nothing. The target never re-uses queue pairs. */
break;
default:
SPDK_ERRLOG("Unexpected Acceptor Event [%d]\n", event->event);
break;
}
if (!event_acked) {
rdma_ack_cm_event(event);
}
} else {
if (errno != EAGAIN && errno != EWOULDBLOCK) {
SPDK_ERRLOG("Acceptor Event Error: %s\n", spdk_strerror(errno));
} else {
nvmf_rdma_handle_cm_event_port_removal(transport, event);
event_acked = true;
}
break;
case RDMA_CM_EVENT_MULTICAST_JOIN:
case RDMA_CM_EVENT_MULTICAST_ERROR:
/* Multicast is not used */
break;
case RDMA_CM_EVENT_ADDR_CHANGE:
event_acked = nvmf_rdma_handle_cm_event_addr_change(transport, event);
break;
case RDMA_CM_EVENT_TIMEWAIT_EXIT:
/* For now, do nothing. The target never re-uses queue pairs. */
break;
default:
SPDK_ERRLOG("Unexpected Acceptor Event [%d]\n", event->event);
break;
}
if (!event_acked) {
rdma_ack_cm_event(event);
}
}
}
@ -3450,7 +3499,9 @@ spdk_nvmf_rdma_poll_group_destroy(struct spdk_nvmf_transport_poll_group *group)
}
if (poller->srq) {
nvmf_rdma_resources_destroy(poller->resources);
if (poller->resources) {
nvmf_rdma_resources_destroy(poller->resources);
}
ibv_destroy_srq(poller->srq);
SPDK_DEBUGLOG(SPDK_LOG_RDMA, "Destroyed RDMA shared queue %p\n", poller->srq);
}

View File

@ -332,6 +332,14 @@ spdk_sock_writev_async(struct spdk_sock *sock, struct spdk_sock_request *req)
int
spdk_sock_flush(struct spdk_sock *sock)
{
if (sock == NULL) {
return -EBADF;
}
if (sock->flags.closed) {
return -EBADF;
}
return sock->net_impl->flush(sock);
}
@ -396,6 +404,7 @@ spdk_sock_group_create(void *ctx)
if (group_impl != NULL) {
STAILQ_INSERT_TAIL(&group->group_impls, group_impl, link);
TAILQ_INIT(&group_impl->socks);
group_impl->num_removed_socks = 0;
group_impl->net_impl = impl;
}
}
@ -492,6 +501,9 @@ spdk_sock_group_remove_sock(struct spdk_sock_group *group, struct spdk_sock *soc
rc = group_impl->net_impl->group_impl_remove_sock(group_impl, sock);
if (rc == 0) {
TAILQ_REMOVE(&group_impl->socks, sock, link);
assert(group_impl->num_removed_socks < MAX_EVENTS_PER_POLL);
group_impl->removed_socks[group_impl->num_removed_socks] = (uintptr_t)sock;
group_impl->num_removed_socks++;
sock->group_impl = NULL;
sock->cb_fn = NULL;
sock->cb_arg = NULL;
@ -518,6 +530,9 @@ spdk_sock_group_impl_poll_count(struct spdk_sock_group_impl *group_impl,
return 0;
}
/* The number of removed sockets should be reset for each call to poll. */
group_impl->num_removed_socks = 0;
num_events = group_impl->net_impl->group_impl_poll(group_impl, max_events, socks);
if (num_events == -1) {
return -1;
@ -525,10 +540,21 @@ spdk_sock_group_impl_poll_count(struct spdk_sock_group_impl *group_impl,
for (i = 0; i < num_events; i++) {
struct spdk_sock *sock = socks[i];
int j;
bool valid = true;
for (j = 0; j < group_impl->num_removed_socks; j++) {
if ((uintptr_t)sock == group_impl->removed_socks[j]) {
valid = false;
break;
}
}
assert(sock->cb_fn != NULL);
sock->cb_fn(sock->cb_arg, group, sock);
if (valid) {
assert(sock->cb_fn != NULL);
sock->cb_fn(sock->cb_arg, group, sock);
}
}
return num_events;
}

View File

@ -919,6 +919,7 @@ vhost_blk_get_config(struct spdk_vhost_dev *vdev, uint8_t *config,
uint32_t blk_size;
uint64_t blkcnt;
memset(&blkcfg, 0, sizeof(blkcfg));
bvdev = to_blk_dev(vdev);
assert(bvdev != NULL);
bdev = bvdev->bdev;
@ -949,7 +950,6 @@ vhost_blk_get_config(struct spdk_vhost_dev *vdev, uint8_t *config,
}
}
memset(&blkcfg, 0, sizeof(blkcfg));
blkcfg.blk_size = blk_size;
/* minimum I/O size in blocks */
blkcfg.min_io_size = 1;

View File

@ -266,7 +266,7 @@ LINK_CXX=\
#
# Variables to use for versioning shared libs
#
SO_VER := 1
SO_VER := 2
SO_MINOR := 0
SO_SUFFIX_ALL := $(SO_VER).$(SO_MINOR)

View File

@ -37,7 +37,11 @@ include $(SPDK_ROOT_DIR)/mk/spdk.lib_deps.mk
SPDK_MAP_FILE = $(SPDK_ROOT_DIR)/shared_lib/spdk.map
LIB := $(call spdk_lib_list_to_static_libs,$(LIBNAME))
SHARED_LINKED_LIB := $(subst .a,.so,$(LIB))
ifdef SO_SUFFIX
SHARED_REALNAME_LIB := $(subst .so,.so.$(SO_SUFFIX),$(SHARED_LINKED_LIB))
else
SHARED_REALNAME_LIB := $(subst .so,.so.$(SO_SUFFIX_ALL),$(SHARED_LINKED_LIB))
endif
ifeq ($(CONFIG_SHARED),y)
DEP := $(SHARED_LINKED_LIB)

View File

@ -131,6 +131,7 @@ uint8_t g_number_of_claimed_volumes = 0;
/* Specific to AES_CBC. */
#define AES_CBC_IV_LENGTH 16
#define AES_CBC_KEY_LENGTH 16
#define AESNI_MB_NUM_QP 64
/* Common for suported devices. */
#define IV_OFFSET (sizeof(struct rte_crypto_op) + \
@ -368,6 +369,7 @@ vbdev_crypto_init_crypto_drivers(void)
struct device_qp *dev_qp;
unsigned int max_sess_size = 0, sess_size;
uint16_t num_lcores = rte_lcore_count();
char aesni_args[32];
/* Only the first call, via RPC or module init should init the crypto drivers. */
if (g_session_mp != NULL) {
@ -375,7 +377,8 @@ vbdev_crypto_init_crypto_drivers(void)
}
/* We always init AESNI_MB */
rc = rte_vdev_init(AESNI_MB, NULL);
snprintf(aesni_args, sizeof(aesni_args), "max_nb_queue_pairs=%d", AESNI_MB_NUM_QP);
rc = rte_vdev_init(AESNI_MB, aesni_args);
if (rc) {
SPDK_ERRLOG("error creating virtual PMD %s\n", AESNI_MB);
return -EINVAL;

View File

@ -328,7 +328,9 @@ _bdev_nvme_reset_complete(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr, int rc)
SPDK_NOTICELOG("Resetting controller successful.\n");
}
__atomic_clear(&nvme_bdev_ctrlr->resetting, __ATOMIC_RELAXED);
pthread_mutex_lock(&g_bdev_nvme_mutex);
nvme_bdev_ctrlr->resetting = false;
pthread_mutex_unlock(&g_bdev_nvme_mutex);
/* Make sure we clear any pending resets before returning. */
spdk_for_each_channel(nvme_bdev_ctrlr,
_bdev_nvme_complete_pending_resets,
@ -425,7 +427,20 @@ bdev_nvme_reset(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr, struct nvme_bdev_io *bi
struct spdk_io_channel *ch;
struct nvme_io_channel *nvme_ch;
if (__atomic_test_and_set(&nvme_bdev_ctrlr->resetting, __ATOMIC_RELAXED)) {
pthread_mutex_lock(&g_bdev_nvme_mutex);
if (nvme_bdev_ctrlr->destruct) {
/* Don't bother resetting if the controller is in the process of being destructed. */
if (bio) {
spdk_bdev_io_complete(spdk_bdev_io_from_ctx(bio), SPDK_BDEV_IO_STATUS_FAILED);
}
pthread_mutex_unlock(&g_bdev_nvme_mutex);
return 0;
}
if (!nvme_bdev_ctrlr->resetting) {
nvme_bdev_ctrlr->resetting = true;
} else {
pthread_mutex_unlock(&g_bdev_nvme_mutex);
SPDK_NOTICELOG("Unable to perform reset, already in progress.\n");
/*
* The internal reset calls won't be queued. This is on purpose so that we don't
@ -442,6 +457,7 @@ bdev_nvme_reset(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr, struct nvme_bdev_io *bi
return 0;
}
pthread_mutex_unlock(&g_bdev_nvme_mutex);
/* First, delete all NVMe I/O queue pairs. */
spdk_for_each_channel(nvme_bdev_ctrlr,
_bdev_nvme_reset_destroy_qpair,

View File

@ -83,8 +83,8 @@ spdk_rpc_nvme_cuse_register(struct spdk_jsonrpc_request *request,
rc = spdk_nvme_cuse_register(bdev_ctrlr->ctrlr);
if (rc) {
SPDK_ERRLOG("Failed to register CUSE devices\n");
spdk_jsonrpc_send_error_response(request, -rc, spdk_strerror(rc));
SPDK_ERRLOG("Failed to register CUSE devices: %s\n", spdk_strerror(-rc));
spdk_jsonrpc_send_error_response(request, rc, spdk_strerror(-rc));
goto cleanup;
}

View File

@ -130,10 +130,20 @@ nvme_bdev_unregister_cb(void *io_device)
free(nvme_bdev_ctrlr);
}
void
int
nvme_bdev_ctrlr_destruct(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr)
{
assert(nvme_bdev_ctrlr->destruct);
pthread_mutex_lock(&g_bdev_nvme_mutex);
if (nvme_bdev_ctrlr->resetting) {
nvme_bdev_ctrlr->destruct_poller =
spdk_poller_register((spdk_poller_fn)nvme_bdev_ctrlr_destruct, nvme_bdev_ctrlr, 1000);
pthread_mutex_unlock(&g_bdev_nvme_mutex);
return 1;
}
pthread_mutex_unlock(&g_bdev_nvme_mutex);
spdk_poller_unregister(&nvme_bdev_ctrlr->destruct_poller);
if (nvme_bdev_ctrlr->opal_dev) {
if (nvme_bdev_ctrlr->opal_poller != NULL) {
spdk_poller_unregister(&nvme_bdev_ctrlr->opal_poller);
@ -149,6 +159,7 @@ nvme_bdev_ctrlr_destruct(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr)
}
spdk_io_device_unregister(nvme_bdev_ctrlr, nvme_bdev_unregister_cb);
return 1;
}
void

View File

@ -94,6 +94,7 @@ struct nvme_bdev_ctrlr {
struct spdk_poller *opal_poller;
struct spdk_poller *adminq_timer_poller;
struct spdk_poller *destruct_poller;
struct ocssd_bdev_ctrlr *ocssd_ctrlr;
@ -150,7 +151,7 @@ struct nvme_bdev_ctrlr *nvme_bdev_next_ctrlr(struct nvme_bdev_ctrlr *prev);
void nvme_bdev_dump_trid_json(struct spdk_nvme_transport_id *trid,
struct spdk_json_write_ctx *w);
void nvme_bdev_ctrlr_destruct(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr);
int nvme_bdev_ctrlr_destruct(struct nvme_bdev_ctrlr *nvme_bdev_ctrlr);
void nvme_bdev_attach_bdev_to_ns(struct nvme_bdev_ns *nvme_ns, struct nvme_bdev *nvme_disk);
void nvme_bdev_detach_bdev_from_ns(struct nvme_bdev *nvme_disk);

View File

@ -328,7 +328,9 @@ bdev_rbd_flush(struct bdev_rbd *disk, struct spdk_io_channel *ch,
struct spdk_bdev_io *bdev_io, uint64_t offset, uint64_t nbytes)
{
struct bdev_rbd_io_channel *rbdio_ch = spdk_io_channel_get_ctx(ch);
struct bdev_rbd_io *rbd_io = (struct bdev_rbd_io *)bdev_io->driver_ctx;
rbd_io->num_segments++;
return bdev_rbd_start_aio(rbdio_ch->image, bdev_io, NULL, offset, nbytes);
}
@ -783,6 +785,44 @@ spdk_bdev_rbd_delete(struct spdk_bdev *bdev, spdk_delete_rbd_complete cb_fn, voi
spdk_bdev_unregister(bdev, cb_fn, cb_arg);
}
int
spdk_bdev_rbd_resize(struct spdk_bdev *bdev, const uint64_t new_size_in_mb)
{
struct spdk_io_channel *ch;
struct bdev_rbd_io_channel *rbd_io_ch;
int rc;
uint64_t new_size_in_byte;
uint64_t current_size_in_mb;
if (bdev->module != &rbd_if) {
return -EINVAL;
}
current_size_in_mb = bdev->blocklen * bdev->blockcnt / (1024 * 1024);
if (current_size_in_mb > new_size_in_mb) {
SPDK_ERRLOG("The new bdev size must be lager than current bdev size.\n");
return -EINVAL;
}
ch = bdev_rbd_get_io_channel(bdev);
rbd_io_ch = spdk_io_channel_get_ctx(ch);
new_size_in_byte = new_size_in_mb * 1024 * 1024;
rc = rbd_resize(rbd_io_ch->image, new_size_in_byte);
if (rc != 0) {
SPDK_ERRLOG("failed to resize the ceph bdev.\n");
return rc;
}
rc = spdk_bdev_notify_blockcnt_change(bdev, new_size_in_byte / bdev->blocklen);
if (rc != 0) {
SPDK_ERRLOG("failed to notify block cnt change.\n");
return rc;
}
return rc;
}
static int
bdev_rbd_library_init(void)
{

View File

@ -57,4 +57,12 @@ int spdk_bdev_rbd_create(struct spdk_bdev **bdev, const char *name, const char *
void spdk_bdev_rbd_delete(struct spdk_bdev *bdev, spdk_delete_rbd_complete cb_fn,
void *cb_arg);
/**
* Resize rbd bdev.
*
* \param bdev Pointer to rbd bdev.
* \param new_size_in_mb The new size in MiB for this bdev.
*/
int spdk_bdev_rbd_resize(struct spdk_bdev *bdev, const uint64_t new_size_in_mb);
#endif /* SPDK_BDEV_RBD_H */

View File

@ -197,3 +197,56 @@ cleanup:
}
SPDK_RPC_REGISTER("bdev_rbd_delete", spdk_rpc_bdev_rbd_delete, SPDK_RPC_RUNTIME)
SPDK_RPC_REGISTER_ALIAS_DEPRECATED(bdev_rbd_delete, delete_rbd_bdev)
struct rpc_bdev_rbd_resize {
char *name;
uint64_t new_size;
};
static const struct spdk_json_object_decoder rpc_bdev_rbd_resize_decoders[] = {
{"name", offsetof(struct rpc_bdev_rbd_resize, name), spdk_json_decode_string},
{"new_size", offsetof(struct rpc_bdev_rbd_resize, new_size), spdk_json_decode_uint64}
};
static void
free_rpc_bdev_rbd_resize(struct rpc_bdev_rbd_resize *req)
{
free(req->name);
}
static void
spdk_rpc_bdev_rbd_resize(struct spdk_jsonrpc_request *request,
const struct spdk_json_val *params)
{
struct rpc_bdev_rbd_resize req = {};
struct spdk_bdev *bdev;
struct spdk_json_write_ctx *w;
int rc;
if (spdk_json_decode_object(params, rpc_bdev_rbd_resize_decoders,
SPDK_COUNTOF(rpc_bdev_rbd_resize_decoders),
&req)) {
spdk_jsonrpc_send_error_response(request, SPDK_JSONRPC_ERROR_INTERNAL_ERROR,
"spdk_json_decode_object failed");
goto cleanup;
}
bdev = spdk_bdev_get_by_name(req.name);
if (bdev == NULL) {
spdk_jsonrpc_send_error_response(request, -ENODEV, spdk_strerror(ENODEV));
goto cleanup;
}
rc = spdk_bdev_rbd_resize(bdev, req.new_size);
if (rc) {
spdk_jsonrpc_send_error_response(request, rc, spdk_strerror(-rc));
goto cleanup;
}
w = spdk_jsonrpc_begin_result(request);
spdk_json_write_bool(w, true);
spdk_jsonrpc_end_result(request, w);
cleanup:
free_rpc_bdev_rbd_resize(&req);
}
SPDK_RPC_REGISTER("bdev_rbd_resize", spdk_rpc_bdev_rbd_resize, SPDK_RPC_RUNTIME)

View File

@ -462,7 +462,7 @@ spdk_posix_sock_close(struct spdk_sock *_sock)
}
#ifdef SPDK_ZEROCOPY
static void
static int
_sock_check_zcopy(struct spdk_sock *sock)
{
struct spdk_posix_sock *psock = __posix_sock(sock);
@ -483,7 +483,7 @@ _sock_check_zcopy(struct spdk_sock *sock)
if (rc < 0) {
if (errno == EWOULDBLOCK || errno == EAGAIN) {
return;
return 0;
}
if (!TAILQ_EMPTY(&sock->pending_reqs)) {
@ -491,19 +491,19 @@ _sock_check_zcopy(struct spdk_sock *sock)
} else {
SPDK_WARNLOG("Recvmsg yielded an error!\n");
}
return;
return 0;
}
cm = CMSG_FIRSTHDR(&msgh);
if (cm->cmsg_level != SOL_IP || cm->cmsg_type != IP_RECVERR) {
SPDK_WARNLOG("Unexpected cmsg level or type!\n");
return;
return 0;
}
serr = (struct sock_extended_err *)CMSG_DATA(cm);
if (serr->ee_errno != 0 || serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) {
SPDK_WARNLOG("Unexpected extended error origin\n");
return;
return 0;
}
/* Most of the time, the pending_reqs array is in the exact
@ -521,7 +521,7 @@ _sock_check_zcopy(struct spdk_sock *sock)
rc = spdk_sock_request_put(sock, req, 0);
if (rc < 0) {
return;
return rc;
}
} else if (found) {
@ -531,6 +531,8 @@ _sock_check_zcopy(struct spdk_sock *sock)
}
}
return 0;
}
#endif
@ -959,14 +961,22 @@ spdk_posix_sock_group_impl_poll(struct spdk_sock_group_impl *_group, int max_eve
for (i = 0, j = 0; i < num_events; i++) {
#if defined(__linux__)
sock = events[i].data.ptr;
#ifdef SPDK_ZEROCOPY
if (events[i].events & EPOLLERR) {
_sock_check_zcopy(events[i].data.ptr);
rc = _sock_check_zcopy(sock);
/* If the socket was closed or removed from
* the group in response to a send ack, don't
* add it to the array here. */
if (rc || sock->cb_fn == NULL) {
continue;
}
}
#endif
if (events[i].events & EPOLLIN) {
socks[j++] = events[i].data.ptr;
socks[j++] = sock;
}
#elif defined(__FreeBSD__)

View File

@ -38,6 +38,13 @@ C_SRCS += vpp.c
CFLAGS += -Wno-sign-compare -Wno-error=old-style-definition
CFLAGS += -Wno-error=strict-prototypes -Wno-error=ignored-qualifiers
GCC_VERSION=$(shell $(CC) -dumpversion | cut -d. -f1)
# disable packed member unalign warnings
ifeq ($(shell test $(GCC_VERSION) -ge 9 && echo 1), 1)
CFLAGS += -Wno-error=address-of-packed-member
endif
LIBNAME = sock_vpp
include $(SPDK_ROOT_DIR)/mk/spdk.lib.mk

View File

@ -2,12 +2,12 @@
%bcond_with doc
Name: spdk
Version: master
Version: 20.01.x
Release: 0%{?dist}
Epoch: 0
URL: http://spdk.io
Source: https://github.com/spdk/spdk/archive/master.tar.gz
Source: https://github.com/spdk/spdk/archive/v20.01.x.tar.gz
Summary: Set of libraries and utilities for high performance user-mode storage
%define package_version %{epoch}:%{version}-%{release}

View File

@ -541,6 +541,20 @@ if __name__ == "__main__":
p.add_argument('name', help='rbd bdev name')
p.set_defaults(func=bdev_rbd_delete)
def bdev_rbd_resize(args):
print_json(rpc.bdev.bdev_rbd_resize(args.client,
name=args.name,
new_size=int(args.new_size)))
rpc.bdev.bdev_rbd_resize(args.client,
name=args.name,
new_size=int(args.new_size))
p = subparsers.add_parser('bdev_rbd_resize',
help='Resize a rbd bdev')
p.add_argument('name', help='rbd bdev name')
p.add_argument('new_size', help='new bdev size for resize operation. The unit is MiB')
p.set_defaults(func=bdev_rbd_resize)
def bdev_delay_create(args):
print_json(rpc.bdev.bdev_delay_create(args.client,
base_bdev_name=args.base_bdev_name,

View File

@ -585,6 +585,20 @@ def bdev_rbd_delete(client, name):
return client.call('bdev_rbd_delete', params)
def bdev_rbd_resize(client, name, new_size):
"""Resize rbd bdev in the system.
Args:
name: name of rbd bdev to resize
new_size: new bdev size of resize operation. The unit is MiB
"""
params = {
'name': name,
'new_size': new_size,
}
return client.call('bdev_rbd_resize', params)
@deprecated_alias('construct_error_bdev')
def bdev_error_create(client, base_name):
"""Construct an error injection block device.

View File

@ -41,6 +41,27 @@
static char g_path[256];
static struct spdk_poller *g_poller;
struct ctrlr_entry {
struct spdk_nvme_ctrlr *ctrlr;
struct ctrlr_entry *next;
};
static struct ctrlr_entry *g_controllers = NULL;
static void
cleanup(void)
{
struct ctrlr_entry *ctrlr_entry = g_controllers;
while (ctrlr_entry) {
struct ctrlr_entry *next = ctrlr_entry->next;
spdk_nvme_detach(ctrlr_entry->ctrlr);
free(ctrlr_entry);
ctrlr_entry = next;
}
}
static void
usage(char *executable_name)
{
@ -70,6 +91,17 @@ static void
attach_cb(void *cb_ctx, const struct spdk_nvme_transport_id *trid,
struct spdk_nvme_ctrlr *ctrlr, const struct spdk_nvme_ctrlr_opts *opts)
{
struct ctrlr_entry *entry;
entry = malloc(sizeof(struct ctrlr_entry));
if (entry == NULL) {
fprintf(stderr, "Malloc error\n");
exit(1);
}
entry->ctrlr = ctrlr;
entry->next = g_controllers;
g_controllers = entry;
}
static int
@ -163,6 +195,8 @@ main(int argc, char **argv)
opts.shutdown_cb = stub_shutdown;
ch = spdk_app_start(&opts, stub_start, (void *)(intptr_t)opts.shm_id);
cleanup();
spdk_app_fini();
return ch;

View File

@ -45,7 +45,14 @@ pushd $DB_BENCH_DIR
if [ -z "$SKIP_GIT_CLEAN" ]; then
git clean -x -f -d
fi
$MAKE db_bench $MAKEFLAGS $MAKECONFIG DEBUG_LEVEL=0 SPDK_DIR=$rootdir
EXTRA_CXXFLAGS=""
GCC_VERSION=$(cc -dumpversion | cut -d. -f1)
if (( GCC_VERSION >= 9 )); then
EXTRA_CXXFLAGS+="-Wno-deprecated-copy -Wno-pessimizing-move"
fi
$MAKE db_bench $MAKEFLAGS $MAKECONFIG DEBUG_LEVEL=0 SPDK_DIR=$rootdir EXTRA_CXXFLAGS="$EXTRA_CXXFLAGS"
popd
timing_exit db_bench_build

View File

@ -52,20 +52,6 @@ DEFINE_STUB(rte_mem_event_callback_register, int,
(const char *name, rte_mem_event_callback_t clb, void *arg), 0);
DEFINE_STUB(rte_mem_virt2iova, rte_iova_t, (const void *virtaddr), 0);
void *
rte_malloc(const char *type, size_t size, unsigned align)
{
CU_ASSERT(type == NULL);
CU_ASSERT(align == 0);
return malloc(size);
}
void
rte_free(void *ptr)
{
free(ptr);
}
static int
test_mem_map_notify(void *cb_ctx, struct spdk_mem_map *map,
enum spdk_mem_map_notify_action action,

View File

@ -107,6 +107,7 @@ function start_vpp() {
# for VPP side maximal size of MTU for TCP is 1460 and tests doesn't work
# stable with larger packets
MTU=1460
MTU_W_HEADER=$((MTU+20))
ip link set dev $INITIATOR_INTERFACE mtu $MTU
ethtool -K $INITIATOR_INTERFACE tso off
ethtool -k $INITIATOR_INTERFACE
@ -131,7 +132,7 @@ function start_vpp() {
xtrace_disable
counter=40
while [ $counter -gt 0 ] ; do
vppctl show version &> /dev/null && break
vppctl show version | grep -E "vpp v[0-9]+\.[0-9]+" && break
counter=$(( counter - 1 ))
sleep 0.5
done
@ -140,37 +141,47 @@ function start_vpp() {
return 1
fi
# Setup host interface
vppctl create host-interface name $TARGET_INTERFACE
VPP_TGT_INT="host-$TARGET_INTERFACE"
vppctl set interface state $VPP_TGT_INT up
vppctl set interface ip address $VPP_TGT_INT $TARGET_IP/24
vppctl set interface mtu $MTU $VPP_TGT_INT
# Below VPP commands are masked with "|| true" for the sake of
# running the test in the CI system. For reasons unknown when
# run via CI these commands result in 141 return code (pipefail)
# even despite producing valid output.
# Using "|| true" does not impact the "-e" flag used in test scripts
# because vppctl cli commands always return with 0, even if
# there was an error.
# As a result - grep checks on command outputs must be used to
# verify vpp configuration and connectivity.
vppctl show interface
# Setup host interface
vppctl create host-interface name $TARGET_INTERFACE || true
VPP_TGT_INT="host-$TARGET_INTERFACE"
vppctl set interface state $VPP_TGT_INT up || true
vppctl set interface ip address $VPP_TGT_INT $TARGET_IP/24 || true
vppctl set interface mtu $MTU $VPP_TGT_INT || true
vppctl show interface | tr -s " " | grep -E "host-$TARGET_INTERFACE [0-9]+ up $MTU/0/0/0"
# Disable session layer
# NOTE: VPP net framework should enable it itself.
vppctl session disable
vppctl session disable || true
# Verify connectivity
vppctl show int addr
vppctl show int addr | grep -E "$TARGET_IP/24"
ip addr show $INITIATOR_INTERFACE
ip netns exec $TARGET_NAMESPACE ip addr show $TARGET_INTERFACE
sleep 3
# SC1010: ping -M do - in this case do is an option not bash special word
# shellcheck disable=SC1010
ping -c 1 $TARGET_IP -s $(( MTU - 28 )) -M do
vppctl ping $INITIATOR_IP repeat 1 size $(( MTU - (28 + 8) )) verbose
vppctl ping $INITIATOR_IP repeat 1 size $(( MTU - (28 + 8) )) verbose | grep -E "$MTU_W_HEADER bytes from $INITIATOR_IP"
}
function kill_vpp() {
vppctl delete host-interface name $TARGET_INTERFACE
vppctl delete host-interface name $TARGET_INTERFACE || true
# Dump VPP configuration before kill
vppctl show api clients
vppctl show session
vppctl show errors
vppctl show api clients || true
vppctl show session || true
vppctl show errors || true
killprocess $vpp_pid
}

View File

@ -40,6 +40,15 @@ $rpc_py iscsi_create_portal_group $PORTAL_TAG $TARGET_IP:$ISCSI_PORT
$rpc_py iscsi_create_initiator_group $INITIATOR_TAG $INITIATOR_NAME $NETMASK
rbd_bdev="$($rpc_py bdev_rbd_create $RBD_POOL $RBD_NAME 4096)"
$rpc_py bdev_get_bdevs
$rpc_py bdev_rbd_resize $rbd_bdev 2000
num_block=$($rpc_py bdev_get_bdevs|grep num_blocks|sed 's/[^[:digit:]]//g')
# get the bdev size in MiB.
total_size=$(( num_block * 4096/ 1048576 ))
if [ $total_size != 2000 ];then
echo "resize failed."
exit 1
fi
# "Ceph0:0" ==> use Ceph0 blockdev for LUN0
# "1:2" ==> map PortalGroup1 to InitiatorGroup2
# "64" ==> iSCSI queue depth 64

View File

@ -93,6 +93,7 @@ set -e
for i in {1..10}; do
if [ -f "${KERNEL_OUT}.${i}" ] && [ -f "${CUSE_OUT}.${i}" ]; then
sed -i "s/${nvme_name}/nvme0/g" ${KERNEL_OUT}.${i}
diff --suppress-common-lines ${KERNEL_OUT}.${i} ${CUSE_OUT}.${i}
fi
done

View File

@ -22,6 +22,13 @@ function tgt_init()
}
nvmftestinit
# There is an intermittent error relating to this test and Soft-RoCE. for now, just
# skip this test if we are using rxe. TODO: get to the bottom of GitHub issue #1165
if [ $TEST_TRANSPORT == "rdma" ] && check_ip_is_soft_roce $NVMF_FIRST_TARGET_IP; then
echo "Using software RDMA, skipping the host bdevperf tests."
exit 0
fi
tgt_init

View File

@ -63,6 +63,7 @@ DEFINE_STUB(nvme_transport_ctrlr_construct, struct spdk_nvme_ctrlr *,
DEFINE_STUB_V(nvme_io_msg_ctrlr_detach, (struct spdk_nvme_ctrlr *ctrlr));
DEFINE_STUB(spdk_nvme_transport_available, bool,
(enum spdk_nvme_transport_type trtype), true);
DEFINE_STUB(spdk_uevent_connect, int, (void), 1);
static bool ut_destruct_called = false;

View File

@ -462,11 +462,11 @@ test_build_contig_hw_sgl_request(void)
CU_ASSERT(req.cmd.dptr.sgl1.address == tr.prp_sgl_bus_addr);
CU_ASSERT(req.cmd.dptr.sgl1.unkeyed.length == 2 * sizeof(struct spdk_nvme_sgl_descriptor));
CU_ASSERT(tr.u.sgl[0].unkeyed.type == SPDK_NVME_SGL_TYPE_DATA_BLOCK);
CU_ASSERT(tr.u.sgl[0].unkeyed.length = 60);
CU_ASSERT(tr.u.sgl[0].address = 0xDEADBEEF);
CU_ASSERT(tr.u.sgl[0].unkeyed.length == 60);
CU_ASSERT(tr.u.sgl[0].address == 0xDEADBEEF);
CU_ASSERT(tr.u.sgl[1].unkeyed.type == SPDK_NVME_SGL_TYPE_DATA_BLOCK);
CU_ASSERT(tr.u.sgl[1].unkeyed.length = 40);
CU_ASSERT(tr.u.sgl[1].address = 0xDEADBEEF);
CU_ASSERT(tr.u.sgl[1].unkeyed.length == 40);
CU_ASSERT(tr.u.sgl[1].address == 0xDEADBEEF);
MOCK_CLEAR(spdk_vtophys);
g_vtophys_size = 0;