Commit Graph

1971 Commits

Author SHA1 Message Date
Ben Walker
f046750c32 event: Add return code to spdk_rpc_initialize
This is an internal API used in several places. The call can fail, so
make sure it can report that correctly.

Change-Id: Iac0ed2c8299c9dd3d2556070278a2224c3807b7b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6640
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-03-12 08:38:10 +00:00
Tomasz Zawadzki
13f868977e module/bdev: remove deprecated API
spdk_vbdev_register() was deprecated in SPDK 19.04.
config_text field in spdk_bdev_module was deprecated in SPDK 20.10.
spdk_bdev_part_base_construct() was deprecated in SPDK 20.10.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib795ccdf61154c168032ccf8b81ea77e5e663851
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6628
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-03-09 08:54:53 +00:00
Tomasz Zawadzki
a6b1e2c57d lib/bdev: remove deprecated spdk_bdev_open()
spdk_bdev_open() was deprecated in SPDK 19.10.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I12f51a237ddc43219d35b5c647dc60d10461d429
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6626
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-03-09 08:54:53 +00:00
Alexey Marchuk
3caf2e710f nvmf/rdma: Add new statistic to count idle polls
This statistic is incremented when we don't reap
anything from the CQ. Together with the total number
of polls it can be useful to estimate idle percentage.

Change-Id: I61b51d049b0bc506fb8a896e225187e46e75a564
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6295
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-03-09 08:54:12 +00:00
Alexey Marchuk
36ac75b94c nvmf/rdma: Use new RDMA statistics
Change-Id: I11c7793301be9dbe9d4c92e482a97fc6146cc355
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6294
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-03-09 08:54:12 +00:00
Richael Zhuang
201aa63471 sock: introduce SO_INCOMING_CPU to get placement_id
Leverage SO_INCOMING_CPU to get the CPU affinity of connections
(sockets). And allocate the connections to specific poll groups,
which aims to utilize cache locality.

From our test:
6 P4600 NVMe on target,target uses 8 cores, NIC irqs are bound to
these 8 cores, and initiator side uses 24 and 32 cores,
we can get 11%~17% randwrite performance boost for posix, and 8%~12%
for uring.

Change-Id: I011e0a21502c85adcccd4a14fbe9838b43f54976
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5748
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-03-09 08:53:52 +00:00
Tomasz Zawadzki
3e15b356f3 module/blob: remove deprecated spdk_bdev_create_bs_dev/from_desc()
spdk_bdev_create_bs_dev was deprecated in SPDK 19.10
spdk_bdev_create_bs_dev_from_desc was deprecated in SPDK 20.10

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1290958923f7833579d098a693454e7ab7656307
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6624
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-03-05 08:33:45 +00:00
Tomasz Zawadzki
a38d241cd6 lib/event: remove deprecated fields from spdk_app_opts
max_delay_us was deprecated in SPDK 19.04.
config_file was deprecated in SPDK 20.10.
master_core/pci_blacklist/pci_whitelist were deprecated in SPDK 21.01.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie8be4c347de58044a7c3d5b1329d96e47ce084b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6594
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-03-05 08:33:45 +00:00
Tomasz Zawadzki
4fb0b89625 lib/env: remove deprecated fields from spdk_env_opts
The fields were deprecated in SPDK 21.01.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If6b946024bf1ce4c106cdf493bcb5662a3b21b13
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6592
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-03-05 08:33:45 +00:00
Alexey Marchuk
1eae58ff6d rdma: Add statistics per qpair
These statistics can help to estimate efficiency of
Work Requests batching and show busy/idle polls ratio.

Send: the doorbell updates statistics for verbs
provider are incremented per each ibv_post_send call,
for mlx5_dv per each ibv_wr_complete call.

Recv: the doorbell updates statistics for both
providers are updated when either ibv_post_recv
or ibv_post_srq_recv functions are called.

Each qpair on initialization accepts an optional
pointer to shared statistics (nvmf/nvme poll groups).
If the pointer to statistics is not provided then
qpair allocates its own structure. That is done
to support cases when NVME RDMA initiator doesn't
use poll groups, so we can avoid checks that qpair
has statistics in IO path

Change-Id: I07dea603cb870b85ea23c42e8e2c4520b1c66252
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6293
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-03-01 10:17:13 +00:00
Ziye Yang
be57e2a6ec lib/nvmf: Use the accelerated engine to compute the data digest.
This patch is used to leverage accelerated engine to compute
the data digest in the following case:

1 DIF is not used.
2 The data to compute is aligned with size 4, i.e, %4 = 0.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I51fb6e3ab04391062b244cba6e249c8e20d3180f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6014
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-02-25 10:25:49 +00:00
Ziye Yang
88754353c0 lib/accel: Add the real chained crc32 support with the example.
This patch added the chained crc32 support API for both batched
and non batched mode usage. And also update the accel_perf
program in order to use the revised accelerated crc32 function.

For example, you can use the following command:

./build/examples/accel_perf -C 4 -q 128 -o 4096 -t 5 -w crc32c -y

In this command, "-C 4" means that caculate the chained
crc32 for an iov array.
(even if you do not have the accelerated DSA hardware)

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ifede26f9040980b5791da8e5afef41177eede9f6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6457
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2021-02-25 10:25:49 +00:00
Ziye Yang
d5cd0b13b6 sock: Fix the "sock remove assert bug" in spdk_sock_group_remove_sock
The statement causes this issue is:

assert(group_impl->num_removed_socks < MAX_EVENTS_PER_POLL);

The call trace is:

The previous solution is:
commitid with: e71e81b631

But with this solution, it will always add the sock
into the removed_socks list even if it is not under polling
context by sock_group_impl_poll_count. So it will exceed the size of
removed_socks array if sock_group_impl_poll_count function will not be
called. And we should not use a large array, because it is just a workaround,
it just hides the bug.

So our current solution is:

1 Remove the code in sock layer, i.e., rollback the commit
 e71e81b631. This patch is
not the right fix. The sock->cb_fn's NULL pointer case is
caused by the cb_fn of write operation (if the
spdk_sock_group_remove_sock is inside the cb_fn). And it is not
caused by the epoll related cache issue described in commit
"e7181.." commit, but caused by the following situation:

(1)The socket's cb_fn is set to NULL which is caused by
spdk_sock_group_remove_sock by the socket itself
inside a call back function from a write operation.

(2) And the socket is already in the pending_recv list. It is
not caused by the epoll event issue, e.g., socket A changes Socket B's
cb_fn. By the way, A socket A should never remove a socket B from a polling group.
If it really does it, it should use spdk_thread_sendmsg to make sure
it happens in the next round.

2 Add the code check in each posix, uring implementation module.
If sock->cb_fn is NULL, we will not return the socket to the active socks list.
And this is enough to address the issue.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I79187f2f1301c819c46a5c3bdd84372f75534f2f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6472
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-02-24 13:06:50 +00:00
Niklas Cassel
9d79d27e49 nvme: add function to get ZNS max active zones
Add a function to get the number of max active zones for a zoned
namespace.

The value inside the identify namespace struct is a 0's based value,
where 0xffffffff means unlimited.
If unlimited, the addition will overflow and return 0,
which is the intended value to represent unlimited for this API.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia09e3db157ca0afadbd3ca4032eedd7bcd88248c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6443
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-02-24 13:06:32 +00:00
Niklas Cassel
bb5330c06d nvme: add function to get ZNS max open zones
Add a function to get the number of max open zones for a zoned
namespace.

The value inside the identify namespace struct is a 0's based value,
where 0xffffffff means unlimited.
If unlimited, the addition will overflow and return 0,
which is the intended value to represent unlimited for this API.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I4223146bc1ddf90486892a0af5fe5ce006dc5fd3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6442
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-02-24 13:06:32 +00:00
Ziye Yang
90c56d965d lib/accel: Two crc32c APIs are added to provide the chained crc32 operation support
This patch provides two new accelerated crc32c function interface.
And the next patch will be used to add the real support of chained crc32c feature.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I3f8dd55c3da636e29e5fb02fc229b51f05653cd6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6456
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2021-02-23 08:52:48 +00:00
Alexey Marchuk
9451030761 rdma: Add functions to queue and flush recv WRs
Change-Id: I77e286efd48330fcb6c1c85d1515656e49587983
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6290
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
2021-02-22 10:05:27 +00:00
paul luse
9a70b6375d idxd: update low level lib and poller to indicate busy/idle
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I0acb74bd679e924977fb058a29ee947734e7aa83
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6287
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-02-19 11:29:53 +00:00
Alexey Marchuk
27d8fd9f4c rdma: Add functions to work with SRQ
Add 4 new functions to create/delete SRQ, queue and flush
recv Work Requests.

Change-Id: I55401d22724cf1ce24f2cca588f2bd1384cdbcd0
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6289
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-02-17 10:19:28 +00:00
Alexey Marchuk
4f2a837238 rdma: Fix doxygen comments
Change-Id: I6a9e9d173aad96c1d6c162396c1e3023b756498b
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6288
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2021-02-17 10:19:28 +00:00
Jim Harris
5e37d3b2d8 nvme: add spdk_nvme_qpair_get_id
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I550d42806a84eb94effd1e1c4e26d9493cda7703

Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6395
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-02-17 10:17:43 +00:00
Niklas Cassel
014baeb8ef nvme: add support for ZNS zone append vector variant
We already have support for spdk_nvme_zns_zone_append(),
add support for spdk_nvme_zns_zone_appendv() (zone append with
NVME_PAYLOAD_TYPE_SGL).

_nvme_ns_cmd_rw() currently performs verification of the SGL,
if the parameter check_sgl is set. This parameter is set for all
calls with payload of type NVME_PAYLOAD_TYPE_SGL.

In order to be able to perform the same check_sgl verfication on
zone append vectors, we need to refactor _nvme_ns_cmd_rw() a bit.

Setting check_sgl ensures that _nvme_ns_cmd_split_request_sgl() or
_nvme_ns_cmd_split_request_prp() gets called.

These functions will split an oversized I/O into several different
requests. However, they also iterate the SGE entries, verifies that
the total payload size, total SGE entries is not too many, and that
buffers are properly aligned. A proper request will not get split.

For zone append, splitting a request into several is not allowed,
however, we still want the verification part to be done, such that
(e.g.) a non first/last SGE which is not page aligned, will cause
the whole request to be rejected.
(In the case of spdk_nvme_ns_cmd_write(), a non first/last SGE which
is not page aligned will instead cause the request to be split.)

An alternative would be to try to rip out the verification part from
_nvme_ns_cmd_split_request_sgl() and _nvme_ns_cmd_split_request_prp().
However, that is non-trivial, and would most likely end up with a lot
of duplicated code, which would easily get out of sync.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I2728acdcadeb70b1f0ed628704df19e75d14dcca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6248
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2021-02-17 10:17:22 +00:00
Niklas Cassel
9a800c38f5 nvme: add spdk_nvme_zns_ns_get_zone_size_sectors()
Sometimes it is more optimal to get the zone size in number
of sectors, instead of in number of bytes.

Therefore, add a new spdk_nvme_zns_ns_get_zone_size_sectors()
function to get zone size in number of sectors.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I0fe67e00a3d74dd27acfc895ae97448d995b89a3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6416
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-02-17 10:17:22 +00:00
Niklas Cassel
8f5b48f8fd nvme: make enum spdk_nvme_ns_flags more readable
Make enum spdk_nvme_ns_flags more readable.
Other enums in spdk, e.g. enum spdk_nvme_ctrlr_flags,
and e.g. enum NVME_RDMA_COMPLETION_FLAGS, already define
the enum values using bit shifts.

Do the same for enum spdk_nvme_ns_flags.

No functional change intended.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia78ab0bab6fa37d700178d5b2669feb5a3003871
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6383
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-02-17 10:17:22 +00:00
Wenhua Liu
e7f6ff2db6 Fix incorrect implementation of HPDA/CPDA in NVMe/TCP target code.
The current implementation treats HPDA/CPDA as the absolute offset
to the beginning of the PDU where the payload data starts. This is
incorrect. The HPDA/CPDA actually specify where the payload data
should start such that the starting location is a multiple of HPDA
(for C2H PDU) or CPDA (for H2C PDU or CapsuleCmd PDU).

The other issue fixed is that the current implementation calculates
padding only when header digest is enabled. This is also incorrect.

Signed-off-by: Wenhua Liu <liuw@vmware.com>
Change-Id: If7a3896a4c1d73f6d062bd3dbe6a912d31771180
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6256
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-02-16 08:40:14 +00:00
Krishna Kanth Reddy
48aed8a578 lib/nvme: Adding support for PMR Identification
Identify application prints the PMR details if it is supported

Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
Change-Id: Iaba4c15e18e1402035b11a34b2defe8078855751
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6209
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-02-11 08:40:48 +00:00
Nick Connolly
68ff34bc66 include/nvme_spec.h: improve portability
Aspects of bit fields are 'implementation defined'.  On some platforms
alignment will occur if two adjacent fields are of different types. This
occurs in spdk_nvme_feat_async_event_configutation after the crit_warn
member which is effectively an int8_t, followed by an int16_t. There
isn't a generic way of changing the compiler's behaviour, so the best
options are:

- Change crit_warn to a uint32_t bit field and copy the value to/from
  a spdk_nvme_critical_warning_state variable to use it. This requires
  changes to code using the field.

- Adjust the structure definition to use smaller types to avoid the
  problem. This preserves existing semantics, but the field order will
  need to be reviewed if big-endian support is ever added (other places
  in nvme_spec.h will need similar attention). A second reserved field
  is required.

Use smaller types which seems the most straightforward option. Adjust
the use of the spdk_nvme_feat_async_event_configuration reserved fields
in lib/nvmf/ctrlr.c.

The new structure is binary compatible and the fields behave in the same
way, with the exception of an additional reserved field, so updating
CHANGELOG.md probably isn't necessary.

Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: I7d8163c84b4f410fc95a5b7064506ad7b4b62c6c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6340
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2021-02-10 08:44:10 +00:00
Niklas Cassel
c56fb6d3ee nvme: fix typo in zone append header documentation
While zone append is very similar to write, we should refer
to the proper I/O command.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I85dc2634eefb502dd38351ca02d9b09a15c5620f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6245
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
2021-02-09 11:30:51 +00:00
Jim Harris
e329ec45a6 nvme: add set_nvme_pcie_set_hotplug_filter
This function allows applications to specify whether
they wish to allow probing a newly attached NVMe
PCIe SSD.

The env layer will only even probe devices that have
been allowed.  By default, this is all devices, but
if the user has specified some list of
allowed PCI addresses (via spdk_env_opts pci_allowed)
then newly attached PCIe devices are implicitly not
allowed.  This API allows applications to add
device addresses to the allowed list after the
application has started.

This API will be useful for use cases where multiple
SPDK processes are running on one server, and assignment
of PCIe SSDs to those processes are based on some function
of the SSD's PCIe address.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I919bc267f2ad9130ab5c875ff760a301028b047e

Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6184
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
2021-02-09 11:27:19 +00:00
Jim Harris
207e05831f env: add spdk_pci_device_allow() API
The env layer has a pci_allowed list, which specifies
that only a subset of PCI devices may be attached
by the associated process.

But that doesn't cover PCI devices that are hot-inserted
after the application starts, which is common for
storage/NVMe.

So add a new spdk_pci_device_allow() API which allows
an application to add new devices to the allowed list.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7bd5ff428d84480d46bc236698daadd019b20b8e

Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6183
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
2021-02-09 11:27:19 +00:00
paul luse
8a1a845053 idxd: simplification in re-balancing channels
For flow control reasons we have to resize the bit arrays we
use to manage flow as channels come and go.  However since
channels are assigned to devices, until the channel count
reaches the device count there's no sharing so no resize of
the array is needed. So, when we use a device for the first
time there's no need to run through the rest of the channels
and re-balance.

Same thing is done on destruction. The code to free idxd
specific resources was moved from the rebalance function to
the idxd put channel function which is a much more logical
place for it as well.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ib4df163286906f413dd6429dc6833af7b68e208c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5846
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-02-05 13:42:32 +00:00
Ziye Yang
e749b5d3ec nvme: Add the interface to get the optimal polling group
This patch is used to add spdk_nvme_poll_group_get_optimal
public API.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iee34c89e0e1ff1f81167b18e198c144ca28f71de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3311
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2021-02-04 08:30:54 +00:00
Tomasz Zawadzki
19dddb3e32 version: SPDK 21.04 pre
This is first commit that should go into latest SPDK
after the code freeze for SPDK 21.01.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ieeef261f4a83183946358916a084fef886339d83
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6052
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2021-01-26 17:44:02 +00:00
Ben Walker
312a9d603d nvmf: Subsystem pauses only pause admin queues
Additionally, the user can specify a namespace to also pause during the
operation.

This allows for the management of hosts, listeners, and the addition of
namespaces all while I/O to other namespaces is occurring. Pausing a
specific namespace also allows for the removal of that namespace without
impacting I/O to other namespaces in the subsystem.

Change-Id: I364336df16df92fe2069114674cb7a68076de6fb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4997
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2021-01-26 17:38:54 +00:00
Jim Harris
7665710cd4 event: wait to shut down subsystems if scheduling in progress
We want to avoid an active for_each_reactor operation
while the reactors are being shut down.

Fixes issue #1766

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I30bc258c5b22545320080d269a1ed8cb0b4e12f2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6104
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-26 17:33:57 +00:00
Maciej Szwed
9cde15093f schedulers: Add interrupt mode information to cores map
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I0d143a1aa05f27c4d0b4bd6212b361eafa966de9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5452
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-25 20:37:50 +00:00
Liu Xiaodong
a20745541e reactor: add spdk_reactor_set_interrupt_mode func
spdk_reactor_set_interrupt_mode will send event to set
reactor into interrupt mode or poll mode, also set every
notify_cpuset on all reactors for consistency.
It can be used by RPC method or scheduler to set
reactor to interrupt mode while workload is lightweight.

Currently, this function is limited that the specific
reactor should have no attached spdk_thread.

Change-Id: I7e8f449bff1184b9a7948f80b9572066a19da60f
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5853
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-25 20:37:50 +00:00
Liu Xiaodong
eff5b149ce reactor: add cpuset to indicate intr mode
Each bit of the cpuset indicates whether a reactor
is going to be in interrupt mode.
Each spdk_cpuset is allocated to each reactor. So it
can only be touched by its reactor.

Change-Id: Ic186de341588b701d7471bf09336309d28b1bf4e
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5850
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2021-01-25 16:32:33 +00:00
Krzysztof Karas
abf52d7d7f spdk_top: add rpc for scheduler and governor
Added rpc to get name and period of currently set spdk scheduler
and name of currently set spdk governor.

Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Change-Id: I5562a81a7f9e4879bd48a765c9467f70b43f73ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5917
Community-CI: Broadcom CI
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-22 18:33:13 +00:00
Maciej Szwed
bfca56ffc9 scheduler: Track threads count per core during rebalancing
This will be helpful in upcoming patches for changing
interrupt mode (when reactor have no threads).

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I4902d0bd26f27db2b98dc4ca4fd4df934f59d9a3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5450
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2021-01-22 18:31:25 +00:00
Maciej Szwed
80a13be9f5 event: Save thread stats to last_stats field in spdk_lw_thread structure
This field will be used in upcoming patches
by scheduler to keep a history of threads
statistics. Base on those scheduler will
be able to calculate how busy a thread is.

Also we have to save the value of this field
while rescheduling thread.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Iffbd1f946e1d159ec7277a39bc30161e71722456
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5972
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2021-01-22 18:31:25 +00:00
Niklas Cassel
aa6767fb14 nvme: add support for ZNS zone append command
The Zone Append command is an optional command in the Zoned Namespace
Command Set.

Zone Append differs from a regular write, in that the command is not
given an exact LBA of where to write the data.

Instead the user has to set the zslba field to the start of a zone,
and the data will be appended to that zone.

The actual LBA where the data was stored is returned in the
spdk_nvme_cpl, where Dword0 contains 31:00 of the ALBA field,
and Dword1 contains bits 63:32 of the ALBA field.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Iabae1b3456bfbb62c07b63d79afe9a14e460fe83
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6013
Community-CI: Broadcom CI
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-22 08:15:55 +00:00
Niklas Cassel
ffd18d43e3 nvme/spec: add zone append specific PIREMAP define
PIREMAP is defined as bit 25 in Command Dword12 for the zone append
command.
Bit 25 in Command Dword12 is still reserved for regular reads/writes.

PIREMAP determines the contents of the reference tag written to the
media.

If cleared to 0, then the controller writes the reference tag
per the NVMe Base specification without modification.
If set to 1, Reference Tag[0] = ILBRT + (ALBA - ZSLBA).
Media Reference Tag[n+1] = Media Reference Tag[n] + 1.

As defined in Zoned Namespaces Command Set Specification,
section 4.5.1.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I984ca00146baf6af61208e3598c833f1ec4708bb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6012
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2021-01-22 08:15:55 +00:00
Niklas Cassel
628f9adb39 nvme/spec: completion queue entry Dword1 is now command specific
The ratified NVME TP4056 changes the completion queue entry Dword1
from reserved to command specific.

Update spdk_nvme_cpl to reflect this.

Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I6a21eb1355f6ca69770ea7eb4ccc8fe1c5a0d737
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6011
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-22 08:15:55 +00:00
Alexey Marchuk
74542bae77 tcp: Rename readv_offset to rw_offset in nvme_tcp_pdu
In the next patch this member will be used to track
both read and write offsets

Change-Id: I852125ff35257f9821ddf4a641d96afb29ebf0a0
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5924
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-21 09:55:53 +00:00
Changpeng
3ff9c13614 NVMe/vfio-user: add initial version vfio-user transport to NVMe driver
When NVMf target linked with vfio-user library, we can use
vfio-user client library to connect to the target.

Here is the three examples that can work with target:

identify -r 'trtype:VFIOUSER traddr:/var/run/muser/domain/muser0/8' -g
perf -r 'trtype:VFIOUSER traddr:/var/run/muser/domain/muser0/8' -g -q 1 -o 4096 -w read -t 10
reconnect -r 'trtype:VFIOUSER traddr:/var/run/muser/domain/muser0/8' -g -q 32 -o 4096 -w randrw \
-M 50 -t 10 -c 0xE

You can run the following test script test/nvmf/target/nvmf_vfio_user.sh to have a quick test,
currently enabled with NVMe Identify,Perf,Reconnect tools.

Change-Id: Ieb9842b2f372184fffbf7f23e4aad26feb47c350
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3839
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-01-21 05:00:18 +00:00
Changpeng Liu
6308a24f56 nvme: add a new custom transport SPDK_NVME_TRANSPORT_VFIOUSER
The new custom transport can enable NVMe driver running with
NVMe over vfio-user target.

Change-Id: I5f90e8516eaca08fc3eab658b29b760a03326ff7
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5996
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2021-01-21 05:00:18 +00:00
sunshihao520
5fba455d3e lib/nvme: add two async API for Directive Send and Directive Receive.
Add two async API for Directive Send and Directive Receive.
	spdk_nvme_ctrlr_cmd_directive_send;
	spdk_nvme_ctrlr_cmd_directive_receive;

Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: Icb6974f74902df1512a5ffa9835188132634291b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5803
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-20 16:27:34 +00:00
sunshihao520
e69af13dea nvme: add directive definitions to nvme_spec.h
Add data structure needed by directives to nvme_spec.h

Signed-off-by: sunshihao <sunshihao@huawei.com>
Signed-off-by: Weifeng Su <suweifeng1@huawei.com>
Signed-off-by: xiehuiming <xiehuiming@huawei.com>
Change-Id: I0cbc53fb4f7ca448ab0566d826a7ae8800bbf504
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5683
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-20 16:27:34 +00:00
sunshihao520
1572882a7f lib/nvme: add function spdk_nvme_bytes_to_numd to calculate number of dwords
According to kernel, use an inline function spdk_nvme_bytes_to_numd
to transfer paload_size form bytes to numer of dwords.

Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: I8b9ded122bbf4a3c8e46988993ea52404783c0b0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5926
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2021-01-20 08:45:43 +00:00