nvme_ctrlr will be registered as io_device even when multipath is
supported. Hence while spdk_for_each_channel() is executed in reset
processing, we can get nvme_ctrlr both spdk_io_channel_iter_get_ctx()
and spdk_io_channel_iter_get_io_device(). This duplication is not
necessary. Use spdk_io_channel_iter_get_io_device() and pass NULL
to the context parameter of spdk_for_each_channel() for clarification.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifdbd0af4274081c4be7ab0735eb8bf9ae10e3493
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8588
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The current nvme_ctrlr will be registered as io_device even when
multipath is supported. Then we do not have to differentiate completion
processing between reset from bdev_io and internal reset. Hence
inline bdev_nvme_reset_io_complete() into bdev_nvme_reset_complete().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ife2c4c93d423da3953174ac860485a6e095a28bd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8587
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When we support multipath, reset_io will hold the controller currently
being reset to reset all underlying controllers sequentially.
bdev_nvme_submit_request() basically passes nvme_bdev_io to each I/O
type, and we have a convenient helper function bdev_nvme_io_complete()
which has nvme_bdev_io as a parametetr.
So revert the previous change to bring nvme_bdev_io as context
for reset processing.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I19697e8252505bab519a42889d1a88d967932f22
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8586
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reset requests from the upper layer will reset the underlying
ctrlrs of a bdev ctrlr but internal reset requests will reset only
the specified ctrlr.
To clarify such difference, rename bdev_nvme_reset() by
bdev_nvme_reset_io() and remove the underscore prefix from
_bdev_nvme_reset() and related functions.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9a2d124f6e2039bfecfdd6599827354d6c373f2e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8492
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
nvme_bdev_first_ctrlr() and nvme_bdev_next_ctrlr() were not possible
to hold mutex correctly, and nvme_ctrlr_get() and nvme_ctrlr_get_by_name()
had not held mutex.
nvme_bdev_first_ctrlr() and nvme_bdev_next_ctrlr() were replaced by
nvme_ctrlr_for_each() in the last patch.
In this patch, add mutex to three helper functions, nvme_ctrlr_get(),
nvme_ctrlr_get_by_name(), and nvme_ctrlr_for_each().
Add mutex to nvme_ctrlr_create() but it will be removed in the
following patches because nvme_ctrlr will be added to not global
ctrlr list but ctrlr list per nvme_bdev_ctrlr.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ife27066d2dcac82db0616b0afeaf68e5705d7da1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8722
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Replace two helper functions, nvme_bdev_first_ctrlr() and
nvme_bdev_next_ctrlr() by an new helper function nvme_ctrlr_for_each().
This will make us easier to guard data structure correctly in the
following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ibd81286e454fd6127fd150a7d48d8381bd1b89d3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8721
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This object aggregates multiple I/O qpairs for their completion
operations and may be a higher layer object. However, the
aggregation is only to poll completions efficiently. Hence if we
follow the new naming rule, nvme_poll_group is better than
nvme_ctrlr_poll_group and nvme_bdev_poll_group, and rename
nvme_bdev_poll_group by nvme_poll_group.
Besides, many functions in NVMe bdev module have a naming rule,
bdev_nvme + verb + objective
Follow this rule for a few functions related with nvme_poll_group.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5e5cb6001d4a862c2121b7265cbbffe0c2109785
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8720
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This object is per I/O path and will be aggregated by an new upper
layer object.
Hence rename nvme_bdev_ctrlr by nvme_ctrlr. Then the following patches
will add nvme_bdev_ctrlr as a different upper layer object.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ieed634447785cc98140b3d49c52a2c753988ece7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8381
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This object is used for failover and per I/O path. A controller may
have multiple of this object. A controller is per path and may be
aggregated by an new object. Hence this object is a lower layer
object.
Based on the new naming rule, rename nvme_bdev_ctrlr_trid by
nvme_ctrlr_trid.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0d5e5812560a6947a0c25af05dea168e8745130e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8380
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
This object will be aggregated by the upper layer object nvme_bdev.
Hence based on the new naming rule, rename nvme_bdev_ns by nvme_ns.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I96a70213b29fb53437acd080a0787ec9f5a6759a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8379
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We will name the lower level objects starting with nvme_* and the
upper level objects starting with nvme_bdev_*.
This object is a channel per ctrlr and another new channel will be
added on top of this object.
Rename nvme_io_path by nvme_ctrlr_channel based on the new naming rule.
nvme_io_path will be used for a new object which is used to find an
optimal I/O path and to reset multiple ctrlrs sequentially when
multipath is supported.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1d4fa6d4625de3413d629a1ff412e00de12dfaf4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8378
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
NVMe specification defines namespace identification descriptors i.e.
EUI64, NGUID, UUID.
BDEV abstracts NVMe specific details that is why only UUID is exposed,
however if NGUID is supported it is prefered to identify namespace
with NGUID over UUID.
If NGUID is not supported by NVMe Controller then fallback to UUID.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: If51889a3664c0daa7cbe983048231793e3c502e0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8627
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
spdk_nvme_ctrlr_free_io_qpair now always returns 0, so
update the code to account for that.
Fixes issue #2012.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I61c78459472573adbfeb28052ae3379d7880567c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8660
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When entering the if case to order the list, there is bug should be
fixed. The original code does not address this.
The way this happens is when there is a connection left in the socks_with_data list
between polls and there are enough new events detected that it would exceed the
maximal number of events. A connection is left on this list between polls if it isn't
fully drained via reads by the upper layer on each poll loop.
Currently, the maximal socket event num is 32. Then we did not hit this issue
in our normal test cases. But when you use NVMe-oF tcp target to test which is
described in #2105, there are more than 32 active sockets, and it exceeeds
the maximal num of events of polling (32), so we will trigger this issue.
Fixes issue #2015
Change-Id: I9384476fdba8826f5fe55a5d2594e3f4ed3832ba
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8541
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Without it multiple threads can race and end up sharing a device
when the intention is sharing only after full round robin.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I29b854ff837d56078bc033802d3df244728a29aa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8187
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Recent work identified race conditions having to do with the
dynamic flow control mechanism for the idxd engine. In order
to both address the issue and simplify the code a new scheme
is now in place. Essentially every DSA device will be allowed
to accomodate 8 channels and each channel will get a fixed 1/8
the number of work queue entries regardless of how many
channels there are. Assignment of channels to devices is round
robin and if/when no more channels can be accommodated the get
channel request will fail.
The performance tests also revealed another issue that was
masked before, it's a one-line so is in this patch for convenience.
In the idxd poller we limit the number of completions allowed
during one run to avoid the poller thread from starving other
threads since as operations complete on this thread they are
immediately replaced up to the limit for the channel.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I913e809a934b562feb495815a9b9c605d622285c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8171
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In uring_sock_create(), we loops through all the addresses available.
If something is wrong, we should close(fd) and set fd to -1, and
try the next address. Only, when one fd satisfies all conditions,
we will break the loop with the useful fd.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I22eada5437776fe90a6b57ab42cbad6dc4b0585c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8311
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The reference count of nvme_bdev_ctrlr will be used to update ANA log
page safely, and nvme_bdev_ctrlr_destruct() can be used to decrement
reference count after completing ANA log page update.
However, nvme_bdev_ctrlr_destruct() is not a good name for this case.
Furthermore, nvme_bdev_ctrlr_destruct() does not set the destruct flag
to true, and the next patch will need nvme_bdev_ctrlr_acquire().
Hence rename nvme_bdev_ctrlr_destruct() by nvme_bdev_ctrlr_release().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I285b7ab0963d0f4ea4a7a9fd29bd026d37ba8460
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8334
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Factor out registering nvme_bdev_ctrlr as io_device and populating
namespaces after creating nvme_bdev_ctrlr into a helper function.
We extract spdk_io_device_register() from nvme_bdev_ctrlr_create()
because free(NULL) is correct but spdk_io_device_unregister(NULL) is
not allowed, and hence it is very simple if we call spdk_io_device_register()
only after nvme_bdev_ctrlr is successfully created.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia4d85ccf96f3ef62e51db9d08ec606d4100c7ebd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8317
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reorder a few operations and increment nvme_bdev_ctrlr->num_ns
after allocating nvme_bdev_ctrlr->namespaces[i] successfully.
Then unify the goto label for error cases to err and the err label
simply calls nvme_bdev_ctrlr_delete().
There is one noticeable change in this patch. Previously the
controller had not been detached when creating nvme_bdev_ctrlr failed.
However, after this patch, the controller will be detached when creating
nvme_bdev_ctrlr failed. This will be reasonable change.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifd8c4649036f1c5e5cd688f89727b2bd2e982735
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8316
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Consolidate populate_namespaces_cb() calls for error cases into
connect_attach_cb(). Then remove ctx parameter from
bdev_nvme_add_secondary_trid() because it is not necessary now.
The next patch will inline _nvme_bdev_ctrlr_create() into
nvme_bdev_ctrlr_create().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia94f456df160c1cc874acac4c70aad27102cb0b6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8314
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
In posix_sock_create(), we loops through all the addresses available.
If something is wrong, we should close(fd) and set fd to -1, and
try the next address. Only, when one fd satisfies all conditions,
we will break the loop with the useful fd.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: Icbfc10246c92b95cacd6eb058e6e46cf8924fc4c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8310
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
In blobfs_fuse_start(), bfuse->bdev_name and bfuse->mountpoint
are allocated by calling strdup(), which may return NULL.
Here, we will go to err if strdup() returns NULL.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Change-Id: I0599254b3436a310ddd26732312281f07a4972ec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8303
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
After reading the code in detail, I think that we should
not set pipe_has_data= true and socket_has_data at the same time.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I8f9f96b16f4f0e0c585877a0dd687a240252a7cf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8283
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Set nvme bdev physical block size based value of NPWG and NAWUPF namespace
field.
The logic to set bdev phys_blocklen is based on how Linux nvme block driver
sets it. If the underlying nvme namespace supports NPWG/NAWUPF then bdev
phys_blocklen is set to min(npwg, nawupf)
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I6d254a9e730dccc230b9db4d1217bf7ab2f39b6c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8224
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Upcoming patches will add support for accel_perf tool. Also
following will come vectored support and batch versions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I43bd11b8efe40e6df0e2c8bd2995b9a9341f6457
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8142
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
In order to avoid latency imbalances, the user can specify a cpu mask
on which the poll groups should run. This code update added data structures
to control set of CPU cores.
Change-Id: Iaf69d75da2fc6fed350d97d11027ce09e9432210
Signed-off-by: Yuri <yuriy.kirichok@hpe.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5610
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Also add scripts/bpf/nvmf.bt to enable and log these
probes.
This patch also adds a script that can generate
a bpftrace script snippet with string maps for
needed enumerations (currently nvmf_tgt_state and
spdk_nvmf_subsystem_state). This allows us to
dynamically generate this from the source code, and
can be extended for other enums we may want to
add in the future.
Thanks to Michal Berger for converting my original
gen_enums.py script into gen_enums.sh!
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Michal Berger <michalx.berger@intel.com>
Change-Id: Iff34a6218aef40055ac14932eea5fc00e1c8bcf5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7194
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Adding iov to the spdk_bdev_zcopy_start function enable spdk_bdev_zcopy_start to
be used by transport layers as the iov is owned by the transport command
Signed-off-by: matthewb <matthew.burbridge@hpe.com>
Change-Id: I6d2be7f49566048bf25b7711ada8d2fb49fea6ee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6816
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We will print a notice log if the drive can support SECURITY
SEND/RECEIVE commands but not OPAL, so remove the error logs.
Change-Id: Ib26aa727ad1e703d53c387af8507b920606ea9c6
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8055
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
By the next patch, nvme_io_channel will be used as an I/O channel
to a single nvme_bdev. This channel is created to a single
nvme_bdev_ctrlr and has a corresponding I/O qpair. nvme_io_path
will be a better name especially when we support multipath.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic34162f3c383676c5249396a09173329fc6febce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8095
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously only a single thread could submit abort commands for admin
commands and it was the thread of the corresponding controller.
When we support multipath, we need to traverse the list of controllers
to which the target admin command is submitted. Threads of controllers
may be different.
On the other hand, the previous implementation made the I/O flow very
clean, but the I/O flow will not be clean if there are many controllers
and the subsystem does not have its thread.
This patch changes the policy so that any SPDK thread can submit abort
commands for admin commands.
Then when multipath is supported, we will be able to traverse the
list of controllers simply on the current thread to abort either I/O
command or admin command.
We already are able to submit any admin command on any thread anytime
including abort command. Hence this will not cause any issue.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib69de33f2e84b03861c7d95ce060035bdb589e4b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8121
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
It is unlikely that managing namespaces by nvme_bdev is complicated.
Hence we do not need the helper function nvme_bdev_to_bdev_ns().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I77b4dcd12b2f2a219f58e5bc7b7e51dd10635da4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8118
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
We can hold bdev_io directly in nvme_bdev_ctrlr as an outstanding reset.
We can put spdk_bdev_io_from_ctx(bio) into a parameter for a few
functions because it is used only once in a function.
Passing not spdk_bdev_io but nvme_bdev_io to bdev_nvme_verify_pi_error()
remove unnecessary substitution.
This is a little more efficient and simplifies the implementation.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If49ad9fa42abf27decf3afcd8c994f55faa3bc70
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8094
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Following Linux NVMe host, add UUID and EUI64 comparison to
bdev_nvme_compare_ns().
Besides, previously the return value of memcmp() had been used as
the return value of bdev_nvme_compare_ns() and this was wrong.
Fix it in this patch together.
Add unit test cases for bdev_nvme_compare_ns().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I069ab53e77741d6348b847d51e84a9338e2f3787
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7755
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Use the macros for red black tree provided by Free BSD to speed up bdev
name lookup in spdk_bdev_get_by_name().
In the bdev_multi_allocation test, we can get 3x ~ 5x speed up when
creating multiple bdevs for various bdev nums.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: I49a2fbcccf06d4c36cbd445ce59e0b0dd4ada31d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7837
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Purpose: Let the users know the current available registered Rados
cluster and the related info.
Change-Id: I115c129ae6e4b0372579aad168fd88f8be136357
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7990
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is useful for applications even if they elect not to use the SPDK
event framework.
This doesn't shift everything in one go - just the subsystem
initialization logic. Configuration file loading also needs to move
in a separate patch later.
Change-Id: Id419df1045442d416650ed90e5ee78adfdd623d7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6641
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Checked the definition of strncmp. If substring s1
is found in strnmp(s1, s2, len), then it will return 0.
For the len value, it is better to use strcmp. Otherwise,
if s1=cluster1, s2=cluster & len=strlen(s2),
strncmp will return 0. But they are two different strings. For
cluster names, they are different.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I15a06184d834cd1567b329d0322cd6bdea6fee4b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7991
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
bdev_nvme_compare_ns() will be used to check if all namespaces of
one nvme_bdev are identical, and this is the convenient location.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I3fa6072c1cceec53268e53bf398fa1e8f069035e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7169
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This will be helpful to simplify the following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I66939f2953c66582bfcb79cfe187814280e89680
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7324
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When multipath is supported, a subsystem has multiple controllers and
bdev_nvme_create_cb() will create a channel per nvme_bdev_ctrlr by
iterating the list of nvme_bdev_ctrlrs, and will hold lock while
doing it.
If the code to access nvme_bdev_ctrlr is put in a place, the
following patches will be easier and smaller.
Hence reorder the code of bdev_nvme_create_cb() as a preparation.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2f2e66758c3374c678cc44bbb0116f4611c6753a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7754
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Factor out spdk_bdev_io_complete() calls into a helper function
bdev_nvme_io_complete().
This simplifies the code a little and will be helpful for the following
patches to retry I/Os. These are not performance critical but we
specify inline explicitly by following bdev_nvme_io_complete_nvme_status().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9fafacfd8571c037c3bc34382c251317309da334
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7497
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Factor out spdk_bdev_io_complete_nvme_status() calls into a helper
function bdev_nvme_io_complete_nvme_status().
This simplifies the code a little and will be helpful for the following
patches to retry I/Os. Specify inline explicitly to avoid performance
regression.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1ac451486e1c6a4401842490411e986fac191d59
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7484
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
bdev_nvme_submit_request() calls spdk_bdev_io_complete() with failed
if bdev_nvme_reset() returns negated rc other than -ENOMEM.
So let bdev_nvme_submit_request() process it.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2569634ff0f18fb433cb685de1366e43abf5a9fe
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7524
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is called only in a place. Inlining this into bdev_nvme_submit_request()
will simplify the following patches by removing unnecessary cast.
Besides, use -ENXIO if I/O path is not found. This will be better than -1.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib6c38f89db1c1e651941aad18d31dd0891f380de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7871
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
I/Os will be retried if spdk_bdev_io_complete() is called with
SPDK_BDEV_IO_STATUS_FAILED. To do it easier, consolidate exit paths
of bdev_nvme_get_buf_cb().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0a67b88a107d616c5a5b0fc5ff963ad1402f5651
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7487
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Move from a single flag indicating that the socket is on the
pending_events list to two flags - pipe_has_data and socket_has_data. If
either flag is true, the socket is on the socks_with_data list.
This is necessary to track enough state to avoid doing extra recv()
system calls.
Change-Id: I65e5701dccb0a5bade19f266f164f26706b110d4
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7595
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Revise bdev_rbd_create rpc call to add an optional
parameter "--cluster-name", e.g., "--cluster-name Rados".
Then users can create a rbd bdev with registered
Rados Cluster. This shared strategy can be used to
remove the thread creation overhead if multiple rbds
are connected to the same Ceph cluster.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ide5800f8fc6b2074805272a59731c666fe279b9a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7584
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch is used to add two rpc calls:
bdev_rbd_register_cluster
bdev_rbd_unregister_cluster
Then in the next patch, rbd bdev constructed on the same cluster object
can share the common Rados_t structure in order to remove the thread creation
overhead and improve the scalability.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I898cc4ffabb8e6721ba5bef099cbf948c64d2c98
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7551
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The bdev layer can do the unmap split now based on the backend
device. For now we only use 1 unmap descriptor, the bdev layer
can help us to do the split.
Fix issue #1888
Change-Id: Iaf740bafd4f2bb4b108133fee2aafd2f53da9b2b
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7519
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Also free scsi_task data structure for the asynchronous
libiscsi APIs.
Change-Id: I0bff706bfb795e51a4b10c357913ae66493dca5d
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7513
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If no ioat devices are registered, we should return.
Change-Id: I03435946716ef653b230515da32e8ccbdf5a188a
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7834
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The coredump info can be viewed in
https://github.com/spdk/spdk/issues/1935
We face this issue because no idxd device is attached, but we
still register the hw engine by spdk_accel_hw_engine_register
in accel_engine_idxd_init.
Fixes#1935
Change-Id: I537f06e2b2923faac7f2cd6a28903e77f1f6aaa5
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7832
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This makes use of newly added spdk_bdev_wait_for_examine(),
to only respond to RPC when bdev was fully examined.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If82cd913ab6653e8cc0da38c639b384b6c0303ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5482
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This makes use of newly added spdk_bdev_wait_for_examine(),
to only respond to RPC when bdev was fully examined.
Fixes#1760
Issue above was triggered in DD tests where application
finished before the examine had a chance to fully finish.
This patch addresses it by making sure that nvme attach
RPC waits for completion of the examine.
Later patch in series adds the bdev_wait_for_examine RPC
to multiple static configuration files. Making sure similar
issues do no occur for bdev modules which do not have changes
in their RPC as here.
The issue does not occur for JSON configs generated from apps,
see patch:
(e57bb1af)lib/bdev: build bdev_wait_for_examine into subsystem
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie3ca2933af97a40ae01ecc3eefe2161d2d34c602
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5483
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch is design to use the single Rados Cluster
object in the same RBD if there are multiple I/O channels created.
And this patch will be prepared for the next patch to share
the same cluster among different RBD bdevs.
Change-Id: I1509f29a9c1088da308a3f88980f0c7fed26476f
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7601
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This module only requires the spdk_bdev_module
APIs, so use this #define to avoid having to
bump so version when spdk_bdev is modified.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I292c38e60cbf21d0d5a6583cb2a2b097c524f3d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7811
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Michal Berger <michalx.berger@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
- Use consistent cache line size units in KiB across RPC calls
and config files. The KiB units are much easier to use then
the bytes units and are more human readable.
- Properly handle cache start when cache line size is incorrect.
- Add test to check if cache line size value is reported correctly.
- Add cache line size info to JSON RPC documentation.
Fixes#1858
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Change-Id: Iec9ede85f6884b64605d2d112947b3f175cbd938
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7614
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Recently, checked the spdk_io_device_unregister function,
it will have deferred free behaviour, and the io_device
will possibly be freed in put_io_channel function.
And this means that it is not safe to directly call:
spdk_io_device_unregister (io_device, NULL);
Then free io_device relately resource.
Because there will be channel to use the resources associated
with io_device. Then we will possibly cause a NULL pointer access.
I found this issue in bdev rbd module, and I think that the
same issue could happen in other modules. So it is better to put the
resource free function as the call back function.
Change-Id: Icc1f86d72b672faefb3b7f416030b818a8cf45ce
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7646
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Passing cpu_mask hints that match only single core were
usefull to prevent any accidents when doing round-robin
in case of 'static' scheduler.
In practice this is not required in case of 'static' scheduler,
the threads will be spread out over all reactors anyway.
This hinders other schedulers which try to respect the cpu_mask
hints, as they would not move the thread to any reactor.
Preventing bunching up less used threads on single reactor.
Drawback of this patch is that poll group names will not match
the cores they are on.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5fb308362dd045228ea9fcca24f988388854c054
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7028
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Purpose: We will also support the kernel idxd driver, so we do not
need export this feature in the module file.
Change-Id: I965e031497920f527962ba187bccd81de6977b8f
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7336
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There's no good reason to reduce the capacity by aligning it to the
number of optimal open zones. If such alignment is required by the
users of the zone block bdev, it should be done on their own layer.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic8614a82715e9f064619aa8fdb75d1a0b851490c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7656
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We cannot rely on DSM/DEALLOCATE as a write zeroes
alternative, even if DLFEAT reports that deallocated
blocks will be read as all zeroes. DEALLOCATE is
advisory, meaning that blocks may not actually be
deallocated. In cases where they are not deallocated,
they will not be read back later as zeroes.
QEMU 6.0 started reporting DLFEAT as returning zeroes
for deallocated blocks but for some of our write
zeroes tests, blocks aren't actually deallocated.
We may be able to add quirks in the future if we know
that a controller reliably deallocates blocks, but
for now we need to revert this completely.
Note that since bdev/nvme module now does not support
write zeroes in any cases, we need to disable the
write zeroes call in the unit tests.
Fixes issue #1932.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I79f0673774b621a9ffcc46891728cc7719e34cdb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7723
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
(Note: this patch was previously applied as b32cfc46 and then reverted
as 63642bef.)
Today the in-guest nvme device shows physical_block_size=512 even though
the backend iSCSI bdev supports physical_block_size=4K
iSCSI targets exposes physical block size using
logical_block_per_physical_block_exponent in READ_CAPACITY_16
NPWG is one of the way to let Linux nvme driver set
physical_block_size of the nvme block device.
This patch adds spdk_bdev.phys_blocklen which is updated if the iSCSI
backend exposes physical_block_size.
Later phys_blocklen is used in nvmf to set NPWG and NAWUPF to report
back during NS identity.
Linux driver uses min(nawupf, npwg) to set physical_block_size.
Similarly in scsi_bdev fill lbppbe in READ_CAP16 response
based on spdk_bdev.phys_blocklen.
Fixes#1884
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I0b6c81f1937e346d448f49c927eda8c79d2d75c0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7739
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This reverts commit b32cfc467b.
This commit fails the ABI checks and only got through because the checks
were disabled until 21.04 hit.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Id26b8f8ba551193d99b1ccbd31b35378b4095a20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7731
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Today the in-guest nvme device shows physical_block_size=512 even though
the backend iSCSI bdev supports physical_block_size=4K
iSCSI targets exposes physical block size using
logical_block_per_physical_block_exponent in READ_CAPACITY_16
NPWG is one of the way to let Linux nvme driver set
physical_block_size of the nvme block device.
This patch adds spdk_bdev.phys_blocklen which is updated if the iSCSI
backend exposes physical_block_size.
Later phys_blocklen is used in nvmf to set NPWG and NAWUPF to report
back during NS identity.
Linux driver uses min(nawupf, npwg) to set physical_block_size.
Similarly in scsi_bdev fill lbppbe in READ_CAP16 response
based on spdk_bdev.phys_blocklen.
Fixes#1884
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I0b6c81f1937e346d448f49c927eda8c79d2d75cf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7310
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When zcero copy send is enabled and used by initiator,
it could significantly increase latency in some payloads.
To enable more fine graing configuration of zero copy
send feature, add new parameters enable_zerocopy_send_server
and enable_zerocopy_send_client to spdk_sock_impl_opts to
enable/disable zcopy for specific type of sockets.
Exisiting enable_zerocopy_send parameter affects all types
of sockets.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I111c75608f8826980a56e210c076ab8ff16ddbdc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7457
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
On host side the connections are created and then added to thread's
poll group. Those connections could use different NIC queues underneath.
To route all connections of poll group through single queue a unique
placement id is chosen as group_placement_id and each socket of poll
group is marked with group_placment_id using getsockopt(SO_MARK) option.
The driver could use so_mark value of skb to determine the queue to use.
Change-Id: I06bda777fe07a62133b80b2491fa7772150b3b5d
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6160
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
We do not want to do any further work on adding
the sock to the group if the epoll_ctl (or kevent)
fails.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I44b6dc86ce5676aa1b8d6c50b86f22758e4e37fa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7594
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
suppression
When all of the following conditions are met:
- non-blocking socket
- zero copy is enabled
- interrupts are suppressed (i.e. busy polling)
- NIC tx queue is full at the time sendmsg() is called
- epoll_wait sees there is already an EPOLLIN event
then we can get into a situation where data we've sent is queued
up in the kernel network stack, but interrupts have been suppressed
because other traffic is flowing. This makes the kernel miss the
signal to flush the software tx queue. If there wasn't also already
a pending EPOLLIN event, then epoll_wait would have been sufficient
to kick the system out of this state. But when all of this aligns,
it hangs.
We deal with this by detecting the scenario and calling poll(), which
will force the kernel to issue the pending transmits.
Change-Id: Ifb247159b7de16c8fc72a90f0333f5b421c8bd07
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6750
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Only passing rbd pointer is enough.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I652581879c9ba228f9af0aa3f5353915a286c6c8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7548
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch is used to fix the chained crc32c computing when users pass
a vector. Since we use a union in spdk_accel_task structure to differentiate
the usage on "src" and "the vector info" (iovs and iovcnt). So we cannot
directly write the src field while users pass a vector.
And I verified it in the hardware platform.
Change-Id: I85d6e86fa689b261782f80a2f89d908a5d4db84f
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7471
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The original code does not free the I/O channel
in bdev_rbd_resize.
Change-Id: I8e48616bea37ba7cece5a179c0f1ca03b3f5cb53
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7466
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add the missing rados_shutdown call in
bdev_rbd_init function.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I0222d2453d1547cfd735f15237eca124cbc23479
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7430
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Busy pollng using recv() is dependent on kernel socket buffer being
empty. Instead poll() function busy polls hw queues with no such dependency.
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Change-Id: I1cb101848d51f7778cdf3d4c015d2d03201bdb37
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7014
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Since the maps are unique to modules, they can store the group_impls
directly.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I7f11db558e38e940267fdf6eaacbe515334391c2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7222
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This allows for different policies per module, as well as overlapped
placement_id values.
Change-Id: I0a9c83e68d22733d81f005eb054a4c5f236f88d9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7221
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
There was a fix for this that went into the posix layer, but the
underlying problem is the logic in the nvme/tcp transport. Attempt to
fix that instead.
Change-Id: I04dd850bb201641d441c8c1f88c7bb8ba1d09e58
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6751
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This list will hold any socket that has some event pending and needs to
be part of the set returned during polling of the group.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I5acf01677e59c1026f93671c7b7b3dc458075bf7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6748
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Instead of iterating the list, we can just manipulate the list
in a single step.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I0172cdbce9af35a62d62dbccfac573e5d723f43a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6747
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Change-Id: I4ed583d91ae9e820be1ee6f4553f29d6650c4922
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5791
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It can divide to two parts:
1, UIO driver - sigbus error handling and uevent
process.
2, VFIO - request notify handling.
sigbus error process is in previous patch.
Change-Id: Idc09754b83ae9ddcaea1f2afcbc13e528ead9863
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5768
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Instead, move it down to the modules. This allows modules
to potentially change the value, if they are able.
Change-Id: I08f5fbadf5d1e96b489ddaaca72aa051ce2cb85c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7212
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This value is already available in the options structure.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I140dc79da1fa5f155a39f1f9e2f54f46d93b6c1c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7211
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Updated to represent default value of number of errors injected.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Change-Id: If3f7a5f4a222fa1180635beb671d9479eede0435
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7246
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Check if qpair has a poll group during the connect process,
use poll group's statistics or allocate own structure per
qpair. That is done due to not all applications use poll
groups and we want to avoid "if (qpair->group)"
conditions in data path.
Admin qpair always allocates its own statistics
structure but the statistics are not reported
since this qpair is not attached to a poll group.
Statistics are reported by spdk_nvme_perf tool
if --transport-stats and in bdev_nvme_transport_statistics
RPC method.
Change-Id: I58765be161491fe394968ea65ea22db1478b219a
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6304
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Factor out the operation to get ctrlr pointer to submit an admin
command into a helper function bdev_nvme_find_admin_path().
This will make the following changes transparent.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idabfa5874ad95414f24d414702205b0cdbc98ddb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7105
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
reset_io is stored into nvme_bdev_ctrlr->reset_bio now. Hence we can
merge completing reset_io into _bdev_nvme_reset_complete() easily.
i
As a few minor changes, to reduce the size of the following patches,
clear reset_bio before calling spdk_bdev_io_complete(), and call
spdk_bdev_io_complete() after completing ctrlr reset.
The following patches will retry pending reset_ios if reset is internal,
abort reset_ios if reset is external and fails, or succeed reset_ios
if reset is external and succeeds.
This clean-up will be helpful for such enhancement.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5e0b4e04b19f4f4f3b21b2db4dc6a2b948b29c67
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7241
Community-CI: Broadcom CI
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
nvme_bdev_ctrlr->reset_bio is accessed only by the thread which called
spdk_for_each_channel() and the callback to spdk_for_each_channel()
is called after unwinding stack via message.
Hence bdev_nvme_reset() can call _bdev_nvme_reset() rather than
_bdev_nvme_reset_start() and store bio into nvme_bdev_ctrlr->reset_bio
after _bdev_nvme_reset() returns zero.
Then inline _bdev_nvme_reset_start() into _bdev_nvme_reset() because
_bdev_nvme_reset_start() has only a single caller now.
The following patches will introduce subsystem and bdev_nvme_reset()
will reset all controllers of a subsystem sequentially. These further
clean-ups will be helpful for such enhancement.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I25abaa5be511c967ce20c92e4caa7dfeb3e09dd1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7240
Community-CI: Broadcom CI
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These are not performance critical and we can get the current thread
simply by spdk_get_thread().
This change will make the following changes a little simpler.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ide79b1275ec0122f7f9812fe53c08a83d0aafeda
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7162
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This change will make the the upcoming changes a little easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I51776c8ab91a0dfd9150f17e567f8be795b9f4b5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7046
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Deleting OPAL device and OCSSD ctrlr after completing unregistration
of io_device will be safer and be helpful for the upcoming patches
to introduce subsystem.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7eb11d5c19bf50903496a7fb936c77aa05089046
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6889
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
As said in the previous patches, nvme_ns->ref is 2 at most, and
first is for populating namespace and second is for nvme_bdev.
On the other hand, nvme_ns->populated is for populating namespace
and nvme_ns->bdev is for nvme_bdev.
Preparation was done by the preceding patches. Let's remove nvme_ns->ref
and use nvme_ns->populated and nvme_ns->bdev instead.
We have unit tests for both normal case and shutdown case now.
So regression will be avoided.
These changes will be helpful for the following patches to support
multipath.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I279ca21a41d6ee2c07bbbeb62866e28423fcc6d0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7097
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Merge setting nvme_ns->populated to false into
nvme_ctrlr_depopulate_namespace_done(). This will simplify the
further changes.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I4bb8347b085ec5855690f6dbde243269cedee706
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7136
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The following patches will remove nvme_ns->ref and use nvme_ns->populated
and nvme_ns->bdev instead because nvms_ns->ref is two at most and
each count of nvme_ns->ref corresponds to nvme_ns->populated or
nvme_ns->bdev.
To do that, we need to ensure nvme_ns->populated is cleared after
spdk_bdev_unregister() is called, otherwise nvme_bdev_ctrlr_destruct()
is called twice.
However OCSSD namespace had used nvme_ns->populated to free resource
after getting log page completes.
To keep such deferral, add the depopulate_pending flag to struct bdev_ocssd_ns
and use it. Then clear nvme_ns->populated in bdev_ocssd_free_namespace()
after spdk_bdev_unregister().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I93199dbd504145bd1e5ea59f5914422c6c5bc938
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7102
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
A few patches including this patch will remove nvme_ns->ref by
nvme_ns->populated and nvme_ns->bdev and remove nvme_ns->ref.
In this patch, set nvme_ns->populated to true when nvme_ns->ref is
incremented, i.e., nvme_ns is available. nvme_ctrlr_populate_namespace_done()
clears nvme_ns->populated if creating bdev failed by memset().
And add assert to nvme_ctrlr_depopulate_namespace_done() to ensure
nvme_ns->populated is false when nvme_ctrlr_depopulated_namespace_done()
is called.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I536933abb516c0a29231149ab564dc8b84245b10
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7096
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The nvme_ns->ref was added to nvme_bdev_ns to support both multipath
and OCSSD range. OCSSD range was to split a namespace into multiple
partitions.
However, OCSSD range was not used and removed to simplify multipath.
When namespace is not split, nvme_ns->ref is 2 at the maximum. First is
for populating namespace and second is for nvme_bdev. nvme_ns->populated
is for populating namespace and nvme_ns->bdev is for nvme_bdev.
Hence we can use nvme_ns->populated and nvme_ns->bdev directly instead of
nvme_ns->ref.
A few patches from this remove nvme_ns->ref and use nvme_ns->populated
and nvme_ns->bdev instead.
dThis patch inlines nvme_bdev_ns_detach() into the callers and remove it.
The following patches will adjust the locations to update nvme_ns->populated
and nvme_ns->bdev and then remove nvme_ns->ref.
Removing nvme_ns->ref will be helpful to associate multiple namespaces
into a single nvme_bdev for multipath.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I87581d39769681adc5e1fdf0b33680e31d958e47
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7095
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Moving nvme_ctrlr_depopulate_namespace_done() to common.c removes the
stub from bdev_ocssd_ut.c and will simplify the upcoming changes.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia94a88ca77518c7df08b52aa9298cdea5997ca1e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7135
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
nvme_bdev_ns_to_bdev() was introduced because nvme_ns->bdev may be
removed to support multipath. However, it is unlikely now.
Hence it is easier to read to stop using nvme_bdev_ns_to_bdev() and
access nvme_ns->bdev directly.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iaf35455218761bafd03060a21110ad1c4098b1db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7098
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Factor out deleting secondary trid from bdev_nvme_delete() into a
helper function bdev_nvme_delete_secondary_trid().
This will make the following changes simpler.
Besides, fix a typo, the case should be not 1B but 2B.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iba21efa0d8036ed15d2743a2548df05e866089d6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7123
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This change separates failover case more clearly, and will make
the following changes simpler.
Merge spdk_nvme_detach() and populate_namespaces_cb() into
bdev_nvme_add_secondary_trid(). Then change the return type of
bdev_nvme_add_secondary_trid() to void and move the comment to the
head of it.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0eb706728cf1b9ad7031eb66e11880a47e2ba767
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7044
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Clean up bdev_nvme_add_trid() by factoring out comparison of namespaces
and comparison of trids into helper functions, bdev_nvme_compare_namespaces()
and bdev_nvme_compare_trids(), respectively.
Then rename bdev_nvme_add_trid() by bdev_nvme_add_secondary_trid().
Rename is for clarification and clean-up is for the next patch.
The next patch will merge spdk_nvme_detach() and populate_namespaces_cb()
into the renamed bdev_nvme_secondary_trid(). Clean-up makes the
next patch simpler.
One note is that checking if the type of trid is not PCIe is done
by holding mutex now to prioritize clean up.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idc8652329dd2c721d101a724ec1a57a66c4174a7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7094
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The following changes will require nvme_bdev_ctrlr_create() to be
asynchronous. As a preparation, merge nvme_ctrlr_populate_namespaces()
and populate_namespaces_cb() into nvme_bdev_ctrlr_create().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I623f927da072d746b4bd83314c0f5bcb9cc34ef0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7043
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will reduce the size of the following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I81d7c90d8676a42aff3ffe73932d5612642a4114
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7042
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Factor out the operation to destroy qpair into a helper function
bdev_nvme_destroy_qpair().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5ac9215407e0b4f4368a68bc58d8e3daac029393
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6783
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Pass nvme_bdev_ctrlr to callback to each spdk_for_each_channel() call
in reset processing.
The following patches will register nvme_bdev_subsystem instead of
nvme_bdev_ctrlr as io_device. Hence we need a different way to pass
nvme_bdev_ctrlr to the completion functions of spdk_for_each_channel().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ief31dbbc145b5850e4e5b2e4c9ad7287128ce870
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6950
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Hold the bdev_io which submits the reset request into the nvme_bdev_ctrlr.
as reset_bio. Then differentiate the callback function between
_bdev_nvme_complete_pending_resets() and _bdev_nvme_abort_pending_resets()
to spdk_for_each_channel() in _bdev_nvme_reset_complete().
The next patch will pass nvme_bdev_ctrlr to for_each_channel() instead.
The following patches will register nvme_bdev_subsystem instead of
nvme_bdev_ctrlr as io_device. Hence we need a different way to pass
nvme_bdev_ctrlr to the completion functions of spdk_for_each_channel().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If7a3386815429e8ed44f4e9e5365a21bd97e7fb6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6913
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The API spdk_nvme_ctrlr_free_io_qpair() returns immediately if the
passed qpair is NULL, but calling spdk_nvme_ctrlr_free_io_qpair()
with NULL should be avoided.
This patch cleans up the code to ensure that nvme_ch->qpair is NULL if
disconnected and spdk_nvme_ctrlr_free_io_qpair() is called only if
nvme_ch->qpair is not NULL.
Then add a test scenario that two reset requests were submitted
simultaneously and the first reset request failed and then the second
reset request also failed. This verifies the refactoring done in the
next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iae461f7f826b0e1a4607a17e528c04a642242d6e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7041
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add support for zoned namespaces in the bdev/nvme module.
Query the namespace type, and if a zoned namespace is found, set the
bdev struct members needed by the bdev_zone.h APIs.
Note that while vbdev_zone_block and bdev_ocssd have zone locks,
bdev/nvme does not need any kind of zone lock, because NVMe ZNS
supports append natively.
The whole point of the zone append command is that you don't need any
locks, even when having multiple writers, since the drive will return
back the Assigned LBA where the blocks were actually written.
If a SPDK application will utilize writes instead of zone appends,
the SPDK application itself will need a per zone lock in order to
make sure that there is only one outstanding write.
Adding zone locks in bdev/nvme would reduce the performance of zone
appends, for no reason at all.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ibacdb9b6e839624aba869c9949231343db5914be
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6911
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Add support in bdev_zone.h for getting the maximum zone append data
transfer size.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I61203e64d51601232c6578a090fa52975364c1f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6910
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Currently there is possibility of adding a sock to pending_recv
list again if sock->pending_recv is true. Check if flag is false
before adding to the list.
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Change-Id: Ie23e1e8dbe1aa5594d9ddea30e7f235e3bf8ddad
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6381
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
This seems to be cleaning up the pending_recv list to account for the
missed cases in the previous patches in this series. Now that we're
correctly cleaning up the list, don't do this.
Note that if an EPOLLIN event is received but the application never does
a read/recv, the socket will remain in the pending recv list. The next
poll will get another EPOLLIN event, but the logic already handles that
case.
Additionally, left a TODO for a performance optimization.
Change-Id: I1cdde500a5c76554401a89de766d35b7a486b207
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6746
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If there was an EPOLLIN event the socket gets adding to the pending_recv
list. But if the application then does a very large read, it will bypass
the logic that clears the socket from the pending_recv list. Fix this.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ia0ba86012f7c6dfd14eb43ba6eeed94dbbce90ce
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6744
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
from pending_recv list
If the upper layer performs a read/recv, it should still remove the
socket from the pending_recv list.
Change-Id: I32ca8ecccbfe1e53ecc7d6f57343c2727e84b851
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6743
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Add a check and only free g_hotplug_probe_ctx once.
This is to fix below issue:
https://github.com/spdk/spdk/issues/1849
Change-Id: I7387495890830badd06a7341d9ac4d149ca1c9f4
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7002
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This patch really uses the accelrated engine.
Currently, we only offload the crc32c caculation,
but it can be extended.
Change-Id: If0e4c6a44b6e1e10e03f7eca355bed418d67326b
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6760
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Separate bdev_nvme_reset() and _bdev_nvme_reset() by making
bdev_nvme_reset() call _bdev_nvme_reset_start(), and then
remove the ctx parameter from _bdev_nvme_reset().
This clarifies the next patch and reduces the size of the next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I76b0f2f5b83445845a313203e594dca0be150bc3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6949
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We need to pass bio only when the reset request came from the upper
layer. Previously, bdev_nvme_reset() called _bdev_nvme_reset() and
_bdev_nvme_reset() had the second parameter, ctx, to pass bio to
for_each_channel(). However, it was not clean.
Let's separate bdev_nvme_reset() which processes a reset request
came from the upper layer and _bdev_nvme_reset() which proesses
an internal reset request by adding a helper function
_bdev_nvme_reset_start().
The next patch will remove the second parameter, ctx, from
_bdev_nvme_reset() for clarification and further refactorings.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I2bd00e3abb56fe8d540b8af55743ee92ca9cd526
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6948
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The recent refactoring removed the destruct poller and change the
reset processing to destruct ctrlr after its completion by conditionally
sending message.
But differentiating callback function is difficult if we reset multiple
ctrlrs.
If nvme_bdev_ctrlr->destruct is set, any new reset cannot start. So we can
use an new variable and always execute the callback function.
Add an new variable pending_destruct to struct nvme_bdev_ctrlr, and set
pending_destruct if ctrlr->ref is zero and ctrlr->destruct is true, and
then start destruct ctrlr if ctrlr->destruct_after_reset is set after
clearing pending resets.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9f34c42a40c5a5da54611e7871aef8c58117a56a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6714
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The following patches will cache and use spdk_nvme_ns pointer rather than
nvme_bdev_ns pointer in I/O paths. As a preparation, change a few cases
to get spdk_nvme_ctrlr pointer from spdk_nvme_ns.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ief5f01f85f46e2ad822af5b5abb13a14f28a0c6a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6834
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will make the following changes a little easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I97ce4fb03d02679f5be152b04c33650bf5164f2a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6833
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Purpose: To setup an accelerated function callback
for created spdk_nvme_poll_group. In this patch,
we just create the interface. The real usage of this
call back will be provided in the other patch.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I0d936aa4eba4dbfcc0137942156b9f2919eb5b78
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6758
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Previously, not nvme_bdev_ctrlr but spdk_nvme_ctrlr pointer had been
passed to spdk_get_io_channel() in nvme_rpc_io_cmd_bdev_nvme().
There was no unit test for bdev_nvme_get_io_qpair(), and so add
unit test for it to guard us from degradation.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I00fdc14ea6467162c5fb90d04389883d7c409a17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6920
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
OCF now includes on its own only small part of env, that is crucial for
compiling headers for casadm - an OCL management tool. The idea behind
this is to reduce the amount of included stuff when it is not needed.
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Change-Id: I58fef2be3ccb1cce743093ed973ba54cdbb605e1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6413
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This is the only reason the event.h header is included here
and it isn't necessary.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I4c0a9828cfede1ccd1c9263ad66b354ba167f434
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6919
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I191ad5e3b153fb563256eba1aa695716f66db788
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6377
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Purpose: Make the code clean and remove duplicated code.
Change-Id: I6cc8b94af24f89301531bd38d766afb53898e0ed
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6811
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
The next patch will add nvme_bdev_channel and we will not be able to
io_channel from nvme_ch anymore.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I8ba93e66657d21da58df63f3ad6dde4b155fc174
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6612
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add mutex per nvme_bdev_ctrlr and replace g_bdev_nvme_mutex by it
for possible cases. These are just drop-in-replacement.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib1d5eebd65e514cfe96ad60a9004d9bace7b4f6b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6693
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This will make us easier to maintain the operation to delete
nvme_bdev_ctrlr and its trids. The added unit test cases guard us
from degradation.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I400d4092020e89bacaebc7be045a456b8760ed8d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6688
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
By adding the second parameter, hotplug, factor out the common part
of remove_cb() and bdev_nvme_delete() into a helper function
_bdev_nvme_delete().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5becbc7a7a94e852510e6a2745afb9c533d21713
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6692
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
spdk_bdev_part_base_construct() is deprecated so this patch
moves remaining instances to spdk_bdev_part_base_construct_ext().
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Iccd4b15933bec41885adfaf4fc4ce2abedd2d6c7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6627
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>