bdev_nvme_add_trid() is used once in the same file.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I738b7e959d13d419e090a019da6e7f1f09aa7109
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4388
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
PCIe failover is not supported yet and it is not possible that the
controller is included in g_skipped_nvme_ctrlrs for failover cases.
We can consolidate the code to create failover path in bdev_nvme_create().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ide456707a0b66a4b26eaf66392b9b53c3e659d9e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4387
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This simpifies the code a little without any degradation.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6006266e250f2d083aa12424809d04fac6a9f8f0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4385
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
_bdev_nvme_submit_request() already has a pointer to nvme_io_channel.
Passing nvme_io_channel simply to I/O APIs will be reasonable
simplification.
Two admin APIs, bdev_nvme_abort() and bdev_nvme_admin_passthru()
get the associated thread to the spdk_io_channel and set it to
nvme_bdev_io. These APIs get spdk_io_channal from nvme_io_channel using
the container_of macro. These APIs are not performance sensitive.
Using the contaier_of macro will be better than extracting the operation
from these APIs because we can keep the changes local to these APIs.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I67985075d70551e4fe3bb84dfee15f1e4e882451
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4384
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When NVMe bdev module supports multipath, nvme_bdev will have multiple
nvme_bdev_ns, and I/O APIs need only nvme_bdev_ns. So change the parameter
from nvme_bdev to nvme_bdev_ns for I/O APIs.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1e71abbab856112fdd6a333a89ba5f02f4e63ccd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4382
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is a preparation to pass nvme_bdev_ns instead of nvme_bdev
to I/O APIs.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I095478bc3ad305ba940fca101b74e74c7e8e74ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4381
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Struct nvme_bdev has a pointer to nvme_bdev_ctrlr and a pointer to
nvme_bdev_ns now, but nvme_bdev_ctrlr can be accessed via
nvme_bdev->nvme_ns->ctrlr, and so nvme_bdev->nvme_bdev_ctrlr is
duplicated.
The upcoming multipath feature will need struct nvme_bdev to have
multiple nvme_bdev_ns pointers.
Hence remove nvme_bdev_ctrlr from struct nvme_bdev and get a pointer
to nvme_bdev_ctrlr from nvme_bdev->nvme_ns->ctrlr.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie5b785cb26e535c7186fe630c1f8b7d598b0ba46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4288
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
LOCAL_SYS_LIBS is meant to define *direct* system
library dependencies for a given library. libuuid
is directly used by the SPDK util library and then
other SPDK libraries use uuid indirectly through
util.
So only the util library should include uuid in
LOCAL_SYS_LIBS.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia0d2d63f48e6f89891164cf2f9dc4c7a6476d4e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4366
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We have some RPCs defined in the bdev library itself,
others in a separate bdev_rpc library. There's no need
for the separate library - just move them all into the
bdev library.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I298eedb88924197e64eb315369efb10f402903a5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4364
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The bdev layer nicely handles the case where we call this
function with the buffers already present - it just
immediately calls the get_buf_cb. But this adds extra
overhead in the case where the buffer is already present.
Since nvme has no alignment restrictions, we can just
check the iovs directly and avoid the extra call to
spdk_bdev_io_get_buf when possible.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I66df0fde574a35e995a3432999d75bdbf9b27212
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4317
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
write instead of writev
Call spdk_nvme_ns_cmd_write_with_md if there is only one iov element.
Use spdk_nvme_ns_cmd_writev_with_md only if there is more than one.
This is about a 15% improvement in I/Ops.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I1a99b5507d37cde04e81feada65b14554ad01a17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4319
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
readv API
It's cheaper to call spdk_nvme_ns_cmd_read_with_md than it is to call
spdk_nvme_ns_cmd_readv_with_md, so do a quick check of the iovcnt and
use the best one for the job.
This is about a 15% improvement in I/Ops.
Change-Id: I82e6677d1ac47abf9919f95e651e7a7595c5e9a3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4316
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It aims to avoid bdev starvation by setting separate
io_context to each bdev_aio io channel and link them
into group channel.
Previous, each ch of aio_bdev on a same spdk_thread shares
same io_context. If one of them submits too many IO to the
io_context, then other channels will be starved.
Moreover, if the first bdev_io gets NOMEM error, then the
following IO of this bdev have no chance to get processed.
The bdev_aio starvation can be reproduced by bdevperf in
v20.04, by:
./bdevperf -q 100 -w randwrite -t 5 -c hdd.conf -o 1048576
(hdd.conf defines 2 aio_bdevs of HDD)
Change-Id: Ic709323f5baeb2f8f7250b75ad872ec4156b5e78
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3808
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Community-CI: Broadcom CI
Not sure how this one was missed back during the big rename party.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ia00819a2c2e8c81ad0499f4081667dde04fa726e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4140
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In most cases, there is single element for iov array,
so using rbd_aio_read/write is enough.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Id3ccbb4a4ff10be9d4546c8e89e6dc596d0f3a72
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4122
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The timer poller could add the latency since we poll it every 50ns.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ia3d0ec56658b9365fdd34b67663a3365eb738451
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4081
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
It is a good enhancement for the GPT vbdev module to support
interleaved metadata but interleaved metadata is not critical and
the priority is low.
However the following error log is always collected when there is
any bdev with interleaved metadata.
vbdev_gpt.c: 552:vbdev_gpt_examine: *ERROR*: GPT module does not
support block size 4224 for bdev Null0
This does not cause any failure but confuses user. Hence, change
the corresponding print from ERRLOG to DEBUGLOG.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id58507cbdc830d2b139fb34faa29e08d73c16141
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4058
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
When using --dif-type option --md-size should be
required as well.
Update & improve bdev_null_create rpc.py help
messages as well.
Change-Id: I6588a97aef6c8792bab7a41ece17c0461bb36844
Signed-off-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3949
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In function's error processing, the io_channel should
also be put back.
Change-Id: Ibd3a616244a77fcc29a5ec5d1d1ca53f5412e8c6
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3859
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch is used to implement the group polling
policy instead of each rbd has one poller.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ieb975e656240bcdaf2657410f010d72b156639ed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3698
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We can use rbd_aio_readv and rbd_aio_writev to simplify the code.
Since it rbd_aio_readv/rbd_aio_writev are supported more than 2
years, so can replace rbd_aio_read/write without using "ifdef".
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I924a039110ceed59836c1b904b6e7f8db271ad62
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3697
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This will allow us to use this RPC to detach only specific
paths from controllers.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ib52e38aa7d4ea096418a6dc0328481c2e8db6c54
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3070
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This RPC was using snprintf which could potentially
truncate a bad argument silently without alerting the
caller.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I1795b2e4086cbaa45390bd11b189755818b5c9e7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3613
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
According to customers's report, there still exists
the Rados related thread contention with SPDK's reactor thread.
So also meove bdev_rados_context_init into bdev_rbd_handle.
With this patch, QEMU + SPDK vhost + Bdev RBD 's performance
can be improved.
Change-Id: Ie26a140f8d2cfe1a98cd1737f4ff70e9319cdc11
Signed-off-by: Yaowei Bai <baiyaowei_yewu@cmss.chinamobile.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3699
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will allow us to further develop the ecosystem for multipath
failover support.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I24a8cf13e60e6cc0d5b6374da33c8a4e5b6c499a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3069
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This allows us to avoid creating a separate rpc just for multipath TRIDs.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I4e83167eaf16e50a72efbd513333a4d09c52be61
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2884
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When we fail to process admin completions on a controller
attempt to failover to a previously registered trid
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I547bd010f4b339b2af7f2b33027cddad4b4926bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3045
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We are about to instrument some checks in this function
that will access the nvme_bdev_ctrlr struct directly so
pass that directly since the ctrlr is still accessible from
the nvme_bdev_ctrlr struct.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I4e0d0d6c433149368e84275426362f8283903027
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3340
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This is part of a larger series enabling failover at the bdev
layer for NVMe.
Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic5c128244699c1a47275145ca7e41aa5f1366259
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3044
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch paves the way for introducing a tailq containing
multiple alternative paths to the same controller.
Change-Id: I13d30c12b8e0ce38eae687f9e76740be1d11e4d1
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3259
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Because of the Linux kernel has limitation, IORING_SETUP_IOPOLL is only
used for local devices (e.g., local files, pcie NVMe SSDs etc.). However,
it does not work for devices atttached from the remote. So in order to
make bdev uring generic, Let's do not use IORING_SETUP_IOPOLL to create the
uring.
Change-Id: I6aea1ff222a8a0d67ab040ada75aa0ef6730e725
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3587
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
During virtio_pci_dev_probe, if enum_cb fails, hw needs
to be released. But in bdev_virtio, if vdev fails after
initialization, it will enter the bdev destruction process
which call the modern_destruct_dev function and hw will
be released during the process. So we will encounter the
problem of hw being released twice.
Change-Id: I1e8116283cfd810dfb050f8928f4ecd4bb2d815b
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3566
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
During virtio_pci_dev_probe, if enum_cb fails, hw needs
to be released. But in bdev_virtio, if vdev fails after
initialization, it will enter the bdev destruction process
which call the modern_destruct_dev function and hw will
be released during the process. So we will encounter the
problem of hw being released twice.
Change-Id: Ifba35284c072355ba0e10428b597a1894d32d59e
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3564
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Fixes#1498
When shutting down the application, it was possible to
reference stale ocf_cache pointer. This was the case
when two or more vbdev_ocf devices were based on top
of single cache bdev.
This issue did not occur outside of the shutdown case,
since RPC only allows deletion of the vbdev_ocf.
This erases on disk metadata and next run of the application,
would not detect such vbdev_ocf.
Shutdown meanwhile works different, by first stopping
the instance of running "ocf_mngt_cache" and later detaching
"core" devices (the ones being cached). This prevented
erasing the on disk metadata and allowed for restarted
application to detect vbdev_ocf.
See patch (1292ef2) for details.
Since references to ocf_cache are copied between vbdev_ocf
[see start_cache()], the reference count inside ocf_cache
was limited to original ocf_mngt_cache_start() and
management queue creation. First call into ocf_mngt_cache_stop()
released all references to ocf_cache. Leaving other
vbdev_ocfs pointing to released memory.
This patch works around this issue by increasing ref cnt
on ocf_cache for each vbdev based on top of it.
It allows to call into ocf_mngt_cache_stop(), but not
release the memory for ocf_cache until last vbdev.
Note:
A proper redesign here is in order:
- either rearranging structures to be based around single ocf_cache,
rather than multiple vbdev_ocf instances
- better use of OCF API to reduce book keeping logic in vbdev
There are plans to implement detach/attach in RPC,
so it should be a focus during that effort.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I560a7fbb1c052bf53970e655bdb60803c561a252
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There are additional conditions which SPDK tracks,
that are known before issuing call to OCF.
Two main ones are:
- if vbdev->ocf_cache was not yet created [start_cache()]
- if the cache bdev was opened [attach_base()]
Both happen for the first cache bdev once. Then for
consecutive vbdev_ocf on same cache bdev, reference
will be copied.
This call will simplify checking both conditions.
Calling into OCF with NULL or stale ocf_cache pointer,
rightly so will cause issues with ASAN.
Related #1498
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib202c15bda4cbbffa1516c69168e8bfb80370047
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3573
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
To specify the desired logical block size. Must be 4K or 512.
If no block size is provided a default of 0 means to use the
underlying bdev block size. For cases where something other
than 4K or 512 is desired, format the underlying device
accordingly and don't specify a logical block size on creation
of the compress vol.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I58b71e210cfa77b3237c0c454585c734e2e22aea
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3177
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This was sort of a clunky interface requiring a couple of inline
functions in every app that wants to use the accel_fw moving
forward. By having the accel_fw public API accept a callback arg
instead of an accel_task combined with adding a pool of accel_tasks
in the accel_fw engine we can eliminate this.
After changing the parm to a cb_arg, changes were made to all accel_fw
interfaces to put cb_fn and cb_arg as the last parms in public and
private function calls.
Related bdev_malloc changes need to be in this patch in order to pass CI.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I2b75764e534562d91484a094c3352266156d8425
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3209
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
With this patch, spdk_app_start/stop can be repeatedly
called by users based on their upper level application's
requirement.
Changes are:
* Add reinit ability inside spdk_env_init and related functions
* Clear g_shutdown_sig_received in spdk_app_setup_signal_handlers
* Clear malloc_disk_count in bdev_malloc_initialize
Change-Id: I2d7be52b0e4aac2cb6734cc1237ce72d33b6de0c
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2260
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Currently we provide a warning to users that the specified block
size doen not match auto-detected for AIO device, e.g.: the AIO
device uses 512B as the block size while users input 4096B when
creating SPDK AIO block device. However, the data alignment
required by kernel AIO device can use the auto-detected one,
we are over-committed for the alignment requirement, so here
just use the auto-detected block size.
Fix issue #1478.
Change-Id: Ia5d02bb66980cbe55ea7039488189d409315552e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3495
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Bdev underneath lvs has a certain value set as block size,
then lvolstore inherits that as `io_unit_size`,
which ends up as block size for the lvol bdev presents.
Before this patch lvs on device with block size of 512,
would create lvol bdev with block size of 512.
Meanwhile reporting in bdev_lvol_get_lvstores(),
that block_size is 4k instead.
This was result of addition of io_unit_size and
forgetting to update the value here. Previously
only I/O of page_size was allowed.
This patch adjusts the value reported to
io_unit_size.
Alternatively it could have been removed,
or name changed to "io_unit_size".
Yet for API compatibility reason this name
remains.
Info on actual page size which is always 4k,
is not useful to bdev users. No need to add
another field for that.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I19701c408c389f64e25c027c7bba789294bbed94
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3289
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Use logical block size instead of base bdev block size in
vbdev_compress_claim. This is relevant to enable the compress bdev to
present a logical block size that is different from the block size of
the base bdev (e.g. 512B logical on 4KB base bdev).
However, this is just an internal fix/preparation and there is
currently not yet a way for users to specify a different
logical block size, e.g. through an rpc.py parameter.
Signed-off-by: Sven Breuner <sven@excelero.com>
Change-Id: Id7562f0da234ce11242b7ee78837a317ee51c211
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2858
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change NVMe bdev module to enable abort as IO type.
Change _bdev_nvme_submit_request() to process abort request when the
IO type is abort.
The current thread tries aborting I/O command in the I/O qpair first.
If no I/O command to abort was found, send message to the thread which
is registered when creating controller. The controller thread tries
aborting admin command in the admin qpair next. If no admin command
to abort was found, complete the abort request with failure.
spdk_nvme_ctrlr_cmd_abort_ext() is used to try aborting command whose
cb_arg matches. qpair is set to NULL when trying to abort admin command.
Before calling spdk_nvme_ctrlr_cmd_abort_ext(), save the current
thread to process admin command completion correctly.
spdk_bdev_abort() supports any bdev module other than NVMe bdev
module and does not check CDW0 but checks only if the completion
status is success or failure. So add bdev_nvme_abort_done() and
converts the NVMe completion status to the bdev completion status.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If6aebae0ba2f6c5834ee926e161af9c4d825f341
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2040
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Add thread pointer to struct nvme_bdev_ctrlr. The thread which
created nvme_bdev_ctrlr is set to the pointer.
The thread pointer will be used to limit only one thread to submit
admin abort.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia39d5cbc7a13b0e0022c0d5591069ea8776ef774
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3244
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>