Commit Graph

425 Commits

Author SHA1 Message Date
Shuhei Matsumoto
96dbd5e427 bdev/nvme: Make bdev_nvme_add_trid() static function
bdev_nvme_add_trid() is used once in the same file.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I738b7e959d13d419e090a019da6e7f1f09aa7109
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4388
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-09-30 07:25:54 +00:00
Shuhei Matsumoto
dde21d3bbb bdev/nvme: Consolidate the code to create failover path in bdev_nvme_create()
PCIe failover is not supported yet and it is not possible that the
controller is included in g_skipped_nvme_ctrlrs for failover cases.

We can consolidate the code to create failover path in bdev_nvme_create().

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ide456707a0b66a4b26eaf66392b9b53c3e659d9e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4387
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-09-30 07:25:54 +00:00
Shuhei Matsumoto
cced29b595 bdev/nvme: Allocate context at the head of bdev_nvme_create()
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id4f8703dbfdd0f12263eef82e51b7df3d8e15d81
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4386
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-09-30 07:25:54 +00:00
Shuhei Matsumoto
66f74eb9f7 bdev/nvme: Check if a controller with the same TRID exists first in bdev_nvme_create()
This simpifies the code a little without any degradation.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6006266e250f2d083aa12424809d04fac6a9f8f0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4385
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-09-30 07:25:54 +00:00
Shuhei Matsumoto
7b9b1eefc0 bdev/nvme: Pass nvme_io_channel instead of spdk_io_channel as a parameter to I/O APIs
_bdev_nvme_submit_request() already has a pointer to nvme_io_channel.
Passing nvme_io_channel simply to I/O APIs will be reasonable
simplification.

Two admin APIs, bdev_nvme_abort() and bdev_nvme_admin_passthru()
get the associated thread to the spdk_io_channel and set it to
nvme_bdev_io. These APIs get spdk_io_channal from nvme_io_channel using
the container_of macro. These APIs are not performance sensitive.
Using the contaier_of macro will be better than extracting the operation
from these APIs because we can keep the changes local to these APIs.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I67985075d70551e4fe3bb84dfee15f1e4e882451
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4384
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-09-30 07:25:54 +00:00
Shuhei Matsumoto
6e82c8f796 bdev/nvme: Pass nvme_bdev_ns instead of nvme_bdev as a parameter to I/O APIs
When NVMe bdev module supports multipath, nvme_bdev will have multiple
nvme_bdev_ns, and I/O APIs need only nvme_bdev_ns. So change the parameter
from nvme_bdev to nvme_bdev_ns for I/O APIs.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1e71abbab856112fdd6a333a89ba5f02f4e63ccd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4382
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-09-30 07:25:54 +00:00
Shuhei Matsumoto
2b4896fe3b bdev/nvme: Pass DIF check flags as a parameter to I/O APIs
This is a preparation to pass nvme_bdev_ns instead of nvme_bdev
to I/O APIs.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I095478bc3ad305ba940fca101b74e74c7e8e74ba
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4381
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-09-30 07:25:54 +00:00
Shuhei Matsumoto
b123fb15fa bdev/nvme: nvme_bdev points only nvme_bdev_ns and access nvme_bdev_ctrlr via nvme_bdev_ns
Struct nvme_bdev has a pointer to nvme_bdev_ctrlr and a pointer to
nvme_bdev_ns now, but nvme_bdev_ctrlr can be accessed via
nvme_bdev->nvme_ns->ctrlr, and so nvme_bdev->nvme_bdev_ctrlr is
duplicated.

The upcoming multipath feature will need struct nvme_bdev to have
multiple nvme_bdev_ns pointers.

Hence remove nvme_bdev_ctrlr from struct nvme_bdev and get a pointer
to nvme_bdev_ctrlr from nvme_bdev->nvme_ns->ctrlr.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie5b785cb26e535c7186fe630c1f8b7d598b0ba46
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4288
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-09-30 07:25:54 +00:00
Jim Harris
ed071dd9f9 build: cleanup LOCAL_SYS_LIBS for uuid
LOCAL_SYS_LIBS is meant to define *direct* system
library dependencies for a given library.  libuuid
is directly used by the SPDK util library and then
other SPDK libraries use uuid indirectly through
util.

So only the util library should include uuid in
LOCAL_SYS_LIBS.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia0d2d63f48e6f89891164cf2f9dc4c7a6476d4e3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4366
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-09-29 09:36:25 +00:00
Jim Harris
ebc227d9b2 bdev: move bdev_rpc library contents
We have some RPCs defined in the bdev library itself,
others in a separate bdev_rpc library.  There's no need
for the separate library - just move them all into the
bdev library.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I298eedb88924197e64eb315369efb10f402903a5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4364
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-09-25 11:43:42 +00:00
Jim Harris
acfb968aa8 bdev/nvme: avoid spdk_bdev_io_get_buf call when possible
The bdev layer nicely handles the case where we call this
function with the buffers already present - it just
immediately calls the get_buf_cb.  But this adds extra
overhead in the case where the buffer is already present.
Since nvme has no alignment restrictions, we can just
check the iovs directly and avoid the extra call to
spdk_bdev_io_get_buf when possible.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I66df0fde574a35e995a3432999d75bdbf9b27212
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4317
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
2020-09-23 08:12:23 +00:00
Ben Walker
7deedab66a bdev/nvme: For writes, if there is only one iov element call
write instead of writev

Call spdk_nvme_ns_cmd_write_with_md if there is only one iov element.
Use spdk_nvme_ns_cmd_writev_with_md only if there is more than one.

This is about a 15% improvement in I/Ops.

Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I1a99b5507d37cde04e81feada65b14554ad01a17
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4319
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-09-22 11:41:30 +00:00
Ben Walker
3870092dd1 bdev/nvme: For reads, if there is only one element in the iov don't use
readv API

It's cheaper to call spdk_nvme_ns_cmd_read_with_md than it is to call
spdk_nvme_ns_cmd_readv_with_md, so do a quick check of the iovcnt and
use the best one for the job.

This is about a 15% improvement in I/Ops.

Change-Id: I82e6677d1ac47abf9919f95e651e7a7595c5e9a3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4316
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-09-22 11:41:30 +00:00
Xiaodong Liu
195fb4e40e bdev/aio: assign new io-ctx to each io-ch
It aims to avoid bdev starvation by setting separate
io_context to each bdev_aio io channel and link them
into group channel.

Previous, each ch of aio_bdev on a same spdk_thread shares
same io_context. If one of them submits too many IO to the
io_context, then other channels will be starved.
Moreover, if the first bdev_io gets NOMEM error, then the
following IO of this bdev have no chance to get processed.
The bdev_aio starvation can be reproduced by bdevperf in
v20.04, by:
./bdevperf -q 100 -w randwrite -t 5 -c hdd.conf -o 1048576
(hdd.conf defines 2 aio_bdevs of HDD)

Change-Id: Ic709323f5baeb2f8f7250b75ad872ec4156b5e78
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3808
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Community-CI: Broadcom CI
2020-09-15 07:14:18 +00:00
paul luse
2b29a2b16c bdev/compress: rename RPC to match standard naming
Not sure how this one was missed back during the big rename party.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ia00819a2c2e8c81ad0499f4081667dde04fa726e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4140
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-09-14 09:19:19 +00:00
Ziye Yang
e699355434 bdev/rbd: Still use rbd_aio_read/write if iovcnt=1
In most cases, there is single element for iov array,
so using rbd_aio_read/write is enough.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Id3ccbb4a4ff10be9d4546c8e89e6dc596d0f3a72
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4122
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2020-09-11 07:29:20 +00:00
Ziye Yang
a9c0fdd77d bdev/rbd: Use the non-timer poller.
The timer poller could add the latency since we poll it every 50ns.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ia3d0ec56658b9365fdd34b67663a3365eb738451
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4081
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
2020-09-09 07:20:04 +00:00
Shuhei Matsumoto
aeb693ca07 bdev/gpt: Change ERRLOG to DEBUGLOG to show interleaved metadata is not supported
It is a good enhancement for the GPT vbdev module to support
interleaved metadata but interleaved metadata is not critical and
the priority is low.

However the following error log is always collected when there is
any bdev with interleaved metadata.

  vbdev_gpt.c: 552:vbdev_gpt_examine: *ERROR*: GPT module does not
    support block size 4224 for bdev Null0

This does not cause any failure but confuses user. Hence,  change
the corresponding print from ERRLOG to DEBUGLOG.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id58507cbdc830d2b139fb34faa29e08d73c16141
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4058
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
2020-09-04 15:12:48 +00:00
Karol Latecki
7ff4e2af79 bdev/null: make md_size inclusive for dif_type
When using --dif-type option --md-size should be
required as well.
Update & improve bdev_null_create rpc.py help
messages as well.

Change-Id: I6588a97aef6c8792bab7a41ece17c0461bb36844
Signed-off-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3949
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-08-31 07:47:48 +00:00
Xiaodong Liu
e7a0a45e8a bdev/rbd: put back io_channel in create_cb
In function's error processing, the io_channel should
also be put back.

Change-Id: Ibd3a616244a77fcc29a5ec5d1d1ca53f5412e8c6
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3859
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-08-26 09:46:57 +00:00
WANGHAILIANG
af4fa14891 module/bdev_raid: remove an ERROR log which occurs every times
This ERROR log is: bdev_raid_rpc.c: 276:rpc_bdev_raid_create: *ERROR*:
the rpc param strip_size is deprecated.
raid_bdev_write_config_json() generates a bdev.json:
    {
      "method": "bdev_raid_create",
      "params": {
        "name": "raid0",
        "strip_size": 64,
        "raid_level": "raid0",
        "base_bdevs": [
          "Malloc4",
          "Malloc5"
        ]
      }
    },

rpc_bdev_raid_create() will check this req.strip_size,
it always print this ERROR log:
else if (req.strip_size > 0 && req.strip_size_kb == 0) {
		SPDK_ERRLOG("the rpc param strip_size is deprecated.\n");

So I replace "strip_size" with "strip_size_kb" here.

Change-Id: Ic9570c88dd336a755ccd53690962eddbaac22e1b
Signed-off-by: WANGHAILIANG <hailiangx.e.wang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3885
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
2020-08-25 08:02:37 +00:00
rafalste
4d91b4ef01 bdev/ocf: add cache line size
Define cache line size when creating cache device.

Signed-off-by: rafalste <rafal.stefanowski@intel.com>
Change-Id: Ica536a9bda35c355b4dd4e406a241defc1453f53
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3449
Community-CI: Mellanox Build Bot
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-08-12 10:39:56 +00:00
Ziye Yang
d4229b4dd8 bdev/rbd: Remove bdev_rbd_rw and bdev_rbd_flush
Purpose: To simplify the code.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I1ab1ceb5367b7a47f72fe1d0d03c607e8c46cb48
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3723
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-08-12 07:49:04 +00:00
Ziye Yang
6cbbc68296 bdev/rbd: Implement the group polling policy.
This patch is used to implement the group polling
policy instead of each rbd has one poller.

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ieb975e656240bcdaf2657410f010d72b156639ed
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3698
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-08-12 07:49:04 +00:00
Ziye Yang
5d097aa7fd bdev/rbd: Simplify the read/write.
We can use rbd_aio_readv and rbd_aio_writev to simplify the code.
Since it rbd_aio_readv/rbd_aio_writev are supported more than 2
years, so can replace rbd_aio_read/write without using "ifdef".

Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I924a039110ceed59836c1b904b6e7f8db271ad62
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3697
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2020-08-12 07:49:04 +00:00
Seth Howell
8bff48f63c module/bdev_rpc: modify detach_ctrlr rpc to accept trid.
This will allow us to use this RPC to detach only specific
paths from controllers.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ib52e38aa7d4ea096418a6dc0328481c2e8db6c54
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3070
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-08-11 08:29:14 +00:00
Seth Howell
80dcb258e0 lib/bdev_nvme: fix attach_controller RPC.
This RPC was using snprintf which could potentially
truncate a bad argument silently without alerting the
caller.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I1795b2e4086cbaa45390bd11b189755818b5c9e7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3613
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-08-11 08:29:14 +00:00
Ziye Yang
65945c0296 bdev/rbd: Move bdev_rados_context_init to bdev_rbd_handle
According to customers's report, there still exists
the Rados related thread contention with SPDK's reactor thread.
So also meove bdev_rados_context_init into bdev_rbd_handle.

With this patch, QEMU + SPDK vhost + Bdev RBD 's performance
can be improved.

Change-Id: Ie26a140f8d2cfe1a98cd1737f4ff70e9319cdc11
Signed-off-by: Yaowei Bai <baiyaowei_yewu@cmss.chinamobile.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3699
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-08-11 08:28:39 +00:00
Seth Howell
0b81c11ab8 module/bdev: add a function to remove trid from ctrlr.
This will allow us to further develop the ecosystem for multipath
failover support.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I24a8cf13e60e6cc0d5b6374da33c8a4e5b6c499a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3069
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-08-07 15:06:01 +00:00
Seth Howell
1973d10b59 bdev/nvme: modify attach_controller rpc to also add multipath trids.
This allows us to avoid creating a separate rpc just for multipath TRIDs.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I4e83167eaf16e50a72efbd513333a4d09c52be61
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2884
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-08-07 15:06:01 +00:00
Seth Howell
e4cc49bc95 bdev/nvme: add failover option to bdev_nvme_reset
When we fail to process admin completions on a controller
attempt to failover to a previously registered trid

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I547bd010f4b339b2af7f2b33027cddad4b4926bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3045
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-08-07 15:06:01 +00:00
Seth Howell
eafae9ae89 bdev/nvme: pass nvme_bdev_ctrlr struct to adminq.
We are about to instrument some checks in this function
that will access the nvme_bdev_ctrlr struct directly so
pass that directly since the ctrlr is still accessible from
the nvme_bdev_ctrlr struct.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: I4e0d0d6c433149368e84275426362f8283903027
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3340
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-08-07 15:06:01 +00:00
Seth Howell
0c6979e29b bdev/nvme: add a function for specifying a multipath trid.
This is part of a larger series enabling failover at the bdev
layer for NVMe.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Ic5c128244699c1a47275145ca7e41aa5f1366259
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3044
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-08-07 15:06:01 +00:00
Seth Howell
d88915f5ff bdev/nvme: rename ctrlr trid parameter to connected_trid
This patch paves the way for introducing a tailq containing
multiple alternative paths to the same controller.

Change-Id: I13d30c12b8e0ce38eae687f9e76740be1d11e4d1
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3259
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-08-07 15:06:01 +00:00
Ziye Yang
cca62c633f bdev/uring: Do not use IORING_SETUP_IOPOLL.
Because of the Linux kernel has limitation, IORING_SETUP_IOPOLL is only
used for local devices (e.g., local files, pcie NVMe SSDs etc.). However,
it does not work for devices atttached from the remote. So in order to
make bdev uring generic, Let's do not use IORING_SETUP_IOPOLL to create the
uring.

Change-Id: I6aea1ff222a8a0d67ab040ada75aa0ef6730e725
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3587
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-08-04 18:27:56 +00:00
Jin Yu
7ef6d8dd63 virtio: fix scsi double free issue
During virtio_pci_dev_probe, if enum_cb fails, hw needs
to be released. But in bdev_virtio, if vdev fails after
initialization, it will enter the bdev destruction process
which call the modern_destruct_dev function and hw will
be released during the process. So we will encounter the
problem of hw being released twice.

Change-Id: I1e8116283cfd810dfb050f8928f4ecd4bb2d815b
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3566
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2020-07-30 08:43:23 +00:00
Jin Yu
79c7744efb virtio: fix virtio hw double free issue
During virtio_pci_dev_probe, if enum_cb fails, hw needs
to be released. But in bdev_virtio, if vdev fails after
initialization, it will enter the bdev destruction process
which call the modern_destruct_dev function and hw will
be released during the process. So we will encounter the
problem of hw being released twice.

Change-Id: Ifba35284c072355ba0e10428b597a1894d32d59e
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3564
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2020-07-30 08:43:23 +00:00
Tomasz Zawadzki
1350922d09 bdev/ocf: take additional reference for ocf_cache
Fixes #1498

When shutting down the application, it was possible to
reference stale ocf_cache pointer. This was the case
when two or more vbdev_ocf devices were based on top
of single cache bdev.

This issue did not occur outside of the shutdown case,
since RPC only allows deletion of the vbdev_ocf.
This erases on disk metadata and next run of the application,
would not detect such vbdev_ocf.

Shutdown meanwhile works different, by first stopping
the instance of running "ocf_mngt_cache" and later detaching
"core" devices (the ones being cached). This prevented
erasing the on disk metadata and allowed for restarted
application to detect vbdev_ocf.
See patch (1292ef2) for details.

Since references to ocf_cache are copied between vbdev_ocf
[see start_cache()], the reference count inside ocf_cache
was limited to original ocf_mngt_cache_start() and
management queue creation. First call into ocf_mngt_cache_stop()
released all references to ocf_cache. Leaving other
vbdev_ocfs pointing to released memory.

This patch works around this issue by increasing ref cnt
on ocf_cache for each vbdev based on top of it.
It allows to call into ocf_mngt_cache_stop(), but not
release the memory for ocf_cache until last vbdev.

Note:
A proper redesign here is in order:
- either rearranging structures to be based around single ocf_cache,
rather than multiple vbdev_ocf instances
- better use of OCF API to reduce book keeping logic in vbdev

There are plans to implement detach/attach in RPC,
so it should be a focus during that effort.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I560a7fbb1c052bf53970e655bdb60803c561a252
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3574
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-07-30 04:02:08 +00:00
Tomasz Zawadzki
868ba17780 bdev/ocf: simplify check for running cache instance
There are additional conditions which SPDK tracks,
that are known before issuing call to OCF.

Two main ones are:
- if vbdev->ocf_cache was not yet created [start_cache()]
- if the cache bdev was opened [attach_base()]

Both happen for the first cache bdev once. Then for
consecutive vbdev_ocf on same cache bdev, reference
will be copied.

This call will simplify checking both conditions.
Calling into OCF with NULL or stale ocf_cache pointer,
rightly so will cause issues with ASAN.

Related #1498

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib202c15bda4cbbffa1516c69168e8bfb80370047
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3573
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-07-30 04:02:08 +00:00
paul luse
62b3b171cb module/compress: add new parm to RPC for create compress vol
To specify the desired logical block size. Must be 4K or 512.
If no block size is provided a default of 0 means to use the
underlying bdev block size. For cases where something other
than 4K or 512 is desired, format the underlying device
accordingly and don't specify a logical block size on creation
of the compress vol.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I58b71e210cfa77b3237c0c454585c734e2e22aea
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3177
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-07-27 22:39:29 +00:00
paul luse
ee7e31f9ae lib/accel: remove the need for the app to allocate an accel_task
This was sort of a clunky interface requiring a couple of inline
functions in every app that wants to use the accel_fw moving
forward. By having the accel_fw public API accept a callback arg
instead of an accel_task combined with adding a pool of accel_tasks
in the accel_fw engine we can eliminate this.

After changing the parm to a cb_arg, changes were made to all accel_fw
interfaces to put cb_fn and cb_arg as the last parms in public and
private function calls.

Related bdev_malloc changes need to be in this patch in order to pass CI.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I2b75764e534562d91484a094c3352266156d8425
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3209
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-07-24 19:38:46 +00:00
Xiaodong Liu
ef7c128a8a lib/event: enable repeated spdk_app_start/stop
With this patch, spdk_app_start/stop can be repeatedly
called by users based on their upper level application's
requirement.
Changes are:
* Add reinit ability inside spdk_env_init and related functions
* Clear g_shutdown_sig_received in spdk_app_setup_signal_handlers
* Clear malloc_disk_count in bdev_malloc_initialize

Change-Id: I2d7be52b0e4aac2cb6734cc1237ce72d33b6de0c
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2260
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-07-24 15:25:57 +00:00
Changpeng Liu
29bbb9ba9c bdev/aio: use detected block size as AIO block device data alignment
Currently we provide a warning to users that the specified block
size doen not match auto-detected for AIO device, e.g.: the AIO
device uses 512B as the block size while users input 4096B when
creating SPDK AIO block device.  However, the data alignment
required by kernel AIO device can use the auto-detected one,
we are over-committed for the alignment requirement, so here
just use the auto-detected block size.

Fix issue #1478.

Change-Id: Ia5d02bb66980cbe55ea7039488189d409315552e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3495
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-07-24 09:41:30 +00:00
Tomasz Zawadzki
78f44434a6 lvol: report io_unit_size as block_size instead of page size
Bdev underneath lvs has a certain value set as block size,
then lvolstore inherits that as `io_unit_size`,
which ends up as block size for the lvol bdev presents.

Before this patch lvs on device with block size of 512,
would create lvol bdev with block size of 512.
Meanwhile reporting in bdev_lvol_get_lvstores(),
that block_size is 4k instead.

This was result of addition of io_unit_size and
forgetting to update the value here. Previously
only I/O of page_size was allowed.

This patch adjusts the value reported to
io_unit_size.

Alternatively it could have been removed,
or name changed to "io_unit_size".
Yet for API compatibility reason this name
remains.

Info on actual page size which is always 4k,
is not useful to bdev users. No need to add
another field for that.

Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I19701c408c389f64e25c027c7bba789294bbed94
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3289
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-07-17 07:20:14 +00:00
Sven Breuner
0190e71eb6 module/bdev_compress: Fix use logical block size in vbdev_compress_claim
Use logical block size instead of base bdev block size in
vbdev_compress_claim. This is relevant to enable the compress bdev to
present a logical block size that is different from the block size of
the base bdev (e.g. 512B logical on 4KB base bdev).
However, this is just an internal fix/preparation and there is
currently not yet a way for users to specify a different
logical block size, e.g. through an rpc.py parameter.

Signed-off-by: Sven Breuner <sven@excelero.com>
Change-Id: Id7562f0da234ce11242b7ee78837a317ee51c211
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2858
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-07-15 07:09:40 +00:00
Shuhei Matsumoto
47b0d4275c bdev/nvme: Support abort IO by using spdk_nvme_ctrlr_cmd_abort_ext()
Change NVMe bdev module to enable abort as IO type.

Change _bdev_nvme_submit_request() to process abort request when the
IO type is abort.

The current thread tries aborting I/O command in the I/O qpair first.
If no I/O command to abort was found, send message to the thread which
is registered when creating controller. The controller thread tries
aborting admin command in the admin qpair next. If no admin command
to abort was found, complete the abort request with failure.

spdk_nvme_ctrlr_cmd_abort_ext() is used to try aborting command whose
cb_arg matches. qpair is set to NULL when trying to abort admin command.

Before calling spdk_nvme_ctrlr_cmd_abort_ext(), save the current
thread to process admin command completion correctly.

spdk_bdev_abort() supports any bdev module other than NVMe bdev
module and does not check CDW0 but checks only if the completion
status is success or failure. So add bdev_nvme_abort_done() and
converts the NVMe completion status to the bdev completion status.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If6aebae0ba2f6c5834ee926e161af9c4d825f341
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2040
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-07-14 07:54:42 +00:00
Shuhei Matsumoto
613c9386ff bdev/nvme: Add thread pointer to struct nvme_bdev_ctrlr
Add thread pointer to struct nvme_bdev_ctrlr. The thread which
created nvme_bdev_ctrlr is set to the pointer.

The thread pointer will be used to limit only one thread to submit
admin abort.

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia39d5cbc7a13b0e0022c0d5591069ea8776ef774
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3244
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-07-14 07:54:42 +00:00
Shuhei Matsumoto
9d6bb6decc bdev/nvme: Use SPDK_POLLER_REGISTER to use function name as poller name
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I30da42413501d03433ef4550327675471185df3e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3243
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-07-14 07:54:42 +00:00
Seth Howell
f4d0ab7111 Revert "bdev/nvme: add a function for specifying a multipath trid."
This reverts commit 5f13e0f67f.

Reverting this change to allow for a secondary review.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Change-Id: Iad208603034bd892f3f248836459005d3c2b5297
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3041
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-07-08 07:57:11 +00:00
Seth Howell
0344ba4064 Revert "bdev/nvme: add failover option to bdev_nvme_reset"
This reverts commit 62ded32fd7.

Reverting this change to allow for a secondary review.

Change-Id: I85a059c434f6e634e014e680ae228e95fffdee92
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3040
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
2020-07-08 07:57:11 +00:00