bdev_nvme_find_admin_path() is used only in a place and it's role
is to find a non-failed ctrlr even after multipath is supported.
Inline it into bdev_nvme_admin_passthru() will be better.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If58507c49b43d047e1f3ef25bbdfb571c36a1956
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9194
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Previously bdev_nvme_reset() returned -EBUSY if ctrlr is being
destructed and returned -EAGAIN if ctrlr is being reset.
These did not match what spdk_nvme_ctrlr_reset() returned.
Reset operation will be more important than current when multipath
is supported and reset operation is made asynchronous.
Hence change bdev_nvme_reset() to follow spdk_nvme_ctrlr_reset().
bdev_nvme_reset() returns -ENXIO if ctrlr is being destructed and
returns -EBUSY if ctrlr is being reset.
Additionally change the return value of bdev_nvme_failover()
accordingly. After the change bdev_nvme_failover() returns -ENXIO
if being destructed and returns -EBUSY if ctrlr is being reset.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ie2c6f8601050b1043d83de9cf01490751784e4e5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8859
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Following the last patch, include hostid into ctrlr_opts rather than
passing it as a parameter for bdev_nvme_create().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I0d04db1c5767ec76a9a7cd255c3a8d56b0b8f583
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9344
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
nvme_ctrlr_populate_namespace_done() needs nvme_ns, probe_ctx, and completion
status.
When multipath is supported, nvme_ctrlr_populate_namespace_done() will
be called from the spdk_for_each_channel() callback. The spdk_for_each_channel()
callback can have one context and one completion status.
Hence nvme_ns or probe_ctx need to have another.
If an nvme_ctrlr has multiple namespaces, a probe_ctx is shared by
multiple nvme_ns.
So let nvme_ns has probe_ctx.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Iedaaab80616d34d01935f4ebc31e1f3b84e09e32
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9047
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For multipath, when an new nvme_ns is added to an existing nvme_bdev,
we need to add the corresponding qpair to all existing nvme_bdev_channels
dynamically. The RPC bdev_nvme_attach_controller has to return after that.
Hence populating namespace needs to be asynchronous.
On the other hand, when an nvme_ns is removed from an existing nvme_bdev,
we need to remove the corresponding qpair from all existing nvme_bdev_channels
dynamically. Hence depopulating namespace needs to be asynchronous.
By the recent refactoring, two callbacks, nvme_ctrlr_populate_namespace_done()
and nvme_ctrlr_depopulate_namespace_done() were removed accidentally.
We need to restore these.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I37ea2420588cef3a18648dec053f8bd2b884e86b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9392
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
A few functions can be made private in bdev_nvme.c. Hence delete
these function declarations from bdev_nvme.h.
Besides, bdev_nvme_remove_trid() is only a function declaration.
Hence delete it together in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7bf9ae6da332c6426f6eb40926e7c4130d9acd8a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9444
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
An error might occur after succesful transport creation
when the new transport is added to nvmf poll groups, e.g.
in nvmf_transport_poll_group_create. In that case
transport is not detroyed and poll groups are not fully
functional. To correct this behaviour, destroy transport if
spdk_nvmf_tgt_add_transport fails. Also update nvmf_tgt
initialization step to check that all poll groups were
created.
Change-Id: I116e6944729d846c1755c2844c77825f65db8c12
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9255
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This only existed to share code between OCSSD and regular NVM
namespaces. Now OCSSD is gone, so just merge the files into bdev_nvme.
Change-Id: Idb73cc05d67144de5dd20af8db24c8f6974d10a7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9337
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
nvme_ctrlr_populate_namespaces
Avoid relying on this number. Different targets have interpreted its
meaning in different ways and it cannot be used anymore in practice. It
may also be very, very large.
Change-Id: I94e8eae49d6ccdbd8be302b30a120d89242b6d39
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9316
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Try to use these accessors instead of directly using the namespaces
array. This will make changing the data structure easier later on.
Change-Id: I3367d0e0065894f3aa199ed1698d27976b4cbbb5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9315
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
If the number of namespaces is very large, this can cause excessive
memory allocation. This is especially true because when the number of
namespaces is large, it is almost always very sparsely populated.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I27d94956c222ae3c49c6a7422164ae3a8ec8d963
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9302
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Do simple validity checks first, then check for duplicate controllers.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ic21d64574b37a3d9148e5cd6d5a7599449ea8fe1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9341
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
It is more flexible now as it is possible to get nvme ns handle to do
additional management or queries, however if nvme ctrlr handle is needed
there is already public nvme API for that with nvme ns as an input.
Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Change-Id: I5493168ad31cc95687962288d57fb5457f2d7dd6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9357
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When preparing for a reset, use this new call to tell
the driver to avoid sending DELETE_CQ/SQ commands to a
PCIe controller when they aren't needed.
Fixes issue #2073.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9ebb7d5c3f7cbb1c3192f162f32edbbea41acde1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9250
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Matt Dumm <matt.dumm@hpe.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Added writing out JSON configuration for the scheduler
subsystem.
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I51b1f94b3f56d0bfb8a87127163c8e248d6846b6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7119
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch moves schedueler and governor related API from
the internal event.h to public scheduler.h.
With this it is possible to create subsystem responsible
for handling the schedulers.
Three schedulers and a governor were moved to scheduler modules
from event framework.
This will allow next patch to add JSON RPC configuration
to the whole subsystem.
Along with easier addition of other schedulers.
Removed debug logs from gscheduler, as they serve little purpose.
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I98ca1ea4fb281beb71941656444267842a8875b7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6995
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When we rotate the socket in the list, we did not check whether the uc pointer
is NULL, then we cause the coredump when using uc pointer.
When we add a new socket (with pollin event) into the pending_recv list,
we add it into the end of the pending_recv list, then it delays the execution
of the newly socket, and it can cause the timeout especially when a socket
is in a connection phase.
So the purpose of this patch is:
1 Revise the rotation logic to handle the two cases, i.e., (1)sock is in the
beginning of the list; (2)sock is in the end of list. The purpose is
to avoid NULL pointer access, and efficently handle the exceptional case.
2 When there is new pollin event of the socket, we should add socket in the beginning
of the list. And this can avoid the new socket handling starvation.
Since max poll event num is 32 from upper layer and if we always put the new socket
in the end of the list, then starvation will occur if there are many socket connection events.
Because if we add the new socket into the end of the pending list, we will always handle the
existing socks first, then the later coming socket(with relatively pollin event) will always be
handled late. Then in the sock connection initialization phase, it will consume a relatively
long time, then the upper layer connection based on this socket will cause timeout,.e.g.,
ctrlr.c: 185:nvmf_ctrlr_keep_alive_poll: *NOTICE*: Disconnecting host nqn.2014-08.org.nvmexpress:
uuid:af56cce7-2008-408c-a8a0-9c710857febf from subsystem nqn.2019-02.io.spdk:cnode0 due to
keep alive timeout.
[2021-08-25 20:13:42.201139] ctrlr.c: 579:_nvmf_ctrlr_add_io_qpair:
*ERROR*: Unknown controller ID 0x1
Fixes#2097
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I171b83ffd800539e86660c7607538e120fbe1a91
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9223
Reviewed-by: John Kariuki <John.K.Kariuki@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
There's only one type now.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I8fbf330797e772b1c45a04970c95bf4894c26639
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9348
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There's only one kind of namespace now that OCSSD is gone, so simplify
those whole path.
Change-Id: I721de11c3e7be8b3a13ada25b6d6163a67c6659f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9329
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
As far as we're aware, this is not in use by anyone. OCSSD has largely
been replaced by ZNS and no OCSSD drives made it to the market.
Change-Id: I020ee277da5292f8c4777f224acafd87586f8238
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9328
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The values returned weren't actually used, and instead
make the code unclear especially in the error cases.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iee2661545a6831f6151a183d4ac723c835e84d13
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9349
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Shouldn't really ever happen but is somehow things get out of whack
and we get a busy back from the low level lib we don't want to
increment our flow control counter.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I49176e8ad2bf6f658ac970efea5db89eebc747f1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9232
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
bdev_nvme_create() is called only by a single caller and hostnqn is
just copied to ctrlr_opts even if it is passed separately.
Hence include hostnqn into ctrlr_opts rather than passing it as a
parameter for bdev_nvme_create().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I75b640bcecefa94950b0c19936fab0571c428125
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9332
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The NVMe bdev module will have two similar features, multipath and
failover when it supports multipath.
Take a case that we add two different trids with the same name by the
bdev_nvme_attach_controller RPC as an example.
The failover adds secondary trid to an existing nvme_ctrlr. The multipath
feature creates another nvme_ctrlr and adds it to the same nvme_bdev_ctrlr
which has an existing nvme_ctrlr.
We want to use bdev_nvme_attach_controller for both failover and multipath.
To do it cleanly, separate callback to spdk_nvme_connect_async() between
creating ctrlr and setting failover.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id9bc175af6201cdd74e12d4903fc81afe4f91189
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9225
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We get ext_opts by getting spdk_bdev_io from nvme_bdev_io in
bdev_nvme_readv() or bdev_nvme_writev() now. But this is not aligned
with the current design pattern and not so efficient.
We pass contents of bdev_io via parameters to bdev_nvme_readv() or
bdev_nvme_writev().
Let's follow it about ext_opts.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I8a43f31934a36fa2d43800ec8bf17916edfca477
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9292
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@gmail.com>
When an abort command completes successfully,
cdw0 bit 0 may be set to indicate that the controller
was unable to abort the specified cid. In that case
the bdev nvme abort completion callback needs
to reset the controller.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I42a5e96e19e113c38dec67c2d8575a11f39d41a9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9320
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Converted log message to debug that is called too many times
during a hot remove, filling up the log file.
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Change-Id: I08f7ff7c36c6388270878291df8a1e83646e8aee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9258
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When a bdev is being unregistered, after all
channels have been closed, the bdev layer calls
the module's destruct callback for the bdev before
calling the bdev unregister callback.
For the rbd module, the destruct callback is
bdev_rbd_destruct. This callback unregisters the
rbd io_device which is an asynchronous operation.
We need to return >0 from bdev_rbd_destruct to
inform the bdev layer that this is an asynchronous
operation, so that it does not immediately call
the bdev unregister callback. Once the rbd io_device
is unregistered, we can call spdk_bdev_destruct_done()
which will trigger the bdev layer to finally call
the bdev unregister callback.
Without this fix, deleting an rbd bdev would
complete before the backing cluster reference
had been released. This meant that even
if you had deleted all rbd bdevs, there might still
be cluster references in place for a short period of
time. It's better to wait to complete the delete
operation until the cluster reference has been
released to avoid this issue (which this patch
now does).
Fixes issue #2069.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8ac156c89d3e235a95ef196308cc349e6078bfd7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9115
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
The bdev layer doesn't call the destruct callback until
all channels have been released, but because the channel
delete callback passes message to the main thread, we can
end up with a complicated race condition. Currently we
have a deferred_free code path to handle this race, but
we can handle this a bit more cleanly by doing the
construct operation on the main_td as well.
This also simplifies the next patch which will
asynchronously destruct the bdev to fix an RPC bug.
Here's the race:
1) first channel was created on thread A, so disk->main_td = thread A
2) second channel was created on thread B
3) first channel is freed (but disk->main_td is still thread A)
4) spdk_bdev_unregister is called on thread C
5) bdev layer gives callback on thread B to upper layer
6) upper layer on thread B frees channel
7) bdev_rbd_destroy_cb runs on thread B and has to send msg to thread A
for processing
8) bdev layer calls bdev_rbd_destruct on thread C (since step #4 was on
thread C)
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I25ede2dc56e24dac0919aed05b9def2560823ee7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9158
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Klocwork filtered it out that there is a dereference
of pointer 'lvol' before its NULL check.
Change-Id: I83a026e8762d1e004cf1a351585f48fb1e24ae41
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9208
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch makes use of async_fini_start flag to
make fini_start asynchronous.
During this time all lvol stores which have no open
lvols are unloaded. This is required, since lvs
holds claim on the underlying bdev.
Fixes#1630
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If443cb087324d08a4a70df71c7afd930ab654f90
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9095
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Callback for bdev modules is called 'module_fini',
meanwhile after its execution bdev modules were to call
'spdk_bdev_module_finish_done()'.
This function carries incorrect name, so it was deprecated
and replaced with 'spdk_bdev_module_fini_done()'.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I9a12dff746ea8b4b1570a3794470f7b24e29003e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9148
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
fini_start() is called for each bdev module before
iterating over all unclaimed bdevs to unregister them.
This allows bdev modules to behave differently during
each such unregister. Ex. unloading lvol store when
all lvol bdevs on it are unregistered.
Another use of this callback is to unclaim all bdevs
that can be at that point. Especially ones that will
not receive callback due to no bdev registered.
Ex. offline raid bdev, when some underlying bdevs are missing.
fini_start() being synchronous does not help in cases
where to release claim on the bdev, an asynchronous operation
is required. Ex. lvol store with no bdevs present, requires
async lvs unload to be called.
This patch adds async_fini_start flag for the bdev modules,
to be used when async fini_start is required. When done,
bdev module has to call spdk_bdev_module_finish_start_done().
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I63438b325d4cc53fd236bf9ff143abf6bdd81c49
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9094
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When removing a socket from a sock_group, the recv_task should be
cancelled last, because it can be sent out while cancelling other tasks
(if POLLERR is received). Otherwise, we could end up with outstanding
recv requests from a socket removed from a group.
Fixes#2112.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ic8e24c210541390dd8bdffe8d3bc4e7dd746d4b7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9239
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Once bdev finish starts, bdev unregister is called on all
unclaimed bdevs. This means that for lvs with at least one
lvol present, there will be a corresponding bdev unregister.
Yet the vbdev_lvol module does not attempt to unload the lvs,
once last lvol from that lvs is unregistered. Leaving
the base bdev for lvs claimed.
This patch fixes that by using fini_start callback from
bdev_module to mark when shutdown begins. After that
last lvol unregistered on lvs will unload it.
Expanded struct lvol_bdev to contain lvol_store_bdev.
Closing the lvol will free spdk_lvol, so lvol->lvol_store cannot
be accessed.
Changed ut_lvol_destroy UT to ut_bdev_finish. Previous UT didn't
really test vbdev_lvol_destroy, but 'hotremove' of the lvol bdev.
In effect there is no hotremove of the lvol bdevs (only lvs bdev).
spdk_bdev_unregister() can only be called from within vbdev_lvol,
or during bdev module finish.
This UT will now check the bdev module finish.
Note that at this point lvs with no lvols will not trigger
lvs unload. Next patches in series will introduce async fini_start,
to allow for the unload.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I8f51e8c1fcfdc55a5d090a3bc84ccefda813aef8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9093
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
OCF creates vbdev with block size equal in size with a core device.
We need to ensure that cache's bdev block size is not bigger than
core's bdev block size, so there are no IO errors due to IO length
smaller than cache device's block size.
The reason why this is implemented late in the cache start and not
as soon as we want to construct OCF vbdev is that cache or core
device can be added later after OCF vbdev creation and only then
we are certain to have both devices to compare.
Fixes#2088
Signed-off-by: Rafal Stefanowski <rafal.stefanowski@intel.com>
Change-Id: I536c783ca71b52f212217c597b7997f2d2e89491
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9229
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The new API is used if bdev ext_opts is not NULL.
Change-Id: I414b5d19bff54114d6708efed89ba19b5955f56a
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6271
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Added struct lvol_bdev wrapping around spdk_bdev.
This will help with passing around any context.
For this patch only spdk_lvol is still used.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I23f5be5edda526ad607ea8b45274c945a42d90db
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9147
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Same iterator was used in two places already, and new one
will be added in next patch.
Replaced _SAFE variant since entries cannot be removed
while iterating in this loop.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5ee911a5b653cfbdf0b4a39d283bd5eee66c49d8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9092
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
async_mode option is currently supported in PCIe transport layer
to create io qpair asynchronously. User polls the io_qpair for
completions, after create cq and sq completes in order, pqpair
is set to READY state. I/O submitted before the qpair is ready
is queued internally. Currently other transports only support
synchronous io qpair creation.
Signed-off-by: Monica Kenguva <monica.kenguva@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ib2f9043872bd5602274e2508cf1fe9ff4211cabb
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8911
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
In order to not affect the loopback test.
Also create a sock_common.c file which can be used by posix/uring
implementation. We do not put such code in sock.c. Because sock.c
is the general layer. Other users may include their own user space
sock impelmentations. So put those common code in sock_common.c instead.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I983ec2313119539e6eed2d9f11ba1488c0ed6560
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8769
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch tries to add MSG zero copy feature.
Though io_uring supports buffer registration, it only
support io_uring_prep_write_fixed. It means only one
registered buffer can be used. It does not satisfy our
current usage mode.
According to this situation, we still use the MSG_ZEROCOPY
flags in io_uring_prep_sendmsg.
Furthermore, this new feature is dependent on the kernel
version, The currently verified version is
kernel 5.12 rc3. So it is not enabled in the default manner.
For example, if you want to use it on the target side, you can
use the following rpc to configure:
./scripts/rpc.py sock_impl_set_options -i uring --enable-zerocopy-send-server
Change-Id: Ie7bb828f466362add94891989ddf0950dccd9e80
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/957
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Sending the unmap in zone reset is optional (it's only a hint), so if
the base bdev doesn't support it, the reset is completed immediately.
Fixes#2064.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: If2c57eadc20d352a71853d7023599503330e1252
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9154
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>