If NN is very large this saves a lot of memory. This lookup is
not generally used in the I/O path anyway.
Change-Id: I98e190006843ad5d0bac8483bf9feb800d4a665a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9884
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
bdev_nvme_find_io_path() selects an io_path whose qpair is connected
and ANA state is optimized or non-optimized.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I79c978795562b606ee27aa43020684d8bcbf50c5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9405
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reset all controllers of a bdev controller sequentially. When resetting
a controller is completed, check if there is next controller, and
start resetting the controller.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I169a84b931c6b03b36bb971d73d5a05caabf8e65
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7274
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously the NVMe bdev module had completed the outstanding reset and
then canceled pending resets. This was complex.
On the other hand, the generic bdev layer cancels pending resets
and then completes the outstanding reset.
Following the generic bdev layer simplifies the code and makes us easier
to control retry reset, delay retry reset by a few seconds, or stop retry
after repeated failures and then delete ctrlr.
Update unit tests accordingly.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9a68422918ebcb052b3a281316ffba9b3450ecd4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9816
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previous patches (5363eb3c) tried to work around the
32-bit unmap and write_zeroes LBA counts by breaking
up larger operations into smaller chunks of max size
UINT32_MAX lba chunks.
But some SSDs may just ignore unmap operations that
are not aligned to full physical block boundaries -
and a UINT32_MAX lba unmap on a 512B logical /
4KiB physical SSD would not be aligned. If the SSD
decided to ignore the unmap/deallocate (which it is
allowed to do according to NVMe spec), we could end
up with not unmapping *any* blocks. Probably SSDs
should always be trying hard to unmap as many
blocks as possible, but let's not try to depend on
that in blobstore.
So one option would be to break them into chunks
close to UINT32_MAX which are still aligned to
4KiB boundaries. But the better fix is to just
change the unmap and write_zeroes APIs to take
64-bit arguments, and then we can avoid the
chunking altogether.
Fixes issue #2190.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I23998e493a764d466927c3520c7a8c7f943000a6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9737
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Dong Yi <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Restore the previous nice idea and unify canceling pending resets
into bdev_nvme_complete_pending_resets().
cb_arg of spdk_for_each_channel() was reserved for a different
purpose but it was gone.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1c01051ae9d8eccc7fc3ec485dd344ff9f55087b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9815
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
The all setters of the reset_cb_fn use boolean as the result eventually.
Using boolean as the result earlier makes the code simpler and the
following patches easier.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9ec4f670ed7a000a824ab7e261bab07a3dc21f43
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9814
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
bdev_nvme_admin_passthru() chooses the first ctrlr which is not failed.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If41a1d1e1bde4bddfa92e5a385509daa3f0ce4de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9525
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
bdev_nvme_admin_passthrough(), bdev_nvme_reset_io(), and
bdev_nvme_abort() do not use io_path. So simply fall through even
if the optimal io_path is not found for these, and clear the
cached io_path for these.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib26fcbf99c95bbfb6e825c1b7c6455241c198d92
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9675
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Even if the NVMe bdev module supports I/O retry, it will not retry reset or abort.
For clarification, bdev_nvme_reset_io() and bdev_nvme_abort() call
bdev_nvme_io_complete() themselves for error cases.
For bdev_nvme_abort(), we do not need to differentiate error processing
among return values. Simply complete the request with failure.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id5d51cbba47c6360a6177efd7d5f2e978c48ee9b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9674
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Even when I/O retry is supported, reset will not be retried. However,
bdev_nvme_io_complete() will process I/O retry. Hence inline
bdev_nvme_io_complete() into bdev_nvme_reset_io_complete() to exclude
reset from I/O retry. The result of reset is success or failure, so omit
the -ENOMEM case.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I667e74cbbac4a13cefb6896f898476ba48bcd0fa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9687
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Save the io_path to which the current I/O is submitted into struct
nvme_bdev_io to know which io_path caused I/O error, or to reuse it
to avoid repeated traversal.
Besides, add a simple helper function nvme_io_path_is_available() to
check if the io_path can be reused.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idadd3a3188d9b333b335ea79e20a244aa0ce0bdd
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9296
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We have io_path structure now and returning io_path rather than
ns and qpair match the function name. The following patches will
cache the returned io_path into nvme_bdev_io.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5d773da18591fc324667f6b5c489a38f497bf3d8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9295
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch removes the critical limitation that ctrlrs which are
aggregated need to have no namespace. After this patch, we can
add multiple namespaces into a single nvme_bdev.
The conditions that such namespaces satisfy are,
- they are in the same NVM subsystem,
- they are in different ctrlrs,
- they are identical.
Additionally, if we add one or more namespaces to an existing
nvme_bdev and there are active nvme_bdev_channels, the corresponding
I/O paths are added to these nvme_bdev_channels.
Even after this patch, ANA state is not utilized in I/O paths yet.
It will be done in the following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I15db35451e640d4beb99b138a4762243bee0d0f4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8131
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Pointer 'ctx->req.multipath' returned from call to
function 'strdup' may be NULL. Reported by Klocwork.
Change-Id: Id4a188ec5340f02c9bd0643db0acb03409dd5829
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9843
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
_find_optimal_core was always consolidating idle threads to g_main_lcore.
Meanwhile for active threads lower lcore id were prioritized over the
higher ones.
So long as g_main_lcore can fit the active thread, it should be prioritized
over any other. Regardless of the lcore id.
Fix#2080
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I30b3a7353bcf243d4362b4db9dde5446c435d675
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9243
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It can be much simpler when inlined.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Ifad89d7b9557a45eee601fc2004066a0fd9289c0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9826
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Specifying only a transport id is not enough. We need to be able to
describe the host parameters too.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Iadbea553aee4b38e7cacab0b486e7e5746d0d1ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9825
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is the currently active path identifier in a failover scenario. The
path is defined by more than just the transport identifier, so fix the
name.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I682c6f4c54f75307e2615bf80e70358180d99fe2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9576
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This defines a unique path between a host and a target.
Change-Id: Ia3d24c1b34199a8b596aaf17900ca9694a9da77d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9505
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The new path must differ in some way.
Change-Id: I98fd2b2cb3220b482efe0a19bfce94ec4e72bec2
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9418
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This parameter may have the values "disable" or "failover". The default
is failover to match existing behavior. In the future we expect to
change the default behavior to disable.
Further, we expect to add an "enable" option soon to do full multipath.
Change-Id: Iebbdc9b95f23101f18d64e085933463498e627be
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9343
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
We know that the librbd's read/write operations will be handled by a
non spdk thread, so we can get rid of the epoll based group based
polling and directly use the async completion. This makes the code
is simple and easy to maintain.
And we still need to keep the io_device registration for this module,
because the I/Os are async. We need the channel reference on "rbd_if"
in order to know which rbd disks are still active.
Change-Id: I1c140a4b286dbfe113ed2a67bd2875de605e8f24
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9335
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Prior a regular round robin could result in strange performance
if an idxd device from another socket was used.
Signed-off-by: peluse <peluse@localhost.localdomain>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Id863c79067beabe73ef89d92b3fb3c436821b97a
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9367
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Call rte_cryptodev_close() to free qpair memory instead of using
an internal function.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I1bd7f0dd86de83f278f6be3263cdf3fbd8e1c77f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9720
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
There's no need to keep polling with iscsi_bdev_conn_poll() when the request
list is empty, as new requests already restart the poller when needed.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Iec5553f8c14d32e894989c9f7a448b6817a821b1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9375
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
It's not providing a lot of value, while being pretty problematic, as
the read is blocking and cannot be easily changed to be non-blocking, as
dump_info_json is a synchronous interface.
Now that dump_info_json isn't using any synchronous interfaces from the
NVMe driver, we can send a bdev_get_bdevs call in the async_init.sh
test to verify that.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ida31c8d1000a52b0782f698afe46b071ed4e41df
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9488
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This patch replaces the synchronous `spdk_nvme_detach()` calls with its
asynchronous counterparts in the controller unregister path.
An additional poller is introduced to periodically poll the NVMe driver
for detach completion. Once the detach is completed, the poller is
unregistered and the nvme_ctrlr is destroyed. The poller uses the same
period (1ms) as the async probe poller.
Since reset and detach cannot happen at the same time, reset_poller was
renamed to reset_detach_poller and it can now store the pointer either
to the reset or detach poller, depending on the circumstances.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I5eb2dd6383d98d25d1f9748af08c1a13d18acb0e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8729
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This is done in preparation for using the non-blocking versions of the
spdk_nvme_detach API, which will delay controller's delation until the
detach is completed asynchronously.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: Ia785408c9a94427e60bf239e6036a5e89d589f61
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8727
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
It can match by any provided parameter to remove paths.
Change-Id: I5e7a87342bbb90943dc97fb52f142814fcf0acfa
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9453
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Instead of storing an spdk_nvme_transport_id, store the object that
contains it. This will make a few later patches easier.
Change-Id: I36b74889fe39af3b7ab2b900fb3ea4b3f39e1f83
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9484
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Rename it to _is_core_at_limit(). This function
currently returns true if the core is at the limit
(instead of over the limit) which is really the semantics
that we want - so just change the name of the function
to make it more precise.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idf815f67c71463c3b98bc00211aafdc291abdbd2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9582
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
We have _is_core_over_limit() which determines if a core is
currently over its busy:total tsc ratio. We use this to determine
if we need to move threads off of a core that is too busy.
But when we pick a core to move a thread *to* we were allowing the
dst core to fill to 100%, rather than the SCHEDULER_CORE_LIMIT.
This patch fixes that, which has the nice effect of keeping
thread-to-core assignments much more stable when running
I/O workloads.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id98b08803939d2a25104082e6436bb8d4727d7c2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9578
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will lead the scheduler to be quicker to move
threads to an unused core - favoring performance over
power savings.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibaa5edc61a4bdca5550bd23a562c3645fded25e9
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9551
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
If a core has a very high busy percentage, we should
not assume that moving a thread will gain that
thread's busy tsc as newly idle cycles for the
current core.
So if the current core's percentage is above
SCHEDULER_CORE_BUSY (95%), do not adjust the
current core's busy/idle tsc when moving a thread
off of it. If moving the thread does actually
result in some newly idle tsc, it will get adjusted
next scheduling period.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I26a0282cd8f8e821809289b80c979cf94335353d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9581
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
For the src thread, add the busy_tsc of the thread
we are moving to the idle_tsc of the current core.
This is consistent with how are accounting for the
cycles in the target core too.
We will disable the load_balancing.sh script for now.
We will reenable it later in this patch set once
a few other changes are made, along with some updates
to the load_balancing.sh script based on the changes
made in this patch set.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I8af82610804e97dabf62ccd90f75a0e6e37d276f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9550
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This will be useful in some upcoming patches where we will
be calculating these percentages in more places.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If7d84c00fe1b666988fe06537836ba7b9cb161aa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9580
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Purpose: Better to put this varable into the bdev_rbd_io structure
instead of on the stack. And we will use this variable in next patch,
when the callback function is completed, so we should not put it on the stack.
Change-Id: I11ff46ef07908084012bc1ce040eceb667334a40
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9334
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The purpose is that we will remove the group reaping of
rbd_io later, so we need to know the original thread info of the
rbd_io.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I69f60261447fdac0b0885fdb213e92c246439047
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9585
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Allow to return more than one memory domain.
This change aligns bdev and nvme API and provides
more flexibility for custom transports.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ica9b12ad8463c361be6cb62ee2c0513eec0b486d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9546
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Enable dump of transport stats in functional test.
Update unit tests to support the new statistics
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I815aeea7d07bd33a915f19537d60611ba7101361
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8885
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch enables us to aggrete multiple ctrlrs in the same NVM
subsystem into a single bdev ctrlr to create multipath.
This patch has a critical limitation that ctrlrs which are aggregated
need to have no namespace. Hence any nvme bdev is not created.
However it will be removed in the next patch.
The design is as follows.
A nvme_bdev_ctrlr is created to aggregate multiple nvme_ctrlrs in
the same NVM subsystem. The name of the nvme_ctrlr is changed to be
the name of the nvme_bdev_ctrlr.
NVMe bdev module has both the failover feature and the multipath
feature now. To choose which of failover or multipath to use, add an new
parameter multipath to the RPC bdev_nvme_attach_controller.
When we attach a new trid to the existing nvme_bdev_ctrlr, we use the failover
feature if multipath is false, we use the multipath feature if multipath is
false.
nvme_bdev_ctrlr has a list for nvme_ctrlr and it is guarded by the
global mutex. Callers can query nvme_ctrlrs from a nvme_bdev_ctrlr via
trid as a key. nvme_bdev_ctrlr is not registered as io_device.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I20571bf89a65d53a00fb77236ad1b193e88b8153
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8119
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Previously, if an I/O qpair is disconnected, we tried reconnecting
the qpair. However, this reconnect operation was very likely to fail
and will not match the upcoming asynchronous connect/reconnect
operation. We need an extra callback to make this reconnect operation
asynchronous, but we do not want to have it.
Hence if an I/O qpair is disconnected, we free the I/O qpair and then
reset the corresponding nvme_ctrlr immediately. If the admin qpair is
also disconnected, the nvme_ctrlr is reset immediately. However this
event may never happen. So we do not wait for the error of the admin
qpair.
The NVMf host may disconnect connections by itself intentionally.
In this case, resetting the nvme_ctrlr will surely fail. But resetting
the nvme_ctrlr frees all I/O qpairs of the nvme_ctrlr and these I/O
qpairs are not created again until resetting the nvme_ctrlr succeeds.
Resetting the nvme_ctrlr once at most is more efficient than repeating
reconnecting the I/O qpair. So this change is valuable even for such
intentional disconnection. However, it is helpful to know the event that
I/O qpair is disconnected. Hence change DEBUGLOG to NOTICELOG in the
disconnected callback. The disconnected callback is not repeated, and
we do not need to worry about NOTICELOG flooding.
Refine the unit test case to verify this change.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I376b749c2f55d010692bf916370e8bb4249b795f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9515
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is similar to how we name other module library
directories.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iadaf59231323180b48b5d0cf2e6acb3d8bfc9807
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9549
Community-CI: Mellanox Build Bot
Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>