Recent work identified race conditions having to do with the
dynamic flow control mechanism for the idxd engine. In order
to both address the issue and simplify the code a new scheme
is now in place. Essentially every DSA device will be allowed
to accomodate 8 channels and each channel will get a fixed 1/8
the number of work queue entries regardless of how many
channels there are. Assignment of channels to devices is round
robin and if/when no more channels can be accommodated the get
channel request will fail.
The performance tests also revealed another issue that was
masked before, it's a one-line so is in this patch for convenience.
In the idxd poller we limit the number of completions allowed
during one run to avoid the poller thread from starving other
threads since as operations complete on this thread they are
immediately replaced up to the limit for the channel.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I913e809a934b562feb495815a9b9c605d622285c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8171
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will allow to match and compare the DTrace probes with SPDK traces
as they'll be using the same source for the timestamps.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I686c3351428c75f9b618a1a909836504fccad828
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8101
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
This patch allows tracepoint to record a variable number of arugments.
An additional function has been added,
`spdk_trace_register_description_ext()`, which allows the user to
register definitions for tracepoints specifying all the arugments that
they accept. Users can also call `spdk_trace_register_description()` to
register tpoints with a single argument (or none).
Currently, all of the tracepoint arguments need to be passed as
uint64_t.
The trace record functions use variable arguments and rely on tracepoint
description to know the order and the format of the arguments passed.
That means that the user needs to take care that they're always in sync.
Moreover, this patch extends the tracepoint entry size from 32B to 64B,
meaning that there are 40B that can be utilized for passing arguments,
which in turn means that there can be up to 5 arguments per tracepoint.
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Change-Id: I9993eabb2663078052439320e6d2f6ae607a47ff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7956
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Move the definition of structure spdk_io_channel into
lib/thread/thread_internal.h, so we don't have to update SO_VER for
other libraries in future when we need to change the internal details on
the structure.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: I3d2ca7a8737972e0b33ce92e464da42c48f89dec
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8189
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
1. Update with latest vfio-user specification changes.
2. The new libvfio-user will not expose dma_sg_t data structure
any more, SPDK should use pointer and allocate memory for it.
Change-Id: I619b0c0828cbe3b050c628bff4c4ce7ee840510f
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8377
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Thanos Makatos <thanos.makatos@nutanix.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Community-CI: Mellanox Build Bot
Then we can enable the split for write zeroes request
in bdev layer.
Change-Id: I1341512994b061728cf8c94c0e4e7f86dbc53ce3
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7873
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
We face the checksum error when using idxd hardware in vector mode because
we put fields of internal function in wrong union.
When use vector crc32c operation, seed and chained functions and cb's will
be used in the same time, so we should not put them in the same union.
With this patch, we can fix the crc32c computing while using
the vector mode with iov_count > 1.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I7097b47294ffd5dcac7d7a83d583321a73dc50d2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8213
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
siginfo_t is a GNU extension. SPDK (and DPDK) have
direct dependencies on GNU extensions, but it's a bit
nicer if external modules don't also need to define
_GNU_SOURCE. Currently siginfo_t parameter in the
spdk_pci_error_handler is the only thing that violates
this.
Note that DPDK also supports registering sigbus handlers,
but they take the failing address as a parameter instead
of the full siginfo_t structure. Let's adopt the same
for SPDK.
While here, remove an extra semicolon that was just after
the virtio sigbus handler function signature that was
updated in this patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I07faf11a3ac3589c637cb2196581c102286b1e68
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8333
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
'typeof' is a GNU extension - let's use __typeof__
instead which is ISO C compliant.
Allows building SPDK header files with -std=c11 as
long as _GNU_SOURCE is also defined.
Next patch will enable omitting _GNU_SOURCE as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibd6c984627b553d6f87f302800abc52157fe9b1e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8332
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
This field was only used to keep track of number of threads
that will be present on a core after scheduler moves.
It was used only internally within scheduler_dynamic.
Event framework has no need to keep such field in core_info.
Instead added field in cores_stats internal to scheduler_dynamic.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3ce74d4a25eac81e58da8705a1c4553730fc1e57
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8049
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch expands spdk_scheduler_core_info with two new
fields that will contain core stats only from last scheduling
period.
This will make sure that schedulers do not have to keep track
and calculate this value on their own.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I3aa7dfa6a60c1d14d95a0e684e84c2e83f0a4496
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8048
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Renamed core_busy_tsc and core_idle_tsc to better
describe that they contain particular core stats for
its whole lifetime.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6f16b2b0a162aad8fbaf18f549fc50a2372b920b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8047
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
There is no need to keep new_lcore field.
lcore value is enough to determine the target core.
Meanwhile _threads_reschedule() can see if the target
core matches the one from core_info.
Removed _spdk_lw_thread_set_core() since it did not
serve much purpose.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I82c7cfebf1107b4a55b2af9b891052084a788907
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8046
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
So far the schedulers had to calculate the diff of
current_stats - last_stats on their own to get tsc
from last scheduling period.
Renamed the current_stats to total_stats, but kept the meaning
as stats describing tsc for lifetime of a thread.
Instead change the meaning of the last_stats to describe
the tsc of only last scheduling period and change its name
to current_stats.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I1a165ff7c1afe659b432c3127a351a96878d1f3d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7843
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
QEMU 6.0 by default uses a RedHat dev/vendor ID rather
than the Intel one that has always been used to date.
We need the NVME_QUIRK_MAXIMUM_PCI_ACCESS_WIDTH quirk
so that we do not use wide instructions to copy SQEs
to a virtualized CMB, since QEMU does not support
that.
The NVME_INTEL_QUIRK_NO_LOG_PAGES quirk is only needed
for devices with SPDK_PCI_VID_INTEL, so we do not need
to carry this one over to the new REDHAT entry.
Fixes issue #1986.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3d339b3525e7c6ceb792eb9d143e7a922c19344d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8226
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Upcoming patches will add accel_fw support for batching this cmd
and then vectored versions later along with accel_perf to exercise
them.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I59f577f4365fbf063d7419cc6052e10c998b58bc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8143
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Was using reserved field. Similar fix to what was done earlier
for direct submission of crc32c operation.
fixes#1972
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ie9867e72f60c7f38aa1af0273a036f34580ed4c6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8145
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Community-CI: Mellanox Build Bot
Upcoming patches will add support for the idxd engine and
the accel_perf tool. Also following will come vectored support and
batch versions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I7223517a844525ad52ed49d65627b04c3cd9fe7c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8141
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Upcoming patches will add support to the accel fw, the idxd engine and
the accel_perf tool. Also following will come vectored support and
batch versions.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ie0fbd4b8da9f727426000898c0b511587adac65b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8139
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Each time the following file
"/sys/kernel/config/nvmet/subsystems/nqn_name/namespaces/ns_id/enable"
on the target side was changed, the SPDK initiator should receive an
async event (type: SPDK_NVME_ASYNC_EVENT_TYPE_NOTICE, info:
SPDK_NVME_ASYNC_EVENT_NS_ATTR_CHANGED).
But actually not.
Since for SPDK, when target sent the non-first event, the condition
"nvmet_aen_bit_disabled(ctrl, NVME_AEN_BIT_NS_ATTR)" that prevents
target from sending event was matched.
This commit fix this issue by issuing a get_log_page cmd for each async
event received, just as the kernel initiator does.
Fixes#1825.
Signed-off-by: tyler.sun <tyler.sun@dell.com>
Change-Id: I2973470a81893456ca12e86ac390ea1de0eed62c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7107
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
In examples/bdev/fio_plugin/fio_plugin, it will include fio.h which
defines the RB_ROOT macro. To workaround the RB_ROOT redefined error,
rename RB_ROOT to _RB_ROOT.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: Ied4f835e4d1657ca5a0a80c13c72845c774618d8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8044
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Adding iov to the spdk_bdev_zcopy_start function enable spdk_bdev_zcopy_start to
be used by transport layers as the iov is owned by the transport command
Signed-off-by: matthewb <matthew.burbridge@hpe.com>
Change-Id: I6d2be7f49566048bf25b7711ada8d2fb49fea6ee
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6816
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Michael Haeuptle <michaelhaeuptle@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
1. struct pxdcap.per is changed to struct pxdcap.rer
Which matches the name in the nvme spec.
2. use new API return value.
3. update specification changes.
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Change-Id: Ida421c4cffd1c65d550e83011ab123b321ea9dff
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8088
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The purpose is to reduce the duplicated code in nvmf and iscsi
layer.
Change-Id: I7e96f0d5bb1ba4b81378addca3cdd929056384e9
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8132
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
We map the SPDK_NVME_TRANSPORT_* values directly to
the NVMe-oF trtype values. Since PCIe isn't
Fabrics, we choose 256 which is outside of the
8-bit trtype range of values.
So we can just check if trtype >= 256 to determine
if the trid is for fabrics or not. This is
preferable to checking PCIE || VFIOUSER in case
additional non-fabrics transport types are added
in the future.
I considered taking a trid as the parameter instead,
but went this route since it is consistent with
the existing spdk_nvme_ctrlr_is_discovery().
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib62ff4d30549b2324486c81f2dce67f0f1741e9b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8077
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
According to NVMe-oF 1.1 spec, it is not a fatal error.
So according to Figure 126 in NVMe Base specification,
we should return "Transient Transport Error".
Change-Id: I601304ae2bb24508882fb1ec8c7e53ec587ab515
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7795
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
spdk_thread keeps track of tsc from its whole lifetime,
those can be requested with spdk_thread_get_stats() at any time.
spdk_lw_thread uses stats from above and keeps track of two points in time:
- current_stats reflecting stats at the time of gather_metrics stage
- last_stats reflecting stats from previous gather_metrics stage
1)
Before this patch current_stats were duplicated in snapshot_stats.
There is no need for that so now they are removed.
2)
Removed _spdk_lw_thread_get_current_stats() since it would be copying
current_stats to current_stats, thus not perform any action.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I5e5d4039cd0f7cc10ba150a3d915b90ec96589d7
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7842
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
In struct spdk_nvmf_poll_group_stat, there are statistics of cumulative IO and
admin queue pair counts. But current qpair counts are not reflected. Use
this patch to add current admin and io qpair counts for a poll group.
Signed-off-by: Rui Chang <rui.chang@arm.com>
Change-Id: I7d40aed8b3fb09f9d34e5b5232380d162b97882b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7969
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Eugene Kochetov <evgeniik@nvidia.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Use the macros for red black tree provided by Free BSD to speed up bdev
name lookup in spdk_bdev_get_by_name().
In the bdev_multi_allocation test, we can get 3x ~ 5x speed up when
creating multiple bdevs for various bdev nums.
Signed-off-by: Jiewei Ke <jiewei@smartx.com>
Change-Id: I49a2fbcccf06d4c36cbd445ce59e0b0dd4ada31d
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7837
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Loading subsystems and restoring state from a JSON config file is useful
outside of the SPDK application framework, so move it to lib/init.
Change-Id: I7dd3ceace2e7b1b28eef83c91ce6a4eedc85740e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6645
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
I'm not sure whether this should go into lib/init or to lib/rpc
directly, but I've chosen lib/init for now.
This is to support applications that want to run the SPDK JSON
RPC server, but aren't using the SPDK application framework.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I79ca39aa0ca6e1a3a6905b0bf73e6cc99b086e55
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6644
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The functions to initialize the SPDK subsystems or tear them down
was previously an internal-only API. Make it public for use by
applications that aren't leverage SPDK's application framework.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I2ebfd020e6fa4c1947fa1c1a2ac509ce9b0242f8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6643
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
There were some functions in the internal header that can be entirely
private to the init library. Move them over.
Also, remove the support for including the header from a C++ file
because these headers are internal to SPDK which is pure C.
Change-Id: Ic4323b2b8664e70106a57b3ca8acbc7c2efe621d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6642
Reviewed-by: Tom Nabarro <tom.nabarro@outlook.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The header size is very small, which does not have too much value to
offload such calculation by hardware.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iaa82f39312df7eef3282325a33677ea41ab735ab
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8011
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is useful for applications even if they elect not to use the SPDK
event framework.
This doesn't shift everything in one go - just the subsystem
initialization logic. Configuration file loading also needs to move
in a separate patch later.
Change-Id: Id419df1045442d416650ed90e5ee78adfdd623d7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6641
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Get all of the hot stuff to the first cache line.
* Shrink the xfer enum to one byte (it only has 3 values).
* Pull out the dif enabled flag form the dif structure so it
can be access separately
* Rearrange the members
Change-Id: Id4a2fe90a49c055a4672642faac0028671ebfae9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7827
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
For now, we will keep this disabled by default,
enable with --with-usdt option to the configure
script. Long-term we will want to enable this by
default, and only disable via configure.
Modules can include spdk_internal/usdt.h and add
probes such as:
SPDK_DTRACE_PROBE2(probe_name, ptr, val);
When USDT is enabled, these will translate to
DTRACE_PROBE2(spdk, probe_name, ptr, val). When
USDT is disabled, these will translate to nothing.
Later patches will add some probe points to the
nvmf target, some bpftrace scripts, and instructions
for how to successfully capture data with these
probe points and scripts.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id168e2c800fa5522815a175026386319014cfdaa
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7173
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Update libvfio-user to the current version, updating the client for the relevant
changes.
Signed-off-by: John Levon <john.levon@nutanix.com>
Change-Id: Ic64ace08ac0c7e9676f04f8d1f47a9c0388a2652
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7983
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
__builtin_clzll(0) is technically undefined, but
returns 64 on all currently tested architectures,
which is the desired value. So remove the
assert(datapoint != 0) and instead just set clz=64
for that case so that we aren't depending on
undefined behavior.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibb05f756e07f20a250d24f0c5adecc4dfbc5a056
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7939
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
zipf is a power law probability distribution. When
applied to performance testing of block devices, it
will select blocks over the full range of LBAs, but
will more frequently select lower-numbered LBAs.
The theta parameter governs the distribution - higher
values of theta will concentrate the distribution on
a smaller number of LBAs.
Note that fio supports zipf, so adding it to SPDK
will enable our perf tools (bdevperf, nvme-perf) to
provide similar functionality.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7df129c9d61996a2070188c6cd9f1fde631ac208
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7779
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Copy Free BSD's sys/sys/tree.h to SPDK's include/spdk/tree.h by
changing as little as possible.
A few changes are as follows:
- Remove a few lines located at the head of the file.
- Change the name of ifdef from _SYS_TREE_H_ to SPDK_TREE_H.
- Change the type __uintptr_t to uintptr_t.
- Change the attribute __unused to __attribute__((unused))
- Insert a space to the front and the end of CONSTCOND.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I26488d8fd61f80508cf7d96fbeb6007f12080a01
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7721
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Monica Kenguva <monica.kenguva@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
This is the same effort as spdk_poller.
The following patches will move the definition of struct spdk_thread and
enum spdk_thread_state from include/spdk_internal/thread.h to
lib/thread/thread.c.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I7f7bdfdd7a7b1b834d16d79638a4fd2d63e9daf6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7800
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Most accesses to the struct spdk_poller outside lib/thread have been
done via functions but a few direct accesses remain.
Change these to indirect accesses by addinng a few helper functions
as SPDK internal APIs.
Add spdk_poller_get_name() to get the name of the poller.
Remove spdk_poller_state_str() and add spdk_poller_get_state_str().
Exposing enum spdk_poller_state outside lib/thread is not really
necessary.
This removal requires us to update major SO version.
Add spdk_poller_get_period_ticks() to get the period ticks of the
poller.
Add struct spdk_poller_stats and spdk_poller_get_stats() to get
the stats of the poller.
The next patch will move the definition of struct spdk_poller and
enum spdk_poller_state from include/spdk_internal/thread.h to
lib/thread/thread.c.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id597dae074a15fcd8af09fd9d416a22ce2f403c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7798
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
The following patches will introduce red black tree to manage
timed pollers efficiently but it will be based on macros available only
in lib/thread/thread.c. Hence then it will be difficult to expose the
internal of timed pollers tree outside the file. On the other hand,
we do not want to include JSON into the file.
Hence add a few SPDK internal APIs to iterate pollers list transparently.
For spdk_thread_get_next_active/timed/pause_poller(), we omit the parameter
thread and get it internally from poller->thread even if the names include
the term "thread". This will be slightly cleaner.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I000801a2e4dc42fa79801a2fd6f2b06e1b769c88
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7717
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add {min,max}_cntlid to spdk_nvmf_subsystem, defaulting to 1 and
0xFFEF, respectively, and add nvmf_subsystem_set_cntlid_range() to
allow the controller range to be configured in the range [min_cntlid,
max_cntlid].
Also add {min,max}_cntlid to the nvmf_create_subsystem RPC to allow
the controller ID range to be specified when creating an nvmf
subsystem.
Signed-off-by: Jonathan Teh <jonathan.teh@mayadata.io>
Change-Id: I936db3bb0c9a38569063a6fd3c11df262dfad776
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7322
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
When testing vfio-user target with QEMU, there is following error log:
ctrlr.c:2143:31: runtime error: member access within misaligned address 0x7fe526a73d98 for type 'struct spdk_nvme_ctrlr_data', which requires 16 byte alignment
For vfio-user transport, the memory buffer is allocated by clients, so
we can't assume the memory is always aligned, just remove the
aligned attribute.
Change-Id: Ie80530415013ebd7bcb8cdabca97d7d0e34857f9
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7797
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This update will stop using `struct vfio_device_info` from
<linux/vfio.h>.
Fix issue #1922.
Change-Id: Ia7ad745db8d7ed8f5248ca13e3188ebd540b0e40
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7831
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Purpose: We will also support the kernel idxd driver, so we do not
need export this feature in the module file.
Change-Id: I965e031497920f527962ba187bccd81de6977b8f
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7336
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Also add a macro to test a given log page ID.
Signed-off-by: Scott Peterson <scott.d.peterson@intel.com>
Change-Id: Id4d00f5c3d8af76b24ce010d6f922025a31312b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4115
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The backend device module can report such capabilities
to the bdev layer, and we can split unmap request into
multiple children requests in the bdev layer.
Change-Id: I81daf7e58b797a2673ef102f2a037de20771092e
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7515
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
(Note: this patch was previously applied as b32cfc46 and then reverted
as 63642bef.)
Today the in-guest nvme device shows physical_block_size=512 even though
the backend iSCSI bdev supports physical_block_size=4K
iSCSI targets exposes physical block size using
logical_block_per_physical_block_exponent in READ_CAPACITY_16
NPWG is one of the way to let Linux nvme driver set
physical_block_size of the nvme block device.
This patch adds spdk_bdev.phys_blocklen which is updated if the iSCSI
backend exposes physical_block_size.
Later phys_blocklen is used in nvmf to set NPWG and NAWUPF to report
back during NS identity.
Linux driver uses min(nawupf, npwg) to set physical_block_size.
Similarly in scsi_bdev fill lbppbe in READ_CAP16 response
based on spdk_bdev.phys_blocklen.
Fixes#1884
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I0b6c81f1937e346d448f49c927eda8c79d2d75c0
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7739
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We have some cases where libraries need the
spdk_bdev_module structure definition and a couple
of related APIs, but not everything else (i.e.
spdk_bdev, spdk_bdev_io), for purposes of avoiding
abidiff errors.
For example, nvmf creates a dummy spdk_bdev_module,
and then uses it with the spdk_bdev_module_claim_bdev
API to ensure multiple subsystems cannot add the same
bdev as a namespace.
But when nvmf includes bdev_module.h, it pulls in the
spdk_bdev structure definition as well. This means
when the spdk_bdev structure is modified, it requires
a major version bump since abidiff detects the
difference in the debug info.
Alternatives considered:
* We could add a specific suppression into our abidiff
script for nvmf and struct spdk_bdev, but it would be
a risk (albeit very very small one) that we could
add a real dependency on struct spdk_bdev in the future,
and the suppression would hide the difference.
* We could also break out bdev_module.h into multiple
header files, but the ways of doing that either result
in odd file naming, or modifying every bdev module to
include a new header.
* We could add a public bdev API to expose what the
bdev library needs, but that seemed even more intrusive
than this change. nvmf is kind of abusing the bdev_module
API here, and I'd prefer to not promote that kind of
usage by adding something to the bdev API.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie8fdef8ea294d005b9ae7934dde49c62748420d1
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7737
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Move some of the bdev_module APIs immediately after
the bdev_module structure definition. This prepares
for an upcoming patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I78d534ba047048ec27d8d3159666584a52211de3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7736
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This reverts commit b32cfc467b.
This commit fails the ABI checks and only got through because the checks
were disabled until 21.04 hit.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: Id26b8f8ba551193d99b1ccbd31b35378b4095a20
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7731
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Today the in-guest nvme device shows physical_block_size=512 even though
the backend iSCSI bdev supports physical_block_size=4K
iSCSI targets exposes physical block size using
logical_block_per_physical_block_exponent in READ_CAPACITY_16
NPWG is one of the way to let Linux nvme driver set
physical_block_size of the nvme block device.
This patch adds spdk_bdev.phys_blocklen which is updated if the iSCSI
backend exposes physical_block_size.
Later phys_blocklen is used in nvmf to set NPWG and NAWUPF to report
back during NS identity.
Linux driver uses min(nawupf, npwg) to set physical_block_size.
Similarly in scsi_bdev fill lbppbe in READ_CAP16 response
based on spdk_bdev.phys_blocklen.
Fixes#1884
Signed-off-by: Swapnil Ingle <swapnil.ingle@nutanix.com>
Change-Id: I0b6c81f1937e346d448f49c927eda8c79d2d75cf
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7310
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is first commit that should go into latest SPDK
after the code freeze for SPDK 21.04.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I515772a0c90a362bd4a23a7717281a8d7a9e4393
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7615
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
When zcero copy send is enabled and used by initiator,
it could significantly increase latency in some payloads.
To enable more fine graing configuration of zero copy
send feature, add new parameters enable_zerocopy_send_server
and enable_zerocopy_send_client to spdk_sock_impl_opts to
enable/disable zcopy for specific type of sockets.
Exisiting enable_zerocopy_send parameter affects all types
of sockets.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I111c75608f8826980a56e210c076ab8ff16ddbdc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7457
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There are three modules implementing the bdev-zone API:
bdev_nvme, bdev_ocssd, and vbdev_zone_block.
For all three modules, the number of zones can be calculated using:
block_count / zone_size.
To avoid this calculation being performed everywhere, create a helper
function in bdev_zone.h, together with the other zone APIs, such that
a user can easily get the number of zones.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I2967b15a604ab8bf4420588e7510b9820762f925
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7451
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
On host side the connections are created and then added to thread's
poll group. Those connections could use different NIC queues underneath.
To route all connections of poll group through single queue a unique
placement id is chosen as group_placement_id and each socket of poll
group is marked with group_placment_id using getsockopt(SO_MARK) option.
The driver could use so_mark value of skb to determine the queue to use.
Change-Id: I06bda777fe07a62133b80b2491fa7772150b3b5d
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6160
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In next patch, if poller doesn't have a period, eventfd
will be created which's always busy automatically.
This eventfd can be combined with timerfd. So rename
timerfd to interruptfd.
Change-Id: Ibffa30ecfcaa73e55f47e97fac854641b74f2dfb
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7546
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Defined callback for spdk_poller to adapt itself to
set interrupt or poll mode. The callback can
be registered to spdk_poller by new function
`spdk_poller_register_interrupt`
Interrupt callback operations for period poller are implemented,
so period pollers now are interruptable.
Change-Id: I2aa6ebfdd75f76b85a70af7e42530be4131ddc8a
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5752
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
spdk_thread_set_interrupt_mode can get spdk_thread run
between intr and poll mode. It is only valid when thread
interrupt facility is enabled by
spdk_interrupt_mode_enable(). Currently, this function
is limited that no poller is registered to the spdk_thread.
Change-Id: Iba54accd5976beb6f6e155014903928ce2858e36
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6708
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Modifies RPC "framework_get_reactors" to get core frequency for current
core and insert it into JSON response.
Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com>
Change-Id: Ibb9c25e6e1d28ddb4cde42baa20a7e9808652ae8
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6582
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Mellanox Build Bot
This function finds a placement_id that does not have a group
associated with it.
Change-Id: I1306690e980fd4661f46dba9fb283f048a962eba
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7223
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The API `spdk_nvme_map_prps` is used in nvmf/vfio-user to
remap VM's NVMe command data buffer to local virtual address,
and for command using PRP, there maybe multiple pages, when
parsing the PRP list to local IOVs, we need a parameter to check
that the maximum number of vectors can't exceed the IOVs, this API
can't meet the requirement, while here, we add a new API `spdk_nvme_map_cmd`
and with a new parameter `max_iovcnt` to fix this case, and it can
also cover the command using SGL in the coming patches.
Change-Id: I71063524bed16ee3434103867a556d3741e55326
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7278
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Since the maps are unique to modules, they can store the group_impls
directly.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I7f11db558e38e940267fdf6eaacbe515334391c2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7222
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This allows for different policies per module, as well as overlapped
placement_id values.
Change-Id: I0a9c83e68d22733d81f005eb054a4c5f236f88d9
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7221
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
It can divide to two parts:
1, UIO driver - sigbus error handling and uevent
process.
2, VFIO - request notify handling.
sigbus error process is in previous patch.
Change-Id: Idc09754b83ae9ddcaea1f2afcbc13e528ead9863
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5768
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This makes it possible to traverse from the group_impl to
the group. It hasn't been necessary so far but will be in an
upcoming change.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I2bf119461bfd5ac5c8a63a3f1f4560d32e695c75
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7218
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Individual modules will need to mantain their own placement maps for
this to work correctly, especially if modules have different algorithms.
This is a step toward allowing them to do that.
Change-Id: Ie798baa50b94f1e99d6690adb606b936c7b30da0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7217
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Instead, move it down to the modules. This allows modules
to potentially change the value, if they are able.
Change-Id: I08f5fbadf5d1e96b489ddaaca72aa051ce2cb85c
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7212
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
For NVMe PRP list command, if the first PRP entry wasn't page
aligned, the number of IOV should add one more entry.
For VFIOUSER transport, a 128KiB PRP command may use 33 entries,
so we add one more entry here to address this case.
Change-Id: I9e30b301ef3ab62d7667adf336f38be9df699929
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7263
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Host Behavior Support is defined in 5.21.1.22 of the NVME spec. This
patch adds the related data structures.
Signed-off-by: Peng Yu <yupeng0921@gmail.com>
Change-Id: I99b2c1e55e23cadcd967b7adf5f8e75a2d799519
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7338
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This include isn't needed in queue_extras.h itself.
There were a few places that were implicitly
depending on this include, so fix those to include
util.h explicitly.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia962ae5a4403ee8ae15f3106d0d5e7d7412a4535
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7172
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Check if qpair has a poll group during the connect process,
use poll group's statistics or allocate own structure per
qpair. That is done due to not all applications use poll
groups and we want to avoid "if (qpair->group)"
conditions in data path.
Admin qpair always allocates its own statistics
structure but the statistics are not reported
since this qpair is not attached to a poll group.
Statistics are reported by spdk_nvme_perf tool
if --transport-stats and in bdev_nvme_transport_statistics
RPC method.
Change-Id: I58765be161491fe394968ea65ea22db1478b219a
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6304
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
These are interface functions that can be used by
an application e.g. spdk_nvme_perf or bdev_nvme
library. The next patches will add usage of these
functions.
Change-Id: I33b88e0e713c2ea5967f9241885e3257c5070577
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6300
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The new 2 API function allow to get and free stats
per poll group. New function to get transport name
have been added to report not only transport type but
also the name.
For now only RDMA transport reports statistics,
other transports will be added later.
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I2824cb474fde5fa859cf8196dabac2c48c05709c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6299
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
As a start of combining interrupt ability into poller,
it aims to get spdk_thread & spdk_poller runnable between
poll mode and interrupt mode with dynamic switching.
spdk_interrupt_mode_is_enabled() indicate whether interrupt
mode is enabled and dynamic switching is permitted. So
spdk_interrupt_mode_is_enabled==true leads to set up
interrupt mode related resources;
in_interrupt flag indicates whether one spdk_thread now
is running in intr mode.
It is possible that spdk_interrupt_mode_is_enabled==true
but in_interrupt==false. this means spdk_thread & spdk_poller
switched to poll mode from interrupt mode due to heavy
workload coming.
To align with spdk_reactor, use "in_interrupt" to
indicate whether one spdk_thread now runs in intr.
Change-Id: I2cd806bf4dec9969f3df88fac7f6b0c0b716d907
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6540
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: Mellanox Build Bot
The hotplug lib can be used for pcie devices
such as nvme, virtio_blk and virtio scsi.
For the sigbus handler, there is only one in a
process and it should handle all the devices.
And align nvme to the hotplug lib
Add the ADD uevent support for allowing the
device hotplug.
Change-Id: I82cd3b4af38ca24cee8b041a215a85c4a69e60f7
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5653
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This will be useful as the same purpose as
spdk_io_channel_iter_get_io_device() and will be used in the
following patches.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id45f5980c65543703b91df2afeb47448232fe503
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7237
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
The max_io_size transport option should be a power of 2 and be >= 8KB.
Max data tranfer size is defined in NVMe-oF spec as 2^(mdts cmd field) * 4KB.
Mdts cmd field is calculated as spdk_u32log2(transport->opts.max_io_size / 4096),
so max_io_size < 8KB results in mdts=0, which means no size limit (according to spec).
User can set max_io_size = 0 explicitly to allow no size limit.
Signed-off-by: Maciej Szulik <maciej.szulik@intel.com>
Change-Id: Id88a77efce5f217e1fc7750f61c0bd330aaa3791
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6384
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jacek Kalwas <jacek.kalwas@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Add support in bdev_zone.h for getting the maximum zone append data
transfer size.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I61203e64d51601232c6578a090fa52975364c1f3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6910
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
bdev_zone.h already has support for offline in enum spdk_bdev_zone_state.
Therefore, a user can call spdk_bdev_get_zone_info() and see that a
zone is in state offline, but the user has no way of putting a zone
in that state.
Add SPDK_BDEV_ZONE_OFFLINE to enum spdk_bdev_zone_action, so that a
user can call spdk_bdev_zone_management() to put a zone in zone
state offline.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I733a815949d7db0fdce293ba0d762e75a545ba76
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6909
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The NVMe Zoned Namespace Command Set Specification has, in addition to a
Max Open Resources limit, a Max Active Resources limit.
An active resource is defined as zone being in zone state implicit open,
explicit open, or closed.
Create a function spdk_bdev_get_max_active_zones() in the generic SPDK
zone layer, so that this limit can be exposed to the user.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I6f61fc45e1dc38689dc54d5649c35fa9b91dbdfc
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6908
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add a new zone state to represent an explicit open zone.
Many zoned specifications like ZBC/ZAC/ZNS have two different
zone states to represent an open zone: explicit open and
implicit open.
In e.g. ZNS, a zone is transitioned to explicit open when a
Zone Management Send is sent with a zone send action of open zone.
In ZNS, writing to e.g. an empty or closed zone, without first
sending a zone send action of open zone, will instead transition
the zone to implicit open.
The OCSSD specification only has a single open zone state.
In OCSSD, you can only transition to the open state by doing a write.
There is no separate function call to transition a zone to the open
state. Therefore, the OCSSD open state is most similar to the ZNS
implicit open state.
Since we cannot remove the SPDK_BDEV_ZONE_STATE_OPEN identifier,
for backwards compatibility reasons, make the SPDK_BDEV_ZONE_STATE_OPEN
identifier an alias to the new SPDK_BDEV_ZONE_STATE_IMP_OPEN identifier.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I98f3a280cd9e595100155568a3c0332c667a834b
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6907
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Assign values to enum spdk_bdev_zone_state.
This change will simplify a follow-up patch that will add new
identifiers.
If the first enumerator has no =, the value of its enumeration constant
is 0.
No functional change intended.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I1dc1cbcf262a8c6384a7133042f864074e1b992c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6906
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This change refactors the way nvmf_get_stats RPC works.
The RPC layer passes JSON write context to custom dump function defined within transport ops.
The RPC layer no longer needs to know the structure of transport poll group statictics.
Functions and structures used in the previous flow have been deprecated and will be removed.
JSON returned for RDMA transport should be the same as before this change.
Signed-off-by: Maciej Szulik <maciej.szulik@intel.com>
Change-Id: I03308c45be120793d316bf79814a1295afd9fb95
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6681
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
This API has been deprecated since SPDK 20.07,
see commit (b2947f52).
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Idb45906c81ea5682c6a67def0265910266d861b5
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6629
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Purpose: To setup an accelerated function callback
for created spdk_nvme_poll_group. In this patch,
we just create the interface. The real usage of this
call back will be provided in the other patch.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I0d936aa4eba4dbfcc0137942156b9f2919eb5b78
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6758
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
A very heinous error case is when a delete_io_sq or
delete_io_cq command fails when freeing a PCIe IO
qpair. In that case we return an error to the caller,
and the resources are not freed.
Instead of trying to 'fix' this error case, let's
just tell the user that if freeing an IO qpair fails,
the user should reset the controller and try again.
Resetting the controller will do one of two things:
1) The subsequent create_cq/create_sq will succeed,
at which point, trying to delete again should work
fine. (If it doesn't, it indicates some kind of
horrendous SSD.)
2) The subsequent create_cq/create_sq will fail,
at which point the entire ctrlr would fail and
need to be detached. This could result in some
leaked memory, but seems unavoidable.
Closes issue #716.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I218564080abec4d6b5ac0b861869ebfc7d99f2c3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6918
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: <dongx.yi@intel.com>
When ns set OPTPERF to 1, use NPWG and NPWA to set sectors_per_stripe.
Signed-off-by: sunshihao <sunshihao@huawei.com>
Change-Id: Iaf72ba5a7f21b33c357fba8deced9616e0bfaca6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6613
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The space should not be there. The doxygen
will not generate proper docs with it.
Change-Id: Id7e6fb2228abf1717e7e4097a9454c7820884655
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6881
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is an internal API used in several places. The call can fail, so
make sure it can report that correctly.
Change-Id: Iac0ed2c8299c9dd3d2556070278a2224c3807b7b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6640
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <dongx.yi@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
spdk_vbdev_register() was deprecated in SPDK 19.04.
config_text field in spdk_bdev_module was deprecated in SPDK 20.10.
spdk_bdev_part_base_construct() was deprecated in SPDK 20.10.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ib795ccdf61154c168032ccf8b81ea77e5e663851
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6628
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
This statistic is incremented when we don't reap
anything from the CQ. Together with the total number
of polls it can be useful to estimate idle percentage.
Change-Id: I61b51d049b0bc506fb8a896e225187e46e75a564
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6295
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Leverage SO_INCOMING_CPU to get the CPU affinity of connections
(sockets). And allocate the connections to specific poll groups,
which aims to utilize cache locality.
From our test:
6 P4600 NVMe on target,target uses 8 cores, NIC irqs are bound to
these 8 cores, and initiator side uses 24 and 32 cores,
we can get 11%~17% randwrite performance boost for posix, and 8%~12%
for uring.
Change-Id: I011e0a21502c85adcccd4a14fbe9838b43f54976
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5748
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
max_delay_us was deprecated in SPDK 19.04.
config_file was deprecated in SPDK 20.10.
master_core/pci_blacklist/pci_whitelist were deprecated in SPDK 21.01.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ie8be4c347de58044a7c3d5b1329d96e47ce084b4
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6594
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The fields were deprecated in SPDK 21.01.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: If6b946024bf1ce4c106cdf493bcb5662a3b21b13
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6592
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
These statistics can help to estimate efficiency of
Work Requests batching and show busy/idle polls ratio.
Send: the doorbell updates statistics for verbs
provider are incremented per each ibv_post_send call,
for mlx5_dv per each ibv_wr_complete call.
Recv: the doorbell updates statistics for both
providers are updated when either ibv_post_recv
or ibv_post_srq_recv functions are called.
Each qpair on initialization accepts an optional
pointer to shared statistics (nvmf/nvme poll groups).
If the pointer to statistics is not provided then
qpair allocates its own structure. That is done
to support cases when NVME RDMA initiator doesn't
use poll groups, so we can avoid checks that qpair
has statistics in IO path
Change-Id: I07dea603cb870b85ea23c42e8e2c4520b1c66252
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6293
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch is used to leverage accelerated engine to compute
the data digest in the following case:
1 DIF is not used.
2 The data to compute is aligned with size 4, i.e, %4 = 0.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I51fb6e3ab04391062b244cba6e249c8e20d3180f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6014
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch added the chained crc32 support API for both batched
and non batched mode usage. And also update the accel_perf
program in order to use the revised accelerated crc32 function.
For example, you can use the following command:
./build/examples/accel_perf -C 4 -q 128 -o 4096 -t 5 -w crc32c -y
In this command, "-C 4" means that caculate the chained
crc32 for an iov array.
(even if you do not have the accelerated DSA hardware)
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Ifede26f9040980b5791da8e5afef41177eede9f6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6457
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
The statement causes this issue is:
assert(group_impl->num_removed_socks < MAX_EVENTS_PER_POLL);
The call trace is:
The previous solution is:
commitid with: e71e81b631
But with this solution, it will always add the sock
into the removed_socks list even if it is not under polling
context by sock_group_impl_poll_count. So it will exceed the size of
removed_socks array if sock_group_impl_poll_count function will not be
called. And we should not use a large array, because it is just a workaround,
it just hides the bug.
So our current solution is:
1 Remove the code in sock layer, i.e., rollback the commit
e71e81b631. This patch is
not the right fix. The sock->cb_fn's NULL pointer case is
caused by the cb_fn of write operation (if the
spdk_sock_group_remove_sock is inside the cb_fn). And it is not
caused by the epoll related cache issue described in commit
"e7181.." commit, but caused by the following situation:
(1)The socket's cb_fn is set to NULL which is caused by
spdk_sock_group_remove_sock by the socket itself
inside a call back function from a write operation.
(2) And the socket is already in the pending_recv list. It is
not caused by the epoll event issue, e.g., socket A changes Socket B's
cb_fn. By the way, A socket A should never remove a socket B from a polling group.
If it really does it, it should use spdk_thread_sendmsg to make sure
it happens in the next round.
2 Add the code check in each posix, uring implementation module.
If sock->cb_fn is NULL, we will not return the socket to the active socks list.
And this is enough to address the issue.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I79187f2f1301c819c46a5c3bdd84372f75534f2f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6472
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add a function to get the number of max active zones for a zoned
namespace.
The value inside the identify namespace struct is a 0's based value,
where 0xffffffff means unlimited.
If unlimited, the addition will overflow and return 0,
which is the intended value to represent unlimited for this API.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia09e3db157ca0afadbd3ca4032eedd7bcd88248c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6443
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Add a function to get the number of max open zones for a zoned
namespace.
The value inside the identify namespace struct is a 0's based value,
where 0xffffffff means unlimited.
If unlimited, the addition will overflow and return 0,
which is the intended value to represent unlimited for this API.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I4223146bc1ddf90486892a0af5fe5ce006dc5fd3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6442
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: sunshihao <sunshihao@huawei.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch provides two new accelerated crc32c function interface.
And the next patch will be used to add the real support of chained crc32c feature.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: I3f8dd55c3da636e29e5fb02fc229b51f05653cd6
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6456
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
We already have support for spdk_nvme_zns_zone_append(),
add support for spdk_nvme_zns_zone_appendv() (zone append with
NVME_PAYLOAD_TYPE_SGL).
_nvme_ns_cmd_rw() currently performs verification of the SGL,
if the parameter check_sgl is set. This parameter is set for all
calls with payload of type NVME_PAYLOAD_TYPE_SGL.
In order to be able to perform the same check_sgl verfication on
zone append vectors, we need to refactor _nvme_ns_cmd_rw() a bit.
Setting check_sgl ensures that _nvme_ns_cmd_split_request_sgl() or
_nvme_ns_cmd_split_request_prp() gets called.
These functions will split an oversized I/O into several different
requests. However, they also iterate the SGE entries, verifies that
the total payload size, total SGE entries is not too many, and that
buffers are properly aligned. A proper request will not get split.
For zone append, splitting a request into several is not allowed,
however, we still want the verification part to be done, such that
(e.g.) a non first/last SGE which is not page aligned, will cause
the whole request to be rejected.
(In the case of spdk_nvme_ns_cmd_write(), a non first/last SGE which
is not page aligned will instead cause the request to be split.)
An alternative would be to try to rip out the verification part from
_nvme_ns_cmd_split_request_sgl() and _nvme_ns_cmd_split_request_prp().
However, that is non-trivial, and would most likely end up with a lot
of duplicated code, which would easily get out of sync.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I2728acdcadeb70b1f0ed628704df19e75d14dcca
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6248
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Sometimes it is more optimal to get the zone size in number
of sectors, instead of in number of bytes.
Therefore, add a new spdk_nvme_zns_ns_get_zone_size_sectors()
function to get zone size in number of sectors.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I0fe67e00a3d74dd27acfc895ae97448d995b89a3
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6416
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Mellanox Build Bot
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Make enum spdk_nvme_ns_flags more readable.
Other enums in spdk, e.g. enum spdk_nvme_ctrlr_flags,
and e.g. enum NVME_RDMA_COMPLETION_FLAGS, already define
the enum values using bit shifts.
Do the same for enum spdk_nvme_ns_flags.
No functional change intended.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: Ia78ab0bab6fa37d700178d5b2669feb5a3003871
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6383
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The current implementation treats HPDA/CPDA as the absolute offset
to the beginning of the PDU where the payload data starts. This is
incorrect. The HPDA/CPDA actually specify where the payload data
should start such that the starting location is a multiple of HPDA
(for C2H PDU) or CPDA (for H2C PDU or CapsuleCmd PDU).
The other issue fixed is that the current implementation calculates
padding only when header digest is enabled. This is also incorrect.
Signed-off-by: Wenhua Liu <liuw@vmware.com>
Change-Id: If7a3896a4c1d73f6d062bd3dbe6a912d31771180
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6256
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Identify application prints the PMR details if it is supported
Signed-off-by: Krishna Kanth Reddy <krish.reddy@samsung.com>
Change-Id: Iaba4c15e18e1402035b11a34b2defe8078855751
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6209
Community-CI: Broadcom CI
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Aspects of bit fields are 'implementation defined'. On some platforms
alignment will occur if two adjacent fields are of different types. This
occurs in spdk_nvme_feat_async_event_configutation after the crit_warn
member which is effectively an int8_t, followed by an int16_t. There
isn't a generic way of changing the compiler's behaviour, so the best
options are:
- Change crit_warn to a uint32_t bit field and copy the value to/from
a spdk_nvme_critical_warning_state variable to use it. This requires
changes to code using the field.
- Adjust the structure definition to use smaller types to avoid the
problem. This preserves existing semantics, but the field order will
need to be reviewed if big-endian support is ever added (other places
in nvme_spec.h will need similar attention). A second reserved field
is required.
Use smaller types which seems the most straightforward option. Adjust
the use of the spdk_nvme_feat_async_event_configuration reserved fields
in lib/nvmf/ctrlr.c.
The new structure is binary compatible and the fields behave in the same
way, with the exception of an additional reserved field, so updating
CHANGELOG.md probably isn't necessary.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Change-Id: I7d8163c84b4f410fc95a5b7064506ad7b4b62c6c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6340
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
While zone append is very similar to write, we should refer
to the proper I/O command.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Change-Id: I85dc2634eefb502dd38351ca02d9b09a15c5620f
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6245
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Mellanox Build Bot
This function allows applications to specify whether
they wish to allow probing a newly attached NVMe
PCIe SSD.
The env layer will only even probe devices that have
been allowed. By default, this is all devices, but
if the user has specified some list of
allowed PCI addresses (via spdk_env_opts pci_allowed)
then newly attached PCIe devices are implicitly not
allowed. This API allows applications to add
device addresses to the allowed list after the
application has started.
This API will be useful for use cases where multiple
SPDK processes are running on one server, and assignment
of PCIe SSDs to those processes are based on some function
of the SSD's PCIe address.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I919bc267f2ad9130ab5c875ff760a301028b047e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6184
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
The env layer has a pci_allowed list, which specifies
that only a subset of PCI devices may be attached
by the associated process.
But that doesn't cover PCI devices that are hot-inserted
after the application starts, which is common for
storage/NVMe.
So add a new spdk_pci_device_allow() API which allows
an application to add new devices to the allowed list.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7bd5ff428d84480d46bc236698daadd019b20b8e
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6183
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: <dongx.yi@intel.com>
For flow control reasons we have to resize the bit arrays we
use to manage flow as channels come and go. However since
channels are assigned to devices, until the channel count
reaches the device count there's no sharing so no resize of
the array is needed. So, when we use a device for the first
time there's no need to run through the rest of the channels
and re-balance.
Same thing is done on destruction. The code to free idxd
specific resources was moved from the rebalance function to
the idxd put channel function which is a much more logical
place for it as well.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Ib4df163286906f413dd6429dc6833af7b68e208c
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5846
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This patch is used to add spdk_nvme_poll_group_get_optimal
public API.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Iee34c89e0e1ff1f81167b18e198c144ca28f71de
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3311
Community-CI: Broadcom CI
Community-CI: Mellanox Build Bot
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is first commit that should go into latest SPDK
after the code freeze for SPDK 21.01.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: Ieeef261f4a83183946358916a084fef886339d83
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6052
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Additionally, the user can specify a namespace to also pause during the
operation.
This allows for the management of hosts, listeners, and the addition of
namespaces all while I/O to other namespaces is occurring. Pausing a
specific namespace also allows for the removal of that namespace without
impacting I/O to other namespaces in the subsystem.
Change-Id: I364336df16df92fe2069114674cb7a68076de6fb
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4997
Community-CI: Broadcom CI
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
We want to avoid an active for_each_reactor operation
while the reactors are being shut down.
Fixes issue #1766
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I30bc258c5b22545320080d269a1ed8cb0b4e12f2
Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6104
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>