_spdk_bdev_io_submit uses the bdev_io->internal.in_submit_request
flag to ensure we unwind in cases where the I/O is completed
inline (i.e. malloc or null bdevs). But when an I/O gets queued
for QoS, and then we iterate through the queued I/O in
_spdk_bdev_qos_io_submit(), this flag was not getting set
when those I/O would get submitted to the underlying bdev. This
would allow for _spdk_bdev_qos_io_submit recursion, resulting
in all kinds of different types of memory corruption.
Fixes#613.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I29263f4e7b2ead60f08b60474d210defa803348c
Reviewed-on: https://review.gerrithub.io/c/442127
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
This patch adds the support of read and write separate
bandwidth rate limits control with the configuration file.
Below is the example (in MiB) for the configuration section:
[QoS]
Limit_Read_BPS Malloc0 100
Limit_Write_BPS Nvme0n1 200
Change-Id: I0221516ce70c3fbb07b9e80c1c814ed5ba271c88
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/416672
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch added two new function pointers (queue_io,
update_io) for related qos operations like iops and
bandwidth rate limits.
Change-Id: I2ffd67c5f1c421eab448fd5e95f809da55805fcd
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438157
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
There are some patches to linux kernel for separating
out discard IOs that are written to /proc/diskstats and
the various /sys/block/*/stat files. The following additional
fields were added to these files:
Filed 12 -- # of discards completed
Filed 13 -- # of discards merged
Filed 14 -- # of sectors discarded
Filed 15 -- # of millisecond spent discarding
SPDK could provide these raw information to bdev_io_stat. Users
can use these information to calculate more states of block device.
Signed-off-by: Yanbo Zhou <yanbo.zhou@intel.com>
Change-Id: I517d67f0ff0159baf04e24732a8fd0ccefcb9c46
Reviewed-on: https://review.gerrithub.io/c/439057
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
The underlying io_channel created by a bdev module
could be never freed in various cases due to missing
spdk_put_io_channel() calls.
Change-Id: If1aa736f67f0354c4f9dcdf61d3f1e6b859b4c48
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/438803
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Since bdev->internal.mutex protects variables in bdev->internal,
there's no need to init the fields of spdk_bdev_desc struct and
free the struct when claim_module exists within the mutex lock.
Reducing critical area could help to reduce lock competition
caused by opening bdev concurrently.
Change-Id: Ie6b2a63e44a97cfed914fa6d8adef2c5ec3ac300
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/c/438514
Reviewed-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When create channel, the fields that _spdk_bdev_channel_destroy_resource()
checked are always be set to non-null. Remove these unnecessary if
statements makes issue exposed more easily if something goes wrong.
Change-Id: I2d505c87176d4d49eb1528a258e4bea6477e0fe6
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/c/438799
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Since division is more expensive than right shift operator, in
function spdk_bdev_bytes_to_blocks, use right shift instead of
division if the blocklen of bdev is a power of two.
Change-Id: Ib3dbc792e86582bba30b3dc028efbd12c69075ba
Signed-off-by: lorneli <lorneli@163.com>
Reviewed-on: https://review.gerrithub.io/438318
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In the case the QoS thread has not properly initialized yet,
needs to go through the regular QoS enabling process to notify
all the channels and also disable the QoS properly. The channel
and poller related staff also needs to be handled together
with the thread.
Change-Id: Ifc2b2cdfb1181aa6418ad1d43ae5905c0c317549
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/437519
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Previously, if want to know which mask bit is used for specific
trace group, the only way is to check source code. Now list
each trace group with its trace tpoint group mask bit in
usage message
Change-Id: I7a85fe9c0885f1919f6ffbdc97dab81f1986fb07
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/435448
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be required in following histogram patches.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I2eee6629243b7a4838a80dc1de33ae485c58081e
Reviewed-on: https://review.gerrithub.io/433874
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There are some use cases such as multipath and RAID expansion where a
vbdev could have been registered before one of its base bdevs.
Currently we unregister bdevs at shutdown in reverse order of their
registration. Continue to do that in general, but skip any bdev that
is still claimed. Any bdevs skipped in this way will eventually be
unregistered once any bdevs that have claimed it have completed
unregistration.
Change-Id: Iafde9558430bc5ce56e8608ef50bcb2b5fbfbf71
Signed-off-by: Andrey Kuzmin <akuzmin@jetstreamsoft.com>
Reviewed-on: https://review.gerrithub.io/432136
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Now, that _spdk_bdev_io_get_buf offers allocating aligned buffers,
add possibility to store original buffer and replace it by aligned
one for the time of IO.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: If0ed306175631613c0f9310dccaae6615364fb49
Reviewed-on: https://review.gerrithub.io/429754
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is requirement for following patch. Requests that will
reuquest bounce buffer can only allocate limited size buffer.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I850b614305d66065733381ceb7bd67d4b1cad6b3
Reviewed-on: https://review.gerrithub.io/430783
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch changes the name of the field. Following patches
will introduce logic that will guarantee that buffers
provided to bdev module will be aligned to value
specified in this field
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I5329b9fe26ef2417bc7beae86518cc643b263f97
Reviewed-on: https://review.gerrithub.io/430782
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
SPDK poller function will return the different value for
the different status. Update the QoS poller for this change.
Change-Id: Ia384bf00a23713df663c317b1997ead441c5adcb
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/428573
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Initiailize fields later assumed to be NULL
Change-Id: I61e054dd275c6c04fb3f826adc445e56f0add331
Signed-off-by: shahar salzman <shahar.salzman@kaminario.com>
Reviewed-on: https://review.gerrithub.io/428304
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I96b5410a92f176aef11e00829fdebd36910ac2d4
Signed-off-by: shahar salzman <shahar.salzman@kaminario.com>
Reviewed-on: https://review.gerrithub.io/428302
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Large read I/O will be typical in some use cases such as
web stream services. On the other hand, large write I/O
may not be typical but will be sufficiently probable.
Currently when large I/O is submitted to the RAID bdev,
the I/O will be divided by the strip size of it and then
divided I/Os are submitted sequentially.
This patch tries to improve the performance of the RAID bdev
in large I/Os. Besides, when the RAID bdev supports higher
levels of RAID (such as RAID5), it should issue multiple
I/Os to multiple base bdevs by batch fasion in the parity
update. Having experience in batched I/O will be helpful
in the future case too.
In this patch, submit split I/Os by batch until all child IOVs
are consumed or all data are submitted. If all child IOVs are
consumed before all data are submitted, wait until all batched
split I/Os complete and then submit again.
In this patch, test code is added too.
Change-Id: If6cd81cc0c306e3875a93c39dbe4288723b78937
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/424770
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When destroy qos, spdk_bdev_qos_destroy() allocates a new qos, and swap
old one. After spdk_bdev_unregister() frees the new qos, the old qos poller
may still reference new qos via bdev->internal.qos. Fix this error by
using old qos in _spdk_bdev_qos_io_submit().
Reported in 72aac51430.1539054028/ubuntu16.04/build.log
Change-Id: Id1bce6c8b1cefae604dd2c69e8f3482ec34b1b54
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/428444
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch added the support of RPC method to enable,
adjust, disable the bandwidth rate limit on the bdev.
And it can work together with the existing IOPS rate limit.
The RPC method has been consolidated to support both IOPS
and bandwidth rate limits as below:
usage:
rpc.py set_bdev_qos_limit [-h]
[--rw_ios_per_sec RW_IOS_PER_SEC]
[--rw_mbytes_per_sec RW_MBYTES_PER_SEC]
name
positional arguments:
name Blockdev name to set QoS. Example: Malloc0
optional arguments:
-h, --help show this help message and exit
--rw_ios_per_sec RW_IOS_PER_SEC
R/W IOs per second limit (>=10000, example: 20000).
0 means unlimited.
--rw_mbytes_per_sec RW_MBYTES_PER_SEC
R/W megabytes per second limit (>=10, example: 100).
0 means unlimited.
Change-Id: I9c03cd635280add01801a81c6a6c02f0cf85bee1
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/416511
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
scan-build requests to check the size of parent iovec by using
artificially large LBA in unit tests.
Fix the error by using not pointer but position and checking if
position is less than count of parent iovec.
Change-Id: I74c4f6d1b68ecfca93e9247acc5ac6bd5412a960
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427965
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Each file that need to check SPDK_CONFIG_* options need to include
spdk/config.h explicitly.
Change-Id: If9f2a91ac4c2b1a300dcf88ec3e2a12714ad344a
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/427221
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This patch is to introduce the specific QoS related structure
and the enumeration for types of QoS rate limits. Later new
types of QoS rate limits can be supported easily.
Change-Id: Idb8d2e7627fd145bf2b0ddb296c968b6b068f48c
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/424459
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Currently write to bdev fails in _spdk_bdev_write_zero_buffer_next
due to other than -ENOMEM, assert is called.
RAID bdev using this feature is generally availale now and it will be
OK to remove this assert and return error instead.
Additionally, applying the factored function to _bdev_write_zero_buffer_next
will improve readability slightly.
These two changes are done in this patch.
Change-Id: I462630a71e57e2e5146b085b215d62a378ea9402
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427186
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch factors out that operation into a function and adds
error handling to that operation to improve readability slightly.
Change-Id: Ic24df0c0a9abbebc38d30fc17779dc5a5f6138a6
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427026
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Currently when read/write to bdev fails in _spdk_bdev_io_split_with_payload
due to other than -ENOMEM, assert is called.
RAID bdev that utilizes the split IO feature is generally availale now
and it will be OK to remove this assert and return error instead.
Change-Id: I6ea6fd45b94bff0ea84e498e0c4dfd1dd31e0260
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/427025
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When a bdev IO is split, if iovec size in a strip is more than 32,
the IO will fail.
Remove the limitation by spliting the split IO further.
Change-Id: I962ad86dfe63ea1fcd86ffa52ead7452fb80e53d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/425876
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
To achieve its goal, this patch changes the order
in which bdev modules are finished. All modules
that examine bdevs (e.g. lvol,split,...) will be now
finished last. It should not cause any issues though,
since all bdevs are already removed at the time when
any module finish is called
Fixes#387
Change-Id: Id60c375eb5c3d7306b69cdce86bded77354868d8
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/421158
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The type of the second parameter `success' of spdk_bdev_io_completion_cb
is bool. Hence change the code to use bool type success or failure.
Change-Id: I9e93f4ccbb085e8e184f209e706915dcd34aa966
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/426648
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
We cannot split an iov if a buffer hasn't been
allocated yet. So always call spdk_bdev_io_get_buf
on reads before trying to split.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2c26efb9dc6cb2c7c3e3b7ae5bab2c37844b9113
Reviewed-on: https://review.gerrithub.io/424879
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
spdk_bdev_io_put_buf() is responsible for reclaiming
bdev-allocated buffers from a bdev_io. If there are
bdev_ios waiting for one of these buffers, it calls
spdk_bdev_io_set_buf() on the next bdev_io in the queue.
This will set the iov_base and iov_len on the bdev_io
to point to the bdev-allocated buffer.
But spdk_bdev_io_put_buf() was calling spdk_bdev_io_set_buf()
on the just completed bdev_io, not the next bdev_io in the
queue. So fix that.
Fixes: 844aedf8 ("bdev: Simplify get/set/put buf functions")
Reported-by: Alan Tu
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibbcad6e35a3db6991bd7deb3516229572f021638
Reviewed-on: https://review.gerrithub.io/424880
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When new bdev was created, the struct spdk_bdev_module::examine_disk()
may open and close bdev. On the other hand, if something goes wrong,
the creation procedure may unregister new created bdev, so race
condition appeared between _remove_notify() and spdk_bdev_close().
Add the new field "closed" and "remove_notified" in struct spdk_bdev_desc,
so _remove_notify() and spdk_bdev_close() knows how to deal with this
situation.
Change-Id: Ibfe915a4d76096796b039a13a4f49f26669eba2c
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/423369
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_bdev_close() must be called on the same thread as
spdk_bdev_open(). Further, the remove callback on the
descriptor will also be run on the same thread as
spdk_bdev_open().
Change-Id: I949d6dd67de1e63d39f06944d473e4aa7134111b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/424738
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When emulating write_zeroes commands on device that
don't natively support it, we submit a write with
a zeroed buffer. We used to just reuse the original
bdev_io, but that was recently changed due to other
splitting code added for iovs. But when making those
changes, we forgot to free the bdev_io for the
write that was sent down to the device.
Fixes: 183f37e8 (bdev: do not reuse bdev_io when...)
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If08782c65f6305c0a9f9d15d74fd8823e1158e9b
Reviewed-on: https://review.gerrithub.io/424733
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This RPC does not work for a lot of bdev types. For
example, NVMe namespaces and virtio scsi LUNs are not
explicitly constructed by an RPC - they are indirectly
constructed by an RPC associated with an NVMe controller
or virtio-scsi controller.
While here, remove spdk_bdev_config_json. It was
only created to facilitate this get_bdevs_config RPC.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I537166d8f91ab458bd2000859d74f7254bfc9c0a
Reviewed-on: https://review.gerrithub.io/424584
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This isn't valid RPC so it needs to be removed. Bdev modules were
working around this issue by defining empty write_config_json
methods.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3c4c20249eadfcfb4103430f5801190b14897249
Reviewed-on: https://review.gerrithub.io/424582
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This was added a long time back for tracking an rte_mbuf
whose buffer was a different rte_mbuf - all related to
a userspace TCP stack that is no longer in development.
The concept isn't useful now, so remove it to reduce
the complexity of the tracing code.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I310e492eba7f55df242bb29d82fb19f6daee1f51
Reviewed-on: https://review.gerrithub.io/424565
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This is a string name used for debugging only.
Change-Id: I9827f0e6c83be7bc13951c7b5f0951ce6c2a1ece
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/424127
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
There could be cases (especially in virtualized and/or test
environments) where we could accumulate significant skew in
the timeslice frequency. Rather than depend on the application
framework to try to guarantee the rate of timeslice poller
callbacks, keep track internally of the last time the poller
was invoked. If/when we accumulate and detect skew equivalent
to one or more timeslices, increase the allowed IO and bandwidth
of the next timeslice to accomodate.
Since bdev poller now calls spdk_get_ticks() to do accounting,
this patch also fixes up the increment_time() unit test function
and the test env layer to properly increment the fake TSC.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iba301ddc0fb3d02042106a8bf6e4a6a9a84dc263
Reviewed-on: https://review.gerrithub.io/423580
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Splitting a 1TB unmap into individual 64KB unmap commands
(for a RAID volume with 64KB strip size) would be awful -
the RAID module can be much smarter about this.
So back out the changes for splitting I/O without payload.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I24fe6d911f4e3c9db4b2cb5d66c7236a5596e0d9
Reviewed-on: https://review.gerrithub.io/424103
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Now that we split on I/O boundaries, that code needs to
be able to use the bdev_io split* members to track
what is left to submit. This means that the write_zeroes
code cannot submit the parent bdev_io as the child bdev_io,
since the I/O boundary code will overwrite the write_zeroes
split accounting.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9316b59267508f60799766fc4f1ea05a4b3e5d9e
Reviewed-on: https://review.gerrithub.io/423404
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This will simplify some future patches which will
account for missed timeslice timers by allowing
additional IO/BW in the following timeslice.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9dd46a768c98ce267c733a9f9719a2d3d2c3c915
Reviewed-on: https://review.gerrithub.io/423579
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This was defined in two places, so consolidate
the definitions.
Change-Id: I0bbb262b97e90d1064bcc50ee201928f6ca9518a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/423182
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
A number of modules (RAID, logical volumes) have logical
"stripes" that require splitting an I/O into several
child I/O. For example, on a RAID-0 with 128KB strip size,
an I/O that spans a 128KB boundary will require sending
one I/O for the portion that comes before the boundary to
one member disk, and another I/O for the portion that comes
after the boundary to another member disk. Logical volumes
are similar - data is allocated in clusters, so an I/O that
spans a cluster boundary may need to be split since the
clusters may not be contiguous on disk.
Putting the splitting logic in the common bdev layer ensures
bdev module authors don't have to always do this themselves.
This is especially helpful for cases like splitting an I/O
described by many iovs - we can simplify this a lot by
handling it in the common bdev layer.
Note that currently we will only submit one child I/O
at a time. This could be improved later to submit multiple
child I/O in parallel, but the complexity in the iov splitting
code also increases a lot.
Note: Some Intel NVMe SSDs have a similar characteristic.
We will not use this bdev stripe feature for NVMe though -
we want to primarily use the splitting functionality inside
of the NVMe driver itself to ensure it remains fully
functional. Many SPDK users use the NVMe driver without
the bdev layer.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ife804ecc56f6b2b55345a0d0ae9fda9e68632b3b
Reviewed-on: https://review.gerrithub.io/423024
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This value is used to calculate the disk utilization of a given bdev.
Change-Id: I4bf101c524b92bdd21573941e17f61db59c5c6b8
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/423017
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be needed for using this same descriptor when
splitting an I/O.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idec759df7ab27f8de567d3c8a4214e25dbe173f5
Reviewed-on: https://review.gerrithub.io/423022
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
So we don't need to allocate memory (maybe failed) just for free other
memory.
Change-Id: I2c83f6acc2aa6ed79455bff90f952a2e70b44d59
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/422203
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The intents of these arrays was to keep track in the
bdev layer of all base<->virtual bdev relationships -
i.e. which member disk bdevs make up a RAID bdev,
which logical volume bdevs are associated with a
bdev that contains an lvolstore, etc.
Currently none of this is used however. And trying
to keep track in the bdev layer instead of asking
the bdev modules for the relationships has a number
of complications. Early one, we tried to do this
with TAILQs - but that doesn't work since this can't
be done with a single TAILQ_ENTRY in the bdev
structures. So we moved to arrays - that works a bit
better, but then the pointer arrays have to be
realloc'd which isn't ideal.
The biggest problem though with these arrays is that
they held bdev pointers - not bdev descriptor pointers.
It's not really valid to access bdevs without a
descriptor - the descriptors are what make sure active
references are accounted for when a bdev is hotplugged.
Of course the bdev layer knows when a bdev is getting
removed and could go and do the updates to these
arrays separately - but that just seems very convoluted.
So for now just remove these arrays completely. If
there is a future need for the bdev layer to
understand relationships between bdevs, we can add
module APIs so that the generic layer can ask
the modules about the relationships.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I99ef1068240bff1262f64f234260cf2fb44df51d
Reviewed-on: https://review.gerrithub.io/420932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
This ensures (for example) that a RAID volume is
unregistered before its member disks.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7a7c16acc351f2d5d4218b64b370e2c77c6e2b5e
Reviewed-on: https://review.gerrithub.io/420812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
When an SPDK application shuts down, the bdev layer will
automatically unregister all of the bdevs to ensure they
are properly quiesced and cleaned up.
Some modules may want to perform different operations when
a bdev is destructed during normal runtime vs. shutdown.
For example, for lvol, when the last lvol is cleaned up,
it should unload the lvolstore, release and close the bdev
that contains the lvolstore. You never want to do this
during normal runtime though - it is perfectly valid to
have an lvolstore that contains no lvols. RAID and future
bdev modules such as multipath have similar use cases.
So add a new bdev module callback named "fini_start".
If a module specifies a function pointer for this callback,
the bdev layer will call it before it starts the bdev
unregistrations.
This enables some future patches to the bdev layer such
that it will always unregister block devices that are not
claimed (i.e. logical volumes) before block devices that
are claimed (i.e. the bdev containing an lvolstore).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6e87f5c2b27f16731ea5def858f26e882a29495a
Reviewed-on: https://review.gerrithub.io/421175
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Claim the block device when adding it to a new Namespace,
and prevent the block device to be added twice for other
modules and Namespaces. Also remove the test that using
same block device over different Namespaces.
Fix issue #371.
Change-Id: Ib7ce18e9fde4a15c0f19ce9e28e69145e54570e0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/420472
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Currently, the bdev layer iterates over all of the
existing channels of a bdev to collect I/O
statistics. But this ignores statistics for
channels that are deleted.
Fix that by keeping an io_stat structure in the
bdev which accumulates statistics for deleted
channels. Use the bdev mutex to protect these
accumulations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3103c0b8b55973c827d977765f47e5b9e7f58e5f
Reviewed-on: https://review.gerrithub.io/421029
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be used further in an additional patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1126e5adfc24f17e5cdf33b0d3e04c78470771c8
Reviewed-on: https://review.gerrithub.io/421027
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
These were left out previously.
Change-Id: I4e97068d2f13ca1dd1cfae1b25564641cee794ef
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/419601
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Two extra fields are added to the iostat rpc.
1. io_time. The amount of time since queue depth tracking was
enabled that has been spent on I/O processing.
2. weighted_io_time. Incremented each time this bdev's queue depth is
polled by the amount of time spent processing I/O since the last polling
event times the measured queue depth.
Change-Id: Ie70489ec24dee83f3eeac8f4f813ec7074ff458f
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/419031
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Daisuke Aoyama originally contributed to istgt, the
iSCSI target in FreeBSD. The SPDK iSCSI code is
originally derived from that. Due to copy and paste,
some incorrect copyright attributions have been added
to other files that do not derive from istgt, so
this patch removes those.
It is doubtful, at this point, that there is any code
whatsoever that remains from the original istgt, but
we can revisit that at a later time.
Change-Id: I207e1e6b99d271e2f12690be90a96f7d0c113af7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/420679
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
If a module fails during init, github issue #363 reported ASAN
errors due to leaks from not cleaning up modules that may have
completed init before the failure. An earlier partial fix was
added, this is a more correct fix that addresses the specific
scenario reported as well as the simplified one used during
early debug...
Change-Id: Ia43476faf33d8e31b581529de3b6d75caed09096
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/420118
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Also make the iov_len always set to the used length, not
the total length of the buffer.
Change-Id: I7ebb5b63c6ca7570369f4af0131a23c520c1f7b0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/419025
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In the event that one bdev module failed, we'd leak a bunch
of stuff from any that init'd correctly beforehand. Also
added a guard around the calling of modile init_done routines
so that it's not done if module init didn't work.
Change-Id: I4e6170e1eee67b131252ed30d0d20124d2c5ff35
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/419446
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This function always return 0, so change the type of function
to void
Change-Id: I715cf82c74619a2707b8e5a453710a992489f2c8
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/419045
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When spdk_put_io_channel is called, if its the last reference,
we defer actual destruction of the channel, so that code
in the same context which may be referring to the channel
doesn't crash.
But it is possible that an io_channel for that same io_device
could be requested before the deferred message is processed.
This would result in a second io_channel being created for
that device on the same thread.
To avoid this case, don't immediately remove the channel from
the list when the last reference is put. When the deferred
message is processed, if additional references were allocated
in the meantime, don't destroy the channel.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Change-Id: Idb8d4705fda0eb9c338e4960430e04edbe537e05
Reviewed-on: https://review.gerrithub.io/418878
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Considering the I/O size is variant from small in byte to
large in megabyte, need to consider the compensation of less
allowed bytes in next timeslice if the current timeslice
sends more bytes down.
Change-Id: I885f0bb21001bd90879aa8622e2b34e3bf78cf6e
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/417829
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This change includes a function to enable this feature on a per-bdev
basis. The new information stored in the bdev includes the following:
measured_queue_depth: The aggregate of the outstanding oparations from
each channel associated with this bdev.
period: The period at which this bdev's measured_queue_depth is being
updated.
With this information, one could calculate the average queue depth and
the disk utilization of the device
Change-Id: Ie0623ee4796e33b125504fb0965d5ef348cbff7d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/418102
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Rather than adding the full alignment size (512), add one less than the
alignment so that already-aligned buffers don't get rounded up again.
Change-Id: I96323b848bfb90f2aa1774b869e2b8a81d253077
Reported-by: Shuhei Matsumoto
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/418879
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
During spdk_bdev_init, examine_config is called.
This call can claim bdev synchronously, based on
configuration. On spdk_bdev_start if none module
claimed bdev, examine_disk is called and can
perform I/O before claiming bdev.
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Change-Id: I1448dd368cf3a24a5daccab387d7af7c3d231127
Reviewed-on: https://review.gerrithub.io/413913
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
There's no reason to abort IO that have been queued
due to QoS limits, when QoS is switched from enabled
to disabled. Submit them to the bdev instead.
Fixes issue #357.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If5eafc53418ac686120e1d6a1da884b42cef845e
Reviewed-on: https://review.gerrithub.io/418128
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Also make it correctly account for alignment and automatically
use the internal iov element if necessary.
Change-Id: I0b33ef9444f0693c2d6b0cdaf221c4a5b0ad2cc3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/416870
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This is potentially useful for more types of commands.
Change-Id: Ifbde7ae35294f581b8360891579836fd6f9573a6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/416869
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Within the bdev layer, we want to know specifically what I/O
types the module supports. However, the bdev module may elect
to emulate some commands and report additional support via
the public API.
The bdev layer already emulates WRITE_ZEROES, so correctly
report that fact.
Change-Id: I79bfb1aee1b3e6048f951bb1b2c7d4f7c9ef184d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/416464
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Newer versions of scan-build complain about cb_fn potentially being null
in spdk_bdev_unregister.
Change-Id: Ib5607234557f2104ee30398a620fa595389a33e9
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/417064
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
I missed this one in the initial series.
Change-Id: Id4dc7574a04cd964455852f1a00084b65ab989b3
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/416253
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This should be set from bdev.c and does not need to be accessed further
from bdev modules.
Change-Id: I2174ed2378d986cec291e7f29e64fe13a5f7df6d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/416060
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
These submission related variables are not accessed from any of our
current bdev modules.
Change-Id: I69e21eea736273183dfeb48922890a4dc9a244cc
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/416058
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Move the status field into the internal structure.
Change-Id: Icf96436925dd829ee89d2491ef55e337823be6fb
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/416057
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The members of spdk_bdev_io which are associated with the data buffer
should only be modified by calling functions in bdev.c
Change-Id: Icacb7f7387d626cf6834480b572e2f31b48666e1
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/416054
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This function is intended to be used when an spdk_bdev
I/O operation (such as spdk_bdev_write or spdk_bdev_write_blocks)
fails due to spdk_bdev_io buffer exhaustion. The caller
can queue an spdk_bdev_io_wait structure on the calling thread
which will be invoked when an spdk_bdev_io buffer is available.
While here, turn off error messages in bdev.c related to
spdk_bdev_io pool exhaustion, since we now have an API designed
to gracefully recover from it.
Also modify bdevperf as an example of how to use this new API.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia55f6582dc5a8d6d8bcc74689fd846d742324510
Reviewed-on: https://review.gerrithub.io/415074
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change the checks to asserts instead. No callers ever
checked the return codes, and if they did, there's
nothing they can do to recover from an error status.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I96d6804d61dfbf6030b3cc78ea59981301417421
Reviewed-on: https://review.gerrithub.io/415539
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Normally, the put_io_channel is deferred - but for unit tests
messages are called inline and not as events. This results
in spdk_bdev_mgmt_channel_destroy() complaining that
the shared_resource list isn't empty.
So just remove it from the TAILQ before putting the io_channel -
that's more correct anyways.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifd3418f208c906ac09822cb9202068baf0fe211a
Reviewed-on: https://review.gerrithub.io/415529
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Deferred descriptor removal invocation under bdev_unregister() does
not account for the possibility of bdev_unregister being entered
multiple times for the same bdev (which is possible thanks to multiple
paths to unregistration - consider bdev hotremove callback and
_spdk_bdev_finish_unregister_bdevs_iter iterator - being present).
Therefore, currently nothing prevents _remove_notify for the same bdev
descriptor from being scheduled multiple times.
This commit adds boolean remove_scheduled field to struct spdk_bdev_desc.
The value is set when remove_notify for the descriptor is being
scheduled for the first time, and checked on subsequent attempts.
Change-Id: If2c5a365c05c4123c50edf5a2db164be9dd26f8e
Signed-off-by: Andrey Kuzmin <andrey.v.kuzmin@gmail.com>
Reviewed-on: https://review.gerrithub.io/415319
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This helps prevent starvation in the case where a thread
is started but remains idle, while other threads consume
all of the spdk_bdev_io buffers in the global pool.
This starvation issue is fairly theoretical at this point,
but future patches will be adding the ability for callers
to be notified when an spdk_bdev_io becomes available if
the pool is exhausted. We will add tests to stress
pool exhaustion at which point this patch will become
much more important.
While here, increase the minimum bdev_io_pool_size to
account for the mgmt_ch getting destroyed and then
immediately created again on the master core. In this
case there is a window where both channels exist at
once - the one being destroyed won't free its cached
spdk_bdev_ios until the deferred spdk_put_io_channel
event executes.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3a2fc80bc2bfd78b098bcbfce456d7a433cd64e9
Reviewed-on: https://review.gerrithub.io/415039
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
We should not allow write_zeroes commands if the descriptor
was not opened for writing.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I56bc729d12089df90637b202dfe25f5505e9b712
Reviewed-on: https://review.gerrithub.io/414896
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
spdk_bdev_io_valid_blocks() already takes care of this.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia7e5ede8d65d0b336ea676d5e25d07a3389573a5
Reviewed-on: https://review.gerrithub.io/414895
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Currently this just includes the overall size of the
spdk_bdev_io pool and the size of the per-thread
spdk_bdev_io caches.
Later patches will allow configuring these via
INI config file and JSON-RPC.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7e235ee6d2d7123d8460eeacde999c7b51017c43
Reviewed-on: https://review.gerrithub.io/414710
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will return the number of currently allocated threads.
Modify the bdev_io caching code to use this new API since these
caches are really per-thread, not per-core. SPDK does not support
dynamic threading yet, but once it does, we will want callers to be
using functions from the thread API - not counting the number of cores
allocated to the application.
spdk_env_get_core_count may still be useful as a helper function, so
it is still kept and not deprecated. For example, app.c uses it to
print the number of cores allocated to the application. bdevperf should
eventually be modified to use spdk_thread_get_count, but holding off on
that for now until spdk_event_allocate() uses threads instead of a reactor
lcore to specify where the event should be executed.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5a30e3e825e6821da87d3927a2443768dfd740f4
Reviewed-on: https://review.gerrithub.io/414709
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Madhu Pai <mpai@netapp.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is to add the core control policy for the
bandwidth rate limit (max bytes per second). Change
the existing functions for a common name and specially
handle the case when IOPS and bandwidth rate limiting
are both enabled.
Change-Id: I9f4565958d472559ef6d8bea52b1fe2a5f3c8969
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/413821
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
To support the bandwidth rate limiting besides the IOPS rate
limiting, this patch is to add the support of the configuration
parse. The format will be as following to have a 10 (10M)
on the Malloc0 bdev. The default unit is in MB.
Limit_BWPS Malloc0 10
Change-Id: I62d70391ccad7804e6673ec56a3ed1cb0a4fbbd4
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/413652
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I6babd4cf990bf19b510db88bdfb0ca81e29d9252
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/414700
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Madhu Pai <mpai@netapp.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This begins the process of hiding data members in
spdk_bdev_io that don't need to be accessed from
within bdev modules.
One strategy would be to implement accessors for
every data member in the structure. However, that
may have negative performance effects. Instead,
create a new internal structure within the old
structure. This new structure will still be visible
for now, but at least makes clear which members
are accessible and which are not.
This patch shifts one data member to the new structure
as an example.
Change-Id: I68525db995325732fe9f5fc3f45b06920948309b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/412298
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There is some overhead notified for the message processing,
instead to always send a message to the QoS thread, check
whether the current thread is QoS thread or not. If yes,
queue the IO immediately base on the QoS logic.
Change-Id: I9c1f93aeaf68c9b1a0282c3b690614413949d901
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/413028
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This will become the public interface for implementing
bdev modules. Right now the file exposes too much of
the guts of the bdev layer to modules, so it needs
to be stripped down.
Change-Id: Ie8b8c3271d51fdb8d0c24a80244b3f3e510c8790
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/412297
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will allow multiple independent callers to request I/O statistics
without stepping on each other (previously the stats would be reset on
each request, which only works with a single caller).
This also means that we can now allow requesting stats while the VTune
integration is enabled.
Change-Id: Ia9d4d6fd37fa66b3671cd33b3183c90524f955bd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/412257
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add a new function and its RPC caller. By using it, we can
get the statistics of all the bdevs or the specified bdev.
Meanwhile, with this patch, the open source tool 'sysstat/iostat'
can support for SPDK. The 'iostat' tool can call this function to
get the statistics of all the SPDK managed devices via the rpc
interface.
Change-Id: I135a7bbd49d923014bdf93720f78dd5a588d7afa
Signed-off-by: Yanbo Zhou <yanbo.zhou@intel.com>
Reviewed-on: https://review.gerrithub.io/393130
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This makes a number of things much simpler and eliminates
the need for upcoming proposed reset handling.
Change-Id: I23a6badd0873f6dcf38ba1e55bf18d846c2843df
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/407357
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
It doesn't represent an io_channel, so lets
rename it to something less confusing.
Change-Id: I730eb0ff0eb7737a59c190812a25b327ccd8fc14
Suggested-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/409998
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Bdev layer has been recently refactored to share
internal module channels for *all* bdevs of the
same bdev module. This makes us return ENOMEM for I/O
of one bdev if another bdev of the same module (nvme, split, etc)
is entirely saturated. This is not the behavior we want,
as these bdevs may not have anything in common.
This partially reverts commit
e433001 (bdev: Put 3 types of channels in a hierarchy)
Change-Id: Ice0570f92fcaffa6301a282c53eeec8215f354fe
Reported-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/409996
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Minimizes a future diff.
Change-Id: Ibc68588f3da2a169863d61a3aa20f384fa33e3dc
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/409747
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Future changes will rely on the ability to atomically
swap this data structure out.
Change-Id: I77b93252df63155692271b5665cc040cb83a33d0
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/409224
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ic980f3f2f04db61e3ff7ce305817eede7a09cb5a
Signed-off-by: John Meneghini <johnm@netapp.com>
Signed-off-by: Ed rodriguez <edwinr@netapp.com>
Reviewed-on: https://review.gerrithub.io/410056
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
These won't cover race conditions across threads, but
at least we have something to test the behavior.
Change-Id: I8e620d2076fe7a3d95df668fda4bee49b6d0afa7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/409343
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
In below case that bdev is not QoS iops rate limiting enabled,
just report an error and return.
set_bdev_qos_limit_iops Malloc0 0
Change-Id: I1514dfd80f417a94d8c5147d7c4e891fc91a29fd
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/409243
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
The related QoS channel resource needs to be freed
after disabling the QoS through the RPC call.
Change-Id: If757b5febdb909367c9b940de6e6b5e9528c0103
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/409125
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I4a86ba226870eef67a30634c9f5ad592eb41cd52
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/408937
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Unlimited I/O per second is equivalent to disabling QoS.
Change-Id: I03a489dd18b5d1a9a42f7853248911e97b7a211e
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/408291
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
For bdev I/O that gets passed to another thread for handling,
such as in the QoS case, make sure the I/O stats are tracked
on the original submitting thread.
Change-Id: Ieaba168dde5eb24314e5cf64bb063faaca2b5be7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/408399
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The QoS parameters are only used from the QoS thread,
so gather them together in a struct pointed to by the
bdev.
Change-Id: I91005f1d00205217a189d1ebacdb55782814f247
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/407356
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic20993bdf5ae4e5ed7fbed7d630a0024462f7fc6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/408534
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Recently added a new call to bdev modules to inform them of when
bdev subsystem init is complete. The intention is for modules like
RAID or crypto that need to take some action after the initial bdev
subsystem init such as 'build whatever RAID volumes you can, all
initial examine callbacks are done' or in the case of crypto 'OK,
construct all of your crypto structures for HW and SW PMDs and
associate each to configured vbdevs.'
Now that I've implemented the crypto case I see that the call was
misplaced and will execute following any examine_done call including
subsequent bdev registrations. Those cases, for both RAID and crypto,
need to follow a different path (examine) since all of the activities
associated with the new init_complete() callback are intended just for
the very first initialization of SPDK.
So, long story short, moving this call so that it's only executed once
when subsystem init is done and not on subsequent bdev registrations.
Change-Id: I952b90e33fc67d48d86c0a7de2ad47c74a98e839
Signed-off-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/408209
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
This patch adds a new RPC method to configure QoS on bdev
at runtime.
For example:
set_bdev_qos_limit_iops Malloc0 20000 --> Enable QoS on this
block device with 20000 IOPS rate limiting.
Change-Id: I1ee8b313b769fb5a664820f4ba827e0230be4b5d
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/393255
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Change-Id: I6bf491bd9661d247a70d7ee323c14397cd577636
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/407367
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This value was recalculated on each time slice. Instead, just
set it up right away.
Right now QoS is statically configured at creation time for bdevs,
which means we can get away with this. Once dynamic
configuration is added, events will need to be sent to the QoS
thread.
Change-Id: I8bde4170f2d882d2c0f5e57a6aa7713d045beb7d
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/407355
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
There might be a couple of mgmt channels still
trying to use the mempool from their destroy
callback (spdk_bdev_mgmt_channel_destroy).
Change-Id: I59247bfa283dc0481923fdd4eaf8e1726e1267ce
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/408192
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
When testing whether QoS is enabled, the code previously
checked mutable values in the bdev itself. Instead, it needs
to check the flag in the channel.
Right now, QoS can only be configured statically when the
bdev is created. This means that no channels will exist
prior to QoS being turned on, which simplifies setting
the per-channel flag (only need to set it when a channel
is created).
Change-Id: I59e56c64c18c262cc2a7f71a6dde8329edb35db7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/407354
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Add new optional bdev module interface function, init_complete, to notify bdev modules
when the bdev subsystem initialization is complete. Useful for virtual bdevs that require
notification that the set of initialization examine() calls is complete.
Change-Id: I0997fb5749d430f2fd3a40172ec8a1d5caa96964
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/407222
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The bdev layer has three types of channels:
* bdev channels - one per bdev per thread
* module channels - one per bdev module per thread
* management channels - one per thread
Arrange this in a hierarchy where bdev channesl point to
module channels which point to management channels. Rely
on the channel mechanism to reference count and release
resources at the correct time.
Change-Id: I9341068f95ec2a3897c5a1d897895a4f5cc49299
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/406854
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This is code movement only. This minimizes an upcoming
diff.
Change-Id: If2af99fdd86ca3c8ce672b9d8f5bc44a2f30cc8e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/407218
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Change-Id: Ibc43e3ee65d85a83d78d6e15457ae57992a1188a
Reviewed-on: https://review.gerrithub.io/395059
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This is a one line static function called from one place.
Change-Id: Ie00bbedfabf54d40a2f571bbc8d30f0ac8cbbdd6
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/406853
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Move this definition between the management channel and
the bdev channel struct since that's where it fits in
the logical hierarchy.
This is just code movement. No other changes.
Change-Id: I80b9909b14cad473a7768780ac9ab74a2f1309e5
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/406852
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We'll need the bdev channel inside this function in
an upcoming change.
Change-Id: Iadb07700ba57b23d37b7c373d0e9c5063aaed57b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/406851
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
It was called module_ch and shared_ch throughout the code.
The name module_ch is better, so use that everywhere.
Change-Id: Id030f72c878f44ea4dc8ca14b936719e26e891e3
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/406825
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
by placing the destroy function in the callback of the
io_device_unregister, we ensure that all channels associated with a bdev
will be freed before the bdev is destroyed. This eliminates the
possibility of an io_channel callback referencing a destroyed bdev.
Change-Id: I0dd6f53dcfa9c9c5a3c6e98a7e2ad8687da17c3f
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/406248
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
For application like vhost/iSCSI target/NVMe-oF target/etc,
a new section as below can be added for the QoS rate limiting.
[QoS]
Limit_IOPS Malloc0 100000
Limit_IOPS Nvme0n1 500000
Also added a sample change at test/lib/bdev/bdev.conf.in
Change-Id: I7c7d951fbe1352ca2571f135c657bc4fa43b56c7
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/393221
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
As there is no I/O channel associated with the QoS
bdev channel, add the specific functions to handle
the reset operation on the QoS bdev channel. The
reset operation will be conducted on the QoS thread.
Related UT code on QoS with reset operation was also
included.
Change-Id: Ibba68ddb132fa926fec6327829157b43ac806713
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/393181
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Instead of accessing the qos_channel pointer on the bdev,
always have the QoS parameter on the allocated memory of
qos_channel structure.
The bdev->qos_channel is set to NULL in the destroy operation,
and the destroy is through the async message which could
be executed later after the poller function within which
the bdev->qos_channel is accessed (thus a NULL pointer
dereferenced).
In this case, assign the memory address of the allcoated
qos_channel to the function as the parameter to solve
this issue.
Change-Id: I2fdb53cb0a6a29fb41ab28362f8e068d21ee5d12
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/405438
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
There is a call to spdk_bdev_close () during spdk_bdev_part_base_free
(). This will trigger spdk_bdev_module-> module_fini () if this is the
last descriptor to the last bdev. In module_fini () resource might be
freed. But then, after spdk_bdev_close(), call to base_free_fn () is
done which may try to free the same resource again.
To avoid this, call spdk_bdev_module_finish_iter () after
_spdk_bdev_finish_unregister_bdevs_iter () return.
Change-Id: Ia6eb210504d121772221da18651a90ffc89dbbc4
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/405322
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
SPKD base bdev might be part of multiple vbdevs. The same is true in
reverse direction. So consider folowing scenario:
bdev3 bdev4 bdev5
| | |
+-+--+ + +--+--+
/ \ | / \
bdev0 bdev1 bdev2
In current implementation bdev0/1/2 will apear as base base for
bdev3/4/5 which is obviously wrong.
This patch try to address this issue.
Change-Id: Ic99c13c8656ceb597aba7e41ccb2fa8090b4f13b
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/405104
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Write information needed to recreate each bdev.
Change-Id: I3d2b24fd4aaa8b98ec558d864cc28b55a899452f
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-on: https://review.gerrithub.io/401217
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Patch 583a24a4 [1] moved io_outstanding field to the
bdev_module_channel as it was needed there. This patch
re-adds another io_outstanding field to the bdev_channel.
We *do* want to count outstanding I/O for particular bdevs
- mostly for debug purposes.
[1] 583a24a4 ("bdev: share nomem_io data between bdevs built on the
same device")
Change-Id: I723e72c36447f9f778f88fc1769619676cbbaf71
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/396871
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Simplify the handling of I/O completions that need to be sent as a
message so that they are handled in a single place in
_spdk_bdev_io_complete().
Change-Id: Ic94354ad947d0f1f11241f728aa194a105ff96c4
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/404184
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Flip the status != NOMEM check around so that the logic is easier to
follow.
Change-Id: I607e8b5f0cfaa3598a623d0fc3642b11bd42decd
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/404175
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Move the commonly-accessed fields to the front so they end up in the
same cache line where possible.
Also tweak the types of type, status, error.nvme.sct, error.nvme.sc,
error.scsi.sc, and error.scsi.sk (they can fit in 8 bits), and move the
Write Zeroes splitting variables into u.bdev.
This reduces sizeof(struct spdk_bdev_io) from 272 to 224, in addition to
the better cache line usage.
Change-Id: I4a91fd07f252e7add4a2db179df9c53268672198
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/404053
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We must hold bdev->mutex around all QoS channel manipulations, not just
channel_count; otherwise, there are race conditions.
Change-Id: I6183aef83f4d5789bded426a1832e3faaa688363
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/403367
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The helper functions were only needed when this was sent as a message,
but they are only called directly in the current code, so just move the
contents of the functions inline to their call sites.
The unregister path can also be simplified to pass the poller directly
to spdk_poller_unregister(); it already handles the case of a NULL
poller.
Change-Id: I509cf8922b53e1e616c9e976610b20061bb50066
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/403364
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be used to track time used in pollers - each poller can now
indicate if it found any work to do or not.
For cases where it was obvious and the infrastructure was already in
place, existing pollers have been modified to return 0 or a positive
value to indicate whether work was done. Other pollers have been
modified to return -1 by default, indicating that the poller isn't
indicating anything about whether work was performed. This will allow
us to find un-annotated pollers easily in the future and fix them
incrementally.
Change-Id: Ifebfa56604a38434fac5c76ba7263267574ff199
Signed-off-by: Roman Sudarikov <roman.sudarikov@intel.com>
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Reviewed-on: https://review.gerrithub.io/391042
Tested-by: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>