numam-spdk

Author	SHA1	Message	Date
wuzhouhui	3785a4d83b	bdev/qos: fix a heap-use-after-free error When destroy qos, spdk_bdev_qos_destroy() allocates a new qos, and swap old one. After spdk_bdev_unregister() frees the new qos, the old qos poller may still reference new qos via bdev->internal.qos. Fix this error by using old qos in _spdk_bdev_qos_io_submit(). Reported in `72aac51430`.1539054028/ubuntu16.04/build.log Change-Id: Id1bce6c8b1cefae604dd2c69e8f3482ec34b1b54 Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com> Reviewed-on: https://review.gerrithub.io/428444 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-10-09 22:17:03 +00:00
GangCao	868c28cd13	QoS/Bdev: add the RPC support for the bandwidth rate limit This patch added the support of RPC method to enable, adjust, disable the bandwidth rate limit on the bdev. And it can work together with the existing IOPS rate limit. The RPC method has been consolidated to support both IOPS and bandwidth rate limits as below: usage: rpc.py set_bdev_qos_limit [-h] [--rw_ios_per_sec RW_IOS_PER_SEC] [--rw_mbytes_per_sec RW_MBYTES_PER_SEC] name positional arguments: name Blockdev name to set QoS. Example: Malloc0 optional arguments: -h, --help show this help message and exit --rw_ios_per_sec RW_IOS_PER_SEC R/W IOs per second limit (>=10000, example: 20000). 0 means unlimited. --rw_mbytes_per_sec RW_MBYTES_PER_SEC R/W megabytes per second limit (>=10, example: 100). 0 means unlimited. Change-Id: I9c03cd635280add01801a81c6a6c02f0cf85bee1 Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.gerrithub.io/416511 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ziye Yang <optimistyzy@gmail.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-10-09 18:26:44 +00:00
Shuhei Matsumoto	8a295129e1	bdev: Fix scan build error by checking the limit of parent iovec scan-build requests to check the size of parent iovec by using artificially large LBA in unit tests. Fix the error by using not pointer but position and checking if position is less than count of parent iovec. Change-Id: I74c4f6d1b68ecfca93e9247acc5ac6bd5412a960 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/427965 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-10-05 21:57:34 +00:00
Pawel Wodkowski	c4fee1e970	mk: don't use '-include spdk/config.h' Each file that need to check SPDK_CONFIG_* options need to include spdk/config.h explicitly. Change-Id: If9f2a91ac4c2b1a300dcf88ec3e2a12714ad344a Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com> Reviewed-on: https://review.gerrithub.io/427221 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-10-02 23:13:32 +00:00
GangCao	7d030ef7fc	QoS/Bdev: add the QoS related structure and enumeration This patch is to introduce the specific QoS related structure and the enumeration for types of QoS rate limits. Later new types of QoS rate limits can be supported easily. Change-Id: Idb8d2e7627fd145bf2b0ddb296c968b6b068f48c Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.gerrithub.io/424459 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-10-02 22:10:13 +00:00
Shuhei Matsumoto	6fa7e38667	bdev: Avoid assert and factor out queue IO operation in _bdev_write_zero_buffer_next Currently write to bdev fails in _spdk_bdev_write_zero_buffer_next due to other than -ENOMEM, assert is called. RAID bdev using this feature is generally availale now and it will be OK to remove this assert and return error instead. Additionally, applying the factored function to _bdev_write_zero_buffer_next will improve readability slightly. These two changes are done in this patch. Change-Id: I462630a71e57e2e5146b085b215d62a378ea9402 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/427186 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2018-09-29 03:12:24 +00:00
Shuhei Matsumoto	9d4708f35e	bdev: Factor out queueing IO operation in _spdk_bdev_io_split This patch factors out that operation into a function and adds error handling to that operation to improve readability slightly. Change-Id: Ic24df0c0a9abbebc38d30fc17779dc5a5f6138a6 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/427026 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2018-09-29 03:12:24 +00:00
Shuhei Matsumoto	9872b99206	bdev: Avoid assert when read/write to bdev fails in _spdk_bdev_io_split Currently when read/write to bdev fails in _spdk_bdev_io_split_with_payload due to other than -ENOMEM, assert is called. RAID bdev that utilizes the split IO feature is generally availale now and it will be OK to remove this assert and return error instead. Change-Id: I6ea6fd45b94bff0ea84e498e0c4dfd1dd31e0260 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/427025 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2018-09-29 03:12:24 +00:00
Shuhei Matsumoto	0df515a842	bdev: Remove limitation of child iov size in bdev_io_split_with_payload() When a bdev IO is split, if iovec size in a strip is more than 32, the IO will fail. Remove the limitation by spliting the split IO further. Change-Id: I962ad86dfe63ea1fcd86ffa52ead7452fb80e53d Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/425876 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2018-09-29 03:12:24 +00:00
Dariusz Stojaczyk	1e4f9974a7	bdev: do not finish unitialized modules To achieve its goal, this patch changes the order in which bdev modules are finished. All modules that examine bdevs (e.g. lvol,split,...) will be now finished last. It should not cause any issues though, since all bdevs are already removed at the time when any module finish is called Fixes #387 Change-Id: Id60c375eb5c3d7306b69cdce86bded77354868d8 Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com> Reviewed-on: https://review.gerrithub.io/421158 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2018-09-26 21:11:51 +00:00
Shuhei Matsumoto	bc3dfe3043	bdev: Fix the second parameter `success` of spdk_bdev_io_completion_cb The type of the second parameter `success' of spdk_bdev_io_completion_cb is bool. Hence change the code to use bool type success or failure. Change-Id: I9e93f4ccbb085e8e184f209e706915dcd34aa966 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/426648 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-09-26 09:03:15 +00:00
Jim Harris	01e7c02e15	bdev: call spdk_bdev_io_get_buf before splitting We cannot split an iov if a buffer hasn't been allocated yet. So always call spdk_bdev_io_get_buf on reads before trying to split. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I2c26efb9dc6cb2c7c3e3b7ae5bab2c37844b9113 Reviewed-on: https://review.gerrithub.io/424879 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-09-07 14:59:21 +00:00
Jim Harris	de4d961996	bdev: set iovs on correct bdev_io in spdk_bdev_io_put_buf spdk_bdev_io_put_buf() is responsible for reclaiming bdev-allocated buffers from a bdev_io. If there are bdev_ios waiting for one of these buffers, it calls spdk_bdev_io_set_buf() on the next bdev_io in the queue. This will set the iov_base and iov_len on the bdev_io to point to the bdev-allocated buffer. But spdk_bdev_io_put_buf() was calling spdk_bdev_io_set_buf() on the just completed bdev_io, not the next bdev_io in the queue. So fix that. Fixes: `844aedf8` ("bdev: Simplify get/set/put buf functions") Reported-by: Alan Tu Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ibbcad6e35a3db6991bd7deb3516229572f021638 Reviewed-on: https://review.gerrithub.io/424880 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-09-06 23:42:32 +00:00
wuzhouhui	f118de60af	bdev: fix race condition between spdk_bdev_close and _remove_notify When new bdev was created, the struct spdk_bdev_module::examine_disk() may open and close bdev. On the other hand, if something goes wrong, the creation procedure may unregister new created bdev, so race condition appeared between _remove_notify() and spdk_bdev_close(). Add the new field "closed" and "remove_notified" in struct spdk_bdev_desc, so _remove_notify() and spdk_bdev_close() knows how to deal with this situation. Change-Id: Ibfe915a4d76096796b039a13a4f49f26669eba2c Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com> Reviewed-on: https://review.gerrithub.io/423369 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2018-09-06 20:59:05 +00:00
Ben Walker	898739fbac	bdev: Enforce that spdk_bdev_close() is called on same thread as open() spdk_bdev_close() must be called on the same thread as spdk_bdev_open(). Further, the remove callback on the descriptor will also be run on the same thread as spdk_bdev_open(). Change-Id: I949d6dd67de1e63d39f06944d473e4aa7134111b Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/424738 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2018-09-06 20:59:05 +00:00
Jim Harris	73c8b61cac	bdev: free bdev_io in write_zeroes emulation When emulating write_zeroes commands on device that don't natively support it, we submit a write with a zeroed buffer. We used to just reuse the original bdev_io, but that was recently changed due to other splitting code added for iovs. But when making those changes, we forgot to free the bdev_io for the write that was sent down to the device. Fixes: `183f37e8` (bdev: do not reuse bdev_io when...) Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: If08782c65f6305c0a9f9d15d74fd8823e1158e9b Reviewed-on: https://review.gerrithub.io/424733 Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-09-06 20:42:47 +00:00
Jim Harris	ba38785b7e	bdev: remove get_bdevs_config RPC This RPC does not work for a lot of bdev types. For example, NVMe namespaces and virtio scsi LUNs are not explicitly constructed by an RPC - they are indirectly constructed by an RPC associated with an NVMe controller or virtio-scsi controller. While here, remove spdk_bdev_config_json. It was only created to facilitate this get_bdevs_config RPC. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I537166d8f91ab458bd2000859d74f7254bfc9c0a Reviewed-on: https://review.gerrithub.io/424584 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-09-05 19:46:17 +00:00
Jim Harris	30dc6a1893	bdev: don't output "name" when write_config_json not specified This isn't valid RPC so it needs to be removed. Bdev modules were working around this issue by defining empty write_config_json methods. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I3c4c20249eadfcfb4103430f5801190b14897249 Reviewed-on: https://review.gerrithub.io/424582 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2018-09-05 19:46:17 +00:00
Jim Harris	82c3c30f44	trace: remove alias concept This was added a long time back for tracking an rte_mbuf whose buffer was a different rte_mbuf - all related to a userspace TCP stack that is no longer in development. The concept isn't useful now, so remove it to reduce the complexity of the tracing code. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I310e492eba7f55df242bb29d82fb19f6daee1f51 Reviewed-on: https://review.gerrithub.io/424565 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-09-05 18:03:43 +00:00
Ben Walker	c94020001a	thread: Add a name parameter to spdk_register_io_device This is a string name used for debugging only. Change-Id: I9827f0e6c83be7bc13951c7b5f0951ce6c2a1ece Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/424127 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-09-05 16:00:54 +00:00
Jim Harris	afaabcce23	bdev: add trace points Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I37e7f2fb19fecfe5933b4815d24240954b74b62b Reviewed-on: https://review.gerrithub.io/424278 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-09-04 17:09:25 +00:00
Jim Harris	86d77e2eb6	bdev: account for missed qos timeslice timeouts There could be cases (especially in virtualized and/or test environments) where we could accumulate significant skew in the timeslice frequency. Rather than depend on the application framework to try to guarantee the rate of timeslice poller callbacks, keep track internally of the last time the poller was invoked. If/when we accumulate and detect skew equivalent to one or more timeslices, increase the allowed IO and bandwidth of the next timeslice to accomodate. Since bdev poller now calls spdk_get_ticks() to do accounting, this patch also fixes up the increment_time() unit test function and the test env layer to properly increment the fake TSC. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Iba301ddc0fb3d02042106a8bf6e4a6a9a84dc263 Reviewed-on: https://review.gerrithub.io/423580 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: GangCao <gang.cao@intel.com>	2018-09-02 23:59:27 +00:00
Jim Harris	4f860d7e40	bdev: only apply split_on_optimal_io_boundary to R/W Splitting a 1TB unmap into individual 64KB unmap commands (for a RAID volume with 64KB strip size) would be awful - the RAID module can be much smarter about this. So back out the changes for splitting I/O without payload. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I24fe6d911f4e3c9db4b2cb5d66c7236a5596e0d9 Reviewed-on: https://review.gerrithub.io/424103 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2018-08-31 17:53:42 +00:00
Jim Harris	183f37e8ad	bdev: do not reuse bdev_io when splitting write_zeroes Now that we split on I/O boundaries, that code needs to be able to use the bdev_io split* members to track what is left to submit. This means that the write_zeroes code cannot submit the parent bdev_io as the child bdev_io, since the I/O boundary code will overwrite the write_zeroes split accounting. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I9316b59267508f60799766fc4f1ea05a4b3e5d9e Reviewed-on: https://review.gerrithub.io/423404 Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-08-29 16:42:53 +00:00
Jim Harris	5443e0aed3	bdev: count down for qos tracking This will simplify some future patches which will account for missed timeslice timers by allowing additional IO/BW in the following timeslice. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I9dd46a768c98ce267c733a9f9719a2d3d2c3c915 Reviewed-on: https://review.gerrithub.io/423579 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-08-29 16:42:39 +00:00
Ben Walker	e6bbe23277	util: Move definition of SPDK_SEC_TO_USEC to util.h This was defined in two places, so consolidate the definitions. Change-Id: I0bbb262b97e90d1064bcc50ee201928f6ca9518a Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/423182 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2018-08-27 18:44:51 +00:00
Chen Wang	6fa48bbf62	lib: fix typos in the lib directory Change-Id: Idcb60b79d2902bb316facc6f60e0a81e5cf847ed Signed-off-by: Chen Wang <chenx.wang@intel.com> Reviewed-on: https://review.gerrithub.io/423372 Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2018-08-24 17:15:12 +00:00
Jim Harris	4bd9762165	bdev: add split_on_optimal_io_boundary A number of modules (RAID, logical volumes) have logical "stripes" that require splitting an I/O into several child I/O. For example, on a RAID-0 with 128KB strip size, an I/O that spans a 128KB boundary will require sending one I/O for the portion that comes before the boundary to one member disk, and another I/O for the portion that comes after the boundary to another member disk. Logical volumes are similar - data is allocated in clusters, so an I/O that spans a cluster boundary may need to be split since the clusters may not be contiguous on disk. Putting the splitting logic in the common bdev layer ensures bdev module authors don't have to always do this themselves. This is especially helpful for cases like splitting an I/O described by many iovs - we can simplify this a lot by handling it in the common bdev layer. Note that currently we will only submit one child I/O at a time. This could be improved later to submit multiple child I/O in parallel, but the complexity in the iov splitting code also increases a lot. Note: Some Intel NVMe SSDs have a similar characteristic. We will not use this bdev stripe feature for NVMe though - we want to primarily use the splitting functionality inside of the NVMe driver itself to ensure it remains fully functional. Many SPDK users use the NVMe driver without the bdev layer. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ife804ecc56f6b2b55345a0d0ae9fda9e68632b3b Reviewed-on: https://review.gerrithub.io/423024 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-08-23 16:08:21 +00:00
Seth Howell	b7d9caf2e6	bdev: increment io_time if queue depth > 0 This value is used to calculate the disk utilization of a given bdev. Change-Id: I4bf101c524b92bdd21573941e17f61db59c5c6b8 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/423017 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-08-22 20:34:09 +00:00
Jim Harris	9f2dd0c4f8	bdev: save the bdev_desc specified when submitting the I/O This will be needed for using this same descriptor when splitting an I/O. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Idec759df7ab27f8de567d3c8a4214e25dbe173f5 Reviewed-on: https://review.gerrithub.io/423022 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-08-22 16:29:58 +00:00
wuzhouhui	6deac3e660	bdev/lvol: using spdk_bdev_alias_del_all() to delete all alias on destroy So we don't need to allocate memory (maybe failed) just for free other memory. Change-Id: I2c83f6acc2aa6ed79455bff90f952a2e70b44d59 Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com> Reviewed-on: https://review.gerrithub.io/422203 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2018-08-21 00:53:31 +00:00
Jim Harris	671b77e5cd	bdev: remove vbdevs and base_bdevs arrays The intents of these arrays was to keep track in the bdev layer of all base<->virtual bdev relationships - i.e. which member disk bdevs make up a RAID bdev, which logical volume bdevs are associated with a bdev that contains an lvolstore, etc. Currently none of this is used however. And trying to keep track in the bdev layer instead of asking the bdev modules for the relationships has a number of complications. Early one, we tried to do this with TAILQs - but that doesn't work since this can't be done with a single TAILQ_ENTRY in the bdev structures. So we moved to arrays - that works a bit better, but then the pointer arrays have to be realloc'd which isn't ideal. The biggest problem though with these arrays is that they held bdev pointers - not bdev descriptor pointers. It's not really valid to access bdevs without a descriptor - the descriptors are what make sure active references are accounted for when a bdev is hotplugged. Of course the bdev layer knows when a bdev is getting removed and could go and do the updates to these arrays separately - but that just seems very convoluted. So for now just remove these arrays completely. If there is a future need for the bdev layer to understand relationships between bdevs, we can add module APIs so that the generic layer can ask the modules about the relationships. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I99ef1068240bff1262f64f234260cf2fb44df51d Reviewed-on: https://review.gerrithub.io/420932 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: GangCao <gang.cao@intel.com>	2018-08-17 00:11:03 +00:00
Jim Harris	01035cd49f	bdev: cleanup registered bdevs in reverse order This ensures (for example) that a RAID volume is unregistered before its member disks. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I7a7c16acc351f2d5d4218b64b370e2c77c6e2b5e Reviewed-on: https://review.gerrithub.io/420812 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-08-17 00:11:03 +00:00
Jim Harris	c899854d03	bdev: add new fini_start notification callback for modules When an SPDK application shuts down, the bdev layer will automatically unregister all of the bdevs to ensure they are properly quiesced and cleaned up. Some modules may want to perform different operations when a bdev is destructed during normal runtime vs. shutdown. For example, for lvol, when the last lvol is cleaned up, it should unload the lvolstore, release and close the bdev that contains the lvolstore. You never want to do this during normal runtime though - it is perfectly valid to have an lvolstore that contains no lvols. RAID and future bdev modules such as multipath have similar use cases. So add a new bdev module callback named "fini_start". If a module specifies a function pointer for this callback, the bdev layer will call it before it starts the bdev unregistrations. This enables some future patches to the bdev layer such that it will always unregister block devices that are not claimed (i.e. logical volumes) before block devices that are claimed (i.e. the bdev containing an lvolstore). Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I6e87f5c2b27f16731ea5def858f26e882a29495a Reviewed-on: https://review.gerrithub.io/421175 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-08-15 23:32:04 +00:00
Changpeng Liu	ff458be850	nvmf: claim each bdev when constructing new Namespace Claim the block device when adding it to a new Namespace, and prevent the block device to be added twice for other modules and Namespaces. Also remove the test that using same block device over different Namespaces. Fix issue #371. Change-Id: Ib7ce18e9fde4a15c0f19ce9e28e69145e54570e0 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/420472 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2018-08-06 21:14:37 +00:00
Jim Harris	27635c770f	bdev: track io_stats from deleted channels Currently, the bdev layer iterates over all of the existing channels of a bdev to collect I/O statistics. But this ignores statistics for channels that are deleted. Fix that by keeping an io_stat structure in the bdev which accumulates statistics for deleted channels. Use the bdev mutex to protect these accumulations. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I3103c0b8b55973c827d977765f47e5b9e7f58e5f Reviewed-on: https://review.gerrithub.io/421029 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-08-03 19:47:15 +00:00
Jim Harris	d18d51b047	bdev: add _spdk_bdev_io_stat_add helper function This will be used further in an additional patch. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I1126e5adfc24f17e5cdf33b0d3e04c78470771c8 Reviewed-on: https://review.gerrithub.io/421027 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-08-03 19:47:15 +00:00
Seth Howell	63528ebc76	bdev: Add read and write latency ticks to stats These were left out previously. Change-Id: I4e97068d2f13ca1dd1cfae1b25564641cee794ef Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/419601 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2018-07-31 23:26:07 +00:00
Seth Howell	dd9cd4352d	bdev: add tracking for time spent processing I/O. Two extra fields are added to the iostat rpc. 1. io_time. The amount of time since queue depth tracking was enabled that has been spent on I/O processing. 2. weighted_io_time. Incremented each time this bdev's queue depth is polled by the amount of time spent processing I/O since the last polling event times the measured queue depth. Change-Id: Ie70489ec24dee83f3eeac8f4f813ec7074ff458f Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/419031 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2018-07-31 23:26:07 +00:00
Ben Walker	a1ec258ac7	Clean up some incorrect Copyright statements Daisuke Aoyama originally contributed to istgt, the iSCSI target in FreeBSD. The SPDK iSCSI code is originally derived from that. Due to copy and paste, some incorrect copyright attributions have been added to other files that do not derive from istgt, so this patch removes those. It is doubtful, at this point, that there is any code whatsoever that remains from the original istgt, but we can revisit that at a later time. Change-Id: I207e1e6b99d271e2f12690be90a96f7d0c113af7 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/420679 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-07-30 16:29:38 +00:00
Paul Luse	18f534465c	bdev: defer call of spdk_bdev_init_failed() when a module fails If a module fails during init, github issue #363 reported ASAN errors due to leaks from not cleaning up modules that may have completed init before the failure. An earlier partial fix was added, this is a more correct fix that addresses the specific scenario reported as well as the simplified one used during early debug... Change-Id: Ia43476faf33d8e31b581529de3b6d75caed09096 Signed-off-by: Paul Luse <paul.e.luse@intel.com> Reviewed-on: https://review.gerrithub.io/420118 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2018-07-26 16:58:40 +00:00
Ben Walker	844aedf8ad	bdev: Simplify get/set/put buf functions Also make the iov_len always set to the used length, not the total length of the buffer. Change-Id: I7ebb5b63c6ca7570369f4af0131a23c520c1f7b0 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/419025 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Maciej Szwed <maciej.szwed@intel.com> Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-07-26 05:37:26 +00:00
Paul Luse	4d5bedcebc	bdev: cleanup if bdev init doesn't complete all the way In the event that one bdev module failed, we'd leak a bunch of stuff from any that init'd correctly beforehand. Also added a guard around the calling of modile init_done routines so that it's not done if module init didn't work. Change-Id: I4e6170e1eee67b131252ed30d0d20124d2c5ff35 Signed-off-by: Paul Luse <paul.e.luse@intel.com> Reviewed-on: https://review.gerrithub.io/419446 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>	2018-07-19 01:33:02 +00:00
Ziye Yang	1f5c8233da	bdev: change return type of _spdk_bdev_enable_qos to void This function always return 0, so change the type of function to void Change-Id: I715cf82c74619a2707b8e5a453710a992489f2c8 Signed-off-by: Ziye Yang <optimistyzy@gmail.com> Reviewed-on: https://review.gerrithub.io/419045 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-07-18 03:46:12 +00:00
Jim Harris	9ddf643831	thread: don't immediately remove channel from list when put When spdk_put_io_channel is called, if its the last reference, we defer actual destruction of the channel, so that code in the same context which may be referring to the channel doesn't crash. But it is possible that an io_channel for that same io_device could be requested before the deferred message is processed. This would result in a second io_channel being created for that device on the same thread. To avoid this case, don't immediately remove the channel from the list when the last reference is put. When the deferred message is processed, if additional references were allocated in the meantime, don't destroy the channel. Signed-off-by: Jim Harris <james.r.harris@intel.com> Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Idb8d4705fda0eb9c338e4960430e04edbe537e05 Reviewed-on: https://review.gerrithub.io/418878 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: GangCao <gang.cao@intel.com> Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2018-07-17 01:28:16 +00:00
GangCao	aa2789e926	QoS/Bdev: update the Byte rate limit timeslice control Considering the I/O size is variant from small in byte to large in megabyte, need to consider the compensation of less allowed bytes in next timeslice if the current timeslice sends more bytes down. Change-Id: I885f0bb21001bd90879aa8622e2b34e3bf78cf6e Signed-off-by: GangCao <gang.cao@intel.com> Reviewed-on: https://review.gerrithub.io/417829 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2018-07-16 06:08:15 +00:00
Seth Howell	760b868aa8	bdev: add ability to track bdev queue depth. This change includes a function to enable this feature on a per-bdev basis. The new information stored in the bdev includes the following: measured_queue_depth: The aggregate of the outstanding oparations from each channel associated with this bdev. period: The period at which this bdev's measured_queue_depth is being updated. With this information, one could calculate the average queue depth and the disk utilization of the device Change-Id: Ie0623ee4796e33b125504fb0965d5ef348cbff7d Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/418102 Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2018-07-12 23:27:27 +00:00
Daniel Verkamp	d53545eff7	bdev: fix alignment math in spdk_bdev_io_set_buf() Rather than adding the full alignment size (512), add one less than the alignment so that already-aligned buffers don't get rounded up again. Change-Id: I96323b848bfb90f2aa1774b869e2b8a81d253077 Reported-by: Shuhei Matsumoto Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com> Reviewed-on: https://review.gerrithub.io/418879 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2018-07-10 22:46:28 +00:00
Piotr Pelplinski	9d258c75af	bdev: split examine into two parts During spdk_bdev_init, examine_config is called. This call can claim bdev synchronously, based on configuration. On spdk_bdev_start if none module claimed bdev, examine_disk is called and can perform I/O before claiming bdev. Signed-off-by: Piotr Pelplinski <piotr.pelplinski@intel.com> Change-Id: I1448dd368cf3a24a5daccab387d7af7c3d231127 Reviewed-on: https://review.gerrithub.io/413913 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Maciej Szwed <maciej.szwed@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-07-06 21:12:53 +00:00
Jim Harris	140eaaa084	bdev: submit queued IO after disabling QoS There's no reason to abort IO that have been queued due to QoS limits, when QoS is switched from enabled to disabled. Submit them to the bdev instead. Fixes issue #357. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: If5eafc53418ac686120e1d6a1da884b42cef845e Reviewed-on: https://review.gerrithub.io/418128 Tested-by: SPDK Automated Test System <sys_sgsw@intel.com> Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Daniel Verkamp <daniel.verkamp@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2018-07-06 18:34:45 +00:00

1 2 3 4 5 ...

328 Commits