This patch makes sure we're on the thread that requested creation /
deletion of the device when calling the notification callback.
Change-Id: Ia11a8054692874f6b57d4ebe3e3cb290c58e83b6
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459618
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Added ftl_dev_has_nv_cache to check if the FTL is configured to use
non-volatile cache or not. It makes these checks a bit more readable.
Change-Id: I0140df184d89a675e40bd5056718cd64301c553e
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459617
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
When creating the nbd, we wait until it's ready, but we didn't do
that when loading the configuration from JSON, which resulted in
sporadic IO failures, as the device hasn't been initialized yet. This
patch adds waitfornbd after each load_config call.
Change-Id: Id350ae7b1afab11f5f3fbd131d938dbd65a8cb15
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459616
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Wait until all user writes are completed before writing band's metadata.
Otherwise in case of power loss, user data might not get written while
the metadata does, which would result in data loss.
Change-Id: I419862960c072e38265b91d0d0498ff0c6f9f29e
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459615
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Use the data placed on the non-volatile cache to perform recovery in
case the device wasn't shut down cleanly. The write phase ranges are
read and their data is copied onto the OC device.
The code added in this patch will correctly copy the data from
overlapping ranges, however it won't do anything about these overlapping
areas, so subsequent power loss happening quickly after recovery might
result in data loss.
Change-Id: Ib4c66092cee858496ec66f789fcfb1e7e32f5c20
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458105
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Scan the cache to find ranges of blocks written with the same phase.
This prepares the structures needed to perform data recovery from the
non-volatile cache.
Change-Id: I0c901d010d6ca76feabca13116d831c1d9931833
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458103
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
The structures in this module had no comments, so it was a bit hard to
understand what they're used for.
Change-Id: I439c8a792f02b929006c60933e6b272751b1a675
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458102
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Moving data from one band to the other doesn't need to be stored on the
non-volatile cache. Not only does it add unnecessary traffic to the
cache (wearing it out and reducing its throughput), but it requires us
to synchronize it with user writes to the same LBAs.
To avoid all that, this patch adds the FTL_IO_BYPASS_CACHE flag to all
writes coming from the reloc module. However, to be sure that the moved
data is stored on disk and can be restored in case of power loss, we
need to make sure that each free band have all of its data moved to a
closed band before it can be erased. It's done by keeping track of the
number of outstanding IOs moving data from particular band
(num_reloc_blocks), as well as the number of open bands that contains
data from this band (num_reloc_bands). Only when both of these are at
zero and the band has zero valid blocks it can be erased.
Change-Id: I7c106011ffc9685eb8e5ff497919237a305e4478
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458101
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Some of the writes doesn't need to go through the non-volatile cache
(e.g. relocations, data recovery from the cache). This patch adds IO
flag to indicate that the write shouldn't be stored on the non-volatile
cache.
Change-Id: I3d485fe14cf25b3074832f26491ba0cb12ff0e58
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458100
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Initialize children IOs with the appropriate LBA of its parent when
allocating internal IOs.
Change-Id: I191ad741b9d88d7f18cae05982e0a06a8f371f78
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458099
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
This patch adds tracking of the phase of the writes to the non-volatile
cache. The phase is changed each time the whole buffer is filled. Along
with every block's LBA, current phase is stored in its metadata. This
allows for replaying the sequence of writes when recovering the data
from the cache after (unclean) shutdown.
Since there are only three possible phases to be stored on the device at
a time, phase is defined as a 2-bit counter cycling through 1 -> 2 -> 3
-> 1, with 0 marking blocks that were never written.
Change-Id: Id47880367934027fd102c32f183110acc9d4c62a
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458098
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
After filling whole non-volatile cache, block all further writes until
the header with metadata is written. This means that metadata stored on
the device will always be up-to-date with the most recent write
sequence.
Change-Id: I15b724b52814289622374ce77e5c3b23173a75c6
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458097
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Check the type of DIF used by the bdev specified as the non-volatile
write cache. If it's anything other than SPDK_DIF_DISABLE, fail the
initialization, as we don't support any other type yet.
Change-Id: Ie8bc1729558e055989d7925bc55f6307ee738f0e
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458096
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
When restoring the device, read the first block of the non-volatile
cache containing its metadata header and verify that it's indeed a
device that was used as write cache.
Change-Id: Idf113a9e8eb73160a2d9e6e882c9e026d3fafb3e
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458095
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When creating FTL device using non-volatile cache, zero out the
non-volatile cache and store metadata (device's UUID, size of the cache)
in the first block.
Change-Id: Id8f212aef756e86e8a215582ab7c32a635e18938
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458094
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Mateusz Kozlowski <mateusz.kozlowski@intel.com>
For https://github.com/spdk/spdk/issues/849
iscsiadm installed in Fedora 29 whose version is
6.2.0.876-1, doesn't work well with DataDigest parameter,
even DataDigest parameter is listed in printed records.
Change-Id: I9c45ced7c13827e13a9273a4b5a4768ff3665c42
Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461191
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This patch excludes *.patch files from some checks in check_format.sh:
- trailing spaces,
- POSIX includes
Change-Id: Ic55ce7f4128ddc946d235b4ca487061075edc03b
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460939
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
The test.py script relies upon importing SPDK Python RPC libs.
This requires user to add ./spdk/scripts/ to PYTHONPATH.
Unfortunately --help could not be reached when the import failed,
to user executing the script directly wouldn't know that.
This patch adds instructions for user when importing
RPC lib fails.
Change-Id: Icb87fbc5ae9d1c5b71699827d6ea0cd922d38627
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460908
Reviewed-by: Karol Latecki <karol.latecki@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
To add another NULL bdev together for the larger IO and
higher queue depth testing.
Change-Id: Iaa4e79006e732b579e6b54ebdb5d9abf5f3472ac
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453782
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In case some module has `async_init = true` and
some other module that comes after it fails to initialize,
then callback from asynchronously initialized module
may call `spdk_bdev_init_complete()` first, then failed module
will call `spdk_bdev_init_complete()` later.
This currently results in NULL dereference because
first call to `spdk_bdev_init_complete()` sets `g_init_cb_fn = NULL`.
This change prevents first call to `spdk_bdev_init_complete()`
by saying that failed module is not finished with initialization.
This patch fixes#847
Change-Id: Ib6b231d5ea27896ad88d7f11b8732921077b3d4d
Signed-off-by: Vitaliy Mysak <vitaliy.mysak@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461230
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
In the memcpy elimination patches, the same bug exists in 3
places. When building req->decomp_iov using the host buffers,
req->decomp_iovcnt was being incremented in the loop and also
being used as part of the index messing everything up.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I485ac32502801c1e11b8392b2df7eba06b4f5a9b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461053
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
The first optimization to eliminate memcpy was too aggressive and
did so for the read-modify-write operation as well. This didn't
affect the fio tests used that the time but bdevio catches it
right away. When over writing a chunk with data, we first need
to read the old data before applying the new. This patch uses
the scratch buffer for old data as sending it to the user buffer
results in it not being written at the end of the read-modify-write.
There is at least one more bug fix coming after this also found
with bdevio but passed with fio
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: I8fe074056434bb4757c68077e2df446861edfd94
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461032
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
iSCSI target does not allocate data buffer on read, and delegate
allocation to the bdev.
When the bdev is a split vbdev, the split vbdev does not allocate
data buffer and delegate allocation to the backend bdev.
In this case, iSCSI target expects the buffer is allocated until
notifying completion to the split vbdev. However, the split vbdev
notifies completion to the backend bdev when calling the callback
of iSCSI target. The backend bdev frees the buffer immediately,
but iSCSI target still uses the buffer. If the buffer is reused
by another I/O, data corruption will occur.
For this issue, vbdev_gpt_submti_request() calls
spdk_bdev_io_get_buf() when the I/O is read, and its callback
vbdev_gpt_get_buf_cb calls _vbdev_gpt_submit_request() then.
This will ensure the buffer is allocated before forwarding I/O
to the backed bdev.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ifb2eac500276ab5012123b7d6f7eb033d87ad17c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461350
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
iSCSI target does not allocate data buffer on read, and delegate
allocation to the bdev.
When the bdev is a split vbdev, the split vbdev does not allocate
data buffer and delegate allocation to the backend bdev.
In this case, iSCSI target expects the buffer is allocated until
notifying completion to the split vbdev. However, the split vbdev
notifies completion to the backend bdev when calling the callback
of iSCSI target. The backend bdev frees the buffer immediately,
but iSCSI target still uses the buffer. If the buffer is reused
by another I/O, data corruption will occur.
For this issue, vbdev_split_submti_request() calls
spdk_bdev_io_get_buf() when the I/O is read, and its callback
vbdev_split_get_buf_cb calls _vbdev_split_submit_request() then.
This will ensure the buffer is allocated before forwarding I/O
to the backed bdev.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Icfd0663b548479ac0bf6b5b49420f144142e3300
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461348
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Don't use a fork of QEMU that's in a specific directory.
On the test systems, we can install the forked QEMU as
the default QEMU and put the binaries in PATH so that
it works.
Change-Id: I637d70452901c85606eb5eeb2bf6c67ae98cd92f
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458597
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We only need the fork for OCSSD tests. And eventually we won't
even need that.
Change-Id: I0f52c44f504435a3bab2f478664a88ba6acfe464
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459960
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Now that the resume path can correctly handle the case where a namespace
was removed and a new one added with the same nsid, this no longer needs
to be asynchronous.
Change-Id: I693045e66a7d4e75255b526d8f5ca5ef8695533e
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459606
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When using stacked virtual bdev (e.g. split virtual bdev), block
address space will be remapped during I/O processing and so
reference tag have to be remapped accordingly.
This patch adds an new helper function spdk_bdev_part_remap_dif
and call it before submitting write I/O or after completing read
I/O.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Idfc6081893861d412c19a9edfb348a7faa7e8c5b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461106
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
All IO types but reset have used the remapped offset to submit I/O
to the base bdev. Previously each IO type had got the remapped
offset by itself. Consolidating it into a place will improve
readability and will be helpful for the next patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I29465e92d8fb62e45cfc97c52fedaa661b2f0602
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461105
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When using stacked virtual bdev (e.g. split virtual bdev), block
address space will be remapped during I/O processing and so reference
tag will have to be remapped accordingly.
This patch adds an API, spdk_dif_remap_ref_tag to satisfy the case.
UT code is added together in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I55cc45c475d4e86e736f5712baf02fcabfde3c82
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461104
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When using stacked virtual bdev (e.g. split virtual bdev), block
address space will be remapped during I/O processing and so reference
tag will have to be remapped accordingly.
The use case is explained in detail as follows:
- Format a single NVMe SSD with DIF enabled.
- Create a NVMe bdev on the NVMe SSD with DIF enabled.
- Create four split vbdevs on the NVMe bdev.
- Add the split vbdevs to a NVMe-oF target.
- Application is aware of block address space of the split vbdevs.
- Application submits read/write I/O to the NVMe-oF target.
Case 1:
- Configure NVMe-oF target to DIF pass-through.
Case 2:
- Configure NVMe-oF target to DIF insert/strip
For the case 1,
- Application inserts DIF for write I/O and verifies DIF for read I/O.
- The split vbdevs remaps reference tags of DIF both for read and write
I/O because application expects reference tags are based on the
block address space of split vbdevs.
- The NVMe bdev processs read/write I/Os without remapping reference tags
because reference tags are already based on the block address space
of the NVMe bdev.
For the case 2,
- NVMe-oF target inserts DIF for write I/O, and verifies and strips
DIF or read I/O.
- The split vbdevs remaps reference tags of DIF both for read and write
I/O because NVMe-oF target expects reference tags are based on the
block address space of split vbdevs.
- The NVMe bdev processs read/write I/Os without remapping reference tags
because reference tags are already based on the block address space
of the NVMe bdev.
This patch adds two APIs, spdk_dif_ctx_set_remapped_init_ref_tag
and spdk_dif_remap_ref_tag to satisfy the use case.
UT code is added together in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ib3101129225b334d2f578eab75197790b1818770
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461103
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
In future patch in new spdk_bdev_open_ext function we will call
spdk_bdev_get_by_name function and after that call and before
calling old spdk_bdev_open routine bdev can be removed.
We need to add mutex which will prevent that. Any future code
should use this mutex when accessing the bdevs list to get
a bdev and perform some operation on it.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I785a1791346aebdd394fc51ad0e7fbfbabf317c9
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458457
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Length of xattr descriptor is equal to length of xattr struct,
xattr name and the len of stored value.
There is no limit to how much can be stored in memory for xattr.
On disk xattr size is limited to single page and within that to
max descriptors that can fit in it.
This size is known at compile time.
Before this patch it was possible to add xattr exceeding
what was possible to be written to disk. This caused issues
when serializing the metadata during spdk_blob_sync_md()
or spdk_blob_close(). Making those fail without specific info
to the user and not actually writting such descriptor.
Since maximum length of xattr descriptor is known at compile time,
this patch compares against this value when setting the xattr.
It will immediately report back to user with error, and will
not store xattr in memory (thus not serialize it).
This patch should not affect any backward compatibility for blobs.
Too large xattrs weren't written to disk before,
API for blobstore stays the same - only reporting ENOMEM
when it should.
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Change-Id: I6f4af4d079e47f084e20d7a4969d9a78ec1f8610
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460450
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Set DIF context of the corresponding request to PDU when
- processing in-capsule data of the command,
- processing data of C2H PDU, or
- processing data of H2C PDU.
Change-Id: I3a668a55be21dbe2ee6ecf26476290670bd7b4a8
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458929
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ziye Yang <ziye.yang@intel.com>
When NVMe/TCP initiator transfers in-capsule data, NVMe/TCP has to
process it as in-capsule data. If DIF insert/strip is enabled,
in-capsule data size will be increased by NVMe/TCP target to insert
metadata. However size of in-capsule data buffer had not been
increased, and buffer overflow occurred when NVMe/TCP initiator
transfers in-capsule data to NVMe/TCP target with DIF insert/strip
being enabled.
This patch increases size of in-capsule data buffer size to store
metadata. 16 byte metadata per 512 byte data block is the current
maximum ratio of metadata per block.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I88b127efd7a945bde167a95df19a0b9175cb8cd0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461333
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
We updated readv_offset before generating DIF to avoid adding
the temporary variable _rc in the previous patch, but that caused
write error when inserting DIF.
Fix the bug in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Id0788280a83cbea2554c851db77751432fc00cba
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461116
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When handling the capsule command header, call spdk_nvmf_request_get_dif_ctx
by passing the NVMf request and the reference to the DIF context, and set
the flag dif_insert_or_strip of the NVMf/TCP request to true.
spdk_nvmf_request_get_dif_ctx returns false immediately when the
corresponding NVMf controller disables DIF insert/strip.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I16f6b322f2692d5f9653d011a490e7929ec37365
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458928
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
When the NVMf controller's flag dif_insert_or_strip is enabled, DIF is
inserted for write I/O and stripped for read I/O, and the corresponding
NVMe-oF initiator should not be aware of the DIF setting of the
backend bdev.
Hence this patch hides the DIF setting of the backend bdev
when the flag dif_insert_or_strip is enabled.
Change-Id: I3c14880c2e94cba7f76b1bca78afb36bfe884e26
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456731
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
The first idea was that the caller of spdk_nvmf_request_get_dif_ctx()
should check if the current transport enables DIF insert/strip before
calling spdk_nvmf_request_get_dif_ctx().
But NVMf controller knows if DIF/insert/strip is enabled now by the
previous patch. Hence spdk_nvmf_request_get_dif_ctx() checks if the NVMf
controller enables DIF insert/strip at its head.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I78253d356b694800c3a9a9608514df58e0c631a6
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461314
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Add a flag dif_insert_or_strip to struct spdk_nvmf_ctrlr that indicates
whether DIF insert/strip is done.
Copy the DIF insert/strip setting of the corresponding transport options
to the flag at NVMf controller creation.
The purpose of this patch is to make DIF insert/strip not per-transport
option but per-controller option because we may want to be able to
control DIF insert/strip per controller at some point. Besides this patch
will clean the implementation.
Besides align indent around the change.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I57f65960b430e55f4021ed514aacd85581ff9993
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>