When we unlock a range, we remove the range from the
locked bdev list before doing the for_each_channel
iteration to remove the range from each channel.
But at the same time, right after removing from the
locked list, a new lock on that range could start.
In that case, we also do a for_each_channel to add
the range to each channel, and that will race with
the for_each_channel remove. When the lock start
wins, it finds the range already in the channel,
but doesn't set the owner_range which results in
a seg fault when the for_each_channel completes.
The fix is actually rather simple. We just add the
locked_ctx to the comparison when checking if the
range is already in the channel. If the locked_ctx
matches, then we know it was added as part of
initializing a new channel. If it doesn't, then
we create a new range object pointing to the new
locked_ctx. The first one will get removed when
the remove for_each_channel catches up.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I94f8b20376dd437f404add35744d42fc148303ff
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482620
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
If a locking operation has to wait because of an
existing lock, we queue the lock context. When the
existing lock finishes unlocking, we restart the
queued lock context. But we have to make sure
we restart the lock context on the same thread it
was originally submitted, since it has a channel
associated with it.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I555515f3adfc3c13a86584c601ed541d605980b7
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482463
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This function is required for NVMf implementation
for compare and write fused command.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: If41611f5c0b8e4ed8eec66f09858c724f1800d59
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477914
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Added new function for getting NVMe specific return code
for fused commands. Also changed one of the return codes
in fused commands so that we could distinguish error
cases.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I86417ea4f5b8f3e6496162be3d6c6128076e35d4
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481666
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
A new API was added `spdk_bdev_io_get_aux_buf` allowing the caller to request
an auxiliary buffer for its own private use. The API is used in the same manner that
`spdk_bdev_io_get_buf` is used and the length of the buffer is always the same as the
bdev_io primary buffer. 'spdk_bdev_io_put_aux_buf' is called to free the
auxiliary buffer.
The initial use case is crypto, in the next patch in series it is used. No UT were
added as the logic isn't that complicated and it is fully tested with each run
of crypto.
Fixed a comment typo also (not mine for once).
Signed-off-by: paul luse <paul.e.luse@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib1939fcbc8e5db36fd909ef26771a725a551e8e6
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478383
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This will allow for some reuse in future patches adding
support to get auxiliary buffers from the bdev memory
pools.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icd4e71d0e918cfe940c749e40cd1cc6c560e25a9
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482022
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This will be helpful in future patches where auxiliary
buffer support is added, and we have a different
type of callback to invoke.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I81d2cf3057e706a0e68de87a20b24c194205bbd5
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482021
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
To clarify what seems like it might be unintentional.
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id14b49e8b7bc2372ecb8dd7579e6797a395c2cf2
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477363
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Compare and write fused operation num_blocks should
not exceed value of 'atomic compare and write unit'.
In case of NVMe native support we should read this
value from 'namespace atomic compare and write unit'
if set in namespace identify data, otherwise from
'atomic and write unit' field in controller identify
data. If bdev does not support this natively we should
set this value to 1.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ib1ea02dbf9d1eed476d9dd0114ea96b1376e0c45
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477911
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Added new field in bdev_io structure for tracking
number of IO retries. It will be used in future patches.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I8e002e93f54c9ce39c7af0dd3a1960e6aea93580
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479828
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We will only support a vectored variant of
compare-and-write for now.
This does no locking for now. Ii will be added
in a separate patch.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5bd075c912de60090e19cf8fced19c4879fcc900
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475941
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We can't allow overlapped locked ranges - otherwise
two different channels could be deadlocked.
So add a pending_locked_ranges to the bdev. When we
start a lock operation, check if the new range overlaps
one that's already locked. If so, put it on the pending
list. When an unlock operation completes, we will
check if any pending ranges can now be locked.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2e3113216a195887b954533495ff200df14fadc1
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478537
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Keep a mutex protected list of the active locked ranges
in the bdev itself. This is only accessed when a new
channel is created, so that it can be populated with
the currently locked ranges.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id68311b46ad4983b6bc9b0e1a8664d121a7e9f8e
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477871
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Add an io_locked TAILQ to each channel, which hold
IO that will write to a currently locked region.
Also add a new step to the locking process per channel.
Each channel needs to wait until all existing outstanding
writes to the newly locked range have been completed.
Only the channel that locked an LBA range may submit
write I/O to it. It must use the same cb_arg for the
write I/O as the cb_arg used when locking the LBA range.
This ensures that only the specific I/O operations needing
the lock will bypass the lock.
When a range is unlocked, we will just blindly try to
resubmit all IO in the io_locked tailq. This could be
made more efficient in the future, but we don't expect
this path to occur very often, so going for simplicity
in the first pass.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibdc992144dfaffe7c05471a5b3c020cedd8cdfc3
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478226
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This adds new internal APIs bdev_lock_lba_range and
bdev_unlock_lba_range. To start, these APIs will
manage dissemination of lock/unlock requests to all
existing channels for a given bdev. This does not
yet interact at all with any I/O sent to the channel.
Future patches will check new I/O to see if they
are trying to write to a range that is locked. Future
patches will also ensure we do not have overlapping
ranges active at once.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6d5b1cc84b41a7adc2a3c5791c766bb77376581f
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478225
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This will be used by upcoming patches for implementing
LBA range locks.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifa4ad8dcc0d09ccf20d35f010fcae19dcc17abc9
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478224
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
In the case of bdev initialization, the memory could
not be allocated. When cleaning up, it needs to check
whether the memory pointer is valid or not before
touching the address of the pointer.
This is to fix below issue:
https://github.com/spdk/spdk/issues/1126
Change-Id: Iec0e2c5a40ad153a96fb939414a15ac3b9112a01
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478849
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <box.b.chen@intel.com>
Reviewed-by: yidong0635 <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
These bdevs claim native support for zcopy operations. They should
actually support them when running on top of a bdev that claims zcopy
support.
For example, if you were to build a part or passthru bdev on top of a
malloc_bdev (which is just for testing primarily) and then try to run
bdevperf on top of that, you will currently get errors. This patch fixes
that problem.
Change-Id: I023557a3a5b1baf177cc29f1cbc1cd391cc67a8a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478243
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Media management event was introduced. It's sent out to notify that
some portion of the data needs to be rewritten (e.g. due to data
refresh, wear leveling, high error rate, etc.). This type of
notification is only utilized by devices exposing raw access to the
physical medium (e.g. Open Channel SSDs).
Change-Id: Ia30faa5866d71fd597009b441f69c609de974161
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471460
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Create a local variable for the spdk_bdev_channel
pointer. This will be used again later in this patch
series.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I71bef863c422e60294b5249bf65f62d4939d1a53
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477870
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add an API so that the user can enable/disable the bdev IO
timeout. Also, add the bdev io timeout handling callback.
So it means to let the upper user determine how to handle
the IO timeout scenario reset the device or abort the IO.
Change-Id: I9c7138ca46c74c045b687adab59a18d6bccc4996
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469228
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously, when additional resets were submitted while a reset was in
progress, those resets were queued and then executed serially as part of
executing the original reset. Doing multiple resets on a bdev in quick
succession is not useful if the first reset succeeds and is very
unlikely to be useful in the negative case. Instead, we should batch
resets and complete them all at once when the current reset succeeds or
fails.
Change-Id: If10e0f37526860eaeeb41a8803d6298a3eff3212
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/474599
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Add the tsc trace for the parent bdev_io.
Change-Id: Idf5c5df3953fa1697c384a4f83b591584dd30a80
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476588
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
The reset cmd and parent of splited IO would not go into
the _bdev_io_submit() so they don't record the submitted
time. The timeout IO check needs the submitted time so we
add them in this patch.
Change-Id: Ic7217171ec878d4a6dfa80d6635957ca0186928b
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476050
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Add a link to the bdev channel for linking all the bdev
IOs that were submitted to this channel so that we can
monitor each IO's consuming-time.
Change-Id: I1e425b2059f20fd7b158eb3d6b023ce8629e7a30
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469227
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The _ was supposed to indicate an internal function,
but really leaving off the spdk in the function
is the standard way to denote an internal function.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I74facafb67b793502838b9c1b5f90aec2c88c69b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475033
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
We try to reserve the spdk_ prefix for functions that
are part of the public SPDK API.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0a95ba0f9db04b588a2e948427f8a9f53a7b5740
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475032
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
This function is not part of the SPDK public API, so
remove the spdk_ prefix from it to make it more clear
this is internal to the SPDK bdev library.
Also add "channel" to the name since this function is
operating on the channel parameter.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I57897ab4bf601b90551259b7cf6efa63152ed02f
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475031
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
This is not part of the public SPDK API, so remove
the spdk_ prefix to make it clear this is internal
to the bdev library.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibda76e4d8e34dde0c2fe638cb965e5ba2d9e47b5
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475030
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
This function is not part of the SPDK public API, so
remove the spdk_ prefix to make that more clear when
reading the code.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5ea4c04c474a6cf0862f11921daa97c9df728e96
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475029
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
In emulated zero copy mode on devices with separate metadata support
the start zero copy request could assert after read completion,
during _bdev_io_unset_bounce_buf; the original iovs would be NULL, but
original md_buf would have garbage data.
Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
Change-Id: Ia52b012c1b7462ebe490c29f5493168775646848
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473455
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This change allows setting of the NVMe completion queue
CDW0 in spdk_bdev_io_complete_nvme_status.
Before that change, handling of vendor specific NVMe IO
commands was limited since there wasn't a way to return
command specific info back to the initiator.
Change-Id: I250d5df3bd1e62ddb89a011503d42bd4c8390f9b
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470678
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When splitting bdev's, if we run out of child IOVs, we have to
make sure that we end block aligned. That may require us to
shorten or eliminate one or more child IOVs to be picked up
on the next split. If we eliminate enough such that there
are no IOVs for this split IO, just continue and the next
split completion will kick off another split to pick up
the remaining data to be transferred.
Fixes issue #981
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: If2bfbe5f0862295e1d74cbea00692890a2178967
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
When we ran out of child_iov space, ensure the iovs to be aligned
with block size. However the calculation was wrong.
(to_next_boundary_bytes % blocklen) meant not to_last_block but to_next_block.
So calculate to_last_block_size by reducing to_last_block_size from blocklen.
The data was collected when the issue occured. So add unit test
by using the data.
Fixes#979
Reported-by: Geoffrey McRae <geoff@hostfission.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I62a50bada450288ea7c60aec0e557c2a53cd8916
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470806
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch adds reference counter and gurading mutex to bdev_desc
structure to keep track of in-flight messages and avoid release of
descriptor until all of messages has been processed. With reference
counter existing remove_scheduled field in descriptor structure is not
needed anymore.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I97f78955362b04131abf202ba04e6d60343f3faf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469620
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
spdk_bdev_zone_append() allows writing to
open zone from multiple threads or from
single thread with queue depth greater than
one. Zone first logical address and number of
blocks to be written are provided by user.
Logical block address of written data is
returned in completion callback.
Change-Id: I4da994d72b7e0fe6621962e3b0f2380940ec3b45
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461614
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
spdk_bdev_io_get_append_location() will be
used during zone append command to retrieve
location of data write.
Change-Id: I1f46ae9d2f745aa53264c1a01da3f7cef4f38c72
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469164
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
spdk_bdev_zone_management() allows to perform
management action on a zone. Zone is specified
by start logical block address. Available zone
actions: open, close, reset and finish.
Change-Id: Ie7eaed3e2cc7b9b49dd51ee2d6c28b4ef2f23eb9
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460647
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_bdev_get_zone_info() is used for retriving
information about zones inside zoned namespace.
Change-Id: I8f931505245e984c0b1ee35ed6592c978ee47544
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460643
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Some internal bdev.c static function will be shared
with new zoned bdev module.
Change-Id: Ifbb8bf443f67b2daf97858b15d474ecce98a9efb
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469100
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Added new public header for zoned bdev. Zoned bdev is an
extension of the bdev interface. Generic concept comes
from ATA/SCSI and is also being worked as an NVMe TP.
Zoned device logical blocks space is divided into fixed-sized
zones. Each zone is described by its start logical block address
and capacity. Writes to a single zone need to be sequential.
After zone is fully written it need to be reset to write to it
again. Such writing schema could be very beneficial in terms of
write amplification factor for NAND based devices.
SPDK Flash Translation Layer library will be consuming this
interface in the future.
Extending SPDK bdev interface will allow to use existing bdev
infrastructure for this new type of devices.
Zoned device have several properties defined in spdk_bdev
structure:
- zone_size: default size of each zone
- max_open_zone: maximum number of open zones
- optimal_open_zones: optimal number of open zones to get
best performance on writes
Single zone properties are defined in spdk_bdev_zone_info
structure:
- start_lba: first logical block of z zone
- write_pointer: logical block address in the zone at
which next write shall occur.
- capacity: maximum number of logical blocks that may
be written in the zone when zone is empty.
- state: zone state
Several zone states are defined: Empty, Open, Full, Closed,
Read Only and Offline.
To change zone state zone actions are defined: Close, Finish,
Open and Reset.
Change-Id: I5fcc22d548c15743329344cae96f5ff73e268504
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460642
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
This patch is entry point for extending bdev
interface to support devices with zoned namespace
semantics.
spdk_bdev_is_zoned() will allow user to check if
bdev is zoned bdev.
Change-Id: Id9ea9898d406d1d942bf3081b00ebcb574ac2b5e
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460641
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Added write_unit_size field to bdev structure.
It describes required number of logical blocks
for write operation. For legacy bdevs this value
will be equal to logical block size.
For bdevs working on top of Open Channel/Zoned
Namespace SSDs or RAID 5 volumes write size unit
could be greater than logical block size and
upper layer should perform write operations
with size of multiple of write unit size.
Change-Id: I55eb6687912a7d0d1157874b2778e7d6c2d44e37
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/463802
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
spdk_bdev_part_construct() must be now called on the
same thread that called spdk_bdev_part_base_construct().
This was always the case so far and I don't see any other
case where thread safety could be useful, so just remove
it. The doxygen doesn't say anything about it either.
Even in GPT case, we create a base directly as a part of
examine and then create part bdevs in the spdk_bdev_read()
completion callback, but that callback will be always
executed on the same thread which issued the read.
Change-Id: I752f2a7f08c9faf4231ed53a46b700b33fa13697
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466024
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>