1545 Commits

Author SHA1 Message Date
Jim Harris
dc3717296e bdev: handle unlock v. lock race
When we unlock a range, we remove the range from the
locked bdev list before doing the for_each_channel
iteration to remove the range from each channel.

But at the same time, right after removing from the
locked list, a new lock on that range could start.
In that case, we also do a for_each_channel to add
the range to each channel, and that will race with
the for_each_channel remove.  When the lock start
wins, it finds the range already in the channel,
but doesn't set the owner_range which results in
a seg fault when the for_each_channel completes.

The fix is actually rather simple.  We just add the
locked_ctx to the comparison when checking if the
range is already in the channel.  If the locked_ctx
matches, then we know it was added as part of
initializing a new channel.  If it doesn't, then
we create a new range object pointing to the new
locked_ctx.  The first one will get removed when
the remove for_each_channel catches up.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I94f8b20376dd437f404add35744d42fc148303ff

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482620
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-01-27 17:39:52 +00:00
Jim Harris
da11a46466 bdev: start lock process on original channel
If a locking operation has to wait because of an
existing lock, we queue the lock context.  When the
existing lock finishes unlocking, we restart the
queued lock context.  But we have to make sure
we restart the lock context on the same thread it
was originally submitted, since it has a channel
associated with it.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I555515f3adfc3c13a86584c601ed541d605980b7

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482463
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-27 17:39:52 +00:00
Maciej Szwed
a83644fe2b bdev: Lock LBA range for fused command execution
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I577f961484b2ebf350f4f795eda1a018c5f0fd7a

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481710
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-01-27 17:39:52 +00:00
Maciej Szwed
c13733915b bdev: Add spdk_bdev_get_acwu function
This function is required for NVMf implementation
for compare and write fused command.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: If41611f5c0b8e4ed8eec66f09858c724f1800d59

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477914
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-01-27 17:39:52 +00:00
Maciej Szwed
adf90938b1 bdev: Add spdk_bdev_io_get_nvme_fused_status function
Added new function for getting NVMe specific return code
for fused commands. Also changed one of the return codes
in fused commands so that we could distinguish error
cases.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I86417ea4f5b8f3e6496162be3d6c6128076e35d4

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481666
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-01-27 17:39:52 +00:00
paul luse
82a4c84fec lib/bdev: Add support for new per bdev_io auxiliary buffer.
A new API was added `spdk_bdev_io_get_aux_buf` allowing the caller to request
an auxiliary buffer for its own private use. The API is used in the same manner that
`spdk_bdev_io_get_buf` is used and the length of the buffer is always the same as the
bdev_io primary buffer.  'spdk_bdev_io_put_aux_buf' is called to free the
auxiliary buffer.

The initial use case is crypto, in the next patch in series it is used. No UT were
added as the logic isn't that complicated and it is fully tested with each run
of crypto.

Fixed a comment typo also (not mine for once).

Signed-off-by: paul luse <paul.e.luse@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib1939fcbc8e5db36fd909ef26771a725a551e8e6
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478383
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-01-23 02:36:51 +00:00
Jim Harris
eaa984801e bdev: break spdk_bdev_io_get_buf into multiple functions
This will allow for some reuse in future patches adding
support to get auxiliary buffers from the bdev memory
pools.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icd4e71d0e918cfe940c749e40cd1cc6c560e25a9

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482022
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-20 10:02:45 +00:00
Jim Harris
03171deee8 bdev: add bdev_io_get_buf_complete helper function
This will be helpful in future patches where auxiliary
buffer support is added, and we have a different
type of callback to invoke.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I81d2cf3057e706a0e68de87a20b24c194205bbd5

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482021
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-20 10:02:45 +00:00
paul luse
d68459fcb3 lib/bdev: Add comment to bdev_io_put_buf() when unsetting bounce buffer
To clarify what seems like it might be unintentional.

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: Id14b49e8b7bc2372ecb8dd7579e6797a395c2cf2
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477363
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-01-20 09:57:16 +00:00
Maciej Szwed
32a961fc5d bdev: Add check for atomic compare and write unit
Compare and write fused operation num_blocks should
not exceed value of 'atomic compare and write unit'.
In case of NVMe native support we should read this
value from 'namespace atomic compare and write unit'
if set in namespace identify data, otherwise from
'atomic and write unit' field in controller identify
data. If bdev does not support this natively we should
set this value to 1.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: Ib1ea02dbf9d1eed476d9dd0114ea96b1376e0c45
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477911
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-01-15 12:25:26 +00:00
Maciej Szwed
91dac0f4d0 bdev: Add num_retries field in bdev_io
Added new field in bdev_io structure for tracking
number of IO retries. It will be used in future patches.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I8e002e93f54c9ce39c7af0dd3a1960e6aea93580

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479828
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-01-14 07:51:45 +00:00
Jim Harris
ae43c81a30 bdev: add spdk_bdev_comparev_and_writev_blocks
We will only support a vectored variant of
compare-and-write for now.

This does no locking for now.  Ii will be added
in a separate patch.

Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5bd075c912de60090e19cf8fced19c4879fcc900
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475941
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2020-01-14 07:51:45 +00:00
Jim Harris
2a2b7296ee bdev: do not allow overlapped locked ranges
We can't allow overlapped locked ranges - otherwise
two different channels could be deadlocked.

So add a pending_locked_ranges to the bdev.  When we
start a lock operation, check if the new range overlaps
one that's already locked.  If so, put it on the pending
list.  When an unlock operation completes, we will
check if any pending ranges can now be locked.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2e3113216a195887b954533495ff200df14fadc1

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478537
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2020-01-14 07:50:26 +00:00
Jim Harris
ebd1a4f76c bdev: inherit locked ranges for new channels
Keep a mutex protected list of the active locked ranges
in the bdev itself.  This is only accessed when a new
channel is created, so that it can be populated with
the currently locked ranges.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Id68311b46ad4983b6bc9b0e1a8664d121a7e9f8e

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477871
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2020-01-14 07:50:26 +00:00
Jim Harris
b90b7ce477 bdev: queue new IO that overlap a locked region
Add an io_locked TAILQ to each channel, which hold
IO that will write to a currently locked region.

Also add a new step to the locking process per channel.
Each channel needs to wait until all existing outstanding
writes to the newly locked range have been completed.

Only the channel that locked an LBA range may submit
write I/O to it.  It must use the same cb_arg for the
write I/O as the cb_arg used when locking the LBA range.
This ensures that only the specific I/O operations needing
the lock will bypass the lock.

When a range is unlocked, we will just blindly try to
resubmit all IO in the io_locked tailq.  This could be
made more efficient in the future, but we don't expect
this path to occur very often, so going for simplicity
in the first pass.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibdc992144dfaffe7c05471a5b3c020cedd8cdfc3

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478226
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-01-14 07:50:26 +00:00
Jim Harris
d84a88c1a0 bdev: add base infrastructure for locked lba ranges
This adds new internal APIs bdev_lock_lba_range and
bdev_unlock_lba_range.  To start, these APIs will
manage dissemination of lock/unlock requests to all
existing channels for a given bdev.  This does not
yet interact at all with any I/O sent to the channel.

Future patches will check new I/O to see if they
are trying to write to a range that is locked.  Future
patches will also ensure we do not have overlapping
ranges active at once.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6d5b1cc84b41a7adc2a3c5791c766bb77376581f
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478225
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2020-01-14 07:50:26 +00:00
Jim Harris
b87080efa2 bdev: add lba_range and overlapping checks
This will be used by upcoming patches for implementing
LBA range locks.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifa4ad8dcc0d09ccf20d35f010fcae19dcc17abc9

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478224
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
2020-01-14 07:50:26 +00:00
Maciej Szwed
f8d024b8c1 bdev: Add support for compare operations
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I45b99abe1ed760aa59785e5c4b4f71917e1f92c3
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477458
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2020-01-07 12:31:32 +00:00
GangCao
66af6221a0 bdev: check whether memory allocated already
In the case of bdev initialization, the memory could
not be allocated. When cleaning up, it needs to check
whether the memory pointer is valid or not before
touching the address of the pointer.

This is to fix below issue:
https://github.com/spdk/spdk/issues/1126

Change-Id: Iec0e2c5a40ad153a96fb939414a15ac3b9112a01
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478849
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: <box.b.chen@intel.com>
Reviewed-by: yidong0635 <dongx.yi@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-30 11:49:54 +00:00
Seth Howell
1384ad007c module/bdev: add zcopy abilities to virtual bdevs.
These bdevs claim native support for zcopy operations. They should
actually support them when running on top of a bdev that claims zcopy
support.

For example, if you were to build a part or passthru bdev on top of a
malloc_bdev (which is just for testing primarily) and then try to run
bdevperf on top of that, you will currently get errors. This patch fixes
that problem.

Change-Id: I023557a3a5b1baf177cc29f1cbc1cd391cc67a8a
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478243
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-20 10:06:47 +00:00
Konrad Sztyber
a69f90ddd8 lib/bdev: media management events
Media management event was introduced.  It's sent out to notify that
some portion of the data needs to be rewritten (e.g. due to data
refresh, wear leveling, high error rate, etc.).  This type of
notification is only utilized by devices exposing raw access to the
physical medium (e.g. Open Channel SSDs).

Change-Id: Ia30faa5866d71fd597009b441f69c609de974161
Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471460
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-20 10:02:15 +00:00
Jim Harris
96abfe63ef bdev: simplify bdev_io_submit
Create a local variable for the spdk_bdev_channel
pointer.  This will be used again later in this patch
series.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I71bef863c422e60294b5249bf65f62d4939d1a53

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477870
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
2019-12-18 10:09:45 +00:00
Jin Yu
19d803e89c bdev: add bdev timeout support
Add an API so that the user can enable/disable the bdev IO
timeout. Also, add the bdev io timeout handling callback.
So it means to let the upper user determine how to handle
the IO timeout scenario reset the device or abort the IO.

Change-Id: I9c7138ca46c74c045b687adab59a18d6bccc4996
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469228
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Community-CI: SPDK CI Jenkins <sys_sgci@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-12-16 12:20:43 +00:00
Seth Howell
b9af5dd834 lib/bdev: batch queued bdev reset completions.
Previously, when additional resets were submitted while a reset was in
progress, those resets were queued and then executed serially as part of
executing the original reset. Doing multiple resets on a bdev in quick
succession is not useful if the first reset succeeds and is very
unlikely to be useful in the negative case. Instead, we should batch
resets and complete them all at once when the current reset succeeds or
fails.

Change-Id: If10e0f37526860eaeeb41a8803d6298a3eff3212
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/474599
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-12-09 13:57:02 +00:00
Jin Yu
f209b343b6 bdev: add the tsc trace
Add the tsc trace for the parent bdev_io.

Change-Id: Idf5c5df3953fa1697c384a4f83b591584dd30a80
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476588
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-12-06 14:47:56 +00:00
Jin Yu
c6e50f7f41 bdev: record the submit time
The reset cmd and parent of splited IO would not go into
the _bdev_io_submit() so they don't record the submitted
time. The timeout IO check needs the submitted time so we
add them in this patch.

Change-Id: Ic7217171ec878d4a6dfa80d6635957ca0186928b
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476050
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-12-06 14:47:56 +00:00
JinYu
56de1b0061 bdev: add bdev channel a link
Add a link to the bdev channel for linking all the bdev
IOs that were submitted to this channel so that we can
monitor each IO's consuming-time.

Change-Id: I1e425b2059f20fd7b158eb3d6b023ce8629e7a30
Signed-off-by: JinYu <jin.yu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469227
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-12-06 14:47:56 +00:00
Jim Harris
aa4856632e bdev: remove _spdk_bdev function prefix
The _ was supposed to indicate an internal function,
but really leaving off the spdk in the function
is the standard way to denote an internal function.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I74facafb67b793502838b9c1b5f90aec2c88c69b

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475033
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-11-20 09:34:18 +00:00
Jim Harris
0e58d1549b bdev: remove spdk_ prefix from static functions
We try to reserve the spdk_ prefix for functions that
are part of the public SPDK API.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I0a95ba0f9db04b588a2e948427f8a9f53a7b5740

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475032
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-11-20 09:34:18 +00:00
Jim Harris
fdfb4e1270 bdev: rename spdk_bdev_get_io to bdev_channel_get_io
This function is not part of the SPDK public API, so
remove the spdk_ prefix from it to make it more clear
this is internal to the SPDK bdev library.

Also add "channel" to the name since this function is
operating on the channel parameter.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I57897ab4bf601b90551259b7cf6efa63152ed02f

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475031
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-11-20 09:34:18 +00:00
Jim Harris
a1d68e9c02 bdev: remove spdk_ prefix from spdk_bdev_io_init
This is not part of the public SPDK API, so remove
the spdk_ prefix to make it clear this is internal
to the bdev library.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibda76e4d8e34dde0c2fe638cb965e5ba2d9e47b5

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475030
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-11-20 09:34:18 +00:00
Jim Harris
6c07e9d1a8 bdev: remove spdk_ prefix from spdk_bdev_io_submit
This function is not part of the SPDK public API, so
remove the spdk_ prefix to make that more clear when
reading the code.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I5ea4c04c474a6cf0862f11921daa97c9df728e96

Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/475029
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
2019-11-20 09:34:18 +00:00
Kozlowski Mateusz
5b88597ac6 bdev: Initialize md_buf for zero copy start request
In emulated zero copy mode on devices with separate metadata support
the start zero copy request could assert after read completion,
during _bdev_io_unset_bounce_buf; the original iovs would be NULL, but
original md_buf would have garbage data.

Signed-off-by: Kozlowski Mateusz <mateusz.kozlowski@intel.com>
Change-Id: Ia52b012c1b7462ebe490c29f5493168775646848
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/473455
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-11-14 15:41:40 +00:00
Michael Haeuptle
53b92a6c18 nvme: allow setting of completion queue CDW0
This change allows setting of the NVMe completion queue
CDW0 in spdk_bdev_io_complete_nvme_status.

Before that change, handling of vendor specific NVMe IO
commands was limited since there wasn't a way to return
command specific info back to the initiator.

Change-Id: I250d5df3bd1e62ddb89a011503d42bd4c8390f9b
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470678
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-10-24 18:02:38 +00:00
paul luse
ebacdcd923 lib/bdev: don't issue a split IO if there are no IOVs
When splitting bdev's, if we run out of child IOVs, we have to
make sure that we end block aligned. That may require us to
shorten or eliminate one or more child IOVs to be picked up
on the next split.  If we eliminate enough such that there
are no IOVs for this split IO, just continue and the next
split completion will kick off another split to pick up
the remaining data to be transferred.

Fixes issue #981

Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: If2bfbe5f0862295e1d74cbea00692890a2178967
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-10-23 17:03:50 +00:00
Shuhei Matsumoto
3deaf00580 bdev: Fix wrong IO split when child iovs run out
When we ran out of child_iov space, ensure the iovs to be aligned
with block size. However the calculation was wrong.

(to_next_boundary_bytes % blocklen) meant not to_last_block but to_next_block.
So calculate to_last_block_size by reducing to_last_block_size from blocklen.

The data was collected when the issue occured. So add unit test
by using the data.

    Fixes #979

Reported-by: Geoffrey McRae <geoff@hostfission.com>

Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I62a50bada450288ea7c60aec0e557c2a53cd8916
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470806
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-10-11 00:35:34 +00:00
Chunyang Hui
b60a249bea bdev/part: add API to get name of bdev_part_base
Change-Id: I1545f5aaab67c3a56d37c712cafc932c089a0477
Signed-off-by: Chunyang Hui <chunyang.hui@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468909
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2019-10-09 14:39:19 +00:00
Evgeniy Kochetov
9522ed36f8 bdev: Add resize event
This patch adds RESIZE event to block device API.

Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: Ic9fe359d19544fa0a48eed91216b858097e82995
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468543
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-10-02 22:30:29 +00:00
Evgeniy Kochetov
ba00bff251 bdev: Add mutex and reference counter to block device descriptor
This patch adds reference counter and gurading mutex to bdev_desc
structure to keep track of in-flight messages and avoid release of
descriptor until all of messages has been processed. With reference
counter existing remove_scheduled field in descriptor structure is not
needed anymore.

Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I97f78955362b04131abf202ba04e6d60343f3faf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469620
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-10-02 22:30:29 +00:00
Wojciech Malikowski
45b5808eba lib/bdev: Added spdk_bdev_zone_append()
spdk_bdev_zone_append() allows writing to
open zone from multiple threads or from
single thread with queue depth greater than
one. Zone first logical address and number of
blocks to be written are provided by user.
Logical block address of written data is
returned in completion callback.

Change-Id: I4da994d72b7e0fe6621962e3b0f2380940ec3b45
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/461614
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
2019-10-01 14:48:54 +00:00
Wojciech Malikowski
a9e6fadf3e lib/bdev: Added spdk_bdev_io_get_append_location()
spdk_bdev_io_get_append_location() will be
used during zone append command to retrieve
location of data write.

Change-Id: I1f46ae9d2f745aa53264c1a01da3f7cef4f38c72
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469164
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Konrad Sztyber <konrad.sztyber@intel.com>
2019-09-26 16:17:34 +00:00
Maciej Wawryk
8e052bf41b RPC: rename set_bdev_options to bdev_set_options
Signed-off-by: Maciej Wawryk <maciejx.wawryk@intel.com>
Change-Id: I02cd87c9e13db6f237db15ccb15c19a63ad4568d
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468948
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-09-24 16:42:41 +00:00
Wojciech Malikowski
2b16dcc740 lib/bdev: Added spdk_bdev_zone_management()
spdk_bdev_zone_management() allows to perform
management action on a zone. Zone is specified
by start logical block address. Available zone
actions: open, close, reset and finish.

Change-Id: Ie7eaed3e2cc7b9b49dd51ee2d6c28b4ef2f23eb9
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460647
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
2019-09-24 16:39:27 +00:00
Wojciech Malikowski
02cabd9eb8 lib/bdev: Added spdk_bdev_get_zone_info() command
spdk_bdev_get_zone_info() is used for retriving
information about zones inside zoned namespace.

Change-Id: I8f931505245e984c0b1ee35ed6592c978ee47544
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460643
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
2019-09-24 16:39:27 +00:00
Wojciech Malikowski
ec63516af5 lib/bdev: Move some bdev.c static functions to internal header
Some internal bdev.c static function will be shared
with new zoned bdev module.

Change-Id: Ifbb8bf443f67b2daf97858b15d474ecce98a9efb
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469100
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
2019-09-24 16:39:27 +00:00
Wojciech Malikowski
aa35048d15 lib/bdev: Zoned bdev public header
Added new public header for zoned bdev. Zoned bdev is an
extension of the bdev interface. Generic concept comes
from ATA/SCSI and is also being worked as an NVMe TP.

Zoned device logical blocks space is divided into fixed-sized
zones. Each zone is described by its start logical block address
and capacity. Writes to a single zone need to be sequential.
After zone is fully written it need to be reset to write to it
again. Such writing schema could be very beneficial in terms of
write amplification factor for NAND based devices.
SPDK Flash Translation Layer library will be consuming this
interface in the future.

Extending SPDK bdev interface will allow to use existing bdev
infrastructure for this new type of devices.

Zoned device have several properties defined in spdk_bdev
structure:
 - zone_size: default size of each zone
 - max_open_zone: maximum number of open zones
 - optimal_open_zones: optimal number of open zones to get
		       best performance on writes

Single zone properties are defined in spdk_bdev_zone_info
structure:
 - start_lba: first logical block of z zone
 - write_pointer: logical block address in the zone at
		  which next write shall occur.
 - capacity: maximum number of logical blocks that may
	     be written in the zone when zone is empty.
 - state: zone state

Several zone states are defined: Empty, Open, Full, Closed,
Read Only and Offline.

To change zone state zone actions are defined: Close, Finish,
Open and Reset.

Change-Id: I5fcc22d548c15743329344cae96f5ff73e268504
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460642
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
2019-09-24 16:39:27 +00:00
Wojciech Malikowski
9b3ae37f33 lib/bdev: Added spdk_bdev_is_zoned() to bdev interface
This patch is entry point for extending bdev
interface to support devices with zoned namespace
semantics.

spdk_bdev_is_zoned() will allow user to check if
bdev is zoned bdev.

Change-Id: Id9ea9898d406d1d942bf3081b00ebcb574ac2b5e
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460641
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
2019-09-24 16:39:27 +00:00
Wojciech Malikowski
18ed0c7f6b lib/bdev: spdk_bdev_get_write_unit_size() interface
Added write_unit_size field to bdev structure.
It describes required number of logical blocks
for write operation. For legacy bdevs this value
will be equal to logical block size.
For bdevs working on top of Open Channel/Zoned
Namespace SSDs or RAID 5 volumes write size unit
could be greater than logical block size and
upper layer should perform write operations
with size of multiple of write unit size.

Change-Id: I55eb6687912a7d0d1157874b2778e7d6c2d44e37
Signed-off-by: Wojciech Malikowski <wojciech.malikowski@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/463802
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
2019-09-20 20:19:40 +00:00
Maciej Wawryk
c61e14be6c RPC: rename set_bdev_qos_limit to bdev_set_qos_limit
Signed-off-by: Maciej Wawryk <maciejx.wawryk@intel.com>
Change-Id: Ia22b0043307e92222541b6e7278e3d1485b61987
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468084
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
2019-09-13 22:39:15 +00:00
Darek Stojaczyk
4368937b4e bdev/part: remove thread safety from part_construct()
spdk_bdev_part_construct() must be now called on the
same thread that called spdk_bdev_part_base_construct().

This was always the case so far and I don't see any other
case where thread safety could be useful, so just remove
it. The doxygen doesn't say anything about it either.

Even in GPT case, we create a base directly as a part of
examine and then create part bdevs in the spdk_bdev_read()
completion callback, but that callback will be always
executed on the same thread which issued the read.

Change-Id: I752f2a7f08c9faf4231ed53a46b700b33fa13697
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466024
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
2019-08-30 15:48:02 +00:00