Implementation was updated by the previous patch series but the
comment was not. Hence update the comment in this patch.
Change-Id: Ib2097561501984c4ad213652ba716ce195c7bea5
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/424737
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Use helper function raid_bdev_config_find_by_name() in destroy_raid_bdev
RPC.
Besides, unify two iterator variables in construct_raid_bdev RPC.
This is too small to add in a separate patch.
Change-Id: If0bbae8993af4e402a82299db378e89245b8966e
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/424150
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
raid_bdev_config_destroy_check_raid_bdev_exists() and
raid_bdev_config_destroy() do almost same operation and can be
unified to raid_bdev_config_destroy().
Comment in spdk_rpc_destroy_raid_bdev() about raid_bdev_config_destroy()
are duplicated with the one on top of the raid_bdev_config_destroy().
Hence remove the former in this patch too.
Change-Id: I4a323bf24d2266e2da7f030c52771013d075eb31
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/424149
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Initialization of bdev modules don't have fixed ordering.
Hence both cases should succeed to create configured raid bdev
for .INI config file.
1. create raid bdev and then create base bdevs of it.
2. before creating raid bdev, create base bdevs of it.
To do this, factor out adding base bdevs to raid bdev
to a helper function raid_bdev_add_base_devices() first
and then apply it to construct_raid_bdev RPC and parse .INI
config file.
Change-Id: I4389fd4e7bd41296dfa9210bbe8b60348ee5bc9d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423769
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Change the behavior of construct_raid_bdev RPC to be consistent with
other bdev modules.
Create a config of raid bdev first. Then create a raid bdev based
on the config.
If both succeed, the config and the raid bdev are not removed even if
any failure occur later in construct_raid_bdev RPC. Otherwise, both
are removed and return failure.
During iteration of adding base bdevs in construct_raid_bdev RPC,
- skip any nonexistent bdev and move to the next base bdev.
- if adding any base bdev fails, move to the next bdev.
- if adding base bdevs don't fail but any base bdev doesn't exist,
the raid bdev is in the configuring state and return success.
- if adding base bdev fails, the raid bdev is in the configuring state
and return failure.
- if all adding base bdevs succeed, configure the raid bdev. If configuring
the raid bdev fails, move the raid bdev to the offline state and return
failure. If configuring the raid bdev succeed, return success.
check_and_remove_raid_bdev() becomes unused in raid_bdev_rpc.c but
is still necessary in UT. Hence move this function to UT.
In UT, finding a raid bdev config by name becomes necessary. Hence
factor out the iteration to a function and use the function in UT.
Change-Id: Ifa36967bdc987d97030e3a4e36684cb37b329d4e
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423622
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When construct_raid_bdev RPC is callled, even if all adding base bdevs
to it fail, a raid bdev with no base bdev should exist and be in
the configuring state.
Extract raid_bdev_create() from raid_bdev_add_base_bdev() and put it
before starting iteration of raid_bdev_add_base_bdev().
This change cam be applied to .INI config file too.
Change-Id: Ia71aca0bbdb33049fe5bd9b8fd8163c99c032401
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/424148
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This prevents us from using the bdev embedded in the lun after it has
been freed.
Change-Id: I780cf3eccca05a58d3461366fec024be42b8ff74
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/425174
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
We cannot split an iov if a buffer hasn't been
allocated yet. So always call spdk_bdev_io_get_buf
on reads before trying to split.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I2c26efb9dc6cb2c7c3e3b7ae5bab2c37844b9113
Reviewed-on: https://review.gerrithub.io/424879
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
spdk_bdev_io_put_buf() is responsible for reclaiming
bdev-allocated buffers from a bdev_io. If there are
bdev_ios waiting for one of these buffers, it calls
spdk_bdev_io_set_buf() on the next bdev_io in the queue.
This will set the iov_base and iov_len on the bdev_io
to point to the bdev-allocated buffer.
But spdk_bdev_io_put_buf() was calling spdk_bdev_io_set_buf()
on the just completed bdev_io, not the next bdev_io in the
queue. So fix that.
Fixes: 844aedf8 ("bdev: Simplify get/set/put buf functions")
Reported-by: Alan Tu
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ibbcad6e35a3db6991bd7deb3516229572f021638
Reviewed-on: https://review.gerrithub.io/424880
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When new bdev was created, the struct spdk_bdev_module::examine_disk()
may open and close bdev. On the other hand, if something goes wrong,
the creation procedure may unregister new created bdev, so race
condition appeared between _remove_notify() and spdk_bdev_close().
Add the new field "closed" and "remove_notified" in struct spdk_bdev_desc,
so _remove_notify() and spdk_bdev_close() knows how to deal with this
situation.
Change-Id: Ibfe915a4d76096796b039a13a4f49f26669eba2c
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/423369
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
spdk_bdev_close() must be called on the same thread as
spdk_bdev_open(). Further, the remove callback on the
descriptor will also be run on the same thread as
spdk_bdev_open().
Change-Id: I949d6dd67de1e63d39f06944d473e4aa7134111b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/424738
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
When emulating write_zeroes commands on device that
don't natively support it, we submit a write with
a zeroed buffer. We used to just reuse the original
bdev_io, but that was recently changed due to other
splitting code added for iovs. But when making those
changes, we forgot to free the bdev_io for the
write that was sent down to the device.
Fixes: 183f37e8 (bdev: do not reuse bdev_io when...)
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If08782c65f6305c0a9f9d15d74fd8823e1158e9b
Reviewed-on: https://review.gerrithub.io/424733
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Change-Id: I28f21649feae8022cd0f0afd5b01ae3ee2b3592e
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/424618
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This RPC does not work for a lot of bdev types. For
example, NVMe namespaces and virtio scsi LUNs are not
explicitly constructed by an RPC - they are indirectly
constructed by an RPC associated with an NVMe controller
or virtio-scsi controller.
While here, remove spdk_bdev_config_json. It was
only created to facilitate this get_bdevs_config RPC.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I537166d8f91ab458bd2000859d74f7254bfc9c0a
Reviewed-on: https://review.gerrithub.io/424584
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This isn't valid RPC so it needs to be removed. Bdev modules were
working around this issue by defining empty write_config_json
methods.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3c4c20249eadfcfb4103430f5801190b14897249
Reviewed-on: https://review.gerrithub.io/424582
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
io_device is registered, but never release.
It will lead spdk_io_device_register to fail
because allocate new io_device may reuse the same memory address.
Change-Id: I3359ed783c1df43ef382669d9d73d06bfbc1fe2d
Signed-off-by: Li Feng <lifeng1519@gmail.com>
Reviewed-on: https://review.gerrithub.io/424235
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This was added a long time back for tracking an rte_mbuf
whose buffer was a different rte_mbuf - all related to
a userspace TCP stack that is no longer in development.
The concept isn't useful now, so remove it to reduce
the complexity of the tracing code.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I310e492eba7f55df242bb29d82fb19f6daee1f51
Reviewed-on: https://review.gerrithub.io/424565
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This is a string name used for debugging only.
Change-Id: I9827f0e6c83be7bc13951c7b5f0951ce6c2a1ece
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/424127
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
There could be cases (especially in virtualized and/or test
environments) where we could accumulate significant skew in
the timeslice frequency. Rather than depend on the application
framework to try to guarantee the rate of timeslice poller
callbacks, keep track internally of the last time the poller
was invoked. If/when we accumulate and detect skew equivalent
to one or more timeslices, increase the allowed IO and bandwidth
of the next timeslice to accomodate.
Since bdev poller now calls spdk_get_ticks() to do accounting,
this patch also fixes up the increment_time() unit test function
and the test env layer to properly increment the fake TSC.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iba301ddc0fb3d02042106a8bf6e4a6a9a84dc263
Reviewed-on: https://review.gerrithub.io/423580
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: GangCao <gang.cao@intel.com>
buf is no longer needed, now that RAID I/O will never
span a strip boundary.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Icc03b985f540797597c238005bf9883f6f8ad010
Reviewed-on: https://review.gerrithub.io/423406
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This has been reduced to just a call to
spdk_bdev_io_complete(), so it is no longer needed.
While here, call spdk_bdev_io_complete() directly
in the strip-crossing error case as well.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ia949e3fd55b52e5c873bd591e541ecf6c0c086b1
Reviewed-on: https://review.gerrithub.io/423327
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Now that I/O never span a strip boundary, we don't
need to count splits.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Iea5df5ca055fe4b338fb5cb0828a45502c239458
Reviewed-on: https://review.gerrithub.io/423325
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Set the bdev->optimal_io_boundary to the strip size, and
set split_on_optimal_io_boundary = true. This will ensure
that all I/O submitted to the raid module do not cross
a strip boundary, meaning it does not need to be split
across multiple member disks.
This is a step towards removing the iovcnt == 1
limitation. Further improvements and simplifications
will be made in future patches before removing this
restriction.
Unit tests need to be adjusted here to not span
boundaries either.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I08943805def673288f552a1b7662a4fbe16f25eb
Reviewed-on: https://review.gerrithub.io/423323
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Splitting a 1TB unmap into individual 64KB unmap commands
(for a RAID volume with 64KB strip size) would be awful -
the RAID module can be much smarter about this.
So back out the changes for splitting I/O without payload.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I24fe6d911f4e3c9db4b2cb5d66c7236a5596e0d9
Reviewed-on: https://review.gerrithub.io/424103
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
This removes assumptions about hte lifecycle of these variables relative
to the parent bdev and allows us to avoid ambiguity about who
is responsible for freeing these buffers.
Change-Id: Ia996653562d532fa1501faf21d3fdff85033ab33
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/424105
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Next patch will keep RAID bdev without removal when adding any base
bdev to the RAID bdev fails for some reason for construct_raid_bdev
RPC.
The half-baked RAID bdev will become unnamed and this should be avoided.
Change-Id: I3ad0d58a04a7638b9b286babe0e56e057e7f5d0d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423621
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
When check_and_remove_raid_bdev(raid_cfg) is called, raid_bdev
must be in configuring state. However, offline state will be
probable as well. Besides similar and more comprehensive state
check is done in raid_bdev_cleanup(). Hence remove this check.
Change-Id: I8526da51828f18a8393afbd3d81c0a52f2d33261
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423620
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
When a raid bdev is constructed by JSON-RPC construct_raid_bdev,
information about config and slot can be passed to
raid_bdev_add_base_device() and raid_bdev_can_claim() doesn't have
to be called.
Hence extract raid_bdev_can_claim_bdev() from raid_bdev_add_base_device()
and put it to raid_bdev_examine().
Change-Id: I92e02bf3661cb97b691246f32198ba946810d96c
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423618
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
When the name of raid bdev is already used by any existing bdev,
spdk_bdev_register() will fail. Hence checking if bdev is duplicated
can be left with spdk_bdev_register().
Change-Id: I22f0c75b0e87c5bd71ece383d146ec61cfaf2627
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423617
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
When raid_bdev_alloc_base_bdev_resource() is called through raid_bdev_examine(),
if raid_bdev_alloc_base_bdev_resource() fails, raid_bdev_cleanup() is called
without freeing base bdev resource.
There is no way to free base bdev resource in this case.
Hence remove cleanup call.
Change-Id: I93494502bfe653293825e11872de5883702e495a
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423616
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
raid_base_bdev_config is suffciently descriptive as name and so
"name" can be used as the name of the base bdev.
Other structs have used "name" as the name of bdev.
Change-Id: I090ef541a84fd14d9ec3bfebbdc96ae649709551
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423614
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Now that we split on I/O boundaries, that code needs to
be able to use the bdev_io split* members to track
what is left to submit. This means that the write_zeroes
code cannot submit the parent bdev_io as the child bdev_io,
since the I/O boundary code will overwrite the write_zeroes
split accounting.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9316b59267508f60799766fc4f1ea05a4b3e5d9e
Reviewed-on: https://review.gerrithub.io/423404
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
This function was meant to handle case when a RAID
volume only had one member disk. This is not a
common case, so just go through the
raid_bdev_submit_children() path in this case. This
will simplify some upcoming patches to use
split_on_optimal_io_boundary to enable vectored
I/O.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic7c7f7987d54ffd836deabb98530f7c339094809
Reviewed-on: https://review.gerrithub.io/423322
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will simplify some future patches which will
account for missed timeslice timers by allowing
additional IO/BW in the following timeslice.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I9dd46a768c98ce267c733a9f9719a2d3d2c3c915
Reviewed-on: https://review.gerrithub.io/423579
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This was defined in two places, so consolidate
the definitions.
Change-Id: I0bbb262b97e90d1064bcc50ee201928f6ca9518a
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/423182
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
According to the purpose of the lists, state_link and global_link
will be enough to understand the purpose well.
Besides, in raid_bdev_remove_base_bdev(), if any entry is not removed
during iteration, TAILQ_FOREACH_SAFE() is not necessary. Hence
change TAILQ_FOREACH_SAFE to TAILQ_FOREACH for this case.
Change-Id: I3022c58faf96721df9241e07dbb5a06d7de89e70
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/423056
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Other helper functions raid_bdev_free_base_bdev_resource() and
raid_bdev_cleanup() can be used in check_and_remove_raid_bdev().
Change-Id: I24ce02371f5bb29c3b6111da4475a9ad08e65c21
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422794
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
RAID parameter check operation of JSON RPC and config file are
duplicated now. This refactoring is one of small preparation to
extend RAID bdev to other RAID levels.
Change-Id: I88527bc9bd0b3a3392aba7d0008a53d518027bfa
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422797
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
IO channel of the RAID bdev has not been changed during IO submission.
Hence it can be set to raid_bdev_io at the start of IO submission.
This will simplify the logic and improve maintainability.
Change-Id: I7d2f35d877be5ea0731e9b635133c1106f294a57
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422790
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
to hold a pointer to the struct raid_bdev_io, raid_bdev_io had
been used throughout. Using struct name as is may be error prone.
raid_io may be sufficiently descriptive and fit other bdev modules.
Change-Id: I2c19359b75e89cc593371d8bee46203494f57841
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422924
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
struct raid_bdev has a pointer raid_bdev_config to its configuration,
but raid_bdev is duplicated between raid_bdev and raid_bdev_config
and just config may be enough as a name.
Change-Id: I77f6a69b913cd540c22f05803c486a4caf103f24
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422923
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
In bdev_raid.c and bdev_raid_rpc.c, to hold a pointer to
raid_bdev_config, some have used raid_cfg and other have used
raid_bdev_config.
raid_bdev_config is the name of the struct and using the name of
the struct as a pointer to it may cause any confusion.
Change-Id: Ic87ff04d58ea46c30081aaae8d9cda5a53135be9
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422922
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
base_bdevs_io_channel is good but base_channel may be enough and
fit other bdev modules.
Change-Id: I67a1d224f1ef4ca1fc048b4325333f2552a37150
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422921
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
In bdev_raid.c, to hold a pointer to raid_bdev_io_channel, some
have used ch and other have used raid_bdev_io_channel.
Using raid_ch as that purpose throughout will be consistent and
differentiation to pointers to spdk_io_channel, and will improve
readability.
Change-Id: Ib10729f9fa3ebbfdb7b7cc013386b9b5cc0e93cb
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422920
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
raid_base_bdev_info is sufficiently descriptive as name and so
names of member variables in raid_base_bdev_info can be simplified.
Change-Id: I759a758ec0f3ac21de767ad379a902f1b04cc66d
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/422919
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
A number of modules (RAID, logical volumes) have logical
"stripes" that require splitting an I/O into several
child I/O. For example, on a RAID-0 with 128KB strip size,
an I/O that spans a 128KB boundary will require sending
one I/O for the portion that comes before the boundary to
one member disk, and another I/O for the portion that comes
after the boundary to another member disk. Logical volumes
are similar - data is allocated in clusters, so an I/O that
spans a cluster boundary may need to be split since the
clusters may not be contiguous on disk.
Putting the splitting logic in the common bdev layer ensures
bdev module authors don't have to always do this themselves.
This is especially helpful for cases like splitting an I/O
described by many iovs - we can simplify this a lot by
handling it in the common bdev layer.
Note that currently we will only submit one child I/O
at a time. This could be improved later to submit multiple
child I/O in parallel, but the complexity in the iov splitting
code also increases a lot.
Note: Some Intel NVMe SSDs have a similar characteristic.
We will not use this bdev stripe feature for NVMe though -
we want to primarily use the splitting functionality inside
of the NVMe driver itself to ensure it remains fully
functional. Many SPDK users use the NVMe driver without
the bdev layer.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ife804ecc56f6b2b55345a0d0ae9fda9e68632b3b
Reviewed-on: https://review.gerrithub.io/423024
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This value is used to calculate the disk utilization of a given bdev.
Change-Id: I4bf101c524b92bdd21573941e17f61db59c5c6b8
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/423017
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be needed for using this same descriptor when
splitting an I/O.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Idec759df7ab27f8de567d3c8a4214e25dbe173f5
Reviewed-on: https://review.gerrithub.io/423022
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
So we don't need to allocate memory (maybe failed) just for free other
memory.
Change-Id: I2c83f6acc2aa6ed79455bff90f952a2e70b44d59
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/422203
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Previously a null buffer on read would get passed on which would
be fine for a simple example but for something more complex where
the module in question might need that buffer at some time after
the read callback, it's possible that an underlying module might
free it too soon. So, to be a good example we now own the buffer
in this module.
Change-Id: I411477e83b1e222c64011688952cfd587c32c16e
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/422603
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The passthru module associates a vbdev and the underlying bdev.
This association lives throughout the lifetime of the module and because
of this, passthru vbdevs are recreated when they shouldn't be.
This patch ensures that the association between a vbdev and a bdev is
broken when the vbdev is removed, so that if the underlying bdev is ever
recreated (after being deleted), the vbdev does not reappear.
Note that this change does not affect hot-remove operations.
Change-Id: I5cb4cf7efcb5e5eeaeef79849e0aaefec4684a4f
Signed-off-by: Wael Halbawi <waelhalbawi@gmail.com>
Reviewed-on: https://review.gerrithub.io/422474
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The lvol->unique_id will never change - so just point the
bdev->name directly to it. That eliminates the need to
free bdev->name when cleaning up the bdev, and removes
an extra strdup.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6547edb1fd4c248fba696c07326842b073d3716c
Reviewed-on: https://review.gerrithub.io/421408
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Ideally lvols would not point to bdevs (bdev/lvol
should be layered on top of lvol). This is a small
step towards making that happen.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ie5dff245a0be0dcfe9a8a0ce3e79677be5faa696
Reviewed-on: https://review.gerrithub.io/421407
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
The intents of these arrays was to keep track in the
bdev layer of all base<->virtual bdev relationships -
i.e. which member disk bdevs make up a RAID bdev,
which logical volume bdevs are associated with a
bdev that contains an lvolstore, etc.
Currently none of this is used however. And trying
to keep track in the bdev layer instead of asking
the bdev modules for the relationships has a number
of complications. Early one, we tried to do this
with TAILQs - but that doesn't work since this can't
be done with a single TAILQ_ENTRY in the bdev
structures. So we moved to arrays - that works a bit
better, but then the pointer arrays have to be
realloc'd which isn't ideal.
The biggest problem though with these arrays is that
they held bdev pointers - not bdev descriptor pointers.
It's not really valid to access bdevs without a
descriptor - the descriptors are what make sure active
references are accounted for when a bdev is hotplugged.
Of course the bdev layer knows when a bdev is getting
removed and could go and do the updates to these
arrays separately - but that just seems very convoluted.
So for now just remove these arrays completely. If
there is a future need for the bdev layer to
understand relationships between bdevs, we can add
module APIs so that the generic layer can ask
the modules about the relationships.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I99ef1068240bff1262f64f234260cf2fb44df51d
Reviewed-on: https://review.gerrithub.io/420932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
The base_bdev and vbdevs arrays are going away in the
common bdev layer in a future patch, so prepare for that
in the split RPCs to use the bdev_part interface instead
to get the names of the split bdevs created from the
base_bdev.
No change in observed functionality.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib31807258b253ded3c9bac4f936ed57aa0df7534
Reviewed-on: https://review.gerrithub.io/420931
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This ensures (for example) that a RAID volume is
unregistered before its member disks.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I7a7c16acc351f2d5d4218b64b370e2c77c6e2b5e
Reviewed-on: https://review.gerrithub.io/420812
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
During system shutdown, we want each module to release its
claim and close any open descriptors when one of its bdevs
is unregistered.
This prepares for a future patch which will unregister
bdevs at shutdown in reverse order - i.e the RAID bdev
will be unregistered before its member disks.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ib4c2b9d25463e420dbc4ec5d3d7030e3d6e8d981
Reviewed-on: https://review.gerrithub.io/420927
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
The GPT module isn't designed currently to support extended
block sizes like 520 or 4104, and we don't really have a good
way to put GPT down on one of these devices anyways.
So just check the block size when examining a new bdev, and
fail immediately if it's not an even multiple of 512.
Long-term, this should probably be a DEBUG print and not an
ERRLOG, but while we work through the bdev implications
of extended block sizes, keep it as an ERRLOG to facilitate
debugging.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1bce0e3b1f0edee2f4d002edf3aaa7e6c16d9677
Reviewed-on: https://review.gerrithub.io/422321
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Liang Yan <liang.z.yan@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
spdk_nvme_ns_get_sector_size returns the sector
size of the data only. This new function adds
in the metadata size, if any.
While here, modify the bdev/nvme driver to use this
function - this is needed for ongoing extended sector
size work through the bdev layer.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ic8070a7f8d29b0b2ac2a2d65e0df5f4736488351
Reviewed-on: https://review.gerrithub.io/422445
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When an SPDK application shuts down, the bdev layer will
automatically unregister all of the bdevs to ensure they
are properly quiesced and cleaned up.
Some modules may want to perform different operations when
a bdev is destructed during normal runtime vs. shutdown.
For example, for lvol, when the last lvol is cleaned up,
it should unload the lvolstore, release and close the bdev
that contains the lvolstore. You never want to do this
during normal runtime though - it is perfectly valid to
have an lvolstore that contains no lvols. RAID and future
bdev modules such as multipath have similar use cases.
So add a new bdev module callback named "fini_start".
If a module specifies a function pointer for this callback,
the bdev layer will call it before it starts the bdev
unregistrations.
This enables some future patches to the bdev layer such
that it will always unregister block devices that are not
claimed (i.e. logical volumes) before block devices that
are claimed (i.e. the bdev containing an lvolstore).
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I6e87f5c2b27f16731ea5def858f26e882a29495a
Reviewed-on: https://review.gerrithub.io/421175
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This enables easier testing of variable sector size
block lengths (512 + 8, 4096 + 8, etc.) - otherwise
you need an NVMe namespace specially formatted for
VSS.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: Ifafe57a2cf63edacb1e0f80a183841acf84fa1e4
Reviewed-on: https://review.gerrithub.io/422320
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
The lvol's destructor vbdev_lvol_unregister() will do this work
at the end.
Change-Id: I8231978f8336883df16ecc7ecff9a0919f698a1d
Signed-off-by: wuzhouhui <wuzhouhui@kingsoft.com>
Reviewed-on: https://review.gerrithub.io/422195
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Seth Howell <seth.howell5141@gmail.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This patch is used to add the epoll suport and make
the group polling more efficient.
Change-Id: I93416514a53dc45471e375b39aa051e6500de412
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/421432
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
This patch is used to change the per bdev polling into a group polling
for each spdk thread.
Change-Id: I6e99fb1512fe8ee778d4436a193fbc24022f4b84
Signed-off-by: Ziye Yang <optimistyzy@gmail.com>
Reviewed-on: https://review.gerrithub.io/421573
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
In the case of user specified configuration file with the
section [Malloc] only and no valid settings like NumberOfLuns,
LunSizeInMB and BlockSize, just log an error message and
keeps the initialization running.
Change-Id: I567ff9163fc24f89520dc55462c26d3588154e50
Signed-off-by: GangCao <gang.cao@intel.com>
Reviewed-on: https://review.gerrithub.io/421844
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If the controller has a serious error and set the Controller
Fatal Status field to 1, host driver does not know this error,
while here, when timeout happens, try to detect the CFS and
reset the controller to recover from such fatal status.
Change-Id: I9fa5b263b34edc52d0f359d874b2920f7570d1f3
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/417622
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
When raid bdev goes to offline, manual removal of raid bdev may be
required. This patch tries to highlight paths to transit to offline
when configuring raid bdev failed.
Change-Id: I394984691fdf1d270d4e107e3131b677b42af49e
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/421205
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
"Timeout" option was replaced with "TimeoutUsec" in SPDK 18.07 release,
while here, remove the option for 18.10.
Change-Id: Id681152a5400f525efcbc9ddbdf558966baef2b3
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/421345
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Claim the block device when adding it to a new Namespace,
and prevent the block device to be added twice for other
modules and Namespaces. Also remove the test that using
same block device over different Namespaces.
Fix issue #371.
Change-Id: Ib7ce18e9fde4a15c0f19ce9e28e69145e54570e0
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/420472
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
This operation is called at one place but this refactoring will be
rewarded by considering subsequent patches and future enhancement.
Change-Id: Id8a16a111847162d5716c3c1eebd0c6eb698903f
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/421195
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
During IO submission, raid_bdev can be get by bdev_io->bdev->ctxt.
Hence holding raid_bdev in raid_bdev_io_channel is duplicated.
Change-Id: I722432718aca8c5846541816b6ecca56821d77f6
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/421182
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: GangCao <gang.cao@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Locating spdk_bdev structure to the beginning of raid_bdev structure
will simplify the hierarchy and match other bdev modules.
Change-Id: I1bfbf773bc96a4f144e6bff772ade05bb42762e9
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/420818
Reviewed-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Currently, the bdev layer iterates over all of the
existing channels of a bdev to collect I/O
statistics. But this ignores statistics for
channels that are deleted.
Fix that by keeping an io_stat structure in the
bdev which accumulates statistics for deleted
channels. Use the bdev mutex to protect these
accumulations.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I3103c0b8b55973c827d977765f47e5b9e7f58e5f
Reviewed-on: https://review.gerrithub.io/421029
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
This will be used further in an additional patch.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: I1126e5adfc24f17e5cdf33b0d3e04c78470771c8
Reviewed-on: https://review.gerrithub.io/421027
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Fix the issue #379.
Change-Id: Ia953d4ebe232d389ba3264074592ae8d0cff467a
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/421078
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch fixes github issue #385.
We need to unregister previously created NVMe controller
when creation of NVMe bdevs fails otherwise we won't be
able to use NVMe device as it will stay in wrong state.
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: I28a165aba1a556a2ed2d4cd5ff6fd510e80657d3
Reviewed-on: https://review.gerrithub.io/421006
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
These were left out previously.
Change-Id: I4e97068d2f13ca1dd1cfae1b25564641cee794ef
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/419601
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Two extra fields are added to the iostat rpc.
1. io_time. The amount of time since queue depth tracking was
enabled that has been spent on I/O processing.
2. weighted_io_time. Incremented each time this bdev's queue depth is
polled by the amount of time spent processing I/O since the last polling
event times the measured queue depth.
Change-Id: Ie70489ec24dee83f3eeac8f4f813ec7074ff458f
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/419031
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Change-Id: If90434452dba51fbb72c8c486e3c1e4691549c34
Reviewed-on: https://review.gerrithub.io/420845
Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Even after the bdev is removed, the underlying virtio
device may be still alive for some time. The user may
then try to recreate the virtio bdev and could technically
initialize a second simultaneous connection to a vhost
target. It does not cause any technical problems, but the
target may reject such connection if it reached a connection
cap or doesn't support more than one connection per target
(like SPDK vhost does as of today).
Since the fix is straightforward, here it is.
Note:
Virtio SCSI is already safe, because it uses a separate
completion callback to indicate virtio device destruction.
Change-Id: I2989780ef9b13c19d0432224ff4602a14be48315
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/420576
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Daisuke Aoyama originally contributed to istgt, the
iSCSI target in FreeBSD. The SPDK iSCSI code is
originally derived from that. Due to copy and paste,
some incorrect copyright attributions have been added
to other files that do not derive from istgt, so
this patch removes those.
It is doubtful, at this point, that there is any code
whatsoever that remains from the original istgt, but
we can revisit that at a later time.
Change-Id: I207e1e6b99d271e2f12690be90a96f7d0c113af7
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/420679
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Use strdup for all strings allocated by JSON decoder and free them on
all paths.
Change-Id: Ia30d32c2fd6bd79f62588ca19bd3bb093d38a502
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-on: https://review.gerrithub.io/420325
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Kunal Sablok <kunal.sablok@intel.com>
Also, don't try to remove any bdevs. The bdev layer
does that before module fini is even started.
Change-Id: I869f4b9e6d8590bfae2504a96d971c4f8866aeb9
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-on: https://review.gerrithub.io/420577
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Chandler-Test-Pool: SPDK Automated Test System <sys_sgsw@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>