This patch allows to send IO requests from external module to the nvme
device.
External module should call nvme_io_msg_ctrlr_start() to start IO message
producer on the controller and enable sending messages.
nvme_io_msg_send() is used to send IO to NVMe driver thread context,
where passed function will be called. Allowing the external module to
issue IO as needed.
NVMe driver users should poll spdk_nvme_io_msg_process() to move forward,
sending IO from external module and process their completions.
Change-Id: Ie59abac69870c4e4daa50120c747f3b620395921
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>
Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471386
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This change allows setting of the NVMe completion queue
CDW0 in spdk_bdev_io_complete_nvme_status.
Before that change, handling of vendor specific NVMe IO
commands was limited since there wasn't a way to return
command specific info back to the initiator.
Change-Id: I250d5df3bd1e62ddb89a011503d42bd4c8390f9b
Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470678
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This can be useful when trying to perform multipath failover at the
application level. However, the controller must be in the failed state
before calling this function.
Change-Id: I5403c0036fed5dd3600ee20592925297494ba8aa
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470699
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
While it is unlikely that a single qpair will be failed, it is important
to make it possible to reconnect a single qpair.
This function is also handy at the application layer when going through
a reconnect workflow. If we get -ENXIO from a qpair when we poll, we
will turn around and call this function. If we get -ENXIO from this
function, then we know the whole controller is failed and we need to do
a reset.
Change-Id: I6a8ea0ce27fce2f5fc0a5b3db05834acd68e6a39
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471417
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
When splitting bdev's, if we run out of child IOVs, we have to
make sure that we end block aligned. That may require us to
shorten or eliminate one or more child IOVs to be picked up
on the next split. If we eliminate enough such that there
are no IOVs for this split IO, just continue and the next
split completion will kick off another split to pick up
the remaining data to be transferred.
Fixes issue #981
Signed-off-by: paul luse <paul.e.luse@intel.com>
Change-Id: If2bfbe5f0862295e1d74cbea00692890a2178967
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471313
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
These will form the base of a little state machine for managing the nvme
qpair structure.
Change-Id: If6f6df38cc17221ac8fcb7d8c0d7e2e808897a99
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470534
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The driver has historically waited until we have to do a listen
before enabling the admin qpair. That is a very PCIe-centric mindset.
For fabric controllers, a lot of the early initialization operations such
as get_cc and set_cc are handled through the admin qpair so it should be
enabled before we begin the initialization process.
As a side effect of this cahnge, the internal API
nvme_ctrlr_enable_admin_qpair has been removed. It would have turned
into a one-liner.
Change-Id: Icd162657d01a85c227a3f20c295d0208e07ce44d
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471743
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Extract PDU header handling from PDU payload handling for all PDU
types, and then group them into a new function iscsi_pdu_hdr_handle().
Then the original iscsi_execute() is renamed to iscsi_pdu_payload_handle().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I1fb1937cfaf502797f2c4edb3aeeb97d4697c7d4
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471015
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Previously, for iSCSI target, freeing bdev_io of SCSI task was deferred
until the reference count of the SCSI task becomes zero.
But this will cause the use-after-free issue when doing LUN hotplug during
large write I/O workload.
The scenario is the following:
- Large iSCSI write I/O is split into multiple I/Os, the first I/O is from immediate,
and subsetquent I/Os are from R2T.
1. The first I/O allocates iSCSI task as primary, and is submitted to the bdev layer.
The first I/O is pending in the bdev layer.
2. The second I/O allocates iSCSI task as secondary (secondary is associated with
primary by incrementing reference count).
3. Before submitting the second I/O to the bdev layer, LUN hotplug is started.
LUN hotplug waits for getting completion of the first write I/O from the bdev layer.
4. The bdev layer completes the first I/O. The primary iSCSI task is tried to free,
but reference count is still one, and is not done yet.
5. LUN hotplug detects completion of the first write I/O, and returns
LUN I/O channel to the bdev layer.
6. The second I/O is tried to submit to the bdev layer, but LUN is already removed,
and so free the secondary iSCSI task.
7. Then the reference count of the primary iSCSI task becomes zero,
and its bdev_io is freed. However, LUN I/O channel is already freed and freeing
bdev_io fails.
This issue is caused by separating iSCSI task allocation and submission.
For write I/O, we don't have to keep bdev_io after getting completion
of it from the bdev layer.
This applies to other non-read I/O types.
So for non-read I/O, free bdev_io after getting SCSI status in
bdev_scsi_task_complete_cmd(), and for read I/O, set bdev_io to
task as same as before.
The next patch will do the same for management task.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I530fb491514880ce41858e1bea55d422d606dfc4
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471695
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Refine the public helper function spdk_scsi_dev_has_pending_tasks to
be able to check tasks only from the specific initiator.
Then use the function in iSCSI target to fix the issue.
Besides add UT code to test the updated spdk_scsi_dev_has_pending_tasks().
Automated multi hosts test is much better but some UT code will be of any
help to mitigate the risk of degradation.
Fixes#985
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I50afb940de7174360c8a30479450850002a3e525
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471337
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Refine helper functions spdk_scsi_lun_has_pending_mgmt_tasks() and
spdk_scsi_lun_has_pending_tasks() to be able to check tasks only from the
specific initiator.
SCSI port is used by passing the pointer and so simple pointer
comparison is appropriate in the functions.
Add UT code to test the updated functions.
The next patch will change spdk_scsi_dev_has_pending_tasks() to
get not only SCSI device but also initiator port and make iSCSI
target use the function to fix the issue.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I89c33e05bc6ab21baa6cbebf60950039a3dcecd0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471336
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
It can be useful for passing additional information about nvmf
target to a handler for new nvmf connections. Context can be
stored in globals as it is currently done in nvmf code. However
in case of multiple targets or languages where accessing global
state is challenging (i.e. Rust), this becomes inconvenient.
Change-Id: Ia6a2fdba4601531822b3e5fda7ac5ab89d46f6c5
Signed-off-by: Jan Kryl <jan.kryl@mayadata.io>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469263
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Sasha Kotchubievsky <sashakot@mellanox.com>
To make the code clearer,
- make the spdk_iscsi_read_pdu() private in iscsi.c and named iscsi_read_pdu(), and
- make the iscsi_conn_handle_incoming_pdus() public and named
spdk_iscsi_handle_incoming_pdus().
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I967681b8e9b86681a906b18719e91e1d387450d7
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469969
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
This is useful for detecting sockets that have been disconnected
by the other end without reading data.
Change-Id: Ieb6529984d282d48373766d9f5555cf11720f19b
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470513
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
FUSE operations in blobfs_fuse.c are extracted from
test/blobfs/fuse/fuse.c to blobfs_fuse.c in module blobfs_bdev.
And it is extended to create one new thread dedicatedly for one
FUSE mountpoint to handle FUSE requests by blobfs sync API.
spdk_blobfs_bdev_mount is implemented as the export API. So
related code can be utilized by other modules/apps.
Now test/blobfs/fuse/fuse.c is much simplified with function
spdk_blobfs_bdev_mount.
Change-Id: Iefa16977fabbae2008c8f65fe1b69d650b6fd18d
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469347
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Move spdk_iscsi_execute() from iscsi_conn_handle_incoming_pdus()
to spdk_iscsi_read_pdu() and then strip the prefix spdk_ from
spdk_iscsi_execute() and make it private.
This is to introduce state machine into receive incoming PDU processing.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I5d5b3e55ece0994532e924d3c75d898cb373875c
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470287
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
In-capsule data transfer can only be supported by NVME drives with SGL memory layout
Add test to examine new behaviour
Change-Id: Iaef6564c8e5c96c1c5af16ab41d6e3827f6a82b6
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470469
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
When we ran out of child_iov space, ensure the iovs to be aligned
with block size. However the calculation was wrong.
(to_next_boundary_bytes % blocklen) meant not to_last_block but to_next_block.
So calculate to_last_block_size by reducing to_last_block_size from blocklen.
The data was collected when the issue occured. So add unit test
by using the data.
Fixes#979
Reported-by: Geoffrey McRae <geoff@hostfission.com>
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I62a50bada450288ea7c60aec0e557c2a53cd8916
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470806
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This function has not been tested, add test in nvme_ctrlr_cmd_ut.c
to increase code coverage rate.
Change-Id: Ibe5c98924d9e2f3239d782e7a32b828caaddb04a
Signed-off-by: yidong0635 <dongx.yi@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470718
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Following test/blobfs/mkfs case, add it as one RPC
method to let build a new blobfs on given block
device.
Change-Id: I0ffbb1add95dfbc8655e0238ed6f3cd519dd945b
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466485
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Add module blobfs_bdev as a general module to simplify
the operations of blobfs on bdev. Then its functions
can be utilized by other libraries or apps.
blobfs_detect can be used to detect whether there is
one blobfs on given bdev.
Change-Id: Ib425574816061dc945fb652b539f791a44097a43
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466486
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Request logout to the initiator asynchronously when exiting the
running connection from the target.
Based on this patch, the next patch will make flushing PDUs possible
only when the connection is RUNNING (or INVALID) and remove the
tight loop in iscsi_conn_flush_pdus().
Set timeout of logout request as 30 seconds. 30 seconds is as same as
NOP timeout and will be reasonable for now.
Add and use logout_request_timer to check the timeout.
When the connection gets internal a logout request, move the connection
to EXITING state if the connection is in INVALID state, request logout
to initiator and start 30 seconds timer if the connection is in RUNNING
state and logout is not requested yet, or do nothing otherwise.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I43192be9fd7112ad444152c0dd88f99a14aa8d30
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470705
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
If we are to have multiple reconnect attempts, we have to control
whetehr the controller is placed in the failed state from outside the
reset function itself. This will allow us to fail the controller only
after all of our retries are exhausted.
Change-Id: Ia82e10325272f25b2b8527336dc3bc507c93b401
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469932
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Use the standard API function to fail the controller in all cases.
This patch, and the several following patches are aimed at creating a
mechanism for reporting up to the application layer that a controller is
failed and or removed. To do this, I use the reset_cb to inform the
upper layer that the controller is failed.
This also requires changes to how we handle a controller reset to
pave the way for doing optional reset retries in the libraries.
Change-Id: I06dfce08326c23472a1caa8f6efbac2fd1a720f2
Signed-off-by: Seth Howell <seth.howell@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469635
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Next commits in the series will need these callbacks in other tests
that are closer to the beginning of file.
Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I55e346f273f169c7894f662dcfa1f57a02024577
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469708
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Have each sock implementation free the group_impl itself.
This allows C++ based sock implementations like Seastar
to release the group_impl memory using delete rather
than free.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Change-Id: If40a91e8bc93a531701fc30d847ab28fa11858ab
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469618
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
We can merge two loops of req->buffers and req->iov into a single
loop and merge two variables, req->num_buffers and req->iovcnt into
a single variable. For the latter, use req->iovcnt because it is
also used for in-capsule data.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Ia164f2054b98bbcb00308791774e3ffa4fc70baf
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469489
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This follows the good practice of FC transport.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Change-Id: I84a6bb28a27b529335f100c8cab12d642bc156ff
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469488
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is the end of the effort to unify buffer allocation
among NVMe-oF transports.
This patch aggregates multiple calls of spdk_nvmf_request_get_buffers()
into a single spdk_nvmf_request_get_buffers_multi().
As a side effect, we can move zeroing req->iovcnt into
spdk_nvmf_request_get_buffers() and spdk_nvmf_request_get_buffers_multi()
and do it in this patch.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I728bd330a1f533019957d58e06831a79fc17e382
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469206
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
This patch is close to the end of the effort to unify buffer allocation
among NVMe-oF transports.
Merge each transport's fill_buffers() into common
spdk_nvmf_request_get_buffers() of the generic NVMe-oF transport.
One noticeable change is to set req->data_from_pool to true not in
each specific transport but in the generic transport.
The next patch will add spdk_nvmf_request_get_multi_buffers() for
multi SGL case of RDMA transport.
This relatively long patch series is a preparation to support
zcopy APIs in NVMe-oF target.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: Icb04e3a1fa4f5a360b1b26d2ab7c67606ca7c9a0
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469205
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Small cleanup in prep for future RAID levels and to improve
readability.
Change-Id: I66ae64320593ee5b242ccdc50a0041a4ec8207fb
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/469742
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I9fb7a998f7c13ce53cba630a895e8e11cf5f4a1c
Signed-off-by: Benjamin Saunders <bsaunders@google.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/467559
Reviewed-by: Paul Luse <paul.e.luse@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
The subsequent patches unifies getting buffers, filling iovecs, and
filling WRs in a single API. This is a preparation.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I077c4ea8957dcb3c7e4f4181f18b04b343e9927d
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468953
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
This patch makes multi SGL case possible to call spdk_nvmf_request_get_buffers()
per WR.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: I977ebb0c6b2a67218c9b6fc20dc26a93a6ec770b
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468943
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
This patch makes multi SGL case possible to call spdk_nvmf_request_get_buffers()
per WR.
This patch has an unrelated fix to clear req->iovcnt in
reset_nvmf_rdma_request() in UT. We can do the fix in a separate patch
but include it in this patch because it is very small.
Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Change-Id: If6e5af0505fb199c95ef5d0522b579242a7cef29
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/468942
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Update transaction length wrt to medata size
Change buffers handling in the case of enabled DIF - add function nvmf_rdma_fill_buffer_with_md_interleave to split SGL into several parts with metadata blocks between them in order to perform RDMA operation with appropriate offsets
Add DIF generation before executing bdev IO operation
Add parsing of DifInsertOrStrip config parameter.
Since there is a limitation on the number of entries in SG list (16), the current approach has a limitation on the max transaction size which depends on the data block size. E.g. if data block size is 512 bytes then the maximum transaction size will be 512 * 16 = 8192 bytes.
In adiition, the size of IO buffer (IOUnitSize conf param) must be aligned to metadata size for better perfromance since metadata is treated as part of this buffer. E.g. if the initiator uses transaction size = 4096, data block size on nvme disk is 512 then IO buffer size should be aligned to (512 + 8) which is 4160. In other case an extra IO buffer will be consumed which will increase the number of entries in SGL and in iov.
Change-Id: I7ad2270fe9dcceb114ece34675eac44e5783a0d5
Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com>
Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com>
Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/465248
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Ben Walker <benjamin.walker@intel.com>
Reviewed-by: Seth Howell <seth.howell@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>
Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>
Test all 3 IOV positions but only on src bufs as the dst buf
code under test is identifal and the amount of test code is
rather large.
Change-Id: Idbb635149a5737df9d508adc9dba69e84ec024d3
Signed-off-by: paul luse <paul.e.luse@intel.com>
Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466157
Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>
Reviewed-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>