numam-spdk

Author	SHA1	Message	Date
Tomasz Zawadzki	5dc88c6ccb	lib/blob: _spdk_bs_load_replay_md_parse_page() now takes only load ctx _spdk_bs_load_replay_md_parse_page() is only used in replay path during blobstore load. Next patch will expand the load ctx with array of extent pages to be read. It is filled out when reading in-chain metadata of extent table descriptors. Passing the load ctx here will make it simpler to fill out the array when processing extent table. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: If96e6670560c8c4a3610f33ece14c354d7d5da39 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482412 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	b5e993483f	lib/blob: read extents during blob load When EXTENT_TABLE descriptor is found when parsing metadata that means there can be extent pages to read. If extent page was not allocated, number of clusters can be increased depending on the num_clusters_in_et. Unallocated extent page contains either SPDK_EXTENTS_PER_EP or remainder of num_clusters_in_et worth of clusters. Depending which is less. Added decreasing fo num_clusters_in_et to parsing extent pages as well. While here, remove ctx->seq = seq assignment as that is done at beginning of blob load. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I57f54634b908ffb406f3e91e15841b7f36fd6de6 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476429 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	d1f863ca57	lib/blob: write out extent pages before persisting metadata Add new serialization of changed extent pages before persisting md. Iterate over active extent pages (not array !). When they are allocated but not yet present on disk - write them out. All extent pages in clean mutable data are assumed to be written out already. So there are two cases here: 1) Active mutable array is larger than clean All allocated extent pages should be written out. 2) Cluster allocation created new extent page Blob has to be thin provisioned and persist was called as part of cluster allocation. New extent page needs to be written out and EXTENT_TABLE allocated. Iteration is done over num_extent_pages instead of extent_pages_array_size, to prevent writting out too many extent pages when size of blob was made smaller. The two values come back in sync at the end of persist either way. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I780819fd7f3c44e4cf5d71c188c642536d3cc320 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479851 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	2bccb7c9b4	lib/blob: use use_extent_table instead of NULL from extent_page Right now output from _spdk_bs_cluster_to_extent_page() is used to determine whether the exten_table is used at all. If NULL pointer was returned this meant that extent table was not allocated, even if the code might suggest just checking if we overran the array. To make it more obvious, the _spdk_bs_cluster_to_extent_page() now only asserts the extent_table_id. blob->use_extent_table is now always used to determine the serialization path. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I9d2630645213539bae5cd1d72e5f9b878f53c2bc Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482599 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	95b478cc70	lib/blob: update single EXTENT_PAGE in place This patch add single EXTENT_PAGE updates on cluster allocations. There are three possible outcomes after inserting a cluster: 1) blob uses EXTENT_RLE Proceed to usual sync_md. 2) blob uses EXTENT_TABLE and extent page was not yet written out Update the active mutable data to contain the claimed md page, write out the EXTENT_PAGE and sync_md to update EXTENT_TABLE. 3) blob uses EXTENT_TABLE and extent page was previously written out Only serialize that single EXTENT_PAGE and write out the updated cluster map for it. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ia057b074ad1466c0e1eb9c186d09d6e944d93d03 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470015 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	e1ce55158a	lib/blob: require SPDK_EXTENTS_PER_EP to be power of 2 Force number of Extents to fit into Extent Page to be power of 2, in order to simplify calculations on cluster allocations. At this time SPDK_BS_PAGE_SIZE is 4k, which would results in SPDK_EXTENTS_PER_EP to be 512. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I7e09d92b00dfe5c12d7dd10ac0fc5a9a10d526ac Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/472041 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	f4e58993f7	lib/blob: add EXTENT descriptor to blobs Similar to EXTENT_RLE, this descriptor holds LBA of clusters. Difference is that EXTENT is kept in separate md pages, and only single EXTENT will be updated on cluster allocation. This patch adds the EXTENT processing, which is not used until following patch. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ifbac23db7ca3e7c8c91cee01018f20071f0d5160 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/470014 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	0dfe80c82a	lib/blob: claim and insert extent pages Added claiming the extent page. Which is then followed by updates in updates of mutable data on md thread. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: If511564f812685381c48924310105a4cb6f63cd1 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479850 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	cb44fa06f9	lib/blob: add _spdk_bs_claim/release_md_page() Functions to claim and release md pages were added. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I1c8ddc13c8a5806fb874e5c34dae2a327e1ff248 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482011 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	1b23560fcd	lib/blob: add _spdk_bs_cluster_to_extent_page() for easy conversion Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I3e49c398d9bdf9f4eacba65061cc7fe4b300fb56 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479963 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	59f7f3f736	lib/blob: change extent pages array size on blob resize With this patch extent pages array will change it size accordingly to size of the blob. Similar to clusters, only resizing up is done on blob resize. Shrinking is done on persisting the blob. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Id7f7c81efbd96af414fce9fc4045cbb476cc93a6 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479962 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	eebbd951cf	lib/blob: pass Extent Page offset on cluster allocation Extent Pages claim and insertion can be asynchronous when cluster allocation happens due to writing to a new cluster. In such case lowest free cluster and lowest free md page is claimed, and message is passed to md_thread. Where inserting both into the arrays and md_sycn happens. This patch adds parameters to pass the Extent Page offset in such case. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I46d8ace9cd5abc0bfe48174c2f2ec218145b9c75 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479849 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	f60b4a7e28	lib/blob: add EXTENT_TABLE descriptor to blobs Added new descriptor SPDK_MD_DESCRIPTOR_TYPE_EXTENT_TABLE. Extent Table will hold md page offsets for new Extent Page descriptor. Entries in Extent Table are run-length encoded 0's as unallocated Extent Page descriptors. Additionally total number of clusters is persisted in each Extent Table descriptor. This is because there is no guarantee that last Extent Page of a blob will be allocated. Even if number of Extents per Extent Page is always the same, Extent Page can hold less Extents than that. This patch does not add more metadata on disk right now. Only added descriptor parsing/serialization and applicable fields to store it in run time. Following patches are going to implement TODO's added in this patch. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Iac5d8f00ddfc655c507bc26d69d7adf8495074e9 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/466920 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	2f8bdb3c82	lib/blob: remove _spdk_blob_serialize_extent_rle() goto Lets get it removed ! :) Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I91b994a883a642d87ecc8c152c801b8a7676f33a Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482010 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	3dadb79e37	lib/blob: add EXTENT_RLE descriptor description Since further patches will be adding new descriptors that are related to cluster layout throughout the blobstore, add description for existing descriptor too. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I722eb633445685789d5185ed59dfc910f76b109f Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481724 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-01-27 18:06:43 +00:00
Tomasz Zawadzki	c33840b7e6	lib/blob: add option to enable extent pages This is an additional option that can be passed when creating a blob. When opts->enable_extent_pages is set to false (current default), only EXTENT_RLE should be persisted on sync. During blob load, when EXTENT_RLE is present in md, blob->extent_rle_found is set to true. When opts->enable_extent_pages is set to true, only EXTENT_TABLE and EXTENT_PAGES should be persisted on sync. During blob load, when EXTENT_TABLE is present in md, blob->extent_table_found is set to true. It is possible to find neither EXTENT_* descriptor when loading a blob. This means that blob length is 0 and EXTENT_RLE was supposed to be used. Yet none were persisted due to lack of clusters. In such case blob->use_extent_table is set to true after finishing blob load. When parsing metadata ends, if extent_table_found is set - then support for extent_table is enabled. All other cases disable it. At this time path for Extent Pages is not implemented, so it should not be used. Later in the series, it will become the default path for serialization. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I2146da6130a0645e686ab02a3b5d2d86a7d35a1f Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479853 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 18:06:43 +00:00
Ben Walker	7ef33c86b8	sock/posix: Zero copy send If available, automatically use MSG_ZEROCOPY when sending on sockets. Storage workloads contain sufficient data transfer sizes that this is always a performance improvement, regardless of workload. Change-Id: I14429d78c22ad3bc036aec13c9fce6453e899c92 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/471752 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com>	2020-01-27 17:42:24 +00:00
Ben Walker	a02207d778	test: Make nvmf target filesystem test more robust Change-Id: Id35254c1cdc4c8fa938e0322d5455bdab825efa8 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482004 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-27 17:42:24 +00:00
Ben Walker	f84c916c41	nvmf/tcp: Correctly kick the recv state machine when a request is freed When a command arrives and no requests are available, the socket recv state machine sits in the RECV_STATE_AWAIT_REQ state until another network event occurs. If this I/O was the last one sent, this leaves the target hung. To fix this, when a request is completed, kick the state machine to make forward progress. In practice, this can only occur once the pdu send acknowledgements are asynchronous relative to arriving commands. That only begins happening with the use of MSG_ZEROCOPY. When MSG_ZEROCOPY is turned on, it's possible receive the next PDU in a chain for a command prior to seeing the acknowledgement that the response that triggered that PDU actually sent. Change-Id: I556f31ad56970d36aa3538cfde375d35f3d4e551 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/480002 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 17:42:24 +00:00
Ben Walker	48a547fd82	nvmf/tcp: Wait for R2T send ack before processing H2C Previously, the R2T was sent and if an H2C arrived prior to seeing the R2T ack, it was processed anyway. Serialize this process. In practice, if the H2C arrives with a correctly functioning initiator, that means the R2T already made it to the initiator. But because the PDU hasn't been released yet, immediately processing the PDU requires an extra PDU associated with the request. Basically, making this change halves the worst-case number of PDUs required per connection. In the current sock layer implementations, it's not actually possible for the R2T send ack to occur after that H2C arrives. But with the upcoming addition of MSG_ZEROCOPY and other sock implementations, it's best to fix this now. Change-Id: Ifefaf48fcf2ff1dcc75e1686bbb9229b7ae3c219 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479906 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-27 17:42:24 +00:00
Ben Walker	033ef363a9	nvmf/tcp: Inline spdk_nvmf_tcp_pdu_set_buf_from_req This function was only called from one spot. Change-Id: I856f564d3ef6c6157be7a32a2cd812c702516a8d Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482003 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 17:42:24 +00:00
Ben Walker	fdfb7908b5	nvmf/tcp: Rename next_expected_r2t_offset to h2c_offset This seems like a more descriptive name Change-Id: Ia616865b3fb36d8f9ccc5fb2ca6185bdd8543cf8 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482002 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>	2020-01-27 17:42:24 +00:00
Ben Walker	a2adca79d9	nvmf/tcp: Set up math to always use 1 R2T per nvme command With our target design, there's no advantage to sending multiple R2T PDUs per nvme command. This patch starts by setting up the math so that at most 1 R2T PDU is required per request. This can be guaranteed because the maximum data transfer size (MDTS) is pre-negotiated in NVMe-oF to a reasonable size at start up. It then proceeds to simplify all of the logic around mapping requests to PDUs. It turns out that the mapping is now always 1:1. There are two additional cases where there is no request object at all but a PDU is still needed - the connection response and termination request. Put an extra PDU on the queue object for that purpose. This is a major simplification. Change-Id: I8d41f9bf95e70c354ece8fb786793624bec757ea Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479905 Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>	2020-01-27 17:42:24 +00:00
Ben Walker	399529aaa1	nvmf/tcp: Set max h2c size equal to max I/O size We can always accept up to the maximum I/O size in an H2C, so eliminate the #define. Change-Id: I349dab5f9b6ec482a7c580b1396e03c8d30a250b Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482278 Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 17:42:24 +00:00
Ben Walker	4dba507224	nvmf/tcp: Simplify qpair resource initialization The resources allocated to a queue pair do not need to be directly correlated to the queue size requested by the initiator in NVMe-oF, as long as enough resources are present. The RDMA transport, for instance, does complex pooling of the resources behind the scenes when using a shared receive queue. Simplify the resource allocation for a TCP qpair to just always allocate the max allowed queue size right away. This is a configurable parameter, so system administrators can adjust for their needs. The initiator may then request a queue size less than or equal to that, which will only be enforced by queue depth counting and not impact the actual number of resources allocated on the target. This change relies on the MaxC2HSize being equal to the Maximum Data Transfer Size (MDTS) reported. That is the default configuration, but MDTS is configurable. Changing the MDTS with this patch to a value larger than 128k will cause the target to break. This is addressed in the next patch in this series. Change-Id: Ibd4723785c6a4d8d444f9b7bbfa89f98de2320f5 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479733 Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com>	2020-01-27 17:42:24 +00:00
Ben Walker	444cf90c72	nvmf/tcp: Change qpair's state_cntr array to uint32_t These values do not need to be negative. Change-Id: Id9f798cf1c9da354448f9c6fbb90e599f877bb32 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482277 Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 17:42:24 +00:00
Ben Walker	5a7b33ec67	nvmf/tcp: In _pdu_write_done, free pdu before calling user callback By releasing the just-completed PDU prior to calling the callback, for flows that immediately submit another PDU inside the callback, the just-released PDU can be immediately reused. This reduces the number of PDUs required in the pool to continue forward progress to half of the previous value, while also making it more CPU cache friendly. Change-Id: I8031b8f9f57ac05f261d96433d9899fe5e31d318 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479904 Community-CI: SPDK CI Jenkins <sys_sgci@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-27 17:42:24 +00:00
Maciej Szwed	931ac757fb	test/bdevio: Add compare and write test This patch adds test for compare-and-write fused command. First it runs successful call of compare and write command and then it runs again and the second run is expected to fail. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: Iaf1151414c4dea16487e8c33630cbffb0c09ae3b Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482606 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 17:39:52 +00:00
Jim Harris	dc3717296e	bdev: handle unlock v. lock race When we unlock a range, we remove the range from the locked bdev list before doing the for_each_channel iteration to remove the range from each channel. But at the same time, right after removing from the locked list, a new lock on that range could start. In that case, we also do a for_each_channel to add the range to each channel, and that will race with the for_each_channel remove. When the lock start wins, it finds the range already in the channel, but doesn't set the owner_range which results in a seg fault when the for_each_channel completes. The fix is actually rather simple. We just add the locked_ctx to the comparison when checking if the range is already in the channel. If the locked_ctx matches, then we know it was added as part of initializing a new channel. If it doesn't, then we create a new range object pointing to the new locked_ctx. The first one will get removed when the remove for_each_channel catches up. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I94f8b20376dd437f404add35744d42fc148303ff Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482620 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Maciej Szwed <maciej.szwed@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-01-27 17:39:52 +00:00
Jim Harris	da11a46466	bdev: start lock process on original channel If a locking operation has to wait because of an existing lock, we queue the lock context. When the existing lock finishes unlocking, we restart the queued lock context. But we have to make sure we restart the lock context on the same thread it was originally submitted, since it has a channel associated with it. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I555515f3adfc3c13a86584c601ed541d605980b7 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/482463 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Maciej Szwed <maciej.szwed@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 17:39:52 +00:00
Tomasz Kulasek	327668c8c9	test/common: make gdb_attach global Change-Id: Ib73bd9513681360f22251b54a2d15ca9807b72c2 Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/478807 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 17:39:52 +00:00
Tomasz Kulasek	b27f6f9d80	unit/nvmf: spdk_nvmf_bdev_ctrlr_compare_and_write_cmd Change-Id: I2c82cfceed9dac466501da8cfe485f916f202b01 Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481407 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-27 17:39:52 +00:00
Tomasz Kulasek	98bbb72dc2	unit/nvmf: fused compare and write Change-Id: Iad60e3d53a9f6b0a6f71b95d66b57d6852d90cb2 Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481406 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	34bac0bad6	bdev/nvme: Fix compare and write command completion Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: I16f6842703eead32318d2aca53cbf1e2b5b15bce Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481976 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	058ec60eab	bdev/nvme: Fix: bdev_nvme_comparev_and_writev using only compare iovs bdev_nvme_comparev_and_writev function takes as an argument only iovs for compare operation and uses them for write operation. It should also take iovs for write operation. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: I5be2610c3d8552559aa4db969d5acb78b1620079 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481806 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	ca07f62b5b	doc/nvme: describe fused operations support Change-Id: Ifcc2bf59051d5cee261cfc4898e267ed7608c1a8 Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com> Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477789 Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	a83644fe2b	bdev: Lock LBA range for fused command execution Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: I577f961484b2ebf350f4f795eda1a018c5f0fd7a Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481710 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	8a3042b714	bdev/nvme: Fix comments in nvme_bdev_io structure Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: I04b310a9f50c1728b9cd260517591c5e9108cc95 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481673 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-27 17:39:52 +00:00
Tomasz Kulasek	9a80e954f7	lib/nvmf: report support for fused compare and write Change-Id: Ib073719a59972240a68b1a4ad4951820c7ea5323 Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476136 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	ff8a425182	nvmf: Return ACWU and NACWU values in indentify structures For ACWU we always set value 1 because bdev holds information specific for namespace only. This value actually does not matter because we also set NACWU which makes ACWU irrelevant. We set ACWU because NVMe specs requires ACWU != 0 if fused commands are supported. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: Ida4357026d3b32677fc824b3cd878e7ad8ef2680 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477915 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	c13733915b	bdev: Add spdk_bdev_get_acwu function This function is required for NVMf implementation for compare and write fused command. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: If41611f5c0b8e4ed8eec66f09858c724f1800d59 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477914 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	71beb568d6	nvmf: Add call support for compare and write cmd in spdk_nvmf_ctrlr_process_io_cmd Add call for spdk_nvmf_bdev_ctrlr_compare_and_write_cmd function in spdk_nvmf_ctrlr_process_io_cmd function when fused command is discovered. This patch also removes redundant defines for fused flags. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: I61971a56577ab32b52e1fde1e572f718a9a2d9aa Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476621 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	87be077d0b	nvmf: Add spdk_nvmf_ctrlr_process_io_fused_cmd Move fused cmd related code from spdk_nvmf_ctrlr_process_io_cmd to separate function. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: Ic662a968b054f05db7f6e1cf4fa9aa13f6fb7c40 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481942 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	941d9e7aa8	nvmf: Add support for compare op command This patch introduces new spdk_nvmf_bdev_ctrlr_compare_cmd function which implements support for compare operation. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: Iadf402a6441a78ea0e6468f1066c6b0e10e63b9b Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/477782 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	05e7f56c3a	nvmf: Add spdk_nvmf_bdev_ctrlr_compare_and_write_cmd function This patch introduces new function that is a part of upcoming support for fused commands. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: I019c587bee7fd0f745ec17c141baf4cb7bf86645 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476611 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-01-27 17:39:52 +00:00
Tomasz Kulasek	67c9c1c5d8	lib/nvmf: add fused operations Change-Id: If3162a5683d1c57011f9a66cbcfe47ba161734bf Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/476138 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	adf90938b1	bdev: Add spdk_bdev_io_get_nvme_fused_status function Added new function for getting NVMe specific return code for fused commands. Also changed one of the return codes in fused commands so that we could distinguish error cases. Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: I86417ea4f5b8f3e6496162be3d6c6128076e35d4 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481666 Community-CI: Broadcom SPDK FC-NVMe CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-01-27 17:39:52 +00:00
Maciej Szwed	d417768c00	nvme: Fix define name Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: Ice5b7ca3ed3fef93636db064d7f5fcafa9cf2d3b Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/481771 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-01-27 17:39:52 +00:00
paul luse	87dcedb817	docs: add section for containers Starting with how to containerize an SPDK app for Docker by example. Signed-off-by: paul luse <paul.e.luse@intel.com> Change-Id: Ice2cbf0ab54f2c541e38508a133a7ef0f23dd40e Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/479900 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-24 20:16:58 +00:00
Karol Latecki	6df2fa8c2e	test/vhost_perf: refactor test scripts to use disk map Use provided configuration file via --disk-map option instead of creating bdevs and VMs in ordered sequence (e.g. 0 to 10, etc.). This allows to: - specify which NVMe device we want to use (PCI BDF identifier) - how to name it in SPDK nvme bdev configuration - how many splits or lvol bdevs to create on this device - which VMs should use created bdevs With CPU mask configuration file this allows to better control resources when running the test (especially in case of NUMA optimization where using sequential for/while loops is not a good approach). vm_count and max_disks parameters removed. These are not needed anymore as they're controlled by config file. Example of config file contents: (BDF,Spdk NvmeBdev name,Split count,VM list) 0000:1b:00.0,Nvme1,2,2 3 0000:89:00.0,Nvme3,4,4 5 6 7 Change-Id: I9fc73458825d8072537aa04880765a048e034ce4 Signed-off-by: Karol Latecki <karol.latecki@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/464565 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-01-24 15:13:00 +00:00

1 2 3 4 5 ...

11086 Commits