numam-spdk

Author	SHA1	Message	Date
Changpeng Liu	1edc5f0040	nvmf: restore the loaded reservation information to NS Load reservation information based on ptpl configuration file, and restore the information to NS data structure. Change-Id: I5f46d49a6d1e6e49aab93ca7cd654469a3a08659 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455912 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-07-08 08:21:03 +00:00
Shuhei Matsumoto	8448adaefa	nvmf/tcp: Verify DIF before sending C2H data in spdk_nvmf_tcp_send_c2h_data If DIF mode is local and C2H data is extended LBA payload, DIF should be verified just before sending the payload. Add a helper function nvmf_tcp_pdu_verify_dif and call it in spdk_nvmf_tcp_send_c2h_data after completing nvme_tcp_pdu_set_data_buf. When nvmf_tcp_pdu_verify_dif returns error, treat the error as fatal transport error because the error is caused by the target itself. Handle the fatal NVMe/TCP transport error by terminating the connection as described in the NVMe specification. On the other hand, data digest error is treated as a non-fatal transport error because the error is caused outside the target. This is reasonable. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I9680af2556c08f5888aeaf0a772097e4744182be Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458921 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-07-08 03:33:07 +00:00
Ziye Yang	57efada508	nvmf/tcp: reorg the structure of struct spdk_nvmf_tcp_req I used pahole to see whether the alignment of the structure is reasonable. After reorgnization, we can saved 16 bytes and 1 cacheline according to the information by pahole. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I1347e7c582fe2b00707e2841690b87d53cc61e33 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/460572 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-07-05 04:18:41 +00:00
Shuhei Matsumoto	3ff1ff004e	nvme/tcp: Minor cleanups for SGL operations Using naming rules consistent with other related libraries is helpful to ensure the quality as verified by this patch series. This patch changes a few parts to use iov and iovcnt for SGL operations. Besides, name of an array points to the head of the array and is constant. So copying name of array to an another pointer is not necessary and can be removed. Change-Id: I2324f28126b3088098c1c767cf6c060f22c175c3 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455629 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Maciej Szwed <maciej.szwed@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com>	2019-07-04 08:58:40 +00:00
Shuhei Matsumoto	127cfac020	nvmf/tcp: Use nvme_tcp_pdu_set_data_buf for incapsule data Previously we had used nvme_tcp_pdu_set_data() for incapsule data. This patch changes handling incapsule data to use nvme_tcp_pdu_set_data_buf() as same as H2C and C2H. This unification is necessary to support DIF insert and strip in NVMe/TCP target later. Change-Id: I02cae8db94e51cf79a354dd64ad45f0e491ec08e Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455920 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com>	2019-07-04 08:58:40 +00:00
Shuhei Matsumoto	3184884f9d	nvmf/tcp: Properly handle multiple iovecs in processing H2C and C2H NVMe/TCP target had assumed the size of each iovec was io_unit_size. Using nvme_tcp_pdu_set_data_buf() instead removes the assumption and supports any alignment transparently. Hence this patch moves nvme_tcp_pdu_set_data_buf() to include/spdk_internal/nvme_tcp.h and replaces the current code to use it. Besides, this patch simplifies spdk_nvmf_tcp_calc_c2h_data_pdu_num() because sum of iov_len of iovecs is equal to the variable length now. We cannot separate code movement (lib/nvme/nvme_tcp.c to include/ spdk_internal/nvme_tcp.h) and code replacement (lib/nvmf/tcp.c) because moved functions are static and compiler give warning if they are not referenced in lib/nvmf/tcp.c. The next patch will add UT code. Change-Id: Iaece5639c6d9a41bd35ee4eb2b75220682dcecd1 Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455625 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-07-04 08:58:40 +00:00
Ziye Yang	b09bd95ad3	sock: update spdk_sock_group_add_sock And also add spdk_sock_group_get_ctx function Change-Id: I2a2a58b0588ff7d99d3538ea0a633a3b8c7a234b Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454538 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Maciej Szwed <maciej.szwed@intel.com>	2019-07-04 08:21:05 +00:00
Shuhei Matsumoto	12d6dce2aa	nvmf: Use not malloc'ed but fixed size string for host NQN Maximum size of NQN is already defined to be SPDK_NVMF_NQN_MAX_LEN, and hence use fixed size string whose size is SPDK_NVMF_NQN_MAX_LEN + 1 for spdk_nvmf_vhost::nqn. This change will reduce the potential malloc failure. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I2b9c7cc21200b3e88b5485ebfdcd5040bc6e3589 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459742 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-07-04 00:30:22 +00:00
Changpeng Liu	af6ed1e94a	nvmf: update the reservation information for ACQUIRE/RLEASE commands Change-Id: Ibfebffa4d683da08ae8f9350cce144fafe6a5538 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455910 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-07-02 00:06:59 +00:00
Changpeng Liu	196d4f704a	nvmf: enable ptpl feature with reservation register command Add file based reservation information definition, the data structure can be used to store all the reservation information to a json based configuration file, and enable this feature with REGISTER command. Change-Id: Ic93cfc5934a4ad96f11b96ec77bacb877edf6c10 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455909 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-07-02 00:06:59 +00:00
Ziye Yang	cdc0170c1b	nvmf/tcp: Add a maximal PDU loop number In our previous code, we will handle all the PDU until there is no incoming data from the network if we can continue the loop. However this is not quite fair when we handling multiple connections in a polling group. And this change is setting a maximal NVME/TCP PDU we can handle for each conneciton, it can improve the performance. After some tuing, 32 should be a good loop number. Our iSCSI target uses 16. The following shows some performance data: Configuration: 1 Command used in the initiator side: ./examples/nvme/perf/perf -r 'trtype:TCP adrfam:IPv4 traddr:192.168.4.11 trsvcid:4420' -q 128 -o 4096 -w randrw -M 50 -t 10 2 target side, export 4 malloc bdev in a same subsystem Result: Before patch: Starting thread on core 0 ======================================================== Latency(us) Device Information : IOPS MiB/s Average min max TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 51554.20 201.38 2483.07 462.31 4158.45 TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 51533.00 201.30 2484.12 508.06 4464.07 TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 51630.20 201.68 2479.30 481.19 4120.83 TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 51700.70 201.96 2475.85 442.61 4018.67 ======================================================== Total : 206418.10 806.32 2480.58 442.61 4464.07 After patch: Starting thread on core 0 ======================================================== Latency(us) Device Information : IOPS MiB/s Average min max TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 57445.30 224.40 2228.46 450.03 4231.23 TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 57529.50 224.72 2225.17 676.07 4251.76 TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 57524.80 224.71 2225.29 627.08 4193.28 TCP (addr:192.168.4.11 subnqn:nqn.2016-06.io.spdk:cnode1) from core 0: 57476.50 224.52 2227.17 663.14 4205.12 ======================================================== Total : 229976.10 898.34 2226.52 450.03 4251.76 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I86b7af1b669169eee2225de2d28c2cc313e7d905 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/459572 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-06-28 12:28:54 +00:00
Or Gerlitz	6629202cbd	nvmf/tcp: Use the success optimization by default By now (5.1 is released), the Linux kernel initiator supports the success optimization and further, the version that doesn't support it (5.0) was EOL-ed. As such, lets open it up @ spdk by default. Doing so provides a notable performance improvement: running perf with iodepth of 64, randread, two threads and block size of 512 bytes for 60s ("-q 64 -w randread -o 512 -c 0x5000 -t 60") over the VMA socket acceleration library and null backing store, we got 730K IOPS with the success optimization vs 550K without it. IOPS MiB/s Average min max 549274.10 268.20 232.99 93.23 3256354.96 728117.57 355.53 175.76 85.93 14632.16 To allow for interop with older kernel initiators, we added a config knob under which the success optimization can be enabled or disabled. Change-Id: Ia4c79f607f82c3563523ae3e07a67eac95b56dbb Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/457644 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2019-06-26 06:24:03 +00:00
Changpeng Liu	cf5c4a8a2e	nvmf: add ptpl activated flag to Namespace If users set the persist through power loss configuation file, that means the Namespace has the capability to support ptpl feature, here we added a ptpl_activated flag to indicate that the users enable the feature or not. Users can use Set features or Reservation Register commands to change the value. Change-Id: Iae3fd44085c5be5bf9574e49efa567e8212dee20 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455906 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-06-26 01:54:10 +00:00
Hailiang Wang	73a171a07c	rdma: assert ibv_send_wr is not NULL Vhost testing crashed from Nightly testing, because a member access within null pointer of type 'struct ibv_send_wr'. Change-Id: If8f34f23864883ea73516d2d1fe3b30137c04316 Signed-off-by: Hailiang Wang <hailiangx.e.wang@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/458913 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-06-25 13:37:15 +00:00
Evgeniy Kochetov	9e3d841d3e	nvmf: Fix connect command SQ size validation for IO queues SQSIZE parameter validation in Connect command was broken because QID field in qpair was used before intialization. Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com> Change-Id: I8a0b359937d661df3b9888e6084e7d0b4a9056ea Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455667 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-06-18 11:39:29 +00:00
Shuhei Matsumoto	c758dc088a	nvmf: Reject bdev with separate metadata to attach to subsystem NVMe bdev module support separate metadata now but NVMf subsystem cannot process bdev with separate metadata yet. Hence reject any bdev with separate metadata to be attached explicitly by this patch. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I793c6c5f61deb766d7bf427ff67ccc57a48974cf Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/457167 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-06-13 00:48:11 +00:00
Changpeng Liu	3ec061800f	nvmf: add a persist through power loss configuration file when constructing NS For reservation feature in NVMoF, we can't support the persist through power loss feature, now we will add the configuration file parameter with Namespace, after users set the configuration file parameter with one NS, then the PTPL feature can be enabled. Change-Id: Id72699093f7e68318b9529f7bacc5c9804f7f86b Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455905 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-06-12 00:30:03 +00:00
Alexey Marchuk	53777de855	rdma: Unset IBV_SEND_SIGNALED flag for RDMA_WRITE operations Unsetting this flag will decrease the number of WRs retrieved during CQ polling and will decrease the oeverall processing time. Since RDMA_WRITE operations are always paired with RDMA_SEND (response), it is possible to track the number of outstanding WRs relying on the completed response WR. Completed WRs of type RDMA_WR_TYPE_DATA are now always RDMA_READ operations. The patch shows %2 better peformance for read operations on x86 machine. The performance was measured using perf with the following parameters: -q 16 -o 4096 -w read -t 300 -c 2 with nvme null device, each measurement was done 4 times avg IOPS (with patch): 865861.71 avg IOPS (master): 847958.77 avg latency (with patch): 18.46 [us] avg latency (master): 18.85 [us] Change-Id: Ifd3329fbd0e45dd5f27213b36b9444308660fc8b Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Sasha Kotchubievsky <sashakot@mellanox.com> Signed-off-by: Evgenii Kochetov <evgeniik@mellanox.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456469 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-06-11 18:07:28 +00:00
JinYu	8fc9ac7b0e	nvmf: complete all I/Os before changing sgroup to PAUSED For the nvme device, I/Os are completed asynchronously. So we need to check the outstanding I/Os before putting IO channel when we hot remove the device. We should be sure that all the I/Os have been completed when we change the sgroup->state to PAUSED, so that we can update the subsystem. Fix #615 #755 Change-Id: I0f727a7bd0734fa9be1193e1f574892ab3e68b55 Signed-off-by: JinYu <jin.yu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452038 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-06-11 01:51:56 +00:00
Ziye Yang	0bb626672b	nvmf/tcp: Support single r2t usage According to the TP 8000 spec in Page 26: Maximum Number of Outstanding R2T (MAXR2T): Specifies the maximum number of outstanding R2T PDUs for a command at any point in time on the connection. Note that by the spec, the target may only support single r2t (which is the minimum possible), it doesn't have to use multiple r2ts even if the initiator supports that. So remove the maxr2t and pending_r2t variable in the tcp qpair structure. In the original design, we think that maxr2t is the maximal active r2t numbers for each connection. So if the initiator sends out maxr2t=16, it means that all the commands of a qpair can use such number of R2T pdus. So we need to wait for the available R2Ts for the request when the maxr2t reaches the maximal value. But it is the wrong understanding of the spec. In fact, each command has its own number of maximal r2t numbers, then we do not need to use the wait method for R2T method anymore. So we remove the state TCP_REQUEST_STATE_DATA_PENDING_FOR_R2T. Futhermore, we adjust the related SPDK_TPOINT_ID definition. In current patch, the target will support one active R2T for each write NVMe command. Thus, we remove the function spdk_nvmf_tcp_handle_queued_r2t_req. Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I7547b8facbc39139b4584637ccc51ba8b33ca285 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/455763 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Or Gerlitz <gerlitz.or@gmail.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-06-05 16:46:55 +00:00
Jim Harris	f758598c44	nvmf: fix assert in spdk_nvmf_tcp_req_fill_iovs It's OK for iovcnt to equal SPDK_NVMF_MAX_SGL_ENTRIES. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ic95d04f5667858e7fbb025f469c027e2d47b8ba1 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/456111 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2019-05-31 14:46:35 +00:00
Jim Harris	bf647c168a	nvmf: increase default max num qps to 128 This matches the Linux kernel target. Users can still decrease this default when creating the transport (i.e. -p option for nvmf_create_transport in rpc.py). Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Icad59350a2cd35cfc4ad76d06399345191680c05 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/454820 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-22 14:50:05 +00:00
Seth Howell	61948a1ca7	rdma: add check for allocating too many SRQ. We could run into issues with this if we were using an arbitrarily large amount of cores to run SPDK. Change-Id: Ia7add027d7e6ef1ccb4a69ac328dbdf4f2751fd8 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452250 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-15 20:29:32 +00:00
Seth Howell	14777890a6	rdma: add an stailq for qpairs pending recv This will help us not iterate through the whole list of connections when only some of them have pending recvs. Change-Id: I681bc98befbdda4e77ef333b7a086c08b2708eb3 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449266 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-13 22:09:55 +00:00
Seth Howell	c3884f943c	rdma: batch rdma recvs per poll. This will help save MMIO overhead. Especially in the SRQ case. Change-Id: I6fb70cf6de4763450f97961f41ccdce3acec2e63 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449265 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-05-13 22:09:55 +00:00
Seth Howell	b4dc10fbb7	rdma: create a list for qpairs pending send transfers By creating a list of qpairs, we can avoid looping over every connected qpair to process sends each time we poll. Change-Id: If24bbc363176f52fbfb756d56719edd885a21a11 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449264 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-05-10 22:24:35 +00:00
Seth Howell	9d63933b7f	rdma: batch rdma sends. By batching ibv sends each time we poll, we can reduce the number of MMIO writes that we do. Change-Id: Ia5a07b0037365abfa8732629c34d34a9ed49ac70 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449253 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-05-10 22:24:35 +00:00
Ben Walker	fbbbd6ab50	nvmf: Print a message out when a host is disconnecting due to keep alive It isn't obvious why hosts are being disconnected at the moment. Change-Id: I5515ba40883ccb20921d0da013b27670212bf649 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/453034 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-09 15:35:11 +00:00
Seth Howell	350e429a57	rdma: add a flag for disabling srq. There are cases where srq can be a detriment. Add a flag to allow users to disable srq even if they have a piece of hardware that supports it. Change-Id: Ia3be8e8c8e8463964e6ff1c02b07afbf4c3cc8f7 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452271 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-05-06 18:11:13 +00:00
Jim Harris	a95fdad68f	nvmf: remove unnecessary size checks when creating transport The individual transports will adjust these sizes when necessary. In fact, we have to remove this check, since RDMA transport may adjust the io_unit_size based on the max number of SGEs - and can adjust it to a value that will fail this check if we reload the configuration. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I2708c7f5aaa54a368ec932ec40dd6447f1a4fde0 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452474 Reviewed-by: Seth Howell <seth.howell5141@gmail.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-05-02 14:44:57 +00:00
Jim Harris	b6206d657c	trace: shorten max name from 44 to 24 characters This restriction helps reduce the amount of padding when printing out the event trace, allowing it to fit in a small number of columns. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ifa31e5a6967c7b9bc7028069effb71533f80596f Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452736 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-05-02 08:41:56 +00:00
Jim Harris	617184be3b	trace: remove short_name This was not used by any of the trace register descriptions. Let's remove it rather keeping it around if we don't need it. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Idda809e2911db5be555ff6aa13695484a14bf665 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452734 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com>	2019-05-02 08:41:56 +00:00
Anil Veerabhadrappa	2061874474	lib/nvmf: Validate requested SQ size for both admin and IO queue During connect call based on queue type (AQ or IOQ), SQ size should be validated against max sq size for that particular queue type. Change-Id: I977d7556e4d04e37004d16c87efffd3b467fa62c Signed-off-by: Anil Veerabhadrappa <anil.veerabhadrappa@broadcom.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452376 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-05-01 18:51:28 +00:00
Seth Howell	6cc18a64aa	rdma.c: Don't set recv->qpair to NULL We can use the rpoller->srq to check if a qpair is valid when processing recv completions. Change-Id: I6aa360adc48a3312ddcf79f10e2a65b502a7314f Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452247 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-05-01 18:48:13 +00:00
Seth Howell	33f60621af	lib: resize key mempools Mempools are based off of a ring structure which allocates its elements as a power of two. It also only exposes n-1 elements to the user. So when we create a mempool with 2^n elements in it, we have to allocate a ring with 2^n+1 entries. By decreasing the number of elements in these key mempools by 1, we can save a decent amount of memory. Change-Id: I942c9dd4cf59096969bc2559fb46fd2084a07f09 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/448875 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-05-01 17:45:29 +00:00
Seth Howell	d05c553827	rdma: don't spam people with async event messages. It used to be that we would get async events very infrequently. However, with the introduction of SRQ, this number has gone up tremendously. Change the way we report our these events so that we don't spam/confuse people running the target. Change-Id: I33070281fa854cbc17784d61bbbb870196ca8780 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452159 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-04-26 18:10:56 +00:00
Seth Howell	ec47f92b9b	rdma: fix potential heap-use-after-free in srq shutdown If there are outstanding recvs for a qpair when it is destroyed, we need to clear the qpair from it before reposting it. Otehrwise, we have a potential heap-use-after-free of double free (depending on whether the recv completion is in error state or not). See github issues #730 Change-Id: Ic2009c761cbcc5e89174f62fbd0872d0489c67ca Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/452122 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2019-04-26 11:16:22 +00:00
Seth Howell	3856d82b50	subsystem: check for NULL bufs in reservation ops. At the RDMA level, we allow processing requests that should contain a data transfer, but specify a length of zero to be passed up the stack without a data buffer. See spdk_nvmf_rdma_request_get_xfer. In the case of the reservation requests, we weren't checking whether req->data was NULL before trying to copy into it causing us to segfault if we got a malformed reservation request. Found when using the fuzzer. Change-Id: I320174ec72a8d298ab6ca44ef6a99691631f00ca Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451786 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-04-25 22:52:12 +00:00
Changpeng Liu	3f4426878a	nvmf: disable the protection if the backend doesn't contain valid type It's not an error if the NVMe hard drive was formatted to 512 + 8 but has no protection type, so we will also disable the protection for NVMoF target. Change-Id: I07e605cff9545f46c642f7ca783a4727a26abece Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451926 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-04-24 21:47:18 +00:00
Seth Howell	89d2efe07e	rdma: set the srq param in the initiator. We were setting this value in the target from our initiator, but it turns out the rdma_conn_params struct is responsible for setting the opposite side so we need to add it in the target side when accepting connections. Also, add a test to demonstrate target functionality when we overwhelm the SRQ. It is useful to note that performance really tanks when you start overwhelming the srq so it may be useful to use this test case to check performance gains in edge cases over time. Change-Id: Iac541bd9fc1d82eca9f21e7abc3f625663a6c460 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451678 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-04-24 09:22:16 +00:00
Jim Harris	b92c3d412d	nvmf: add tcp trace points for data read from socket Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: Ib04abb64dd379dd73c7ff3c8318591124b4bb7dd Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451477 Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-04-23 17:59:23 +00:00
Gregory Shapiro	14032a984c	NVMF: Add model number as parameter to construct_nvmf_subsystem (-d option). Change-Id: Ia1a458a0ac1c5a17d2955a3f31c6dfe77538eb17 Signed-off-by: Gregory Shapiro <gregory.shapiro@kaminario.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/438562 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-04-23 16:51:16 +00:00
Changpeng Liu	68bb3995aa	nvmf: trivial optimization to make the code more consistent Make the use of spdk_uuid_compare() to be consistent in the file, also change the SPDK_INFOLOG to SPDK_DEBUGLOG to avoid the repeated log messages for RESERVATION CONFLICT response. Change-Id: I72fefbd520cefcaf25182c3ca3d21e3d87d17e94 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450884 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-04-23 16:30:24 +00:00
Changpeng Liu	4fa486a1e3	nvmf: add asynchronous event for reservation notificaiton Now Host can get an asynchronous event notification when registrants were unregistered/preempted or reservation was released from the associate namespace, Host can send get log page to clear related log pages and reservation report to get the full overview of current reservation configuration. Change-Id: Idc57c19812490c7536503308989871515e9f2361 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/439935 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-04-23 16:30:24 +00:00
jiaqizho	b70e698465	rdma:fix core dump when rdma_create_qp return error. Signed-off-by: jiaqizho <jiaqi.zhou@intel.com> Change-Id: Ie900e01820f69fc5b2d5e30d519c6b619d7a7281 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/449507 Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-04-22 18:40:35 +00:00
Yair Elharrar	2b0ae30bf1	nvmf: fix segfault in case of multi-range unmap In case of a DSM Deallocate (unmap) with multiple ranges, individual bdev IOs are submitted for each range. If the bdev IO cannot be allocated, the request is queued on io_wait_queue; however previously submitted ranges may complete before memory is available for the next range. In such a case, the completion callback will free unmap_ctx, while the request is still queued for memory - causing a segfault when the request is dequeued. To fix, introduce a new field tracking the unmap ranges, and make sure the count is nonzero when the request is queued for memory. Signed-off-by: Yair Elharrar <yair@excelero.com> Change-Id: Ifcac018f14af5ca408c7793ca9543c1e2d63b777 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/447542 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2019-04-22 15:42:51 +00:00
Jim Harris	4ff7949893	nvmf: remove unused tcp trace point Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I8f2e26f46f8c37312c3201df8210b449279640d0 Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/451476 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2019-04-22 15:25:37 +00:00
Seth Howell	7d7b44f2a6	rdma: decrement descriptor before checking SEND_WITH_INVAL We were incrementing over the end of the descriptor list and assigning undefined values to the rsp opcode in SEND_WITH_INVAL case. We were only hitting this error when mixing sgl and inline requests in the same workload. We were just by chance hitting a four bit value that was set to all 1s from the in capsule data from the last request. Change-Id: Ied06356f3d22fa34a2cd869dfad6bdca8720791d Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450873 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2019-04-19 17:29:45 +00:00
Seth Howell	2cc6b0dfcb	rdma: set the number of wr sge_entries per I/O This was not being properly set in the multi-sgl path. Also add a verification step to the fio configuration file to prevent against future regressions. Change-Id: I510b6acd92bc2fbc9b6fbec1d59945cc53584ad3 Signed-off-by: Seth Howell <seth.howell@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/450305 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2019-04-19 17:29:45 +00:00
Changpeng Liu	468c6c18bd	nvmf: enable get log page with reservation notification page Reservation notification log page can be returned via the get log page command with correct page number, users can get zeored page buffer if the controller didn't have any reservation notification log. Change-Id: I99f5e4b8917a6919eb68359628efa1bead4b21b5 Signed-off-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-on: https://review.gerrithub.io/c/spdk/spdk/+/439934 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: GangCao <gang.cao@intel.com>	2019-04-18 22:33:26 +00:00

1 2 3 4 5 ...

1015 Commits