numam-spdk

Author	SHA1	Message	Date
Ben Walker	4e347038a8	sock: Maps hold group_impls instead of groups Since the maps are unique to modules, they can store the group_impls directly. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I7f11db558e38e940267fdf6eaacbe515334391c2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7222 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-19 12:54:54 +00:00
Ben Walker	5379aa95e7	sock: Each module now maintains its own sock_map This allows for different policies per module, as well as overlapped placement_id values. Change-Id: I0a9c83e68d22733d81f005eb054a4c5f236f88d9 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7221 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-19 12:54:54 +00:00
Ben Walker	6b86039fd9	nvme/tcp: Ensure qpair is polled when it gets a writev_async completion There was a fix for this that went into the posix layer, but the underlying problem is the logic in the nvme/tcp transport. Attempt to fix that instead. Change-Id: I04dd850bb201641d441c8c1f88c7bb8ba1d09e58 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6751 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-19 12:54:24 +00:00
Ben Walker	6d6959e989	sock/posix: Rename pending_recv to pending_events This list will hold any socket that has some event pending and needs to be part of the set returned during polling of the group. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I5acf01677e59c1026f93671c7b7b3dc458075bf7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6748 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Mellanox Build Bot	2021-04-19 12:54:24 +00:00
Ben Walker	fc551b3a62	sock/posix: Make pending_recv shuffle more efficient Instead of iterating the list, we can just manipulate the list in a single step. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I0172cdbce9af35a62d62dbccfac573e5d723f43a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6747 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Mellanox Build Bot	2021-04-19 12:54:24 +00:00
Ben Walker	e8bcf36a81	sock: Don't cache placement_id in generic sock struct Instead, move it down to the modules. This allows modules to potentially change the value, if they are able. Change-Id: I08f5fbadf5d1e96b489ddaaca72aa051ce2cb85c Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7212 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-04-16 05:04:29 +00:00
Ben Walker	1d2613fe36	sock/posix: Eliminate so_priority This value is already available in the options structure. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I140dc79da1fa5f155a39f1f9e2f54f46d93b6c1c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7211 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-16 05:04:29 +00:00
Ben Walker	28b3889c8e	sock: Use an enum for placement modes Easier to read than integers. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ie9b8b16e1916b393a257e9ed0180ef9837f20cd2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7205 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Community-CI: Mellanox Build Bot	2021-04-09 17:15:57 +00:00
Sudheer Mogilappagari	4ac5ca6558	posix: add sock to pending_recv list only if not already added Currently there is possibility of adding a sock to pending_recv list again if sock->pending_recv is true. Check if flag is false before adding to the list. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Change-Id: Ie23e1e8dbe1aa5594d9ddea30e7f235e3bf8ddad Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6381 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com>	2021-03-29 07:43:08 +00:00
Ben Walker	b67aa514a4	sock/posix: No longer remove sockets from pending_recv in poll This seems to be cleaning up the pending_recv list to account for the missed cases in the previous patches in this series. Now that we're correctly cleaning up the list, don't do this. Note that if an EPOLLIN event is received but the application never does a read/recv, the socket will remain in the pending recv list. The next poll will get another EPOLLIN event, but the logic already handles that case. Additionally, left a TODO for a performance optimization. Change-Id: I1cdde500a5c76554401a89de766d35b7a486b207 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6746 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-29 07:28:35 +00:00
Ben Walker	8ac5f9e924	sock/posix: Fix read logic to avoid double-adding socket to pending_recv Also write some better comments Change-Id: I81d59307c5eacc5a71879a83e5040da667909d96 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6745 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-29 07:28:35 +00:00
Ben Walker	01aa5cb385	sock/posix: Clear sock from pending_recv even if user does large read If there was an EPOLLIN event the socket gets adding to the pending_recv list. But if the application then does a very large read, it will bypass the logic that clears the socket from the pending_recv list. Fix this. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ia0ba86012f7c6dfd14eb43ba6eeed94dbbce90ce Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6744 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-29 07:28:35 +00:00
Ben Walker	8e7d559283	sock/posix: When a socket has no recv_pipe, reading should still clear from pending_recv list If the upper layer performs a read/recv, it should still remove the socket from the pending_recv list. Change-Id: I32ca8ecccbfe1e53ecc7d6f57343c2727e84b851 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6743 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-29 07:28:35 +00:00
Richael Zhuang	201aa63471	sock: introduce SO_INCOMING_CPU to get placement_id Leverage SO_INCOMING_CPU to get the CPU affinity of connections (sockets). And allocate the connections to specific poll groups, which aims to utilize cache locality. From our test: 6 P4600 NVMe on target,target uses 8 cores, NIC irqs are bound to these 8 cores, and initiator side uses 24 and 32 cores, we can get 11%~17% randwrite performance boost for posix, and 8%~12% for uring. Change-Id: I011e0a21502c85adcccd4a14fbe9838b43f54976 Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5748 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-09 08:53:52 +00:00
Ziye Yang	2f1cd867f3	sock/uring: Refactor the code in uring_sock_close Use the same style compared the code in posix_sock_close. Thus if we cannot close sock->fd, i.e., we leak the fd, but we can still free the memory related with uring sock. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Id2f0e8a2c7065f100c2b009e76a49b528fd221b6 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6539 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-02-25 10:26:08 +00:00
Ziye Yang	d5cd0b13b6	sock: Fix the "sock remove assert bug" in spdk_sock_group_remove_sock The statement causes this issue is: assert(group_impl->num_removed_socks < MAX_EVENTS_PER_POLL); The call trace is: The previous solution is: commitid with: `e71e81b631` But with this solution, it will always add the sock into the removed_socks list even if it is not under polling context by sock_group_impl_poll_count. So it will exceed the size of removed_socks array if sock_group_impl_poll_count function will not be called. And we should not use a large array, because it is just a workaround, it just hides the bug. So our current solution is: 1 Remove the code in sock layer, i.e., rollback the commit `e71e81b631`. This patch is not the right fix. The sock->cb_fn's NULL pointer case is caused by the cb_fn of write operation (if the spdk_sock_group_remove_sock is inside the cb_fn). And it is not caused by the epoll related cache issue described in commit "e7181.." commit, but caused by the following situation: (1)The socket's cb_fn is set to NULL which is caused by spdk_sock_group_remove_sock by the socket itself inside a call back function from a write operation. (2) And the socket is already in the pending_recv list. It is not caused by the epoll event issue, e.g., socket A changes Socket B's cb_fn. By the way, A socket A should never remove a socket B from a polling group. If it really does it, it should use spdk_thread_sendmsg to make sure it happens in the next round. 2 Add the code check in each posix, uring implementation module. If sock->cb_fn is NULL, we will not return the socket to the active socks list. And this is enough to address the issue. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I79187f2f1301c819c46a5c3bdd84372f75534f2f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6472 Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-02-24 13:06:50 +00:00
Nick Connolly	a1ae47f34f	module/sock/posix: improve portability Default to using epoll unless __FreeBSD__ is defined. Add macros SPDK_KEVENT and SPDK_EPOLL to indicate whether epoll or kevent is present. The macros match the naming convention for SPDK_ZEROCOPY which controls zero copy in a similar way. Signed-off-by: Nick Connolly <nick.connolly@mayadata.io> Change-Id: I4c46fb94b254cb075427bfe07a8085887254c45a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6466 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-02-19 11:30:32 +00:00
Alexey Marchuk	0d3ad99929	sock/posix: Don't return if zcopy is disabled When socket is being created and zcopy is disabled by the config, we can return from posix_sock_alloc function before we try to set quick_ack Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I6670b8337e70ec12b18a5e6753674fbef9e95648 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6382 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-02-17 10:18:16 +00:00
Ziye Yang	0e9ee17642	posix: Fix the NULL pointer issue of group. A single sock connection can call posix_sock_flush, and this sock may not belong to a polling group. So add the check in sock_check_zcopy to avoid such issue. Fixes #1788 Change-Id: Id0a2f80ad0f3cdb7fc736a3be3211e49513751b1 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6334 Reviewed-by: <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Karol Latecki <karol.latecki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2021-02-10 08:40:25 +00:00
Tomasz Zawadzki	e4070ee0e0	so_ver: increase all major versions To allow SO_MINOR updates on LTS for the whole year it is supported, the major version for all components needs to be increased. This is to prevent scenario where two versions exists with matching versions, but conflicting ABI. Ex. Next SPDK release adds an API call increasing the minor version, then LTS needs just a subset of those additions. Increasing major so version after LTS, allows the quarterly releases to update versions as needed. Yet allowing LTS to increase minor version separately. Disabled test for increasing SO version without ABI change, as that is goal of this patch. This check shall be removed with SPDK 21.04 release. This patch: - increases SO_VER by 1 for all components - resets SO_MINOR to 0 for all components - removes suppressions for ABI tests Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I44d01154430a074103bd21c7084f44932e81fe72 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6167 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-02-05 14:43:47 +00:00
Ziye Yang	48b2ac7a13	sock/posix: fix the zero copy enabling in initiator. When zero copy is enabled in initiator, there could be the case that a socket connection does not belong to a polling group, i.e, the application does not use socket polling group. Then we should actively call _sock_check_zcopy in posix_sock_flush function when zero copy policy is enabled. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Idceaa7557eb265daa878db40c922494c3de35ea8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5423 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Community-CI: Mellanox Build Bot	2021-02-05 13:45:15 +00:00
Alexey Marchuk	8015386885	sock/posix: Enable zero copy send Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ic21e7ba1b090b4d24ef8ae0c1b0a9c5b1909da3b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6193 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-02-04 08:36:50 +00:00
Alexey Marchuk	2ae4adc342	sock/posix: Add sock to pending list on zcopy event In NVMF TCP initiator when zero copy is disabled, all requests are completed when we receive EPOLLIN event for socket, add socket to pending_recv list and call socket's callback which calls qpair_process_completions. As part of completions processing on NVME level we receive the number of completions and resubmit the same number of queued requests. When zero copy is enabled, some transport requests can be completed when we receive and process EPOLLERR event, it happens out of qpair_process_completions context. So part of requests can be completed, transport level contains free requests but NVME layer don't have info about it until it calls qpair_process_completions. And there is a chance that on posix level when we poll sockets we receive only EPOLLERR flag without EPOLLIN. In this case we can complete several requests but don't call qpair_process_completion so we don't resubmit queued requests. It may lead to a hang in the end of test run when there are no mo requests to be completed on transport level (no EPOLLIN event) and we receive EPOLERR only, so we can't resubmit queured requests. This patch fixes this problem, it add a socket to group's pending_recv list if we received EPOLERR event and completed at least 1 socket request. So socket's callback can be called even without EPOLLIN event. Fixes issue #1685 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I21d5c2fe6eb0787aab9531925a7f0e2fe18bafaa Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6162 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: <dongx.yi@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-02-04 08:36:50 +00:00
Ziye Yang	c38a1bc002	sock: create spdk_sock_prep_reqs function. The purpose is to reduce the duplicated functions in posix and uring implmentation. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Ia0568b2490d362e7e78fa59b3ca88a60313ba0bd Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5284 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-11-27 09:22:30 +00:00
Liu Xiaodong	b7f7bbd16b	sock/uring: reap if some sock has data in pipe Since spdk_sock_group_poll should do level trigger to the callback func registered in spdk_sock_group_add_sock, if there is data in pipe, but not event occurs, poll func should still reap sock who has data in pipe. Change-Id: If3a983f80fd04708e45ad0398c7d34018ec52bc7 Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5072 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2020-11-16 12:22:43 +00:00
Alexey Marchuk	9b19abae3c	sock/posix: Disable zcopy send by default Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I4825c681d742946dfcf5bdc209356194766a15cd Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4978 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-10-30 14:51:34 +00:00
Tomasz Zawadzki	34f84c5845	lib/sock: zero out sock_impl opts By design the opts for each implementation can not match spdk_sock_impl_opts. During get_opts for specific implementation only used fields are filled. Yet iterating over all spdk_sock_impl_opts fields would yeild garbage values for unset fields. This is the case right now when doing save_config RPC with uring enabled. A garbe value for enable_zerocopy_send is returned. sock.c:829:62: runtime error: load of value 165, which is not a valid value for type '_Bool' Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ie0512a7dffc36c8ff89256d08f8a2f4fefcf9e83 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4699 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-10-16 08:15:10 +00:00
Alexey Marchuk	cbacd11a5c	posix: Reorder spdk_posix_sock struct Reorder bool fields to eliminate extra hole Change-Id: Ia27ee2be42c13d0aba702a1b00c9b89e6620e36e Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4213 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	86865969ff	sock/posix: Enable send zero copy for client sockets In NVME TCP initiator zero copy is enabled for IO qpairs and disabled for admin qpairs Change-Id: Ibdf521dccde9b95ec5dd15a5eb2baed8fcf8b88e Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4211 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	f0d8396e7a	sock: Add option to configure zero copy per socket A preparation step for enabling zero copy in NVMEoF TCP initiator. This option will be used to disable zero copy for admin qpair. This is needed since the admin qpair's socket is not connected to socket poll group and we can't receive buffer reclaim notification. Change-Id: Ibfbb8a156aafcd7ba8975a50f790da7fbd37d96f Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4210 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-09-29 09:35:47 +00:00
Jeffry Molanus	c66275c075	lib/sock: do not fail spdk_sock_flush() on ENOBUFS When net.core.optmem_max is not set high enough, a call to sendmsg() might fail with ENOBUFS. Currently this is treated as an error. When we have no more buffer space left, we should continue to process any completions and by doing so, free up the auxiliary buffers we ran out of. With this change I was able to run perf against the spdk target with a purposely set to a low, value of optmem_max, where previously it would fail. This fixes github issue #1592 Signed-off-by: Jeffry Molanus <jeffry.molanus@gmail.com> Change-Id: Ieeeed4fbecd827d0da815456b57fbe81495fe54d Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4129 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-09-16 07:55:22 +00:00
Ziye Yang	a6db2f3590	sock: enable placement_id configuration in sock layer This patch is used to enable placement_id getting in sock layer and also add the rpc support. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I70de57b0ed392a0aefce9d3ff1f61ef924015a87 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4146 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-09-11 10:04:22 +00:00
Ziye Yang	ebb903d46e	sock/uring: Add the support for enable_quickack Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: If908173600b7803dcf0e130f185dfdaec70c71c5 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4148 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-09-11 10:04:22 +00:00
Ziye Yang	2c80fce02d	sock/uring: enable "enable_recv_pipe" in uring sock Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: If62030a011ded73181b88f90fe87586a907af9ae Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4145 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-09-11 10:04:22 +00:00
Ziye Yang	cf99beb87e	sock/posix: Fix the overflow issue of sendmsg_index The type of sendmsg_idx is uint32_t, so the maximal is 2^32 -1, so it could be overflow and get 0, so we should fix it. PS: I think that our code may have potential defect. In my experiment, I try to init sendmsg_idx with 2^32 -1, so the first req->internal.offset = 2^32 - 1. But for the ee_info and ee_data in "struct sock_extended_err" got from _sock_check_zcopy is all 0 in the target side. So it means that the this req will never be completed. With the increase of sendmsg_idx (the type is uint32_t), sendmsg_idx will finally goto 2^32 - 1, so I think it will still kick the issue I described. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Ic9aaf629d73d5b7e2c81800a4f7f92c728adbc34 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3948 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-08-31 07:46:14 +00:00
Tomasz Zawadzki	30a31a16eb	sock/vpp: remove VPP implementation This patch removes implementation of VPP socket abstraction along with ways to compile it. Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I089f7703cfc4fb517f8f80f4368e544bced549b6 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3734 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-08-17 08:19:46 +00:00
Shuhei Matsumoto	0f22282fc3	lib/iscsi: Check if IP address-port pair is valid as redirect portal Add a helper function iscsi_parse_redirect_addr() to validate the passed IP address-port pair. iSCSI login redirection will support only numeric IP address and TCP port, and add AI_NUMERICSERV and AI_NUMERICHOST. This function is almost same as nvme_tcp_parse_addr() and nvme_rdma_parse_addr(). Besides, update error log in posix_sock_create() to use gai_strerror(). gai_strerror() will provide more accurate information as done by nvme_tcp_parse_addr() and nvme_rdma_parse_addr(). Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: I65c6de81a64dcb26551ce796172d0458e1c298a7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3357 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-08-11 08:27:43 +00:00
Shuhei Matsumoto	61cd9d308e	lib/sock: Add option to enable or disable quick ACK TCP delayed ACK can be disabled or enabled by enabling or disabling quick ACK, respectively. The recently added spdk_sock_impl_opts is helpful for sock library to control quick ACK. Hence this patch adds and uses an option enable_quickack. The option is effective only for the POSIX sock module. We have spdk_sock_opts now too but spdk_sock_impl_opts will be better for this case. This option is not supported on FreeBSD. FreeBSD users can set the option globally via sysctl if desired. Signed-off-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Change-Id: Ic89620267acce5872dc8ecaf7a99bb70ae97e993 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3603 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-08-10 11:29:20 +00:00
Richael Zhuang	612aa86b50	sock/uring: enable pipe buffer on arm64 The pipe buffer has obvious performance influence on arm64. The following is my test result with 1core, we can also enable it on arm64 currently like the posix socket. And later we can find the optimal pipe size that won't cause a degradation for large payloads. randwrite randread 512 byte 61% 97% 4096 byte 84% 16% 16384 byte -13% -17% Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Change-Id: Ib4df60751c5e06ef9bd7fc7bb7efafa5ad4de211 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3329 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-07-31 08:23:36 +00:00
Evgeniy Kochetov	29f31a90e1	sock: Add sock_impl option to disable zero copy on send Zero copy send can cause performance degradation with small payloads. This patch adds an option to disable it if required. By default zero copy is enabled. Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com> Change-Id: I14f2b21ad375e770cb08f850360898bac675b351 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3344 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-07-24 00:30:45 +00:00
Evgeniy Kochetov	63c5e51ebc	sock: Add sock_impl option to disable receive pipe Receive pipe reduces number of system calls and gives significant performance improvement with kernel TCP stack and relatively small IO sizes. With user space TCP/IP implementations there are no system calls and double buffering introduced by pipe has negative impact on performance. Receive pipe remains enabled by default. Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com> Change-Id: Ic5ddee42293df2c233ba7ffbe6662de7917ac586 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3343 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-07-24 00:30:45 +00:00
Evgeniy Kochetov	59b5ba5ca9	sock/posix: Add helper macros to get/set fields in sock_impl_opts structure Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com> Change-Id: I2fe650556edf22e253976dcd4ddf07d649789d11 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3498 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2020-07-24 00:30:45 +00:00
Sudheer Mogilappagari	2d877401fd	nvmf/tcp: Add recv to busy poll hw queue for data Call recv to trigger busy polling even when no socket is active. when epoll_wait returns zero, the first socket in poll group is used to trigger busy polling in kernel stack and potentially reap incoming data Change-Id: I15f04cb4a2c7b382dd07391eda69678fd7919790 Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3180 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-07-22 12:20:47 +00:00
Ziye Yang	f096f252d0	uring/sock: remove the SPDK_UNREACHABLE For some exceptional cases (e.g., https://github.com/spdk/spdk/issues/1486), we may detect POLLERR or other events. So for those events, we can just ingore it, but not use SPDK_UNREACHABLE. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I073575408783ff75e50b40d45ddf09388a2cab96 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3262 Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2020-07-10 07:30:45 +00:00
Maciej Szwed	eb05cbd677	pollers: Fix pollers to return correct busy status Poller should return status > 0 when it did some work (CPU was used for some time) marking its call as busy CPU time. Active pollers should return BUSY status only if they did any meangful work besides checking some conditions (e.g. processing requests, do some complicated operations). Signed-off-by: Maciej Szwed <maciej.szwed@intel.com> Change-Id: Id4636a0997489b129cecfe785592cc97b50992ba Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2164 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-07-07 07:29:31 +00:00
Ziye Yang	c833729b22	sock/uring: Fix the nvmf_shutdown_tc3 failure issue. If the socks parameter(passed in uring_sock_group_impl_poll) is NULL, we do not need to handle the sock_flush and prep the pollin task, otherwise it will cause the assert issue when we reap the task when we handle the nvmf_shutdown_tc3 issue. Because in uring_sock_group_impl_remove_sock, we finally set sock->group = NULL. Without this patch, when we call uring_sock_group_impl_poll in this function, pollin_task or write_task are prepared, then in the next round, we will reap those tasks again. PS: Error info can be found in https://ci.spdk.io/results/autotest-per-patch/builds/19186/archive/nvmf-tcp-vg-autotest/build.log Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I7e6deaa05e958b52e71e0bbf0ccdd20e35583685 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/3031 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-06-29 09:18:13 +00:00
Ziye Yang	a37fd6e03d	uring: remove user's ring file descriptor operation. According to the description of io_uring_queue_exit: Tear down function for io_uring. Unmaps all setup shared ring buffers and closes the low-level io_uring file descriptor returned by the kernel. So we should remove the close operatoin on ring fd. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I910c6e8acd935925b7985c2aa750df385004eb55 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2922 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-06-19 07:18:40 +00:00
Ziye Yang	d8cafc28bb	Sock: The created pipe for sock should have a minimal value. Thus, we can make sure that when read data is larger than the pipe size, it will not read the data into the pipe. Change-Id: I87f3b03fd9b81eb693e9eae0fea9eef7d1b9eaa8 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2450 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-06-02 09:48:59 +00:00
Ziye Yang	ef9fd47d07	sock/uring: Fix the connection close issue when the initiator closes the connection unexpected. When the initiator terminates the connection suddently, we found the keep alive timeout issue, e.g., nvmf_ctrlr_keep_alive_poll: NOTICE: Disconnecting host from subsystem nqn.2016-06.io.spdk:cnode1 due to keep alive timeout. The root cause is that we did not closes the connection on our target side on time whening using uring sock, and this patch can fix the issue. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I295f58bbdbae0ac3f5308f6eadef6a75c5ad07d8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/2544 Community-CI: Mellanox Build Bot Community-CI: Broadcom CI Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-06-02 09:48:15 +00:00
Evgeniy Kochetov	74b184e73a	sock/posix: Add recv_buf_size and send_buf_size socket layer options Signed-off-by: Evgeniy Kochetov <evgeniik@mellanox.com> Change-Id: Ifb7d542c4da83070344c4baa512da459ba73ec90 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/610 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-06-02 09:48:00 +00:00

1 2

95 Commits