numam-spdk

Author	SHA1	Message	Date
Liu Qing	0fb77ea8c3	sock/posix: move fd creation to a separated function Options modified on sock after connected is also moved to a function. Signed-off-by: Liu Qing <winglq@gmail.com> Change-Id: I4c2a9ae9858c102764959d055bed208b4b0621d9 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9516 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-09-22 07:00:28 +00:00
Maciej Wawryk	e588603d3e	Revert "sock/posix: fix the socket pipe_has_data or socket_has_data." This reverts commit `2cd948c4a6`. This commit caused drop in performance tests. More info in issue https://github.com/spdk/spdk/issues/2158 Signed-off-by: Maciej Wawryk <maciejx.wawryk@intel.com> Change-Id: Id5d353535323c79e773e33377af388dae47238cb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9510 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-09-22 06:56:31 +00:00
Ziye Yang	a03bc55669	uring: Fix socket rotation and ordering issue in the pending_recv list. When we rotate the socket in the list, we did not check whether the uc pointer is NULL, then we cause the coredump when using uc pointer. When we add a new socket (with pollin event) into the pending_recv list, we add it into the end of the pending_recv list, then it delays the execution of the newly socket, and it can cause the timeout especially when a socket is in a connection phase. So the purpose of this patch is: 1 Revise the rotation logic to handle the two cases, i.e., (1)sock is in the beginning of the list; (2)sock is in the end of list. The purpose is to avoid NULL pointer access, and efficently handle the exceptional case. 2 When there is new pollin event of the socket, we should add socket in the beginning of the list. And this can avoid the new socket handling starvation. Since max poll event num is 32 from upper layer and if we always put the new socket in the end of the list, then starvation will occur if there are many socket connection events. Because if we add the new socket into the end of the pending list, we will always handle the existing socks first, then the later coming socket(with relatively pollin event) will always be handled late. Then in the sock connection initialization phase, it will consume a relatively long time, then the upper layer connection based on this socket will cause timeout,.e.g., ctrlr.c: 185:nvmf_ctrlr_keep_alive_poll: NOTICE: Disconnecting host nqn.2014-08.org.nvmexpress: uuid:af56cce7-2008-408c-a8a0-9c710857febf from subsystem nqn.2019-02.io.spdk:cnode0 due to keep alive timeout. [2021-08-25 20:13:42.201139] ctrlr.c: 579:_nvmf_ctrlr_add_io_qpair: ERROR: Unknown controller ID 0x1 Fixes #2097 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I171b83ffd800539e86660c7607538e120fbe1a91 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9223 Reviewed-by: John Kariuki <John.K.Kariuki@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-09-07 07:32:50 +00:00
Konrad Sztyber	c7d1c506ef	sock/uring: reorder task cancellations in remove_sock When removing a socket from a sock_group, the recv_task should be cancelled last, because it can be sent out while cancelling other tasks (if POLLERR is received). Otherwise, we could end up with outstanding recv requests from a socket removed from a group. Fixes #2112. Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: Ic8e24c210541390dd8bdffe8d3bc4e7dd746d4b7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9239 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-08-23 08:49:01 +00:00
Konrad Sztyber	57d60b9f17	sock/uring: fix build on systems w/o zerocopy support Signed-off-by: Konrad Sztyber <konrad.sztyber@intel.com> Change-Id: I9c856a6390f227393bb0df8a473895e2368d2fbb Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/9238 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-08-23 08:49:01 +00:00
Ziye Yang	fc9a264595	uring: Not enable zero copy if fd is opened on a loopback device. In order to not affect the loopback test. Also create a sock_common.c file which can be used by posix/uring implementation. We do not put such code in sock.c. Because sock.c is the general layer. Other users may include their own user space sock impelmentations. So put those common code in sock_common.c instead. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I983ec2313119539e6eed2d9f11ba1488c0ed6560 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8769 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-08-13 07:14:38 +00:00
Ziye Yang	34ab803d30	sock/uring: Add the MSG zero copy feature This patch tries to add MSG zero copy feature. Though io_uring supports buffer registration, it only support io_uring_prep_write_fixed. It means only one registered buffer can be used. It does not satisfy our current usage mode. According to this situation, we still use the MSG_ZEROCOPY flags in io_uring_prep_sendmsg. Furthermore, this new feature is dependent on the kernel version, The currently verified version is kernel 5.12 rc3. So it is not enabled in the default manner. For example, if you want to use it on the target side, you can use the following rpc to configure: ./scripts/rpc.py sock_impl_set_options -i uring --enable-zerocopy-send-server Change-Id: Ie7bb828f466362add94891989ddf0950dccd9e80 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/957 Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-08-13 07:14:38 +00:00
Alexey Marchuk	ec1b78dbd7	socket: Remove deprecated enable_zerocopy_send This parameter is still part of API spdk_sock_impl_opts structure but it is not used. Keep it to support ABI compatibility since it is located in the middle of the structure and removing it may break socket opts initialization or parsing. Change-Id: Ib641ad7d965d68bc9ebb65dba531408d88cf6fa1 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8914 Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-07-23 10:30:25 +00:00
Ziye Yang	5a169179be	uring: fix the assert issue. Revise the if case to avoid the assert issue. Change-Id: I095f3d111423e17abaa1f951fe22efb3d2e851b7 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8872 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com>	2021-07-23 07:10:48 +00:00
Ziye Yang	467f16bf7d	uring: Use low level list ops to improve the performance when reorder the list. This patch is used to improve the performance when we need to reorder the list. PS: Bring the similar operations from posix implementation. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I7031b35ddb597730ee160690e8ab9caf9b2b64b7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8675 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Broadcom CI <spdk-ci.pdl@broadcom.com> Community-CI: Mellanox Build Bot Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-07-13 09:08:15 +00:00
Ziye Yang	05dc895bea	sock/posix: Fix the coredump when removing the sock from socks_has_data list When entering the if case to order the list, there is bug should be fixed. The original code does not address this. The way this happens is when there is a connection left in the socks_with_data list between polls and there are enough new events detected that it would exceed the maximal number of events. A connection is left on this list between polls if it isn't fully drained via reads by the upper layer on each poll loop. Currently, the maximal socket event num is 32. Then we did not hit this issue in our normal test cases. But when you use NVMe-oF tcp target to test which is described in #2105, there are more than 32 active sockets, and it exceeeds the maximal num of events of polling (32), so we will trigger this issue. Fixes issue #2015 Change-Id: I9384476fdba8826f5fe55a5d2594e3f4ed3832ba Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8541 Community-CI: Mellanox Build Bot Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-07-01 19:37:38 +00:00
Zhiqiang Liu	583689215f	uring: set fd to -1 after close(fd) in uring_sock_create() In uring_sock_create(), we loops through all the addresses available. If something is wrong, we should close(fd) and set fd to -1, and try the next address. Only, when one fd satisfies all conditions, we will break the loop with the useful fd. Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Change-Id: I22eada5437776fe90a6b57ab42cbad6dc4b0585c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8311 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com>	2021-06-22 00:11:32 +00:00
Zhiqiang Liu	b4226d6f99	posix: set fd to -1 after close(fd) in posix_sock_create() In posix_sock_create(), we loops through all the addresses available. If something is wrong, we should close(fd) and set fd to -1, and try the next address. Only, when one fd satisfies all conditions, we will break the loop with the useful fd. Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Change-Id: Icbfc10246c92b95cacd6eb058e6e46cf8924fc4c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8310 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2021-06-17 09:18:08 +00:00
Ziye Yang	2cd948c4a6	sock/posix: fix the socket pipe_has_data or socket_has_data. After reading the code in detail, I think that we should not set pipe_has_data= true and socket_has_data at the same time. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I8f9f96b16f4f0e0c585877a0dd687a240252a7cf Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/8283 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-06-14 08:44:28 +00:00
Ben Walker	1ae601b573	sock/posix: Avoid extra readv calls after draining recv_pipe Move from a single flag indicating that the socket is on the pending_events list to two flags - pipe_has_data and socket_has_data. If either flag is true, the socket is on the socks_with_data list. This is necessary to track enough state to avoid doing extra recv() system calls. Change-Id: I65e5701dccb0a5bade19f266f164f26706b110d4 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7595 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-05-20 15:14:08 +00:00
Alexey Marchuk	8e85b675fc	sock: Add new params to configure zcopy for server, client sockets When zcero copy send is enabled and used by initiator, it could significantly increase latency in some payloads. To enable more fine graing configuration of zero copy send feature, add new parameters enable_zerocopy_send_server and enable_zerocopy_send_client to spdk_sock_impl_opts to enable/disable zcopy for specific type of sockets. Exisiting enable_zerocopy_send parameter affects all types of sockets. Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I111c75608f8826980a56e210c076ab8ff16ddbdc Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7457 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Karol Latecki <karol.latecki@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-04-27 08:13:32 +00:00
Sudheer Mogilappagari	2cbc9d4dff	posix: Group connections of sock group on host side based on placement_id On host side the connections are created and then added to thread's poll group. Those connections could use different NIC queues underneath. To route all connections of poll group through single queue a unique placement id is chosen as group_placement_id and each socket of poll group is marked with group_placment_id using getsockopt(SO_MARK) option. The driver could use so_mark value of skb to determine the queue to use. Change-Id: I06bda777fe07a62133b80b2491fa7772150b3b5d Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6160 Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-04-26 15:33:03 +00:00
Jim Harris	0e4690236b	sock/posix: return error immediately if epoll_ctl fails We do not want to do any further work on adding the sock to the group if the epoll_ctl (or kevent) fails. Signed-off-by: Jim Harris <james.r.harris@intel.com> Change-Id: I44b6dc86ce5676aa1b8d6c50b86f22758e4e37fa Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7594 Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2021-04-26 06:55:48 +00:00
Ben Walker	365db9ee48	sock/posix: Deal with hung I/O with MSG_ZEROCOPY and interrupt suppression When all of the following conditions are met: - non-blocking socket - zero copy is enabled - interrupts are suppressed (i.e. busy polling) - NIC tx queue is full at the time sendmsg() is called - epoll_wait sees there is already an EPOLLIN event then we can get into a situation where data we've sent is queued up in the kernel network stack, but interrupts have been suppressed because other traffic is flowing. This makes the kernel miss the signal to flush the software tx queue. If there wasn't also already a pending EPOLLIN event, then epoll_wait would have been sufficient to kick the system out of this state. But when all of this aligns, it hangs. We deal with this by detecting the scenario and calling poll(), which will force the kernel to issue the pending transmits. Change-Id: Ifb247159b7de16c8fc72a90f0333f5b421c8bd07 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6750 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Community-CI: Mellanox Build Bot	2021-04-23 18:31:07 +00:00
Sudheer Mogilappagari	2974f8d676	posix: replace usage of recv() with poll() Busy pollng using recv() is dependent on kernel socket buffer being empty. Instead poll() function busy polls hw queues with no such dependency. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Change-Id: I1cb101848d51f7778cdf3d4c015d2d03201bdb37 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7014 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-04-19 19:13:41 +00:00
Ben Walker	4e347038a8	sock: Maps hold group_impls instead of groups Since the maps are unique to modules, they can store the group_impls directly. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I7f11db558e38e940267fdf6eaacbe515334391c2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7222 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-19 12:54:54 +00:00
Ben Walker	5379aa95e7	sock: Each module now maintains its own sock_map This allows for different policies per module, as well as overlapped placement_id values. Change-Id: I0a9c83e68d22733d81f005eb054a4c5f236f88d9 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7221 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-19 12:54:54 +00:00
Ben Walker	6b86039fd9	nvme/tcp: Ensure qpair is polled when it gets a writev_async completion There was a fix for this that went into the posix layer, but the underlying problem is the logic in the nvme/tcp transport. Attempt to fix that instead. Change-Id: I04dd850bb201641d441c8c1f88c7bb8ba1d09e58 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6751 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-19 12:54:24 +00:00
Ben Walker	6d6959e989	sock/posix: Rename pending_recv to pending_events This list will hold any socket that has some event pending and needs to be part of the set returned during polling of the group. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I5acf01677e59c1026f93671c7b7b3dc458075bf7 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6748 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Mellanox Build Bot	2021-04-19 12:54:24 +00:00
Ben Walker	fc551b3a62	sock/posix: Make pending_recv shuffle more efficient Instead of iterating the list, we can just manipulate the list in a single step. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I0172cdbce9af35a62d62dbccfac573e5d723f43a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6747 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Community-CI: Mellanox Build Bot	2021-04-19 12:54:24 +00:00
Ben Walker	e8bcf36a81	sock: Don't cache placement_id in generic sock struct Instead, move it down to the modules. This allows modules to potentially change the value, if they are able. Change-Id: I08f5fbadf5d1e96b489ddaaca72aa051ce2cb85c Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7212 Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-04-16 05:04:29 +00:00
Ben Walker	1d2613fe36	sock/posix: Eliminate so_priority This value is already available in the options structure. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: I140dc79da1fa5f155a39f1f9e2f54f46d93b6c1c Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7211 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-04-16 05:04:29 +00:00
Ben Walker	28b3889c8e	sock: Use an enum for placement modes Easier to read than integers. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ie9b8b16e1916b393a257e9ed0180ef9837f20cd2 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/7205 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Community-CI: Mellanox Build Bot	2021-04-09 17:15:57 +00:00
Sudheer Mogilappagari	4ac5ca6558	posix: add sock to pending_recv list only if not already added Currently there is possibility of adding a sock to pending_recv list again if sock->pending_recv is true. Check if flag is false before adding to the list. Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Change-Id: Ie23e1e8dbe1aa5594d9ddea30e7f235e3bf8ddad Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6381 Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com>	2021-03-29 07:43:08 +00:00
Ben Walker	b67aa514a4	sock/posix: No longer remove sockets from pending_recv in poll This seems to be cleaning up the pending_recv list to account for the missed cases in the previous patches in this series. Now that we're correctly cleaning up the list, don't do this. Note that if an EPOLLIN event is received but the application never does a read/recv, the socket will remain in the pending recv list. The next poll will get another EPOLLIN event, but the logic already handles that case. Additionally, left a TODO for a performance optimization. Change-Id: I1cdde500a5c76554401a89de766d35b7a486b207 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6746 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-29 07:28:35 +00:00
Ben Walker	8ac5f9e924	sock/posix: Fix read logic to avoid double-adding socket to pending_recv Also write some better comments Change-Id: I81d59307c5eacc5a71879a83e5040da667909d96 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6745 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-29 07:28:35 +00:00
Ben Walker	01aa5cb385	sock/posix: Clear sock from pending_recv even if user does large read If there was an EPOLLIN event the socket gets adding to the pending_recv list. But if the application then does a very large read, it will bypass the logic that clears the socket from the pending_recv list. Fix this. Signed-off-by: Ben Walker <benjamin.walker@intel.com> Change-Id: Ia0ba86012f7c6dfd14eb43ba6eeed94dbbce90ce Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6744 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-29 07:28:35 +00:00
Ben Walker	8e7d559283	sock/posix: When a socket has no recv_pipe, reading should still clear from pending_recv list If the upper layer performs a read/recv, it should still remove the socket from the pending_recv list. Change-Id: I32ca8ecccbfe1e53ecc7d6f57343c2727e84b851 Signed-off-by: Ben Walker <benjamin.walker@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6743 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-29 07:28:35 +00:00
Richael Zhuang	201aa63471	sock: introduce SO_INCOMING_CPU to get placement_id Leverage SO_INCOMING_CPU to get the CPU affinity of connections (sockets). And allocate the connections to specific poll groups, which aims to utilize cache locality. From our test: 6 P4600 NVMe on target,target uses 8 cores, NIC irqs are bound to these 8 cores, and initiator side uses 24 and 32 cores, we can get 11%~17% randwrite performance boost for posix, and 8%~12% for uring. Change-Id: I011e0a21502c85adcccd4a14fbe9838b43f54976 Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5748 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-03-09 08:53:52 +00:00
Ziye Yang	2f1cd867f3	sock/uring: Refactor the code in uring_sock_close Use the same style compared the code in posix_sock_close. Thus if we cannot close sock->fd, i.e., we leak the fd, but we can still free the memory related with uring sock. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Id2f0e8a2c7065f100c2b009e76a49b528fd221b6 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6539 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-02-25 10:26:08 +00:00
Ziye Yang	d5cd0b13b6	sock: Fix the "sock remove assert bug" in spdk_sock_group_remove_sock The statement causes this issue is: assert(group_impl->num_removed_socks < MAX_EVENTS_PER_POLL); The call trace is: The previous solution is: commitid with: `e71e81b631` But with this solution, it will always add the sock into the removed_socks list even if it is not under polling context by sock_group_impl_poll_count. So it will exceed the size of removed_socks array if sock_group_impl_poll_count function will not be called. And we should not use a large array, because it is just a workaround, it just hides the bug. So our current solution is: 1 Remove the code in sock layer, i.e., rollback the commit `e71e81b631`. This patch is not the right fix. The sock->cb_fn's NULL pointer case is caused by the cb_fn of write operation (if the spdk_sock_group_remove_sock is inside the cb_fn). And it is not caused by the epoll related cache issue described in commit "e7181.." commit, but caused by the following situation: (1)The socket's cb_fn is set to NULL which is caused by spdk_sock_group_remove_sock by the socket itself inside a call back function from a write operation. (2) And the socket is already in the pending_recv list. It is not caused by the epoll event issue, e.g., socket A changes Socket B's cb_fn. By the way, A socket A should never remove a socket B from a polling group. If it really does it, it should use spdk_thread_sendmsg to make sure it happens in the next round. 2 Add the code check in each posix, uring implementation module. If sock->cb_fn is NULL, we will not return the socket to the active socks list. And this is enough to address the issue. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: I79187f2f1301c819c46a5c3bdd84372f75534f2f Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6472 Reviewed-by: Xiaodong Liu <xiaodong.liu@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com>	2021-02-24 13:06:50 +00:00
Nick Connolly	a1ae47f34f	module/sock/posix: improve portability Default to using epoll unless __FreeBSD__ is defined. Add macros SPDK_KEVENT and SPDK_EPOLL to indicate whether epoll or kevent is present. The macros match the naming convention for SPDK_ZEROCOPY which controls zero copy in a similar way. Signed-off-by: Nick Connolly <nick.connolly@mayadata.io> Change-Id: I4c46fb94b254cb075427bfe07a8085887254c45a Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6466 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2021-02-19 11:30:32 +00:00
Alexey Marchuk	0d3ad99929	sock/posix: Don't return if zcopy is disabled When socket is being created and zcopy is disabled by the config, we can return from posix_sock_alloc function before we try to set quick_ack Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I6670b8337e70ec12b18a5e6753674fbef9e95648 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6382 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-02-17 10:18:16 +00:00
Ziye Yang	0e9ee17642	posix: Fix the NULL pointer issue of group. A single sock connection can call posix_sock_flush, and this sock may not belong to a polling group. So add the check in sock_check_zcopy to avoid such issue. Fixes #1788 Change-Id: Id0a2f80ad0f3cdb7fc736a3be3211e49513751b1 Signed-off-by: Ziye Yang <ziye.yang@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6334 Reviewed-by: <dongx.yi@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Karol Latecki <karol.latecki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Paul Luse <paul.e.luse@intel.com> Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot	2021-02-10 08:40:25 +00:00
Tomasz Zawadzki	e4070ee0e0	so_ver: increase all major versions To allow SO_MINOR updates on LTS for the whole year it is supported, the major version for all components needs to be increased. This is to prevent scenario where two versions exists with matching versions, but conflicting ABI. Ex. Next SPDK release adds an API call increasing the minor version, then LTS needs just a subset of those additions. Increasing major so version after LTS, allows the quarterly releases to update versions as needed. Yet allowing LTS to increase minor version separately. Disabled test for increasing SO version without ABI change, as that is goal of this patch. This check shall be removed with SPDK 21.04 release. This patch: - increases SO_VER by 1 for all components - resets SO_MINOR to 0 for all components - removes suppressions for ABI tests Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: I44d01154430a074103bd21c7084f44932e81fe72 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6167 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2021-02-05 14:43:47 +00:00
Ziye Yang	48b2ac7a13	sock/posix: fix the zero copy enabling in initiator. When zero copy is enabled in initiator, there could be the case that a socket connection does not belong to a polling group, i.e, the application does not use socket polling group. Then we should actively call _sock_check_zcopy in posix_sock_flush function when zero copy policy is enabled. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Idceaa7557eb265daa878db40c922494c3de35ea8 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5423 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Community-CI: Mellanox Build Bot	2021-02-05 13:45:15 +00:00
Alexey Marchuk	8015386885	sock/posix: Enable zero copy send Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: Ic21e7ba1b090b4d24ef8ae0c1b0a9c5b1909da3b Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6193 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com>	2021-02-04 08:36:50 +00:00
Alexey Marchuk	2ae4adc342	sock/posix: Add sock to pending list on zcopy event In NVMF TCP initiator when zero copy is disabled, all requests are completed when we receive EPOLLIN event for socket, add socket to pending_recv list and call socket's callback which calls qpair_process_completions. As part of completions processing on NVME level we receive the number of completions and resubmit the same number of queued requests. When zero copy is enabled, some transport requests can be completed when we receive and process EPOLLERR event, it happens out of qpair_process_completions context. So part of requests can be completed, transport level contains free requests but NVME layer don't have info about it until it calls qpair_process_completions. And there is a chance that on posix level when we poll sockets we receive only EPOLLERR flag without EPOLLIN. In this case we can complete several requests but don't call qpair_process_completion so we don't resubmit queued requests. It may lead to a hang in the end of test run when there are no mo requests to be completed on transport level (no EPOLLIN event) and we receive EPOLERR only, so we can't resubmit queured requests. This patch fixes this problem, it add a socket to group's pending_recv list if we received EPOLERR event and completed at least 1 socket request. So socket's callback can be called even without EPOLLIN event. Fixes issue #1685 Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I21d5c2fe6eb0787aab9531925a7f0e2fe18bafaa Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/6162 Community-CI: Broadcom CI Community-CI: Mellanox Build Bot Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: <dongx.yi@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2021-02-04 08:36:50 +00:00
Ziye Yang	c38a1bc002	sock: create spdk_sock_prep_reqs function. The purpose is to reduce the duplicated functions in posix and uring implmentation. Signed-off-by: Ziye Yang <ziye.yang@intel.com> Change-Id: Ia0568b2490d362e7e78fa59b3ca88a60313ba0bd Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5284 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Community-CI: Mellanox Build Bot Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-11-27 09:22:30 +00:00
Liu Xiaodong	b7f7bbd16b	sock/uring: reap if some sock has data in pipe Since spdk_sock_group_poll should do level trigger to the callback func registered in spdk_sock_group_add_sock, if there is data in pipe, but not event occurs, poll func should still reap sock who has data in pipe. Change-Id: If3a983f80fd04708e45ad0398c7d34018ec52bc7 Signed-off-by: Liu Xiaodong <xiaodong.liu@intel.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/5072 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ziye Yang <ziye.yang@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com> Reviewed-by: Changpeng Liu <changpeng.liu@intel.com>	2020-11-16 12:22:43 +00:00
Alexey Marchuk	9b19abae3c	sock/posix: Disable zcopy send by default Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Change-Id: I4825c681d742946dfcf5bdc209356194766a15cd Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4978 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-10-30 14:51:34 +00:00
Tomasz Zawadzki	34f84c5845	lib/sock: zero out sock_impl opts By design the opts for each implementation can not match spdk_sock_impl_opts. During get_opts for specific implementation only used fields are filled. Yet iterating over all spdk_sock_impl_opts fields would yeild garbage values for unset fields. This is the case right now when doing save_config RPC with uring enabled. A garbe value for enable_zerocopy_send is returned. sock.c:829:62: runtime error: load of value 165, which is not a valid value for type '_Bool' Signed-off-by: Tomasz Zawadzki <tomasz.zawadzki@intel.com> Change-Id: Ie0512a7dffc36c8ff89256d08f8a2f4fefcf9e83 Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4699 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com> Reviewed-by: Aleksey Marchuk <alexeymar@mellanox.com>	2020-10-16 08:15:10 +00:00
Alexey Marchuk	cbacd11a5c	posix: Reorder spdk_posix_sock struct Reorder bool fields to eliminate extra hole Change-Id: Ia27ee2be42c13d0aba702a1b00c9b89e6620e36e Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4213 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	86865969ff	sock/posix: Enable send zero copy for client sockets In NVME TCP initiator zero copy is enabled for IO qpairs and disabled for admin qpairs Change-Id: Ibdf521dccde9b95ec5dd15a5eb2baed8fcf8b88e Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4211 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com>	2020-09-29 09:35:47 +00:00
Alexey Marchuk	f0d8396e7a	sock: Add option to configure zero copy per socket A preparation step for enabling zero copy in NVMEoF TCP initiator. This option will be used to disable zero copy for admin qpair. This is needed since the admin qpair's socket is not connected to socket poll group and we can't receive buffer reclaim notification. Change-Id: Ibfbb8a156aafcd7ba8975a50f790da7fbd37d96f Signed-off-by: Alexey Marchuk <alexeymar@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-on: https://review.spdk.io/gerrit/c/spdk/spdk/+/4210 Tested-by: SPDK CI Jenkins <sys_sgci@intel.com> Reviewed-by: Ben Walker <benjamin.walker@intel.com> Reviewed-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Shuhei Matsumoto <shuhei.matsumoto.xt@hitachi.com>	2020-09-29 09:35:47 +00:00

1 2 3

115 Commits