numam-dpdk

Author	SHA1	Message	Date
Dmitry Kozlyuk	64be0e779f	ethdev: fix device capability to string translation Add support for RTE_ETH_DEV_CAPA_FLOW_{RULE,SHARED_OBJECT}_KEEP to rte_eth_dev_capability_name(), missed when adding the capabilities. Fixes: `1d5a3d68c0` ("ethdev: add capability to keep flow rules on restart") Fixes: `2c9cd45de7` ("ethdev: add capability to keep shared objects on restart") Reported-by: Ali Alnubani <alialnu@nvidia.com> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Xueming Li <xuemingl@nvidia.com> Tested-by: Ali Alnubani <alialnu@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-09 19:26:19 +01:00
Jim Harris	d4fb4eb087	power: remove unused poll counter Following the previous fix, there is nothing using the ppi counter. We can remove the related ppi_av array in struct priority_worker. This allows us to also remove num_dequeue_pkts_prev and pc from struct priority_worker since they are only used in conjunction with the ppi_av array. Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-11-12 15:43:49 +01:00
Jim Harris	0353121c33	power: fix build with clang 13 clang-13 rightfully complains that the tot_ppi variable in update_stats is set but not used, since the final accumulated tot_ppi results isn't used anywhere. Fixes: `450f079131` ("power: add traffic pattern aware power control") Cc: stable@dpdk.org Signed-off-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-11-12 15:33:35 +01:00
Volodymyr Fialko	001d402c89	eal/arm64: support ASan This patch defines ASAN_SHADOW_OFFSET for arm64 according to the ASan documentation. This offset should cover all arm64 VMAs supported by ASan. Signed-off-by: Volodymyr Fialko <vfialko@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>	2021-11-12 15:30:00 +01:00
David Marchand	19d024003d	build: factorize jansson availability check Since two components wants to know if the jansson library is available, move it to config/. Signed-off-by: David Marchand <david.marchand@redhat.com>	2021-11-10 16:23:05 +01:00
David Marchand	d6024c0a67	build: cleanup libpcap dependent components The RTE_PORT_PCAP variable is used to signal libpcap availability, though its name seems to refer to pcap support in the port library. Prefer a generic name and add explicit link dependencies where needed. Fixes: `7a944656b3` ("test/pcapng: test pcapng library") Fixes: `2eccf6afbe` ("bpf: add function to convert classic BPF to DPDK BPF") Fixes: `cbb44143be` ("app/dumpcap: add new packet capture application") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2021-11-10 11:42:34 +01:00
Huichao Cai	013bb504c8	ip_frag: revert fix fragmenting IPv4 fragment The patch ("ip_frag: fix fragmenting IPv4 fragment") introduces a bug and needs to be rolled back. This is because the patch and variables "flag_offset" conflict with each other. Bugzilla ID: 835 Fixes: `567473433b` ("ip_frag: fix fragmenting IPv4 fragment") Cc: stable@dpdk.org Signed-off-by: Huichao Cai <chcchc88@163.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-11-08 23:32:38 +01:00
Konstantin Ananyev	060ef29dc0	ip_frag: hide internal structures Move internal reassembly structures into new private header 'ip_reassembly.h'. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-11-08 22:37:47 +01:00
Maciej Szwed	aeed570a21	interrupt: fix request notifier interrupt processing We should call read() on RTE_INTR_HANDLE_VFIO_REQ event to confirm that event. Fixes: `0eb8a1c4c7` ("vfio: add request notifier interrupt") Cc: stable@dpdk.org Signed-off-by: Maciej Szwed <maciej.szwed@intel.com>	2021-11-08 18:26:07 +01:00
Harman Kalra	7e2083e462	eal/linux: check interrupt file descriptor validity This patch fixes coverity issue by adding a check for negative event fd value. Coverity issue: 373711, 373694 Fixes: `c2bd9367e1` ("lib: remove direct access to interrupt handle") Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: David Marchand <david.marchand@redhat.com>	2021-11-08 17:32:42 +01:00
Harman Kalra	3fcca9fac6	interrupts: check file descriptor validity This patch fixes coverity issues by adding a check for negative event fd value. Coverity issue: 373716, 373699, 373693, 373688 Fixes: `bbbac4cd6e` ("interrupts: remove direct access to interrupt handle") Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: David Marchand <david.marchand@redhat.com>	2021-11-08 17:32:42 +01:00
Elena Agostini	c7ebd65c13	gpudev: add communication list In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. When mixing network activity with task processing there may be the need to put in communication the CPU with the device in order to synchronize operations. An example could be a receive-and-process application where CPU is responsible for receiving packets in multiple mbufs and the GPU is responsible for processing the content of those packets. The purpose of this list is to provide a buffer in CPU memory visible from the GPU that can be treated as a circular buffer to let the CPU provide fondamental info of received packets to the GPU. A possible use-case is described below. CPU: - Trigger some task on the GPU - in a loop: - receive a number of packets - provide packets info to the GPU GPU: - Do some pre-processing - Wait to receive a new set of packet to be processed Layout of a communication list would be: ------- \| 0 \| => pkt_list \| status \| \| #pkts \| ------- \| 1 \| => pkt_list \| status \| \| #pkts \| ------- \| 2 \| => pkt_list \| status \| \| #pkts \| ------- \| .... \| => pkt_list ------- Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2021-11-08 17:20:53 +01:00
Elena Agostini	f56160a255	gpudev: add communication flag In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. When mixing network activity with task processing there may be the need to put in communication the CPU with the device in order to synchronize operations. The purpose of this flag is to allow the CPU and the GPU to exchange ACKs. A possible use-case is described below. CPU: - Trigger some task on the GPU - Prepare some data - Signal to the GPU the data is ready updating the communication flag GPU: - Do some pre-processing - Wait for more data from the CPU polling on the communication flag - Consume the data prepared by the CPU Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2021-11-08 17:20:53 +01:00
Elena Agostini	2d61b429cf	gpudev: add memory barrier Add a function for the application to ensure the coherency of the writes executed by another device into the GPU memory. Signed-off-by: Elena Agostini <eagostini@nvidia.com>	2021-11-08 17:20:53 +01:00
Elena Agostini	e818c4e2bf	gpudev: add memory API In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. Such workload distribution can be achieved by sharing some memory. As a first step, the features are focused on memory management. A function allows to allocate memory inside the device, or in the main (CPU) memory while making it visible for the device. This memory may be used to save packets or for synchronization data. The next step should focus on GPU processing task control. Signed-off-by: Elena Agostini <eagostini@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:53 +01:00
Thomas Monjalon	a9af048aba	gpudev: support multi-process The device data shared between processes are moved in a struct allocated in a shared memory (a new memzone for all GPUs). The main struct rte_gpu references the shared memory via the pointer mpshared. The API function rte_gpu_attach() is added to attach a device from the secondary process. The function rte_gpu_allocate() can be used only by primary process. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:53 +01:00
Thomas Monjalon	82e5f6b658	gpudev: add child device representing a device context The computing device may operate in some isolated contexts. Memory and processing are isolated in a silo represented by a child device. The context is provided as an opaque by the caller of rte_gpu_add_child(). Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:52 +01:00
Thomas Monjalon	18cb075631	gpudev: add event notification Callback functions may be registered for a device event. Callback management is per-process and not thread-safe. The events RTE_GPU_EVENT_NEW and RTE_GPU_EVENT_DEL are notified respectively after creation and before removal of a device, as part of the library functions. Some future events may be emitted from drivers. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:52 +01:00
Elena Agostini	8b8036a66e	gpudev: introduce GPU device class library In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. The new library gpudev is for dealing with GPGPU computing devices from a DPDK application running on the CPU. The infrastructure is prepared to welcome drivers in drivers/gpu/. Signed-off-by: Elena Agostini <eagostini@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:52 +01:00
Anatoly Burakov	4fd15c6af0	vfio: set errno on unsupported OS Currently, when code is running on FreeBSD or Windows, there is no way to distinguish between a geniune error and a "VFIO is unsupported" error. Fix the dummy implementations to also set the rte_errno flag. Fixes: `279b581c89` ("vfio: expose functions") Fixes: `c564a2a200` ("vfio: expose clear group function for internal usages") Fixes: `964b2f3bfb` ("vfio: export some internal functions") Fixes: `ea2dc10668` ("vfio: add multi container support") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-11-08 16:45:28 +01:00
Anatoly Burakov	da6e4cdca1	vfio: fix FreeBSD documentation On FreeBSD, `rte_vfio_is_enabled()` and `rte_vfio_noiommu_is_enabled()` API calls will not return error, and will instead return 0. This is intentional, because the caller of this API does not care whether VFIO is supported at all, and will instead be interested in whether VFIO is enabled or not. However, the doxygen comments for these functions state that they will return an error on FreeBSD, which is incorrect. Fix the doxygen comment to call out the fact that these functions are only relevant on Linux, but remove the reference to returning errors. Fixes: `279b581c89` ("vfio: expose functions") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com>	2021-11-08 16:42:55 +01:00
Anatoly Burakov	bf8b792f3b	vfio: fix FreeBSD clear group stub On FreeBSD, `rte_vfio_clear_group()` was returning 0 even though this function is not valid for FreeBSD, and is called out to return error in doxygen comments. Fix the return value to match documentation. Fixes: `c564a2a200` ("vfio: expose clear group function for internal usages") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-11-08 16:42:44 +01:00
Anatoly Burakov	84e03bde1c	vfio: drop fallback Linux implementation Currently, VFIO support for Linux is compiled unconditionally, and supported kernel versions start with 4.4, so VFIO is assumed to always be enabled. There is no way of disabling VFIO support at compile time anyway, so just drop the "VFIO not available" fallback code altogether. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com>	2021-11-08 16:27:15 +01:00
Chengwen Feng	b1f4933ef3	kni: check error code of allmulticast mode switch Some drivers may return errcode when switch allmulticast mode, so it's necessary to check the return code. Fixes: `b34801d1aa` ("kni: support allmulticast mode set") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-08 11:56:13 +01:00
Ali Alnubani	948be25cdb	sched: fix debug build Compare pkt_len to 0 instead of NULL to avoid the following build failure with debug mode enabled: ../lib/sched/rte_pie.h: In function 'rte_pie_enqueue_empty': ../lib/sched/rte_pie.h:125:21: error: comparison between pointer and integer [-Werror] RTE_ASSERT(pkt_len != NULL); Bugzilla ID: 878 Fixes: `44c730b0e3` ("sched: add PIE based congestion management") Signed-off-by: Ali Alnubani <alialnu@nvidia.com>	2021-11-07 18:52:51 +01:00
Ferruh Yigit	b7ade5d31a	ethdev: fix crash on owner delete 'eth_dev->data' can be null before ethdev allocated. The API walks through all eth devices, at least for some data can be null. Adding 'eth_dev->data' null check before accessing it. Fixes: `33c73aae32` ("ethdev: allow ownership operations on unused port") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-11-05 15:35:57 +01:00
Mattias Rönnblom	578402f2d5	eventdev: support device maintenance in adapters Introduce support for event devices requiring calls to rte_event_maintain() in the Ethernet RX, Timer and Crypto Eventdev adapters. Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Tested-by: Richard Eklycke <richard.eklycke@ericsson.com>	2021-11-04 13:28:09 +01:00
Mattias Rönnblom	54f17843a8	eventdev: add port maintenance API Extend Eventdev API to allow for event devices which require various forms of internal processing to happen, even when events are not enqueued to or dequeued from a port. Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Tested-by: Richard Eklycke <richard.eklycke@ericsson.com> Tested-by: Liron Himi <lironh@marvell.com>	2021-11-04 13:27:54 +01:00
Naga Harish K S V	9e58318531	eventdev/eth_rx: support telemetry Added telemetry support for rxa_queue_stats and rxa_queue_stats_reset to get and reset rx queue stats respectively. Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-11-04 08:41:25 +01:00
Naga Harish K S V	995b150c1a	eventdev/eth_rx: add queue stats API This patch adds new api ``rte_event_eth_rx_adapter_queue_stats_get`` to retrieve queue stats. The queue stats are in the format ``struct rte_event_eth_rx_adapter_queue_stats``. For resetting the queue stats, ``rte_event_eth_rx_adapter_queue_stats_reset`` api is added. The adapter stats_get and stats_reset apis are also updated to handle queue level event buffer use case. Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-11-04 08:41:25 +01:00
Gowrishankar Muthukrishnan	259ca6d161	security: add telemetry endpoint for capabilities Add telemetry endpoint for cryptodev security capabilities. Details of endpoints added in documentation. Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:46:27 +01:00
Raja Zidane	9ad776442d	crypto/mlx5: support 1MB data-unit Add 1MB data-unit length to the capability's bitmap. Handle 1MB data-unit length in the mlx5 session create operation, and expose its capability in the mlx5 capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-04 19:46:27 +01:00
Radu Nicolau	ff4a29d167	ipsec: support TSO Add support for transmit segmentation offload to inline crypto processing mode. This offload is not supported by other offload modes, as at a minimum it requires inline crypto for IPsec to be supported on the network interface. Signed-off-by: Declan Doherty <declan.doherty@intel.com> Signed-off-by: Radu Nicolau <radu.nicolau@intel.com> Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com> Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:46:27 +01:00
Nicolas Chautru	ba2469cddf	bbdev: promote API as stable This promotes the bbdev interface to stable. Overdue for some time as bbdev interface has been stable. Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:43:14 +01:00
Gowrishankar Muthukrishnan	1c559ee846	cryptodev: add telemetry endpoint for capabilities Add telemetry endpoint for getting cryptodev capabilities. Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:43:14 +01:00
Rebecca Troy	d3d98f5ce9	cryptodev: support telemetry The cryptodev library now registers commands with telemetry, and implements the corresponding callback functions. These commands allow a list of cryptodevs to be queried, as well as info and stats for the corresponding cryptodev. An example usage can be seen below: Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2 {"version": "DPDK 21.11.0-rc0", "pid": 1135019, "max_output_len": 16384} --> / {"/": ["/", "/cryptodev/info", "/cryptodev/list", "/cryptodev/stats", ...]} --> /cryptodev/list {"/cryptodev/list": [0,1,2,3]} --> /cryptodev/info,0 {"/cryptodev/info": {"device_name": "0000:1c:01.0_qat_sym", \ "max_nb_queue_pairs": 2}} --> /cryptodev/stats,0 {"/cryptodev/stats": {"enqueued_count": 0, "dequeued_count": 0, \ "enqueue_err_count": 0, "dequeue_err_count": 0}} Signed-off-by: Rebecca Troy <rebecca.troy@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:43:14 +01:00
Gregory Etelson	de39080bd6	ethdev: fix variable length flow elements support RTE flow API defines two flow elements types - common and PMD private. Common RTE flow types are defined in rte_flow.h while PMD private types exists inside specific PMD only. Application can create a flow rule with PMD private items or actions. RTE flow API restricts private PMD types to negative values. Current implementation tried to use negative PMD private item type value as index in the rte_flow_desc_item[] array. The patch allows access to rte_flow_desc_item[] and rte_flow_desc_action[] arrays to non-private PMD types only. Fixes: `6cf7204733` ("ethdev: support flow elements with variable length") Signed-off-by: Gregory Etelson <getelson@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-04 13:23:29 +01:00
Thomas Monjalon	285725d93b	ethdev: promote device removal check function as stable The function rte_eth_dev_is_removed() was introduced in DPDK 18.02, and is integrated in error checks of ethdev library. It is promoted as stable ABI. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-04 11:11:28 +01:00
Maxime Coquelin	ab4bb42406	vhost: rename driver callbacks struct As previously announced, this patch renames struct vhost_device_ops to struct rte_vhost_device_ops. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-11-03 11:59:27 +01:00
Maxime Coquelin	94c16e89d7	vhost: mark vDPA driver API as internal This patch marks the vDPA driver APIs as internal and rename the corresponding header file to vdpa_driver.h. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-11-03 09:11:34 +01:00
Dmitry Kozlyuk	2c9cd45de7	ethdev: add capability to keep shared objects on restart rte_flow_action_handle_create() did not mention what happens with an indirect action when a device is stopped and started again. It is natural for some indirect actions, like counter, to be persistent. Keeping others at least saves application time and complexity. However, not all PMDs can support it, or the support may be limited by particular action kinds, that is, combinations of action type and the value of the transfer bit in its configuration. Add a device capability to indicate if at least some indirect actions are kept across the above sequence. Without this capability the behavior is still unspecified, and application is required to destroy the indirect actions before stopping the device. In the future, indirect actions may not be the only type of objects shared between flow rules. The capability bit intends to cover all possible types of such objects, hence its name. Declare that the application can test for the persistence of a particular indirect action kind by attempting to create an indirect action of that kind when the device is stopped and checking for the specific error type. This is logical because if the PMD can to create an indirect action when the device is not started and use it after the start happens, it is natural that it can move its internal flow shared object to the same state when the device is stopped and restore the state when the device is started. Indirect action persistence across a reconfigurations is not required. In case a PMD cannot keep the indirect actions across reconfiguration, it is allowed just to report an error. Application must then flush the indirect actions before attempting it. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-11-02 18:59:17 +01:00
Dmitry Kozlyuk	1d5a3d68c0	ethdev: add capability to keep flow rules on restart Previously, it was not specified what happens to the flow rules when the device is stopped, possibly reconfigured, then started. If flow rules were kept, it could be convenient for application developers, because they wouldn't need to save and restore them. However, due to the number of flows and possible creation rate it is impractical to save all flow rules in DPDK layer. This means that flow rules persistence really depends on whether PMD and HW can implement it efficiently. It can also be limited by the rule item and action types, and its attributes transfer bit (a combination of an item/action type and a value of the transfer bit is called a rule feature). Add a device capability bit for PMDs that can keep at least some of the flow rules across restart. Without this capability behavior is still unspecified and it is declared that the application must flush the rules before stopping the device. Allow the application to test for persistence of rules using a particular feature by attempting to create a flow rule using that feature when the device is stopped and checking for the specific error. This is logical because if the PMD can to create the flow rule when the device is not started and use it after the start happens, it is natural that it can move its internal flow rule object to the same state when the device is stopped and restore the state when the device is started. Rule persistence across a reconfigurations is not required, because tracking all the rules and configuration-dependent resources they use may be infeasible. In case a PMD cannot keep the rules across reconfiguration, it is allowed just to report an error. Application must then flush the rules before attempting it. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-11-02 18:59:17 +01:00
Maxime Coquelin	541053891a	vhost: increase number of async IO vectors This patch increases the number of IO vectors for the asynchronous data path from 512 to 2048. It has been reported during testing the starvation of IO vectors during iperf benchmark with 64KB packet size. As there are no direct relationship between VHOST_MAX_ASYNC_VEC and BUF_VECTOR_MAX, this patch also assign VHOST_MAX_ASYNC_VEC value directly instead of being a multiple of BUF_VECTOR_MAX. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	816a565bf3	vhost: merge sync and async mbuf to descriptor filling This patches merges copy_mbuf_to_desc() used by the sync path with async_mbuf_to_desc() used by the async path. Most of these complex functions are identical, so merging them will make the maintenance easier. In order not to degrade performance, the patch introduces a boolean function parameter to specify whether it is called in async context. This boolean is statically passed to this always-inlined function, so the compiler will optimize this out. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	b84e85e3b1	vhost: prepare sync for mbuf to descriptor refactoring This patch extracts the descriptors buffers filling from copy_mbuf_to_desc() into a dedicated function as a preliminary step of merging copy_mubf_to_desc() and async_mbuf_to_desc(). Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	dbfa4c0bbd	vhost: prepare async for mbuf to descriptor refactoring This patch extracts the IO vectors filling from async_mbuf_to_desc() into a dedicated function as a preliminary step of merging copy_mubf_to_desc() and async_mbuf_to_desc(). Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	c75987487d	vhost: simplify getting first in-flight index This patch reworks the function getting the index for the first packet in-flight. When this index turns out to be zero, let's use the simple path. Doing that avoid having to do a modulo with the virtqueue size. The patch also rename the function for better clarification, and only pass the virtqueue metadata pointer, as all the needed information are stored there. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	a3cfa8081f	vhost: simplify async enqueue completion vhost_poll_enqueue_completed() assumes some inflight packets could have been completed in a previous call but not returned to the application. But this is not the case, since check_completed_copies callback is never called with more than the current count as argument. In other words, async->last_pkts_n is always 0. Removing it greatly simplifies the function. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	2cbe826e26	vhost: remove notion of async descriptor Now that IO vectors iterator have been simplified, the rte_vhost_async_desc struct only contains a pointer on the iterator array stored in the async metadata. This patch removes it, and pass directly the iterators array pointer to the transfer_data callback. Doing that, we avoid declaring the descriptor array in the stack, and also avoid the cost of filling it. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	d5d25cfd85	vhost: improve IO vector logic IO vectors and their iterators arrays were part of the async metadata but not their indexes. In order to makes this more consistent, the patch adds the indexes to the async metadata. Doing that, we can avoid triggering DMA transfer within the loop as it IO vector index overflow is now prevented in the async_mbuf_to_desc() function. Note that previous detection mechanism was broken since the overflow already happened when detected, so OOB memory access would already have happened. With this changes done, virtio_dev_rx_async_submit_split() and virtio_dev_rx_async_submit_packed() can be further simplified. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00

1 2 3 4 5 ...

7517 Commits