numam-dpdk

Author	SHA1	Message	Date
Elena Agostini	e818c4e2bf	gpudev: add memory API In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. Such workload distribution can be achieved by sharing some memory. As a first step, the features are focused on memory management. A function allows to allocate memory inside the device, or in the main (CPU) memory while making it visible for the device. This memory may be used to save packets or for synchronization data. The next step should focus on GPU processing task control. Signed-off-by: Elena Agostini <eagostini@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:53 +01:00
Thomas Monjalon	a9af048aba	gpudev: support multi-process The device data shared between processes are moved in a struct allocated in a shared memory (a new memzone for all GPUs). The main struct rte_gpu references the shared memory via the pointer mpshared. The API function rte_gpu_attach() is added to attach a device from the secondary process. The function rte_gpu_allocate() can be used only by primary process. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:53 +01:00
Thomas Monjalon	82e5f6b658	gpudev: add child device representing a device context The computing device may operate in some isolated contexts. Memory and processing are isolated in a silo represented by a child device. The context is provided as an opaque by the caller of rte_gpu_add_child(). Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:52 +01:00
Thomas Monjalon	18cb075631	gpudev: add event notification Callback functions may be registered for a device event. Callback management is per-process and not thread-safe. The events RTE_GPU_EVENT_NEW and RTE_GPU_EVENT_DEL are notified respectively after creation and before removal of a device, as part of the library functions. Some future events may be emitted from drivers. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:52 +01:00
Elena Agostini	8b8036a66e	gpudev: introduce GPU device class library In heterogeneous computing system, processing is not only in the CPU. Some tasks can be delegated to devices working in parallel. The new library gpudev is for dealing with GPGPU computing devices from a DPDK application running on the CPU. The infrastructure is prepared to welcome drivers in drivers/gpu/. Signed-off-by: Elena Agostini <eagostini@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-11-08 17:20:52 +01:00
Anatoly Burakov	4fd15c6af0	vfio: set errno on unsupported OS Currently, when code is running on FreeBSD or Windows, there is no way to distinguish between a geniune error and a "VFIO is unsupported" error. Fix the dummy implementations to also set the rte_errno flag. Fixes: `279b581c89` ("vfio: expose functions") Fixes: `c564a2a200` ("vfio: expose clear group function for internal usages") Fixes: `964b2f3bfb` ("vfio: export some internal functions") Fixes: `ea2dc10668` ("vfio: add multi container support") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-11-08 16:45:28 +01:00
Anatoly Burakov	da6e4cdca1	vfio: fix FreeBSD documentation On FreeBSD, `rte_vfio_is_enabled()` and `rte_vfio_noiommu_is_enabled()` API calls will not return error, and will instead return 0. This is intentional, because the caller of this API does not care whether VFIO is supported at all, and will instead be interested in whether VFIO is enabled or not. However, the doxygen comments for these functions state that they will return an error on FreeBSD, which is incorrect. Fix the doxygen comment to call out the fact that these functions are only relevant on Linux, but remove the reference to returning errors. Fixes: `279b581c89` ("vfio: expose functions") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com>	2021-11-08 16:42:55 +01:00
Anatoly Burakov	bf8b792f3b	vfio: fix FreeBSD clear group stub On FreeBSD, `rte_vfio_clear_group()` was returning 0 even though this function is not valid for FreeBSD, and is called out to return error in doxygen comments. Fix the return value to match documentation. Fixes: `c564a2a200` ("vfio: expose clear group function for internal usages") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-11-08 16:42:44 +01:00
Anatoly Burakov	84e03bde1c	vfio: drop fallback Linux implementation Currently, VFIO support for Linux is compiled unconditionally, and supported kernel versions start with 4.4, so VFIO is assumed to always be enabled. There is no way of disabling VFIO support at compile time anyway, so just drop the "VFIO not available" fallback code altogether. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com>	2021-11-08 16:27:15 +01:00
Chengwen Feng	b1f4933ef3	kni: check error code of allmulticast mode switch Some drivers may return errcode when switch allmulticast mode, so it's necessary to check the return code. Fixes: `b34801d1aa` ("kni: support allmulticast mode set") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-08 11:56:13 +01:00
Ali Alnubani	948be25cdb	sched: fix debug build Compare pkt_len to 0 instead of NULL to avoid the following build failure with debug mode enabled: ../lib/sched/rte_pie.h: In function 'rte_pie_enqueue_empty': ../lib/sched/rte_pie.h:125:21: error: comparison between pointer and integer [-Werror] RTE_ASSERT(pkt_len != NULL); Bugzilla ID: 878 Fixes: `44c730b0e3` ("sched: add PIE based congestion management") Signed-off-by: Ali Alnubani <alialnu@nvidia.com>	2021-11-07 18:52:51 +01:00
Ferruh Yigit	b7ade5d31a	ethdev: fix crash on owner delete 'eth_dev->data' can be null before ethdev allocated. The API walks through all eth devices, at least for some data can be null. Adding 'eth_dev->data' null check before accessing it. Fixes: `33c73aae32` ("ethdev: allow ownership operations on unused port") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-11-05 15:35:57 +01:00
Mattias Rönnblom	578402f2d5	eventdev: support device maintenance in adapters Introduce support for event devices requiring calls to rte_event_maintain() in the Ethernet RX, Timer and Crypto Eventdev adapters. Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Tested-by: Richard Eklycke <richard.eklycke@ericsson.com>	2021-11-04 13:28:09 +01:00
Mattias Rönnblom	54f17843a8	eventdev: add port maintenance API Extend Eventdev API to allow for event devices which require various forms of internal processing to happen, even when events are not enqueued to or dequeued from a port. Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Tested-by: Richard Eklycke <richard.eklycke@ericsson.com> Tested-by: Liron Himi <lironh@marvell.com>	2021-11-04 13:27:54 +01:00
Naga Harish K S V	9e58318531	eventdev/eth_rx: support telemetry Added telemetry support for rxa_queue_stats and rxa_queue_stats_reset to get and reset rx queue stats respectively. Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-11-04 08:41:25 +01:00
Naga Harish K S V	995b150c1a	eventdev/eth_rx: add queue stats API This patch adds new api ``rte_event_eth_rx_adapter_queue_stats_get`` to retrieve queue stats. The queue stats are in the format ``struct rte_event_eth_rx_adapter_queue_stats``. For resetting the queue stats, ``rte_event_eth_rx_adapter_queue_stats_reset`` api is added. The adapter stats_get and stats_reset apis are also updated to handle queue level event buffer use case. Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-11-04 08:41:25 +01:00
Gowrishankar Muthukrishnan	259ca6d161	security: add telemetry endpoint for capabilities Add telemetry endpoint for cryptodev security capabilities. Details of endpoints added in documentation. Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:46:27 +01:00
Raja Zidane	9ad776442d	crypto/mlx5: support 1MB data-unit Add 1MB data-unit length to the capability's bitmap. Handle 1MB data-unit length in the mlx5 session create operation, and expose its capability in the mlx5 capabilities. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-04 19:46:27 +01:00
Radu Nicolau	ff4a29d167	ipsec: support TSO Add support for transmit segmentation offload to inline crypto processing mode. This offload is not supported by other offload modes, as at a minimum it requires inline crypto for IPsec to be supported on the network interface. Signed-off-by: Declan Doherty <declan.doherty@intel.com> Signed-off-by: Radu Nicolau <radu.nicolau@intel.com> Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com> Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:46:27 +01:00
Nicolas Chautru	ba2469cddf	bbdev: promote API as stable This promotes the bbdev interface to stable. Overdue for some time as bbdev interface has been stable. Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:43:14 +01:00
Gowrishankar Muthukrishnan	1c559ee846	cryptodev: add telemetry endpoint for capabilities Add telemetry endpoint for getting cryptodev capabilities. Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:43:14 +01:00
Rebecca Troy	d3d98f5ce9	cryptodev: support telemetry The cryptodev library now registers commands with telemetry, and implements the corresponding callback functions. These commands allow a list of cryptodevs to be queried, as well as info and stats for the corresponding cryptodev. An example usage can be seen below: Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2 {"version": "DPDK 21.11.0-rc0", "pid": 1135019, "max_output_len": 16384} --> / {"/": ["/", "/cryptodev/info", "/cryptodev/list", "/cryptodev/stats", ...]} --> /cryptodev/list {"/cryptodev/list": [0,1,2,3]} --> /cryptodev/info,0 {"/cryptodev/info": {"device_name": "0000:1c:01.0_qat_sym", \ "max_nb_queue_pairs": 2}} --> /cryptodev/stats,0 {"/cryptodev/stats": {"enqueued_count": 0, "dequeued_count": 0, \ "enqueue_err_count": 0, "dequeue_err_count": 0}} Signed-off-by: Rebecca Troy <rebecca.troy@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-11-04 19:43:14 +01:00
Gregory Etelson	de39080bd6	ethdev: fix variable length flow elements support RTE flow API defines two flow elements types - common and PMD private. Common RTE flow types are defined in rte_flow.h while PMD private types exists inside specific PMD only. Application can create a flow rule with PMD private items or actions. RTE flow API restricts private PMD types to negative values. Current implementation tried to use negative PMD private item type value as index in the rte_flow_desc_item[] array. The patch allows access to rte_flow_desc_item[] and rte_flow_desc_action[] arrays to non-private PMD types only. Fixes: `6cf7204733` ("ethdev: support flow elements with variable length") Signed-off-by: Gregory Etelson <getelson@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-04 13:23:29 +01:00
Thomas Monjalon	285725d93b	ethdev: promote device removal check function as stable The function rte_eth_dev_is_removed() was introduced in DPDK 18.02, and is integrated in error checks of ethdev library. It is promoted as stable ABI. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-11-04 11:11:28 +01:00
Maxime Coquelin	ab4bb42406	vhost: rename driver callbacks struct As previously announced, this patch renames struct vhost_device_ops to struct rte_vhost_device_ops. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-11-03 11:59:27 +01:00
Maxime Coquelin	94c16e89d7	vhost: mark vDPA driver API as internal This patch marks the vDPA driver APIs as internal and rename the corresponding header file to vdpa_driver.h. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-11-03 09:11:34 +01:00
Dmitry Kozlyuk	2c9cd45de7	ethdev: add capability to keep shared objects on restart rte_flow_action_handle_create() did not mention what happens with an indirect action when a device is stopped and started again. It is natural for some indirect actions, like counter, to be persistent. Keeping others at least saves application time and complexity. However, not all PMDs can support it, or the support may be limited by particular action kinds, that is, combinations of action type and the value of the transfer bit in its configuration. Add a device capability to indicate if at least some indirect actions are kept across the above sequence. Without this capability the behavior is still unspecified, and application is required to destroy the indirect actions before stopping the device. In the future, indirect actions may not be the only type of objects shared between flow rules. The capability bit intends to cover all possible types of such objects, hence its name. Declare that the application can test for the persistence of a particular indirect action kind by attempting to create an indirect action of that kind when the device is stopped and checking for the specific error type. This is logical because if the PMD can to create an indirect action when the device is not started and use it after the start happens, it is natural that it can move its internal flow shared object to the same state when the device is stopped and restore the state when the device is started. Indirect action persistence across a reconfigurations is not required. In case a PMD cannot keep the indirect actions across reconfiguration, it is allowed just to report an error. Application must then flush the indirect actions before attempting it. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-11-02 18:59:17 +01:00
Dmitry Kozlyuk	1d5a3d68c0	ethdev: add capability to keep flow rules on restart Previously, it was not specified what happens to the flow rules when the device is stopped, possibly reconfigured, then started. If flow rules were kept, it could be convenient for application developers, because they wouldn't need to save and restore them. However, due to the number of flows and possible creation rate it is impractical to save all flow rules in DPDK layer. This means that flow rules persistence really depends on whether PMD and HW can implement it efficiently. It can also be limited by the rule item and action types, and its attributes transfer bit (a combination of an item/action type and a value of the transfer bit is called a rule feature). Add a device capability bit for PMDs that can keep at least some of the flow rules across restart. Without this capability behavior is still unspecified and it is declared that the application must flush the rules before stopping the device. Allow the application to test for persistence of rules using a particular feature by attempting to create a flow rule using that feature when the device is stopped and checking for the specific error. This is logical because if the PMD can to create the flow rule when the device is not started and use it after the start happens, it is natural that it can move its internal flow rule object to the same state when the device is stopped and restore the state when the device is started. Rule persistence across a reconfigurations is not required, because tracking all the rules and configuration-dependent resources they use may be infeasible. In case a PMD cannot keep the rules across reconfiguration, it is allowed just to report an error. Application must then flush the rules before attempting it. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-11-02 18:59:17 +01:00
Maxime Coquelin	541053891a	vhost: increase number of async IO vectors This patch increases the number of IO vectors for the asynchronous data path from 512 to 2048. It has been reported during testing the starvation of IO vectors during iperf benchmark with 64KB packet size. As there are no direct relationship between VHOST_MAX_ASYNC_VEC and BUF_VECTOR_MAX, this patch also assign VHOST_MAX_ASYNC_VEC value directly instead of being a multiple of BUF_VECTOR_MAX. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	816a565bf3	vhost: merge sync and async mbuf to descriptor filling This patches merges copy_mbuf_to_desc() used by the sync path with async_mbuf_to_desc() used by the async path. Most of these complex functions are identical, so merging them will make the maintenance easier. In order not to degrade performance, the patch introduces a boolean function parameter to specify whether it is called in async context. This boolean is statically passed to this always-inlined function, so the compiler will optimize this out. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	b84e85e3b1	vhost: prepare sync for mbuf to descriptor refactoring This patch extracts the descriptors buffers filling from copy_mbuf_to_desc() into a dedicated function as a preliminary step of merging copy_mubf_to_desc() and async_mbuf_to_desc(). Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	dbfa4c0bbd	vhost: prepare async for mbuf to descriptor refactoring This patch extracts the IO vectors filling from async_mbuf_to_desc() into a dedicated function as a preliminary step of merging copy_mubf_to_desc() and async_mbuf_to_desc(). Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	c75987487d	vhost: simplify getting first in-flight index This patch reworks the function getting the index for the first packet in-flight. When this index turns out to be zero, let's use the simple path. Doing that avoid having to do a modulo with the virtqueue size. The patch also rename the function for better clarification, and only pass the virtqueue metadata pointer, as all the needed information are stored there. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	a3cfa8081f	vhost: simplify async enqueue completion vhost_poll_enqueue_completed() assumes some inflight packets could have been completed in a previous call but not returned to the application. But this is not the case, since check_completed_copies callback is never called with more than the current count as argument. In other words, async->last_pkts_n is always 0. Removing it greatly simplifies the function. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	2cbe826e26	vhost: remove notion of async descriptor Now that IO vectors iterator have been simplified, the rte_vhost_async_desc struct only contains a pointer on the iterator array stored in the async metadata. This patch removes it, and pass directly the iterators array pointer to the transfer_data callback. Doing that, we avoid declaring the descriptor array in the stack, and also avoid the cost of filling it. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	d5d25cfd85	vhost: improve IO vector logic IO vectors and their iterators arrays were part of the async metadata but not their indexes. In order to makes this more consistent, the patch adds the indexes to the async metadata. Doing that, we can avoid triggering DMA transfer within the loop as it IO vector index overflow is now prevented in the async_mbuf_to_desc() function. Note that previous detection mechanism was broken since the overflow already happened when detected, so OOB memory access would already have happened. With this changes done, virtio_dev_rx_async_submit_split() and virtio_dev_rx_async_submit_packed() can be further simplified. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	0af9f99221	vhost: remove useless fields in async iterator struct Offset and count fields are unused and so can be removed. The offset field was actually in the Vhost example, but in a way that does not make sense. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	6171bfbfb2	vhost: introduce specific iovec structure This patch introduces rte_vhost_iovec struct that contains both source and destination addresses since we always have a 1:1 mapping between source and destination. While using the standard iovec struct might have seemed better, having to duplicate IO vectors and its iterators is memory inefficient and make the implementation more complex. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	8b3fc5a213	vhost: remove async batch threshold Reaching the async batch threshold was one of the condition to trigger the DMA transfer. However, this condition was never met since the threshold value is 32, same as the MAX_PKT_BURST value. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	3fe629547e	vhost: simplify async IO vectors iterators This patch splits the iterator arrays in two, one for source and one for destination. The goal is make the code easier to understand. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	97064162d4	vhost: simplify async IO vectors IO vectors implementation is unnecessarily complex, mixing source and destinations vectors in the same array. This patch declares two arrays, one for the source and one for the destination. It also gets rid of seg_awaits variable in both packed and split implementation, which is the same as iovec_idx. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	5f89c5e1e9	vhost: hide in-flight async structure This patch moves async_inflight_info struct to internal header since it should not be part of the API. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Maxime Coquelin	ee8024b3d4	vhost: move async data in dedicated structure This patch moves async-related metadata from vhost_virtqueue to a dedicated struct. It makes it clear which fields are async related, and also saves some memory when async feature is not in use. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-10-29 12:32:30 +02:00
Miao Li	c6e305141a	power: support missing Rx queue info Since some vdevs like virtio and vhost do not support rxq_info_get and queue state inquiry, the error return value -ENOTSUP need to be ignored when queue_stopped cannot get rx queue information and rx queue state. This patch changes the return value of queue_stopped when rte_eth_rx_queue_info_get return -ENOTSUP to support vdevs which cannot provide rx queue information and rx queue state enable power management. Fixes: `209fd58545` ("power: make ethdev power management thread unsafe") Cc: stable@dpdk.org Signed-off-by: Miao Li <miao.li@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-10-29 12:32:29 +02:00
Miao Li	34fd4373ce	vhost: add power monitor API This commit defines rte_vhost_power_monitor_cond which is used to pass some information to vhost driver. The information is including the address to monitor, the expected value, the mask to extract value read from 'addr', the value size of monitor address, the match flag used to distinguish the value used to match something or not match something. Vhost driver can use these information to fill rte_power_monitor_cond. Signed-off-by: Miao Li <miao.li@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: David Hunt <david.hunt@intel.com>	2021-10-29 12:32:29 +02:00
Xuan Ding	5fd6e93b7e	vhost: remove async DMA map status Async DMA map status flag was added to prevent the unnecessary unmap when DMA devices bound to kernel driver. This brings maintenance cost for a lot of code. This patch removes the DMA map status by using rte_errno instead. This patch relies on the following patch to fix a partial unmap check in vfio unmapping API. [1] https://www.mail-archive.com/dev@dpdk.org/msg226464.html Signed-off-by: Xuan Ding <xuan.ding@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-10-29 12:32:22 +02:00
David Marchand	e7c727c307	net: fix build with sparse on L2TPv2 bitfields An external project that wants to do additional checks on fields endianness can remap rte_beXX types to instrumented types and use sparse. The current code breaks OVS build with sparse: ../../lib/ofp-packet.c: note: in included file (through .../ovs/dpdk-dir/build/include/rte_flow.h, ../../lib/netdev-dpdk.h, ../../lib/dp-packet.h): .../ovs/dpdk-dir/build/include/rte_l2tpv2.h:92:37: error: invalid bitfield specifier for type restricted ovs_be16. .../ovs/dpdk-dir/build/include/rte_l2tpv2.h:93:37: error: invalid bitfield specifier for type restricted ovs_be16. .../ovs/dpdk-dir/build/include/rte_l2tpv2.h:94:40: error: invalid bitfield specifier for type restricted ovs_be16. .../ovs/dpdk-dir/build/include/rte_l2tpv2.h:95:37: error: invalid bitfield specifier for type restricted ovs_be16. .../ovs/dpdk-dir/build/include/rte_l2tpv2.h:96:40: error: invalid bitfield specifier for type restricted ovs_be16. .../ovs/dpdk-dir/build/include/rte_l2tpv2.h:97:37: error: invalid bitfield specifier for type restricted ovs_be16. .../ovs/dpdk-dir/build/include/rte_l2tpv2.h:98:37: error: invalid bitfield specifier for type restricted ovs_be16. .../ovs/dpdk-dir/build/include/rte_l2tpv2.h:99:40: error: invalid bitfield specifier for type restricted ovs_be16. .../ovs/dpdk-dir/build/include/rte_l2tpv2.h💯39: error: invalid bitfield specifier for type restricted ovs_be16. make[3]: *** [lib/ofp-packet.lo] Error 1 Use simple uint16_t types for bitfields in L2TPv2 struct. Fixes: `3a929df1f2` ("ethdev: support L2TPv2 and PPP procotol") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-10-28 20:28:01 +02:00
David Marchand	41f2f05574	ethdev: warn once when using port not ready Warning continuously is a pain when developping or if a unit test is/gets broken. It could also be a problem if application behaves badly only in some corner cases and a DoS results of those logs being continuously displayed. Let's warn once per port and per rx/tx. Getting such a log is scary, but let's make it more eye catching by dumping a backtrace with it. Tested by introducing a bug in testpmd: static int eth_dev_start_mp(uint16_t port_id) { - if (is_proc_primary()) + if (!is_proc_primary()) return rte_eth_dev_start(port_id); return 0; Then, running a basic null test: $ ./devtools/test-null.sh ... Start automatic packet forwarding io packet forwarding - ports=2 - cores=1 - streams=2 - NUMA support enabled, MP allocation mode: native Logical Core 1 (socket 0) forwards packets on 2 streams: RX P=0/Q=0 (socket 0) -> TX P=1/Q=0 (socket 0) peer=02:00:00:00:00:01 RX P=1/Q=0 (socket 0) -> TX P=0/Q=0 (socket 0) peer=02:00:00:00:00:00 lcore 0 called rx_pkt_burst for not ready port 0 8: [build/app/dpdk-testpmd() [0x59e839]] 7: [/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ff481b69555]] 6: [build/app/dpdk-testpmd(main+0x54b) [0x662d24]] 5: [build/app/dpdk-testpmd(start_packet_forwarding+0x263) [0x65e795]] 4: [build/app/dpdk-testpmd() [0x65e1be]] 3: [build/app/dpdk-testpmd() [0x65a996]] 2: [build/app/dpdk-testpmd() [0xa6cbc7]] 1: [build/app/dpdk-testpmd(rte_dump_stack+0x27) [0xaee796]] lcore 0 called rx_pkt_burst for not ready port 1 8: [build/app/dpdk-testpmd() [0x59e839]] 7: [/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ff481b69555]] 6: [build/app/dpdk-testpmd(main+0x54b) [0x662d24]] 5: [build/app/dpdk-testpmd(start_packet_forwarding+0x263) [0x65e795]] 4: [build/app/dpdk-testpmd() [0x65e1be]] 3: [build/app/dpdk-testpmd() [0x65a996]] 2: [build/app/dpdk-testpmd() [0xa6cbc7]] 1: [build/app/dpdk-testpmd(rte_dump_stack+0x27) [0xaee796]] io packet forwarding packets/burst=32 nb forwarding cores=1 - nb forwarding ports=2 port 0: RX queue number: 1 Tx queue number: 1 Rx offloads=0x0 Tx offloads=0x0 Fixes: `c87d435a4d` ("ethdev: copy fast-path API into separate structure") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-10-27 19:28:45 +02:00
Olivier Matz	9bffc92850	mem: fix dynamic hugepage mapping in container Since its introduction in 2018, the SIGBUS handler was never registered, and all related functions were unused. A SIGBUS can be received by the application when accessing to hugepages even if mmap() was successful, This happens especially when running inside containers when there is not enough hugepages. In this case, we need to recover. A similar scheme can be found in eal_memory.c. Fixes: `582bed1e1d` ("mem: support mapping hugepages at runtime") Cc: stable@dpdk.org Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com>	2021-11-05 15:28:55 +01:00
Ilyes Ben Hamouda	770d41bf33	malloc: fix allocation with unknown socket ID When using rte_malloc() from a thread which is not bound to a numa socket (the typical case is a control thread, but it can also happen on a dataplane thread if its cpu affinity is on cores attached to several sockets), the used heap is the one from numa socket 0, which may not have available memory. Fix this by selecting the first socket which has available memory. Note: malloc_get_numa_socket() is only used from one .c file, so move it there, and remove the inline keyword. Fixes: `b94580d688` ("malloc: avoid unknown socket id") Cc: stable@dpdk.org Signed-off-by: Ilyes Ben Hamouda <ilyes.ben_hamouda@6wind.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: David Marchand <david.marchand@redhat.com>	2021-11-05 15:28:49 +01:00

1 2 3 4 5 ...

7503 Commits