numam-dpdk

Author	SHA1	Message	Date
David Marchand	1a44f67a03	vhost: reduce memory footprint when IOMMU is disabled If an application does not request IOMMU support, we can avoid allocating a IOMMU pool. This saves 112kB (IOTLB_CACHE_SIZE * sizeof(struct vhost_iotlb_entry)) per vq. Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-09-29 10:13:22 +02:00
Andy Pei	64e155f01a	vhost: use dedicated variable for message result Currently in function vhost_user_msg_handler, variable ret is used to store both vhost msg result code and function call return value. After this patch, variable ret is used only to store function call return value, a new dedicated variable msg_result is used to store vhost msg result. This can improve readability. Signed-off-by: Andy Pei <andy.pei@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-09-29 10:13:22 +02:00
David Marchand	5b546fa718	vhost: remove deprecated helpers Those helpers have been marked as deprecated for a long time and have documented equivalent helpers. Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-09-29 10:13:22 +02:00
Ali Alnubani	9891ded2ea	net: fix build with -Wpedantic Adds the __extension__ attribute to resolve the following warning with gcc 4.8.5 on CentOS 7: lib/net/rte_macsec.h:38:2: error: type of bit-field 'short_length' is a GCC extension [-Werror=pedantic] Fixes: `0aaf097585` ("ethdev: add MACsec flow item") Signed-off-by: Ali Alnubani <alialnu@nvidia.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2022-09-29 09:20:12 +02:00
Akhil Goyal	bbbe6c596b	security: support MACsec Added support for MACsec in rte_security for offloading MACsec Protocol operation to inline NIC device or a crypto device. To support MACsec we cannot just make one security session and send with the packet to process it. MACsec specifications suggest, it has 3 different entities - SECY Entity, SC (secure channel) and SA (security association). And same SA can be used by multiple SCs and similarly many SECY can have same SCs. Hence, in order to support this many to one relationships between all entities, 2 new APIs are created - rte_security_macsec_sc_create and rte_security_macsec_sa_create. Flow of execution of the APIs would be as - rte_security_macsec_sa_create - rte_security_macsec_sc_create - rte_security_session_create (for secy) And in case of inline protocol processing rte_flow can be created with rte_security action. A new flow item will be added for MACsec header. New APIs are also created for getting SC and SA stats. Signed-off-by: Akhil Goyal <gakhil@marvell.com>	2022-09-28 21:31:14 +02:00
Akhil Goyal	0aaf097585	ethdev: add MACsec flow item A new flow item is defined for MACsec flows which can be offloaded to an inline device. If the flow matches with MACsec header, device will process as per the security session created using rte_security APIs. If an error comes while MACsec processing in HW, PMD will notify with the events defined in this patch. Signed-off-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Ori Kam <orika@nvidia.com>	2022-09-28 21:31:02 +02:00
Akhil Goyal	32ae9939d9	net: add MACsec header Added MACsec protocol header to be used for supporting MACsec protocol offload in hardware or directly in the application. Signed-off-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ori Kam <orika@nvidia.com>	2022-09-28 21:17:21 +02:00
Volodymyr Fialko	4f3c773825	eventdev: fix name of Rx conf type in documentation Rename configuration structure type to correspond definition. Fixes: `dcc806c263` ("eventdev: add eth Rx adapter API") Cc: stable@dpdk.org Signed-off-by: Volodymyr Fialko <vfialko@marvell.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2022-09-28 13:02:55 +02:00
Naga Harish K S V	3c3328aef3	eventdev/eth_tx: add queue start/stop API Add support to start or stop a particular queue that is associated with the adapter. Start function enables the Tx adapter to start enqueueing packets to the Tx queue. Stop function stops the Tx adapter from enqueueing any packets to the Tx queue. The stop API also frees any packets that may have been buffered for this queue. All in-flight packets destined to the queue are freed by the adapter runtime until the queue is started again. Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2022-09-28 05:47:38 +02:00
Pavan Nikhilesh	0fbb55efa5	eventdev: add element offset to event vector Add `rte_event_vector:elem_offset:12` bit field event vector structure the bits are taken from `rte_event_vector::rsvd:15`. The element offset defines the offset into the vector array at which valid elements start. The valid elements count will be equal to `rte_event_vector::nb_elem`. Update Rx/Tx adapter SW implementation to use elem_offset. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>	2022-09-27 15:41:03 +02:00
Andrew Boyer	19d5f52240	eal/linux: restore some header includes for HPET These are necessary to build when HPET is enabled. Fixes: `2ff3976e67` ("eal: remove unneeded header includes") Cc: stable@dpdk.org Signed-off-by: Andrew Boyer <andrew.boyer@amd.com> Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>	2022-09-28 13:18:29 +02:00
Vamsi Attunuru	d1ce79d14b	ethdev: add IPsec SA expiry event subtypes Patch adds new event subtypes for notifying expiry events upon reaching IPsec SA soft packet expiry and hard packet/byte expiry limits. Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2022-09-27 20:31:16 +02:00
Volodymyr Fialko	6c55e2c3e6	cryptodev: fix missing SHA3 algorithm strings SHA3 family algorithms were missing in the array of algorithm strings. Fixes: `1df800f895` ("crypto/ccp: support SHA3 family") Cc: stable@dpdk.org Signed-off-by: Volodymyr Fialko <vfialko@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2022-09-27 18:20:35 +02:00
Nithin Dabilpuram	338eb15178	mbuf: clarify meta data needed for inline IPsec Clarify mbuf meta data needed for Outbound Inline IPsec processing. Application needs to provide mbuf.l3_len and L3 type in mbuf.ol_flags so that like tunnel mode using mbuf.l2_len, transport mode can make use of l3_len and l3_type to determine perform proper transport mode IPsec processing. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2022-09-27 17:18:45 +02:00
Amit Prakash Shukla	2be383423e	ipsec: fix build with GCC 12 GCC 12 raises the following warning: In function '_mm_loadu_si128', inlined from 'rte_mov16' at ../lib/eal/x86/include/rte_memcpy.h:507:9, inlined from 'rte_mov128' at ../lib/eal/x86/include/rte_memcpy.h:549:2, inlined from 'rte_memcpy_generic' at ../lib/eal/x86/include/rte_memcpy.h:732:4, inlined from 'rte_memcpy' at ../lib/eal/x86/include/rte_memcpy.h:882:10, inlined from 'outb_tun_pkt_prepare' at ../lib/ipsec/esp_outb.c:224:2: /usr/lib/gcc/x86_64-pc-linux-gnu/12.1.1/include/emmintrin.h:703:10: error: array subscript '__m128i_u[15]' is partly outside array bounds of 'const uint8_t[255]' {aka 'const unsigned char[255]'} [-Werror=array-bounds] 703 \| return *__P; \| ^~~~ In file included from ../lib/ipsec/esp_outb.c:17: ../lib/ipsec/pad.h: In function 'outb_tun_pkt_prepare': ../lib/ipsec/pad.h:10:22: note: at offset 240 into object 'esp_pad_bytes' of size 255 10 \| static const uint8_t esp_pad_bytes[IPSEC_MAX_PAD_SIZE] = { \| ^~~~~~~~~~~~~ This patch restrict copy to minimum size. Bugzilla ID: 1060 Fixes: `6015e6a133` ("ipsec: move inbound and outbound code") Cc: stable@dpdk.org Signed-off-by: Amit Prakash Shukla <amitprakashs@marvell.com> Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>	2022-09-27 17:16:52 +02:00
Shijith Thotton	2f279a1b6e	eventdev: add weight and affinity attributes to queue conf Added new fields to represent event queue weight and affinity in rte_event_queue_conf structure. Internal op to get queue attribute is removed as it is no longer needed. Updated driver to use the new field. Signed-off-by: Shijith Thotton <sthotton@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2022-09-26 15:34:57 +02:00
Naga Harish K S V	3d9d8adf8c	eventdev/timer: support periodic event timer Add support to configure and use periodic event timers in software timer adapter. The structure ``rte_event_timer_adapter_stats`` is extended by adding a new field, ``evtim_drop_count``. This stat represents the number of times an event_timer expiry event is dropped by the event timer adapter. Updated the software eventdev pmd timer_adapter_caps_get callback function to report the support of periodic event timer capability. Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>	2022-09-26 15:33:48 +02:00
Ganapati Kundapura	f089d62895	eventdev/eth_rx: remove resetting sub-event type In rte_event_eth_rx_adapter_queue_add(), sub_event_type of rte_event structure is reset which can be used by the application to determine the processing function. Removed resetting of rte_event::sub_event_type Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2022-09-26 15:33:48 +02:00
Naga Harish K S V	fbb9f7e215	eventdev/eth_tx: fix adapter stop adapter_stop function is stopping the adapter service using rte_service_runstate_set() API and waiting until rte_service_may_be_active() API returns stopped state in an infinite loop. This results in hang issues if application calls rte_service_lcore_stop() before adapter stop. Remove the state check after setting the service state which avoids running into hang issues. This also makes Tx adapter stop inline with remaining adapters. Fixes: `a3bbf2e097` ("eventdev: add eth Tx adapter implementation") Cc: stable@dpdk.org Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2022-09-26 15:33:47 +02:00
Naga Harish K S V	2ab9869c01	eventdev/eth_tx: add spinlock for adapter start/stop Add spinlock protection for tx adapter stop and start APIs add null check for tx adapter service pointer in adapter start/stop APIs. Fixes: `a3bbf2e097` ("eventdev: add eth Tx adapter implementation") Cc: stable@dpdk.org Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2022-09-26 15:33:47 +02:00
Ganapati Kundapura	b2963cbd5f	eventdev/eth_tx: add adapter instance get API Added rte_event_eth_tx_adapter_instance_get() to get the adapter instance id for specified ethernet device id and tx queue index. Added testcase for rte_event_eth_tx_adapter_instance_get(). Added rte_event_eth_tx_adapter_instance_get() details in prog_guide/event_ethernet_tx_adapter.rst Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com> Reviewed-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2022-09-26 15:33:45 +02:00
Ganapati Kundapura	a1793ee8ab	eventdev/eth_rx: add adapter instance get API Added rte_event_eth_rx_adapter_instance_get() to get adapter instance id for specified ethernet device id and rx queue index. Added telemetry handler for rte_event_eth_rx_adapter_instance_get(). Added test case for rte_event_eth_rx_adapter_instance_get() Added rte_event_eth_rx_adapter_instance_get() details in prog_guide/event_ethernet_rx_adapter.rst Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com> Reviewed-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2022-09-26 15:33:44 +02:00
Ivan Malov	235558fe94	ethdev: remove deprecated flow action physical port Such deprecation was commenced in DPDK 21.11. Since then, no parties have objected. Remove. The patch breaks ABI. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com>	2022-09-27 10:26:51 +02:00
Ivan Malov	5e3779b7ab	ethdev: remove deprecated flow item physical port Such deprecation was commenced in DPDK 21.11. Since then, no parties have objected. Remove. The patch breaks ABI. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com>	2022-09-27 10:26:51 +02:00
Ivan Malov	5c45fde3e9	ethdev: remove deprecated flow item VF Such deprecation was commenced in DPDK 21.11. Since then, no parties have objected. Remove. The patch breaks ABI. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com>	2022-09-27 10:26:51 +02:00
Ivan Malov	13f8de927a	ethdev: remove deprecated flow item PF Such deprecation was commenced in DPDK 21.11. Since then, no parties have objected. Remove. The patch breaks ABI. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com>	2022-09-27 10:26:51 +02:00
Ivan Malov	50a239fac6	ethdev: promote flow transfer proxy API rte_flow_pick_transfer_proxy() was first added to DPDK 21.11. Since then, no one has requested any fixes. At the same time, the API is required by series [1] in OvS for the new release. [1] http://patchwork.ozlabs.org/project/openvswitch/list/?series=310415 Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com>	2022-09-27 10:26:50 +02:00
Ivan Malov	41b7de3534	ethdev: promote port ID items and actions as stable The following set of primitives has been introduced in 21.11: - RTE_FLOW_ACTION_TYPE_PORT_REPRESENTOR - RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT - RTE_FLOW_ITEM_TYPE_PORT_REPRESENTOR - RTE_FLOW_ITEM_TYPE_REPRESENTED_PORT Since then, no one has requested any fixes. At the same time, the set is required by series [1] in OvS for the new release. [1] http://patchwork.ozlabs.org/project/openvswitch/list/?series=310415 Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com>	2022-09-27 10:26:33 +02:00
Ivan Malov	fc0a92937a	ethdev: promote Rx metadata negotiation API rte_eth_rx_metadata_negotiate() was introduced in DPDK 21.11. Since then, no one has requested any fixes. At the same time, the API is required by series [1] in OvS for the new release. [1] http://patchwork.ozlabs.org/project/openvswitch/list/?series=310415 Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2022-09-27 10:25:48 +02:00
Andrew Rybchenko	b9b509246d	mbuf: remove deprecated offload flags Remove deprecated ``PKT_*`` flags. Use corresponding flags with ``RTE_MBUF_F_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2022-09-26 14:17:50 +02:00
Andrew Rybchenko	f51abd4397	mempool: remove deprecated unused defines MEMPOOL_PG_NUM_DEFAULT and MEMPOOL_PG_SHIFT_MAX defines are unused since xmem API removal. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2022-09-26 14:05:13 +02:00
Andrew Rybchenko	6efe9ff22d	mempool: remove deprecated macro to register mempool driver Replacement RTE_MEMPOOL_REGISTER_OPS() should be used instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2022-09-26 14:05:06 +02:00
Andrew Rybchenko	b0831090cd	mempool: remove deprecated macro to calculate header size MEMPOOL_HEADER_SIZE() is removed. The replacement with RTE_ prefix is internal only since it is implementation details which are not required in applications. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2022-09-26 14:04:51 +02:00
Bruce Richardson	2607c9cbeb	telemetry: make help command more helpful The /help telemetry command prints out the help text for the given command passed in as parameter. However, entering /help without any parameters does not give any useful information as to the fact that you need to pass in a command to get help on. Update the command so it prints its own help text when called without any parameters. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-26 13:49:38 +02:00
Bruce Richardson	0f3d92f373	telemetry: eliminate duplicate code for json output When preparing the json response to a telemetry socket query, the code for prefixing the command name, and appending the file "}" on the end of the response was duplicated for multiple reply types. Taking this code out of the switch statement reduces the duplication and makes the code more maintainable. For completeness of testing, add in a test case to validate the "null" response type - the only leg of the switch statement not already covered by an existing test case in the telemetry_data tests. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-26 13:49:38 +02:00
Bruce Richardson	b64b2a6ce3	telemetry: limit command characters Limit the telemetry command characters to the minimum set needed for current implementations. This prevents issues with invalid json characters needing to be escaped on replies. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-26 13:49:37 +02:00
Bruce Richardson	6918ef8eeb	telemetry: add escaping of strings in dicts When strings are added to an dict variable, we need to properly escape the invalid json characters in the strings. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-26 13:49:11 +02:00
Bruce Richardson	14d84fccb2	telemetry: add escaping of strings in arrays When strings are added to an array variable, we need to properly escape the invalid json characters in the strings. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-26 13:49:11 +02:00
Bruce Richardson	babc521444	telemetry: fix escaping of invalid json characters For string values returned from telemetry, escape any values that cannot normally appear in a json string. According to the json spec[1], the characters than need to be handled are control chars (char value < 0x20) and '"' and '\' characters. To handle this, we replace the snprintf call with a separate string copying and encapsulation routine which checks each character as it copies it to the final array. [1] https://www.rfc-editor.org/rfc/rfc8259.txt Bugzilla ID: 1037 Fixes: `6dd571fd07` ("telemetry: introduce new functionality") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-26 13:49:11 +02:00
Bruce Richardson	2537fb0c5f	telemetry: limit characters allowed in dictionary names To save issues with encoding the names of values in dicts, we limit the allowed names to a subset of character values. This list of allowed characters can be expanded as necessary in future. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-26 13:49:10 +02:00
Fidaullah Noonari	f92b9ebed0	malloc: fix storage size for some allocations The amount of memory to allocate from the system for heap expansion was calculated in a way that may yield one page more than needed. This could hit the allocation limit from the system or EAL. The allocation would fail despite enough memory being available. In response to mail: https://inbox.dpdk.org/dev/CAEYuUWCnRZNwxiOHEeTHw0Gy9aFJRLZtvAG9g=smuUvUEMcFXg@mail.gmail.com/ A reproducer has been provided by Dmitry, see: https://inbox.dpdk.org/dev/20220922015212.03bfde66@sovereign/ Fixes: `07dcbfe010` ("malloc: support multiprocess memory hotplug") Cc: stable@dpdk.org Signed-off-by: Fidaullah Noonari <fidaullah.noonari@emumba.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2022-09-26 11:40:20 +02:00
Andrew Rybchenko	5007ac1318	ethdev: remove deprecated Flow Director configuration Remove deprecated fdir_conf from device configuration. Assume that mode is equal to RTE_FDIR_MODE_NONE. Add internal Flow Director configuration copy in ixgbe and txgbe device private data since flow API supports requires it. Initialize mode to the first flow rule mode on the rule validation or creation. Since Flow Director configuration data types are still used by some drivers internally, move it from public API to ethdev driver internal API. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Dongdong Liu <liudongdong3@huawei.com>	2022-08-31 15:24:23 +02:00
Andrew Rybchenko	36eccf7b4a	ethdev: remove deprecated metadata dynamic fields defines Remove deprecated ``PKT_{R,T}X_DYNF_METADATA`` defines. Use corresponding defines ``RTE_MBUF_DYNFLAG_{R,T}X_METADATA`` instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:57:19 +02:00
Andrew Rybchenko	16acf7b1fc	ethdev: remove deprecated maximum Rx MAC addresses define Remove deprecated ``ETH_NUM_RECEIVE_MAC_ADDR`` define. Use corresponding define with ``RTE_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:57:14 +02:00
Andrew Rybchenko	ab8701199e	ethdev: remove deprecated VLAN offload configuration defines Remove deprecated ``ETH_VLAN_*`` and ``ETH_QINQ_`` defines. Use corresponding defines with ``RTE_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:57:10 +02:00
Andrew Rybchenko	25def33d71	ethdev: remove deprecated flow control defines Remove deprecated ``RTE_FC_*`` defines. Use corresponding defines with ``RTE_ETH_FC_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:57:06 +02:00
Andrew Rybchenko	25b77b6089	ethdev: remove deprecated tunnel type defines Remove deprecated ``RTE_TUNNEL_*`` defines. Use corresponding defines with ``RTE_ETH_TUNNEL_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:57:02 +02:00
Andrew Rybchenko	935b9a43b3	ethdev: remove deprecated DCB and VMDq defines Remove deprecated ``ETH_DCB_``, ``ETH_VMDQ_``, ``ETH__TCS``, ``ETH_*_POOLS`` and ``ETH_MAX_VMDQ_POOL`` defines. Use corresponding defines with ``RTE_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:56:58 +02:00
Andrew Rybchenko	3be1c23343	ethdev: remove deprecated Rx and Tx offload defines Remove deprecated ``DEV_RX_OFFLOAD_*`` and ``DEV_TX_OFFLOAD_`` defines. Use corresponding defines with ``RTE_ETH_RX_OFFLOAD_`` and ``RTE_ETH_TX_OFFLOAD_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:56:53 +02:00
Andrew Rybchenko	683f93efb8	ethdev: remove deprecated defines for RSS configuration Remove deprecated ``ETH_RSS_*`` defines used for hash function and RETA size specification. Use corresponding defines with ``RTE_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:56:49 +02:00
Andrew Rybchenko	0593c208a4	ethdev: remove deprecated Rx and Tx multi-queue mode defines Remove deprecated ``ETH_MQ_RX_`` and ``ETH_MQ_TX_`` defines. Use corresponding defines with ``RTE_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:56:45 +02:00
Andrew Rybchenko	667c74defa	ethdev: remove deprecated link speed and duplex defines Remove deprecated ``ETH_LINK_SPEED_``, ``ETH_SPEED_NUM_`` and ``ETH_LINK_`` defines. Use corresponding defines with ``RTE_`` prefix instead. Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>	2022-08-24 09:55:54 +02:00
Cristian Dumitrescu	ac0d65df3e	pipeline: support direct meters on control path Add pipeline control path API to manage direct meters. These meters are identified by a table key, whose entry ID is used as the index into the meter array. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-24 11:35:34 +02:00
Cristian Dumitrescu	aa1855235d	pipeline: support direct registers on control path Add pipeline control path API to read/write direct registers. These registers are identified by a table key, whose entry ID is used as the index into the register array. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-24 11:35:29 +02:00
Cristian Dumitrescu	d4e95281a9	pipeline: add table entry ID read instruction Add the entry ID instruction that reads the entry ID of the latest table lookup operation from the pipeline into the meta-data. The entry ID is then used by the register and meter instructions as the index into the register or meter array. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-24 11:35:25 +02:00
Cristian Dumitrescu	c1b4b26a53	table: add entry ID for learner tables Add support for unique ID for each learner table entry. The entry ID is retrieved as part of the learner table lookup operation and is saved by the pipeline for later use. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-24 11:35:23 +02:00
Cristian Dumitrescu	42605e5651	table: add entry ID for regular tables Add support for unique ID for each table entry. The entry ID is retrieved as part of the table lookup operation and is saved by the pipeline for later use. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-24 11:35:14 +02:00
Cristian Dumitrescu	fa7723b5e3	pipeline: configure hash function for learner tables Make the hash function configurable for the learner pipeline tables. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-23 18:04:42 +02:00
Cristian Dumitrescu	0d871d7ed9	table: configure hash function for learner tables Make the hash function configurable. The internal hash function that was not configurable, mask-based and limited to 64 bytes is removed. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-23 18:04:41 +02:00
Cristian Dumitrescu	9560a3294e	pipeline: configure hash function for regular tables Make the hash function configurable for the regular pipeline tables. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-23 18:04:41 +02:00
Cristian Dumitrescu	82ff070188	table: configure hash function for regular tables Make the hash function configurable. The internal hash function that was not configurable, mask-based and limited to 64 bytes is removed. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-23 18:04:41 +02:00
Cristian Dumitrescu	da65740a16	table: add key comparison functions Add key comparison functions to be used by the exact match and the learner table types as part of the performance critical lookup operation. Since the key size is fixed, it is possible to select a specialized memory copy function as opposed to using the variable size version, resulting in a performance improvement of around 5%. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-23 18:04:41 +02:00
Cristian Dumitrescu	f5fda6863a	table: add hash function prototype Add hash function prototype to be used by the exact match and the learner table types. The hash function is not mask-based, so the table key fields have to be contiguous in memory. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-23 18:04:41 +02:00
David Marchand	fbd59c8ecb	dev: hide device object Make rte_device opaque for non internal users. This will make extending this object possible without breaking the ABI. Some applications may have been dereferencing rte_device objects, mark this object's accessors as stable. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	5b569f2ea8	dev: provide bus specific information For diagnostic, it may be useful to provide a description of the device with bus specific information. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	ec5ecd7e37	dev: introduce device accessors Prepare for making the device object opaque by adding accessors. Update existing "external" users. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	1acb7f5474	dev: hide driver object Make rte_driver opaque for non internal users. This will make extending this object possible without breaking the ABI. Introduce a new driver header and move rte_driver definition. Update drivers and library to use the internal header. Some applications may have been dereferencing rte_driver objects, mark this object's accessors as stable. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	97bbdba31d	dev: introduce driver accessors Prepare for making the driver object opaque by adding accessors. Update existing "external" users. Internal users may still dereference a rte_driver object. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	a04322f616	bus: hide bus object Make rte_bus opaque for non internal users. This will make extending this object possible without breaking the ABI. Introduce a new driver header and move rte_bus definition and helpers. Update drivers and library to use the internal header. Some applications may have been dereferencing rte_bus objects, mark this object's accessors as stable. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	148c51a3de	bus: introduce accessors Add helpers to get a rte_bus object details. This will be used externally. Internal users may still dereference a rte_bus object. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	770ebc060e	bus: move IOVA definition from header iova enum definition does not need to be defined as part of the bus API. Move it to rte_eal.h. With this step, rte_eal.h does not depend on rte_bus.h and rte_dev.h. Fix existing code that was relying on these implicit inclusions. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	4851ef2b40	bus/vdev: make driver-only headers private The vdev bus interface is for drivers only. Mark as internal and move the header in the driver headers list. While at it, cleanup the code: - fix indentation, - remove unneeded reference to bus specific singleton object, - remove unneeded list head structure type, - reorder the definitions and macro manipulating the bus singleton object, - remove inclusion of rte_bus.h and fix the code that relied on implicit inclusion, Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Rosen Xu <rosen.xu@intel.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>	2022-09-23 16:14:34 +02:00
David Marchand	1f37cb2bb4	bus/pci: make driver-only headers private The pci bus interface is for drivers only. Mark as internal and move the header in the driver headers list. While at it, cleanup the code: - fix indentation, - remove unneeded reference to bus specific singleton object, - remove unneeded list head structure type, - reorder the definitions and macro manipulating the bus singleton object, - remove inclusion of rte_bus.h and fix the code that relied on implicit inclusion, Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Rosen Xu <rosen.xu@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	82ee04a855	devargs: remove dependency on bus header We don't need to include rte_bus.h in rte_devargs.h. Only a forward declaration of rte_bus and an inclusion of rte_dev.h are needed. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	8f1d23ece0	eal: deprecate RTE_FUNC_PTR_* macros Those macros have no real value and are easily replaced with a simple if() block. Existing users have been converted using a new cocci script. Deprecate them. Signed-off-by: David Marchand <david.marchand@redhat.com>	2022-09-23 16:14:34 +02:00
David Marchand	99948194a8	dev: hide debug messages in device iterator For any bus that does not support device iteration, rte_dev_iterator_init both returned an error code and logged an error message. An application (like testpmd) that only wants to list devices, would have no choice but to inspect a bus object to avoid spewing error logs. Make those log messages debug level, and remove the check in testpmd. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
Cristian Dumitrescu	9ebff617fe	pipeline: support moving large structure fields Add support to the move instruction for operands bigger than 64 bits. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Harshad Narayane <harshad.suresh.narayane@intel.com>	2022-09-23 13:56:01 +02:00
Cristian Dumitrescu	86347cd9ce	pipeline: support large default action arguments Support structure fields bigger than 64 bits as default action arguments. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Harshad Narayane <harshad.suresh.narayane@intel.com>	2022-09-23 13:56:01 +02:00
Cristian Dumitrescu	103b303d4c	pipeline: read large structure fields on control path Support reading structure fields that are bigger than 64 bits on the control path for the table update operations. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Harshad Narayane <harshad.suresh.narayane@intel.com>	2022-09-23 13:56:01 +02:00
Cristian Dumitrescu	076d612875	pipeline: remove 64-bit limit for structure fields Remove the generic limitation of structure fields to 64 bits or less and push this restriction to the instructions that require it. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Harshad Narayane <harshad.suresh.narayane@intel.com>	2022-09-23 13:56:01 +02:00
Abdullah Sevincer	aec322ce15	eal: export coremask parsing helper DLB2 has a need to parse a user supplied coremask as part of an optimization that associates optimal core/resource pairs. Therefore eal_parse_coremask has been renamed to rte_eal_parse_coremask and exported but kept internal. Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com>	2022-09-23 12:07:49 +02:00
Cristian Dumitrescu	f8b0c950a5	port: prevent unnecessary flush for ring output port Do not flush the buffered packets unnecessarily when a burst was sent since the last flush call. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-22 16:56:58 +02:00
Cristian Dumitrescu	0b5122dcf5	port: free buffered packets on ring output port free Free the buffered packets as opposed to retrying to send them when the output port is freed. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-22 16:56:58 +02:00
Cristian Dumitrescu	87eda68957	port: rework ring output port behavior to non-blocking Drop packets that cannot be sent instead of retry sending the same packets potentially forever when the ring consumer that is down. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-22 16:56:58 +02:00
Cristian Dumitrescu	45b3fcbfd1	port: prevent unnecessary flush for ethdev output port Do not flush the buffered packets unnecessarily when a burst was sent since the last flush call. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-22 16:56:58 +02:00
Cristian Dumitrescu	7150064dc2	port: free buffered packets on ethdev output port free Free the buffered packets as opposed to retrying to send them when the output port is freed. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-22 16:56:58 +02:00
Cristian Dumitrescu	f317fff048	port: rework ethdev output port behavior to non-blocking Drop packets that cannot be sent instead of retry sending the same packets potentially forever when the Ethernet device that is down. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-22 16:56:58 +02:00
Cristian Dumitrescu	7c10ca26f0	port: adjust sink port counters The sink port is tasked to drop all packets, hence the packet and byte counters should be named to reflect the drop operation. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-22 16:56:58 +02:00
Cristian Dumitrescu	e28e064b28	port: add output port packet drop counters Add packet drop statistics counters for the output ports. Required by the non-blocking output port behavior where the packets that cannot be sent at the time of the operation are dropped as opposed to the send operation being retried potentially forever for the same packets. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-09-22 16:56:58 +02:00
Dmitry Kozlyuk	72b452c5f2	eal: remove unneeded includes from a public header Do not include <ctype.h>, <errno.h>, and <stdlib.h> from <rte_common.h>, because they are not used by this file. Include the needed headers directly from the files that need them. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-21 15:31:03 +02:00
Dmitry Kozlyuk	347623c9c7	eal: uninline some string formatting helper There is no reason for rte_str_to_size() to be inline. Move the implementation out of <rte_common.h>. Export it as a stable ABI because it always has been public. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-21 15:31:03 +02:00
Dmitry Kozlyuk	107dc0664c	eal: deduplicate roundup code RTE_CACHE_LINE_ROUNDUP() implementation repeated RTE_ALIGN_MUL_CEIL(). In other places RTE_CACHE_LINE_SIZE is assumed to be a power-of-2, so define RTE_CACHE_LINE_ROUNDUP() using RTE_ALIGN_CEIL(). Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Reviewed-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-21 15:31:03 +02:00
Dmitry Kozlyuk	1a7374c956	eal: fix side effect in some pointer arithmetic macros RTE_PTR_SUB(ptr, x) and RTE_PTR_ALIGN_FLOOR() worked incorrectly if "ptr" was an expression: uint32_t arr[3]; RTE_PTR_SUB(arr + 1, sizeof(arr[0])); // expected: (uint32_t )((uintptr_t)(arr + 1) - 4) == arr // actual: (uint32_t )((uintptr_t) arr + 1 - 4) != arr RTE_PTR_ALIGN_FLOOR(arr + 2, sizeof(arr[0])); // expected: RTE_ALIGN_FLOOR((uintptr_t)(arr + 2), 4) == &arr[2] // actual: RTE_ALIGN_FLOOR((uintptr_t) arr + 2, 4) == &arr[0] Fix the macros and extend the relevant unit test. Convert uses of a custom test failure macro to RTE_TEST_ASSERT*(). Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Reviewed-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-09-21 15:31:03 +02:00
Henning Schild	0832959682	bpf: avoid useless memcpy call There is no point in such a call and UBSan complains about a call to memcpy with a null pointer as second arg. When building with -Db_sanitize=undefined, Clang gives the following warning ../lib/bpf/bpf_load.c:37:20: runtime error: null pointer passed as argument 2, which is declared to never be null A check of the sz before calling memcpy fixes that. Signed-off-by: Henning Schild <henning.schild@siemens.com>	2022-09-21 15:04:45 +02:00
Mattias Rönnblom	1c9a7fba5c	net: accept unaligned data in checksum routines __rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its data through an uint16_t pointer, which allowed the compiler to assume the data was 16-bit aligned. This in turn would, with certain architectures and compiler flag combinations, result in code with SIMD load or store instructions with restrictions on data alignment. This patch keeps the old algorithm, but data is read using memcpy() instead of direct pointer access, forcing the compiler to always generate code that handles unaligned input. The __may_alias__ GCC attribute is no longer needed. The data on which the Internet checksum functions operates are almost always 16-bit aligned, but there are exceptions. In particular, the PDCP protocol header may (literally) have an odd size. Performance impact seems to range from none to a very slight regression. Bugzilla ID: 1035 Fixes: `6006818cfb` ("net: new checksum functions") Cc: stable@dpdk.org Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2022-09-20 18:09:42 +02:00
David Marchand	f8f6b1c174	vhost: stop using mempool for IOTLB cache A mempool consumes 3 memzones (with the default ring mempool driver). The default DPDK configuration allows RTE_MAX_MEMZONE (2560) memzones. Assuming there is no other memzones that means that we can have a maximum of 853 mempools. In the vhost library, the IOTLB cache code so far was requesting a mempool per vq, which means that at the maximum, the vhost library could request mempools for 426 qps. This limit was recently reached on big systems with a lot of virtio ports (and multiqueue in use). While the limit on mempool count could be something we fix at the DPDK project level, there is no reason to use mempools for the IOTLB cache: - the IOTLB cache entries do not need to be DMA-able and are only used by the current process (in multiprocess context), - getting/putting objects from/in the mempool is always associated with some other locks, so some level of lock contention is already present, We can convert to a malloc'd pool with objects put in a free list protected by a spinlock. Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-09-15 17:56:24 +02:00
David Marchand	57e414e3ec	vhost: keep a reference to virtqueue index Having a back reference to the index of the vq in the dev->virtqueue[] array makes it possible to unify the internal API, with only passing dev and vq. It also allows displaying the vq index in log messages. Remove virtqueue index checks where unneeded (like in static helpers called from a loop on all available virtqueue). Move virtqueue index validity checks the sooner possible. Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-09-15 17:56:13 +02:00
David Marchand	5a5a72a875	vhost: make NUMA reallocation code more robust translate_ring_addresses and numa_realloc may change a virtio device and virtio queue. Callers of those helpers must be extra careful and refresh any reference to old data. Change those functions prototype as a way to hint about this issue and always ask for an indirect pointer. Besides, when reallocating the device and queue, the code already made sure it will return a pointer to a valid device. The checks on such returned pointer can be removed. Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-09-15 17:56:04 +02:00
David Marchand	0b2a2ca350	vhost: fix virtqueue use after free on NUMA reallocation translate_ring_addresses (via numa_realloc) may change a virtio device and virtio queue. The virtqueue object must be refreshed before accessing the lock. Fixes: `04c27cb673` ("vhost: fix unsafe vring addresses modifications") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-09-15 17:54:55 +02:00
Cristian Dumitrescu	68b95704a6	pipeline: add API for shared library-based pipeline build Previously, the pipeline build operation was done based on the specification file (typically produced by the P4 compiler), then the C code with optimized functions for the pipeline actions and instructions was generated, built into a shared object library, loaded and installed into the pipeline in a completely hardcoded and non-customizable way. Now, this process is split into three explicit stages: i) code generation (specification file -> C file); ii) code build (C file -> shared object library); iii) code installation (library load into the pipeline). Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-15 17:46:09 +02:00
Cristian Dumitrescu	8f55f896a8	pipeline: add API for code generation Previously, the C code generation for the pipeline was hidden under the hood; now, we make this an explicit API operation. Besides the functions for the pipeline actions and the pipeline instructions, the generated C source code now includes the pipeline specification structure required for the pipeline configuration operations. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-15 17:45:50 +02:00
Cristian Dumitrescu	54cae37ef4	pipeline: support I/O specification Add specification data structure and API for the pipeline I/O ports and related pipeline configuration such as packet mirroring. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-15 17:45:28 +02:00
Cristian Dumitrescu	894c93e14e	pipeline: generate code for specification structure Add support to export the pipeline specification data structure to a C source code file. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-15 17:45:16 +02:00
Cristian Dumitrescu	30c4abb909	pipeline: rework specification file-based pipeline build Rework the specification file-based pipeline build operation to first parse the specification file into the previously introduced pipeline specification data structure, then use this structure to configure and build the pipeline. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-15 17:45:16 +02:00
Cristian Dumitrescu	4684aa757d	pipeline: add pipeline specification data structure Add specification data structure for the entire pipeline. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-15 17:45:06 +02:00
Cristian Dumitrescu	3421eb4218	pipeline: move specification structures to internal header Move all the pipeline object specification data structures to an internal header file. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-15 17:07:40 +02:00
Cristian Dumitrescu	d69c90c85d	pipeline: add pipeline name Add an unique name to every pipeline. This enables the library to maintain a list of the existing pipeline objects, which can be queried by the application. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>	2022-09-15 17:06:57 +02:00
Huichao Cai	4aee6110bb	ip_frag: add IPv4 fragment copy Some NIC drivers support MBUF_FAST_FREE (device supports optimization for fast release of mbufs. When set, application must guarantee that per-queue all mbufs comes from the same mempool, has refcnt = 1, direct and non-segmented.) offload. In order to adapt to this offload function, add this API. Add some test data for this API. Signed-off-by: Huichao Cai <chcchc88@163.com> Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>	2022-08-29 16:24:18 +02:00
Maxime Coquelin	dc1516e260	vhost: fix header spanned across more than two descriptors This patch aims at supporting the unlikely case where a Virtio-net header is spanned across more than two descriptors. CVE-2022-2132 Fixes: `fd68b4739d` ("vhost: use buffer vectors in dequeue path") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-08-29 15:12:40 +02:00
Maxime Coquelin	71bd0cc536	vhost: discard too small descriptor chains This patch discards descriptor chains which are smaller than the Virtio-net header size, and ones that are equal. Indeed, such descriptor chains sizes mean there is no packet data. This patch also has the advantage of requesting the exact packets sizes for the mbufs. CVE-2022-2132 Fixes: `62250c1d09` ("vhost: extract split ring handling from Rx and Tx functions") Fixes: `c3ff0ac70a` ("vhost: improve performance by supporting large buffer") Fixes: `84d5204310` ("vhost: support async dequeue for split ring") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-08-29 15:12:13 +02:00
Cristian Dumitrescu	af00c9352e	pipeline: fix code generation for recirculation ID The "offset" and "n_bits" fields were generated incorrectly, hence the output C file was producing compilation errors when the "recircid" instruction was used. Fixes: `5ec76d29dc` ("pipeline: support packet recirculation") Cc: stable@dpdk.org Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-08-29 12:14:03 +02:00
Marcin Danilewicz	a61b319674	sched: enable congestion management at runtime Added changes to enable CMAN (RED or PIE) at init from profile configuration file. By default CMAN code is enabled but not in use, when there is no RED or PIE profile configured. Signed-off-by: Marcin Danilewicz <marcinx.danilewicz@intel.com> Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>	2022-08-25 16:18:06 +02:00
David Marchand	72206323a5	version: 22.11-rc0 Start a new release cycle with empty release notes. The ABI version becomes 23.0. The map files are updated to the new ABI major number (23). The ABI exceptions are dropped and CI ABI checks are disabled because compatibility is not preserved. Special handling of removed drivers is also dropped in check-abi.sh and a note has been added in libabigail.abignore as a reminder. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2022-07-21 12:13:48 +02:00
Maxime Coquelin	16b2977562	vhost: fix missing copy length in batched copies This patch fixes missing the assignment of the copy length when doing batched copies in the sync dequeue path. Fixes: `cd79d1b030` ("vhost: fix unnecessary dirty page logging") Suggested-by: Chenbo Xia <chenbo.xia@intel.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Tested-by: Wei Ling <weix.ling@intel.com>	2022-07-14 15:47:30 +02:00
Xuan Ding	cd79d1b030	vhost: fix unnecessary dirty page logging The dirty page logging is only required in vhost enqueue direction for live migration. This patch removes the unnecessary dirty page logging in vhost dequeue direction. Otherwise, it will result in a performance drop. Some if-else judgements are also optimized to improve performance. Fixes: `6d823bb302` ("vhost: prepare sync for descriptor to mbuf refactoring") Fixes: `b6eee3e834` ("vhost: fix sync dequeue offload") Signed-off-by: Xuan Ding <xuan.ding@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Xingguang He <xingguang.he@intel.com>	2022-07-08 11:15:32 +02:00
Kumara Parameshwaran	bdf2f895a6	gro: fix identifying fragmented packets A packet with RTE_PTYPE_L4_FRAG(0x300) contains both RTE_PTYPE_L4_TCP (0x100) & RTE_PTYPE_L4_UDP (0x200). A fragmented packet as defined in rte_mbuf_ptype.h cannot be recognized as other L4 types and hence the GRO layer should not use IS_IPV4_TCP_PKT or IS_IPV4_UDP_PKT for RTE_PTYPE_L4_FRAG. Hence, if the packet type is RTE_PTYPE_L4_FRAG the IP header should be parsed to recognize the appropriate IP type and invoke the respective gro handler. Fixes: `1ca5e67408` ("gro: support UDP/IPv4") Cc: stable@dpdk.org Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2022-07-05 18:30:38 +02:00
Harry van Haaren	6550113be6	service: fix lingering active status This commit fixes an issue where calling rte_service_lcore_stop() would result in a service's "active on lcore" status becoming stale. The stale status would result in rte_service_may_be_active() always returning "1", indicating that the service is not certainly stopped. This is fixed by ensuring the "active on lcore" status of each service is set to 0 when an lcore is stopped. Fixes: `e30dd31847` ("service: add mechanism for quiescing") Fixes: `8929de043e` ("service: retrieve lcore active state") Reported-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>	2022-07-05 16:24:43 +02:00
David Marchand	36c525a035	vhost: prefix logs with context We recently improved the log messages in the vhost library, adding some context that helps filtering for a given vhost-user device. However, some parts of the code were missed, and some later code changes broke this new convention (fixes were sent previous to this patch). Change the VHOST_LOG_CONFIG/DATA helpers and always ask for a string used as context. This should help limit regressions on this topic. Most of the time, the context is the vhost-user device socket path. For the rest when a vhost-user device can not be related, generic names were chosen: - "dma", for vhost-user async DMA operations, - "device", for vhost-user device creation and lookup, - "thread", for threads management, Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-07-01 15:49:49 +02:00
David Marchand	481a2c7ef2	vhost: improve some datapath log messages Those messages were missed when adding socket context. Fix this. Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-07-01 15:49:49 +02:00
David Marchand	bb15129da9	vhost: restore device information in log messages device information in the log messages was dropped. Fixes: `52ade97e36` ("vhost: fix physical address mapping") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-07-01 15:49:49 +02:00
David Marchand	1ef468a7e9	vhost: add some trailing newline in log messages VHOST_LOG_* macros don't append a newline. Add missing ones. Fixes: `e623e0c6d8` ("vhost: add reconnect ability") Fixes: `af14759181` ("vhost: introduce API to start a specific driver") Fixes: `2dfeebe265` ("vhost: check return of mutex initialization") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-07-01 15:49:49 +02:00
Jiayu Hu	1e4bcee9ba	vhost: check DMA info return This patch checks the return value of rte_dma_info_get() called in rte_vhost_async_dma_configure(). Coverity issue: 379066 Fixes: `53d3f4778c` ("vhost: integrate dmadev in asynchronous data-path") Cc: stable@dpdk.org Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-07-01 15:49:49 +02:00
Xuan Ding	b6eee3e834	vhost: fix sync dequeue offload This patch fixes the missing virtio net header copy in sync dequeue path caused by refactoring, which affects dequeue offloading. Fixes: `6d823bb302` ("vhost: prepare sync for descriptor to mbuf refactoring") Signed-off-by: Xuan Ding <xuan.ding@intel.com> Tested-by: Wei Ling <weix.ling@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-07-01 15:49:49 +02:00
Andy Pei	b90574b10e	vhost: fix virtio block vDPA live migration IO drop In the virtio blk vDPA live migration use case, before the live migration process, QEMU will set call fd to vDPA back-end. QEMU and vDPA back-end stand by until live migration starts. During live migration process, QEMU sets kick fd and a new call fd. However, after the kick fd is set to the vDPA back-end, the vDPA back-end configures device and data path starts. The new call fd will cause some kind of "re-configuration", this kind of "re-configuration" cause IO drop. After this patch, vDPA back-end configures device after kick fd and call fd are well set and make sure no IO drops. This patch only impact virtio blk vDPA device and does not impact net device. Fixes: `7015b65771` ("vdpa/ifc: add block device SW live-migration") Signed-off-by: Andy Pei <andy.pei@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-07-01 15:49:49 +02:00
Naga Harish K S V	881d4b4d85	eventdev/eth_tx: fix queue delete Add spinlock protection in queue delete function. This protects the data path while the queue delete operation is in progress. Fixes: `a3bbf2e097` ("eventdev: add eth Tx adapter implementation") Cc: stable@dpdk.org Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2022-07-04 18:04:52 +02:00
Gregory Etelson	0718b7716c	net: fix GTP PSC headers Fix bitmap fields order in little endian section of GTP PSC headers. Fixes: `e8ca1479cd` ("net: add extension header for GTP PSC") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Aman Singh <aman.deep.singh@intel.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2022-06-23 13:38:09 +02:00
Stephen Hemminger	96c3a92809	eal: promote experimental sleep function This has been around since 2018 release. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2022-06-26 15:00:36 +02:00
Stephen Hemminger	7f7efe829b	interrupts: promote some experimental functions These are functions related to interrupts that have been in since 20.02 release or earlier. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2022-06-26 15:00:36 +02:00
Stephen Hemminger	218f2f9749	eal: promote some lcore experimental accessors These API's have been around for a long time and by now are fixed. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2022-06-26 15:00:36 +02:00
Stephen Hemminger	025e9bcd61	log: promote some experimental macros and function The RTE_LOG_REGISTER is not experimental, and the experimental tag was never enforced on these. Make rte_log_can_log a fully supported function. It was introduced nearly 2yrs ago. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2022-06-26 14:43:19 +02:00
Stephen Hemminger	26f5a0d5a6	rib: fix references for IPv6 implementation The comments in rte_rib6 were cut-and-pasted from rte_rib and because of that some references to rte_rib_node were not updated. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2022-06-26 12:20:24 +02:00
Stephen Hemminger	64e14b8b07	remove unnecessary null checks Found by nullfree.cocci. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> [David: for lpm parts:] Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> [David: for vdpa/mlx5 parts:] Acked-by: Matan Azrad <matan@nvidia.com> [David: for dma/dpaa2, raw/ifpga, vdpa/mlx5:] Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> [David: reran cocci.sh and updated common/mlx5 and cryptodev asym test] Signed-off-by: David Marchand <david.marchand@redhat.com>	2022-06-24 14:51:09 +02:00
Stephen Hemminger	448e01f1b5	lib: document free functions Make sure all functions which use the convention that XXX_free(NULL) is a nop are all documented. The wording is chosen to match the documentation of free(3). "If ptr is NULL, no operation is performed." Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Chengwen Feng <fengchengwen@huawei.com> [David: squashed with other series updates, unified wording]	2022-06-24 14:50:34 +02:00
Stephen Hemminger	cb68a56343	remove passive voice in function description Remove extraneous phrase "This API is used to" and use active instead of passive voice when describing a function. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com> [David: for raw/ioat and dmadev parts:] Acked-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Conor Walsh <conor.walsh@intel.com>	2022-06-24 14:05:54 +02:00
Don Wallwork	42fbb8e85d	eal/linux: allocate worker lcore stacks in hugepages Add support for using hugepages for worker lcore stack memory. The intent is to improve performance by reducing stack memory related TLB misses and also by using memory local to the NUMA node of each lcore. EAL option '--huge-worker-stack[=stack-size-in-kbytes]' is added to allow the feature to be enabled at runtime. If the size is not specified, the system pthread stack size will be used. Signed-off-by: Don Wallwork <donw@xsightlabs.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-06-23 22:36:33 +02:00
Huichao Cai	e1522b328c	ip_frag: fix build with GCC 12 GCC 12 raises warnings on usage of rte_memcpy with IPv4 options handling in fragments for both the ip_frag library and unit tests. For example in the library: In function ‘_mm256_storeu_si256’, inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:347:2, inlined from ‘rte_mov128’ at ../lib/eal/x86/include/rte_memcpy.h:369:2, inlined from ‘rte_memcpy_generic’ at ../lib/eal/x86/include/rte_memcpy.h:445:4, inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:851:10, inlined from ‘__create_ipopt_frag_hdr’ at ../lib/ip_frag/rte_ipv4_fragmentation.c:68:4, inlined from ‘rte_ipv4_fragment_packet’ at ../lib/ip_frag/rte_ipv4_fragmentation.c:242:16: /usr/lib/gcc/x86_64-redhat-linux/12/include/avxintrin.h:935:8: error: array subscript ‘__m256i_u[1]’ is partly outside array bounds of ‘uint8_t[60]’ {aka ‘unsigned char[60]’} [-Werror=array-bounds] 935 \| *__P = __A; \| ~~~~~^~~~~ ../lib/ip_frag/rte_ipv4_fragmentation.c: In function ‘rte_ipv4_fragment_packet’: ../lib/ip_frag/rte_ipv4_fragmentation.c:122:17: note: at offset [52, 60] into object ‘ipopt_frag_hdr’ of size 60 122 \| uint8_t ipopt_frag_hdr[IPV4_HDR_MAX_LEN]; \| ^~~~~~~~~~~~~~ To resolve the compilation warning, replace the rte_memcpy with memcpy. Fixes: `b50a14a853` ("ip_frag: add IPv4 options fragment") Signed-off-by: Huichao Cai <chcchc88@163.com>	2022-06-23 16:32:00 +02:00
Fidaullah Noonari	ce2f7d472e	malloc: fix allocation of almost hugepage size If called to allocate memory of size is between multiple of hugepage size minus malloc_header_len and hugepage size, rte_malloc fails. This fix replaces malloc_elem_trailer_len with malloc_elem_overhead in try_expand_heap() to include malloc_elem_header_len when calculating n_seg. Bugzilla ID: 800 Fixes: `07dcbfe010` ("malloc: support multiprocess memory hotplug") Cc: stable@dpdk.org Signed-off-by: Fidaullah Noonari <fidaullah.noonari@emumba.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2022-06-23 13:40:50 +02:00
Stephen Hemminger	0efcd352e2	eal/unix: make stack dump signal safe rte_dump_stack() needs to be usable in situations when a bug is encountered and from signal handlers (such as SEGV). Glibc backtrace_symbols() calls malloc which makes it dangerous in a signal handler that is handling errors that maybe due to memory corruption. Additionally, rte_log() is unsafe because syslog() is not signal safe; printf() is also documented as not being safe. This version formats message and uses writev for each line in a manner similar to what glibc version of backtrace_symbols_fd() does. The FreeBSD version of backtrace_symbols_fd() is not signal safe. Sample output: 0: ./build/app/dpdk-testpmd (rte_dump_stack+0x2b) [560a6e9c002b] 1: ./build/app/dpdk-testpmd (main+0xad) [560a6decd5ad] 2: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xcd) [7fd43d3e27fd] 3: ./build/app/dpdk-testpmd (_start+0x2a) [560a6e83628a] Bugzilla ID: 929 Acked-by: Morten Brørup <mb@smartsharesystems.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-06-23 13:40:50 +02:00
David Marchand	2fbada9154	vhost/crypto: fix descriptor processing copy_data was returning a pointer to an increased (off by one) descriptor. Subsequent calls to copy_data in the library were then failing. Fix this by incrementing the descriptor only if there is some left data to copy. Fixes: `4414bb6701` ("vhost/crypto: fix build with GCC 12") Cc: stable@dpdk.org Reported-by: Jakub Poczatek <jakub.poczatek@intel.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Jakub Poczatek <jakub.poczatek@intel.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com>	2022-06-22 18:58:31 +02:00
David Marchand	4414bb6701	vhost/crypto: fix build with GCC 12 GCC 12 raises the following warning: In file included from ../lib/mempool/rte_mempool.h:46, from ../lib/mbuf/rte_mbuf.h:38, from ../lib/vhost/vhost_crypto.c:7: ../lib/vhost/vhost_crypto.c: In function ‘rte_vhost_crypto_fetch_requests’: ../lib/eal/x86/include/rte_memcpy.h:371:9: warning: array subscript 1 is outside array bounds of ‘struct virtio_crypto_op_data_req[1]’ [-Warray-bounds] 371 \| rte_mov32((uint8_t )dst + 3 32, (const uint8_t )src + 3 32); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../lib/vhost/vhost_crypto.c:1178:42: note: while referencing ‘req’ 1178 \| struct virtio_crypto_op_data_req req; \| ^~~ Split this function and separate the per descriptor copy. This makes the code clearer, and the compiler happier. Note: logs for errors have been moved to callers to avoid duplicates. Fixes: `3c79609fda` ("vhost/crypto: handle virtually non-contiguous buffers") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-06-21 11:20:58 +02:00
David Marchand	ad0eeb83ea	vhost: fix log message for async dequeue Since the commit `02798b0735` ("vhost: improve virtio-net layer logs"), vhost logs contain the socket path as a prefix. Async dequeue path was copied from the sync dequeue path but a log was incorrect. Fixes: `84d5204310` ("vhost: support async dequeue for split ring") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-06-17 16:01:21 +02:00
Xuan Ding	b7c2ecbacb	vhost: fix statistics update in async dequeue This patch adds missing per-virtqueue statistics in async dequeue path. Fixes: `84d5204310` ("vhost: support async dequeue for split ring") Signed-off-by: Xuan Ding <xuan.ding@intel.com> Tested-by: Wei Ling <weix.ling@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-06-17 15:57:55 +02:00
Maxime Coquelin	f4284e2dda	vhost: rename number of available entries This patchs renames the local variables free_entries to avail_entries in the dequeue path. Indeed, this variable represents the number of new packets available in the Virtio transmit queue, so these entries are actually used, not free. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-06-17 15:51:02 +02:00
Cheng Jiang	fe8477ebbd	vhost: support async packed ring dequeue This patch implements packed ring dequeue data path for asynchronous vhost. Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-06-17 15:33:51 +02:00
Yuan Wang	3753ebf075	vhost: support clear in-flight packets for async dequeue rte_vhost_clear_queue_thread_unsafe() supports to clear in-flight packets for async enqueue only. But after supporting async dequeue, this API should support async dequeue too. This patch also adds the thread-safe version of this API, the difference between the two API is that thread safety uses lock. These APIs maybe used to clean up packets in the async channel to prevent packet loss when the device state changes or when the device is destroyed. Signed-off-by: Yuan Wang <yuanx.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2022-06-17 15:19:57 +02:00
Maxime Coquelin	7316b4fd61	vhost: fix missing enqueue pseudo-header calculation The Virtio specification requires that in case of checksum offloading, the pseudo-header checksum must be set in the L4 header. When received from another Vhost-user port, the packet checksum might already contain the pseudo-header checksum but we have no way to know it. So we have no other choice than doing the pseudo-header checksum systematically. This patch handles this using the rte_net_intel_cksum_prepare() helper. Fixes: `859b480d5a` ("vhost: add guest offload setting") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-17 14:41:04 +02:00
Naga Harish K S V	3d6e9dd493	eventdev/eth_tx: fix adapter creation During adapter create, memory is allocated for storing event port configuration which is freed during adapter free. The following error is seen during free "EAL: Error: Invalid memory" The service data pointer storage for txa_service_data_array is allocated during adapter create with incorrect size which is less than the required size. Initialization of this memory causes buffer overflow and result in metadata overwrite of event port config memory allocated above and results in the above error message during free. Allocating the correct size of memory for txa_service_data_array prevents overwriting other memory areas like event port config memory. Fixes: `a3bbf2e097` ("eventdev: add eth Tx adapter implementation") Cc: stable@dpdk.org Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2022-06-20 20:39:11 +02:00
Radu Nicolau	778bbc0891	ipsec: fix NAT-T ports and length Fix the UDP header fields, wrong byte order used for src and dst port and wrong offset used when updating UDP datagram length. Fixes: `01eef5907f` ("ipsec: support NAT-T") Cc: stable@dpdk.org Signed-off-by: Radu Nicolau <radu.nicolau@intel.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com>	2022-06-21 20:04:49 +02:00
David Marchand	11f61ea2f6	eal/x86: drop export of internal alignment macro ALIGNMENT_MASK is only used internally. Besides it lacks a DPDK-related prefix. Hide it from external eyes. Fixes: `f5472703c0` ("eal: optimize aligned memcpy on x86") Cc: stable@dpdk.org Reported-by: Morten Brørup <mb@smartsharesystems.com> Signed-off-by: David Marchand <david.marchand@redhat.com>	2022-06-22 11:32:35 +02:00
Stephen Hemminger	45a192b2b4	sched: fix floating point math The function rte_pie_drop was attempting to do a random probability drop, but because of incorrect usage of fixed point divide it would always return 1. Change to use new rte_drand() instead. Fixes: `44c730b0e3` ("sched: add PIE based congestion management") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>	2022-06-22 11:11:48 +02:00
Stephen Hemminger	bb421e98c0	sched: remove unnecessary floating point The qdelay variable is derived from and compared to 64 bit value so it doesn't have to be floating point. Fixes: `44c730b0e3` ("sched: add PIE based congestion management") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>	2022-06-22 11:11:44 +02:00
Stephen Hemminger	0cd10724bf	eal: provide pseudo-random floating point number The PIE code and other applications can benefit from having a fast way to get a random floating point value. This new function is equivalent to drand() in the standard library. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2022-06-22 10:59:09 +02:00
Sean Morrissey	3653eda1c2	vhost: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-21 16:46:57 +02:00
Sean Morrissey	2237e6da52	timer: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:57 +02:00
Sean Morrissey	5ec9bff003	sched: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	8259c3ef4e	rawdev: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	887c29abbe	power: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	2518fcb67f	net: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	cfb9eb0af3	meter: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	e66ba416a0	jobstats: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	fe3a71b724	hash: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	e15331dd26	gso: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	564178d3b1	gpudev: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	83be526a2e	ethdev: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	2ff3976e67	eal: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Sean Morrissey	0eefb45370	cmdline: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-06-21 16:46:56 +02:00
Chengwen Feng	d59a940667	trace: fix init with long file prefix Bug scenario: 1. start testpmd: $ dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* \ --file-prefix=trace_autotest -- -i 2. then observed: EAL: eal_trace_init():93 failed to initialize trace [File exists] EAL: FATAL: Cannot init trace EAL: Cannot init trace EAL: Error - exiting with code: 1 The root cause it that the offset set wrong with long file-prefix and then lead the strftime return failed. At the same time, trace_session_name_generate() uses errno as the return value, but the errno was not set if strftime returned zero. A previously set errno (EEXIST or ENOENT from call to mkdir for creating the runtime configuration directory) was returned in this case. This is fragile and may lead to incorrect logic if errno was set to 0 previously. This also resulted in inaccurate prompting. Set errno to ENOSPC if strftime return zero. Fixes: `321dd5f8fa` ("trace: add internal init and fini interface") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2022-06-21 11:11:00 +02:00
Chengwen Feng	a8f23b444d	trace: fix crash when exiting Bug scenario: 1. start testpmd: $ dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* -- -i 2. quit testpmd and then observed segment fault: Bye... Segmentation fault (core dumped) The root cause is that rte_trace_save() and eal_trace_fini() access the huge pages which were cleanup by rte_eal_memory_detach(). This patch moves rte_trace_save() and eal_trace_fini() before rte_eal_memory_detach() to fix the bug. Fixes: `dfbc61a2f9` ("mem: detach memsegs on cleanup") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Tested-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-06-21 11:11:00 +02:00
Cristian Dumitrescu	a1b2afe431	pipeline: rework optimization pattern for header generation The P4 language requires marking a header as valid before any of the header fields are written as opposed to after the writes are done. Hence, the optimization of replacing the sequence of instructions to generate a header by reading it from the table action data with a single DMA internal instruction are reworked from "mov all + validate -> dma" to "validate + mov all -> dma". Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-06-20 16:14:50 +02:00
Harshad Narayane	ce1511f4b8	pipeline: fix check maximum learner table timeouts Fix comparison used to check against the maximum number of learner table timeouts. Fixes: `e2ecc53582` ("pipeline: improve learner table timers") Signed-off-by: Harshad Narayane <harshad.suresh.narayane@intel.com> Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-06-20 16:06:41 +02:00
Cristian Dumitrescu	0a00384a76	pipeline: fix packet mirroring configuration Fix segmentation fault due to null pointer dereferencing inside the "mirror" instruction when number of mirroring slots is set to 0. This was taking place when the "mirror" instruction was used without the mirror feature being properly configured, i.e. the API function rte_swx_pipeline_mirroring_config was not called at initialization. Fixes: `dac0ecd909` ("pipeline: support packet mirroring") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2022-06-20 16:04:13 +02:00
Stanislaw Kardach	b13fb77583	lpm: fix scalar version header for C++ rte_xmm_t is a union type which wraps around xmm_t and maps its contents to scalar structures. Since C++ has stricter type conversion rules than C, the rte_xmm_t::x has to be used instead of C-casting. The generated assembly is identical to the code without the fix (checked both on x86 and RISC-V). Fixes: `406937f89f` ("lpm: add scalar version of lookupx4") Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-06-15 09:12:19 +02:00
Stanislaw Kardach	93cba71bdc	eal/riscv: fix vector header for C++ rte_xmm_t is a union type which wraps around xmm_t and maps its contents to scalar structures. Since C++ has stricter type conversion rules than C, the rte_xmm_t::x has to be used instead of C-casting. Fixes: `f22e705ebf` ("eal/riscv: support RISC-V architecture") Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-06-15 09:12:16 +02:00
Bruce Richardson	758d92795b	cmdline: add function to verify valid commands The cmdline library cmdline_parse() function parses a command and executes the action automatically too. The cmdline_valid_buffer function also uses this function to validate commands, meaning that there is no function to validate a command as ok without executing it. To fix this omission, we extract the body of cmdline_parse into a new static inline function with an extra parameter to indicate whether the action should be performed or not. Then we create two wrappers around that - a replacement for the existing cmdline_parse function where the extra parameter is "true" to execute the command, and a new function "cmdline_parse_check" which passes the parameter as "false" to perform cmdline validation only. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Weiyuan Li <weiyuanx.li@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2022-06-13 10:26:39 +02:00
Spike Du	bc70e55948	ethdev: introduce available Rx descriptors threshold A new event RTE_ETH_EVENT_RX_AVAIL_THRESH should be generated by HW when number of available descriptors in Rx queue goes below the threshold. The threshold is defined as a percentage of an Rx queue size with valid values from 0 to 99 (inclusive). Zero (default) value disables it. There is no capability reporting for the feature. Application should simply try to set required threshold value and handle result. Add testpmd commands to control the threshold: set port <port_id> rxq <rxq_id> avail_thresh <avail_thresh_num> Signed-off-by: Spike Du <spiked@nvidia.com> Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2022-06-08 21:41:00 +02:00
David Marchand	e5e613f05b	eal: remove unused arch-specific headers for locks MCS lock, PF lock and Ticket lock have no arch specific implementation, there is no need for the extra redirection in headers. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Stanislaw Kardach <kda@semihalf.com>	2022-06-08 15:44:20 +02:00
Michal Mazurek	f22e705ebf	eal/riscv: support RISC-V architecture Add all necessary elements for DPDK to compile and run EAL on SiFive Freedom U740 SoC which is based on SiFive U74-MC (ISA: rv64imafdc) core complex. This includes: - EAL library implementation for rv64imafdc ISA. - meson build structure for 'riscv' architecture. RTE_ARCH_RISCV define is added for architecture identification. - xmm_t structure operation stubs as there is no vector support in the U74 core. Compilation was tested on Ubuntu and Arch Linux using riscv64 toolchain. Clang compilation currently not supported due to issues with missing relocation relaxation. Two rte_rdtsc() schemes are provided: stable low-resolution using rdtime (default) and unstable high-resolution using rdcycle. User can override the scheme by defining RTE_RISCV_RDTSC_USE_HPM=1 during compile time of both DPDK and the application. The reasoning for this is as follows. The RISC-V ISA mandates that clock read by rdtime has to be of constant period and synchronized between all hardware threads within 1 tick (chapter 10.1 in version 20191213 of RISC-V spec). However this clock may not be of high-enough frequency for dataplane uses. I.e. on HiFive Unmatched (FU740) it is 1MHz. There is a high-resolution alternative in form of rdcycle which is clocked at the core clock frequency. The drawbacks are that it may be disabled during sleep (WFI), its frequency might change due to DVFS and it is core-local and therefore cannot be used as a wall-clock. It can however be used for micro-benchmarking user applications, similarly to Aarch64's PMCCNTR PMU counter. The platform is currently marked as linux-only because rte_cycles implementation uses the timebase-frequency device-tree node read through the proc file system. Such approach was chosen because Linux kernel depends on the presence of this device-tree node. The i40e PMD driver is disabled on RISC-V as the rv64gc ISA has no vector operations. The compilation of following modules has been disabled by this commit and will be re-enabled in later commits as fixes are introduced: net/ixgbe, net/memif, net/tap, example/l3fwd. Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com> Sponsored-by: Sam Grove <sam.grove@sifive.com> Signed-off-by: Michal Mazurek <maz@semihalf.com> Signed-off-by: Stanislaw Kardach <kda@semihalf.com>	2022-06-08 11:26:20 +02:00
Bruce Richardson	f93a605f2d	build: add definitions for use as Meson subproject To allow other projects to easily use DPDK as a subproject, add in the necessary dependency definitions. Slightly different definitions are necessary for static and shared builds, since for shared builds the drivers should not be linked in, and the internal meson dependency objects are more complete. To use DPDK as a subproject fallback i.e. use installed DPDK if present, otherwise the shipped one, the following meson statement can be used: libdpdk = dependency('libdpdk', fallback: ['dpdk', 'dpdk_dep']) Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Ben Magistro <koncept1@gmail.com> Tested-by: Ben Magistro <koncept1@gmail.com>	2022-06-07 20:49:20 +02:00
Bruce Richardson	013b4c52c7	replace zero-length arrays with flexible ones This patch replaces instances of zero-sized arrays i.e. those at the end of structures with "[0]" with the more standard syntax of "[]". Replacement was done using coccinelle script, with some revert and cleanup of whitespace afterwards. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2022-06-07 16:44:21 +02:00
Tyler Retzlaff	ca04c78b62	eal: get/set thread priority per thread identifier Add functions for setting and getting the priority of a thread. Priorities on multiple platforms are similarly determined by a priority value and a priority class/policy. Currently in DPDK most threads operate at the OS-default priority level but there are cases when increasing the priority is useful. For example, high performance applications may require elevated priority levels. For these reasons, EAL will expose two priority levels which are named suggestively "normal" and "realtime_critical" and are computed as follows: On Linux, the following mapping is created: RTE_THREAD_PRIORITY_NORMAL corresponds to * policy SCHED_OTHER * priority value: (sched_get_priority_min(SCHED_OTHER) + sched_get_priority_max(SCHED_OTHER))/2; RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to * policy SCHED_RR * priority value: sched_get_priority_max(SCHED_RR); On Windows, the following mapping is created: RTE_THREAD_PRIORITY_NORMAL corresponds to * class NORMAL_PRIORITY_CLASS * priority THREAD_PRIORITY_NORMAL RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to * class REALTIME_PRIORITY_CLASS (when running with privileges) * class HIGH_PRIORITY_CLASS (when running without privileges) * priority THREAD_PRIORITY_TIME_CRITICAL Note that on Linux the resulting priority value will be 0, in accordance to the documentation that mention the value should be 0 for SCHED_OTHER policy. Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2022-06-07 13:33:14 +02:00
Mattias Rönnblom	0bee070907	eal: add seqlock A sequence lock (seqlock) is a synchronization primitive which allows for data-race free, low-overhead, high-frequency reads, suitable for data structures shared across many cores and which are updated relatively infrequently. A seqlock permits multiple parallel readers. A spinlock is used to serialize writers. In cases where there is only a single writer, or writer-writer synchronization is done by some external means, the "raw" sequence counter type (and accompanying rte_seqcount_*() functions) may be used instead. To avoid resource reclamation and other issues, the data protected by a seqlock is best off being self-contained (i.e., no pointers [except to constant data]). One way to think about seqlocks is that they provide means to perform atomic operations on data objects larger than what the native atomic machine instructions allow for. DPDK seqlocks (and the underlying sequence counters) are not preemption safe on the writer side. A thread preemption affects performance, not correctness. A seqlock contains a sequence number, which can be thought of as the generation of the data it protects. A reader will 1. Load the sequence number (sn). 2. Load, in arbitrary order, the seqlock-protected data. 3. Load the sn again. 4. Check if the first and second sn are equal, and even numbered. If they are not, discard the loaded data, and restart from 1. The first three steps need to be ordered using suitable memory fences. A writer will 1. Take the spinlock, to serialize writer access. 2. Load the sn. 3. Store the original sn + 1 as the new sn. 4. Perform load and stores to the seqlock-protected data. 5. Store the original sn + 2 as the new sn. 6. Release the spinlock. Proper memory fencing is required to make sure the first sn store, the data stores, and the second sn store appear to the reader in the mentioned order. The sn loads and stores must be atomic, but the data loads and stores need not be. The original seqlock design and implementation was done by Stephen Hemminger. This is an independent implementation, using C11 atomics. For more information on seqlocks, see https://en.wikipedia.org/wiki/Seqlock Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>	2022-06-07 13:33:14 +02:00
Duncan Bellamy	0615dd2aa1	eal/ppc: fix compilation for musl musl lacks __ppc_get_timebase() but has __builtin_ppc_get_timebase() Signed-off-by: Duncan Bellamy <dunk@denkimushi.com> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>	2022-06-07 13:33:14 +02:00
Sean Morrissey	39b5ab60df	dmadev: add telemetry Telemetry commands are now registered through the dmadev library for the gathering of DSA stats. The corresponding callback functions for listing dmadevs and providing info and stats for a specific dmadev are implemented in the dmadev library. An example usage can be seen below: Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2 {"version": "DPDK 22.03.0-rc2", "pid": 2956551, "max_output_len": 16384} Connected to application: "dpdk-dma" --> / {"/": ["/", "/dmadev/info", "/dmadev/list", "/dmadev/stats", ...]} --> /dmadev/list {"/dmadev/list": [0, 1]} --> /dmadev/info,0 {"/dmadev/info": {"name": "0000:00:01.0", "nb_vchans": 1, "numa_node": 0, "max_vchans": 1, "max_desc": 4096, "min_desc": 32, "max_sges": 0, "capabilities": {"mem2mem": 1, "mem2dev": 0, "dev2mem": 0, ...}}} --> /dmadev/stats,0,0 {"/dmadev/stats": {"submitted": 0, "completed": 0, "errors": 0}} Signed-off-by: Sean Morrissey <sean.morrissey@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Tested-by: Sunil Pai G <sunil.pai.g@intel.com> Tested-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Chengwen Feng <fengchengwen@huawei.com>	2022-06-06 23:31:29 +02:00
Bruce Richardson	e345594f3c	dmadev: clarify visibility of completed jobs Clarify that once an operation has completed, the output of that operation is visible to all cores. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com>	2022-06-06 23:31:23 +02:00
Andy Pei	486f65e682	vhost: get vDPA device type Vhost backend of different devices have different features. Add an API to get vDPA device type, net device or blk device currently, so users can set different features for different kinds of devices. Signed-off-by: Andy Pei <andy.pei@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-01 11:50:10 +02:00
Andy Pei	7a3ac98428	vhost: support vhost message for get/set config Add support for VHOST_USER_GET_CONFIG and VHOST_USER_SET_CONFIG. VHOST_USER_GET_CONFIG and VHOST_USER_SET_CONFIG message is only supported by virtio blk VDPA device. Signed-off-by: Andy Pei <andy.pei@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-01 11:50:09 +02:00
Andy Pei	fb2315564e	vhost: add vDPA operations for block device Get_config and set_config are necessary ops for blk device. Add get_config and set_config ops to vDPA ops. Signed-off-by: Andy Pei <andy.pei@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-01 11:50:09 +02:00
Wenwu Ma	9e89b06d02	vhost: fix deadlock when message handling failed In vhost_user_msg_handler(), if vhost message handling failed, we should check whether the queue is locked and release the lock before returning. Or, it will cause a deadlock later. Fixes: `7f31d4ea05` ("vhost: fix lock on device readiness notification") Cc: stable@dpdk.org Signed-off-by: Wenwu Ma <wenwux.ma@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Tested-by: Wei Ling <weix.ling@intel.com> Acked-by: David Marchand <david.marchand@redhat.com>	2022-06-01 11:50:09 +02:00
Xuan Ding	84d5204310	vhost: support async dequeue for split ring This patch implements asynchronous dequeue data path for vhost split ring, a new API rte_vhost_async_try_dequeue_burst() is introduced. Signed-off-by: Xuan Ding <xuan.ding@intel.com> Signed-off-by: Yuan Wang <yuanx.wang@intel.com> Tested-by: Yvonne Yang <yvonnex.yang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-06-01 11:50:09 +02:00
Xuan Ding	844e113a05	vhost: merge sync and async descriptor to mbuf filling This patch refactors copy_desc_to_mbuf() used by the sync path to support both sync and async descriptor to mbuf filling. Signed-off-by: Xuan Ding <xuan.ding@intel.com> Tested-by: Yvonne Yang <yvonnex.yang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-06-01 11:50:09 +02:00
Xuan Ding	d796fee5e5	vhost: prepare async for descriptor to mbuf refactoring This patch refactors vhost async enqueue path and dequeue path to use the same function async_fill_seg() for preparing batch elements, which simplifies the code without performance degradation. Signed-off-by: Xuan Ding <xuan.ding@intel.com> Tested-by: Yvonne Yang <yvonnex.yang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-06-01 11:50:09 +02:00
Xuan Ding	6d823bb302	vhost: prepare sync for descriptor to mbuf refactoring This patch extracts the descriptors to buffers filling from copy_desc_to_mbuf() into a dedicated function. Besides, enqueue and dequeue path are refactored to use the same function sync_fill_seg() for preparing batch elements, which simplifies the code without performance degradation. Signed-off-by: Xuan Ding <xuan.ding@intel.com> Tested-by: Yvonne Yang <yvonnex.yang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-06-01 11:50:09 +02:00
Maxime Coquelin	0a8363efdb	vhost: add runtime locking check in unsafe API This patch adds runtime checks in unsafe Vhost async APIs, to ensure the access lock is taken. The detection won't work every time, as another thread could take the lock, but it would help to detect misuse of these unsafe API. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-01 11:50:09 +02:00
Maxime Coquelin	69c94e35d5	vhost: add statistics for in-flight packets This patch adds statistics for packets in-flight submission and completion, when Vhost async mode is used. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-01 11:50:09 +02:00
Maxime Coquelin	7247b7464e	vhost: add statistics for IOTLB This patch adds statistics for IOTLB hits and misses. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-01 11:50:09 +02:00
Maxime Coquelin	1ea74efd7f	vhost: add statistics for guest notification This patch adds a new virtqueue statistic for guest notifications. It is useful to deduce from hypervisor side whether the corresponding guest Virtio device is using Kernel Virtio-net driver or DPDK Virtio PMD. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-01 11:50:09 +02:00
Maxime Coquelin	be75dc99ea	vhost: support per-virtqueue statistics This patch introduces new APIs for the application to query and reset per-virtqueue statistics. The patch also introduces generic counters. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2022-06-01 11:50:09 +02:00
David Marchand	2d47fd3dfb	vhost: fix async access vq->async accesses must be protected with vq->access_lock. Fixes: `eb666d2408` ("vhost: fix async unregister deadlock") Fixes: `0c0935c5f7` ("vhost: allow to check in-flight packets for async vhost") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2022-06-01 11:50:09 +02:00
Thomas Monjalon	bdde2d9094	ethdev: promote port ownership API as stable The port ownership concept was introduced in ethdev in DPDK 18.02. Not sure it is used by applications except those using failsafe or netvsc. It can also be used by libraries or applications to sort out how ports are controlled. Hiding sub-ports controlled by failsafe or netvsc look to be enough justification to promote this API as stable. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2022-06-02 13:38:26 +02:00
Jiawei Wang	c4e442fa4c	ethdev: add IPv4/IPv6 ECN header rewrite action This patch introduces the IPv4/IPv6 ECN modify field support, and adds the testpmd CLI commands support. Usage: modify_field op set dst_type ipv4_ecn src_type ... For example: flow create 0 ingress group 1 pattern eth / ipv4 / end actions modify_field op set dst_type ipv4_ecn src_type value src_value 0x03 width 2 / queue index 0 / end Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com>	2022-06-02 11:58:38 +02:00

... 2 3 4 5 6 ...

8127 Commits