numam-dpdk

Author	SHA1	Message	Date
Eugenio Pérez	e7cb7fdf54	vhost: clean IOTLB cache on vring stop Old IOVA cache entries are left when there is a change on virtio driver in VM. In case that all these old entries have iova addresses lesser than new iova entries, vhost code will need to iterate all the cache to find the new ones. In case of just a new iova entry needed for the new translations, this condition will last forever. This has been observed in virtio-net to testpmd's vfio-pci driver transition, reducing the performance from more than 10Mpps to less than 0.07Mpps if the hugepage address was higher than the networking buffers. Since all new buffers are contained in this new gigantic page, vhost needs to scan IOTLB_CACHE_SIZE - 1 for each translation at worst. Fixes: `69c90e98f4` ("vhost: enable IOMMU support") Cc: stable@dpdk.org Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Reported-by: Pei Zhang <pezhang@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-09-28 17:26:44 +02:00
Stephen Hemminger	8680efa04c	mbuf: promote Tx offload helper to stable This function should be made stable now. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-10-05 11:04:03 +02:00
Stephen Hemminger	a767a3f43e	mbuf: promote check helper to stable This one has been in for required time period. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-10-05 11:04:03 +02:00
Stephen Hemminger	7ddade3555	mbuf: promote dynamic fields to stable These functions to register dynamic fields were added in 19.11 and should be promoted to stable. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-10-05 11:03:58 +02:00
Stephen Hemminger	bf384709c7	mbuf: promote more helpers to stable These two functions were added in 19.11 as experimental. Time to promote the to stable status. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-10-05 10:59:35 +02:00
David Marchand	8d2a436d69	mbuf: promote some helpers to stable Those accessors have been introduced more than two years ago (rte_mbuf_to_priv in v18.08, rte_mbuf__addr in v19.02). Time to mark them stable. rte_mbuf_to_baddr() could be removed, but since we lack a deprecation notice, keep it as a simple wrapper. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-10-05 10:59:31 +02:00
Sean Morrissey	f01eff0d23	ring: promote new sync modes and peek to stable These methods were introduced in 20.05. There has been no changes in their public API since then. They seem mature enough to remove the experimental tag. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-10-05 10:09:15 +02:00
Bruce Richardson	47a4f2650c	eal/freebsd: lock memory device to prevent conflicts Only a single DPDK process on the system can be using the /dev/contigmem mappings at a time, but this was never explicitly enforced, e.g. when using --in-memory flag on two processes. To prevent possible conflict issues, we lock the dev node when it's in use, preventing other DPDK processes from starting up and causing problems for us. Fixes: `764bf26873` ("add FreeBSD support") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-10-02 16:30:16 +02:00
Vladimir Medvedkin	8f532a34c4	fib: promote API to stable The fib and fib6 API's have been in since 19.11 and should be marked as stable. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Conor Walsh <conor.walsh@intel.com>	2021-10-02 11:37:25 +02:00
Stephen Hemminger	4aeb92396b	rib: promote API to stable The rib and rib6 API's have been in since 19.11 and should be marked as stable. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2021-10-02 11:37:25 +02:00
Stephen Hemminger	2cea5168c1	net: promote string to ethernet to stable This function has been in since 19.11. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-10-02 11:35:17 +02:00
Xiao Wang	20ab35d032	net: promote make rarp packet function to stable rte_net_make_rarp_packet was introduced in version v18.02, there was no change in this public API since then, and it's still being used by vhost lib and virtio driver, so promote it as stable ABI. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-10-02 11:12:32 +02:00
Ivan Malov	8cfad59e29	log: promote some function to stable This one might be quite mature to be attested as stable. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-10-02 11:12:32 +02:00
Mattias Rönnblom	15a1e00a65	eal: promote random generator with upper bound to stable Remove experimental tag from rte_rand_max(). Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-10-02 11:12:19 +02:00
Bruce Richardson	b8a0fbab98	telemetry: promote API to stable The telemetry APIs have been present and unchanged for >1 year now, so remove experimental tag from them. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-10-01 17:17:28 +02:00
Jie Zhou	09e4eceacb	mempool/stack: build on Windows Enable build of mempool/stack on Windows. Signed-off-by: Jie Zhou <jizh@linux.microsoft.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-10-01 16:46:05 +02:00
Pablo de Lara	8751a7e983	efd: allow more CPU sockets in table creation rte_efd_create() function was using uint8_t for a socket bitmask, for one of its parameters. This limits the maximum of NUMA sockets to be 8. Changing to uint64_t increases it to 64, which should be more future-proof. Coverity issue: 366390 Fixes: `56b6ef874f` ("efd: new Elastic Flow Distributor library") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com> Tested-by: David Christensen <drc@linux.vnet.ibm.com>	2021-10-01 16:33:20 +02:00
Kevin Traynor	4ad8807cfc	bitrate: promote free function to stable rte_stats_bitrate_free() has been in DPDK since 20.11. Its signature is very basic as it just frees an opaque data struct allocated in rte_stats_bitrate_create() and returns void. It's unlikely that such a basic signature would need to change so might as well promote it to stable for the next major ABI. Signed-off-by: Kevin Traynor <ktraynor@redhat.com>	2021-10-01 15:31:47 +02:00
Kevin Traynor	bdd478eede	bitrate: fix calculation to match API description rte_stats_bitrate_calc() API states it returns 'Negative value on error'. However, the implementation will return the error code from rte_eth_stats_get() which may be non-zero on error. Change the implementation of rte_stats_bitrate_calc() to match the API description by always returning a negative value on error. Fixes: `2ad7ba9a65` ("bitrate: add bitrate statistics library") Signed-off-by: Kevin Traynor <ktraynor@redhat.com>	2021-10-01 15:31:06 +02:00
Kevin Traynor	06ae9f0f92	bitrate: fix registration to match API description rte_stats_bitrate_reg() API states it returns 'Zero on success'. However, the implementation directly returns the return of rte_metrics_reg_names() which may be zero or positive on success, with a positive value also indicating the index. The user of rte_stats_bitrate_reg() should not care about the index as it is stored in the opaque rte_stats_bitrates struct. Change the implementation of rte_stats_bitrate_reg() to match the API description by always returning zero on success. Fixes: `2ad7ba9a65` ("bitrate: add bitrate statistics library") Signed-off-by: Kevin Traynor <ktraynor@redhat.com>	2021-10-01 15:29:21 +02:00
Stephen Hemminger	71ecc415c5	telemetry: detach threads There are a number telemetry threads which are created and there is nothing that does pthread_join() to wait for them. Mark these threads as detached, so that the pthread library can cleanup state when the thread exits. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Ciara Power <ciara.power@intel.com>	2021-10-01 14:50:16 +02:00
Cian Ferriter	0203a14c72	ring: fix Doxygen comment of internal function Change "enqueue" to "dequeue" because the __rte_ring_move_cons_head() function is updating the consumer head for dequeue. Fixes: `0dfc98c507` ("ring: separate out head index manipulation") Cc: stable@dpdk.org Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2021-10-01 14:39:12 +02:00
William Tu	f1f6ebc0ea	eal: remove sys/queue.h from public headers Currently there are some public headers that include 'sys/queue.h', which is not POSIX, but usually provided by the Linux/BSD system library. (Not in POSIX.1, POSIX.1-2001, or POSIX.1-2008. Present on the BSDs.) The file is missing on Windows. During the Windows build, DPDK uses a bundled copy, so building a DPDK library works fine. But when OVS or other applications use DPDK as a library, because some DPDK public headers include 'sys/queue.h', on Windows, it triggers an error due to no such file. One solution is to install the 'lib/eal/windows/include/sys/queue.h' into Windows environment, such as [1]. However, this means DPDK exports the functionalities of 'sys/queue.h' into the environment, which might cause symbols, macros, headers clashing with other applications. The patch fixes it by removing the "#include <sys/queue.h>" from DPDK public headers, so programs including DPDK headers don't depend on the system to provide 'sys/queue.h'. When these public headers use macros such as TAILQ_xxx, we replace it by the ones with RTE_ prefix. For Windows, we copy the definitions from <sys/queue.h> to rte_os.h in Windows EAL. Note that these RTE_ macros are compatible with <sys/queue.h>, both at the level of API (to use with <sys/queue.h> macros in C files) and ABI (to avoid breaking it). Additionally, the TAILQ_FOREACH_SAFE is not part of <sys/queue.h>, the patch replaces it with RTE_TAILQ_FOREACH_SAFE. [1] http://mails.dpdk.org/archives/dev/2021-August/216304.html Suggested-by: Nick Connolly <nick.connolly@mayadata.io> Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>	2021-10-01 13:09:43 +02:00
Dmitry Kozlyuk	6787d0af94	lib: remove sched.h from public headers Public headers including POSIX-specific <sched.h> were unusable on Windows. These includes were superfluous, remove them. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: William Tu <u9012063@gmail.com>	2021-10-01 08:35:05 +02:00
Dmitry Kozlyuk	b7c3eb57bb	eal/windows: fix export list * Version and randomness API were not added to .def file by mistake, which is why they were later excluded from the export list. * Device API stubs were added to EAL but not exported. Fixes: `edd66d57d5` ("eal/windows: add random function") Fixes: `3d2fcb0e0a` ("eal/windows: add device event stubs") Fixes: `5b637a8481` ("eal: fix querying DPDK version at runtime") Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: William Tu <u9012063@gmail.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com>	2021-09-30 22:47:43 +02:00
Dmitry Kozlyuk	cf665406b1	eal: remove Windows-specific list of common files The majority of common EAL sources that are built for all platforms were listed separately for Windows and for other OS. It seems that developers adding modules to EAL perceived this as if Windows supported only a limited subset of modules and only added new ones into another. Factor the truly common modules into a shared list, then extend it with modules supported by different platforms. When the two lists were created, UUID API implementation was removed from Windows build (apparently by mistake), then excluded from the export list for no reason other than not being built. Restore it. Fixes: `df3ff6be2b` ("eal: simplify meson build of common directory") Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: William Tu <u9012063@gmail.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com>	2021-09-30 22:47:28 +02:00
William Tu	fe81e52a91	eal/windows: export version function When OVS inits, it calls rte_version to get the DPDK's version. The patch fixes the error below by exposing rte_version symbol. libopenvswitch.a(dpdk.c.obj) : error LNK2019: unresolved external symbol rte_version referenced in function dpdk_init Fixes: `5b637a8481` ("eal: fix querying DPDK version at runtime") Cc: stable@dpdk.org Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-09-30 22:47:23 +02:00
Pallavi Kadam	876d40fe6d	net: enable random address on Windows IAVF PMD needs to generate a random MAC address if it is not configured by host. 'random' is now supported on Windows. Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com> Reviewed-by: Ranjit Menon <ranjit.menon@intel.com> Acked-by: Shivanshu Shukla <shivanshu.shukla@intel.com>	2021-09-30 20:51:11 +02:00
Olivier Matz	f0e18cb4a8	kvargs: fix comments style A '*' is missing at 2 places, add them. Fixes: `e1a00536c8` ("kvargs: add a new library to parse key/value arguments") Fixes: `3ab385063c` ("kvargs: add get by key") Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-30 17:38:13 +02:00
Olivier Matz	6aebb94290	kvargs: add function to get from key and value A quite common scenario with kvargs is to lookup for a <key>=<value> in a kvlist. For instance, check if name=foo is present in name=toto,name=foo,name=bar. This is currently done in drivers/bus with rte_kvargs_process() + the rte_kvargs_strcmp() handler. This approach is not straightforward, and can be replaced by this new function. rte_kvargs_strcmp() is then removed. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-30 17:38:02 +02:00
Olivier Matz	b36d02ab39	kvargs: promote get from key as stable The function rte_kvargs_get() is used by eal and pci bus driver since its introduction in commit `3ab385063c` ("kvargs: add get by key") and commit `d2a66ad794` ("bus: add device arguments name parsing"), in dpdk 21.05. Let's promote it as stable. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-30 15:31:01 +02:00
Olivier Matz	4f5520d910	kvargs: promote delimited parsing as stable This function is used by EAL to parse key/value strings separated with specified delimiters. It was introduced in 2018 by commit `5d6af85ab0` ("kvargs: introduce a more flexible parsing function"), and can be promoted as stable. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-30 15:30:31 +02:00
Raslan Darawsheh	16b8e92d49	ethdev: use extension header for GTP PSC item This updates the gtp_psc flow item to use the net header definition of the gtp_psc to be based on RFC 38415-g30 Signed-off-by: Raslan Darawsheh <rasland@nvidia.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-09-28 12:34:58 +02:00
Raslan Darawsheh	e8ca1479cd	net: add extension header for GTP PSC Define new rte header for GTP PDU session container based on RFC 38415-g30 Signed-off-by: Raslan Darawsheh <rasland@nvidia.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-09-28 12:34:58 +02:00
Thomas Monjalon	0ce56b057b	ethdev: group constant definitions in Doxygen A lot of flags are parts of a group but are documented alone. The Doxygen syntax @{ and @} for grouping is used to make flags appear together and have a common description. Some Rx/Tx offload flags and RSS definitions are not grouped because they need to be all properly documented first. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Kevin Traynor <ktraynor@redhat.com>	2021-09-27 13:34:45 +02:00
Alvin Zhang	81b0fbb85b	ethdev: add IPv4 and L4 checksum RSS offload types This patch defines new RSS offload types for IPv4 and L4(TCP/UDP/SCTP) checksum, which are required when users want to distribute packets based on the IPv4 or L4 checksum field. For example "flow create 0 ingress pattern eth / ipv4 / end actions rss types ipv4-chksum end queues end / end", this flow causes all matching packets to be distributed to queues on basis of IPv4 checksum. Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com> Reviewed-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Aman Deep Singh <aman.deep.singh@intel.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-09-21 10:25:42 +02:00
Anatoly Burakov	de4ffd50c9	mem: promote some shared memory config API to stable As per ABI policy, move the formerly experimental API's to the stable section. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-28 22:07:41 +02:00
Anatoly Burakov	27e7e2509c	mem: promote DMA mask API to stable As per ABI policy, move the formerly experimental API's to the stable section. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-28 22:07:41 +02:00
Anatoly Burakov	acddc33b3e	mem: promote external memory API to stable As per ABI policy, move the formerly experimental API's to the stable section. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-28 22:07:41 +02:00
Anatoly Burakov	b893775065	mem: promote memseg API to stable As per ABI policy, move the formerly experimental API's to the stable section. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-28 22:07:41 +02:00
Anatoly Burakov	437cb6e826	malloc: promote some experimental API to stable As per ABI policy, move the formerly experimental API's to the stable section. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-28 22:07:41 +02:00
Anatoly Burakov	c335ffdbf7	fbarray: promote experimental API to stable As per ABI policy, move the formerly experimental API's to the stable section. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-28 22:07:41 +02:00
Anatoly Burakov	1611654bd6	ipc: promote experimental API to stable As per ABI policy, move the formerly experimental API's to the stable section. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-28 22:07:41 +02:00
Tejasree Kondoj	f0b538a5f8	security: add option to configure tunnel header verification Add option to indicate whether outer header verification need to be done as part of inbound IPsec processing. With inline IPsec processing, SA lookup would be happening in the Rx path of rte_ethdev. When rte_flow is configured to support more than one SA, SPI would be used to lookup SA. In such cases, additional verification would be required to ensure duplicate SPIs are not getting processed in the inline path. For lookaside cases, the same option can be used by application to offload tunnel verification to the PMD. These verifications would help in averting possible DoS attacks. Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-09-28 17:40:52 +02:00
Anoob Joseph	ad7515a39f	security: add SA lifetime configuration Add SA lifetime configuration to register soft and hard expiry limits. Expiry can be in units of number of packets or bytes. Crypto op status is also updated to include new field, aux_flags, which can be used to indicate cases such as soft expiry in case of lookaside protocol operations. In case of soft expiry, the packets are successfully IPsec processed but the soft expiry would indicate that SA needs to be reconfigured. For inline protocol capable ethdev, this would result in an eth event while for lookaside protocol capable cryptodev, this can be communicated via `rte_crypto_op.aux_flags` field. In case of hard expiry, the packets will not be IPsec processed and would result in error. Signed-off-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-09-28 14:11:29 +02:00
Anoob Joseph	63992166ba	security: support user-specified IV Enabled user to provide IV to be used per security operation. This would be used with lookaside protocol offload for comparing against known vectors. By default, PMD would internally generate random IV. Signed-off-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-09-28 13:35:32 +02:00
Nithin Dabilpuram	d08dcd28c3	security: add option for faster user/meta data access Currently rte_security_set_pkt_metadata() and rte_security_get_userdata() methods to set pkt metadata on Inline outbound and get userdata after Inline inbound processing is always driver specific callbacks. For drivers that do not have much to do in the callbacks but just to update metadata in rte_security dynamic field and get userdata from rte_security dynamic field, having to just to PMD specific callback is costly per packet operation. This patch provides a mechanism to do the same in inline function and avoid function pointer jump if a driver supports the same. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-09-28 08:43:47 +02:00
Nithin Dabilpuram	6d1f8c1319	mbuf: enforce semantics for Tx inline IPsec processing Not all net PMD's/HW can parse packet and identify L2 header and L3 header locations on Tx. This is inline with other Tx offloads requirements such as L3 checksum, L4 checksum offload, etc, where mbuf.l2_len, mbuf.l3_len etc, needs to be set for HW to be able to generate checksum. Since Inline IPsec is also such a Tx offload, some PMD's at least need mbuf.l2_len to be valid to find L3 header and perform Outbound IPSec processing. Hence, this patch updates documentation to enforce setting mbuf.l2_len while setting PKT_TX_SEC_OFFLOAD in mbuf.ol_flags for Inline IPsec Crypto / Protocol offload processing to work on Tx. Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-09-27 09:55:41 +02:00
Shijith Thotton	a0a388a897	eal: add macro to swap two variables Add a macro to swap two variables and updat common autotest for the same. Signed-off-by: Shijith Thotton <sthotton@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-09-27 18:33:45 +02:00
Julien Meunier	6ded44bce4	stack: fix reload head when pop fails The previous commit `18effad9cf` ("stack: reload head when pop fails") only changed C11 implementation, not generic implementation. List head must be loaded right before continue (when failed to find the new head). Without this, one thread might keep trying and failing to pop items without ever loading the new correct head. Fixes: `3340202f59` ("stack: add lock-free implementation") Cc: stable@dpdk.org Signed-off-by: Julien Meunier <julien.meunier@nokia.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-09-27 17:28:55 +02:00
Xueming Li	eb5636e879	sched: get 64-bit greatest common divisor This patch adds new function that compute the greatest common divisor of 64 bits, also changes the original 32 bits function to call this new 64-bit version. Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Kevin Traynor <ktraynor@redhat.com>	2021-09-27 17:24:16 +02:00
Cristian Dumitrescu	175d213bf8	pipeline: improve handling of learner action arguments The arguments of actions that are learned are now specified as part of the learn instruction as opposed to being statically specified as part of the learner table configuration. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:18:49 +02:00
Cristian Dumitrescu	1c6571c837	pipeline: enable pipeline compilation Commit the pipeline changes when the compilation process is successful: change the table lookup instructions to execute the action function for each action, replace the regular pipeline instructions with the custom instructions. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:10:26 +02:00
Cristian Dumitrescu	f898a475c3	pipeline: build shared object for pipeline Build the generated C file into a shared object library. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Cunming Liang <cunming.liang@intel.com>	2021-09-27 12:10:20 +02:00
Cristian Dumitrescu	724f3ef422	pipeline: generate custom instruction functions Generate a C function for each custom instruction, which essentially consolidate multiple regular instructions into a single function call. The pipeline program is split into groups of instructions, and a custom instruction is generated for each group that has more than one instruction. Special care is taken the instructions that can do thread yield (RX, extern) and for those that can change the instruction pointer (TX, near/far jump). Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:09:54 +02:00
Cristian Dumitrescu	d025528d74	pipeline: generate action functions Generate a C function for each action. For most instructions, the associated inline function is called directly. Special care is taken for TX, jump and return instructions. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:09:45 +02:00
Cristian Dumitrescu	216bc906d0	pipeline: export pipeline instructions to file Export the array of translated instructions to a C file. There is one such array per action and one for the pipeline. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:09:26 +02:00
Cristian Dumitrescu	fc64098a1a	pipeline: introduce pipeline compilation Lay the foundation to generate C code for the pipeline: C functions for actions and custom instructions are generated, built as shared object library and loaded into the pipeline. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:09:15 +02:00
Cristian Dumitrescu	dfa9491a18	pipeline: introduce custom instructions For better performance, the option to create custom instructions when the program is translated and add them on-the-fly to the pipeline is now provided. Multiple regular instructions can now be consolidated into a single C function optimized by the C compiler directly. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:09:13 +02:00
Cristian Dumitrescu	5dc6a5f2e7	pipeline: introduce action functions For better performance, the option to run a single function per action is now provided, which requires a single function call per action that can be better optimized by the C compiler, as opposed to one function call per instruction. Special table lookup instructions are added to to support this feature. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:09:11 +02:00
Cristian Dumitrescu	4bd025dc98	pipeline: enable persistent instruction meta-data Save the instruction meta-data for later use instead of freeing it up once the instruction translation is completed. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:23 +02:00
Cristian Dumitrescu	40baf712ef	pipeline: create inline functions for instruction operands Create inline functions to get the instruction operands. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:20 +02:00
Cristian Dumitrescu	0d5910ddcf	pipeline: create inline functions for meter instructions Create inline functions for the meter instructions. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:18 +02:00
Cristian Dumitrescu	c5d03ffda7	pipeline: create inline functions for register instructions Create inline functions for the register instructions. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:16 +02:00
Cristian Dumitrescu	ed7567c9d7	pipeline: create inline functions for ALU instructions Create inline functions for the ALU instructions. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:15 +02:00
Cristian Dumitrescu	fae7b2baa3	pipeline: create inline functions for DMA instruction Create inline functions for the DMA instruction. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:13 +02:00
Cristian Dumitrescu	b82733ab25	pipeline: create inline functions for move instruction Create inline functions for the move instruction. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:12 +02:00
Cristian Dumitrescu	4884264b17	pipeline: create inline functions for extern instruction Create inline functions for the extern instruction. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:10 +02:00
Cristian Dumitrescu	d1a58ada1a	pipeline: create inline functions for learn instruction Create inline functions for the learn and forget instructions. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:09 +02:00
Cristian Dumitrescu	4565d7db70	pipeline: create inline functions for validate instruction Create inline functions for the validate and invalidate instructions. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:07 +02:00
Cristian Dumitrescu	d60dbdc88a	pipeline: create inline functions for emit instruction Create inline functions for the emit instruction. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 12:03:05 +02:00
Cristian Dumitrescu	2574fd607e	pipeline: create inline functions for extract instruction Create inline functions for the extract instruction. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 11:59:47 +02:00
Cristian Dumitrescu	fcb03ae09e	pipeline: create inline functions for Tx instruction Create inline functions for the Tx instruction. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 11:59:37 +02:00
Cristian Dumitrescu	101d7f09bf	pipeline: create inline functions for Rx instruction Create inline functions for the Rx instruction. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 11:59:28 +02:00
Cristian Dumitrescu	c693add3bf	pipeline: move thread inline functions to header file Move the thread inline functions to the internal header file. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 11:59:07 +02:00
Cristian Dumitrescu	97b8278ad9	pipeline: move data structures to internal header file Start to consolidate the data structures and inline functions required by the pipeline instructions into an internal header file. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 11:43:36 +02:00
Cristian Dumitrescu	4f59d37261	pipeline: support learner tables Add pipeline level support for learner tables. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 09:52:04 +02:00
Cristian Dumitrescu	0c06fa3bfa	table: support learner tables A learner table is typically used for learning or connection tracking, where it allows for the implementation of the "add on miss" scenario: whenever the lookup key is not found in the table (lookup miss), the data plane can decide to add this key to the table with a given action with no control plane intervention. Likewise, the table keys expire based on a configurable timeout and are automatically deleted from the table with no control plane intervention. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 09:30:41 +02:00
Cristian Dumitrescu	220d419b86	pipeline: add header look-ahead instruction Added look-ahead instruction to read a header from the input packet without advancing the extraction pointer. This is typically used in correlation with the special extract instruction to extract variable size headers from the input packet: the first few header fields are read without advancing the extraction pointer, just enough to detect the actual length of the header (e.g. IPv4 IHL field); then the full header is extracted. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 09:15:07 +02:00
Cristian Dumitrescu	5972f82936	pipeline: add variable size headers extract instruction Added a mechanism to extract variable size headers through a special flavor of the extract instruction. The length of the last struct field which has variable size is passed as argument to the instruction. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 09:14:56 +02:00
Cristian Dumitrescu	cef3896928	pipeline: support variable size headers Added support for variable size headers. The last field of a struct type can now have a variable size between 0 and N bytes. Useful to accommodate IPv4 packets with options, etc. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 09:14:45 +02:00
Cristian Dumitrescu	5f3e610422	pipeline: prepare for variable size headers The emit instruction that is responsible for pushing headers into the output packet is now reading the header length from internal run-time structures as opposed to constant value from the instruction opcode. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-09-27 09:14:41 +02:00
Harman Kalra	3f156ec284	metrics: promote deinitialize API Remove experimental flag from rte_metrics_deinit(). This API was introduced in 19.11 release. Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-23 19:21:47 +02:00
Tal Shnaiderman	826476bc70	eal/windows: fix debug build When building DPDK on Windows in debug mode the following warning appear: warning: token pasting of ',' and __VA_ARGS__ is a GNU extension [-Wgnu-zero-variadic-macro-arguments] #define open(path, flags, ...) _open(path, flags, ##__VA_ARGS__) Modify the 'open' macro to avoid it. Fixes: `45d62067c2` ("eal: make OS shims internal") Cc: stable@dpdk.org Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-09-23 19:16:27 +02:00
Radu Nicolau	d2671e642a	telemetry: support dict of dicts Add support for dicts of dicts to telemetry library. Increase the max string size to 128. Signed-off-by: Declan Doherty <declan.doherty@intel.com> Signed-off-by: Radu Nicolau <radu.nicolau@intel.com> Acked-by: Ciara Power <ciara.power@intel.com>	2021-09-23 14:15:29 +02:00
Thomas Monjalon	ca0c25bbc9	eal: reword logs for CPU and NUMA counts Some logs about cores and nodes were using hypotetic plural (s) form. A fixed plural form with value at the end is preferred. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-09-23 08:55:20 +02:00
Thomas Monjalon	7a33572057	lib: remove C++ include guard from private headers The private headers are compiled internally with a C compiler. Thus extern "C" declaration is useless in such files. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-09-22 22:00:17 +02:00
Chenbo Xia	945ef8a040	vhost: promote some APIs to stable As reported by symbol bot, APIs listed in this patch have been experimental for more than two years. This patch promotes these 18 APIs to stable. Signed-off-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-09-14 13:21:57 +02:00
Gaoxiang Liu	e53084d84b	vhost: log socket path on adding connection Add log print of socket path in vhost_user_add_connection. It's useful when adding a mass of socket connections, because the information of every connection is clearer. Fixes: `8f972312b8` ("vhost: support vhost-user") Cc: stable@dpdk.org Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-09-14 13:21:57 +02:00
Gaoxiang Liu	451dc0fad8	vhost: fix crash on port deletion The rte_vhost_driver_unregister() and vhost_user_read_cb() can be called at the same time by 2 threads. when memory of vsocket is freed in rte_vhost_driver_unregister(), the invalid memory of vsocket is accessed in vhost_user_read_cb(). It's a bug of both mode for vhost as server or client. E.g., vhostuser port is created as server. Thread1 calls rte_vhost_driver_unregister(). Before the listen fd is deleted from poll waiting fds, "vhost-events" thread then calls vhost_user_server_new_connection(), then a new conn fd is added in fdset when trying to reconnect. "vhost-events" thread then calls vhost_user_read_cb() and accesses invalid memory of socket while thread1 frees the memory of vsocket. E.g., vhostuser port is created as client. Thread1 calls rte_vhost_driver_unregister(). Before vsocket of reconn is deleted from reconn list, "vhost_reconn" thread then calls vhost_user_add_connection() then a new conn fd is added in fdset when trying to reconnect. "vhost-events" thread then calls vhost_user_read_cb() and accesses invalid memory of socket while thread1 frees the memory of vsocket. The fix is to move the "fdset_try_del" in front of free memory of conn, then avoid the race condition. The core trace is: Program terminated with signal 11, Segmentation fault. Fixes: `52d874dc67` ("vhost: fix crash on closing in client mode") Cc: stable@dpdk.org Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-09-14 13:21:57 +02:00
Jiayu Hu	abeb865255	vhost: remove copy threshold for async path Copy threshold has been introduced in async vhost data path to select the appropriate copy engine to do copies for higher efficiency. However, it may cause packets ordering issues and also introduces performance unpredictability. Therefore, this patch removes copy threshold support in async vhost data path. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-09-14 13:21:55 +02:00
Mohsin Kazmi	818ce1132a	net: fix checksum offload for outer IPv4 Preparation of the headers for the hardware offload misses the outer IPv4 checksum offload. It results in bad checksum computed by hardware NIC. This patch fixes the issue by setting the outer IPv4 checksum field to 0. Fixes: `4fb7e803eb` ("ethdev: add Tx preparation") Cc: stable@dpdk.org Signed-off-by: Mohsin Kazmi <mohsin.kazmi14@gmail.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-09-15 12:51:49 +02:00
David Marchand	b37ed6def3	ethdev: promote sibling iterators to stable This API saw no update since its introduction and will help applications like OVS ([1] and [2]) that currently look at rte_eth_devices[] to achieve the same. 1: https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L1285 2: https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L1476 Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-15 11:48:54 +02:00
Haiyue Wang	9fecac6c3a	ethdev: promote burst mode API The DPDK Symbol Bot reports: Please note the symbols listed below have expired. In line with the DPDK ABI policy, they should be scheduled for removal, in the next DPDK release. Symbol rte_eth_rx_burst_mode_get rte_eth_tx_burst_mode_get Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-09-15 10:46:00 +02:00
Joyce Kong	97d7b92772	ethdev: fix typo in Rx queue setup API comment Fix a typo that mb_pool was misspelt as mp_pool. Fixes: `4ff702b5df` ("ethdev: introduce Rx buffer split") Cc: stable@dpdk.org Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-09-15 10:38:43 +02:00
Pavan Nikhilesh	2def522abc	ethdev: promote API to set packet types Remove experimental tag from rte_eth_dev_set_ptypes(). Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-09-15 09:54:04 +02:00
Xiaoyun Li	dbd34bee98	ethdev: promote API to get interrupt FD per queue Remove the experimental tag for rte_eth_dev_rx_intr_ctl_q_get_fd API that was introduced in 18.11 and have been around for 11 releases. Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: David Marchand <david.marchand@redhat.com>	2021-09-14 18:18:06 +02:00
Conor Walsh	4777674c44	eal: fix memory leak when saving arguments This patch fixes a memleak which was reported in Bugzilla within the eal_save_args function. This was caused by the function mistakenly adding -- to the eal args instead of breaking beforehand. Bugzilla ID: 722 Fixes: `293c53d8b2` ("eal: add telemetry callbacks") Reported-by: Zhihong Peng <zhihongx.peng@intel.com> Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Conor Fogarty <conor.fogarty@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-09-16 22:24:54 +02:00
Stephen Hemminger	7df485eb3d	eal: remove deprecated noninclusive API New API for these were added in 20.11 and the old API was retained but marked deprecated. Since 21.11 is the next LTS, it is time to remove the deprecated ones. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: David Marchand <david.marchand@redhat.com>	2021-09-16 17:21:22 +02:00
Dmitry Kozlyuk	5ffa86a2b4	build: propagate Windows system dependencies to pkg-config Windows EAL depends on some system libraries. They were linked using add_project_link_arguments('-l<LIB>'), which prevented meson from adding them to Libs.private of pkg-config file. As a result, applications using pkg-config to find DPDK hit link errors, for example: librte_eal.a(eal_windows_eal_debug.c.obj) : error LNK2019: unresolved external symbol __imp_SymInitialize referenced in function rte_dump_stack Reference required libraries in EAL using ext_deps meson variable. bus/pci and net/pcap depend on lib/eal and will pull them automatically. Drop advapi32 dependency, as MinGW locates VirtualAlloc2() dynamically. Fixes: `2a5d547a4a` ("eal/windows: implement basic memory management") Fixes: `c91717eb75` ("eal/windows: support exit and panic") Cc: stable@dpdk.org Reported-by: William Tu <u9012063@gmail.com> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Tested-by: William Tu <u9012063@gmail.com>	2021-09-14 15:58:34 +02:00
Thomas Monjalon	557610a8ff	cryptodev: fix indent in Meson file Fixes: `af668035f7` ("cryptodev: expose driver interface as internal") Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-09-14 15:58:32 +02:00
Aman Deep Singh	a7db3afce7	net: add macro to extract MAC address bytes Added macros to simplify print of MAC address. The six bytes of a MAC address are extracted in a macro here, to improve code readablity. Signed-off-by: Aman Deep Singh <aman.deep.singh@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-09-07 19:08:05 +02:00
Aman Deep Singh	c2c4f87b12	net: add macro for MAC address print Added macro to print six bytes of MAC address. The MAC addresses will be printed in upper case hexadecimal format. In case there is a specific check for lower case MAC address, the user may need to make a change in such test case after this patch. Signed-off-by: Aman Deep Singh <aman.deep.singh@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-09-07 19:07:46 +02:00
Hemant Agrawal	864c1a40d7	security: support PDCP short MAC-I This patch add support to handle PDCP short MAC-I domain along with standard control and data domains as it has to be treaty as special case with PDCP protocol offload support. ShortMAC-I is the 16 least significant bits of calculated MAC-I. Usually when a RRC message is exchanged between UE and eNodeB it is integrity & ciphered protected. MAC-I = f(key, varShortMAC-I, count, bearer, direction). Here varShortMAC-I is prepared by using (current cellId, pci of source cell and C-RNTI of old cell). Other parameters like count, bearer and direction set to all 1. crypto-perf app is updated to take short MAC as input mode. Signed-off-by: Gagandeep Singh <g.singh@nxp.com> Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-09-08 16:54:37 +02:00
Akhil Goyal	af668035f7	cryptodev: expose driver interface as internal The rte_cryptodev_pmd.* files are for drivers only and should be private to DPDK, and not installed for app use. Signed-off-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2021-09-08 09:35:12 +02:00
Akhil Goyal	e74abd4843	cryptodev: rename function to check device validity The API rte_cryptodev_pmd_is_valid_dev, can be used by the application as well as PMD to check whether the device is valid or not. Hence, _pmd is removed from the API. The applications and drivers which use this API are also updated. Signed-off-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2021-09-08 09:21:10 +02:00
David Christensen	c13e617739	eal/ppc: ignore GCC 10 stringop-overflow warnings Suppress gcc warning "warning: writing 16 bytes into a region of size 0" for users of the POWER rte_memcpy() function. Existing rte_memcpy() code takes different code paths based on the actual size of the move so the warning is already addressed. See also commit `b5b3ea803e` ("eal/x86: ignore gcc 10 stringop-overflow warnings") Cc: stable@dpdk.org Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>	2021-09-13 09:18:06 +02:00
Xueming Li	b344eb5d94	devargs: parse global device syntax When parsing a devargs, try to parse using the global device syntax first. Fallback on legacy syntax on error. Example of new global device syntax: -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1 Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Gaetan Rivet <grive@u256.net>	2021-09-02 17:02:27 +02:00
Xueming Li	d2a66ad794	bus: add device arguments name parsing For device probe and iterator, devargs name was key information, parsed by rte_devargs_parse. In legacy parser, devargs name was extracted after bus name: bus:name,kv_arguments,,, Example: pci:83:00.0,arguments,... vdev:pcap0,... To be compatible with legacy parser, this patch introduces new bus driver API devargs_parse to parse devargs and update devargs name. If devargs_parse not implemented by bus driver, the new syntax parser rte_devargs_layers_parse default will resolve devargs name from bus's "name" argument. Different bus driver might choose different keys from arguments with unified format. The PCI bus implementation fills the devargs name with the "addr" argument, example: -a bus=pci,addr=83:00.0/class=eth/driver=mlx5,... name: 0000:03:00.0 -a bus=vdev,name=pcap0/class=eth/driver=pcap,... name:pcap0 Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Gaetan Rivet <grive@u256.net>	2021-09-02 16:58:19 +02:00
Thomas Monjalon	fdab8f2e17	version: 21.11-rc0 Start a new release cycle with empty release notes. The ABI version becomes 22.0. The map files are updated to the new ABI major number (22). The ABI exceptions are dropped and CI ABI checks are disabled because compatibility is not preserved. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: David Marchand <david.marchand@redhat.com>	2021-08-17 08:37:52 +02:00
Churchill Khangar	69fa4a61ae	pipeline: fix table statistics This patch fixes the memcpy function call which was incorrect and led to memory corruption for tables with more that just a few actions. Fixes: `742b0a57f5` ("pipeline: add table statistics to SWX") Cc: stable@dpdk.org Signed-off-by: Churchill Khangar <churchill.khangar@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-08-04 14:45:13 +02:00
Ciara Power	eeaeca82b8	cryptodev: fix freeing after device release The PMD destroy function was calling the release function, which frees cryptodev->data, and then tries to free cryptodev->data->dev_private, which causes the heap use after free issue. A temporary pointer is set before the free of cryptodev->data, which can then be used afterwards to free dev_private. The free cannot be moved to before the release function is called, as dev_private is used in the PMD close function while being released. Fixes: `9e6edea418` ("cryptodev: add APIs to assist PMD initialisation") Cc: stable@dpdk.org Reported-by: Zhihong Peng <zhihongx.peng@intel.com> Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-07-30 21:08:12 +02:00
Dmitry Kozlyuk	23ce9e0a19	eal/windows: cleanup virt2phys handle eal_mem_virt2phys_init() opens a handle for use by rte_mem_virt2phy(). Close this handle on EAL cleanup. Fixes: `2a5d547a4a` ("eal/windows: implement basic memory management") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com>	2021-07-30 18:56:01 +02:00
Naga Harish K S V	6922655cad	eventdev: fix event port setup in Tx adapter The event port config set by application in rte_event_eth_tx_adapter_create API is modified in default configuration callback function. This patch removes this hardcode to use application provided event port config value. Fixes: `a3bbf2e097` ("eventdev: add eth Tx adapter implementation") Cc: stable@dpdk.org Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>	2021-07-30 12:55:19 +02:00
Maxime Coquelin	3c929a0bb3	vhost: fix crash on reconnect When the vhost-user frontend like Virtio-user tries to reconnect to the restarted Vhost backend, the Vhost backend segfaults when multiqueue is enabled. This is caused by VHOST_USER_GET_VRING_BASE being called for a virtqueue that has not been created before, causing a NULL pointer dereferencing. This patch adds the VHOST_USER_GET_VRING_BASE requests to the list of requests that trigger queue pair allocations. Fixes: `160cbc815b` ("vhost: remove a hack on queue allocation") Cc: stable@dpdk.org Reported-by: Yinan Wang <yinan.wang@intel.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Yinan Wang <yinan.wang@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-07-28 08:27:53 +02:00
Huisong Li	c6ccd1e392	sched: rework configuration failure handling Currently, rte_sched_free_memory() is called multiple times by the exception handling code in rte_sched_subport_config() and rte_sched_pipe_config(). This patch optimizes them into a unified outlet to free memory. Fixes: `ac6fcb841b` ("sched: update subport rate dynamically") Fixes: `34a90f8665` ("sched: modify pipe functions for config flexibility") Fixes: `ce7c4fd7c2` ("sched: add pipe config to subport level") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-07-24 10:58:58 +02:00
Huisong Li	a042481ecd	sched: fix profile allocation failure handling This patch fixes return value judgment when allocate memory to store the subport profile, and releases memory of 'rte_sched_port' if code fails to apply for this memory. Fixes: `0ea4c6afca` ("sched: add subport profile table") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-07-24 10:58:39 +02:00
Richael Zhuang	d37462e56c	power: check frequencies count before filling array The freqs array size is RTE_MAX_LCORE_FREQS. Before filling the array with num_freqs elements, restrict the total num to RTE_MAX_LCORE_FREQS. This fix aims to fix the coverity scan issue like: Overrunning array "pi->freqs" of 256 bytes by passing it to a function which accesses it at byte offset 464. Coverity issue: 371913 Fixes: `ef1cc88f18` ("power: support cppc_cpufreq driver") Cc: stable@dpdk.org Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Acked-by: David Hunt <david.hunt@intel.com>	2021-07-24 10:09:58 +02:00
Stephen Hemminger	128c22b998	eal: fix argument in 32-bit safe BSF function The first argument to rte_bsf32_safe was incorrectly declared as a 64 bit value. The code only works on 32 bit values and the underlying function rte_bsf32 only accepts 32 bit values. This was a mistake introduced when the safe version was added and probably cause by copy/paste from the 64 bit version. The bug passed silently under the radar until some other code was built with -Wall and -Wextra in C++ and C++ complains about the missing cast. Yes, this is a API signature change, but the original code was wrong. It is an inline so not an ABI change. Fixes: `4e261f5519` ("eal: add 64-bit bsf and 32-bit safe bsf functions") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>	2021-07-24 09:51:30 +02:00
Jiayu Hu	259caa21d7	vhost: handle memory hotplug for async vhost When the guest memory is hotplugged, the vhost application which enables DMA acceleration must stop DMA transfers before the vhost re-maps the guest memory. This patch is to notify the vhost application of stopping DMA transfers. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-23 10:58:53 +02:00
Cheng Jiang	b737fd6139	vhost: add unsafe async API to clear packets Applications need to stop DMA transfers and finish all the inflight packets when in VM memory hot-plug case and async vhost is used. This patch is to provide an unsafe API to clear inflight packets which are submitted to DMA engine in vhost async data path. Update the program guide and release notes for virtqueue inflight packets clear API in vhost lib. Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-23 10:58:53 +02:00
Cheng Jiang	3f63c19b2b	vhost: fix async callbacks return type The async vhost callback ops should return negative value when there are something wrong in the callback, so the return type should be changed into int32_t. The issue in vhost example is also fixed. Fixes: `cd6760da10` ("vhost: introduce async enqueue for split ring") Fixes: `819a716858` ("vhost: fix async callback return type") Fixes: `6b3c81db8b` ("vhost: simplify async copy completion") Fixes: `abec60e711` ("examples/vhost: support vhost async data path") Fixes: `6e9a9d2a02` ("examples/vhost: fix ioat dependency") Fixes: `873e8dad6f` ("vhost: support packed ring in async datapath") Cc: stable@dpdk.org Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-23 10:58:53 +02:00
Jie Zhou	b8617fcc51	eal/windows: check callback parameter of alarm functions EAL functions rte_eal_alarm_set() and rte_eal_alarm_cancel() did not for invalid parameters in Windows implementation, which is caught by the unit test alarm_autotest. Enforce parameter check to fail fast for invalid parameters. Fixes: `f4cbdbc7fb` ("eal/windows: implement alarm API") Cc: stable@dpdk.org Signed-off-by: Jie Zhou <jizh@linux.microsoft.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-07-22 22:06:27 +02:00
Anatoly Burakov	565d01226e	power: fix multi-queue scale mode Currently in scale mode, multi-queue initialization will attempt to initialize and de-initialize the per-lcore power library structures multiple times. Fix it to only do this whenever we either enabling first queue or disabling last queue. Fixes: `5dff9a72b0` ("power: support callbacks for multiple Rx queues") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: David Hunt <david.hunt@intel.com>	2021-07-22 21:36:30 +02:00
Jiayu Hu	fa51f1aa08	vhost: add thread-unsafe async registration This patch adds thread unsafe version for async register and unregister functions. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-21 07:56:13 +02:00
Jiayu Hu	acbc38887b	vhost: rework async configuration structure This patch reworks the async configuration structure to improve code readability. In addition, add preserved padding fields on the structure for future usage. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-21 07:56:13 +02:00
Jiayu Hu	7f31d4ea05	vhost: fix lock on device readiness notification The vhost notifies the application of device readiness via vhost_user_notify_queue_state(), but calling this function is not protected by the lock. This patch is to make this function call lock protected. Fixes: `d0fcc38f5f` ("vhost: improve device readiness notifications") Cc: stable@dpdk.org Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-21 07:56:13 +02:00
Maxime Coquelin	92ed77dce6	vhost: fix packed ring index wrapping Unlike split ring, packed ring does not mandate the ring size to be a power of 2. So we have to use a modulo operation when wrapping ring index. Fixes: `873e8dad6f` ("vhost: support packed ring in async datapath") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-07-21 07:56:13 +02:00
Jiayu Hu	0c0935c5f7	vhost: allow to check in-flight packets for async vhost This patch allows to check the amount of in-flight packets for the vhost queue using async acceleration. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-21 07:56:13 +02:00
Cheng Jiang	2e3f1ab0d8	vhost: fix async packed ring batch datapath We assume that in the sync path, if there is no buffer wrap in the avail descriptors fetched in a batch, there is no buffer wrap in the used descriptors which need to be written back in this batch, but this assumption is wrong in the async path since there are inflight descriptors which are processed by the DMA device. This patch refactors the batch copy code and adds used ring buffer wrap check as a batch copy condition to fix this issue. Fixes: `873e8dad6f` ("vhost: support packed ring in async datapath") Cc: stable@dpdk.org Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-21 07:56:13 +02:00
Cheng Jiang	8d2c1260af	vhost: fix index overflow for packed ring in async vhost We introduced some new indexes in packed ring of async vhost. They will eventually overflow and lead to errors if the ring size is not a power of 2. This patch is to check and keep these indexes within a reasonable range. Fixes: `873e8dad6f` ("vhost: support packed ring in async datapath") Cc: stable@dpdk.org Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-07-21 07:56:13 +02:00
Xiao Wang	706ba48665	vhost: check header for legacy dequeue offload When parsing the virtio net header and packet header for dequeue offload, we need to perform sanity check on the packet header to ensure: - No out-of-boundary memory access. - The packet header and virtio_net header are valid and aligned. Fixes: `d0cf91303d` ("vhost: add Tx offload capabilities") Cc: stable@dpdk.org Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-21 07:56:13 +02:00
Cristian Dumitrescu	40d42de563	pipeline: fix selector freeing Due to a typo, the selector_free() function incorrectly takes an early return when the selectors array is non-NULL, as opposed to the other way around. Coverity issue: 371912 Fixes: `cdaa937d3e` ("pipeline: support selector table") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-07-21 13:51:17 +02:00
Anatoly Burakov	87fb608356	power: fix crash on error for intel_pstate Currently, the error paths can lead to attempts at dereferencing NULL pointers. Add the check to avoid attempts at dereferencing NULL pointers. Coverity issue: 371895 Coverity issue: 371889 Fixes: `06cffd468f` ("power: refactor ACPI and intel_pstate support") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-07-20 17:24:00 +02:00
David Hunt	de8606bf73	distributor: fix 128-bit write alignment When the distributor sample app is built as a 32-bit app, the data buffer passed to find_match_vec can be unaligned, causing a segmentation fault due to writing a 128-bit value using _mm_store_si128(). 128-bit align the data being passed in so this does not happen. Fixes: `775003ad2f` ("distributor: add new burst-capable library") Cc: stable@dpdk.org Signed-off-by: David Hunt <david.hunt@intel.com>	2021-07-20 14:32:08 +02:00
Viacheslav Galaktionov	10eaf41d70	ethdev: keep count of representor ranges in API In its current state, the API can overflow the user-passed buffer if a new representor range appears between function calls. In order to solve this problem, augment the representor info structure with the numbers of allocated and initialized ranges. This way the users of this structure can be sure they will not overrun the buffer. Fixes: `85e1588ca7` ("ethdev: add API to get representor info") Cc: stable@dpdk.org Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Xueming Li <xuemingl@nvidia.com>	2021-07-10 11:29:11 +02:00
Changpeng Liu	7bc7bc3516	eal: suppress error log on multi-process hotplug This is a normal case that the primary process already owned one device while the secondary process try to attach it, so suppress the error log here to exclude this case. Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>	2021-07-10 10:07:07 +02:00
Cristian Dumitrescu	a3ac0a4836	pipeline: support LPM lookup Add support for the Longest Prefix Match (LPM) lookup to the SWX pipeline. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>	2021-07-10 08:30:59 +02:00
Cristian Dumitrescu	cdaa937d3e	pipeline: support selector table Add pipeline-level support for selector tables. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-07-10 08:26:12 +02:00
Cristian Dumitrescu	f7598a62d1	table: support selector table A selector table is made up of groups of weighted members, with a given member potentially part of several groups. The select operation returns a member ID by first selecting a group based on an input group ID and then selecting a member within that group based on hashing one or several input header/meta-data fields. It is very useful for implementing an ECMP/WCMP-enabled FIB or a load balancer. It is part of the action selector described by the P4 Portable Switch Architecture (PSA) specification. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-07-09 23:31:54 +02:00
Cristian Dumitrescu	a57d92d73d	pipeline: fix table entry read The rte_swx_pipeline_table_entry_read() function is used to read from a character string a table entry that is to be added to the table, deleted from the table or set as the default entry of the table. Addition needs both the match and the part of the entry, deletion ignores the action part, while the default set ignores the match part, hence the need to make both the match and the action part optional. The logic for skipping the match or the action part was broken, hence the current fix. Fixes: `b32c0a2c5e` ("pipeline: add SWX table update high level API") Cc: stable@dpdk.org Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Venkata Suresh Kumar P <venkata.suresh.kumar.p@intel.com> Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>	2021-07-09 22:52:19 +02:00
Thierry Herbelot	3fc2ddffde	table: fix bucket empty check Due to a typo, only 3 out of 4 keys in the bucket of the exact match table were considered, which can result in valid keys being incorrectly dropped from the table. Fixes: `d0a0096661` ("table: add exact match SWX table") Cc: stable@dpdk.org Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-07-09 22:42:24 +02:00
Chengwen Feng	5aa9189d74	config/arm: fix SVE build with GCC 8.3 If the target machine has SVE feature (e.g. "-march=armv8.2-a+sve'), and the compiler is gcc-8.3, it will produce this error: In file included from lib/eal/common/eal_common_options.c:38: lib/eal/arm/include/rte_vect.h:13:10: fatal error: arm_sve.h: No such file or directory #include <arm_sve.h> ^~~~~~~~~~~ The root cause is that gcc-8.3 supports SVE (the macro __ARM_FEATURE_SVE was 1), but it doesn't support SVE ACLE [1]. The solution: a) Detect compiler whether support SVE ACLE, if support then define RTE_HAS_SVE_ACLE macro. b) Use the RTE_HAS_SVE_ACLE macro to include SVE header file. [1] ACLE: Arm C Language Extensions, the SVE ACLE header file is <arm_sve.h>, user should include it when writing ACLE SVE code. Fixes: `67b68824a8` ("lpm/arm: support SVE") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-07-09 22:25:24 +02:00
Ruifeng Wang	cac2a49b4a	ring: use WFE to wait for tail update on aarch64 Instead of polling for tail to be updated, use WFE instruction. Signed-off-by: Gavin Hu <gavin.hu@arm.com> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-07-09 21:33:01 +02:00
Gavin Hu	fa6b488998	spinlock: use WFE to reduce contention on aarch64 In acquiring a spinlock, cores repeatedly poll the lock variable. This is replaced by rte_wait_until_equal API. Running micro benchmarking and testpmd and l3fwd traffic tests on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well and no notable performance gain nor degradation was measured. Signed-off-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-07-09 21:33:01 +02:00
Anatoly Burakov	f53fe635c1	power: support monitoring multiple Rx queues Use the new multi-monitor intrinsic to allow monitoring multiple ethdev Rx queues while entering the energy efficient power state. The multi version will be used unconditionally if supported, and the UMWAIT one will only be used when multi-monitor is not supported by the hardware. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: David Hunt <david.hunt@intel.com>	2021-07-09 21:13:13 +02:00
Anatoly Burakov	5dff9a72b0	power: support callbacks for multiple Rx queues Currently, there is a hard limitation on the PMD power management support that only allows it to support a single queue per lcore. This is not ideal as most DPDK use cases will poll multiple queues per core. The PMD power management mechanism relies on ethdev Rx callbacks, so it is very difficult to implement such support because callbacks are effectively stateless and have no visibility into what the other ethdev devices are doing. This places limitations on what we can do within the framework of Rx callbacks, but the basics of this implementation are as follows: - Replace per-queue structures with per-lcore ones, so that any device polled from the same lcore can share data - Any queue that is going to be polled from a specific lcore has to be added to the list of queues to poll, so that the callback is aware of other queues being polled by the same lcore - Both the empty poll counter and the actual power saving mechanism is shared between all queues polled on a particular lcore, and is only activated when all queues in the list were polled and were determined to have no traffic. - The limitation on UMWAIT-based polling is not removed because UMWAIT is incapable of monitoring more than one address. Also, while we're at it, update and improve the docs. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: David Hunt <david.hunt@intel.com>	2021-07-09 21:13:13 +02:00
Anatoly Burakov	209fd58545	power: make ethdev power management thread unsafe Currently, we expect that only one callback can be active at any given moment, for a particular queue configuration, which is relatively easy to implement in a thread-safe way. However, we're about to add support for multiple queues per lcore, which will greatly increase the possibility of various race conditions. We could have used something like an RCU for this use case, but absent of a pressing need for thread safety we'll go the easy way and just mandate that the API's are to be called when all affected ports are stopped, and document this limitation. This greatly simplifies the `rte_power_monitor`-related code. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: David Hunt <david.hunt@intel.com>	2021-07-09 21:13:13 +02:00
Anatoly Burakov	66834f2974	eal: add power monitor for multiple events Use RTM and WAITPKG instructions to perform a wait-for-writes similar to what UMWAIT does, but without the limitation of having to listen for just one event. This works because the optimized power state used by the TPAUSE instruction will cause a wake up on RTM transaction abort, so if we add the addresses we're interested in to the read-set, any write to those addresses will wake us up. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: David Hunt <david.hunt@intel.com>	2021-07-09 21:13:13 +02:00
Anatoly Burakov	6afc4baf4f	eal: use callbacks for power monitoring comparison Previously, the semantics of power monitor were such that we were checking current value against the expected value, and if they matched, then the sleep was aborted. This is somewhat inflexible, because it only allowed us to check for a specific value in a specific way. This commit replaces the comparison with a user callback mechanism, so that any PMD (or other code) using `rte_power_monitor()` can define their own comparison semantics and decision making on how to detect the need to abort the entering of power optimized state. Existing implementations are adjusted to follow the new semantics. Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: David Hunt <david.hunt@intel.com> Acked-by: Timothy McDaniel <timothy.mcdaniel@intel.com>	2021-07-09 21:13:13 +02:00
Juraj Linkeš	845048c522	eal/arm: update CPU flags There are two execution states on armv8 architecture, aarch64 and aarch32. Add PLATFORM_STR for the latter and update RTE_ARCH_* flags according to `e9b9739264`. Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>	2021-07-09 20:00:19 +02:00
Ferruh Yigit	b67f598e23	kni: update link only on change 'rte_kni_update_link()' updates virtual KNI interface link using kernel sysfs interface. If the requested link status is same as interface link status, do not update the link status but return with success. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-07-09 17:22:42 +02:00
Richael Zhuang	ef1cc88f18	power: support cppc_cpufreq driver Currently in DPDK only acpi_cpufreq and pstate_cpufreq drivers are supported, which are both not available on arm64 platforms. Add support for cppc_cpufreq driver which works on most arm64 platforms. Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Acked-by: David Hunt <david.hunt@intel.com>	2021-07-09 16:04:46 +02:00
Anatoly Burakov	06cffd468f	power: refactor ACPI and intel_pstate support Currently, ACPI and PSTATE modes have lots of code duplication, confusing logic, and a bunch of other issues that can, and have, led to various bugs and resource leaks. This commit factors out the common parts of sysfs reading/writing for ACPI and PSTATE drivers. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: David Hunt <david.hunt@intel.com>	2021-07-08 22:32:13 +02:00
Anatoly Burakov	02a6d68311	power: fix namespace for internal struct Currently, ACPI code uses rte_power_info as the struct name, which gives the appearance that this is an externally visible API. Fix to use internal namespace. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Hunt <david.hunt@intel.com>	2021-07-08 22:32:13 +02:00
Huisong Li	02edbfab1e	ethdev: add dev configured flag Currently, if dev_configure is not called or fails to be called, users can still call dev_start successfully. So it is necessary to have a flag which indicates whether the device is configured, to control whether dev_start can be called and eliminate dependency on user invocation order. The flag stored in "struct rte_eth_dev_data" is more reasonable than "enum rte_eth_dev_state". "enum rte_eth_dev_state" is private to the primary and secondary processes, and can be independently controlled. However, the secondary process does not make resource allocations and does not call dev_configure(). These are done by the primary process and can be obtained or used by the secondary process. So this patch adds a "dev_configured" flag in "rte_eth_dev_data", like "dev_started". Signed-off-by: Huisong Li <lihuisong@huawei.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> libabigail raised a warning on this change. This change is fine wrt ABI as far as we understand, but we can't express an exception rule (see libabigail bug #28060) to waive the changes only in this part of the rte_eth_dev_data struct. The solution for now is to globally waive any change on the rte_eth_dev_data structure. Signed-off-by: David Marchand <david.marchand@redhat.com>	2021-07-08 13:05:55 +02:00
David Marchand	e7885281de	ipc: stop mp control thread on cleanup When calling rte_eal_cleanup, the mp channel cleanup routine only sets mp_fd to -1 leaving the rte_mp_handle control thread running. This control thread can spew warnings on reading on an invalid fd. This is especially noticed with ASAN enabled. To handle this situation, set mp_fd to -1 to signal the control thread it should exit, but since this thread might be sleeping on the socket, cancel the thread too. Fixes: `85d6815fa6` ("eal: close multi-process socket during cleanup") Cc: stable@dpdk.org Reported-by: Owen Hilyard <ohilyard@iol.unh.edu> Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-07-08 13:05:55 +02:00
Jan Viktorin	56912ddef2	ethdev: fix doc of flow action The struct rte_flow_action was missing from DPDK API documentation. Fixes: `3850cf0c8c` ("ethdev: add tunnel encap/decap actions") Cc: stable@dpdk.org Signed-off-by: Jan Viktorin <viktorin@cesnet.cz> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Aman Deep Singh <aman.deep.singh@intel.com>	2021-07-02 19:03:03 +02:00
Jie Zhou	799a5b9aca	eal/windows: add clock function Add clock_gettime() on Windows in rte_os_shim.h. Signed-off-by: Jie Zhou <jizh@linux.microsoft.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-07-02 19:03:03 +02:00
Jie Zhou	3d2fcb0e0a	eal/windows: add device event stubs Add device event stubs in eal_dev.c for Windows Signed-off-by: Jie Zhou <jizh@linux.microsoft.com> Acked-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-07-02 19:03:03 +02:00
Jie Zhou	22f463e181	eal/windows: add macros required by testpmd Add required macros by testpmd on Windows in rte_os_shim.h Signed-off-by: Jie Zhou <jizh@linux.microsoft.com> Acked-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-07-02 19:03:03 +02:00
Jie Zhou	786881d152	lib: build testpmd dependencies on Windows Enable building libraries that testpmd depends on for Windows Signed-off-by: Jie Zhou <jizh@linux.microsoft.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-07-02 19:03:03 +02:00
Olivier Matz	45a08ef55e	net: introduce functions to verify L4 checksums Since commit `d5df2ae042` ("net: fix unneeded replacement of TCP checksum 0"), the functions rte_ipv4_udptcp_cksum() and rte_ipv6_udptcp_cksum() can return either 0x0000 or 0xffff when used to verify a packet containing a valid checksum. Since these functions should be used to calculate the checksum to set in a packet, introduce 2 new helpers for checksum verification. They return 0 if the checksum is valid in the packet. Use this new helper in net/tap driver. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-07-02 19:03:03 +02:00
David Marchand	40edb9c0d3	eal: handle compressed firmware Introduce an internal firmware loading helper to remove code duplication in our drivers and handle xz compressed firmware by calling libarchive. This helper tries to look for .xz suffixes so that drivers are not aware the firmware has been compressed. libarchive is set as an optional dependency: without libarchive, a runtime warning is emitted so that users know there is a compressed firmware. Windows implementation is left as an empty stub. Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Igor Russkikh <irusskikh@marvell.com> Acked-by: Aaron Conole <aconole@redhat.com> Tested-by: Haiyue Wang <haiyue.wang@intel.com>	2021-07-07 16:41:53 +02:00
Bruce Richardson	d5252f7d4b	telemetry: add extra log message on socket bind failure If the library fails to create the needed socket, add an additional check to report if the error is due to a missing DPDK runtime dir. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Ciara Power <ciara.power@intel.com>	2021-07-07 15:23:53 +02:00
Bruce Richardson	ce382fdddb	eal: create runtime dir even when shared data is not used When multi-process is not wanted and DPDK is run with the "no-shconf" flag, the telemetry library still needs a runtime directory to place the unix socket for telemetry connections. Therefore, rather than not creating the directory when this flag is set, we can change the code to attempt the creation anyway, but not error out if it fails. If it succeeds, then telemetry will be available, but if it fails, the rest of DPDK will run without telemetry. This ensures that the "in-memory" flag will allow DPDK to run even if the whole filesystem is read-only, for example. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Morten Brørup <mb@smartsharesystems.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-07-07 15:23:09 +02:00
Maxime Coquelin	b3d4a18b9c	vhost: use DPDK allocations for in-flight data Inflight metadata are allocated using glibc's calloc. This patch converts them to rte_zmalloc_socket to take care of the NUMA affinity. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-06-30 13:32:38 +02:00
Maxime Coquelin	b81c93466d	vhost: allocate all data on same node as virtqueue This patch saves the NUMA node the virtqueue is allocated on at init time, in order to allocate all other data on the same node. While most of the data are allocated before numa_realloc() is called and so the data will be reallocated properly, some data like the log cache are most likely allocated after. For the virtio device metadata, we decide to allocate them on the same node as the VQ 0. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-06-30 13:32:13 +02:00
Maxime Coquelin	97b4e3b1d0	vhost: improve NUMA reallocation This patch improves the numa_realloc() function by making use of rte_realloc_socket(), which takes care of the memory copy and freeing of the old data. Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-06-30 13:32:13 +02:00
Maxime Coquelin	6305dfeff4	vhost: fix NUMA reallocation with multi-queue Since the Vhost-user device initialization has been reworked, enabling the application to start using the device as soon as the first queue pair is ready, NUMA reallocation no more happened on queue pairs other than the first one since numa_realloc() was returning early if the device was running. This patch fixes this issue by reallocating the device metadata only if the device is running. For the virtqueues, a vring state change notification is sent to notify the application of its disablement. Since the callback is supposed to be blocking, it is safe to reallocate it afterwards. Fixes: `d0fcc38f5f` ("vhost: improve device readiness notifications") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-06-30 13:32:13 +02:00
Maxime Coquelin	eb40c50c17	vhost: fix missing cache logging NUMA realloc When the guest allocates virtqueues on a different NUMA node than the one the Vhost metadata are allocated, both the Vhost device struct and the virtqueues struct are reallocated. However, reallocating the log cache on the new NUMA node was not done. This patch fixes this by reallocating it if it has been allocated already, which means a live-migration is on-going. Fixes: `1818a63147` ("vhost: move dirty logging cache out of virtqueue") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-06-30 13:32:13 +02:00
Maxime Coquelin	57589cdfd7	vhost: fix missing guest pages table NUMA realloc When the guest allocates virtqueues on a different NUMA node than the one the Vhost metadata are allocated, both the Vhost device struct and the virtqueues struct are reallocated. However, reallocating the guest pages table was missing, which likely causes at least one cross-NUMA accesses for every burst of packets. This patch reallocates this table on the same NUMA node as the other metadata. Fixes: `e246896178` ("vhost: get guest/host physical address mappings") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-06-30 13:26:58 +02:00
Maxime Coquelin	8119ca9114	vhost: fix missing memory table NUMA realloc When the guest allocates virtqueues on a different NUMA node than the one the Vhost metadata are allocated, both the Vhost device struct and the virtqueues struct are reallocated. However, reallocating the Vhost memory table was missing, which likely causes at least one cross-NUMA accesses for every burst of packets. This patch reallocates this table on the same NUMA node as the other metadata. Fixes: `552e8fd3d2` ("vhost: simplify memory regions handling") Cc: stable@dpdk.org Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-06-30 13:26:58 +02:00
Xueming Li	35d4f17b3d	devargs: add common key definition Add common devargs key definition for "bus", "class" and "driver". Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-07-05 16:33:18 +02:00
Thomas Monjalon	dbba7c9efb	eal: save error in string copy The string copy api rte_strscpy() did not set rte_errno during failures, instead it just returned negative error number. Set rte_errrno if the destination buffer is too small. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-07-05 15:11:30 +02:00
Ruifeng Wang	18f0b28eec	eal/arm: remove unused type Data types Elf32_auxv_t and Elf64_auxv_t are used by OS Linux auxiliary vector read, and not used by arch specific cpu flag API implementations. Hence remove them from Arm file. Reported-by: James Grant <j.grant@qub.ac.uk> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2021-07-05 09:50:51 +02:00
Owen Hilyard	03b8372a9a	rib: fix max depth IPv6 lookup ASAN found a stack buffer overflow in lib/rib/rte_rib6.c:get_dir. The fix for the stack buffer overflow was to make sure depth was always < 128, since when depth = 128 it caused the index into the ip address to be 16, which read off the end of the array. While trying to solve the buffer overflow, I noticed that a few changes could be made to remove the for loop entirely. Fixes: `f7e861e21c` ("rib: support IPv6") Cc: stable@dpdk.org Signed-off-by: Owen Hilyard <ohilyard@iol.unh.edu> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>	2021-06-24 15:34:45 +02:00
Owen Hilyard	016441e3c7	flow_classify: fix leaking rules on delete Rules in a classify table were not freed if the table had a delete function. Fixes: `be41ac2a33` ("flow_classify: introduce flow classify library") Cc: stable@dpdk.org Signed-off-by: Owen Hilyard <ohilyard@iol.unh.edu> Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>	2021-06-24 15:34:45 +02:00
Yunjian Wang	0db3d5551a	kni: fix mbuf allocation for kernel side use In kni_allocate_mbufs(), we alloc mbuf for alloc_q as this code. allocq_free = (kni->alloc_q->read - kni->alloc_q->write - 1) \ & (MAX_MBUF_BURST_NUM - 1); The value of allocq_free maybe zero, for example : The ring size is 1024. After init, write = read = 0. Then we fill kni->alloc_q to full. At this time, write = 1023, read = 0. Then the kernel send 32 packets to userspace. At this time, write = 1023, read = 32. And then the userspace receive this 32 packets. Then fill the kni->alloc_q, (32 - 1023 - 1) & 31 = 0, fill nothing. ... Then the kernel send 32 packets to userspace. At this time, write = 1023, read = 992. And then the userspace receive this 32 packets. Then fill the kni->alloc_q, (992 - 1023 - 1) & 31 = 0, fill nothing. Then the kernel send 32 packets to userspace. The kni->alloc_q only has 31 mbufs and will drop one packet. Absolutely, this is a special scene. Normally, it will fill some mbufs everytime, but may not enough for the kernel to use. In this patch, we always keep the kni->alloc_q to full for the kernel to use. Fixes: `49da4e82cf` ("kni: allocate no more mbuf than empty slots in queue") Cc: stable@dpdk.org Signed-off-by: Cheng Liu <liucheng11@huawei.com> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2021-06-24 09:42:37 +02:00
Balazs Nemeth	242695f612	vhost: allocate and free packets in bulk in Tx split Same idea as commit `a287ac2891` ("vhost: allocate and free packets in bulk in Tx packed"), allocate and free packets in bulk. Also remove the unused function virtio_dev_pktmbuf_alloc. Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-06-23 09:55:34 +02:00
Thierry Herbelot	9cfbe67691	vhost/crypto: check request pointer before dereference Use vc_req only after it was checked not to be NULL. Fixes: `2d962bb736` ("vhost/crypto: fix possible TOCTOU attack") Cc: stable@dpdk.org Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-06-23 09:55:23 +02:00
Dmitry Kozlyuk	cfdaa678b3	eal/windows: cleanup interrupt resources Interrupt manager in Windows EAL allocates on IOCP and starts a control thread that runs indefinitely. At DPDK cleanup this thread was not stopped and IOCP handle was not closed. Gracefully stop interrupt-handling in rte_eal_cleanup(). The thread already closes IOCP handle before exiting. Fixes: `5c016fc020` ("eal/windows: add interrupt thread skeleton") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com> Acked-by: Jie Zhou <jizh@microsoft.com> Tested-by: Jie Zhou <jizh@microsoft.com>	2021-06-23 09:05:36 +02:00
Dmitry Kozlyuk	3888f31950	eal/windows: fix interrupt thread handle leakage Each time a work was scheduled in the interrupt thread, usually an alarm, a handle was opened but not closed. Opening a handle is a system call, which harms alarm precision. Instead of opening and closing a handle each time, open it when interrupt thread starts and close it when the thread finishes. Fixes: `5c016fc020` ("eal/windows: add interrupt thread skeleton") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>	2021-06-23 09:04:28 +02:00
Dmitry Kozlyuk	35dff5d3b7	eal/windows: fix interrupt thread ID Interrupt thread ID retained its value after interrupt thread finish. Other interrupt routines could then operate on the wrong thread. Clear interrupt thread ID before thread termination. Fixes: `5c016fc020` ("eal/windows: add interrupt thread skeleton") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>	2021-06-23 09:03:14 +02:00
Christian Ehrhardt	31c5af644b	vfio: add stdbool include This became visible by backporting the following for the 19.11 stable tree: `c13ca4e8` "vfio: fix DMA mapping granularity for IOVA as VA" The usage of type bool in the vfio code would require "#include <stdbool.h>", but rte_vfio.h has no direct paths to stdbool.h. It happens that in eal_vfio_mp_sync.c it comes after "#include <rte_log.h>". And rte_log.h since 20.05 includes stdbool since this change: `241e67bfe` "log: add API to check if a logtype can log in a given level" and thereby mitigates the issue. It should be safe to include stdbool.h from rte_vfio.h itself to be present exactly when needed for the struct it defines using that type. Fixes: `c13ca4e81c` ("vfio: fix DMA mapping granularity for IOVA as VA") Cc: stable@dpdk.org Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-06-17 10:31:33 +02:00
Konstantin Ananyev	b3b36f0fbf	acl: fix build with GCC 6.3 --buildtype=debug with gcc 6.3 produces the following error: ../lib/librte_acl/acl_run_avx512_common.h: In function ‘resolve_match_idx_avx512x16’: ../lib/librte_acl/acl_run_avx512x16.h:33:18: error: the last argument must be an 8-bit immediate ^ ../lib/librte_acl/acl_run_avx512_common.h:373:9: note: in expansion of macro ‘_M_I_’ return _M_I_(slli_epi32)(mi, match_log); ^~~~~ Seems like gcc-6.3 complains about the following construct: static const uint32_t match_log = 5; ... _mm512_slli_epi32(mi, match_log); It can't substitute constant variable 'match_log' with its actual value. The fix replaces constant variable with its immediate value. Bugzilla ID: 717 Fixes: `b64c2295f7` ("acl: add 256-bit AVX512 classify method") Fixes: `45da22e42e` ("acl: add 512-bit AVX512 classify method") Cc: stable@dpdk.org Reported-by: Liang Ma <liangma@liangbit.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-06-17 09:37:11 +02:00
David Marchand	2ca92f5441	malloc: fix size annotation for NUMA-aware realloc __rte_alloc_size is mapped to compiler alloc_size attribute. Quoting gcc documentation: """ alloc_size The alloc_size attribute is used to tell the compiler that the function return value points to memory, where the size is given by one or two of the functions parameters. GCC uses this information to improve the correctness of __builtin_object_size. The function parameter(s) denoting the allocated size are specified by one or two integer arguments supplied to the attribute. The allocated size is either the value of the single function argument specified or the product of the two function arguments specified. Argument numbering starts at one. """ In rte_realloc_socket case, only 'size' matters. Note: this has been spotted by Maxime trying to use rte_realloc_socket and compiling with gcc 11. Fixes: `17b347dab7` ("malloc: add alloc_size attribute to functions") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-06-11 11:03:38 +02:00
Ivan Ilchenko	1ffd3bc125	bitmap: fix buffer overrun in bitmap init Bitmap initialization function is allowed to memset() caller-provided buffer with number of bytes exceeded this buffer size. This happens due to wrong comparison sign between buffer size and number of bytes required to initialize bitmap. Fixes: `602c9ca33a` ("sched: bitmap is now dynamically allocated") Cc: stable@dpdk.org Reported-by: Andy Moreton <amoreton@xilinx.com> Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com> Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-06-11 11:03:25 +02:00
Haiyue Wang	21f6adec07	bus/pci: configure PCI bus master Add the API to set 'Bus Master Enable' bit to be enabled or disabled in the PCI command register. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2021-06-04 09:38:08 +02:00
David Marchand	64307fad7d	telemetry: remove static limit on callbacks count This code is not performance sensitive and can be switched to dynamic allocations. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Ciara Power <ciara.power@intel.com>	2021-06-03 18:36:03 +02:00
Hongbo Zheng	2d2bf7de1a	graph: fix null dereference in stats In function 'stats_mem_init', pointer 'stats' should be confirmed not null before memset it. Fixes: `af1ae8b6a3` ("graph: implement stats") Cc: stable@dpdk.org Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-06-03 18:35:57 +02:00
Hongbo Zheng	3b47572fbe	graph: fix memory leak in stats Fix function 'stats_mem_populate' return without free dynamic memory referenced by 'stats'. Fixes: `af1ae8b6a3` ("graph: implement stats") Cc: stable@dpdk.org Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-06-03 18:35:47 +02:00
Thomas Monjalon	0d025f019c	ethdev: fix comments of packet integrity flow item The Doxygen comments are placed before the related lines, but the markers were /< instead of / The struct rte_flow_item_integrity did not appear in Doxygen output because there was no general comment for the struct. Fixes: `b10a421a1f` ("ethdev: add packet integrity check flow rules") Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2021-05-19 22:52:03 +02:00
David Marchand	f31ce483bc	vhost: restore IOTLB mempool allocation IOTLB messages will be sent when some queues are not enabled. If we initialize IOTLB in vhost_user_set_vring_num, it could happen that IOTLB update comes when IOTLB pool of disabled queues are not initialized. Fixes: `968bbc7e2e` ("vhost: avoid IOTLB mempool allocation while IOMMU disabled") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-05-18 09:54:44 +02:00
Balazs Nemeth	3ad55b8e94	vhost: fix stored last used index The optimization introduced by commit `d18db8049c` ("vhost: read last used index once") didn't account for the fact that vhost_flush_enqueue_shadow_packed increments the last_used_idx. For this reason, store last_used_idx after the potential call to vhost_flush_enqueue_shadow_packed. Bugzilla ID: 699 Fixes: `d18db8049c` ("vhost: read last used index once") Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Wei Ling <weix.ling@intel.com>	2021-05-18 09:43:35 +02:00
Cheng Jiang	35139e648a	vhost: fix sign extension in async packed ring Change the variable type in store_dma_desc_info_packed() to fix suspicious implicit sign extension. Coverity issue: 370608, 370610, 370612 Fixes: `873e8dad6f` ("vhost: support packed ring in async datapath") Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>	2021-05-12 10:28:18 +02:00
Cheng Jiang	11a7cd8c92	vhost: fix sign extension in async split ring Change the variable type in store_dma_desc_info_split() to fix suspicious implicit sign extension. Coverity issue: 370604, 370607, 370609 Fixes: `3d6cb86b0d` ("vhost: refactor async split ring functions") Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>	2021-05-12 10:28:08 +02:00
David Marchand	8eff201b00	net/ice: fix leak on thread termination A terminated pthread should be joined or detached so that its associated resources are released. The "ice-reset-<vf_id>" threads are used to service some reset task in the background, but they are never joined by the thread that created them. The easiest solution is to detach new threads. The Windows EAL did not provide a pthread_detach wrapper but there is no resource to release for Windows threads, so add an empty wrapper. Fixes: `3b3757bda3` ("net/ice: get VF hardware index in DCF") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-05-11 23:40:22 +02:00
Hongbo Zheng	1fe00fd358	power: fix sanity checks for guest channel read In function power_guest_channel_read_msg, 'lcore_id' is used before validity check, which may cause buffer 'global_fds' accessed by index 'lcore_id' overflow. This patch moves the validity check of 'lcore_id' before the 'lcore_id' being used for the first time. Fixes: `9dc843eb27` ("power: extend guest channel API for reading") Cc: stable@dpdk.org Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: David Hunt <david.hunt@intel.com>	2021-05-12 17:18:38 +02:00
Chengwen Feng	cc994d3922	ipc: use monotonic clock Currently, the mp uses gettimeofday() API to get the time, and used as timeout parameter. But the time which gets from gettimeofday() API isn't monotonically increasing. The process may fail if the system time is changed. This fixes it by using clock_gettime() API with monotonic attribution. Fixes: `783b6e5497` ("eal: add synchronous multi-process communication") Fixes: `f05e26051c` ("eal: add IPC asynchronous request") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Morten Brørup <mb@smartsharesystems.com>	2021-05-12 16:49:08 +02:00
Lance Richardson	6beb2d2947	eal: fix memory mapping on 32-bit target For 32-bit targets, size_t is normally a 32-bit type and does not have sufficient range to represent 64-bit offsets that are needed when mapping PCI addresses. Use uint64_t instead. Found when attempting to run 32-bit Linux dpdk-testpmd using VFIO driver: EAL: pci_map_resource(): cannot map resource(63, 0xc0010000, \ 0x200000, 0x20000000000): Invalid argument ((nil)) Fixes: `c4b89ecb64` ("eal: introduce memory management wrappers") Cc: stable@dpdk.org Signed-off-by: Lance Richardson <lance.richardson@broadcom.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-05-11 23:01:06 +02:00
David Marchand	1657e1f871	net: fix header include order for FreeBSD Spotted by sparse in OVS build: ../../lib/netdev-dpdk.c: note: in included file (through /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_ip.h, /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h, ...): ../../include/sparse/arpa/inet.h:22:2: error: "Must include <netinet/in.h> before <arpa/inet.h> for FreeBSD support" This is a check enforced by OVS itself. See [1] for some context. 1: https://github.com/openvswitch/ovs/commit/b2befd5bb2db Fixes: `89813a522e` ("net: provide IP-related API on any OS") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-05-11 15:44:38 +02:00
David Marchand	dc2c712f72	net: add endianness annotations to ethernet headers Spotted by sparse in OVS build: /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:789:27: error: incorrect type in initializer (different base types) /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:789:27: expected unsigned short [usertype] ether_type /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:789:27: got restricted ovs_be16 [usertype] /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:829:25: error: incorrect type in initializer (different base types) /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:829:25: expected unsigned short [usertype] vlan_tci /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:829:25: got restricted ovs_be16 [usertype] /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:830:26: error: incorrect type in initializer (different base types) /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:830:26: expected unsigned short [usertype] eth_proto /home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:830:26: got restricted ovs_be16 [usertype] This was not caught before as no code in headers was using those fields. This changed with commit `6f2168b69a` ("ethdev: reuse ethernet header definition in flow item") and commit `a56a262e34` ("ethdev: reuse VLAN header definition in flow item"). Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-05-11 15:22:26 +02:00
David Marchand	eeded2044a	log: register with standardized names Let's try to enforce the convention where most drivers use a pmd. logtype with their class reflected in it, and libraries use a lib. logtype. Introduce two new macros: - RTE_LOG_REGISTER_DEFAULT can be used when a single logtype is used in a component. It is associated to the default name provided by the build system, - RTE_LOG_REGISTER_SUFFIX can be used when multiple logtypes are used, and then the passed name is appended to the default name, RTE_LOG_REGISTER is left untouched for existing external users and for components that do not comply with the convention. There is a new Meson variable log_prefix to adapt the default name for baseband (pmd.bb.), bus (no pmd.) and mempool (no pmd.) classes. Note: achieved with below commands + reverted change on net/bonding + edits on crypto/virtio, compress/mlx5, regex/mlx5 $ git grep -l RTE_LOG_REGISTER drivers/ \| while read file; do pattern=${file##drivers/}; class=${pattern%%/}; pattern=${pattern#$class/}; drv=${pattern%%/}; case "$class" in baseband) pattern=pmd.bb.$drv;; bus) pattern=bus.$drv;; mempool) pattern=mempool.$drv;; ) pattern=pmd.$class.$drv;; esac sed -i -e 's/RTE_LOG_REGISTER($.$, '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file; sed -i -e 's/RTE_LOG_REGISTER($.$, '$pattern'\.$.$,/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file; done $ git grep -l RTE_LOG_REGISTER lib/ \| while read file; do pattern=${file##lib/}; pattern=lib.${pattern%%/}; sed -i -e 's/RTE_LOG_REGISTER($.$, '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file; sed -i -e 's/RTE_LOG_REGISTER($.$, '$pattern'\.$.$,/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file; done Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-05-11 15:17:55 +02:00
Vladimir Medvedkin	be81f77d80	hash: fix tuple adjustment rte_thash_adjust_tuple() uses random to generate a new subtuple if fn() callback reports about collision. In some cases random changes the subtuple in a way that after complementary bits are applied the original tuple is obtained. This patch replaces random with subtuple increment. Fixes: `28ebff11c2` ("hash: add predictable RSS") Cc: vladimir.medvedkin@intel.com Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com> Tested-by: Stanislaw Kardach <kda@semihalf.com> Reviewed-by: Stanislaw Kardach <kda@semihalf.com>	2021-05-10 15:31:42 +02:00
David Marchand	b81bf1efe3	eal: fix leak in shared lib mode detection This is reported by our internal covscan: 1. dpdk-20.11/lib/librte_eal/common/eal_common_options.c:508: alloc_fn: Storage is returned from allocation function "dlopen". 6. dpdk-20.11/lib/librte_eal/common/eal_common_options.c:508: leaked_storage: Failing to save or free storage allocated by "dlopen("librte_eal.so.21.0", 5)" leaks it. # 506\| * shared library is not already loaded i.e. it's # statically linked.) # 507\| / # 508\|-> if (dlopen("librte_eal.so."ABI_VERSION, RTLD_LAZY \| # RTLD_NOLOAD) != NULL && # 509\| default_solib_dir != '\0' && # 510\| stat(default_solib_dir, &sb) == 0 && This leak is not an issue per se, but on the other hand, this is easy to fix and I prefer not having to waive this warning later. Fixes: `06c7871dde` ("eal: restrict default plugin path to shared lib mode") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-05-10 15:31:42 +02:00
David Marchand	627c5b41bb	build: fix default drivers list without Python If no enable_drivers option is passed, the default is to build the drivers list by calling list-dir-globs.py. But if no Python interpreter is installed, no error is reported and all drivers end up being disabled. Example on a minimal FreeBSD VM: dpdk@freebsd:~/dpdk $ meson setup build ... drivers: common/cpt: not in enabled drivers build config common/dpaax: not in enabled drivers build config common/iavf: not in enabled drivers build config common/mvep: not in enabled drivers build config common/octeontx: not in enabled drivers build config common/octeontx2: not in enabled drivers build config bus/dpaa: not in enabled drivers build config bus/fslmc: not in enabled drivers build config ... dpdk@freebsd:~/dpdk $ cd drivers/ dpdk@freebsd:~/dpdk/drivers $ ~/dpdk/buildtools/list-dir-globs.py / env: python3: No such file or directory Rely on meson internal interpreter. Check return code when calling this script. Fixes: `ab9407c3ad` ("build: allow using wildcards to disable drivers") Fixes: `2e33309ebe` ("config: enable/disable drivers in Arm builds") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-05-07 15:41:45 +02:00
Chengwen Feng	97ca1e786b	eal: fix service core list parsing This patch adds checking for service core index validity when parsing service corelist. Fixes: `7dbd7a6413` ("service: add -S corelist option") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>	2021-05-05 23:19:23 +02:00
Chengwen Feng	76b49dcbd2	ipc: check malloc sync reply result This patch adds checking for mp reply result in handle_sync(). Fixes: `07dcbfe010` ("malloc: support multiprocess memory hotplug") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>	2021-05-05 23:16:07 +02:00
David Marchand	2223b6cee9	lib: restore developer mode checks Most of the checks on developer_mode have been accidentally dropped. Restore them. Fixes: `7d611e35b0` ("lib: simplify main build file") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-05-05 22:05:21 +02:00
David Marchand	ca7036b4af	vhost: fix offload flags in Rx path The vhost library currently configures Tx offloading (PKT_TX_*) on any packet received from a guest virtio device which asks for some offloading. This is problematic, as Tx offloading is something that the application must ask for: the application needs to configure devices to support every used offloads (ip, tcp checksumming, tso..), and the various l2/l3/l4 lengths must be set following any processing that happened in the application itself. On the other hand, the received packets are not marked wrt current packet l3/l4 checksumming info. Copy virtio rx processing to fix those offload flags with some differences: - accept VIRTIO_NET_HDR_GSO_ECN and VIRTIO_NET_HDR_GSO_UDP, - ignore anything but the VIRTIO_NET_HDR_F_NEEDS_CSUM flag (to comply with the virtio spec), Some applications might rely on the current behavior, so it is left untouched by default. A new RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS flag is added to enable the new behavior. The vhost example has been updated for the new behavior: TSO is applied to any packet marked LRO. Fixes: `859b480d5a` ("vhost: add guest offload setting") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-05-04 10:22:17 +02:00
Cheng Jiang	eb36520444	vhost: add batch datapath for async packed ring Add batch datapath for async vhost packed ring to improve the performance of small packet processing. Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-05-04 10:21:59 +02:00
Cheng Jiang	873e8dad6f	vhost: support packed ring in async datapath For now async vhost data path only supports split ring. This patch enables packed ring in async vhost data path to make async vhost compatible with virtio 1.1 spec. Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-05-04 10:21:52 +02:00
Cheng Jiang	3d6cb86b0d	vhost: refactor async split ring functions This patch moves some code of async vhost split ring into inline functions to improve the readability. Also, it changes the pointer index style of iterator to make the code more concise. Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>	2021-05-04 10:21:46 +02:00
Ciara Power	048960272e	telemetry: fix race on callbacks list The list_commands() function accessed the callbacks list, but did not take the lock. This may have caused inconsistencies if callbacks were being registered at the same time. This is now fixed to lock before iterating the list, and unlock afterwards. Fixes: `f38748736e` ("telemetry: add default callback commands") Cc: stable@dpdk.org Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-05-05 18:21:26 +02:00
Jerin Jacob	dbab511874	telemetry: hide internal define Remove TELEMETRY_MAX_CALLBACKS symbol from the public rte_telemetry.h header file. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com>	2021-05-05 18:21:26 +02:00
Hemant Agrawal	a956adb281	ethdev: add missing buses in device iterator This patch fixes issue with OVS 2.15 not working on DPAA/FSLMC based platform due to missing support for these busses in dev_iterate. This patch adds dpaa_bus and fslmc to dev iterator for bus arguments. Fixes: `214ed1acd1` ("ethdev: add iterator to match devargs input") Cc: stable@dpdk.org Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-05-04 18:33:09 +02:00
Gregory Etelson	1d0b9c7d94	ethdev: fix integrity flow item Add integrity item definition to the rte_flow_desc_item array. The new entry allows to build RTE flow item from a data stored in rte_flow_item_integrity type. Fixes: `b10a421a1f` ("ethdev: add packet integrity check flow rules") Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Ori Kam <orika@nvidia.com>	2021-05-04 17:37:22 +02:00
Balazs Nemeth	a287ac2891	vhost: allocate and free packets in bulk in Tx packed Move allocation out further and perform all allocation in bulk. The same goes for freeing packets. In the process, also introduce virtio_dev_pktmbuf_prep and make virtio_dev_pktmbuf_alloc use that. Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-04-28 04:53:30 +02:00
Balazs Nemeth	56fa279124	vhost: remove remaining packets count The remained variable stores the same information as the difference between count and pkt_idx. Remove the remained variable to simplify. Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-04-28 04:49:09 +02:00
Balazs Nemeth	d18db8049c	vhost: read last used index once Instead of calculating the address of a packed descriptor based on the vq->desc_packed and vq->last_used_idx every time, store that base address in desc_base. On arm, this saves 176 bytes in code size of function in which vhost_flush_enqueue_batch_packed gets inlined. Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-04-28 04:21:37 +02:00
Jiayu Hu	c94c9f3152	vhost: fix redundant vring status change notification When VHOST_USER_F_PROTOCOL_FEATURES is not negotiated, there is no need for vhost_user_set_vring_kick() to notify the application of vring enabled, as vhost_user_msg_handler() also notifies the application. This patch is to remove unnecessary vring_state_changed() call. Fixes: `d0fcc38f5f` ("vhost: improve device readiness notifications") Cc: stable@dpdk.org Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Tested-by: Yinan Wang <yinan.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-04-28 03:52:44 +02:00
Jiayu Hu	0090535f7a	vhost: remove unnecessary free This patch removes unnecessary rte_free() for async_pkts_info and async_descs_split. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Yinan Wang <yinan.wang@intel.com>	2021-04-28 03:24:04 +02:00
Jiayu Hu	678a91efa2	vhost: fix queue initialization This patch allocates vhost queue by rte_zmalloc() to avoid undefined values. Fixes: `a277c71598` ("vhost: refactor code structure") Cc: stable@dpdk.org Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Yinan Wang <yinan.wang@intel.com>	2021-04-28 03:24:04 +02:00
Anatoly Burakov	7f15f0fbed	power: save original ACPI governor always Currently, when we set the acpi governor to "userspace", we check if it is already set to this value, and if it is, we skip setting it. However, we never save this value anywhere, so that next time we come back and request the governor to be set to its original value, the original value is empty. Fix it by saving the original pstate governor first. While we're at it, replace `strlcpy` with `rte_strscpy`. Fixes: `445c6528b5` ("power: common interface for guest and host") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com>	2021-05-05 12:29:12 +02:00
Hongbo Zheng	cdcee2ec9b	bpf: fix JSLT validation In function 'eval_jcc', judgment 'op == EBPF_JLT' occurs twice, as a result, the corresponding second statement cannot be accessed. This patch fix this problem. Fixes: `8021917293` ("bpf: add extra validation for input BPF program") Cc: stable@dpdk.org Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-05-05 12:22:30 +02:00
Konstantin Ananyev	8e2dd74f0a	acl: fix build with GCC 11 gcc 11 with '-O2' complains about some variables being used without being initialized: In function ‘start_flow_avx512x8’, inlined from ‘search_trie_avx512x8.constprop’ at acl_run_avx512_common.h:317: lib/librte_acl/acl_run_avx512_common.h:210:13: warning: ‘pdata’ is used uninitialized [-Wuninitialized] In function ‘search_trie_avx512x8.constprop’: lib/librte_acl/acl_run_avx512_common.h:314:32: note: ‘pdata’ declared here ... Indeed, these variables are not explicitly initialized, but this is done intentionally. We rely on constant mask value that we pass to start_flow*() functions as a parameter to mask out uninitialized values. Note that '-O3' doesn't produce this warning. Anyway, to support clean build with gcc-11 this patch adds explicit initialization for these variables. I checked the output binary: with '-O3' both clang and gcc 10/11 generate no extra code for it. Also performance test didn't reveal any regressions. Bugzilla ID: 673 Fixes: `b64c2295f7` ("acl: add 256-bit AVX512 classify method") Fixes: `45da22e42e` ("acl: add 512-bit AVX512 classify method") Cc: stable@dpdk.org Reported-by: Ali Alnubani <alialnu@nvidia.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-05-05 12:10:15 +02:00
Chengwen Feng	f6681ab76b	eventdev: fix memory leakage on thread creation failure This patch fixes the issue that epoll_events memory is not released after the intr thread created fail. Fixes: `3810ae4357` ("eventdev: add interrupt driven queues to Rx adapter") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-05-03 11:46:26 +02:00
Chengwen Feng	0bac9fc791	eventdev: remove redundant thread name setting The thread name already set by rte_ctrl_thread_create() API, so remove the call of rte_thread_setname() API. Fixes: `3810ae4357` ("eventdev: add interrupt driven queues to Rx adapter") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-05-03 11:46:26 +02:00
Olivier Matz	bd6113858f	mbuf: clarify usage of packet pool initializers Clarify that the mempool private initializer and object initializer used for packet pools require that the mempool private size is large enough. Also add an assert (only enabled when -DRTE_ENABLE_ASSERT is passed) to check this constraint. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Aaron Conole <aconole@redhat.com>	2021-05-04 22:41:32 +02:00
Chengwen Feng	d4902ed31c	mbuf: check shared memory before dumping dynamic space Because mbuf dyn shared memory was allocated runtime, so it's necessary to check validity when dump mbuf dyn info. Also this patch adds an error logging when init shared memory fail. Fixes: `4958ca3a44` ("mbuf: support dynamic fields and flags") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-05-04 19:49:44 +02:00
Tal Shnaiderman	2231388f21	eal/windows: fix MinGW build the strncasecmp macro defined in rte_os_shim.h is already defined in MinGW-w64, as a result the compiler prints out the warning below on function redefinition whenever compiling a file including the header in debug mode. lib/eal/windows/include/rte_os_shim.h:21: warning: "strncasecmp" redefined Fixed by defining the macro only to the clang compiler. Fixes: `45d62067c2` ("eal: make OS shims internal") Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-05-04 19:32:28 +02:00
Juraj Linkeš	20c7744f8d	eal/arm64: fix platform register bit REG_PLATFORM only uses bit 0 to indicate whether the value retrieved from hardware matches PLATFORM_STR. Fixes: `97523f822b` ("eal/arm: add CPU flags for ARMv8") Cc: stable@dpdk.org Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>	2021-05-04 18:55:09 +02:00
Bruce Richardson	7d5cfaa750	build: fix formatting of Meson lists Running "./devtools/check-meson.py --fix" on the DPDK repo fixes a number of issues with whitespace and formatting of files: * indentation of lists * missing trailing commas on final list element * multiple list entries per line when list is not all single-line Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2021-05-04 15:01:47 +02:00
Joyce Kong	cee151b41b	mempool: distinguish cache and pool debug counters If cache is enabled, objects will be retrieved/put from/to cache, subsequently from/to the common pool. Now the debug stats calculate the objects retrieved/put from/to cache and pool together, it is better to distinguish them. Signed-off-by: Joyce Kong <joyce.kong@arm.com> Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-05-04 09:44:55 +02:00
Dharmik Thakkar	5648704065	mempool: make stats macro generic Make __MEMPOOL_STAT_ADD macro more generic and delete __MEMPOOL_CONTIG_BLOCKS_STAT_ADD macro. Suggested-by: Olivier Matz <olivier.matz@6wind.com> Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-05-04 09:34:28 +02:00
Stanislaw Kardach	1abb185d6c	stack: allow lock-free only on relevant architectures Since commit `7911ba0473` ("stack: enable lock-free implementation for aarch64"), lock-free stack is supported on arm64 but this description was missing from the doxygen for the flag. Currently it is impossible to detect programmatically whether lock-free implementation of rte_stack is supported. One could check whether the header guard for lock-free stubs is defined (_RTE_STACK_LF_STUBS_H_) but that's an unstable implementation detail. Because of that currently all lock-free ring creations silently succeed (as long as the stack header is 16B long) which later leads to push and pop operations being NOPs. The observable effect is that stack_lf_autotest fails on platforms not supporting the lock-free. Instead it should just skip the lock-free test altogether. This commit adds a new errno value (ENOTSUP) that may be returned by rte_stack_create() to indicate that a given combination of flags is not supported on a current platform. This is detected by checking a compile-time flag in the include logic in rte_stack_lf.h which may be used by applications to check the lock-free support at compile time. Use the added RTE_STACK_LF_SUPPORTED flag to disable the lock-free stack tests at the compile time. Perf test doesn't fail because rte_ring_create() succeeds, however marking this test as skipped gives a better indication of what actually was tested. Fixes: `7911ba0473` ("stack: enable lock-free implementation for aarch64") Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-05-03 18:46:15 +02:00
David Marchand	7ddb625c3d	mbuf: mark offload flag as deprecated PKT_RX_EIP_CKSUM_BAD has been declared deprecated but there was no warning to applications still using it. Fix this by marking as deprecated with the newly introduced RTE_DEPRECATED. Fixes: `e8a419d6de` ("mbuf: rename outer IP checksum macro") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Lance Richardson <lance.richardson@broadcom.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-05-03 16:12:49 +02:00
Dmitry Kozlyuk	1ee899977d	eal: add timespec_get shim C11 timespec_get() is not provided on some platforms: * MinGW-w64 does not currently implement it [1]. * FreeBSD 11 with Clang 10.0.0 does not provide it. Add internal shims to Windows and FreeBSD EALs. For Windows, it can be removed after [1] is fixed. [1]: https://sourceforge.net/p/mingw-w64/mailman/message/37224689/ Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Jie Zhou <jizh@linux.microsoft.com> Acked-by: Nick Connolly <nick.connolly@mayadata.io>	2021-04-21 23:32:13 +02:00
Min Hu (Connor)	53ef1b3477	ethdev: add sanity checks in control APIs This patch adds more sanity checks in control path APIs. Fixes: `214ed1acd1` ("ethdev: add iterator to match devargs input") Fixes: `3d98f921fb` ("ethdev: unify prefix for static functions and variables") Fixes: `0366137722` ("ethdev: check for invalid device name") Fixes: `d948f596fe` ("ethdev: fix port data mismatched in multiple process model") Fixes: `5b7ba31148` ("ethdev: add port ownership") Fixes: `f8244c6399` ("ethdev: increase port id range") Cc: stable@dpdk.org Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-21 18:36:56 +02:00
Li Zhang	5f0d54f372	ethdev: add pre-defined meter policy API Currently, the flow meter policy does not support multiple actions per color; also the allowed action types per color are very limited. In addition, the policy cannot be pre-defined. Due to the growing in flow actions offload abilities there is a potential for the user to use variety of actions per color differently. This new meter policy API comes to allow this potential in the most ethdev common way using rte_flow action definition. A list of rte_flow actions will be provided by the user per color in order to create a meter policy. In addition, the API forces to pre-define the policy before the meters creation in order to allow sharing of single policy with multiple meters efficiently. meter_policy_id is added into struct rte_mtr_params. So that it can get the policy during the meters creation. Allow coloring the packet using a new rte_flow_action_color as could be done by the old policy API. Add two common policy template as macros in the head file. The next API function were added: - rte_mtr_meter_policy_add - rte_mtr_meter_policy_delete - rte_mtr_meter_policy_update - rte_mtr_meter_policy_validate The next struct was changed: - rte_mtr_params - rte_mtr_capabilities The next API was deleted: - rte_mtr_policer_actions_update To support this API the following app were changed: app/test-flow-perf: clean meter policer app/testpmd: clean meter policer To support this API the following drivers were changed: net/softnic: support meter policy API 1. Cleans meter rte_mtr_policer_action. 2. Supports policy API to get color action as policer action did. The color action will be mapped into rte_table_action_policer. net/mlx5: clean meter creation management Cleans and breaks part of the current meter management in order to allow better design with policy API. Signed-off-by: Li Zhang <lizh@nvidia.com> Signed-off-by: Haifei Luo <haifeil@nvidia.com> Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Jasvinder Singh <jasvinder.singh@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2021-04-21 12:22:17 +02:00
Bing Zhao	9847fd125d	ethdev: introduce conntrack flow action and item This commit introduces the conntrack action and item. Usually the HW offloading is stateless. For some stateful offloading like a TCP connection, HW module will help provide the ability of a full offloading w/o SW participation after the connection was established. The basic usage is that in the first flow rule the application should add the conntrack action and jump to the next flow table. In the following flow rule(s) of the next table, the application should use the conntrack item to match on the result. A TCP connection has two directions traffic. To set a conntrack action context correctly, the information of packets from both directions are required. The conntrack action should be created on one ethdev port and supply the peer ethdev port as a parameter to the action. After context created, it could only be used between these two ethdev ports (dual-port mode) or a single port. The application should modify the action via the API "rte_action_handle_update" only when before using it to create a flow rule with conntrack for the opposite direction. This will help the driver to recognize the direction of the flow to be created, especially in the single-port mode, in which case the traffic from both directions will go through the same ethdev port if the application works as an "forwarding engine" but not an end point. There is no need to call the update interface if the subsequent flow rules have nothing to be changed. Query will be supported via "rte_action_handle_query" interface, about the current packets information and connection status. The fields query capabilities depends on the HW. For the packets received during the conntrack setup, it is suggested to re-inject the packets in order to make sure the conntrack module works correctly without missing any packet. Only the valid packets should pass the conntrack, packets with invalid TCP information, like out of window, or with invalid header, like malformed, should not pass. Naming and definition: https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/ netfilter/nf_conntrack_tcp.h https://elixir.bootlin.com/linux/latest/source/net/netfilter/ nf_conntrack_proto_tcp.c Other reference: https://www.usenix.org/legacy/events/sec01/invitedtalks/rooij.pdf Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-20 01:24:57 +02:00
Ori Kam	b10a421a1f	ethdev: add packet integrity check flow rules Currently, DPDK application can offload the checksum check, and report it in the mbuf. However, as more and more applications are offloading some or all logic and action to the HW, there is a need to check the packet integrity so the right decision can be taken. The application logic can be positive meaning if the packet is valid jump / do actions, or negative if packet is not valid jump to SW / do actions (like drop) and add default flow (match all in low priority) that will direct the miss packet to the miss path. Since currently rte_flow works in positive way the assumption is that the positive way will be the common way in this case also. When thinking what is the best API to implement such feature, we need to consider the following (in no specific order): 1. API breakage. 2. Simplicity. 3. Performance. 4. HW capabilities. 5. rte_flow limitation. 6. Flexibility. First option: Add integrity flags to each of the items. For example add checksum_ok to IPv4 item. Pros: 1. No new rte_flow item. 2. Simple in the way that on each item the app can see what checks are available. Cons: 1. API breakage. 2. Increase number of flows, since app can't add global rule and must have dedicated flow for each of the flow combinations, for example matching on ICMP traffic or UDP/TCP traffic with IPv4 / IPv6 will result in 5 flows. Second option: dedicated item Pros: 1. No API breakage, and there will be no for some time due to having extra space. (by using bits) 2. Just one flow to support the ICMP or UDP/TCP traffic with IPv4 / IPv6. 3. Simplicity application can just look at one place to see all possible checks. 4. Allow future support for more tests. Cons: 1. New item, that holds number of fields from different items. For starter the following bits are suggested: 1. packet_ok - means that all HW checks depending on packet layer have passed. This may mean that in some HW such flow should be split to number of flows or fail. 2. l2_ok - all check for layer 2 have passed. 3. l3_ok - all check for layer 3 have passed. If packet doesn't have L3 layer this check should fail. 4. l4_ok - all check for layer 4 have passed. If packet doesn't have L4 layer this check should fail. 5. l2_crc_ok - the layer 2 CRC is O.K. 6. ipv4_csum_ok - IPv4 checksum is O.K. It is possible that the IPv4 checksum will be O.K. but the l3_ok will be 0. It is not possible that checksum will be 0 and the l3_ok will be 1. 7. l4_csum_ok - layer 4 checksum is O.K. 8. l3_len_OK - check that the reported layer 3 length is smaller than the frame length. Example of usage: 1. Check packets from all possible layers for integrity. flow create integrity spec packet_ok = 1 mask packet_ok = 1 ..... 2. Check only packet with layer 4 (UDP / TCP) flow create integrity spec l3_ok = 1, l4_ok = 1 mask l3_ok = 1 l4_ok = 1 Signed-off-by: Ori Kam <orika@nvidia.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-19 19:05:17 +02:00
Bing Zhao	4b61b8774b	ethdev: introduce indirect flow action Right now, rte_flow_shared_action_* APIs are used for some shared actions, like RSS, count. The shared action should be created before using it inside a flow. These shared actions sometimes are not really shared but just some indirect actions decoupled from a flow. The new functions rte_flow_action_handle_* are added to replace the current shared functions rte_flow_shared_action_. There are two types of flow actions: 1. the direct (normal) actions that could be created and stored within a flow rule. Such action is tied to its flow rule and cannot be reused. 2. the indirect action, in the past, named shared_action. It is created from a direct actioni, like count or rss, and then used in the flow rules with an object handle. The PMD will take care of the retrieve from indirect action to the direct action when it is referenced. The indirect action is accessed (update / query) w/o any flow rule, just via the action object handle. For example, when querying or resetting a counter, it could be done out of any flow using this counter, but only the handle of the counter action object is required. The indirect action object could be shared by different flows or used by a single flow, depending on the direct action type and the real-life requirements. The handle of an indirect action object is opaque and defined in each driver and possibly different per direct action type. The old name "shared" is improper in a sense and should be replaced. Since the APIs are changed from "rte_flow_shared_action" to the new "rte_flow_action_handle", the testpmd application code and command line interfaces also need to be updated to do the adaption. The testpmd application user guide is also updated. All the "shared action" related parts are replaced with "indirect action" to have a correct explanation. The parameter of "update" interface is also changed. A general pointer will replace the rte_flow_action struct pointer due to the facts: 1. Some action may not support fields updating. In the example of a counter, the only "update" supported should be the reset. So passing a rte_flow_action struct pointer is meaningless and there is even no such corresponding action struct. What's more, if more than one operations should be supported, for some other action, such pointer parameter may not meet the need. 2. Some action may need conditional or partial update, the current parameter will not provide the ability to indicate which part(s) to update. For different types of indirect action objects, the pointer could either be the same of rte_flow_action struct - in order not to break the current driver implementation, or some wrapper structures with bits as masks to indicate which part to be updated, depending on real needs of the corresponding direct action. For different direct actions, the structures of indirect action objects updating will be different. All the underlayer PMD callbacks will be moved to these new APIs. The RTE_FLOW_ACTION_TYPE_SHARED is kept for now in order not to break the ABI. All the implementations are changed by using RTE_FLOW_ACTION_TYPE_INDIRECT. Since the APIs are changed from "rte_flow_shared_action" to the new "rte_flow_action_handle" and the "update" interface's 3rd input parameter is changed to generic pointer, the mlx5 PMD that uses these APIs needs to do the adaption to the new APIs as well. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Andrey Vesnovaty <andreyv@nvidia.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-19 18:25:42 +02:00
Lijun Ou	9ad9ff476c	ethdev: add queue state in queried queue information Currently, upper-layer application could get queue state only through pointers such as dev->data->tx_queue_state[queue_id], this is not the recommended way to access it. So this patch add get queue state when call rte_eth_rx_queue_info_get and rte_eth_tx_queue_info_get API. Note: After add queue_state field, the 'struct rte_eth_rxq_info' size remains 128B, and the 'struct rte_eth_txq_info' size remains 64B, so it could be ABI compatible. Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-04-19 18:25:35 +02:00
Thomas Monjalon	027c931be8	telemetry: fix build on FreeBSD < 12.2 The function pthread_setname_np() was originally not available on FreeBSD. It has been added in FreeBSD 12.2: https://svnweb.freebsd.org/base?view=revision&revision=362264 The EAL implementation of rte_thread_setname() is duplicated in the telemetry library, which does not depend on EAL, so the compilation is safe in all systems. Fixes: `5da7736f8c` ("telemetry: set socket listener thread name") Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-21 20:07:59 +02:00
Savinay Dharmappa	3a91d2d138	sched: fix traffic class oversubscription parameter This patch fixes the traffic class oversubscription watermark value by initialising it with computed value of maximum watermark. Fixes: `ac6fcb841b` ("sched: update subport rate dynamically") Cc: stable@dpdk.org Signed-off-by: Savinay Dharmappa <savinay.dharmappa@intel.com> Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>	2021-04-21 16:57:18 +02:00
Pu Xu	1edf7a796d	ip_frag: fix fragmenting IPv4 packet with header option When fragmenting IPv4 packet, the data offset should be calculated through the IHL field in IP header rather than using sizeof(struct rte_ipv4_hdr). Fixes: `4c38e5532a` ("ip_frag: refactor IPv4 fragmentation into a proper library") Cc: stable@dpdk.org Signed-off-by: Pu Xu <583493798@qq.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-04-21 16:50:46 +02:00
Chengwen Feng	c53a5f3efb	telemetry: check thread creations Add result check and message print out for thread creation after failure. Fixes: `b80fe1805e` ("telemetry: introduce backward compatibility") Fixes: `6dd571fd07` ("telemetry: introduce new functionality") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Ciara Power <ciara.power@intel.com>	2021-04-21 16:23:50 +02:00
Chengwen Feng	5da7736f8c	telemetry: set socket listener thread name This patch supports set init threads name which is helpful for debugging. Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-21 15:57:47 +02:00
Bruce Richardson	0bf5832222	lib: allow disabling optional libraries Add support for the disable_libs option, to allow disabling the build of particular libraries. As part of this, maintain a list of what libraries can safely be disabled, without breaking the build - for now this list is solely those libraries which are not built on FreeBSD, kni, power and vhost. This list can be expanded by future patches. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-21 14:17:29 +02:00
Bruce Richardson	99a2dd955f	lib: remove librte_ prefix from directory names There is no reason for the DPDK libraries to all have 'librte_' prefix on the directory names. This prefix makes the directory names longer and also makes it awkward to add features referring to individual libraries in the build - should the lib names be specified with or without the prefix. Therefore, we can just remove the library prefix and use the library's unique name as the directory name, i.e. 'eal' rather than 'librte_eal' Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-21 14:04:09 +02:00
Bruce Richardson	6fc406593a	lib: clean up build files Switch from using tabs to 4 spaces for meson.build indentation. Perform other formatting cleanups such as ensure that long lists of files are one per line, and terminating with a final comma before the closing brace to make addition/removals easier. In some cases, reorder lists of items where they were not in alphabetical order. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-21 12:37:55 +02:00
Bruce Richardson	9cc02b1794	lib: tidy up build list With the lib/meson.build file changed from C-style indentation to python-style indentation, we need to correct the indentation of the lists of libraries, since these libs were not modified in the previous patches. For ease of management of the list and working with patches for adding to the list, put each library on it's own line. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-21 12:37:55 +02:00
Bruce Richardson	7d611e35b0	lib: simplify main build file Two simplifications can be made to the build file which reduce indentation levels and make it easier to read: 1. When meson build support was first added, the compat library existed in DPDK as a single header file. Since that header has been merged into EAL, we no longer need to support header-only libraries, so can shorten the code. 2. From meson 0.49 onwards we have the "continue" keyword available to break out of one loop iteration and begin the next. This allows us to remove blocks in the build configuration file which were conditional on the "build" variable being true. Instead we can use "continue" to abort processing at the point where the "build" value becomes false. Since this patch changes the indentation level of large parts of the meson.build file, we use the opportunity to adjust the whitespace used to the meson-standard 4-spec indentation level. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-21 12:37:55 +02:00
Ali Alnubani	fe4b8c7bcd	pipeline: fix build with GCC 4.8.5 Compilation on CentOS 7 with gcc version 4.8.5 fails with the following errors: error: 'src_struct_id' may be used uninitialized in this function [-Werror=maybe-uninitialized] error: 'dst_struct_id' may be used uninitialized in this function [-Werror=maybe-uninitialized] This patch fixes the build errors by initializing both variables. Bugzilla ID: 683 Fixes: `783768136f` ("pipeline: auto-detect endianness of action arguments") Signed-off-by: Ali Alnubani <alialnu@nvidia.com>	2021-04-21 12:36:17 +02:00
Elad Nachman	6b1f8e4f9b	kni: support async user request Adding async userspace requests which don't wait for the userspace response and always return success. This is preparation to address a regression in KNI. Signed-off-by: Elad Nachman <eladv6@gmail.com> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-04-21 01:05:15 +02:00
Dmitry Kozlyuk	ff4cf5265c	mem: fix cleanup after incomplete initialization In case of EAL initialization failure rte_eal_memory_detach() may be called before mapping memory configuration, which in this case points to the static structure. Attempt to unmap it yields error: EAL: Could not unmap shared memory config: Invalid argument Skip unmapping memory configuration if it's not yet shared. Fixes: `dfbc61a2f9` ("mem: detach memsegs on cleanup") Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-04-20 23:33:03 +02:00
Vladimir Medvedkin	28ebff11c2	hash: add predictable RSS This patch adds predictable RSS API. It is based on the idea of searching partial Toeplitz hash collisions. Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2021-04-20 23:13:23 +02:00
Cristian Dumitrescu	783768136f	pipeline: auto-detect endianness of action arguments Each table entry is made up of match fields and action data, with the latter made up of the action ID and the action arguments. The approach of having the user specify explicitly the endianness of the action arguments is difficult to be picked up by P4 compilers, as the P4 compiler is generally unaware about this aspect. This commit introduces the auto-detection of the endianness of the action arguments by examining the endianness of the their destination: network byte order (NBO) when they get copied to headers and host byte order (HBO) when they get copied to packet meta-data or mailboxes. The endianness specification of each action argument as part of the rule specification, e.g. H(...) and N(...) is removed from the rule file and auto-detected based on their destination. The DMA instruction scope is made internal, so mov instructions need to be used. The pattern of transferring complete headers from table entry action args to headers is detected, and the associated set of mov instructions plus header validate is internally detected and replaced with the internal-only DMA instruction to preserve performance. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-04-20 21:55:43 +02:00
Cristian Dumitrescu	0f5df4ea5d	pipeline: modularize SWX instruction optimizer Decouple between the different instruction optimizer. Allow each optimization to run as a separate iteration on the entire instruction stream. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-04-20 21:55:34 +02:00
Cristian Dumitrescu	48ad58964c	pipeline: fix endianness conversions The SWX pipeline instructions work with operands of different types: header fields (h.header.field), packet meta-data (m.field), extern object mailbox field (e.obj.field), extern function (f.field), action data read from table entries (t.field), or immediate values; hence the HMEFTI acronym. The H operands are stored in network byte order (NBO), while the MEFT operands are stored in host byte order (HBO), hence the need to operate endianness conversions. Some of the endianness conversion macros were not working correctly for some cases such as operands of different sizes, and they are fixed now. Affected instructions: mov, and, or, xor, jmpeq, jmpneq. Fixes: `7210349d5b` ("pipeline: add SWX move instruction") Fixes: `650195cf96` ("pipeline: introduce SWX and instruction") Fixes: `8f796198dc` ("pipeline: introduce SWX or instruction") Fixes: `b4e607f9fd` ("pipeline: introduce SWX XOR instruction") Fixes: `b3947e25be` ("pipeline: introduce SWX jump and return instructions") Cc: stable@dpdk.org Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-04-20 02:34:16 +02:00
Cristian Dumitrescu	e8e22eb0dd	pipeline: adjust error code for internal function Adjusting the error code for the internal function instruction_config to match the rest of the code which is returning a negative value on error. Cosmetic change. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-04-20 02:33:12 +02:00
Cristian Dumitrescu	6b840b7c53	pipeline: validate header on SWX emit Enhance the behavior of the emit instruction to ignore invalid headers, as mandated by the P4 language specification. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-04-20 02:31:21 +02:00
Cristian Dumitrescu	742b0a57f5	pipeline: add table statistics to SWX Add support for table statistics for the SWX pipeline. For each table, we maintain a counter for lookup hit packets, one for lookup miss packets and one packet counter for each table action. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>	2021-04-20 02:27:56 +02:00
Cristian Dumitrescu	fe16d678e7	pipeline: add drop instruction to SWX Enabled the TX instruction to accept an immediate value for the output port argument. The drop instruction is simply an alias to the TX instruction for the last output port of the pipeline. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-04-20 02:24:36 +02:00
Cristian Dumitrescu	ea5ab65f57	pipeline: relax table match field requirements The match fields for a given table have to be part of the same header or the metadata structure. This commit removes the requirement that the list of match fields must observe the order of fields within their structure. For example, the h.ipv4.dst_addr field can now be listed before the h.ipv4.src_addr field in a table match field list, even though within the IPv4 header the dst_addr field is present after the src_addr field. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-04-20 02:24:36 +02:00
Churchill Khangar	1af0e07b27	table: relax requirements for entry action data Currently, the table entry action data is required to be NULL when the action data size is zero. We now require that action data is ignored when the action data size is zero. This is to allow for a table entry instance to be allocated once with max action data size for the table and reused repeatedly for actions of different sizes, including zero. Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>	2021-04-19 20:21:18 +02:00
Cristian Dumitrescu	97005a6665	table: fix out of bounds write Fix out of bounds write. The allocated string size was incorrect. Coverity issue: 369670 Fixes: `66440b7b22` ("table: add wildcard match table type") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-04-19 19:49:09 +02:00
Cristian Dumitrescu	ae650ff9ba	port: fix allocation check in ring SWX Fix logically dead code in ring port. Coverity issue: 369664 Fixes: `77a413017c` ("port: add ring SWX port") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2021-04-19 19:33:46 +02:00
Xueming Li	e4b5b2ba9c	devargs: fix list entry update When inserting devargs that is already in list, existing one was reset and replaced completely by new one, the entry info was lost during copy. This patch backups entry info before copy. Fixes: `64051bb1f1` ("devargs: unify scratch buffer storage") Reported-by: Jim Harris <james.r.harris@intel.com> Signed-off-by: Xueming Li <xuemingl@nvidia.com>	2021-04-19 18:13:02 +02:00
Yunjian Wang	22677b0eef	vfio: fix duplicated user mem map Currently, new user mem maps are checked if they are adjacent to an existing mem map and if so, the mem map entries are merged. It didn't check for duplicate mem maps, so if the API is called with the same mem map multiple times, they will occupy multiple mem map entries. This will reduce the amount of entries available for unique mem maps. So check for duplicate mem maps and merge them into one mem map entry if any found. Fixes: `0cbce3a167` ("vfio: skip DMA map failure if already mapped") Cc: stable@dpdk.org Suggested-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-04-19 11:57:46 +02:00
Tyler Retzlaff	1cd3ce0953	power: replace unsigned -1 with unsigned maximum Use UINT64_MAX instead of -1ULL. Some compilers generate a warning when applying a '-' to an unsigned literal so avoid this by initializing with unsigned preprocessor definition. Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-04-19 11:36:12 +02:00
Tyler Retzlaff	5535bbadcb	eal: replace unsigned -1 with unsigned maximums Use UINT64_MAX and UINT32_MAX instead of -1 or ~0 literal variations of different explicit widths when creating masks and sentinel values. Some compilers generate a warning when applying a '-' to an unsigned literal so avoid this by initializing with unsigned preprocessor definitions where appropriate. Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-04-19 11:29:45 +02:00
Tyler Retzlaff	e8eb80e8bf	eal: check vsnprintf failure in devargs parsing Check for failure, while here just increment len once after checking for failure instead of duplicating len + 1 math in two different argument lists. Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>	2021-04-19 11:18:07 +02:00
Shijith Thotton	d69123d266	eventdev: fix case to initiate crypto adapter service Initiate software crypto adapter service, only if hardware capabilities are not reported. In OP_FORWARD mode, software service is not required to enqueue events if OP_FORWARD capability is supported by the PMD. Fixes: `7901eac340` ("eventdev: add crypto adapter implementation") Cc: stable@dpdk.org Signed-off-by: Shijith Thotton <sthotton@marvell.com> Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>	2021-04-17 19:22:41 +02:00
Akhil Goyal	f96a8ebb27	eventdev: introduce crypto adapter enqueue API In case an event from a previous stage is required to be forwarded to a crypto adapter and PMD supports internal event port in crypto adapter, exposed via capability RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD, we do not have a way to check in the API rte_event_enqueue_burst(), whether it is for crypto adapter or for eth tx adapter. Hence we need a new API similar to rte_event_eth_tx_adapter_enqueue(), which can send to a crypto adapter. Note that RTE_EVENT_TYPE_* cannot be used to make that decision, as it is meant for event source and not event destination. And event port designated for crypto adapter is designed to be used for OP_NEW mode. Hence, in order to support an event PMD which has an internal event port in crypto adapter (RTE_EVENT_CRYPTO_ADAPTER_OP_FORWARD mode), exposed via capability RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD, application should use rte_event_crypto_adapter_enqueue() API to enqueue events. When internal port is not available(RTE_EVENT_CRYPTO_ADAPTER_OP_NEW mode), application can use API rte_event_enqueue_burst() as it was doing earlier, i.e. retrieve event port used by crypto adapter and bind its event queues to that port and enqueue events using the API rte_event_enqueue_burst(). Signed-off-by: Akhil Goyal <gakhil@marvell.com> Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-04-17 18:49:52 +02:00
Raslan Darawsheh	7d96f5717a	ethdev: update flow item GTP QFI definition 'qfi' field is 8 bits which represent single bit for PPP (paging Policy Presence) single bit for RQI (Reflective QoS Indicator) and 6 bits for QFI (QoS Flow Identifier) This is based on RFC 38415-g30 https://www.3gpp.org/ftp/Specs/archive/38_series/38.415/38415-g30.zip Updated the doxygen comment and the mask for 'qfi' to properly identify the full 8 bits of the field. note: changing the default mask would cause different patterns generated by testpmd. Fixes: `346553db5b` ("ethdev: add GTP extension header to flow API") Cc: stable@dpdk.org Signed-off-by: Raslan Darawsheh <rasland@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-04-14 14:17:41 +02:00
Haifei Luo	50c383793b	ethdev: dump single flow rule Previous implementations support dump all the flows. Add new arg rte_flow in rte_flow_dev_dump to dump one flow. Signed-off-by: Haifei Luo <haifeil@nvidia.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Ori Kam <orika@nvidia.com>	2021-04-14 13:19:55 +02:00
Li Zhang	74c8ec894d	ethdev: add packet mode in meter profile structure Currently meter algorithms only supports rate is bytes per second (BPS). Add packet_mode flag in meter profile parameters data structure. So that it can meter traffic by packet per second. When packet_mode is 0, the profile rates and bucket sizes are specified in bytes per second and bytes when packet_mode is not 0, the profile rates and bucket sizes are specified in packets and packets per second. The below structure will be extended: rte_mtr_meter_profile rte_mtr_capabilities Signed-off-by: Li Zhang <lizh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>	2021-04-13 18:40:58 +02:00
Gregory Etelson	f8e7132572	ethdev: fix VXLAN mask initialization In GCC compiler, __builtin_constant_p(exp) is a function. The function returns the integer 1 if the argument is known to be a compile-time constant. Therefore, __builtin_constant_p(0xffffff << 8) returned 1. As the result, rte_flow_item_vxlan_mask was initiated to {{ {flags = 0x0, rsvd0 = {0x0, 0x0, 0x0}, vni = {0x0, 0x0, 0x0}, rsvd1 = 0x1}, hdr = {vx_flags = 0x0, vx_vni = 0x1000000}}} }} GCC fails initialization rte_flow_item_vxlan_mask.hdr.vni = (0xffffff << 8) with "initializer element is not a constant expression" error. Use immediate 0xffffff00 value instead. Fixes: `43af98e687` ("ethdev: reuse VXLAN header definition in flow item") Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>	2021-04-09 11:27:40 +02:00
Matan Azrad	07b0b75370	cryptodev: formalize key wrap method in API The Key Wrap approach is used by applications in order to protect keys located in untrusted storage or transmitted over untrusted communications networks. The constructions are typically built from standard primitives such as block ciphers and cryptographic hash functions. The Key Wrap method and its parameters are a secret between the keys provider and the device, means that the device is preconfigured for this method using very secured way. The key wrap method may change the key length and layout. Add a description for the cipher transformation key to allow wrapped key to be forwarded by the same API. Add a new feature flag RTE_CRYPTODEV_FF_CIPHER_WRAPPED_KEY to be enabled by PMDs support wrapped key in cipher trasformation. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-04-16 12:43:33 +02:00
Fan Zhang	c21574edc5	cryptodev: add dequeue count parameter in raw API This patch changes the experimental raw data path dequeue burst API. Originally the API enforces the user to provide callback function to get maximum dequeue count. This change gives the user one more option to pass directly the expected dequeue count. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-04-16 12:43:33 +02:00
Matan Azrad	d014dddb2d	cryptodev: support multiple cipher data-units In cryptography, a block cipher is a deterministic algorithm operating on fixed-length groups of bits, called blocks. A block cipher consists of two paired algorithms, one for encryption and the other for decryption. Both algorithms accept two inputs: an input block of size n bits and a key of size k bits; and both yield an n-bit output block. The decryption algorithm is defined to be the inverse function of the encryption. For AES standard the block size is 16 bytes. For AES in XTS mode, the data to be encrypted\decrypted does not have to be multiple of 16B size, the unit of data is called data-unit. The data-unit size can be any size in range [16B, 2^24B], so, in this case, a data stream is divided into N amount of equal data-units and must be encrypted\decrypted in the same data-unit resolution. For ABI compatibility reason, the size is limited to 64K (16-bit field). The new field dataunit_len is inserted in a struct padding hole, which is only 2 bytes long in 32-bit build. It could be moved and extended later during an ABI-breakage window. The current cryptodev API doesn't allow the user to select a specific data-unit length supported by the devices. In addition, there is no definition how the IV is detected per data-unit when single operation includes more than one data-unit. That causes applications to use single operation per data-unit even though all the data is continuous in memory what reduces datapath performance. Add a new feature flag to support multiple data-unit sizes, called RTE_CRYPTODEV_FF_CIPHER_MULTIPLE_DATA_UNITS. Add a new field in cipher capability, called dataunit_set, where the devices can report the range of the supported data-unit sizes. Add a new cipher transformation field, called dataunit_len, where the user can select the data-unit length for all the operations. All the new fields do not change the size of their structures, by filling some struct padding holes. They are added as exceptions in the ABI check file libabigail.abignore. Using a bitmap to report the supported data-unit sizes capability allows the devices to report a range simply as same as the user to read it simply. also, thus sizes are usually common and probably will be shared among different devices. Signed-off-by: Matan Azrad <matan@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Akhil Goyal <gakhil@marvell.com>	2021-04-16 12:43:33 +02:00
Nicolas Chautru	48fc315f02	bbdev: add explicit enum for code block mode Using explicit enum instead of ambiguous integer value Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com> Reviewed-by: Tom Rix <trix@redhat.com>	2021-04-16 12:43:33 +02:00
Anatoly Burakov	2414ce9b7b	power: fix closing frequency file Currently, we open the system base frequency file, but never close it, which results in a memory leak. Coverity issue: 369693 Fixes: `8a5febaac4` ("power: fix P-state base frequency handling") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com>	2021-04-15 23:53:39 +02:00
Anatoly Burakov	64f22b91c6	power: remove redundant close of frequency file Previous fix has addressed the incorrect handling of `base_frequency` file, but has added a use-after-free error due to the fact that all further code paths will lead to an `fclose()` call at the end, so the additional `fclose()` call right after processing the file was unnecessary. Coverity issue: 369901 Fixes: `8a5febaac4` ("power: fix P-state base frequency handling") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Liang Ma <liangma@liangbit.com> Acked-by: David Hunt <david.hunt@intel.com>	2021-04-15 23:53:33 +02:00
Tyler Retzlaff	a6f88f7eb9	eal: add C++ include guard for reciprocal header Add missing extern "C" linkage for rte_reciprocal.h consistent with other eal headers. Fixes: `ffe3ec811e` ("sched: introduce reciprocal divide") Cc: stable@dpdk.org Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: David Marchand <david.marchand@redhat.com>	2021-04-15 16:44:18 +02:00
Dmitry Kozlyuk	89813a522e	net: provide IP-related API on any OS Users of <rte_ip.h> relied on it to provide IP-related defines, like IPPROTO_* constants, but still had to include POSIX headers for inet_pton() and other standard IP-related facilities. Extend <rte_ip.h> so that it is a single header to gain access to IP-related facilities on any OS. Use it to replace POSIX includes in components enabled on Windows. Move missing constants from Windows networking shim to OS shim header and include it where needed. Remove Windows networking shim that is no longer needed. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com>	2021-04-15 01:56:43 +02:00
Dmitry Kozlyuk	6c068dbd9f	net: work around s_addr macro on Windows Windows Sockets headers contain `#define s_addr S_un.S_addr`, which conflicts with definition of `s_addr` field of `struct rte_ether_hdr`. Prieviously `s_addr` was undefined in <rte_ether.h>, which had been breaking access to `s_addr` field of `struct in_addr`, so some DPDK and Windows headers could not be included in one file. Renaming of `struct rte_ether_hdr` is planned: https://mails.dpdk.org/archives/dev/2021-March/201444.html Temporarily disable `s_addr` macro around `struct rte_ether_hdr` definition to avoid conflict. Place source MAC address in both `s_addr` and `S_un.S_addr` fields, so that access works either directly or through the macro as defined in Windows headers. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2021-04-15 01:56:40 +02:00
Dmitry Kozlyuk	45d62067c2	eal: make OS shims internal DPDK code often relies on functions and macros that are not standard C, but are found on all platforms, even if by slightly different names. Windows <rte_os.h> provided macros or inline definitions for such symbols. However, when placed in public header, these symbols were unnecessarily exposed, breaking consumer POSIX compatibility code. Move most of the shims to <rte_os_shim.h>, a header to be used instead of <rte_os.h> by internal code. Include it in libraries and PMDs that previously imported shims from <rte_os.h>. Directly replace shims that were only used inside EAL: * index -> strchr, rindex -> strrchr * sleep -> rte_delay_us_sleep * strerror_r -> strerror_s Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ranjit Menon <ranjit.menon@intel.com>	2021-04-15 01:56:20 +02:00
Dmitry Kozlyuk	9ec521006d	eal/windows: hide asprintf shim Make asprintf(3) implementation for Windows private to EAL, so that it's hidden from external consumers. It is not exposed to internal consumers either, because they don't need asprintf() and also because callers from other modules would have no reliable way to free allocated memory. Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Khoa To <khot@microsoft.com> Acked-by: Nick Connolly <nick.connolly@mayadata.io> Acked-by: Ranjit Menon <ranjit.menon@intel.com>	2021-04-15 01:56:06 +02:00
Xueming Li	3ab385063c	kvargs: add get by key Adds a new function to get value of a specific key from kvargs list. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Gaetan Rivet <grive@u256.net>	2021-04-14 22:25:22 +02:00
Xueming Li	e132ee8690	devargs: fix memory leak on parsing failure This patch fixes memory leak in parsing error handling. Fixes: `338327d731` ("devargs: add function to parse device layers") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Gaetan Rivet <grive@u256.net>	2021-04-14 22:25:15 +02:00
Xueming Li	64051bb1f1	devargs: unify scratch buffer storage In current design, legacy parser rte_devargs_parse() saved scratch buffer to devargs.args while new parser rte_devargs_layers_parse() saved to devargs.data. Code using devargs had to know the difference and cleaned up memory accordingly - error prone. This patch unifies scratch buffer to data field, introduces rte_devargs_reset() function to wrap the memory clean up logic. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Reviewed-by: Gaetan Rivet <grive@u256.net>	2021-04-14 22:25:08 +02:00
Stephen Hemminger	9667d97c25	pflock: add phase-fair reader writer locks This is a new type of reader-writer lock that provides better fairness guarantees which better suited for typical DPDK applications. A pflock has two ticket pools, one for readers and one for writers. Phase-fair reader writer locks ensure that neither reader nor writer will be starved. Neither reader or writer are preferred, they execute in alternating phases. All operations of the same type (reader or writer) that acquire the lock are handled in FIFO order. Write operations are exclusive, and multiple read operations can be run together (until a write arrives). A similar implementation is in Concurrency Kit package in FreeBSD. For more information see: "Reader-Writer Synchronization for Shared-Memory Multiprocessor Real-Time Systems", http://www.cs.unc.edu/~anderson/papers/ecrts09b.pdf Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2021-04-14 21:59:47 +02:00
Pavan Nikhilesh	206c562e4b	eventdev: fix build on RHEL 7 Since queue identifier is passed as signed integer, a compilation error is generated: rte_event_eth_rx_adapter.c:1810:57: error: signed and unsigned type in conditional expression [-Werror=sign-compare] Make queue identifier as unsigned when adding it to vector data. Bugzilla ID: 672 Fixes: `d7c428e557` ("eventdev: support Rx adapter event vector") Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-04-14 10:02:05 +02:00
Tyler Retzlaff	7498c2d74e	eal: do not redefine asm keyword in C++ C++ forbids redefining a keyword as a macro. The keyword asm is conditionally-supported and implementation defined, but it seems our best guess. In C, if asm does not exist, it is defined as __asm__ which is a GNU extension. Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-13 15:32:48 +02:00
Pavan Nikhilesh	bc7d4b0346	eventdev: support Tx adapter event vector Add event vector support for event eth Tx adapter, the implementation receives events from the single linked queue and based on rte_event_vector::attr_valid transmits the vector of mbufs to a given port, queue pair. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-04-12 09:23:34 +02:00
Pavan Nikhilesh	d7c428e557	eventdev: support Rx adapter event vector Add event vector support for event eth Rx adapter, the implementation creates vector flows based on port and queue identifier of the received mbufs. The flow id for SW Rx event vectorization will use 12-bits of queue identifier and 8-bits port identifier when custom flow id is not set for simplicity. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-04-12 09:23:34 +02:00
Pavan Nikhilesh	3da4060a30	eventdev: introduce event vector Tx capability Introduce event vector transmit capability for event eth tx adapter. The capability indicates that the Tx adapter is capable of transmitting event vectors. When rte_event_vector::union_valid is set, the Tx adapter should transmit all the packets to the rte_event_vector::port using the rte_event_vector::queue. If rte_event_vector::union_valid is not set then the Tx adapter should peek into each mbuf to get the destination port and queue pair. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-04-12 09:23:34 +02:00
Pavan Nikhilesh	3c838062b9	eventdev: introduce event vector Rx capability Introduce event ethernet Rx adapter event vector capability. If an event eth Rx adapter has the capability of RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR then a given Rx queue can be configured to enable event vectorization by passing the flag RTE_EVENT_ETH_RX_ADAPTER_QUEUE_EVENT_VECTOR to rte_event_eth_rx_adapter_queue_conf::rx_queue_flags while configuring Rx adapter through rte_event_eth_rx_adapter_queue_add(). The max vector size, vector timeout define the vector size and mempool used for allocating vector event are configured through rte_event_eth_rx_adapter_queue_add. The element size of the element in the vector pool should be equal to sizeof(struct rte_event_vector) + (vector_sz * sizeof(uintptr_t)) Application can use `rte_event_vector_pool_create` to create the vector mempool used for rte_event_eth_rx_adapter_queue_conf::vector_mp. The Rx adapter would be responsible for vectorizing the mbufs based on the flow, the vector limits configured by the application and add the vector event of mbufs to the event queue set via rte_event_eth_rx_adapter_queue_conf::ev::queue_id. It should also mark rte_event_vector::union_valid and fill rte_event_vector::port, rte_event_vector::queue. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-04-12 09:23:34 +02:00
Pavan Nikhilesh	1cc44d4092	eventdev: introduce event vector capability Introduce rte_event_vector datastructure which is capable of holding multiple uintptr_t of the same flow thereby allowing applications to vectorize their pipeline and reducing the complexity of pipelining the events across multiple stages. This approach also reduces the scheduling overhead on a event device. Add a event vector mempool create handler to create mempools based on the best mempool ops available on a given platform. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>	2021-04-12 09:23:34 +02:00
Shijith Thotton	a10d79a60b	eventdev: introduce adapter flags for periodic mode A timer adapter in periodic mode can be used to arm periodic timers. This patch adds flags used to advertise capability and configure timer adapter in periodic mode. Capability flag should be set for adapters which support periodic mode. Below is a programming sequence on the usage: /* check for periodic mode support by reading capability. / rte_event_timer_adapter_caps_get(...); / create adapter in periodic mode by setting periodic flag (RTE_EVENT_TIMER_ADAPTER_F_PERIODIC) and resolution. / rte_event_timer_adapter_create_ext(...); / arm periodic timer of configured resolution / rte_event_timer_arm_burst(...); / timer event will be periodically generated at configured resolution till cancel is called. / while (running) { rte_event_dequeue_burst(...); } / cancel periodic timer which stops generating events */ rte_event_timer_cancel_burst(...); Signed-off-by: Shijith Thotton <sthotton@marvell.com> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2021-04-12 09:23:34 +02:00
Tal Shnaiderman	13e06abc9d	eal/windows: fix return codes of pthread shim layer The macro definitions of the following pthread functions return incorrect values from the inner function return code. While pthread_barrier_init(), pthread_barrier_destroy() and pthread_cancel() return 0 in a case of success and non-zero (errno) value otherwise the shimming functions InitializeSynchronizationBarrier, DeleteSynchronizationBarrier and TerminateThread return FALSE (0) in a case of failure and TRUE(1) in a case of success. This issue was undetected as none of the functions return codes were checked until such check was added in commit `34cc55cce6` ("eal: fix race in control thread creation") exposing the issue by failing pthread_barrier_init() and rte_eal_init() on Windows as a result. The fix aligned the return value of the 3 function with the expected pthread API return values. Fixes: `e8428a9d89` ("eal/windows: add some basic functions and macros") Cc: stable@dpdk.org Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>	2021-04-12 22:35:31 +02:00
Chengchang Tang	18239da4ac	ethdev: validate input in EEPROM info This patch adds validity check of input pointer in EEPROM dump API. Fixes: `7a3f27cbf5` ("ethdev: add access to specific device info") Cc: stable@dpdk.org Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-04-08 00:26:39 +02:00
Chengchang Tang	94af45f400	ethdev: validate input in register info This patch adds validity check of input pointer in regs dump API. Fixes: `7a3f27cbf5` ("ethdev: add access to specific device info") Fixes: `936eda25e8` ("net/hns3: support dump register") Cc: stable@dpdk.org Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-04-08 00:26:39 +02:00
Chengchang Tang	e2bd08d569	ethdev: validate input in module EEPROM dump The validity verification of input parameters should be performed at API layer, not in the PMD. Fixes: `3a18c44b45` ("ethdev: add access to EEPROM") Fixes: `40ff8b305a` ("net/e1000: add module EEPROM callbacks for e1000") Fixes: `f2088e785c` ("net/i40e: fix dereference before check when getting EEPROM") Fixes: `b74d0cd43e` ("net/ixgbe: add module EEPROM callbacks for ixgbe") Fixes: `8a6a09f853` ("net/mlx5: support reading module EEPROM data") Fixes: `58f6f93c34` ("net/octeontx2: add module EEPROM dump") Cc: stable@dpdk.org Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-04-08 00:26:39 +02:00
Junjie Wan	968bbc7e2e	vhost: avoid IOTLB mempool allocation while IOMMU disabled If vhost device's IOMMU feature is disabled, IOTLB mempool allocation is unnecessary. Reported-by: Peng He <hepeng.0320@bytedance.com> Signed-off-by: Junjie Wan <wanjunjie@bytedance.com> Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-04-07 08:54:05 +02:00
Marvin Liu	98da5545be	vhost: fix initialization of async temporary header This patch fixes coverity issue in async enqueue function by adding initialization step before using temporary virtio header. Coverity issue: 366123 Fixes: `cd6760da10` ("vhost: introduce async enqueue for split ring") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>	2021-04-07 08:41:30 +02:00
Marvin Liu	5b784a2d80	vhost: fix initialization of temporary header This patch fixs coverity issue by adding initialization step before using temporary virtio header. Coverity issue: 366181 Fixes: `fb3815cc61` ("vhost: handle virtually non-contiguous buffers in Rx-mrg") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-04-07 08:41:30 +02:00
Qi Zhang	bb6270dab3	ethdev: refine debug build option PMDs use RTE_LIBRTE_<PMD_NAME>_DEBUG_RX\|TX as build option to wrap data path debug code. As .config has been removed since the meson build, It is not friendly for new DPDK users to notice those debug options. The patch introduces below build options for data path debug, so PMD can choose to reuse them to avoid maintain their own. - RTE_ETHDEV_DEBUG_RX - RTE_ETHDEV_DEBUG_TX All the build options are documented at programming guide "3.1 Driver Option", so users can easily find them. The original undocumented RTE_LIBRTE_ETHDEV_DEBUG will alias to both RTE_ETHDEV_DEBUG_RX and RTE_ETHDEV_DEBUG_TX for backward compatibility. Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-04-01 16:10:20 +02:00
Bruce Richardson	720dfda455	build: limit symbol checks to developer mode The checking of symbols within each library and driver is only of interest to developers, so limit to developer mode only. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-09 19:07:25 +02:00
Bruce Richardson	d317f7ebd4	build: hide debug messages in non-developer mode The messages about what components have what dependency names, and information about function versioning not being supported on windows are only of interest to developers, so hide them when building in non-developer mode. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-09 19:07:25 +02:00
Luc Pelletier	af68c1d699	eal: fix hang in control thread creation The affinity of a control thread is set after it has been launched. If setting the affinity fails, pthread_cancel is called followed by a call to pthread_join, which can hang forever if the thread's start routine doesn't call a pthread cancellation point. This patch modifies the logic so that the control thread exits gracefully if the affinity cannot be set successfully and removes the call to pthread_cancel. Fixes: `6383d2642b` ("eal: set name when creating a control thread") Cc: stable@dpdk.org Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2021-04-09 16:37:55 +02:00
Luc Pelletier	34cc55cce6	eal: fix race in control thread creation The creation of control threads uses a pthread barrier for synchronization. This patch fixes a race condition where the pthread barrier could get destroyed while one of the threads has not yet returned from the pthread_barrier_wait function, which could result in undefined behaviour. Fixes: `3a0d465d4c` ("eal: fix use-after-free on control thread creation") Cc: stable@dpdk.org Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>	2021-04-09 16:36:17 +02:00
Thomas Monjalon	76d409ce6e	vfio: reformat logs The log messages had various issues: - split on 2 lines, making search (grep) difficult - long lines (can be split after the string) - indented for no good reason (parent message may have higher log level) - inconsistent use of __func__, not meaningful context for user - lack of context (general message not mentioning VFIO) - log level too high (more below) Message having its level decreased from WARNING to NOTICE: "not managed by VFIO driver, skipping" Message having its level decreased from INFO to DEBUG: "Probing VFIO support..." Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-04-09 15:38:26 +02:00
David Marchand	3e08081637	eal: fix evaluation of log level option --log-level option is handled early, no need to reevaluate it later in EAL init. Before: $ echo quit \| ./build/app/test/dpdk-test --no-huge -m 512 \ --log-level=lib.eal:debug \ --log-level=lib.ethdev:debug --log-level=lib.ethdev:info \ \|& grep -i log.level EAL: lib.eal log level changed from info to debug EAL: lib.ethdev log level changed from info to debug EAL: lib.ethdev log level changed from debug to info EAL: lib.ethdev log level changed from info to debug EAL: lib.ethdev log level changed from debug to info EAL: lib.telemetry log level changed from disabled to warning After: $ echo quit \| ./build/app/test/dpdk-test --no-huge -m 512 \ --log-level=lib.eal:debug \ --log-level=lib.ethdev:debug --log-level=lib.ethdev:info \ \|& grep -i log.level EAL: lib.eal log level changed from info to debug EAL: lib.ethdev log level changed from info to debug EAL: lib.ethdev log level changed from debug to info EAL: lib.telemetry log level changed from disabled to warning Fixes: `6c7216eefd` ("eal: fix log level of early messages") Fixes: `1c806ae5c3` ("eal/windows: support command line options parsing") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com> Tested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>	2021-04-09 14:20:23 +02:00
David Marchand	eba022b926	log: track log level changes Add a log message when registering log types and changing log levels. __rte_log_register previous handled both legacy and dynamic logtypes. To simplify the code, __rte_log_register is reworked to only handle dynamic logtypes and takes a log level. Example: $ DPDK_TEST=logs_autotest ./build/app/test/dpdk-test --no-huge -m 512 \ --log-level=lib.eal:debug ... RTE>>logs_autotest == dynamic log types EAL: logtype1 log level changed from disabled to info EAL: logtype2 log level changed from disabled to info EAL: logtype1 log level changed from info to error EAL: logtype3 log level changed from error to emergency EAL: logtype2 log level changed from info to emergency EAL: logtype3 log level changed from emergency to debug EAL: logtype1 log level changed from error to debug EAL: logtype2 log level changed from emergency to debug error message critical message critical message error message == static log types TESTAPP1: error message TESTAPP1: critical message TESTAPP2: critical message TESTAPP1: error message Test OK Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-09 14:19:53 +02:00
Thomas Monjalon	3c20e6fe72	log: add option argument help The option --log-level was not completely described in the usage text, and it was difficult to guess the names of the log types and levels. A new value "help" is accepted after --log-level to give more details about the syntax and listing the log types and levels. The array "levels" used for level name parsing is replaced with a (modified) existing function which was used in rte_log_dump(). The new function rte_log_list_types() is exported in the API for allowing an application to give this info to the user if not exposing the EAL option --log-level. The list of log types cannot include all drivers if not linked in the application (shared object plugin case). Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-04-09 12:56:58 +02:00
Thomas Monjalon	c2bd208a90	log: catch invalid level option number The parsing check for invalid log level was not trying to catch irrelevant numeric values. A log level 0 becomes a failure in parsing so it can be caught early. A log level higher than the max (8) is accepted with a warning message. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-04-09 12:56:09 +02:00
Thomas Monjalon	806d888a80	log: introduce macro for maximum level RTE_DIM(...) and RTE_LOG_DEBUG were used to get the highest log level. For better clarity a new constant RTE_LOG_MAX is introduced and mapped to RTE_LOG_DEBUG. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-04-09 12:56:09 +02:00
Thomas Monjalon	867bb637ae	log: move private functions Some private log functions had a wrong "rte_" prefix. All private log functions are moved from eal_private.h to the new file eal_log.h: rte_eal_log_init -> eal_log_init rte_log_save_regexp -> eal_log_save_regexp rte_log_save_pattern -> eal_log_save_pattern eal_log_set_default The static functions in the file eal_common_log.c are renamed: rte_log_save_level -> log_save_level rte_log_lookup -> log_lookup rte_log_init -> log_init __rte_log_register -> log_register Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-04-09 12:56:09 +02:00
Stanislaw Kardach	5516e4760e	timer: clarify error if subsystem already initialized rte_timer_subsystem_init() may return -EALREADY if it has been already initialized. Therefore put explicitly into doxygen that this is not a failure for the application. Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Reviewed-by: Michal Krawczyk <mk@semihalf.com> Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>	2021-04-08 23:12:55 +02:00
David Marchand	551b29a714	eal: fix telemetry log type on registration failure rte_log_register_type_and_pick_level() returns an int. Casting to a uin32_t will make us miss the -1 passed in case of failure. Fallback to EAL log type like RTE_LOG_REGISTER. Fixes: `37b881a961` ("telemetry: use log function from pointer") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2021-04-08 18:31:58 +02:00
Thomas Monjalon	bd057ae47d	log: choose EAL log type on registration failure In the unlikely case where something goes wrong while registering a log type, the fallback is to use the EAL log type. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-04-08 18:31:58 +02:00
David Marchand	56ea803e87	build: remove Windows export symbol list Rather than have two files that keeps getting out of sync, let's annotate the version.map to generate the Windows export file. Some mlx5 symbols (haswell_broadwell_cpu, mlx5_glue, mlx5_os_) were only exported for Windows. All of them are available and used by Linux too, so this patch adds them in version.map. Note: Existing version.map annotation achieved with: $ for dir in lib/librte_eal drivers/common/mlx5; do ./buildtools/map-list-symbol.sh $dir/.map \| while read file version sym; do ! git grep -qw $sym $dir/.def \|\| continue; sed -i -e "s/$sym;/$sym; # WINDOWS_NO_EXPORT/" $dir/.map; done; done Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-08 17:57:33 +02:00
David Marchand	60e0e75b61	service: clean references to removed symbol rte_service_get_id() was removed in v17.11 but the API description still referenced it and a version node was still present in EAL map. Fixes: `8edc9aaaf2` ("service: use id in get by name function") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-04-08 17:47:18 +02:00
Renata Saiakhova	2e761ce184	eal: add synchronous interrupt unregister Avoid race with unregister interrupt handler if interrupt source has some active callbacks at the moment, use wrapper around rte_intr_callback_unregister() to check for -EAGAIN return value and to loop until rte_intr_callback_unregister() succeeds. Signed-off-by: Renata Saiakhova <renata.saiakhova@ekinops.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Harman Kalra <hkalra@marvell.com>	2021-04-07 11:16:11 +02:00
Roy Shterman	edf20bd8a5	mem: fix freeing segments in --huge-unlink mode When using huge_unlink we unlink the segment right after allocation. Although we unlink the file we keep the fd in fd_list so file still exist just the path deleted. When freeing the hugepage we need to close the fd and assign it with (-1) in fd_list for the page to be released. The current flow fails rte_malloc in the following flow when working with --huge-unlink option: 1. alloc_seg() for segment A - We allocate a segment, unlink the path to the segment and keep the file descriptor in fd_list. 2. free_seg() for segment A - We clear the segment metadata and return - without closing fd or assigning (-1) in fd list. 3. alloc_seg() for segment A again - We find segment A as available, try to allocate it, find the old fd in fd_list try to unlink it as part of alloc_seg() but failed because path doesn't exist. The impact of such error is falsely failing rte_malloc() although we have hugepages available. Fixes: `d435aad37d` ("mem: support --huge-unlink mode") Cc: stable@dpdk.org Signed-off-by: Roy Shterman <roy.shterman@vastdata.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-04-07 11:13:45 +02:00
Thomas Monjalon	4d509afa7b	pci: rename catch-all ID The name of the constant PCI_ANY_ID was missing RTE_ prefix. It is renamed, and the old name becomes a deprecated alias. While renaming, the duplicate definitions in rte_bus_pci.h are removed to keep only those in rte_pci.h. Note: rte_pci.h is included in rte_bus_pci.h Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2021-04-06 14:52:49 +02:00
Anatoly Burakov	190f38773a	power: do not skip saving original P-state governor Currently, when we set the pstate governor to "performance", we check if it is already set to this value, and if it is, we skip setting it. However, we never save this value anywhere, so that next time we come back and request the governor to be set to its original value, the original value is empty. Fix it by saving the original pstate governor first. While we're at it, replace `strlcpy` with `rte_strscpy`. Fixes: `e6c6dc0f96` ("power: add p-state driver compatibility") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com>	2021-04-06 10:36:49 +02:00
Anatoly Burakov	8a5febaac4	power: fix P-state base frequency handling Previous fix for base frequency handling in pstate mode introduced a couple of issues: - When base_frequency file does not exist, it simply bails out because of what appears to be accidental addition of FOPEN_OR_ERR_RET. This is incorrect, as absence of this file is not fatal and is in fact expected on kernel versions earlier than 5.3 - When base_frequency file does exist, it gets opened, but never gets closed, resulting in a resource leak Both issues also manifest themselves as Coverity defects (dead code, and a resource leak), so this fix addresses both. Coverity issue: 369693, 369694 Bugzilla ID: 668 Fixes: `4db9587bbf` ("power: check sysfs base frequency") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Reshma Pattan <reshma.pattan@intel.com>	2021-04-06 10:36:42 +02:00
Marvin Liu	af584d21bf	vhost: fix batch dequeue potential buffer overflow Similar as single dequeue, the multiple accesses of descriptor length will lead to potential risk. One-time access of descriptor length can eliminate this risk. Fixes: `75ed516978` ("vhost: add packed ring batch dequeue") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-03-31 09:34:17 +02:00
Marvin Liu	93ed2f49de	vhost: fix packed ring potential buffer overflow Similar as split ring, the multiple accesses of descriptor length will lead to potential risk. One-time access of descriptor length can eliminate this risk. Fixes: `2f3225a7d6` ("vhost: add vector filling support for packed ring") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-03-31 09:34:17 +02:00
Marvin Liu	134228ca39	vhost: fix split ring potential buffer overflow In vhost datapath, descriptor's length are mostly used in two coherent operations. First step is used for address translation, second step is used for memory transaction from guest to host. But the interval between two steps will give a window for malicious guest, in which can change descriptor length after vhost calculated buffer size. Thus may lead to buffer overflow in vhost side. This potential risk can be eliminated by accessing the descriptor length once. Fixes: `1be4ebb1c4` ("vhost: support indirect descriptor in mergeable Rx") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-03-31 09:34:17 +02:00
Keiichi Watanabe	790b1c3171	vhost: get negotiated protocol features Add rte_vhost_get_negotiated_protocol_features, which returns a set of enabled protocol features. Signed-off-by: Keiichi Watanabe <keiichiw@chromium.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2021-03-31 08:15:14 +02:00
Maxime Coquelin	af4844503e	vhost: optimize virtqueue structure This patch moves vhost_virtqueue struct fields in order to both optimize packing and move hot fields on the first cachelines. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Tested-by: Balazs Nemeth <bnemeth@redhat.com>	2021-03-31 07:48:32 +02:00
Maxime Coquelin	1818a63147	vhost: move dirty logging cache out of virtqueue This patch moves the per-virtqueue's dirty logging cache out of the virtqueue struct, by allocating it dynamically only when live-migration is enabled. It saves 8 cachelines in vhost_virtqueue struct. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Tested-by: Balazs Nemeth <bnemeth@redhat.com>	2021-03-31 07:48:32 +02:00
Maxime Coquelin	2453bbf7e1	vhost: remove unused virtqueue field This patch removes the "backend" field of the vhost_virtqueue struct, which is not used by the library. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Tested-by: Balazs Nemeth <bnemeth@redhat.com>	2021-03-31 07:48:32 +02:00
Tyler Retzlaff	19cc526d6c	ethdev: install driver headers Introduce a meson option 'enable_driver_sdk', when true installs internal driver headers for ethdev. This allows drivers that do not depend on stable api/abi to be built external to the dpdk source tree. Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-03-30 14:46:33 +02:00
Thomas Monjalon	fb7ad441d4	ethdev: replace callback getting filter operations Since rte_flow is the only API for filtering operations, the legacy driver interface filter_ctrl was too much complicated for the simple task of getting the struct rte_flow_ops. The filter type RTE_ETH_FILTER_GENERIC and the filter operarion RTE_ETH_FILTER_GET are removed. The new driver callback flow_ops_get replaces filter_ctrl. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-03-26 18:37:13 +01:00
Ivan Malov	43af98e687	ethdev: reuse VXLAN header definition in flow item One ought to reuse existing header structs in flow items. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-03-22 17:19:16 +01:00
Ivan Malov	694d6ad392	net: clarify endianness of 32-bit fields in VXLAN headers These fields have network byte order. Highlight it using dedicated type. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-03-22 17:19:16 +01:00
Ivan Malov	a56a262e34	ethdev: reuse VLAN header definition in flow item One ought to reuse existing header structs in flow items. This particular item contains non-header fields, so it's important to keep the header fields in a separate struct. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-03-22 17:19:16 +01:00
Ivan Malov	6f2168b69a	ethdev: reuse ethernet header definition in flow item One ought to reuse existing header structs in flow items. This particular item contains non-header fields, so it's important to keep the header fields in a separate struct. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-03-22 17:19:16 +01:00
David Marchand	6545d6c52b	telemetry: cleanup internal header The experimental banner can be removed. Every in-tree file is compiled with _GNU_SOURCE, so RTE_HAS_CPUSET is unneeded for an internal header. Fixes: `0e64ae618e` ("telemetry: move init function to internal header") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ciara Power <ciara.power@intel.com>	2021-03-26 17:17:45 +01:00
Dmitry Kozlyuk	b2f24588b5	mem: fix cleanup when multi-process is disabled rte_eal_memory_detach() did not account for cases where multi-process mode is disabled: --in-memory and --no-shconf. This resulted in unmapping memory that had not been mapped, which caused errors: EAL: Could not unmap memory: No error (Windows) EAL: Cannot munmap(0x1d47f40, 0x7000): Invalid argument (Linux) Confusing "No error" was caused by using errno instead of rte_errno set by rte_mem_unmap(). Skip detaching memory altogether when --in-memory is specified. Skip unmapping configuration when it's not shared. Fix and add error handling to produce proper log messages. Fixes: `dfbc61a2f9` ("mem: detach memsegs on cleanup") Reported-by: Jie Zhou <jizh@microsoft.com> Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Ranjit Menon <ranjit.menon@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2021-03-26 17:17:45 +01:00
Tal Shnaiderman	1325a1ffd9	eal: rename thread TLS API Rename the key opaque pointer from rte_tls_key to rte_thread_key to avoid confusion with transport layer security. Also rename and remove the "_tls" term from the following functions to avoid redundancy: rte_thread_tls_key_create rte_thread_tls_key_delete rte_thread_tls_value_set rte_thread_tls_value_get Suggested-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com> Suggested-by: Morten Brørup <mb@smartsharesystems.com> Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Morten Brørup <mb@smartsharesystems.com>	2021-03-26 09:22:39 +01:00
Tal Shnaiderman	3d2913c67c	eal: add error numbers in thread TLS API Add error number reporting to rte_errno in all functions in the rte_thread_tls_* API. Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2021-03-26 09:21:05 +01:00

... 5 6 7 8 9 ...

7518 Commits