numam-dpdk

Author	SHA1	Message	Date
Michael Baum	151cbe3aab	net/mlx5: separate Rx function declarations to another file The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those is performance-optimized for the specific set of requested offloads. These ones are generated on the basis of the template function and it takes significant time to compile, just due to a large number of giant functions generated in the same file and this compilation is not being done in parallel with using multithreading. Therefore we can split the mlx5_rxtx.c file into several separate files to allow different functions to be compiled simultaneously. In this patch, we separate Rx function declarations to different header file in preparation for removing them from the source file and as an optional preparation step for further consolidation of Rx burst functions. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-04-15 08:24:49 +02:00
Alexander Kozyrev	5fc2e5c27d	net/mlx5: fix mbuf overflow in vectorized MPRQ Changing the allocation scheme to improve mbufs locality caused mbufs overrun in some cases. Revert the previous replenish logic back. Calculate a number of unused mbufs and replenish max this number of mbufs. Mark the last 4 mbufs as fake mbufs to prevent overflowing into consumed mbufs in the future. Keep the consumed index and the produced index 4 mbufs apart for this purpose. Replenish some mbufs only in case the consumed index is within the replenish threshold of the produced index in order to retain the cache locality for the vectorized MPRQ routine. Fixes: `5c68764377` ("net/mlx5: improve vectorized MPRQ descriptors locality") Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-11-22 15:37:03 +01:00
Alexander Kozyrev	5c68764377	net/mlx5: improve vectorized MPRQ descriptors locality There is a performance penalty for the replenish scheme used in vectorized Rx burst for both MPRQ and SPRQ. Mbuf elements are being filled at the end of the mbufs array and being replenished at the beginning. That leads to an increase in cache misses and the performance drop. The more Rx descriptors are used the worse the situation. Change the allocation scheme for vectorized MPRQ Rx burst: allocate new mbufs only when consumed mbufs are almost depleted (always have one burst gap between allocated and consumed indices). Keeping a small number of mbufs allocated improves cache locality and improves performance a lot. Unfortunately, this approach cannot be applied to SPRQ Rx burst routine. In MPRQ Rx burst we simply copy packets from external MPRQ buffers or attach these buffers to mbufs. In SPRQ Rx burst we allow the NIC to fill mbufs for us. Hence keeping a small number of allocated mbufs will limit NIC ability to fill as many buffers as possible. This fact offsets the advantage of better cache locality. Fixes: `0f20acbf5e` ("net/mlx5: implement vectorized MPRQ burst") Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-11-13 23:36:14 +01:00
Alexander Kozyrev	0f20acbf5e	net/mlx5: implement vectorized MPRQ burst MPRQ (Multi-Packet Rx Queue) processes one packet at a time using simple scalar instructions. MPRQ works by posting a single large buffer (consisted of multiple fixed-size strides) in order to receive multiple packets at once on this buffer. A Rx packet is then copied to a user-provided mbuf or PMD attaches the Rx packet to the mbuf by the pointer to an external buffer. There is an opportunity to speed up the packet receiving by processing 4 packets simultaneously using SIMD (single instruction, multiple data) extensions. Allocate mbufs in batches for every MPRQ buffer and process the packets in groups of 4 until all the strides are exhausted. Then switch to another MPRQ buffer and repeat the process over again. The vectorized MPRQ burst routine is engaged automatically in case the mprq_en=1 devarg is specified and the vectorization is not disabled explicitly by providing rx_vec_en=0 devarg. There is a limitation: LRO is not supported and scalar MPRQ is selected if it is on. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-11-03 23:24:25 +01:00
Alexander Kozyrev	1ded26239a	net/mlx5: refactor vectorized Rx Move the main processing cycle into a separate function: rxq_cq_process_v. Put the regular rxq_burst_v function to a non-arch specific file. Having all SIMD instructions in a single reusable block is a first preparatory step to implement vectorized Rx burst for MPRQ feature. Pass a pointer to the storage of mbufs directly to the rxq_copy_mbuf_v instead of calculating the pointer inside this function. This is needed for the future vectorized Rx routing which is going to pass a different pointer here. Calculate the number of packets to replenish inside the mlx5_rx_replenish_bulk_mbuf. Containing this logic in one place allows us to do the same for MPRQ case. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-11-03 23:24:25 +01:00
Ciara Power	2c5e0dd21f	net/mlx5: check max SIMD bitwidth When choosing a vector path to take, an extra condition must be satisfied to ensure the max SIMD bitwidth allows for the CPU enabled path. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2020-10-19 16:45:02 +02:00
Ophir Munk	9d60f54569	common/mlx5: remove inclusion of Verbs header files Several source files include Verbs header files as in (1). These source files will not compile under non-Linux operating systems. This commit removes this inclusion in two cases: Case 1: There is no usage of ibv_* or mlx5dv_* symbols in the source file so the inclusion in (1) can be safely removed. Case 2: Verbs symbols are used. Please note the inclusion in (1) already appears in file linux/mlx5_glue.h (which represents the interface to the rdma-core library). Therefore, replace (1) in the source file with (2). Under non-Linux operating systems - file mlx5_glue.h will not include (1). (1) #include <infiniband/verbs.h> #include <infiniband/mlx5dv.h> (2) #include <mlx5_glue.h> Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-07-21 15:46:30 +02:00
Michael Baum	0f006468c5	net/mlx5: fix iterator type in Rx queue management The mlx5_check_vec_rx_support function in the mlx5_rxtx_vec.c file passes the RX queues array in the loop. Similarly, the mlx5_mprq_enabled function in the mlx5_rxq.c file passes the RX queues array in the loop. In both cases, the iterator of the loop is called i and the variable representing the array size is called rxqs_n. The i variable is of UINT16_T type while the rxqs_n variable is of unsigned int type. The size of the rxqs_n variable is much larger than the number of iterations allowed by the i type, theoretically there may be a situation where the value of the rxqs_n will be greater than can be represented by 16 bits and the loop will never end. Change the type of i to UINT32_T. Fixes: `7d6bf6b866` ("net/mlx5: add Multi-Packet Rx support") Fixes: `6cb559d67b` ("net/mlx5: add vectorized Rx/Tx burst for x86") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-30 14:52:30 +02:00
Alexander Kozyrev	c9cc554ba4	net/mlx5: fix vectorized Rx burst termination Maximum burst size of Vectorized Rx burst routine is set to MLX5_VPMD_RX_MAX_BURST(64). This limits the performance of any application that would like to gather more than 64 packets from the single Rx burst for batch processing (i.e. VPP). The situation gets worse with a mix of zipped and unzipped CQEs. They are processed separately and the Rx burst function returns small number of packets every call. Repeat the cycle of gathering packets from the vectorized Rx routine until a requested number of packets are collected or there are no more CQEs left to process. Fixes: `6cb559d67b` ("net/mlx5: add vectorized Rx/Tx burst for x86") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-03 17:20:32 +02:00
Thomas Monjalon	ce6427ddca	replace cold attributes The new macro __rte_cold, for compiler hinting, is now used where appropriate for consistency. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>	2020-04-16 18:30:58 +02:00
Alexander Kozyrev	8e46d4e18f	common/mlx5: improve assert control Use the MLX5_ASSERT macros instead of the standard assert clause. Depends on the RTE_LIBRTE_MLX5_DEBUG configuration option to define it. If RTE_LIBRTE_MLX5_DEBUG is enabled MLX5_ASSERT is equal to RTE_VERIFY to bypass the global CONFIG_RTE_ENABLE_ASSERT option. If RTE_LIBRTE_MLX5_DEBUG is disabled, the global CONFIG_RTE_ENABLE_ASSERT can still make this assert active by calling RTE_VERIFY inside RTE_ASSERT. Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-02-05 09:51:21 +01:00
Matan Azrad	7b4f1e6bd3	common/mlx5: introduce common library A new Mellanox vdpa PMD will be added to support vdpa operations by Mellanox adapters. This vdpa PMD design includes mlx5_glue and mlx5_devx operations and large parts of them are shared with the net/mlx5 PMD. Create a new common library in drivers/common for mlx5 PMDs. Move mlx5_glue, mlx5_devx_cmds and their dependencies to the new mlx5 common library in drivers/common. The files mlx5_devx_cmds.c, mlx5_devx_cmds.h, mlx5_glue.c, mlx5_glue.h and mlx5_prm.h are moved as is from drivers/net/mlx5 to drivers/common/mlx5. Share the log mechanism macros. Separate also the log mechanism to allow different log level control to the common library. Build files and version files are adjusted accordingly. Include lines are adjusted accordingly. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-02-05 09:51:20 +01:00
David Christensen	2e542da709	net/mlx5: add Altivec Rx Added mlx5_rxtx_vec_altivec.h which supports vectorized RX using Altivec vector code. Modified associated build files to use the new code. Signed-off-by: David Christensen <drc@linux.vnet.ibm.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Tested-by: Raslan Darawsheh <rasland@mellanox.com>	2019-10-25 19:00:22 +02:00
Matan Azrad	17ed314c6c	net/mlx5: allow LRO per Rx queue Enabling LRO offload per queue makes sense because the user will probably want to allocate different mempool for LRO queues - the LRO mempool mbuf size may be bigger than non LRO mempool. Change the LRO offload to be per queue instead of per port. If one of the queues is with LRO enabled, all the queues will be configured via DevX. If RSS flows direct TCP packets to queues with different LRO enabling, these flows will not be offloaded with LRO. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-07-29 16:54:27 +02:00
Matan Azrad	bd41389e35	net/mlx5: allow LRO in regular Rx queue LRO support was only for MPRQ, hence mprq Rx burst was selected when LRO was configured in the port. The current support for MPRQ is suffering from bad memory utilization since an external mempool is allocated by the PMD for the packets data in addition to the user mempool, besides that, the user may get packet data addresses which were not configured by him. Even though MPRQ has the best performance for packet receiving in the most cases and because of the above facts it is better to remove the automatic MPRQ select when LRO is configured. Move MPRQ to be selected only when the user force it by the PMD arguments including LRO case. Allow LRO offload using the regular RQ with the regular Rx burst function. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-07-29 16:54:27 +02:00
Viacheslav Ovsiienko	a6bd4911ad	net/mlx5: remove Tx implementation This patch removes the existing Tx datapath code as preparation step before introducing the new implementation. The following entities are being removed: - deprecated devargs support - tx_burst() routines - related PRM definitions - SQ configuration code - Tx routine selection code - incompatible Tx completion code The following devargs are deprecated and ignored: - "txq_inline" is going to be converted to "txq_inline_max" for compatibility issue - "tx_vec_en" - "txqs_max_vec" - "txq_mpw_hdr_dseg_en" - "txq_max_inline_len" is going to be converted to "txq_inline_mpw" for compatibility issue The deprecated devarg keys are recognized by PMD and ignored/converted to the new ones in order not to block device probing. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2019-07-23 14:31:36 +02:00
Matan Azrad	88c0733535	net/mlx5: extend Rx completion with error handling When WQEs are posted to the HW to receive packets, the PMD may receive a completion report with error from the HW, aka error CQE which is associated to a bad WQE. The error reason may be bad address, wrong lkey, small buffer size, etc. that can wrongly be configured by the PMD or by the user. Checking all the optional mistakes to prevent error CQEs doesn't make sense due to performance impacts, moreover, some error CQEs can be triggered because of the packets coming from the wire when the DPDK application has no any control. Most of the error CQE types change the RQ state to error state what causes all the next received packets to be dropped by the HW and to be completed with CQE flush error forever. The current solution detects these error CQEs and even reports the errors to the user by the statistics error counters but without recovery, so if the RQ inserted to the error state it never moves to ready state again and all the next packets ever will be dropped. Extend the error CQEs handling for recovery by moving the state to ready again, and rearranging all the RQ WQEs and the management variables appropriately. Sometimes the error CQE root cause is very hard to debug and even may be related to some corner cases which are not reproducible easily, hence a dump file with debug information will be created for the first number of error CQEs, this number can be configured by the PMD probe parameters. Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-06-14 00:01:06 +09:00
Thomas Monjalon	dbeba4cf18	net/mlx: prefix private structure The private structure stored in rte_eth_dev->data->dev_private was named "struct priv". In order to ease code browsing, the structure is renamed "struct mlx[45]_priv". Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2019-03-01 18:17:35 +01:00
Yongseok Koh	09d8b41699	net/mlx5: make vectorized Tx threshold configurable Add txqs_max_vec parameter to configure the maximum number of Tx queues to enable vectorized Tx. And its default value is set according to the architecture and device type. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Dekel Peled	6bd7fbd03c	net/mlx5: support metadata as flow rule criteria As described in series starting at [1], it adds option to set metadata value as match pattern when creating a new flow rule. This patch adds metadata support in mlx5 driver, in two parts: - Add the validation and setting of metadata value in matcher, when creating a new flow rule. - Add the passing of metadata value from mbuf to wqe when indicated by ol_flag, in different burst functions. [1] "ethdev: support metadata as flow rule criteria" http://mails.dpdk.org/archives/dev/2018-September/113269.html Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-10-26 22:14:06 +02:00
Yongseok Koh	7d6bf6b866	net/mlx5: add Multi-Packet Rx support Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth by posting a single large buffer for multiple packets. Instead of posting a buffer per a packet, one large buffer is posted in order to receive multiple packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides and each stride receives one packet. Rx packet is mem-copied to a user-provided mbuf if the size of Rx packet is comparatively small, or PMD attaches the Rx packet to the mbuf by external buffer attachment - rte_pktmbuf_attach_extbuf(). A mempool for external buffers will be allocated and managed by PMD. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-05-14 22:31:52 +01:00
Xueming Li	5f8ba81c42	net/mlx5: support generic tunnel offloading This commit adds support for generic tunnel TSO and checksum offload. PMD will compute the inner/outer headers offset according to the mbuf fields. Hardware will do calculation based on offsets and types. Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-04-27 17:34:43 +01:00
Shahaf Shuler	5feecc57d9	align SPDX Mellanox copyrights Aligning Mellanox SPDX copyrights to a single format. In addition replace to SPDX licence files which were missed. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-04-11 01:47:47 +02:00
Nélio Laranjeiro	af4f09f282	net/mlx5: prefix all functions with mlx5 This change removes the need to distinguish unlocked priv_() functions which are therefore renamed using a mlx5_() prefix for consistency. At the same time, all functions from mlx5 uses a pointer to the ETH device instead of the one to the PMD private data. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00
Olivier Matz	8fd92a66c6	net/mlx5: use SPDX tags in 6WIND copyrighted files Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2018-02-01 02:32:52 +01:00
Shahaf Shuler	dbccb4cddc	net/mlx5: convert to new Tx offloads API Ethdev Tx offloads API has changed since: commit `cba7f53b71` ("ethdev: introduce Tx queue offloads API") This commit support the new Tx offloads API. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-16 18:47:49 +01:00
Shahaf Shuler	7fe24446e9	net/mlx5: add device configuration structure Move device configuration and features capabilities to its own structure. This structure is filled by mlx5_pci_probe(), outside of this function it should be treated as read only. This configuration struct will be used for the Tx/Rx queue setup to select the Tx/Rx queue parameters based on the user configuration and device capabilities. In addition it will be used by the burst selection function to decide on the best pkt burst to be used. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-16 18:47:49 +01:00
Yongseok Koh	d27fb0de2f	net/mlx5: fix overwriting bit-fields in SW Rx queue Bit-fields in mlx5_rxq_data can be changed on the fly by a control plane - e.g. rxq->mark. However, vectorized Rx uses a bit-field to mark pending errors. Even if one bit is written, consequence is to write the whole integer and this can cause a synchronization issue - two entities write to a same block without locking. As the pending_err bit is entirely internal use for the datapath, this can be replaced with a local variable. Fixes: `6cb559d67b` ("net/mlx5: add vectorized Rx/Tx burst for x86") Fixes: `570acdb1da` ("net/mlx5: add vectorized Rx/Tx burst for ARM") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-01-16 18:47:49 +01:00
Yongseok Koh	4b0d7b7fff	net/mlx5: add fallback in Tx for multi-segment packet mlx5_tx_burst_empw() falls back to legacy Tx descriptor for multi-segmented packets without taking advantage of inlining. In many cases, the 1st segment can be inlined and this could make device fetch only one segment instead of two. This helps saving PCIe bandwidth when transmitting out multi-segmented packets with still using the Enhanced Multi-Packet Send for other packets. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-01-16 18:47:49 +01:00
Nélio Laranjeiro	4aa15eb19a	net/mlx5: fix Tx checksum offloads Tx checksum offloads are correctly handled in a single Tx burst function whereas the capability is always set. This causes VXLAN packet with checksum offloads request to be ignored when the (E)MPS Tx functions are selected. Fixes: `f5fde52051` ("net/mlx5: add hardware checksum offload for tunnel packets") Cc: stable@dpdk.org Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-01-16 18:47:49 +01:00
Yongseok Koh	570acdb1da	net/mlx5: add vectorized Rx/Tx burst for ARM Brings vectorization through NEON instructions. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-12 01:36:58 +01:00
Yongseok Koh	3c2ddbd413	net/mlx5: separate shareable vector functions Considering more architecture (e.g. ARM and PowerPC) will be added for vectorized Rx/Tx burst, all the shareable functions which don't use any vector intrinsics need to be separated from architecture-dependent functions. All the vector functions for x86 SSE are moved to a new header file - mlx5_rxtx_vec_sse.h. And shareable common functions are now in mlx5_rxtx_vec.c. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-12 01:36:58 +01:00
Yongseok Koh	5bfc9fc112	net/mlx5: use static assert for compile-time sanity checks Replace compile-time sanity check with static_assert() as c11 standard has been set. Add mlx5_rxtx_vec.h and move the sanity checks to the file Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-12 01:36:58 +01:00
Yongseok Koh	f0d2114f91	net/mlx5: rename a file of SSE Rx/Tx Rename mlx5_rxtx_vec_sse.c to mlx5_rxtx_vec.c to separate shareable vector functions. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-12 01:36:58 +01:00

34 Commits