numam-dpdk

Author	SHA1	Message	Date
Viacheslav Ovsiienko	572c9d4bda	net/mlx5: fix shared Rx queue segment configuration match While joining the shared Rx queue to the existing queue group, the queue configurations is checked to be the same as it was specified in the first group queue creation - all shared queues should be created with identical configurations. During the Rx queue creation the buffer split segment configuration can be altered - the zero segment sizes are substituted with the actual ones, inherited from the pools, number of segments can be extended to cover the maximal packet length, etc. It means the actual queue segment configuration can not be used directly to match the configuration provided in the queue setup call. To resolve an issue we should store original parameters in the shared queue structure and perform the check against one of these stored ones. Fixes: `09c2555303` ("net/mlx5: support shared Rx queue") Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-24 17:25:37 +01:00
Dmitry Kozlyuk	ec9b812b6c	net/mlx5: fix Rx queue reference count for indirect RSS mlx5_ind_table_obj_modify() was not changing the reference counters of neither the new set of RxQs, nor the old set of RxQs. On the other hand, creation of the RSS incremented the RxQ refcnt. If an RxQ was present in both the initial and the modified set, its reference counter was incremented one extra time compared to the queues that were only present in the new set. This prevented releasing said RxQ resources on port stop: flow indirect_action 0 create action_id 1 \ action rss queues 0 1 end / end flow indirect_action 0 update 1 \ action rss queues 2 3 end / end quit ... mlx5_net: mlx5.c:1622: mlx5_dev_close(): port 0 some Rx queue objects still remain mlx5_net: mlx5.c:1626: mlx5_dev_close(): port 0 some Rx queues still remain Increment reference counters for the new set of RxQs and decrement them for the old set of RxQs when needed. Remove explicit referencing of RxQ from mlx5_ind_table_obj_attach() because it reuses mlx5_ind_table_obj_modify() code doing this. Fixes: `ec4e11d41d` ("net/mlx5: preserve indirect actions on restart") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com>	2021-11-24 17:25:35 +01:00
Dmitry Kozlyuk	c65d684497	net/mlx5: fix indirect RSS creation when port is stopped mlx5_ind_table_obj_setup() was incrementing RxQ reference counters even when the port was stopped, which prevented RxQ release and triggered an internal assertion. Only increment reference counter when the port is started. Fixes: `ec4e11d41d` ("net/mlx5: preserve indirect actions on restart") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com>	2021-11-24 17:25:33 +01:00
Dariusz Sosnowski	8fbce96fbe	net/mlx5: fix reference count on detached indirect action This patch fixes segfault which was triggered when port, with indirect actions created, was closed. Segfault was occurring only when RTE_LIBRTE_MLX5_DEBUG was defined. It was caused by redundant decrement of RX queues refcount: - refcount was decremented when port was stopped and indirect actions were detached from RX queues (port stop), - refcount was decremented when indirect actions objects were destroyed (port close or destroying of indirect action). This patch fixes behavior. Dereferencing Rx queues is done if and only if indirect action is explicitly destroyed by the user or detached on port stop. Dereferencing Rx queues on action destroy operation depends on an argument to the wrapper of indirect action destroy operation, introduced in this patch. Fixes: `ec4e11d41d` ("net/mlx5: preserve indirect actions on restart") Cc: stable@dpdk.org Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-23 17:57:19 +01:00
Michael Baum	71304b5c7b	common/mlx5: fix redundant field in MR control structure Inside the MR control structure there is a pointer to the common device. This pointer enables access to the global cache as well as hardware objects that may be required in case a new MR needs to be created. The purpose of adding this pointer into the MR control structure was to avoid its transfer as a parameter to all the functions of searching MR in the caches. However, adding it to this structure increased the Rx and Tx data-path structures, all the fields that followed it were slightly moved away which caused to a reduction in performance. This patch removes the pointer from the structure. It can be accessed through the "dev_gen_ptr" existing field using the "container_of" operator. Fixes: `334ed198ab` ("common/mlx5: remove redundant parameter in MR search") Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-17 10:42:20 +01:00
Dmitry Kozlyuk	077be91dd7	net/mlx5: fix split buffer Rx Routine to lookup LKey on Rx was assuming that the mbuf address always belongs to a single mempool: the one associated with an RxQ or the MPRQ mempool. This assumption is false for split buffers case. A wrong LKey was looked up, resulting in completion errors. Modify lookup routines to lookup LKey in the mbuf->pool for non-MPRQ cases both on Rx datapath and on queue initialization. Fixes: `fec28ca0e3` ("net/mlx5: support mempool registration") Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-08 13:56:29 +01:00
Michael Baum	5dfa003db5	common/mlx5: fix post doorbell barrier The rdma-core library can map doorbell register in two ways, depending on the environment variable "MLX5_SHUT_UP_BF": - as regular cached memory, the variable is either missing or set to zero. This type of mapping may cause the significant doorbell register writing latency and requires an explicit memory write barrier to mitigate this issue and prevent write combining. - as non-cached memory, the variable is present and set to not "0" value. This type of mapping may cause performance impact under heavy loading conditions but the explicit write memory barrier is not required and it may improve core performance. The UAR creation function maps a doorbell in one of the above ways according to the system. In run time, it always adds an explicit memory barrier after writing to. In cases where the doorbell was mapped as non-cached memory, the explicit memory barrier is unnecessary and may impair performance. The commit [1] solved this problem for a Tx queue. In run time, it checks the mapping type and provides the memory barrier after writing to a Tx doorbell register if it is needed. The mapping type is extracted directly from the uar_mmap_offset field in the queue properties. This patch shares this code between the drivers and extends the above solution for each of them. [1] commit `8409a28573` ("net/mlx5: control transmit doorbell register mapping") Fixes: `f8c97babc9` ("compress/mlx5: add data-path functions") Fixes: `8e196c08ab` ("crypto/mlx5: support enqueue/dequeue operations") Fixes: `4d4e245ad6` ("regex/mlx5: support enqueue") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-07 16:21:03 +01:00
Michael Baum	334ed198ab	common/mlx5: remove redundant parameter in MR search Memory region management has recently been shared between drivers, including the search for caches in the data plane. The initial search in the local linear cache of the queue, usually yields a result and one should not continue searching in the next level caches. The function that searches in the local cache gets the pointer to a device as a parameter, that is not necessary for its operation but for subsequent searches (which, as mentioned, usually do not happen). Transferring the device to a function and maintaining it, takes some time and causes some impact on performance. Add the pointer to the device as a field of the mr_ctrl structure. The field will be updated during control path and will be used only when needed in the search. Fixes: `fc59a1ec55` ("common/mlx5: share MR mempool registration") Signed-off-by: Michael Baum <michaelba@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-07 14:11:16 +01:00
Bing Zhao	febcac7b46	net/mlx5: support Rx queue delay drop For the Ethernet RQs, if there all receiving descriptors are exhausted, the packets being received will be dropped. This behavior prevents slow or malicious software entities at the host from affecting the network. While for hairpin cases, even if there is no software involved during the packet forwarding from Rx to Tx side, some hiccup in the hardware or back pressure from Tx side may still cause the descriptors to be exhausted. In certain scenarios it may be preferred to configure the device to avoid such packet drops, assuming the posting of descriptors will resume shortly. To support this, a new devarg "delay_drop" is introduced. By default, the delay drop is enabled for hairpin Rx queues and disabled for standard Rx queues. This value is used as a bit mask: - bit 0: enablement of standard Rx queue - bit 1: enablement of hairpin Rx queue And this attribute will be applied to all Rx queues of a device. The "rq_delay_drop" capability in the HCA_CAP is checked before creating any queue. If the hardware capabilities do not support this delay drop, all the Rx queues will still be created without this attribute, and the devarg setting will be ignored even if it is specified explicitly. A warning log is used to notify the application when this occurs. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-05 17:04:53 +01:00
Xueming Li	09c2555303	net/mlx5: support shared Rx queue This patch introduces shared RxQ. All shared Rx queues with same group and queue ID share the same rxq_ctrl. Rxq_ctrl and rxq_data are shared, all queues from different member port share same WQ and CQ, essentially one Rx WQ, mbufs are filled into this singleton WQ. Shared rxq_data is set into device Rx queues of all member ports as RxQ object, used for receiving packets. Polling queue of any member ports returns packets of any member, mbuf->port is used to identify source port. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:50 +01:00
Xueming Li	5cf0707fc7	net/mlx5: remove Rx queue data list from device Rx queue data list(priv->rxqs) can be replaced by Rx queue list(priv->rxq_privs), removes it and replaces with universal wrapper API. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:49 +01:00
Xueming Li	5ceb3a02b0	net/mlx5: move Rx queue DevX resource To support shared RX queue, moves DevX RQ which is per queue resource to Rx queue private data. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:48 +01:00
Xueming Li	5db77fef78	net/mlx5: remove port info from shareable Rx queue To prepare for shared Rx queue, removes port info from shareable Rx queue control. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:47 +01:00
Xueming Li	44126bd9d0	net/mlx5: move Rx queue hairpin info to private data Hairpin info of Rx queue can't be shared, moves to private queue data. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:47 +01:00
Xueming Li	0cedf34da7	net/mlx5: move Rx queue reference count Rx queue reference count is counter of RQ, used to count reference to RQ object. To prepare for shared Rx queue, this patch moves it from rxq_ctrl to Rx queue private data. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:46 +01:00
Xueming Li	4cda06c3c3	net/mlx5: split Rx queue into shareable and private To prepare shared Rx queue, splits RxQ data into shareable and private. Struct mlx5_rxq_priv is per queue data. Struct mlx5_rxq_ctrl is shared queue resources and data. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:45 +01:00
Dmitry Kozlyuk	ec4e11d41d	net/mlx5: preserve indirect actions on restart MLX5 PMD uses reference counting to manage RX queue resources. After port stop shared RSS actions kept references to RX queues, preventing resource release. As a result, internal PMD mempool for such queues had been exhausted after a number of port restarts. Diagnostic message from rte_eth_dev_start(): Rx queue allocation failed: Cannot allocate memory Dereference RX queues used by indirect actions on port stop (detach) and restore references on port start (attach) in order to allow RX queue resource release, but keep indirect RSS across the port restart. Replace queue IDs in HW by drop queue ID on detach and restore actual queue IDs on attach. When the port is stopped, create indirect RSS in the detached state. As a result, MLX5 PMD is able to keep all its indirect actions across port restart. Advertise this capability. Fixes: `4b61b8774b` ("ethdev: introduce indirect flow action") Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-02 18:59:17 +01:00
Olivier Matz	daa02b5cdd	mbuf: add namespace to offload flags Fix the mbuf offload flags namespace by adding an RTE_ prefix to the name. The old flags remain usable, but a deprecation warning is issued at compilation. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>	2021-10-24 13:37:43 +02:00
Michael Baum	fc59a1ec55	common/mlx5: share MR mempool registration Expand the use of mempool registration to MR management for other drivers. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:58:00 +02:00
Michael Baum	9f1d636f3e	common/mlx5: share MR management Add global shared MR cache as a field of common device structure. Move MR management to use this global cache for all drivers. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:57:58 +02:00
Konstantin Ananyev	8d7d4fcdca	ethdev: change input parameters for Rx queue count Currently majority of fast-path ethdev ops take pointers to internal queue data structures as an input parameter. While eth_rx_queue_count() takes a pointer to rte_eth_dev and queue index. For future work to hide rte_eth_devices[] and friends it would be plausible to unify parameters list of all fast-path ethdev ops. This patch changes eth_rx_queue_count() to accept pointer to internal queue data as input parameter. While this change is transparent to user, it still counts as an ABI change, as eth_rx_queue_count_t is used by ethdev public inline function rte_eth_rx_queue_count(). Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Tested-by: Feifei Wang <feifei.wang2@arm.com>	2021-10-13 22:14:58 +02:00
Dmitry Kozlyuk	fec28ca0e3	net/mlx5: support mempool registration When the first port in a given protection domain (PD) starts, install a mempool event callback for this PD and register all existing memory regions (MR) for it. When the last port in a PD closes, remove the callback and unregister all mempools for this PD. This behavior can be switched off with a new devarg: mr_mempool_reg_en. On TX slow path, i.e. when an MR key for the address of the buffer to send is not in the local cache, first try to retrieve it from the database of registered mempools. Supported are direct and indirect mbufs, as well as externally-attached ones from MLX5 MPRQ feature. Lookup in the database of non-mempool memory is used as the last resort. RX mempools are registered regardless of the devarg value. On RX data path only the local cache and the mempool database is used. If implicit mempool registration is disabled, these mempools are unregistered at port stop, releasing the MRs. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-19 16:35:16 +02:00
Xueming Li	7483341ae5	ethdev: change queue release callback Currently, most ethdev callback API use queue ID as parameter, but Rx and Tx queue release callback use queue object which is used by Rx and Tx burst data plane callback. To align with other eth device queue configuration callbacks: - queue release callbacks are changed to use queue ID - all drivers are adapted Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-10-06 19:16:03 +02:00
Suanming Mou	6507c9f51d	common/mlx5: call list callbacks with context This commit optimizes to call the list callback functions with global context directly. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:17 +02:00
Matan Azrad	491b7137ff	net/mlx5: add per-lcore cache to the list utility When mlx5 list object is accessed by multiple cores, the list lock counter is all the time written by all the cores what increases cache misses in the memory caches. In addition, when one thread accesses the list for add\remove\lookup operation, all the other threads coming to do an operation in the list are stuck in the lock. Add per lcore cache to allow thread manipulations to be lockless when the list objects are mostly reused. Synchronization with atomic operations should be done in order to allow threads to unregister an entry from other thread cache. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 15:19:10 +02:00
Matan Azrad	e78e5408da	net/mlx5: remove cache term from the list utility The internal mlx5 list tool is used mainly when the list objects need to be synchronized between multiple threads. The "cache" term is used in the internal mlx5 list API. Next enhancements on this tool will use the "cache" term for per thread cache management. To prevent confusing, remove the current "cache" term from the API's names. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 15:19:10 +02:00
Alexander Kozyrev	a8f0df6bf9	net/mlx5: support power monitoring Support the PMD power management API in MLX5 driver. The monitor policy of this API puts a CPU core to sleep until a data in some monitored memory address is changed by the NIC. Implement the get_monitor_addr function to return an address of a CQE owner bit to monitor the arrival of a new packet. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-05-03 12:12:42 +02:00
Michael Baum	a96102c869	net/mlx5: separate Rx function implementations to new file This patch separates Rx function implementations to different source file as an optional preparation step for further consolidation of Rx burst functions. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-04-15 08:24:51 +02:00
Michael Baum	151cbe3aab	net/mlx5: separate Rx function declarations to another file The mlx5_rxtx.c file contains a lot of Tx burst functions, each of those is performance-optimized for the specific set of requested offloads. These ones are generated on the basis of the template function and it takes significant time to compile, just due to a large number of giant functions generated in the same file and this compilation is not being done in parallel with using multithreading. Therefore we can split the mlx5_rxtx.c file into several separate files to allow different functions to be compiled simultaneously. In this patch, we separate Rx function declarations to different header file in preparation for removing them from the source file and as an optional preparation step for further consolidation of Rx burst functions. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-04-15 08:24:49 +02:00

29 Commits