numam-dpdk

Author	SHA1	Message	Date
Ophir Munk	391b8bcc81	common/mlx5: move some getter functions from net driver Getter functions such as: 'mlx5_os_get_ctx_device_name', 'mlx5_os_get_ctx_device_path', 'mlx5_os_get_dev_device_name', 'mlx5_os_get_umem_id' are implemented under net directory. To enable additional devices (e.g. regex, vdpa) to access these getter functions they are moved under common directory. As part of this commit string sizes DEV_SYSFS_NAME_MAX and DEV_SYSFS_PATH_MAX are increased by 1 to make sure that the destination string size in strncpy() function is bigger than the source string size. This update will avoid GCC version 8 error -Werror=stringop-truncation. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-30 14:52:30 +02:00
Suanming Mou	ac79183dc6	net/mlx5: optimize free counter lookup Currently, when allocate a new counter, it needs loop the whole container pool list to get a free counter. In the case with millions of counters allocated, and all the pools are empty, allocate the new counter will still need to loop the whole container pool list first, then allocate a new pool to get a free counter. It wastes the cycles during the pool list traversal. Add a global free counter list in the container helps to get the free counters more efficiently. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-30 14:52:30 +02:00
Suanming Mou	b1cc226644	net/mlx5: optimize single counter pool search For single counter, when allocate a new counter, it needs to find the pool it belongs in order to do the query together. Once there are millions of counters allocated, the pool array in the counter container will become very large. In this case, the pool search from the pool array will become extremely slow. Save the minimum and maximum counter ID to have a quick check of current counter ID range. And start searching the pool from the last pool in the container will mostly get the needed pool since counter ID increases sequentially. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-30 14:52:29 +02:00
Suanming Mou	632f0f1905	net/mlx5: manage shared counters in three-level table Currently, to check if any shared counter with same ID existing, it will have to loop the counter pools to search for the counter. Even add the counter to the list will also not so helpful while there are thousands of shared counters in the list. Change Three-Level table to look up the counter index saved in the relevant table entry will be more efficient. This patch introduces the Three-level table to save the ID relevant counter index in the table. Then the next while the same ID comes, just check the table entry of this ID will get the counter index directly. No search will be needed. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-30 14:52:29 +02:00
Matan Azrad	aec086c9f1	common/mlx5: share kernel interface name getter Some configuration of the mlx5 port are done by the kernel net device associated to the IB device represents the PCI device. The DPDK mlx5 driver uses Linux system calls, for example ioctl, in order to configure per port configurations requested by the DPDK user. One of the basic knowledges required to access the correct kernel net device is its name. Move function to get interface name from IB device path to the common library. Signed-off-by: Matan Azrad <matan@mellanox.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2020-06-30 14:52:29 +02:00
Ophir Munk	d5ed8aa944	net/mlx5: add memory region callbacks in per-device cache Prior to this commit MR operations were verbs based and hard coded under common/mlx5/linux directory. This commit enables upper layers (e.g. net/mlx5) to determine which MR operations to use. For example the net layer could set devx based MR operations in non-Linux environments. The reg_mr and dereg_mr callbacks are added to the global per-device MR cache 'struct mlx5_mr_share_cache'. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-17 16:32:01 +02:00
Ophir Munk	73bf9235e9	net/mlx5: refactor statistics mlx5 statistics are calculated by several methods: 1. In software when packets go through datapath. 2. Calling ioctl with ETHTOOL command (Linux specific). 3. Reading counters from SYSFS device path (Linux specific). The Linux related functions are moved to file linux/mlx5_os.c. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	042f5c94fd	net/mlx5: refactor device operations for Linux There are three types of eth_dev_ops: primary, secondary and isolate. Their function calls assignments are moved from common file mlx5.c to the Linux specific file linux/mlx5_os.c. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	1256805dd5	net/mlx5: move Linux-specific functions File mlx5_ethdev.c is partially moved to linux/mlx5_ethdev_os.c for functions which are Linux specific. Functions which are Linux agnostics remain in mlx5_ethdev.c file. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	9138989036	net/mlx5: rename ib in names Renames in this commit: mlx5_ibv_list -> mlx5_dev_ctx_list mlx5_alloc_shared_ibctx -> mlx5_alloc_shared_dev_ctx mlx5_free_shared_ibctx -> mlx5_free_shared_dev_ctx mlx5_ibv_shared_port -> mlx5_dev_shared_port ibv_port -> dev_port Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	21b7c452a6	net/mlx5: remove completion object dependency on DV Replace 'struct mlx5dv_devx_cmd_comp ' with 'void ' in 'struct mlx5_dev_ctx_shared'. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	834a9019ec	net/mlx5: remove Verbs dependency in spawn struct 1. Replace 'struct ibv_device ' with 'void ' in 'struct mlx5_dev_spawn_data'. Define a getter function to retrieve the device name. 2. Rename ibv_dev and ibv_port as phys_dev and phys_port respectively. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	10f3581dfd	net/mlx5: add Linux-specific header file File drivers/net/linux/mlx5_os.h is added. It includes specific Linux definitions such as PCI driver flags, link state changes interrupts, link removal interrupts, etc. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	2eb4d0107a	net/mlx5: refactor PCI probing on Linux Refactor PCI probing related code. Move Linux specific functions (as well as verbs and dv related code) from mlx5.c file to linux/mlx5_os.c file. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	c7f6ba0e53	net/mlx5: remove umem field dependency on Direct Verbs umem field is used in several structs. Its type 'struct mlx5dv_devx_umem ' is changed to 'void '. This change will allow non-Linux OS compilations. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	e85f623e13	net/mlx5: remove attributes dependency on Verbs Define 'struct mlx5_dev_attr' which is ibv and dv independent. It contains attribute that were originally contained in 'struct ibv_device_attr_ex' and 'struct mlx5dv_context dv_attr'. Add a new API mlx5_os_get_dev_attr() which fills in the new defined struct. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	c468501658	common/mlx5: remove protection domain dependency on Verbs Replace 'struct ibv_pd ' with 'void ' in struct mlx5_ctx_shared and all function calls in mlx5 PMD. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	f44b09f9e3	net/mlx5: add Linux-specific file with getter functions 'ctx' type (field in 'struct mlx5_ctx_shared') is changed from 'struct ibv_context ' to 'void '. 'ctx' members which are verbs dependent (e.g. device_name) will be accessed through getter functions which are added to a new file under Linux directory: linux/mlx5_os.c. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Ophir Munk	6e88bc42c7	net/mlx5: rename Verbs shared object Replace all 'mlx5_ibv_shared' appearances with 'mlx5_dev_ctx_shared'. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-16 19:21:07 +02:00
Suanming Mou	a1da6f624c	net/mlx5: add reclaim memory mode Currently, when flow destroyed, some memory resources may still be kept as cached to help next time create flow more efficiently. Some system may need the resources to be more flexible with flow create and destroy. After peak time, with millions of flows destroyed, the system would prefer the resources to be reclaimed completely, no cache is needed. Then the resources can be allocated and used by other components. The system is not so sensitive about the flow insertion rate, but more care about the resources. Both DPDK mlx5 PMD driver and the low level component rdma-core have provided the flow resources to be configured cached or not, but there is no APIs or parameters exposed to user to configure the flow resources cache mode. In this case, introduce a new PMD devarg to let user configure the flow resources cache mode will be helpful. This commit is to add a new "reclaim_mem_mode" to help user configure if the destroyed flows' cache resources should be kept or not. Their will be three mode can be chosen: 1. 0(none). It means the flow resources will be cached as usual. The resources will be cached, helpful with flow insertion rate. 2. 1(light). It will only enable the DPDK PMD level resources reclaim. 3. 2(aggressive). Both DPDK PMD level and rdma-core low level will be configured as reclaimed mode. With these three mode, user can configure the resources cache mode with different levels. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-06-03 17:19:26 +02:00
Suanming Mou	33860cfab6	net/mlx5: fix interrupt installation timing Currently, the DevX counter query works asynchronously with Devx interrupt handler return the query result. When port closes, the interrupt handler will be uninstalled and the Devx comp obj will also be destroyed. Meanwhile the query is still not cancelled. In this case, counter query may use the invalid Devx comp which has been destroyed, and query failure with invalid FD will be reported. Adjust the shared interrupt install and uninstall timing to make the counter asynchronous query stop before interrupt uninstall. Fixes: `f15db67df0` ("net/mlx5: accelerate DV flow counter query") Cc: stable@dpdk.org Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-06-02 16:06:24 +02:00
Matan Azrad	5af61440dd	net/mlx5: fix flow counter container resize The design of counter container resize used double buffer algorithm in order to synchronize between the query thread to the control thread. When the control thread detected resize need, it created new bigger buffer for the counter pools in a new container and change the container index atomically. In case the query thread had not detect the previous resize before a new one need was detected by the control thread, the control thread returned EAGAIN to the flow creation API used a COUNT action. The rte_flow API doesn't allow unblocked commands and doesn't expect to get EAGAIN error type. So, when a lot of flows were created between 2 different periodic queries, 2 different resizes might try to be created and caused EAGAIN error. This behavior may blame flow creations. Change the synchronization way to use lock instead of double buffer algorithm. The critical section of this lock is very small, so flow insertion rate should not be decreased. Fixes: `ebbac312e4` ("net/mlx5: resize a full counter container") Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-05-18 20:35:57 +02:00
Dong Zhou	fa2d01c87d	net/mlx5: support flow aging Currently, there is no flow aging check and age-out event callback mechanism for mlx5 driver, this patch implements it. It's included: - Splitting the current counter container to aged or no-aged container since reducing memory consumption. Aged container will allocate extra memory to save the aging parameter from user configuration. - Aging check and age-out event callback mechanism based on current counter. When a flow be checked aged-out, RTE_ETH_EVENT_FLOW_AGED event will be triggered to applications. - Implement the new API: rte_flow_get_aged_flows, applications can use this API to get aged flows. Signed-off-by: Dong Zhou <dongz@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-05-05 15:54:27 +02:00
Dong Zhou	8d93c830e4	net/mlx5: modify ext-counter memory allocation Currently, the counter pool needs 512 ext-counter memory for no batch counters, it's allocated separately by once, behind the 512 basic-counter memory. This is not easy to get ext-counter pointer by corresponding basic-counter pointer. This is also no easy for expanding some other potential additional type of counter memory. So, need allocate every one of ext-counter and basic-counter together, as a single piece of memory. It's will be same for further additional type of counter memory. In this case, one piece of memory contains all type of memory for one counter, it's easy to get each type memory by using offsetting. Signed-off-by: Dong Zhou <dongz@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-05-05 15:54:27 +02:00
Alexander Kozyrev	6c55b622a9	net/mlx5: set dynamic flow metadata in Rx queues Using a global mbuf dynamic field for metadata incurs some performance penalty on a datapath. Store this information in the Rx queue descriptor for a better cache locality. Fixes: `a18ac61133` ("net/mlx5: add metadata support to Rx datapath") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 22:28:06 +02:00
Suanming Mou	ab612adc1e	net/mlx5: allocate flow API from indexed pool This commit allocates rte flow from indexed memory pool. Allocate rte flow memory from indexed memory pool helps save more than MALLOC_ELEM_OVERHEAD bytes memory from rte_malloc(). Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	e745f90007	net/mlx5: optimize flow RSS struct When destroy the flow with RSS, flow can invoke the queues information from hrxq index table object, since the queue number and list are both saved to the index table object. No need to save the duplicated data in rte flow. Save the RSS description information to the intermediate private data when create the flow with RSS action helps to save the memory for rte flow. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Wentao Cui	c2ddde7950	net/mlx5: optimize flow director filter memory This commit is for mlx5 fdir flow memory optimization. Currently for the fdir member in rte_flow structure. It saves the fdir memory pointer directly. As fdir is fading away, use one bit help to indicate the function in the flow and add the content to an extra list save the memory for the other widely usage cases. Signed-off-by: Wentao Cui <wentaoc@mellanox.com> Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	90e6053a19	net/mlx5: convert mark copy resource to indexed Allocate mark copy resource from indexed pool helps rte flow saves the 4 bytes index instead of 8 bytes pointer. For mark copy resource itself, it helps save MALLOC_ELEM_OVERHEAD bytes from rte_malloc(). Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	8638e2b076	net/mlx5: allocate meter from indexed pool This patch allocate the meter object memory from indexed memory pool which will help to save the MALLOC_ELEM_OVERHEAD memory taken by rte_malloc(). Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	b88341ca35	net/mlx5: convert flow dev handle to indexed This commit converts flow dev handle to indexed. Change the mlx5 flow handle from pointer to uint32_t saves memory for flow. With million flow, it saves several MBytes memory. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	772dc0eb83	net/mlx5: convert hrxq to indexed This commit converts hrxq to indexed. Using the uint32_t index instead of pointer saves 4 bytes memory for the flow handle. For millions flows, it will save several MBytes of memory. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	7ac99475ce	net/mlx5: convert jump resource to indexed This commit convert jump resource to indexed. The table data struct is allocated from indexed memory. As it is add in the hash list, the pointer is still used for hash list search. The index is added to the table struct, and the pointer in flow handle is decrease to uint32_t type. For flow without jump flows, it saves 4 bytes memory. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	f3faf9ea11	net/mlx5: convert port id action to indexed This commit converts port id action to indexed. Using the uint32_t index instead of pointer saves 4 bytes memory for the flow handle. For millions flows, it will save several MBytes of memory. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	5f1142692a	net/mlx5: convert tag resource to indexed This commit convert tag resource to indexed. As tag resources are add in the hash list, to avoid introduce performance issue and keep the hash list, only the tag resource memory is allocated from indexed memory. The resources is still added to the hash list. Add four bytes index in the tag resource struct and change the tag resources in the flow handle from pointer to uint32_t seems be no benefit for tag resource, but it saves memory for flows without tag action. And also for sub flows share one tag action resource. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	8acf8ac9b7	net/mlx5: convert push VLAN resource to indexed This commit converts the push VLAN resource to indexed. Using the uint32_t index instead of pointer saves 4 bytes memory for the flow handle. For millions flows, it will save several MBytes of memory. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Suanming Mou	014d1cbe51	net/mlx5: convert encap/decap resource to indexed This commit converts the flow encap/decap resource to indexed. Using the uint32_t index instead of pointer saves 4 bytes memory for the flow handle. For millions flows, it will save several MBytes of memory. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:09 +02:00
Vu Pham	b8dc6b0e29	common/mlx5: refactor memory management Refactor common memory btree and cache management to common driver. Replace some input parameters of MR APIs to more common data structure like PD, port_id, share_cache,... so that multiple PMD drivers can use those MR APIs. Modify mlx5 net pmd driver to use MR management APIs from common driver. Signed-off-by: Vu Pham <vuhuong@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:08 +02:00
Vu Pham	a4de9586ac	common/mlx5: refactor IPC handling from net driver Refactor common multi-process handling codes from net PMD to common driver. Using tuple mp_id{name, port_id} as standard input parameter for all multi-process IPC APIs instead of using rte_eth_dev. Modify net PMD to use multi-process APIs from common driver. Signed-off-by: Vu Pham <vuhuong@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:08 +02:00
Suanming Mou	9dbaf7eef6	net/mlx5: fix meter suffix table leak Currently, the meter suffix table is created and saved in the mlx5 shared struct. It causes the suffix table will never be released even without any meter rules. Move the suffix table to meter domain struct to help the suffix table be released when all the meter rules are destroyed. Fixes: `46a5e6bc6a` ("net/mlx5: prepare meter flow tables") Cc: stable@dpdk.org Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:08 +02:00
Alexander Kozyrev	ecb160456a	net/mlx5: add device parameter for MPRQ stride size Define a device parameter to configure log 2 of a stride size for MPRQ - mprq_log_stride_size. User is able to specify a stride size in a range allowed by an underlying hardware. The default stride size is defined as 2048 bytes to encompass most commonly used packet sizes in the Internet (MTU 1518 and less) and will be used in case a maximum configured packet size cannot fit into the largest possible stride size. Otherwise a stride size is set to a large enough value to encompass a whole packet. Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-04-21 13:57:08 +02:00
Bing Zhao	3ac3d8234b	net/mlx5: fix index when creating flow When creating a flow, usually the creating routine is called in serial. No parallel execution is supported right now. The same function will be called only once for a single flow creation. But there is a special case that the creating routine will be called nested. If the xmeta feature is enabled and there is FLAG / MARK in the actions list, some metadata reg copy flow needs to be created before the original flow is applied to the hardware. In the flow non-cached mode, resources only for flow creation will not be saved anymore. The memory space is pre-allocated and reused for each flow. A global index for each device is used to indicate the memory address of the resources. If the function is called in a nested mode, then the index will be reset and make everything get corrupted. To solve this, a nested index is introduced to save the position for the original flow creation. Currently, only one level nested call of the flow creating routine is supported. Fixes: `e7bfa3596a` ("net/mlx5: separate the flow handle resource") Signed-off-by: Bing Zhao <bingz@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:07 +02:00
Suanming Mou	261bb99a21	net/mlx5: reorganize fallback counter management Currently, the fallback counter is also allocated from the pool, the specify fallback function code becomes a bit duplicate. Reorganize the fallback counter code to make it reuse from the normal counter code. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-04-21 13:57:07 +02:00
Suanming Mou	826b8a8732	net/mlx5: split flow counter struct Currently, the counter struct saves both the members used by batch counters and none batch counters. The members which are only used by none batch counters cost 16 bytes extra memory for batch counters. As normally there will be limited none batch counters, mix the none batch counter and batch counter members becomes quite expensive for batch counter. If 1 million batch counters are created, it means 16 MB memory which will not be used by the batch counters are allocated. Split the mlx5_flow_counter struct for batch and none batch counters helps save the memory. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-04-21 13:57:07 +02:00
Suanming Mou	956d5c74d7	net/mlx5: optimize flow counter handle type Currently, DV and verbs counters are both changed to indexed. It means while creating the flow with counter, flow can save the indexed value to address the counter. Save the 4 bytes indexed value in the rte_flow instead of 8 bytes pointer helps to save memory with millions of flows. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-04-21 13:57:07 +02:00
Suanming Mou	4001d7ad26	net/mlx5: change Direct Verbs counter to indexed This part of the counter optimize change the DV counter to indexed as what have already done in verbs. In this case, all the mlx5 flow counter can be addressed by index. The counter index is composed of pool index and the counter offset in the pool counter array. The batch and none batch counter dcs ID offset 0x800000 is used to avoid the mix up for the index. As batch counter dcs ID starts from 0x800000 and none batch counter dcs starts from 0, the 0x800000 offset is added to the batch counter index to indicate the index of batch counter. The counter pointer in rte_flow struct will be aligned to index instead of pointer. It will save 4 bytes memory for every rte_flow. With millions of rte_flow, it will save MBytes memory. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-04-21 13:57:07 +02:00
Suanming Mou	c3d3b14099	net/mlx5: change verbs counter allocator to indexed This is part of the counter optimize which will save the indexed counter id instead of the counter pointer in the rte_flow. Place the verbs counter into the container pool helps the counter to be indexed correctly independent with the raw counter. The counter pointer in rte_flow will be changed to indexed value after the DV counter is also changed to indexed. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-04-21 13:57:07 +02:00
Suanming Mou	c989f49a38	net/mlx5: optimize counter release query generation Query generation was introduced to avoid counter to be reallocated before the counter statistics be fully updated. Since the counters be released between query trigger and query handler may miss the packets arrived in the trigger and handler gap period. In this case, user can only allocate the counter while pool query_gen is greater than the counter query_gen + 1 which indicates a new round of query finished, the statistic is fully updated. Split the pool query_gen to start_query_gen and end_query_gen helps to have a better identify for the counter released in the gap period. And it helps the counter released before query trigger or after query handler can be reallocated more efficiently. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-04-21 13:57:07 +02:00
Jiawei Wang	c5193a0bbe	net/mlx5: fix imissed counter overflow The Hw counters is defined as 32bit unsigned value and read from the sysfs. Firstly read the base value while application start, then fetch the new value while do query and minus the base value. If the new value is less than base value, will result in a negative value and convert to the big value as unsigned 64bit. PMD add xstats field to store the last successfully read counter, use it if failed to read hw counter from sysfs. PMD also record the last output value to handle the wrap around case, if overflow happened, increase the wrap count by 1 and save into the higher 32bit, and update the new value into lower 32bit, finally return the 64bit counter value. Fixes: `ce9494d76c` ("net/mlx5: report imissed statistics") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:05 +02:00
Bing Zhao	1ad9a3d09f	net/mlx5: introduce buffer size parameter for hairpin When creating a hairpin queue, the total data size and the maximal number of packets are interrelated. The differ is the stride size. Larger buffer size means big packet like jumbo could be supported, but in the meanwhile, it will introduce more cache misses and have a side effect on the performance. Now a new device parameter "hp_buf_log_sz" is introduced for applications to set the total data buffer size (the logarithm value). Then the maximal number of packets will also be calculated automatically by this value. Applications could also change this value to a larger one in order to support larger packets in hairpin case. A smaller value will be beneficial for memory consumption. If it is not set, the default value will be used. Signed-off-by: Bing Zhao <bingz@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2020-04-21 13:57:05 +02:00

1 2 3 4 5 ...

338 Commits