numam-dpdk

Author	SHA1	Message	Date
Suanming Mou	a5835d530f	net/mlx5: optimize Rx queue match As hrxq struct has the indirect table pointer, while matching the hrxq, better to use the hrxq indirect table instead of searching from the list. This commit optimizes the hrxq indirect table matching. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:23 +02:00
Suanming Mou	cde19e8634	net/mlx5: change memory release configuration This commit changes the index pool memory release configuration to 0 when memory reclaim mode is not required. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:22 +02:00
Suanming Mou	f3020a331d	net/mlx5: optimize hash list table allocate on demand Currently, all the hash list tables are allocated during start up. Since different applications may only use dedicated limited actions, optimized the hash list table allocate on demand will save initial memory. This commit optimizes hash list table allocate on demand. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:22 +02:00
Suanming Mou	07b51bb9fe	net/mlx5: enable indexed pool per-core cache This commit enables the tag and header modify action indexed pool per-core cache in non-reclaim memory mode. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:21 +02:00
Suanming Mou	f7c3f3c290	net/mlx5: adjust hash bucket size With the new per core optimization to the list, the hash bucket size can be tuned to a more accurate number. This commit adjusts the hash bucket size. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:21 +02:00
Matan Azrad	4f3d8d0ea3	net/mlx5: move header modify allocator to ipool Modify header actions are allocated by mlx5_malloc which has a big overhead of memory and allocation time. One of the action types under the modify header object is SET_TAG, The SET_TAG action is commonly not reused by the flows and each flow has its own value. Hence, the mlx5_malloc becomes a bottleneck in flow insertion rate in the common cases of SET_TAG. Use ipool allocator for SET_TAG action. Ipool allocator has less overhead of memory and insertion rate and has better synchronization mechanism in multithread cases. Different ipool is created for each optional size of modify header handler. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 16:09:20 +02:00
Suanming Mou	7e1cf89271	common/mlx5: support list non-lcore operations This commit supports the list non-lcore operations with an extra sub-list and lock. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:20 +02:00
Suanming Mou	9a4c368807	common/mlx5: optimize cache list object memory Currently, hash list uses the cache list as bucket list. The list in the buckets have the same name, ctx and callbacks. This wastes the memory. This commit abstracts all the name, ctx and callback members in the list to a constant struct and others to the inconstant struct, uses the wrapper functions to satisfy both hash list and cache list can set the list constant and inconstant struct individually. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:19 +02:00
Suanming Mou	25481e5025	common/mlx5: allocate cache list memory individually Currently, the list's local cache instance memory is allocated with the list. As the local cache instance array size is RTE_MAX_LCORE, most of the cases the system will only have very limited cores. allocate the instance memory individually per core will be more economic to the memory. This commit changes the instance array to pointer array, allocate the local cache memory only when the core is to be used. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:19 +02:00
Matan Azrad	961b6774c4	common/mlx5: add per-lcore cache to hash list utility Using the mlx5 list utility object in the hlist buckets. This patch moves the list utility object to the common utility, creates all the clone operations for all the hlist instances in the driver. Also adjust all the utility callbacks to be generic for both list and hlist. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 16:09:18 +02:00
Suanming Mou	6507c9f51d	common/mlx5: call list callbacks with context This commit optimizes to call the list callback functions with global context directly. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:17 +02:00
Suanming Mou	d03b786005	common/mlx5: add per-lcore sharing flag in object list Without lcores_share flag, mlx5 PMD was sharing the rdma-core objects between all lcores. Having lcores_share flag disabled, means each lcore will have its own objects, which will eventually lead to increased insertion/deletion rates. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 15:50:31 +02:00
Suanming Mou	9c373c524b	common/mlx5: move list utility from net driver Hash list is planned to be implemented with the cache list code. This commit moves the list utility to common directory. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 15:19:13 +02:00
Matan Azrad	679f46c775	net/mlx5: allocate list memory in create function Currently, the list memory was allocated by the list API caller. Move it to be allocated by the create API in order to save consistence with the hlist utility. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 15:19:13 +02:00
Matan Azrad	84fbba5b9e	net/mlx5: relax list utility atomic operations The atomic operation in the list utility no need a barriers because the critical part are managed by RW lock. Relax them. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 15:19:12 +02:00
Matan Azrad	a603b55ad9	net/mlx5: manage list cache entries release When a cache entry is allocated by lcore A and is released by lcore B, the driver should synchronize the cache list access of lcore A. The design decision is to manage a counter per lcore cache that will be increased atomically when the non-original lcore decreases the reference counter of cache entry to 0. In list register operation, before the running lcore starts a lookup in its cache, it will check the counter in order to free invalid entries in its cache. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 15:19:11 +02:00
Matan Azrad	0b4ce17a11	net/mlx5: minimize list critical sections The mlx5 internal list utility is thread safe. In order to synchronize list access between the threads, a RW lock is taken for the critical sections. The create\remove\clone\clone_free operations are in the critical sections. These operations are heavy and make the critical sections heavy because they are used for memory and other resources allocations\deallocations. Moved out the operations from the critical sections and use generation counter in order to detect parallel allocations. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 15:19:11 +02:00
Matan Azrad	491b7137ff	net/mlx5: add per-lcore cache to the list utility When mlx5 list object is accessed by multiple cores, the list lock counter is all the time written by all the cores what increases cache misses in the memory caches. In addition, when one thread accesses the list for add\remove\lookup operation, all the other threads coming to do an operation in the list are stuck in the lock. Add per lcore cache to allow thread manipulations to be lockless when the list objects are mostly reused. Synchronization with atomic operations should be done in order to allow threads to unregister an entry from other thread cache. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 15:19:10 +02:00
Matan Azrad	e78e5408da	net/mlx5: remove cache term from the list utility The internal mlx5 list tool is used mainly when the list objects need to be synchronized between multiple threads. The "cache" term is used in the internal mlx5 list API. Next enhancements on this tool will use the "cache" term for per thread cache management. To prevent confusing, remove the current "cache" term from the API's names. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 15:19:10 +02:00
Matan Azrad	e681eb0515	net/mlx5: optimize header modify action memory Define the types of the modify header action fields to be with the minimum size needed for the optional values range. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 15:19:09 +02:00
Suanming Mou	b4edeaf3ef	net/mlx5: replace flow list with indexed pool The flow list is used to save the create flows and to be used only when port closes all the flows need to be flushed. This commit takes advantage of the index pool foreach operation to flush all the allocated flows. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 15:19:09 +02:00
Suanming Mou	42f463395f	net/mlx5: support indexed pool non-lcore operations This commit supports the index pool non-lcore operations with an extra cache and lcore lock. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 15:19:08 +02:00
Suanming Mou	64a80f1c48	net/mlx5: add indexed pool iterator In some cases, application may want to know all the allocated index in order to apply some operations to the allocated index. This commit adds the indexed pool functions to support foreach operation. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 15:19:08 +02:00
Suanming Mou	d15c0946be	net/mlx5: add indexed pool local cache For object which wants efficient index allocate and free, local cache will be very helpful. Two level cache is introduced to allocate and free the index more efficient. One as local and the other as global. The global cache is able to save all the allocated index. That means all the allocated index will not be freed. Once the local cache is full, the extra index will be flushed to the global cache. Once local cache is empty, first try to fetch more index from global, if global is still empty, allocate new trunk with more index. This commit adds new local cache mechanism for indexed pool. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 15:19:07 +02:00
Suanming Mou	58ecd3ad0b	net/mlx5: allow limiting the indexed pool maximum index Some ipool instances in the driver are used as ID\index allocator and added other logic in order to work with limited index values. Add a new configuration for ipool specify the maximum index value. The ipool will ensure that no index bigger than the maximum value is provided. Use this configuration in ID allocator cases instead of the current logics. This patch add the maximum ID configurable for the index pool. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 15:19:01 +02:00
Ruifeng Wang	1db288f941	net/mlx5: reduce unnecessary memory access in Rx MR btree len is a constant during Rx replenish. Moved retrieve of the value out of loop to reduce data loads. Slight performance uplift was measured on both N1SDP and x86. Suggested-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-15 15:17:22 +02:00
Ruifeng Wang	ff6fcd415f	net/mlx5: remove redundant operations in NEON Rx Mask of entries after the compressed CQE is covered by invalid mask of non-compressed valid CQEs. Hence remove redundant calculation on mask. The change showed slight performance uplift on N1SDP. Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-15 15:16:26 +02:00
Rongwei Liu	dbd8e4102d	app/testpmd: support matching on VXLAN reserved field Add a new testpmd pattern field 'last_rsvd' that supports the last 8-bits matching of VXLAN header. The examples for the "last_rsvd" pattern field are as below: 1. ...pattern eth / ipv4 / udp / vxlan last_rsvd is 0x80 / end ... This flow will exactly match the last 8-bits to be 0x80. 2. ...pattern eth / ipv4 / udp / vxlan last_rsvd spec 0x80 vxlan mask 0x80 / end ... This flow will only match the MSB of the last 8-bits to be 1. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Raslan Darawsheh <rasland@nvidia.com>	2021-07-13 15:06:43 +02:00
Rongwei Liu	630a587bfb	net/mlx5: support matching on VXLAN reserved field This adds matching on the reserved field of VXLAN header (the last 8-bits). The capability from rdma-core is detected by creating a dummy matcher using misc5 when the device is probed. For non-zero groups and FDB domain, the capability is detected from rdma-core, meanwhile for NIC domain group zero it's relying on the HCA_CAP from FW. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Raslan Darawsheh <rasland@nvidia.com>	2021-07-13 15:06:43 +02:00
Gregory Etelson	730bf06652	app/testpmd: add flow matching on IPv4 version and IHL The new flow item allows PMD to offload IPv4 IHL field for matching, if hardware supports that operation. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-13 13:15:14 +02:00
Viacheslav Ovsiienko	b6b8a1ebd4	app/testpmd: fix offloads for newly attached port For the newly attached ports (with "port attach" command) the default offloads settings, configured from application command line, were not applied, causing port start failure following the attach. For example, if scattering offload was configured in command line and rxpkts was configured for multiple segments, the newly attached port start was failed due to missing scattering offload enable in the new port settings. The missing code to apply the offloads to the new device and its queues is added. The new local routine init_config_port_offloads() is introduced, embracing the shared part of port offloads initialization code. Fixes: c9cce42876f5 ("ethdev: remove deprecated attach/detach functions") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Aman Deep Singh <aman.deep.singh@intel.com> Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>	2021-07-13 11:52:05 +02:00
Huisong Li	fafa81dece	net/hns3: support multiple TC MAC pause MAC PAUSE can take effect on a single TC or multiple TCs, depending on the hardware. For example, the Kunpeng 920 supports MAC pause in a single TC, and the Kunpeng 930 supports MAC pause in multiple TCs. This patch supports MAC PAUSE in multiple TC for some hardware. Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>	2021-07-13 11:41:32 +02:00
Chengchang Tang	0f5bf5a856	net/hns3: support VLAN filter state modify for VF Since the HW limitation for VF, the VLAN filter is default enabled, and is not allowed to be closed. Now, the limitation has been removed in Kunpeng930 network engine, so this patch add support for VF to modify the VLAN filter state. A capabilities bit is added to differentiate between different platforms and achieve compatibility. When the VF runs on an incomatible platform or an incompatible kernel-mode driver version is used, the VF behavior is the same as that before. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>	2021-07-13 11:41:32 +02:00
Chengchang Tang	2735b35538	net/hns3: query basic info for VF There are some features of VF depend on PF, so it's necessary for VF to know whether current PF supports. Therefore, the final capability set of VF will be composed of the capability set of hardware and the capability set of PF. For compatibility reasons, the mailbox HNS3_MBX_GET_TCINFO has been modified to obatin more basic information about the current PF, including the communication interface version and current PF capabilities set. Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>	2021-07-13 11:41:32 +02:00
Dapeng Yu	ae2b3ba643	net/softnic: fix connection memory leak In function softnic_conn_init(), a block of memory is allocated as connection buffer, but it is never freed in softnic_conn_free(), which cause memory leak. Fixes: 7709a63bf178 ("net/softnic: add connection agent") Cc: stable@dpdk.org Signed-off-by: Dapeng Yu <dapengx.yu@intel.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-07-13 11:34:57 +02:00
Jochen Behrens	046f116195	net/vmxnet3: support MSI-X interrupt Add support for MSI-X interrupt vectors to the vmxnet3 driver. This will allow more efficient deployments in cloud environments. By default it will try to allocate 1 vector (0) for link event and one MSI-X vector for each Rx queue. To simplify things, it will only be enabled if the number of Tx and Rx queues are equal (so that Tx/Rx share the same vector). If for any reason vmxnet3 cannot enable intr mode, it will fall back to the LSC only mode. Signed-off-by: Yong Wang <yongwang@vmware.com> Signed-off-by: Jochen Behrens <jbehrens@vmware.com>	2021-07-13 11:31:10 +02:00
Martin Havlik	d844400966	net/bonding: check flow setting Return value from bond_ethdev_8023ad_flow_set() is now checked and appropriate message is logged on error. Fixes: 112891cd27e5 ("net/bonding: add dedicated HW queues for LACP control") Cc: stable@dpdk.org Signed-off-by: Martin Havlik <xhavli56@stud.fit.vutbr.cz> Acked-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-07-13 11:04:55 +02:00
Martin Havlik	cb8dc97f9d	net/bonding: fix error message on flow verify Return value is now saved to errval and log message on error reports correct function name, doesn't use q_id which was out of context, and uses up-to-date errval. Fixes: 112891cd27e5 ("net/bonding: add dedicated HW queues for LACP control") Cc: stable@dpdk.org Signed-off-by: Martin Havlik <xhavli56@stud.fit.vutbr.cz> Acked-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>	2021-07-13 10:58:14 +02:00
Jiawen Wu	cc63194e89	net/ngbe: support close and reset device Support to close and reset device. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:22 +02:00
Jiawen Wu	aad91edd81	net/ngbe: add simple Tx flow Initialize device with the simplest transmit functions. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:22 +02:00
Jiawen Wu	93dfebd2c2	net/ngbe: add simple Rx flow Initialize device with the simplest receive function. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:22 +02:00
Jiawen Wu	62fc35e63d	net/ngbe: support Rx queue start/stop Initializes receive unit, support to start and stop receive unit for specified queues. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:22 +02:00
Jiawen Wu	001c782330	net/ngbe: support Tx queue start/stop Initializes transmit unit, support to start and stop transmit unit for specified queues. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:22 +02:00
Jiawen Wu	3518df5774	net/ngbe: support device start/stop Setup MSI-X interrupt, complete PHY configuration and set device link speed to start device. Disable interrupt, stop hardware and clear queues to stop device. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:22 +02:00
Jiawen Wu	a58e7c312c	net/ngbe: support Tx queue setup/release Setup device Tx queue and release Tx queue. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:22 +02:00
Jiawen Wu	43b7e5ea60	net/ngbe: support Rx queue setup/release Setup device Rx queue and release Rx queue. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:21 +02:00
Jiawen Wu	3d0af70667	net/ngbe: setup PHY link Setup PHY, determine link and speed status from PHY. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:21 +02:00
Jiawen Wu	b9246b8fa2	net/ngbe: support link update Register to handle device interrupt. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:21 +02:00
Jiawen Wu	539d55dab6	net/ngbe: store MAC address Store MAC addresses and init receive address filters. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:21 +02:00
Jiawen Wu	44e97550ca	net/ngbe: identify and reset PHY Identify PHY to get the PHY type, and perform a PHY reset. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>	2021-07-12 17:55:21 +02:00

1 2 3 4 5 ...

28888 Commits