numam-dpdk

Author	SHA1	Message	Date
Michael Baum	34776af600	net/mlx5: fix MPRQ stride devargs adjustment In Multi-Packet RQ creation, the user can choose the number of strides and their size in bytes. The user updates it using specific devargs for both of these parameters. The above two parameters determine the size of the WQE which is actually their product of multiplication. If the user selects values that are not in the supported range, the PMD changes them to default values. However, apart from the range limitations for each parameter individually there is also a minimum value on their multiplication. When the user selects values that their multiplication are lower than minimum value, no adjustment is made and the creation of the WQE fails. This patch adds an adjustment in these cases as well. When the user selects values whose multiplication is lower than the minimum, they are replaced with the default values. Fixes: `ecb160456a` ("net/mlx5: add device parameter for MPRQ stride size") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-12-05 12:22:09 +01:00
Michael Baum	0947ed380f	net/mlx5: improve stride parameter names In the striding RQ management there are two important parameters, the size of the single stride in bytes and the number of strides. Both the data-path structure and config structure keep the log of the above parameters. However, in their names there is no mention that the value is a log which may be misleading as if the fields represent the values themselves. This patch updates their names describing the values more accurately. Fixes: `ecb160456a` ("net/mlx5: add device parameter for MPRQ stride size") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-12-05 12:22:09 +01:00
Josh Soref	7be78d0279	fix spelling in comments and strings The tool comes from https://github.com/jsoref Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2022-01-11 12:16:53 +01:00
Viacheslav Ovsiienko	11cfe349b3	net/mlx5: fix Tx scheduling check There was a redundant check for the enabled E-Switch, this resulted in device probing failure if the Tx scheduling was requested and E-Switch was enabled. Fixes: `f17e4b4ffe` ("net/mlx5: add Tx scheduling check on queue creation") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-14 09:18:28 +01:00
Bing Zhao	febcac7b46	net/mlx5: support Rx queue delay drop For the Ethernet RQs, if there all receiving descriptors are exhausted, the packets being received will be dropped. This behavior prevents slow or malicious software entities at the host from affecting the network. While for hairpin cases, even if there is no software involved during the packet forwarding from Rx to Tx side, some hiccup in the hardware or back pressure from Tx side may still cause the descriptors to be exhausted. In certain scenarios it may be preferred to configure the device to avoid such packet drops, assuming the posting of descriptors will resume shortly. To support this, a new devarg "delay_drop" is introduced. By default, the delay drop is enabled for hairpin Rx queues and disabled for standard Rx queues. This value is used as a bit mask: - bit 0: enablement of standard Rx queue - bit 1: enablement of hairpin Rx queue And this attribute will be applied to all Rx queues of a device. The "rq_delay_drop" capability in the HCA_CAP is checked before creating any queue. If the hardware capabilities do not support this delay drop, all the Rx queues will still be created without this attribute, and the devarg setting will be ignored even if it is specified explicitly. A warning log is used to notify the application when this occurs. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-05 17:04:53 +01:00
Xueming Li	09c2555303	net/mlx5: support shared Rx queue This patch introduces shared RxQ. All shared Rx queues with same group and queue ID share the same rxq_ctrl. Rxq_ctrl and rxq_data are shared, all queues from different member port share same WQ and CQ, essentially one Rx WQ, mbufs are filled into this singleton WQ. Shared rxq_data is set into device Rx queues of all member ports as RxQ object, used for receiving packets. Polling queue of any member ports returns packets of any member, mbuf->port is used to identify source port. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:50 +01:00
Gregory Etelson	9086ac093a	net/mlx5: add flex parser DevX object management The DevX flex parsers can be shared between representors within the same IB context. We should put the flex parser objects into the shared list and engage the standard mlx5_list_xxx API to manage ones. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:38 +01:00
Viacheslav Ovsiienko	db25cadc08	net/mlx5: add flex item operations This patch is a preparation step of implementing flex item feature in driver and it provides: - external entry point routines for flex item creation/deletion - flex item objects management over the ports. The flex item object keeps information about the item created over the port - reference counter to track whether item is in use by some active flows and the pointer to underlying shared DevX object, providing all the data needed to translate the flow flex pattern into matcher fields according hardware configuration. There is not too many flex items supposed to be created on the port, the design is optimized rather for flow insertion rate than memory savings. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-11-04 22:55:38 +01:00
Dmitry Kozlyuk	bc5bee028e	net/mlx5: create drop queue using DevX Drop queue creation and destruction were not implemented for DevX flow engine and Verbs engine methods were used as a workaround. Implement these methods for DevX so that there is a valid queue ID that can be used regardless of queue configuration via API. Cc: stable@dpdk.org Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-11-02 18:59:17 +01:00
Jiawei Wang	3c4338a421	net/mlx5: optimize device spawn time with representors During the device spawn process, mlx5 PMD queried the available flow priorities by calling mlx5_flow_discover_priorities, queried if the DR drop action was supported on the root table by calling the mlx5_flow_discover_dr_action_support routine, and queried the availability of metadata register C by calling mlx5_flow_discover_mreg_c These functions created the test flows to get the supported fields, and at the end destroyed the test flows. The test flows in the first two functions was created on the root table. If the device was spawned with multiple representors, these test flows were created and destroyed on each representor as well. The above operations took a significant amount of init time during the device spawn. This patch optimizes the device discover functions, if there is the device with multiple representors (VF/SF) being spawned, the priority and drop action and metadata register support check can be done only ones and check results can be shared for all representors. Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-27 14:04:39 +02:00
Rongwei Liu	7299ab6822	net/mlx5: support socket direct mode bonding In socket direct mode, it's possible to bind any two (maybe four in future) PCIe devices with IDs like xxxx:xx:xx.x and yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have the same PCIe domain/bus/device ID anymore, Kernel driver uses "system_image_guid" to identify if devices can be bound together or not. Sysfs "phys_switch_id" is used to get "system_image_guid" of each network interface. OFED 5.4+ is required to support "phys_switch_id". Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-26 13:24:20 +02:00
Harman Kalra	d61138d4f0	drivers: remove direct access to interrupt handle Removing direct access to interrupt handle structure fields, rather use respective get set APIs for the same. Making changes to all the drivers access the interrupt handle fields. Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Tested-by: Raslan Darawsheh <rasland@nvidia.com>	2021-10-25 21:20:12 +02:00
Ferruh Yigit	295968d174	ethdev: add namespace Add 'RTE_ETH' namespace to all enums & macros in a backward compatible way. The macros for backward compatibility can be removed in next LTS. Also updated some struct names to have 'rte_eth' prefix. All internal components switched to using new names. Syntax fixed on lines that this patch touches. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Wisam Jaddo <wisamm@nvidia.com> Acked-by: Rosen Xu <rosen.xu@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>	2021-10-22 18:15:38 +02:00
Rongwei Liu	a89f6433aa	net/mlx5: set Tx queue affinity in round-robin Previously, we set txq affinity to 0 and let firmware to perform round-robin when bonding. Firmware uses a global counter to assign txq affinity to different physical ports accord to remainder after division. There are three dis-advantages: 1. The global counter is shared between kernel and dpdk. 2. After restarting pmd or port, the previous counter value is reused, so the new affinity is unpredictable. 3. There is no way to get what affinity is set by firmware. In this update, we will create several TISs up to the number of bonding ports and bind each TIS to one PF port. For each port, it will start to pick up TIS using its port index. Upper layer application can quickly calculate each txq's affinity without querying. At DPDK layer, when creating txq with 2 bonding ports, the affinity is set like: port 0: 1-->2-->1-->2 port 1: 2-->1-->2-->1 port 2: 1-->2-->1-->2 Note: Only applicable to DevX api. This affinity subjects to HW hash. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 12:37:00 +02:00
Dmitry Kozlyuk	ea823b2c51	net/mlx5: close tools socket with last device MLX5 PMD exposes a socket for external tools to dump port state. Socket events are listened using an interrupt source of EXT type. The socket was closed and the interrupt callback was unregistered at program exit, which is incorrect because DPDK could be already shut down at this point. Move actions performed at program exit to the moment the last MLX5 port is closed. The socket will be opened again if later a new MLX5 device is plugged in and probed. Also fix comments that were decisively talking about secondary processes instead of external tools. Fixes: `e6cdc54cc0` ("net/mlx5: add socket server for external tools") Cc: stable@dpdk.org Reported-by: Harman Kalra <hkalra@marvell.com> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2021-10-21 10:31:53 +02:00
Xueming Li	614966c2fa	net/mlx5: check DevX to support more Verbs ports Verbs API doesn't support device port number larger than 255 by design. To support more VF or SubFunction port representors, forces DevX API check when max Verbs device link ports larger than 255. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:14 +02:00
Xueming Li	686d05b60d	net/mlx5: enable DevX Tx queue creation Verbs API does not support Infiniband device port number larger 255 by design. To support more representors on a single Infiniband device DevX API should be engaged. While creating Send Queue (SQ) object with Verbs API, the PMD assigned IB device port attribute and kernel created the default miss flows in FDB domain, to redirect egress traffic from the queue being created to representor appropriate peer (wire, HPF, VF or SF). With DevX API there is no IB-device port attribute (it is merely kernel one, DevX operates in PRM terms) and PMD must create default miss flows in FDB explicitly. PMD did not provide this and using DevX API for E-Switch configurations was disabled. The default miss FDB flow matches E-Switch manager vport (to make sure the source is some representor) and SQn (Send Queue number - device internal queue index). The root flow table managed by kernel/firmware and it does not support vport redirect action, we have to split the default miss flow into two ones: - flow with lowest priority in the root table that matches E-Switch manager vport ID and jump to group 1. - flow in group 1 that matches E-Switch manager vport ID and SQn and forwards packet to peer vport Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:13 +02:00
Xueming Li	3fd2961efa	net/mlx5: use Netlink when IB port greater than 255 IB spec doesn't allow 255 ports on a single HCA, port number of 256 was cast to u8 value 0 which invalid to ibv_query_port() This patch invokes Netlink API to query port state when port number greater than 255. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-21 09:31:08 +02:00
Michael Baum	9f1d636f3e	common/mlx5: share MR management Add global shared MR cache as a field of common device structure. Move MR management to use this global cache for all drivers. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:57:58 +02:00
Michael Baum	5fbc75ace1	common/mlx5: add global MR cache create function Add function for global shared MR cache structure initialization. This function include: - btree initialization. - set callbacks for reg and dereg MR. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:57:24 +02:00
Michael Baum	fe46b20c96	common/mlx5: share HCA capabilities handle Add HCA attributes structure as a field of device config structure. It query in common probing, and updates the timestamp format fields. Each driver use HCA attributes from common device config structure, instead of query it for itself. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:46 +02:00
Michael Baum	e35ccf243b	common/mlx5: share protection domain object Create shared Protection Domain in common area and add it and its PDN as fields of common device structure. Use this Protection Domain in all drivers and remove the PD and PDN fields from their private structure. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:46 +02:00
Michael Baum	ca1418ce39	common/mlx5: share device context object Create shared context device in common area and add it as a field of common device. Use this context device in all drivers and remove the ctx field from their private structure. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:44 +02:00
Michael Baum	5bc38358b5	net/mlx5: remove redundant flag in device config Device configure structure has flag named devx as same as SH structure with the same meaning. Remove the flag from the configuration structure and move all the usages to the SH flag. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:36 +02:00
Michael Baum	887183effa	common/mlx5: move basic probing functions to common Move open IBV/DevX device function to common. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:53:32 +02:00
Michael Baum	8520992403	common/mlx5: share memory related devargs Add device configure structure and function to parse user device arguments into it. Move parsing and management of relevant device arguments to common. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:39:04 +02:00
Michael Baum	7af08c8f1a	common/mlx5: share basic probing with internal drivers Create common probing structure that includes, for now, basic probing information detected by the common driver and share it with all the internal drivers. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-21 15:38:46 +02:00
Tal Shnaiderman	c1a320bf89	net/mlx5: fix tunneling support query Currently, the PMD decides if the tunneling offload can enable VXLAN/GRE/GENEVE tunneled TSO support by checking config->tunnel_en (single bit) and config->tso. This is incorrect, the right way is to check the following flags returned by the mlx5dv_query_device function: MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN - if supported the offload DEV_TX_OFFLOAD_VXLAN_TNL_TSO can be enabled. MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE - if supported the offload DEV_TX_OFFLOAD_GRE_TNL_TSO can be enabled. MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GENEVE - if supported the offload DEV_TX_OFFLOAD_GENEVE_TNL_TSO can be enabled. The fix enables the offloads according to the correct flags returned by the kernel. Fixes: `dbccb4cddc` ("net/mlx5: convert to new Tx offloads API") Cc: stable@dpdk.org Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Tested-by: Idan Hackmon <idanhac@nvidia.com>	2021-10-12 15:29:34 +02:00
Tal Shnaiderman	accf3cfce4	net/mlx5: fix software parsing support query Currently, the PMD decides if the software parsing offload can enable outer IPv4 checksum and tunneled TSO support by checking config->hw_csum and config->tso respectively. This is incorrect, the right way is to check the following flags returned by the mlx5dv_query_device function: MLX5DV_SW_PARSING - check general swp support. MLX5DV_SW_PARSING_CSUM - check swp checksum support. MLX5DV_SW_PARSING_LSO - check swp LSO/TSO support. The fix enables the offloads according to the correct flags returned by the kernel. Fixes: `e46821e9fc` ("net/mlx5: separate generic tunnel TSO from the standard one") Cc: stable@dpdk.org Signed-off-by: Tal Shnaiderman <talshn@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Tested-by: Idan Hackmon <idanhac@nvidia.com>	2021-10-12 15:29:25 +02:00
Viacheslav Galaktionov	ff4e52efb3	ethdev: fix representor port ID search by name The patch is required for all PMDs which do not provide representors info on the representor itself. The function, rte_eth_representor_id_get(), is used in eth_representor_cmp() which is required in ethdev class iterator to search ethdev port ID by name (representor case). Before the patch the function is called on the representor itself and tries to get representors info to match. Search of port ID by name is used after hotplug to find out port ID of the just plugged device. Getting a list of representors from a representor does not make sense. Instead, a backer device should be used. To this end, extend the rte_eth_dev_data structure to include the port ID of the backing device for representors. Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Beilei Xing <beilei.xing@intel.com> Reviewed-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-10-12 16:54:20 +02:00
Dmitry Kozlyuk	fec28ca0e3	net/mlx5: support mempool registration When the first port in a given protection domain (PD) starts, install a mempool event callback for this PD and register all existing memory regions (MR) for it. When the last port in a PD closes, remove the callback and unregister all mempools for this PD. This behavior can be switched off with a new devarg: mr_mempool_reg_en. On TX slow path, i.e. when an MR key for the address of the buffer to send is not in the local cache, first try to retrieve it from the database of registered mempools. Supported are direct and indirect mbufs, as well as externally-attached ones from MLX5 MPRQ feature. Lookup in the database of non-mempool memory is used as the last resort. RX mempools are registered regardless of the devarg value. On RX data path only the local cache and the mempool database is used. If implicit mempool registration is disabled, these mempools are unregistered at port stop, releasing the MRs. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-10-19 16:35:16 +02:00
Michael Baum	97c9b0aa25	net/mlx5: fix duplicate pattern option default In order to allow/disallow configuring rules with identical patterns, the new device argument 'allow_duplicate_pattern' was introduced. The default is to allow, and it is initialized to 1 in PCI probe function. However, on auxiliary bus probing (for Sub-Function) it is not initialized at all, so it's actually initialized to 0. Move the initialization to default config function which is called from both. Fixes: `919488fbfa` ("net/mlx5: support Sub-Function") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-09-20 23:13:40 +02:00
Michael Baum	6856efa54e	net/mlx5: fix PF leak on PCI probing failure During PCI probe, the internal probe function is called per PF. If one of them fails, it was missing a proper destroy for the previously probed PFs. This fixes the behavior by destroying all previously probed PFs. Fixes: `08c2772fc7` ("net/mlx5: support list of representor PF") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-09-20 23:12:10 +02:00
Aman Deep Singh	a7db3afce7	net: add macro to extract MAC address bytes Added macros to simplify print of MAC address. The six bytes of a MAC address are extracted in a macro here, to improve code readablity. Signed-off-by: Aman Deep Singh <aman.deep.singh@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-09-07 19:08:05 +02:00
Aman Deep Singh	c2c4f87b12	net: add macro for MAC address print Added macro to print six bytes of MAC address. The MAC addresses will be printed in upper case hexadecimal format. In case there is a specific check for lower case MAC address, the user may need to make a change in such test case after this patch. Signed-off-by: Aman Deep Singh <aman.deep.singh@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2021-09-07 19:07:46 +02:00
Gregory Etelson	e9d420dfc2	net/mlx5: fix find sibling devices The routine mlx5_eth_find_next() and related iterating macro MLX5_ETH_FOREACH_DEV is used to iterate through sibling devices (all representors share the same configuration and switching domain) on top of specified root device. The root device parameter was specified as NULL, and it caused missing siblings in iteration during representor device probing, causing: 1. allocating new domain_id for the device being probed. 2. discrepancy in representor configurations and potential overall driver malfunctions. Fixes: `56bb3c84e9` ("net/mlx5: reduce PCI dependency") Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-08-04 11:27:49 +02:00
Suanming Mou	45633c460c	net/mlx5: workaround drop action with old kernel Currently, there are two types of drop action implementation in the PMD. One is the DR (Direct Rules) dummy placeholder drop action and another is the dedicated dummy queue drop action. When creates flow on the root table with DR drop action, the action will be converted to MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP Verbs attribute in rdma-core. In some inbox systems, MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP Verbs attribute may not be supported in the kernel driver. Create flow with drop action on the root table will be failed as it is not supported. In this case, the dummy queue drop action should be used instead of DR dummy placeholder drop action. This commit adds the DR drop action support detect on the root table. If MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP Verbs is not supported in the system, a dummy queue will be used as drop action. Fixes: `da845ae9d7` ("net/mlx5: fix drop action for Direct Rules/Verbs") Cc: stable@dpdk.org Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-08-03 15:08:02 +02:00
Gregory Etelson	ce4062cb10	net/mlx5: fix port initialization of switch domain All active ports that belong to the same E-switch share domain_id value. Port initialization procedure searches through a database for existing port with matching properties. New domain_id allocated if match was not located. Otherwise, new port inherits existing domain_id. Port initialization did not pass enough info to search procedure to find existing matches. Therefore, each port was created with a private domain_id value. As the result, port_id flow action failed because it could not match ports in a rule to E-switch. The patch adds dpdk_dev with port properties to device search. Fixes: `56bb3c84e9` ("net/mlx5: reduce PCI dependency") Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-08-03 14:19:33 +02:00
Viacheslav Ovsiienko	f17e4b4ffe	net/mlx5: add Tx scheduling check on queue creation The send scheduling on timestamp offload requires the Send Queue (SQ) shares its User Access Region (UAR) with the pacing Clock Queue. The SQ can be created by mlx5 PMD either with DevX or with Verbs. If the SQ is being created with DevX, the dedicated UAR can be specified and all the SQs share the single UAR. Once SQ is being created with Verbs the SQ's UAR is allocated by the rdma-core library internally on its own and there is no UAR sharing. This caused hardware errors on WAIT WQEs and overall send scheduling malfunction. If SQs are going to be created with Verbs and the send scheduling offload is explicitly requested via tx_pp devarg the device probing is rejected as device configuration can't satisfy the requirements. Fixes: `3ec73abeed` ("net/mlx5/linux: fix Tx queue operations decision") Fixes: `8f848f32fc` ("net/mlx5: introduce send scheduling devargs") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-29 18:01:23 +02:00
Gregory Etelson	494d6863c2	net/mlx5: fix representor interrupt handler In mlx5 PMD the PCI device interrupt vector was used by Uplink representor exclusively and other VF representors did not support interrupt mode. All the VFs and Uplink representors are separate ethernet devices and must have dedicated interrupt vectors. The fix provides each representor with a dedicated interrupt vector. Fixes: `5882bde88d` ("net/mlx5: fix representor interrupts handler") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-29 18:01:15 +02:00
Viacheslav Ovsiienko	9f430dd751	net/mlx5: fix RoCE LAG bond device probing The RoCE LAG bond device requires neither E-Switch nor SR-IOV configurations. It means the RoCE LAG bond device might be presented as a single port Infiniband device. The mlx5 PMD wrongly recognized standalone RoCE LAG bond device as E-Switch configuration, this triggered the calls of E-Switch ports related API and the latter failed (over the new OFED kernel driver, starting since 5.4.1), causing the overall device probe failure. If there is a single port Infiniband bond device found the E-Switch related flags must be cleared indicating standalone configuration. Also, it is not true anymore the bond device can exist over E-Switch configurations only (as it was claimed for VF LAG bond devices). The related checks are not relevant anymore and removed. Fixes: `790164ce1d` ("net/mlx5: check kernel support for VF LAG bonding") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-22 16:43:49 +02:00
Dong Zhou	96f85ec489	net/mlx5: check VLAN push/pop support For ConnectX-6 in FDB domain, pop and push VLAN on both ingress and egress directions are supported. For ConnectX-6 in NIC domain, and ConnectX-5 in both FWD and NIC domain, pop VLAN is only supported on ingress direction, push VLAN is only supported on egress direction. Signed-off-by: Dong Zhou <dongzhou@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-22 15:40:01 +02:00
Xueming Li	cdfdb82d0b	net/mlx5: check maximum Verbs port number Verbs API doesn't support device port number larger than 255 by design. Add check and fail probing with proper error log. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-22 00:11:14 +02:00
Xueming Li	919488fbfa	net/mlx5: support Sub-Function Introduce SF support. Similar to VF, SF on auxiliary bus is a portion of hardware PF, no representor or bonding parameters for SF. Devargs to support SF: -a auxiliary:mlx5_core.sf.8,dv_flow_en=1 New global syntax to support SF: -a bus=auxiliary,name=mlx5_core.sf.8/class=eth/driver=mlx5,dv_flow_en=1 Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-22 00:11:14 +02:00
Xueming Li	a7f34989e9	net/mlx5: migrate to bus-agnostic common interface To support SubFunction based on auxiliary bus, common driver supports new bus-agnostic driver. This patch migrates net driver to new common driver. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-22 00:11:14 +02:00
Xueming Li	56bb3c84e9	net/mlx5: reduce PCI dependency To support more bus types, remove PCI dependency where possible. Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-22 00:11:14 +02:00
Thomas Monjalon	4d567938be	common/mlx5: get PCI device address from any bus A function is exported to allow retrieving the PCI address of the parent PCI device of a Sub-Function in auxiliary bus sysfs. The function mlx5_dev_to_pci_str() is accepting both PCI and auxiliary devices. In case of a PCI device, it is simply using the device name. The function mlx5_dev_to_pci_addr(), which is based on sysfs path and do not use any device object, is renamed to mlx5_get_pci_addr() for clarity purpose. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2021-07-22 00:11:14 +02:00
Suanming Mou	f3020a331d	net/mlx5: optimize hash list table allocate on demand Currently, all the hash list tables are allocated during start up. Since different applications may only use dedicated limited actions, optimized the hash list table allocate on demand will save initial memory. This commit optimizes hash list table allocate on demand. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:22 +02:00
Suanming Mou	f7c3f3c290	net/mlx5: adjust hash bucket size With the new per core optimization to the list, the hash bucket size can be tuned to a more accurate number. This commit adjusts the hash bucket size. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2021-07-15 16:09:21 +02:00
Matan Azrad	961b6774c4	common/mlx5: add per-lcore cache to hash list utility Using the mlx5 list utility object in the hlist buckets. This patch moves the list utility object to the common utility, creates all the clone operations for all the hlist instances in the driver. Also adjust all the utility callbacks to be generic for both list and hlist. Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Suanming Mou <suanmingm@nvidia.com>	2021-07-15 16:09:18 +02:00

1 2 3 4

180 Commits