numam-dpdk

Author	SHA1	Message	Date
Moti Haimovsky	f5bf91de73	net/mlx5: support flow counters using devx This commit adds counters support when creating flows via direct verbs. The implementation uses devx interface in order to create query and delete the counters. This support requires MLNX_OFED_LINUX-4.5-0.1.0.1 installation. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-01-14 17:44:29 +01:00
Wisam Jaddo	f0354d8423	net/mlx5: add ConnectX-6 device IDs This commit includes the add of: - ConnectX-6 device ID - ConnectX-6 SRIOV device ID Signed-off-by: Wisam Jaddo <wisamm@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-01-03 13:07:06 +01:00
Thomas Monjalon	15febafdd4	drivers/net: set close behaviour flag at probing The ethdev flag RTE_ETH_DEV_CLOSE_REMOVE is set for drivers having migrated to the new behaviour of rte_eth_dev_close(). As any other flag, it can be useful to know about its value as soon as the port is probed. Unfortunately, it was set inside the close operation, just before being erased by memset() in rte_eth_dev_release_port(). The flag assignment is moved to the probing stage, so it can be checked by the application in order to anticipate the behaviour. Fixes: `42603bbdb5` ("net/mlx5: release port on close") Fixes: `6c99085d97` ("net/vmxnet3: fix hot-unplug") Fixes: `4d7877fde2` ("net/ena: remove resources when port is being closed") Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Luca Boccassi <bluca@debian.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-11-14 00:35:53 +01:00
Tom Barbette	26f0488344	net/mlx5: support Rx queue count API This patch adds support for the rx_queue_count API in mlx5 driver Signed-off-by: Tom Barbette <barbette@kth.se> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Yongseok Koh	09d8b41699	net/mlx5: make vectorized Tx threshold configurable Add txqs_max_vec parameter to configure the maximum number of Tx queues to enable vectorized Tx. And its default value is set according to the architecture and device type. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Yongseok Koh	f87bfa8eae	net/mlx5: move device spawn configuration to probing When a device is spawned, it does make more sense that the configuration parameters are passed by callee. Furthermore, setting default value for some configuration would need PCIe device ID which can be found in the probe function. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Yongseok Koh	bc91e8db12	net/mlx5: add 128B padding of Rx completion entry A PMD parameter (rxq_cqe_pad_en) is added to enable 128B padding of CQE on RX side. The size of CQE is aligned with the size of a cacheline of the core. If cacheline size is 128B, the CQE size is configured to be 128B even though the device writes only 64B data on the cacheline. This is to avoid unnecessary cache invalidation by device's two consecutive writes on to one cacheline. However in some architecture, it is more beneficial to update entire cacheline with padding the rest 64B rather than striding because read-modify-write could drop performance a lot. On the other hand, writing extra data will consume more PCIe bandwidth and could also drop the maximum throughput. It is recommended to empirically set this parameter. Disabled by default. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-11-05 15:01:25 +01:00
Ophir Munk	3a8207423a	net/mlx5: close all ports on remove With the introduction of representors several eth devices are using the same rte device (e.g. a PCI bus). When calling port detach on one eth device it is required that all eth devices belonging to the same rte device have been closed in advance, then the rte device itself can be removed/detached. This commit implements this requirement implicitly by adding a remove callback to struct rte_pci_driver. The new behavior can be demonstrated in testpmd. First we attach a representor 0 using PCI address 0000:08:00.0 testpmd> port attach 0000:08:00.0,representor=[0] Attaching a new port... EAL: PCI device 0000:08:00.0 on NUMA socket 0 EAL: probe driver: 15b3:1013 net_mlx5 Port 0 is attached. Done Port 1 is attached. Done Port 0 is the master device (PF) - an ethdev of the PCI address. Port 1 is representor 0 - another ethdev (representing a VF) using the same PCI address. Next we detach port 1 testpmd> port detach 1 Removing a device... Port 0 is closed Port 1 is closed Now total ports is 0 Done Since port 0 has been implicitly closed we cannot act on it anymore. testpmd> port stop 0 Invalid port 0 Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-10-26 22:14:06 +02:00
Ophir Munk	42603bbdb5	net/mlx5: release port on close With the introduction of representors several eth devices are using the same rte device (e.g. a PCI bus). It is therefore required to release the eth device resources during an eth device close operation rather than during an rte device removal (detach) operation. In current version many PMDs are still releasing the eth device as part of the rte device removal. In order to allow a smooth transition for all PMDs to behave correctly an ethdev flag RTE_ETH_DEV_CLOSE_REMOVE is used. When this flag is set it indicates to rte_eth_dev_close() to call rte_eth_dev_release_port(), so the port is freed during the close operation. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-10-26 22:14:06 +02:00
Ophir Munk	206254b7dc	net/mlx5: allow multiple probing for representor Implement probing of a rte device multiple times, see [1]. Set PCI driver RTE_PCI_DRV_PROBE_AGAIN flag to enable multiple probing of the PCI device by the PCI common driver. Consecutive probing requests with a devargs string may contain repetitive master and representors devices for which eth device should be created only once. In case an eth device already exists - silently ignore it. [1] commit `e9d159c3d5` ("eal: allow probing a device again") Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-10-26 22:14:06 +02:00
Yongseok Koh	58b1312e9d	net/mlx5: add warning message for Direct Verbs flow In case that the library doesn't support DV flow, if enabled by 'dv_flow_en=1', print out a warning message and disable it. Fixes: `51e72d386c` ("net/mlx5: add runtime parameter to enable Direct Verbs") Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko	2dd8b72167	net/mlx5: simplify flow counters support check The redundant check of Flow counters support in runtime is removed. The flag flow_counter_en is eliminated from the code. The Verbs create counter function just returns an error if no counter support presented in the system. If there is no any of Flow counters configuration macro defined the log message is emited, indicating the missing counter support. mlx5_flow_validate_action_count() fuctnion is also updated due to flow_counter_en flag removal. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko	0d8c6e6ba0	net/mlx5: rename flow counter configuration macro The HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT is replaced with HAVE_IBV_DEVICE_COUNTERS_SET_V42. At this stage it is just macro renaming. This macro is defined if system supports the "old" Flow counters functionality, MLNX_OFED version from 4.2 to 4.4 is required. We need to do this preparation before introducing the new configuration macro (HAVE_IBV_DEVICE_COUNTERS_SET_V45) for the "new" Flow counters support. Both makefile and meson.build are changed. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2018-10-26 22:14:06 +02:00
Thomas Monjalon	a7d3c6271d	ethdev: support representor id as iterator filter The representor id is added in rte_eth_dev_data in order to be able to match a port with its representor id in devargs. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2018-10-26 22:14:06 +02:00
Moti Haimovsky	d53180afe3	net/mlx5: refactor TC-flow infrastructure This commit refactors tc_flow as a preparation to coming commits that sends different type of messages and expect differ type of replies while still using the same underlying routines. Signed-off-by: Moti Haimovsky <motih@mellanox.com>	2018-10-26 22:14:05 +02:00
Thomas Monjalon	e16adf08e5	ethdev: free all common data when releasing port This is a clean-up of common ethdev data freeing. All data freeing are moved to rte_eth_dev_release_port() and done only in case of primary process. It is probably fixing some memory leaks for PMDs which were not freeing all data. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2018-10-26 22:14:05 +02:00
Thomas Monjalon	391797f042	drivers/bus: move driver assignment to end of probing The PCI mapping requires to know the PCI driver to use, even before the probing is done. That's why the PCI driver is referenced early inside the PCI device structure. See commit `1d20a073fa` ("bus/pci: reference driver structure before mapping") However the rte_driver does not need to be referenced in rte_device before the device probing is done. By moving back this assignment at the end of the device probing, it becomes possible to make clear the status of a rte_device. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Tested-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Rosen Xu <rosen.xu@intel.com>	2018-10-17 10:26:59 +02:00
Yongseok Koh	57123c00c1	net/mlx5: add Linux TC flower driver for E-Switch flow Flows having 'transfer' attribute have to be inserted to E-Switch on the NIC and the control path uses Linux TC flower interface via Netlink socket. This patch adds the flow driver on top of the new flow engine. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-10-11 18:53:49 +02:00
Yongseok Koh	40c9ccf9e9	net/mlx5: remove Netlink flow driver Netlink based E-Switch flow engine will be migrated to the new flow engine. nl_flow will be renamed to flow_tcf as it goes through Linux TC flower interface. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-10-11 18:53:49 +02:00
Yongseok Koh	0c76d1c9a1	net/mlx5: add abstraction for multiple flow drivers Flow engine has to support multiple driver paths. Verbs/DV for NIC flow steering and Linux TC flower for E-Switch flow steering. In the future, another flow driver could be added (devX). Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-10-11 18:53:49 +02:00
Ori Kam	51e72d386c	net/mlx5: add runtime parameter to enable Direct Verbs DV flow API is based on new kernel API and is missing some functionality like counter but add other functionality like encap. In order not to affect current users even if the kernel supports the new DV API it should be enabled only manually. Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-10-11 18:53:49 +02:00
Ori Kam	84c406e745	net/mlx5: add flow translate function This commit modify the conversion of the input parameters into Verbs spec, in order to support all previous changes. Some of those changes are: removing the use of the parser, storing each flow in its own flow structure. Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-10-11 18:53:49 +02:00
Anatoly Burakov	5282bb1c36	mem: allow memseg lists to be marked as external When we allocate and use DPDK memory, we need to be able to differentiate between DPDK hugepage segments and segments that were made part of DPDK but are externally allocated. Add such a property to memseg lists. This breaks the ABI, so document the change in release notes. This also breaks a few internal assumptions about memory contiguousness, so adjust malloc code in a few places. All current calls for memseg walk functions were adjusted to ignore external segments where it made sense. Mempools is a special case, because we may be asked to allocate a mempool on a specific socket, and we need to ignore all page sizes on other heaps or other sockets. Previously, this assumption of knowing all page sizes was not a problem, but it will be now, so we have to match socket ID with page size when calculating minimum page size for a mempool. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-10-11 10:24:29 +02:00
Ori Kam	c322c0e558	net/mlx5: add bluefield VF support Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-09-28 01:41:01 +02:00
Shahaf Shuler	f9de87187b	net/mlx5: disable ConnectX-4 Lx Multi Packet Send by default On ConnectX-4 Lx the Multi Packet Send (MPW) feature is considered un-secure, as on some cases were the application provides incorrect mbufs on the Tx burst the host or NIC can get stuck. Hence, disabling the feature by default for this specific NIC. Users can still enable this feature and enjoy the performance gain (mostly for low number of cores) by using the txq_mpw_en devarg. This patch will impact the out of the box performance of some application using ConnectX-4 Lx for the sack of security and robustness. Since we need different defaults based on the underlying device the mpw field in the configuration struct was extended to contain also the MLX5_ARG_UNSET option. Cc: stable@dpdk.org Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-08-28 15:27:39 +02:00
Yongseok Koh	2547ee7458	net/mlx5: preserve allmulticast flag for flow isolation mode mlx5_dev_ops_isolate doesn't have APIs for enabling/disabling allmulti mode as it can't be enabled in flow isolation mode. If the function pointers are null, librte APIs such as rte_eth_allmulticast_enable/disable() fail to set the flag (dev->data->all_multicast). The flag is used when starting traffic by mlx5_traffic_enable(). When switching out of flow isolation mode, allmulti mode will not be set even though it has been enabled. Fixes: `0887aa7f27` ("net/mlx5: add new operations for isolated mode") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-08-05 08:47:41 +02:00
Yongseok Koh	24b068ad71	net/mlx5: preserve promiscuous flag for flow isolation mode mlx5_dev_ops_isolate doesn't have APIs for enabling/disabling promiscuous mode as it can't be enabled in flow isolation mode. If the function pointers are null, librte APIs such as rte_eth_promiscuous_enable/disable() fail to set the flag (dev->data->promiscuous). The flag is used when starting traffic by mlx5_traffic_enable(). When switching out of flow isolation mode, promiscuous mode will not be set even though it has been enabled. Fixes: `0887aa7f27` ("net/mlx5: add new operations for isolated mode") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-08-05 08:47:40 +02:00
Nelio Laranjeiro	5366074b01	net/mlx5: fix route Netlink message overflow Route Netlink message socket is wrongly initialized by registering to the route link group. This causes the socket to receive all link message related to routes whereas the PMD do not expect to receive such information. In some situation it ends by filling the socket at a point that any new message cannot be exchanged. As the PMD is not expected to process such broadcast messages, the parameter in the nl_group in the function is also remove. Fixes: `ccdcba53a3` ("net/mlx5: use Netlink to add/remove MAC addresses") Cc: stable@dpdk.org Signed-off-by: Zijie Pan <zijie.pan@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-07-26 14:05:52 +02:00
Nelio Laranjeiro	f872b4b99d	net/mlx5: fix representors detection On systems where the required Netlink commands are not supported but Mellanox OFED is installed, representors information must be retrieved through sysfs. Fixes: `26c08b979d` ("net/mlx5: add port representor awareness") Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-07-26 14:05:52 +02:00
Adrien Mazarguil	20b71e92ef	net/mlx5: lay groundwork for switch offloads With mlx5, unlike normal flow rules implemented through Verbs for traffic emitted and received by the application, those targeting different logical ports of the device (VF representors for instance) are offloaded at the switch level and must be configured through Netlink (TC interface). This patch adds preliminary support to manage such flow rules through the flow API (rte_flow). Instead of rewriting tons of Netlink helpers and as previously suggested by Stephen [1], this patch introduces a new dependency to libmnl [2] (LGPL-2.1) when compiling mlx5. [1] https://mails.dpdk.org/archives/dev/2018-March/092676.html [2] https://netfilter.org/projects/libmnl/ Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-07-26 14:05:52 +02:00
Moti Haimovsky	6bf10ab69b	net/mlx5: support 32-bit systems This patch adds support for building and running mlx5 PMD on 32bit systems such as i686. The main issue to tackle was handling the 32bit access to the UAR as quoted from the mlx5 PRM: QP and CQ DoorBells require 64-bit writes. For best performance, it is recommended to execute the QP/CQ DoorBell as a single 64-bit write operation. For platforms that do not support 64 bit writes, it is possible to issue the 64 bits DoorBells through two consecutive writes, each write 32 bits, as described below: * The order of writing each of the Dwords is from lower to upper addresses. * No other DoorBell can be rung (or even start ringing) in the midst of an on-going write of a DoorBell over a given UAR page. The last rule implies that in a multi-threaded environment, the access to a UAR page (which can be accessible by all threads in the process) must be synchronized (for example, using a semaphore) unless an atomic write of 64 bits in a single bus operation is guaranteed. Such a synchronization is not required for when ringing DoorBells on different UAR pages. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-07-12 14:34:59 +02:00
Nelio Laranjeiro	af689f1f04	net/mlx5: support flow Ethernet item along with drop action Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-07-12 12:10:01 +02:00
Nelio Laranjeiro	2815702bae	net/mlx5: replace verbs priorities by flow Previous work introduce verbs priorities, whereas the PMD is making translation between Flow priority into Verbs. Rename this to make more sense on what the PMD has to translate. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-07-12 12:10:01 +02:00
Nelio Laranjeiro	78be885295	net/mlx5: handle drop queues as regular queues Drop queues are essentially used in flows due to Verbs API, the information if the fate of the flow is a drop or not is already present in the flow. Due to this, drop queues can be fully mapped on regular queues. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-07-12 12:10:01 +02:00
Adrien Mazarguil	6de569f5ec	net/mlx5: add parameter for port representors Prior to this patch, all port representors detected on a given device were probed and Ethernet devices instantiated for each of them. This patch adds support for the standard "representor" parameter, which implies that port representors are not probed by default anymore, except for the list provided through device arguments. (Patch based on prior work from Yuanhan Liu) Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Reviewed-by: Xueming Li <xuemingl@mellanox.com>	2018-07-11 15:37:29 +02:00
Adrien Mazarguil	116f90ad7e	net/mlx5: probe port representors in natural order Port representors are probed in whatever unspecified order ibv_get_device_list() returns them. This is counterintuitive to users since DPDK port IDs assignment almost never follows the same sequence as representor IDs. Additionally, the master device does not necessarily inherit the lowest DPDK port ID. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-07-11 15:37:26 +02:00
Adrien Mazarguil	2b73026388	net/mlx5: probe all port representors Probe existing port representors in addition to their master device and associate them automatically. To avoid collision between Ethernet devices, they are named as follows: - "{DBDF}" for master/switch devices. - "{DBDF}_representor_{rep}" with "rep" starting from 0 for port representors. (Patch based on prior work from Yuanhan Liu) Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Reviewed-by: Xueming Li <xuemingl@mellanox.com>	2018-07-11 15:37:19 +02:00
Adrien Mazarguil	26c08b979d	net/mlx5: add port representor awareness The current PCI probing method is not aware of Verbs port representors, which appear as standard Verbs devices bound to the same PCI address and cannot be distinguished. Problem is that more often than not, the wrong Verbs device is used, resulting in unexpected traffic. This patch makes the driver discard representors to only use the master device. If unable to identify it (e.g. kernel drivers not recent enough), either: - There is only one matching device which isn't identified as a representor, in that case use it. - Otherwise log an error and do not probe the device. (Patch based on prior work from Yuanhan Liu) Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Reviewed-by: Xueming Li <xuemingl@mellanox.com>	2018-07-11 15:37:14 +02:00
Adrien Mazarguil	681289345e	net/mlx5: re-indent generic probing function Since commit "net/mlx5: drop useless support for several Verbs ports" removed an inner loop, mlx5_dev_spawn() is left with an unnecessary indent level. This patch eliminates a block, moves its local variables to function scope, and re-indents its contents (diff best viewed with --ignore-all-space). No functional impact. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Reviewed-by: Xueming Li <xuemingl@mellanox.com>	2018-07-11 15:37:10 +02:00
Adrien Mazarguil	f38c54571d	net/mlx5: split PCI from generic probing All the generic probing code needs is an IB device. While this device is currently supplied by a PCI lookup, other methods will be added soon. This patch divides the original function, which has become huge over time, as follows: 1. PCI-specific (mlx5_pci_probe()). 2. Verbs device (mlx5_dev_spawn()). (Patch based on prior work from Yuanhan Liu) Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Reviewed-by: Xueming Li <xuemingl@mellanox.com>	2018-07-11 15:37:03 +02:00
Adrien Mazarguil	9083982ce7	net/mlx5: drop useless support for several Verbs ports Unlike mlx4 from which this capability was inherited, mlx5 devices expose exactly one Verbs port per PCI bus address. Each physical port gets assigned its own bus address with a single Verbs port. While harmless, this code requires an extra loop that would get in the way of subsequent refactoring. No functional impact. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-07-11 15:36:55 +02:00
Adrien Mazarguil	3ff4b0866f	net/mlx5: remove redundant objects in probe function This patch gets rid of redundant calls to open the device and query its attributes in order to simplify the code. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Reviewed-by: Xueming Li <xuemingl@mellanox.com>	2018-07-11 15:36:52 +02:00
Adrien Mazarguil	6057a10b3b	net/mlx5: rename confusing object in probe function There are several attribute objects in this function: - IB device attributes (struct ibv_device_attr_ex device_attr). - Direct Verbs attributes (struct mlx5dv_context attrs_out). - Port attributes (struct ibv_port_attr). - IB device attributes again (struct ibv_device_attr_ex device_attr_ex). "attrs_out" is both odd and initialized using a nonstandard syntax. Rename it "dv_attr" for consistency. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Reviewed-by: Xueming Li <xuemingl@mellanox.com>	2018-07-11 15:36:46 +02:00
Thomas Monjalon	f8e9989606	remove useless constructor headers A constructor is usually declared with RTE_INIT* macros. As it is a static function, no need to declare before its definition. The macro is used directly in the function definition. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-07-12 00:00:35 +02:00
Matan Azrad	1ff30d182c	net/mlx5: activate Verbs cleanup on removal Starting from rdma-core v19, Mellanox OFED 4.4, the Verbs resources cleanup is properly activated in plug-out process when setting the MLX5_DEVICE_FATAL_CLEANUP environment variable to 1. Set the aforementioned variable to 1. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-07-04 16:48:53 +02:00
Yongseok Koh	5c0e2db619	net/mlx5: add warning message for Multi-Packet RQ If Multi-Packet RQ is enabled but not supported by device or kernel/library, print out a warning message. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-07-03 01:35:58 +02:00
Stephen Hemminger	3d96644aa3	net/mlx5: fix log initialization The mlx5 driver had two init functions, but this could cause log initialization to be done after the other initialization. Also, the name of the function does not match convention (cut/paste error?). Fix by initializing log type first at start of the pmd_init. This also gets rid of having two constructor functions. Fixes: `a170a30d22` ("net/mlx5: use dynamic logging") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-06-17 10:17:53 +02:00
Xueming Li	a9fc0b0ef0	net/mlx5: fix crash in device probe This patch initializes counter descriptor struct before invoking Verbs api to avoid segmentation fault. Fixes: `9a761de8ea` ("net/mlx5: flow counter support") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-06-17 10:04:48 +02:00
Adrien Mazarguil	93068a9d5a	net/mlx5: fix error message in probe function Error values passed to strerror() must be positive. Fixes: `012ad9944d` ("net/mlx5: fix probe return value polarity") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-06-17 10:04:48 +02:00
Adrien Mazarguil	c6ce7e34ad	net/mlx5: fix missing errno in probe function Fixes: `b43802b4bd` ("net/mlx5: support 16 hardware priorities") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-06-17 10:04:48 +02:00
Adrien Mazarguil	8c3c2372ed	net/mlx5: fix errno object in probe function Fixes: `a6d83b6a92` ("net/mlx5: standardize on negative errno values") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-06-17 10:04:48 +02:00
Adrien Mazarguil	c93adccc97	net/mlx5: remove limitation on number of instances This artificial limitation was inherited from the mlx4 code base and has no purpose other than adding unnecessary noise. This patch is a port of commit `f2318196c7` ("net/mlx4: remove limitation on number of instances"). Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-06-17 10:04:48 +02:00
David Marchand	44b1d513d5	net/mlx5: register memory callback only when probing The callback should be invoked only for memory that has been registered in a device, hence, no need to track cleanup events if no device is present. Bugzilla ID: 56 Fixes: `974f1e7ef1` ("net/mlx5: add new memory region support") Signed-off-by: David Marchand <david.marchand@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-05-30 21:16:43 +02:00
Xueming Li	0ace586dee	net/mlx5: fix memory region cache init MR cache init takes place on the device configuration. When the device is re-configured multiple times, for example when changing the number of queue on the flight, deadlock can happen. This patch moved MR cache init from device configuration function to probe function to make sure init only once. Fixes: `974f1e7ef1` ("net/mlx5: add new memory region support") Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-28 16:28:43 +02:00
Adrien Mazarguil	e89c15b697	net/mlx5: fix crash when configure is not called Although uncommon, applications may destroy a device immediately after probing it without going through dev_configure() first. This patch addresses a crash which occurs when mlx5_dev_close() calls mlx5_mr_release() due to an uninitialized entry in the private structure. Fixes: `974f1e7ef1` ("net/mlx5: add new memory region support") Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-28 07:50:38 +02:00
Matan Azrad	1f106da2bf	net/mlx5: support MPLS-in-GRE and MPLS-in-UDP Add support for MPLS over GRE and MPLS over UDP tunnel types as described in the next RFCs: 1. https://tools.ietf.org/html/rfc4023 2. https://tools.ietf.org/html/rfc7510 3. https://tools.ietf.org/html/rfc4385 Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-05-17 12:31:42 +02:00
Shahaf Shuler	dd3331c6f1	net/mlx5: add Bluefield device id Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-05-17 12:31:42 +02:00
Andy Green	f11a4a7d8a	net/mlx5: fix uninitialized variable in probing Fixes: `ccdcba53a3` ("net/mlx5: use Netlink to add/remove MAC addresses") Signed-off-by: Andy Green <andy@warmcat.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-15 22:29:22 +02:00
Thomas Monjalon	fbe90cdd77	ethdev: add probing finish function A new hook function is added and called inside the PMDs at the end of the device probing: - in primary process, after allocating, init and config - in secondary process, after attaching and local init This new function is almost empty for now. It will be used later to add some post-initialization processing. For the PMDs calling the helpers rte_eth_dev_create() or rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish() is called from here, and not in the PMD itself. Note that the helper rte_eth_dev_create() could be used more, especially for vdevs, avoiding some code duplication in PMDs. Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>	2018-05-14 22:31:53 +01:00
Yongseok Koh	7d6bf6b866	net/mlx5: add Multi-Packet Rx support Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth by posting a single large buffer for multiple packets. Instead of posting a buffer per a packet, one large buffer is posted in order to receive multiple packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides and each stride receives one packet. Rx packet is mem-copied to a user-provided mbuf if the size of Rx packet is comparatively small, or PMD attaches the Rx packet to the mbuf by external buffer attachment - rte_pktmbuf_attach_extbuf(). A mempool for external buffers will be allocated and managed by PMD. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-05-14 22:31:52 +01:00
Yongseok Koh	974f1e7ef1	net/mlx5: add new memory region support This is the new design of Memory Region (MR) for mlx PMD, in order to: - Accommodate the new memory hotplug model. - Support non-contiguous Mempool. There are multiple layers for MR search. L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized array by linear search. L0/L1 is in an inline function - mlx5_mr_lookup_cache(). If L1 misses, the bottom-half function is called to look up the address from the bigger local cache of the queue. This is L2 - mlx5_mr_addr2mr_bh() and it is not an inline function. Data structure for L2 is the Binary Tree. If L2 misses, the search falls into the slowest path which takes locks in order to access global device cache (priv->mr.cache) which is also a B-tree and caches the original MR list (priv->mr.mr_list) of the device. Unless the global cache is overflowed, it is all-inclusive of the MR list. This is L3 - mlx5_mr_lookup_dev(). The size of the L3 cache table is limited and can't be expanded on the fly due to deadlock. Refer to the comments in the code for the details - mr_lookup_dev(). If L3 is overflowed, the list will have to be searched directly bypassing the cache although it is slower. If L3 misses, a new MR for the address should be created - mlx5_mr_create(). When it creates a new MR, it tries to register adjacent memsegs as much as possible which are virtually contiguous around the address. This must take two locks - memory_hotplug_lock and priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any allocation/free of memory inside. In the free callback of the memory hotplug event, freed space is searched from the MR list and corresponding bits are cleared from the bitmap of MRs. This can fragment a MR and the MR will have multiple search entries in the caches. Once there's a change by the event, the global cache must be rebuilt and all the per-queue caches will be flushed as well. If memory is frequently freed in run-time, that may cause jitter on dataplane processing in the worst case by incurring MR cache flush and rebuild. But, it would be the least probable scenario. To guarantee the most optimal performance, it is highly recommended to use an EAL option - '--socket-mem'. Then, the reserved memory will be pinned and won't be freed dynamically. And it is also recommended to configure per-lcore cache of Mempool. Even though there're many MRs for a device or MRs are highly fragmented, the cache of Mempool will be much helpful to reduce misses on per-queue caches anyway. '--legacy-mem' is also supported. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:51 +01:00
Yongseok Koh	d561b5dc13	net/mlx5: remove memory region support This patch removes current support of Memory Region (MR) in order to accommodate the dynamic memory hotplug patch. This patch can be compiled but traffic can't flow and HW will raise faults. Subsequent patches will add new MR support. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:51 +01:00
Yongseok Koh	df428ceef4	net/mlx5: change device reference for secondary process rte_eth_devices[] is not shared between primary and secondary process, but a static array to each process. The reverse pointer of device (priv->dev) is invalid. Instead, priv has the pointer to shared data of the device, struct rte_eth_dev_data *dev_data; Two macros are added, #define PORT_ID(priv) ((priv)->dev_data->port_id) #define ETH_DEV(priv) (&rte_eth_devices[PORT_ID(priv)]) Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:51 +01:00
Raslan Darawsheh	690de2850b	net/mlx5: fix resource leak in case of error If something went wrong in mlx5_pci_prob the allocated eth dev will cause a memory leak. This commit release the eth dev that was previously allocated. Fixes: `771fa900b7` ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters") Cc: stable@dpdk.org Signed-off-by: Raslan Darawsheh <rasland@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:50 +01:00
Raslan Darawsheh	e9f4166014	net/mlx5: fix double free on error handling When attr_ctx is NULL it will attempt to free the list of devices twice. Avoid double freeing the list by directly going to error handling. Fixes: `771fa900b7` ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters") Cc: stable@dpdk.org Signed-off-by: Raslan Darawsheh <rasland@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:49 +01:00
Xueming Li	5afda2c6ac	net/mlx5: fix SW parsing feature detection Fixes: `5f8ba81c42` ("net/mlx5: support generic tunnel offloading") Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:49 +01:00
Shahaf Shuler	012ad9944d	net/mlx5: fix probe return value polarity mlx5 prefixed function returns a negative errno value. the error handler on mlx5_pci_probe is doing the same. Fixes: `a6d83b6a92` ("net/mlx5: standardize on negative errno values") Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-05-14 22:31:48 +01:00
Shahaf Shuler	eac9cd58de	net/mlx5: fix socket connection return value Upon success, mlx5_socket_connect should return the fd descriptor of the primary process Fixes: `a6d83b6a92` ("net/mlx5: standardize on negative errno values") Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-05-14 22:31:48 +01:00
Xueming Li	78a54648ff	net/mlx5: support L3 VXLAN flow This patch support L3 VXLAN, no inner L2 header comparing to standard VXLAN protocol. L3 VXLAN using specific overlay UDP destination port to discriminate against standard VXLAN, device parameter and FW has to be configured to support it: sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1 sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port> Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-04-27 18:00:56 +01:00
Xueming Li	b43802b4bd	net/mlx5: support 16 hardware priorities This patch supports new 16 Verbs flow priorities by trying to create a simple flow of priority 15. If 16 priorities not available, fallback to traditional 8 priorities. Verb priority mapping: 8 priorities >=16 priorities Control flow: 4-7 8-15 User normal flow: 1-3 4-7 User tunnel flow: 0-2 0-3 Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-04-27 18:00:56 +01:00
Xueming Li	5f8ba81c42	net/mlx5: support generic tunnel offloading This commit adds support for generic tunnel TSO and checksum offload. PMD will compute the inner/outer headers offset according to the mbuf fields. Hardware will do calculation based on offsets and types. Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-04-27 17:34:43 +01:00
Nélio Laranjeiro	e0586a8d1e	net/mlx5: implement multicast add list devop Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-04-27 17:34:43 +01:00
Shahaf Shuler	a85a606ca5	net/mlx5: fix link status initialization Following commit `7ba5320baa` ("net/mlx5: fix link status behavior") The initial link status is no longer set as part of the port start. When LSC interrupts are enabled, ethdev layer reads the link status directly from the device data instead of using the PMD callback. This may cause application to query the link as down while in fact it was already up before the DPDK application start (and no interrupt to fix it). Fixes: `7ba5320baa` ("net/mlx5: fix link status behavior") Cc: stable@dpdk.org Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-04-14 00:43:30 +02:00
Nélio Laranjeiro	db209cc32a	net/mlx5: add parameter for Netlink support in VF All Netlink request the PMD will do can also be done by a iproute2 command line interface, enabling VF behavior configuration without having to modify the application nor reaching PMD limits (e.g. MAC address number limit). Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-04-14 00:41:44 +02:00
Nélio Laranjeiro	ccdcba53a3	net/mlx5: use Netlink to add/remove MAC addresses VF devices are not able to receive traffic unless it fully requests it though Netlink. This will cause the request to be processed by the PF which will add/remove the MAC address to the VF table if the VF is trusted. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-04-14 00:41:44 +02:00
Anatoly Burakov	66cc45e293	mem: replace memseg with memseg lists Before, we were aggregating multiple pages into one memseg, so the number of memsegs was small. Now, each page gets its own memseg, so the list of memsegs is huge. To accommodate the new memseg list size and to keep the under-the-hood workings sane, the memseg list is now not just a single list, but multiple lists. To be precise, each hugepage size available on the system gets one or more memseg lists, per socket. In order to support dynamic memory allocation, we reserve all memory in advance (unless we're in 32-bit legacy mode, in which case we do not preallocate memory). As in, we do an anonymous mmap() of the entire maximum size of memory per hugepage size, per socket (which is limited to either RTE_MAX_MEMSEG_PER_TYPE pages or RTE_MAX_MEM_MB_PER_TYPE megabytes worth of memory, whichever is the smaller one), split over multiple lists (which are limited to either RTE_MAX_MEMSEG_PER_LIST memsegs or RTE_MAX_MEM_MB_PER_LIST megabytes per list, whichever is the smaller one). There is also a global limit of CONFIG_RTE_MAX_MEM_MB megabytes, which is mainly used for 32-bit targets to limit amounts of preallocated memory, but can be used to place an upper limit on total amount of VA memory that can be allocated by DPDK application. So, for each hugepage size, we get (by default) up to 128G worth of memory, per socket, split into chunks of up to 32G in size. The address space is claimed at the start, in eal_common_memory.c. The actual page allocation code is in eal_memalloc.c (Linux-only), and largely consists of copied EAL memory init code. Pages in the list are also indexed by address. That is, in order to figure out where the page belongs, one can simply look at base address for a memseg list. Similarly, figuring out IOVA address of a memzone is a matter of finding the right memseg list, getting offset and dividing by page size to get the appropriate memseg. This commit also removes rte_eal_dump_physmem_layout() call, according to deprecation notice [1], and removes that deprecation notice as well. On 32-bit targets due to limited VA space, DPDK will no longer spread memory to different sockets like before. Instead, it will (by default) allocate all of the memory on socket where master lcore is. To override this behavior, --socket-mem must be used. The rest of the changes are really ripple effects from the memseg change - heap changes, compile fixes, and rewrites to support fbarray-backed memseg lists. Due to earlier switch to _walk() functions, most of the changes are simple fixes, however some of the _walk() calls were switched to memseg list walk, where it made sense to do so. Additionally, we are also switching locks from flock() to fcntl(). Down the line, we will be introducing single-file segments option, and we cannot use flock() locks to lock parts of the file. Therefore, we will use fcntl() locks for legacy mem as well, in case someone is unfortunate enough to accidentally start legacy mem primary process alongside an already working non-legacy mem-based primary process. [1] http://dpdk.org/dev/patchwork/patch/34002/ Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com> Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com> Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>	2018-04-11 19:55:39 +02:00
Anatoly Burakov	8594a2026b	net/mlx5: use memseg walk instead of iteration Reduce dependency on internal details of EAL memory subsystem, and simplify code. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com> Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com> Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>	2018-04-11 19:48:12 +02:00
Shahaf Shuler	5feecc57d9	align SPDX Mellanox copyrights Aligning Mellanox SPDX copyrights to a single format. In addition replace to SPDX licence files which were missed. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-04-11 01:47:47 +02:00
Nélio Laranjeiro	7b2207afe8	net/mlx5: fix icc build Remove the second declaration of device_attr [1] inside the loop as well as the query_device_ex() which has already been done outside of the loop. [1] https://dpdk.org/ml/archives/dev/2018-March/091744.html Fixes: `9a761de8ea` ("net/mlx5: flow counter support") Cc: stable@dpdk.org Reported-by: Ferruh Yigit <ferruh.yigit@intel.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	7ba5320baa	net/mlx5: fix link status behavior This behavior is mixed between what should be handled by the application and what is under PMD responsibility. According to DPDK API: - link_update() should only query the link status [1] - link_set_{up,down}() should only set the link to the according status [1] - dev_{start,stop}() should enable/disable traffic reception/emission [2] On this PMD, the link status is retrieved from the net device associated owned by the Linux Kernel, it does not means that even when this interface is down, the PMD cannot send/receive traffic from the NIC those two information are unrelated, until the physical port is active and has a link, the PMD can receive/send traffic on the wire. According to DPDK API, calling the rte_eth_dev_start() even when the Linux interface link is down is then possible and allowed, as the traffic will flow between the DPDK application and the Physical port. This also means that a synchronization between the Linux interface and the DPDK application remains under the DPDK application responsibility. To handle such synchronization the application should behave as the following scheme, to start: rte_eth_get_link(port_id, &link); if (link.link_status == ETH_DOWN) rte_eth_dev_set_link_up(port_id); rte_eth_dev_start(port_id); Taking in account the possible returned values for each function. and to stop: rte_eth_dev_stop(port_id); rte_eth_dev_set_link_down(port_id); The application should also set the LSC interrupt callbacks to catch and behave accordingly when the administrator set the Linux device down/up. The same callbacks are called when the link on the medium falls/raise. [1] https://dpdk.org/browse/dpdk/tree/lib/librte_ether/rte_ethdev_core.h [2] https://dpdk.org/browse/dpdk/tree/lib/librte_ether/rte_ethdev.h#n1677 Fixes: `c7bf62255e` ("net/mlx5: fix handling link status event") Fixes: `e313ef4c2f` ("net/mlx5: fix link state on device start") Cc: stable@dpdk.org Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	a170a30d22	net/mlx5: use dynamic logging Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	0f99970b4a	net/mlx5: use port id in PMD log Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	a6d83b6a92	net/mlx5: standardize on negative errno values Set rte_errno systematically as well. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	af4f09f282	net/mlx5: prefix all functions with mlx5 This change removes the need to distinguish unlocked priv_() functions which are therefore renamed using a mlx5_() prefix for consistency. At the same time, all functions from mlx5 uses a pointer to the ETH device instead of the one to the PMD private data. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	7b2423cd2e	net/mlx5: remove control path locks In priv struct only the memory region needs to be protected against concurrent access between the control plane and the data plane. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	0b3456e391	net/mlx5: remove useless empty lines Some empty lines have been added in the middle of the code without any reason. This commit removes them. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	56f08e1671	net/mlx5: mark parameters with unused attribute Replaces all (void)foo; by __rte_unused macro except when variables are under #if statements. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	a61888c8f2	net/mlx5: fix sriov flag priv_get_num_vfs() was used to help the PMD in prefetching the mbuf in datapath when the PMD was behaving in VF mode. This knowledge is no more used. Fixes: `528a9fbec6` ("net/mlx5: support ConnectX-5 devices") Cc: stable@dpdk.org Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:43 +02:00
Adrien Mazarguil	08c028d08c	net/mlx: fix rdma-core glue path with EAL plugins Glue object files are looked up in RTE_EAL_PMD_PATH by default when set and should be installed in this directory. During startup, EAL attempts to load them automatically like other plug-ins found there. While normally harmless, dlopen() fails when rdma-core is not installed, EAL interprets this as a fatal error and terminates the application. This patch requests glue objects to be installed in a different directory to prevent their automatic loading by EAL since they are PMD helpers, not actual DPDK plug-ins. Fixes: `f6242d0655` ("net/mlx: make rdma-core glue path configurable") Cc: stable@dpdk.org Reported-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Tested-by: Timothy Redaelli <tredaelli@redhat.com>	2018-03-30 14:08:43 +02:00
Shahaf Shuler	038e72511f	net/mlx5: fix tunnel offloads cap query The query for the tunnel stateless offloads is wrongly implemented because of: 1. It was using the device id to query for the offloads. 2. It was using a compilation flag for Verbs which no longer exits. The main reason was lack of proper API from Verbs. Fixing the query to use rdma-core API. The capability returned from rdma-core refer to both Tx and Rx sides. Eventhough there is a separate cap for GRE and VXLAN, implementation merge them into a single flag in order to simplify the checks on the data path. Fixes: `43e9d9794c` ("net/mlx5: support upstream rdma-core") Fixes: `f5fde52051` ("net/mlx5: add hardware checksum offload for tunnel packets") Cc: stable@dpdk.org Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-03-30 14:08:42 +02:00
Xueming Li	8c5bca92c9	net/mlx5: fix close after start failure This patch fixed primary socket assertion error during close on a device that failed to start. Fixes: `f8b9a3bad4` ("net/mlx5: install a socket to exchange a file descriptor") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-02-13 16:55:49 +01:00
Adrien Mazarguil	f6242d0655	net/mlx: make rdma-core glue path configurable Since rdma-core glue libraries are intrinsically tied to their respective PMDs and used as internal plug-ins, their presence in the default search path among other system libraries for the dynamic linker is not necessarily desired. This commit enables their installation and subsequent look-up at run time in RTE_EAL_PMD_PATH if configured to a nonempty string. This path can also be overridden by environment variables MLX[45]_GLUE_PATH. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-02-06 14:35:07 +01:00
Adrien Mazarguil	6d5df2eaf6	net/mlx: version rdma-core glue libraries When built as separate objects, these libraries do not have unique names. Since they do not maintain a stable ABI, loading an incompatible library may result in a crash (e.g. in case multiple versions are installed). This patch addresses the above by versioning glue libraries, both on the file system (version suffix) and by comparing a dedicated version field member in glue structures. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-02-06 14:35:07 +01:00
Adrien Mazarguil	2a3b00973d	net/mlx: add debug checks to glue structure This code should catch mistakes early if a glue structure member is added without a corresponding implementation in the library. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-02-06 14:35:07 +01:00
Shahaf Shuler	cd230a3ed9	net/mlx5: fix CRC strip capability query IBV_WQ_FLAGS_SCATTER_FCS is a WQ flag to be used to configure the CRC strip on a queue upon creation. Using IBV_RAW_PACKET_CAP_SCATTER_FCS instead to query to capability. Even though this is RAW_QP capability, it is being used by rdma-core to indicate for both RAW_QP and WQ. Fixes: `43e9d9794c` ("net/mlx5: support upstream rdma-core") Cc: stable@dpdk.org Reported-by: Alex Rosenbaum <alexr@mellanox.com> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-02-05 13:42:53 +01:00
Olivier Matz	8fd92a66c6	net/mlx5: use SPDX tags in 6WIND copyrighted files Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2018-02-01 02:32:52 +01:00
Adrien Mazarguil	59b91bec12	net/mlx5: spawn rdma-core dependency plug-in When mlx5 is not compiled directly as an independent shared object (e.g. CONFIG_RTE_BUILD_SHARED_LIB not enabled for performance reasons), DPDK applications inherit its dependencies on libibverbs and libmlx5 through rte.app.mk. This is an issue both when DPDK is delivered as a binary package (Linux distributions) and for end users because rdma-core then propagates as a mandatory dependency for everything. Application writers relying on binary DPDK packages are not necessarily aware of this fact and may end up delivering packages with broken dependencies. This patch therefore introduces an intermediate internal plug-in hard-linked with rdma-core (to preserve symbol versioning) loaded by the PMD through dlopen(), so that a missing rdma-core does not cause unresolved symbols, allowing applications to start normally. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-01-31 20:57:29 +01:00
Nelio Laranjeiro	0e83b8e536	net/mlx5: move rdma-core calls to separate file This lays the groundwork for externalizing rdma-core as an optional run-time dependency instead of a mandatory one. No functional change. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-01-31 20:57:29 +01:00
Xueming Li	4a98415343	net/mlx5: map UAR address around huge pages Reserving the memory space for the UAR near huge pages helps to reduce the cases where the secondary process cannot start. Those pages being physical pages they must be mapped at the same virtual address as in the primary process to have a working secondary process. As this remap is almost the latest being done by the processes (libraries, heaps, stacks are already loaded), similar to huge pages, there is no guarantee this mechanism will always work. Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-29 10:04:28 +01:00
Shahaf Shuler	e313ef4c2f	net/mlx5: fix link state on device start Following commit `c7bf62255e` ("net/mlx5: fix handling link status event") the link state must be up in order for the burst function to be set on the device ops. As the link may take time to move between down and up state it is possible the rte_eth_dev_start call will return with wrong burst function (either null or the empty burst function). Fixing it by forcing the link to be up before returning from device start. In case the link is still not up after 5 seconds fail the function. In addition initialize the burst function on device probe to prevent crashes before the link is up. Fixes: `c7bf62255e` ("net/mlx5: fix handling link status event") Cc: stable@dpdk.org Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-29 10:04:28 +01:00
Yuanhan Liu	ad831a1176	net/mlx5: use PCI address as port name It is suggested to use PCI BDF to identify a port for port addition in OVS-DPDK. While mlx5 has its own naming style: name it by ib dev name. This breaks the typical OVS DPDK use case and brings more puzzle to the end users. To fix it, this patch changes it to use PCI BDF as the name, too. Also, a postfix " port %u" is added, just in case their might be more than 1 port associated with a PCI device. Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-29 10:04:28 +01:00
Olivier Matz	d10b09db0a	net/mlx5: fix allocation when no memory on device NUMA node When no memory is available on the same numa node than the device, the initialization of the device fails. However, the use case where the cores and memory are on a different socket than the device is valid, even if not optimal. To fix this issue, this commit introduces an infrastructure to select the socket on which to allocate the verbs objects based on the ethdev configuration and the object type, rather than the PCI numa node. Fixes: `1e3a39f72d` ("net/mlx5: allocate verbs object into shared memory") Cc: stable@dpdk.org Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-29 10:04:28 +01:00
Ferruh Yigit	ffc905f3b8	ethdev: separate driver APIs Create a rte_ethdev_driver.h file and move PMD specific APIs here. Drivers updated to include this new header file. There is no update in header content and since ethdev.h included by ethdev_driver.h, nothing changed from driver point of view, only logically grouping of APIs. From applications point of view they can't access to driver specific APIs anymore and they shouldn't. More PMD specific data structures still remain in ethdev.h because of inline functions in header use them. Those will be handled separately. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2018-01-22 01:26:49 +01:00
Matan Azrad	d3e0f39207	net/mlx5: support a device removal check operation Add support to get removal status of mlx5 device. It is not supported in secondary process. Signed-off-by: Matan Azrad <matan@mellanox.com>	2018-01-21 21:09:41 +01:00
Yongseok Koh	c7bf62255e	net/mlx5: fix handling link status event Even though link of a port gets down, device still can receive traffic. That is the reason why mlx5_set_link_up/down() switches rx/tx_pkt_burst(). However, if link gets down by an external command (e.g. ifconfig), it isn't effective. It is better to change burst functions when link status change is detected. Fixes: `62072098b5` ("mlx5: support setting link up or down") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-21 15:51:52 +01:00
Shahaf Shuler	dbccb4cddc	net/mlx5: convert to new Tx offloads API Ethdev Tx offloads API has changed since: commit `cba7f53b71` ("ethdev: introduce Tx queue offloads API") This commit support the new Tx offloads API. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-16 18:47:49 +01:00
Nelio Laranjeiro	73b620f211	net/mlx5: rename counter set in configuration Counter_set is a counter used for flows when its support is available. Renaming it to flow counter. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-16 18:47:49 +01:00
Shahaf Shuler	7fe24446e9	net/mlx5: add device configuration structure Move device configuration and features capabilities to its own structure. This structure is filled by mlx5_pci_probe(), outside of this function it should be treated as read only. This configuration struct will be used for the Tx/Rx queue setup to select the Tx/Rx queue parameters based on the user configuration and device capabilities. In addition it will be used by the burst selection function to decide on the best pkt burst to be used. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-16 18:47:49 +01:00
Shahaf Shuler	1cfa649ba6	net/mlx5: change pkt burst select function prototype Change the function prototype to return the function pointer of the selected Tx/Rx burst function instead of assigning it directly to the device context. Such change will enable to use those select functions to query the burst function that will be selected according to the device configuration. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-01-16 18:47:49 +01:00
Nélio Laranjeiro	4aa15eb19a	net/mlx5: fix Tx checksum offloads Tx checksum offloads are correctly handled in a single Tx burst function whereas the capability is always set. This causes VXLAN packet with checksum offloads request to be ignored when the (E)MPS Tx functions are selected. Fixes: `f5fde52051` ("net/mlx5: add hardware checksum offload for tunnel packets") Cc: stable@dpdk.org Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-01-16 18:47:49 +01:00
Nélio Laranjeiro	51e7fa8d20	net/mlx5: fix secondary process verification Since the secondary process has its own devops, function which cannot be called by the secondary don't need anymore to verify which process is calling it. Fixes: `87ec44ce16` ("net/mlx5: add operations for secondary process") Cc: stable@dpdk.org Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-01-16 18:47:49 +01:00
Nélio Laranjeiro	01d79216e6	net/mlx5: remove get priv internal function mlx5_get_priv() is barely use across the driver. To avoid mixing access, this function is definitely removed. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-01-16 18:47:49 +01:00
Gaetan Rivet	c752998b5e	pci: introduce library and driver The PCI lib defines the types and methods allowing to use PCI elements. The PCI bus implements a bus driver for PCI devices by constructing rte_bus elements using the PCI lib. Move the relevant code out of the EAL to its expected place. Libraries, drivers, unit tests and applications are updated to use the new rte_bus_pci.h header when necessary. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2017-10-26 23:17:31 +02:00
Yongseok Koh	fb870be5a8	net/mlx5: fix Tx doorbell memory barrier Configuring UAR as IO-mapped makes maximum throughput decline by noticeable amount. If UAR is configured as write-combining register, a write memory barrier is needed on ringing a doorbell. rte_wmb() is mostly effective when the size of a burst is comparatively small. Revert the register back to write-combining and enforce a write memory barrier instead, except for vectorized Tx burst routines. Application can change it by setting MLX5_SHUT_UP_BF under its own necessity. Fixes: `9f9bebae55` ("net/mlx5: don't map doorbell register to write combining") Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-26 02:33:01 +02:00
Gaetan Rivet	00a3d8104a	ethdev: remove detachable device flag This flag is not necessary at the ether layer anymore. Buses are able to advertise their hotplug support. The ether layer can rely upon this capability instead of a special flag. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: John McNamara <john.mcnamara@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2017-10-26 02:33:01 +02:00
Yongseok Koh	e589960c0b	net/mlx5: fix parsing flags of multi-packet send Fixes: `43e9d9794c` ("net/mlx5: support upstream rdma-core") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-26 02:33:00 +02:00
Ori Kam	9a761de8ea	net/mlx5: flow counter support Example for setting rule for counting packets with dest ip = 192.168.3.1 in testpmd: testpmd: flow create 0 ingress pattern eth / ipv4 dst is 192.168.3.1 / end actions queue index 0 / count / end Reading the number of packets and bytes for the rule: testpmd: flow query 0 0 count Note: This feature is only supported starting Mellanox OFED 4.2 Signed-off-by: Ori Kam <orika@mellanox.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:52:49 +01:00
Yongseok Koh	523f5a7421	net/mlx5: fix configuration of Rx CQE compression With the upstream rdma-core, to enable Rx CQE compression, mlx5dv_create_cq() in Direct Verbs has to be used instead of regular Verbs call (ibv_create_cq()). And if the size of CQE is 128 bytes, compression is supported only by certain devices. Thus, it has to be decided by checking the capability bits. Fixes: `43e9d9794c` ("net/mlx5: support upstream rdma-core") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-12 01:36:58 +01:00
Yongseok Koh	161b93e5e7	net/mlx5: match Rx completion entry size to cacheline The size of Rx completion entry should match the size of a cacheline. This is already reflected in struct mlx5_cqe by adding 64bytes padding if a cacheline is 128bytes. Some ARM CPUs have 128bytes cacheline. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	0887aa7f27	net/mlx5: add new operations for isolated mode Isolated works exclusively with the generic flow API, this patch adds a new set of operations valid in this mode. - promiscuous() - allmulticast() - reta() - rss() are not supported in this mode as it is fully supported by generic flow API. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	29957ec421	net/mlx5: remove hash Rx queues support From this commit the RSS support becomes un-available until it is replaced by the generic flow implementation. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	29c1d8bb3e	net/mlx5: handle a single RSS hash key for all protocols Since RSS configuration can also be used by flow API, there is no more necessity to keep a list of RSS configurable for each protocol. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	272733b5eb	net/mlx5: use flow to enable unicast traffic RSS hash configuration is currently ignored by the PMD, this commits removes the RSS feature. This functionality will be added in a later commit. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	1b37f5d898	net/mlx5: use flow to enable promiscuous mode RSS hash configuration is currently ignored by the PMD, this commits removes the RSS feature on promiscuous mode. This functionality will be added in a later commit. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	f5479b6848	net/mlx5: add hash Rx queue object Hash Rx queue is an high level queue providing the RSS hash algorithm, key and indirection table to spread the packets. Those objects can be easily shared between several Verbs flows. This commit bring this capability to the PMD. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	4c7a0f5ff8	net/mlx5: make indirection tables shareable Indirection table in verbs side resides in a list of final work queues to spread the packets according to an higher level queue. This indirection table can be shared among the hash Rx queues which points to them. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	a1366b1a2b	net/mlx5: add reference counter on DPDK Rx queues Use the same design for DPDK queue as for Verbs queue for symmetry, this also helps in fixing some issues like the DPDK release queue API which is not expected to fail. With such design, the queue is released when the reference counters reaches 0. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	6e78005a9b	net/mlx5: add reference counter on DPDK Tx queues Use the same design for DPDK queue as for Verbs queue for symmetry, this also helps in fixing some issues like the DPDK release queue API which is not expected to fail. With such design, the queue is released when the reference counters reaches 0. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	faf2667fe8	net/mlx5: separate DPDK from verbs Tx queue objects Move verbs object to their own functions to allocate/release them independently from the DPDK queue. At the same time a reference counter is added to help in issues detections when the queue is being release but still in use somewhere else (flows for instance). Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	09cb5b5817	net/mlx5: separate DPDK from verbs Rx queue objects Move verbs object to their own functions to allocate/release them independently from the DPDK queue. At the same time a reference counter is added to help in issues detections when the queue is being release but still in use somewhere else (flows for instance). Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	f8fb87d51f	net/mlx5: add reference counter on memory region This patch introduce the Memory region as a shared object where users should get a reference to it by calling the priv_mr_get() or priv_mr_new() to create the memory region. This last one will register the memory pool in the kernel driver and retrieve the associated memory region. This should help to reduce the memory consumption cause by registering multiple times the same memory pool. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	6af6b973b6	net/mlx5: verify all flows are been removed on close Debug tools to verify all flows are be un-register from the NIC. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	991b04f682	net/mlx5: prefix Tx structures and functions Prefix struct txq_ctrl and associated function with mlx5. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	78142aac2f	net/mlx5: prefix Rx structures and functions Prefix struct rxq_ctrl and associated functions with mlx5. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Nélio Laranjeiro	34bb7d073f	net/mlx5: remove flow director support Generic flow API should be use for flow steering as is provides a better and easier way to configure flows. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-10-12 01:36:58 +01:00
Xueming Li	87ec44ce16	net/mlx5: add operations for secondary process Add operations that are safe for secondary processes: * (x)stats * device info get * rx/tx descriptor status Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-12 01:36:57 +01:00
Xueming Li	1e3a39f72d	net/mlx5: allocate verbs object into shared memory PMD uses Verbs object which were not available in the shared memory. This patch modify the location where Verbs objects are allocated (from process memory address space to shared memory address space) and thus allow a secondary process to use those object by mapping this shared memory space its own memory space. Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-12 01:36:57 +01:00
Xueming Li	f8b9a3bad4	net/mlx5: install a socket to exchange a file descriptor Use a unix socket to get back the communication channel with the Kernel driver from the primary process, this is necessary to remap those pages in the secondary process memory space and thus use the same Tx queues. This is only supported from rdma-core (v15). Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Xueming Li <xuemingl@mellanox.com>	2017-10-12 01:36:57 +01:00
Shachar Beiser	43e9d9794c	net/mlx5: support upstream rdma-core This removes the dependency on specific Mellanox OFED libraries by using the upstream rdma-core and linux upstream community code. Both rdma-core upstream and Mellanox OFED are Linux user-space packages: 1. Rdma-core is Linux upstream user-space package.(Generic) 2. Mellanox OFED is Mellanox's Linux user-space package.(Proprietary) The difference between the two are the APIs towards the kernel. Support for x86-32 is removed due to issues in rdma-core library. ICC compilation will be supported as soon as the following patch is integrated in rdma-core: https://marc.info/?l=linux-rdma&m=150643474705690&w=2 Signed-off-by: Shachar Beiser <shacharbe@mellanox.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-06 02:49:49 +02:00
Matan Azrad	7d7d7ad1e6	net/mlx5: support device removal event Extend the LSC event handling to support the device removal as well. The mlx5 event handling has been made capable of receiving and signaling several event types at once. This support includes next: 1. Removal event detection according to the user configuration. 2. Calling to all registered mlx5 removal callbacks. 3. Capabilities extension to include removal interrupt handling. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-10-06 02:49:48 +02:00
Xueming Li	772d34356d	net/mlx5: fix tunnel offload detection PMD driver got random tunnel_en value on ConnextX-4LX NIC, depends on compile optimization level. The variable was not initialized and detection logic was absent. Fixes: `f5fde52051` ("net/mlx5: add hardware checksum offload for tunnel packets") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-06 02:49:48 +02:00
Matan Azrad	e1c3e30508	net/mlx5: fix probe failure report The corrupted code doesn't return error when probe function fails due to error in device mac address getting. By this way, the probe function may return success even if the ETH dev is not allocated. Hence, the probe caller, for example failsafe PMD, fails when it tries to get ETH dev after the device was plugged out while mlx5 was probing it. The fix adds error report to the probe caller when priv_get_mac fails and in all other failure options which are missing it. By this way, it prevents the unexpected behavior to miss ETH device after the device was probed successfully. This bug was already present in the original code taken from mlx4. Fixes: `771fa900b7` ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters") Fixes: `1371f4df16` ("mlx5: check port is configured as ethernet device") Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-06 02:49:47 +02:00
Shahaf Shuler	9f9bebae55	net/mlx5: don't map doorbell register to write combining By default, Verbs maps the doorbell register to write combining. Working with write combining is useful for drivers which use blue flame for the doorbell write. Since mlx5 PMD uses only doorbells and write combining mapping requires an extra memory barrier to flush the doorbell after its write, setting the mapping to un-cached by default. Such change is expected to reduce the max and average round trip latency. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Signed-off-by: Alexander Solganik <solganik@gmail.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-06 02:49:47 +02:00
Nélio Laranjeiro	6096a4603d	net/mlx5: fix non working secondary process by removing it Secondary process is a copy/paste of the mlx4 drivers, it was never tested and it even segfault at the secondary process start in the mlx5_pci_probe(). This makes more sense to wipe this non working feature to re-write a working and functional version. Fixes: `a48deada65` ("mlx5: allow operation in secondary processes") Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2017-10-06 02:49:47 +02:00
Nélio Laranjeiro	d052f5358b	net/mlx5: remove pedantic pragma Those are useless since DPDK headers have been cleaned up. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2017-10-06 02:49:47 +02:00
Nelio Laranjeiro	5644d5b94e	net/mlx5: add parameters to enable/disable vector datapath Vector code is very young and can present some issues for users, to avoid them to modify the selections function by commenting the code and recompile the PMD, new devices parameters are added to deactivate the Tx and/or Rx vector code. By using such device parameters, the user will be able to fall back to regular burst functions. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-08-03 23:34:40 +02:00
Yongseok Koh	ea16068c00	net/mlx5: fix L4 packet type support TCP/UDP/NONFRAG/FRAG flags aren't counted for both outer and inner header even though device supports it. Fixes: `0603df73a0` ("net/mlx5: fix Rx packet validation and type") Fixes: `6cb559d67b` ("net/mlx5: add vectorized Rx/Tx burst for x86") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-07-31 19:58:41 +02:00
Gaetan Rivet	bd735c31c7	net/mlx5: advertise the detach capability Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-07-31 14:08:25 +02:00
Adrien Mazarguil	9f91fb5483	net/mlx5: fix Rx interrupts support checks Not exposing Rx interrupts callbacks when this feature is unsupported is less intrusive than having two different versions for these functions. Fixes: `3c7d44af25` ("net/mlx5: support user space Rx interrupt event") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-07-06 15:00:56 +02:00
Nélio Laranjeiro	c8ffb8a9cc	net/mlx5: fix flow application order on stop/start Flow rules must be applied in the same order as they have been created and thus destroyed in the reverse order. Fixes: `2097d0d1e2` ("net/mlx5: support basic flow items and actions") Cc: stable@dpdk.org Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-06-12 10:41:27 +01:00
Thomas Monjalon	3dcfe0390c	pci: remove eal prefix The PCI code will move to the bus drivers directory. Rename functions from rte_eal_pci_ to rte_pci_ to prepare the move of the driver out of EAL. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2017-05-05 14:38:17 +02:00
Shahaf Shuler	2dfbbd925d	net/mlx5: fix parameters defaults With the Enhanced multi packet send addition, the defaults were made in order to get the maximum out of the box performance. Features like tso, don't use the enhanced send, however the defaults are still valid. This cause Tx queue creation to fail. Fixes: `3f13f8c23a` ("net/mlx5: support hardware TSO") Fixes: `6ce84bd889` ("net/mlx5: add enhanced multi-packet send for ConnectX-5") Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Raslan Darawsheh <rasland@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-04-19 15:37:37 +02:00
Shahaf Shuler	50b244a179	net/mlx5: rework parameters parsing Currently the argument process is done without indication which parameter was forced by the application and which one is on it default value. This becomes problematic when different features requires different defaults. For example, Enhanced multi packet send and TSO. This commit modifies the argument process, enabling to differ which parameter was forced by the application. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-04-19 15:37:37 +02:00
Jan Blunck	fdf91e0f2f	drivers/net: do not use ethdev driver Signed-off-by: Jan Blunck <jblunck@infradead.org> Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2017-04-18 19:05:46 +02:00
Gaetan Rivet	5525aa8fad	net/mlx5: fix returned values upon failed probing Fixes: `771fa900b7` ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters") Cc: stable@dpdk.org Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-04-04 18:59:51 +02:00
Yongseok Koh	6ce84bd889	net/mlx5: add enhanced multi-packet send for ConnectX-5 ConnectX-5 supports enhanced version of multi-packet send (MPS). An MPS Tx descriptor can carry multiple packets either by including pointers of packets or by inlining packets. Inlining packet data can be helpful to better utilize PCIe bandwidth. In addition, Enhanced MPS supports hybrid mode - mixing inlined packets and pointers in a descriptor. This feature is enabled by default if supported by HW. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2017-04-04 18:59:41 +02:00
Shahaf Shuler	3c7d44af25	net/mlx5: support user space Rx interrupt event Implement rxq interrupt callbacks Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2017-04-04 18:59:39 +02:00
Shahaf Shuler	f5fde52051	net/mlx5: add hardware checksum offload for tunnel packets Prior to this commit Tx checksum offload was supported only for the inner headers. This commit adds support for the hardware to compute the checksum for the outer headers as well. The support is for tunneling protocols GRE and VXLAN. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-04-04 15:52:51 +02:00
Shahaf Shuler	3f13f8c23a	net/mlx5: support hardware TSO Implement support for hardware TSO. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-04-04 15:52:51 +02:00
Olivier Matz	8788fec1f2	net/mlx5: implement descriptor status API Since there is no "descriptor done" flag like on Intel drivers, the approach is different on mlx5 driver. - for Tx, we call txq_complete() to free descriptors processed by the hw, then we check if the descriptor is between tail and head - for Rx, we need to browse the cqes, managing compressed ones, to get the number of used descriptors. Signed-off-by: Olivier Matz <olivier.matz@6wind.com>	2017-03-30 15:27:42 +02:00
Shahaf Shuler	a67323e49c	net/mlx5: fix memory leak when parsing device params in case of an error argument list is not freed. Fixes: `e72dd09b61` ("net/mlx5: add support for configuration through kvargs") Cc: stable@dpdk.org Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>	2017-01-30 22:18:26 +01:00
Yongseok Koh	ec1fed226a	net/mlx5: increase RSS indirection table size limit The size of Rx RSS indirection table was limited by 256, but it is not required anymore for all Mellanox NICs. However, the librte_ether still limits the size by 512. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-01-30 22:18:26 +01:00
Shahaf Shuler	a4193ae3bc	net/mlx5: support extended statistics Implement extended statistics callbacks. Suggested-by: Hanoch Haim <hhaim@cisco.com> Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Signed-off-by: Elad Persiko <eladpe@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-01-17 20:13:13 +01:00
Matthieu Ternisien d'Ouville	2c960a5116	net/mlx: remove link update lock Retrieving link status information through the link update callback should be quick and non-blocking. Mellanox PMDs retrieve this information through ioctl() calls on the related kernel netdevice. This appears to take a long time to complete and may cause significant slowdowns in applications. While these system calls cannot be accelerated, removing the lock on the private structure allows applications to perform other control operations from separate threads in the meantime. This function remains safe without locking as it does not write the private structure, it is only used to retrieve the name of the netdevice. Signed-off-by: Matthieu Ternisien d'Ouville <matthieu.tdo@6wind.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-01-17 19:41:43 +01:00
Yongseok Koh	528a9fbec6	net/mlx5: support ConnectX-5 devices Add PCI device ID for ConnectX-5 and enable multi-packet send for PF and VF along with changing documentation and release note. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-01-17 19:41:42 +01:00
Jan Blunck	eac901ce29	ethdev: decouple from PCI device This makes struct rte_eth_dev independent of struct rte_pci_device by replacing it with a pointer to the generic struct rte_device. Signed-off-by: Jan Blunck <jblunck@infradead.org> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>	2016-12-25 23:30:19 +01:00
Jan Blunck	75aca7997e	ethdev: initialize more fields on allocation This moves the non-PCI related initialization of the link state interrupt callback list and the setting of the default MTU to rte_eth_dev_allocate() so that drivers only need to set non-default values. Signed-off-by: Jan Blunck <jblunck@infradead.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-12-21 17:32:17 +01:00
Jan Blunck	7f95f78a8a	ethdev: clear data when allocating device Lets clear the eth_dev->data when allocating a new rte_eth_dev so that drivers only need to set non-zero values. Signed-off-by: Jan Blunck <jblunck@infradead.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2016-12-21 17:30:27 +01:00
Olivier Matz	0880c40113	drivers: advertise kmod dependencies in pmdinfo Add a new macro RTE_PMD_REGISTER_KMOD_DEP() that allows a driver to declare the list of kernel modules required to run properly. Today, most PCI drivers require uio/vfio. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-12-20 18:26:00 +01:00
Olga Shern	0d1e2f8da9	net/mlx5: fix link status report This commit fixes link status report on device start up when lcs callback is configured. Fixes: `62072098b5` ("mlx5: support setting link up or down") Signed-off-by: Olga Shern <olgas@mellanox.com>	2016-10-26 19:38:18 +02:00
Shreyansh Jain	01f1922786	drivers: rename register macro prefix All macros related to driver registeration renamed from DRIVER_* to RTE_PMD_* This includes: DRIVER_REGISTER_PCI -> RTE_PMD_REGISTER_PCI DRIVER_REGISTER_PCI_TABLE -> RTE_PMD_REGISTER_PCI_TABLE DRIVER_REGISTER_VDEV -> RTE_PMD_REGISTER_VDEV DRIVER_REGISTER_PARAM_STRING -> RTE_PMD_REGISTER_PARAM_STRING DRIVER_EXPORT_* -> RTE_PMD_EXPORT_* Fix PMDINFOGEN tool to look for matches of RTE_PMD_REGISTER_*. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>	2016-10-14 01:49:32 +02:00
Bruce Richardson	fc5b160f3c	net/mlx: fix debug build with gcc 6.1 With recent gcc versions, e.g. gcc 6.1, compilation of mlx drivers with debug enabled produces lots of errors complaining that "pedantic" is not a warning level that can be ignored. error: ‘-pedantic’ is not an option that controls warnings [-Werror=pragmas] #pragma GCC diagnostic ignored "-pedantic" ^~~~~~~~~~~ These errors can be removed by changing the "-pedantic" to "-Wpedantic". Fixes: `7fae69eeff` ("mlx4: new poll mode driver") Fixes: `771fa900b7` ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-09-30 12:27:18 +02:00
Jan Viktorin	13a1317d3b	pci: create device list and fallback on its members Now that rte_device is available, drivers can start using its members (numa, name) as well as link themselves into another rte_device list. As of now no one is using this list, but can be used for moving over all devices (pdev/vdev/Xdev) and perform bulk actions (like cleanup). Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> [Shreyansh: Reword commit log for extra rte_device list] Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: David Marchand <david.marchand@6wind.com>	2016-10-03 16:34:03 +02:00
Jan Viktorin	2f3193cf0f	pci: inherit common driver in PCI driver Remove the 'name' member from rte_pci_driver and move to generic rte_driver. Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(<name>..) as well as assigning a name to eth_driver.pci_drv.name member. In this patch, only the original DRIVER_REGISTER_PCI(<name>..) name has been populated into the rte_driver.name member - assignments through eth_driver has been removed. Signed-off-by: Jan Viktorin <viktorin@rehivetech.com> [Shreyansh: Rebase and expand changes to newly added files] Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: David Marchand <david.marchand@6wind.com>	2016-10-03 16:33:55 +02:00
David Marchand	6751f6deb7	ethdev: get rid of device type Now that hotplug has been moved to eal, there is no reason to keep the device type in this layer. Signed-off-by: David Marchand <david.marchand@6wind.com> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>	2016-10-03 16:33:39 +02:00
David Marchand	c830cb2954	drivers: use PCI registration macro Simplify crypto and ethdev pci drivers init by using newly introduced init macros and helpers. Those drivers then don't need to register as "rte_driver"s anymore. Exceptions: - virtio and mlx* use RTE_INIT directly as they have custom initialization steps. - VDEV devices are not modified - they continue to use PMD_REGISTER_DRIVER. Update documentation for replacing an example referring to PMD_REGISTER_DRIVER. Signed-off-by: David Marchand <david.marchand@6wind.com> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>	2016-10-03 16:33:23 +02:00
Shreyansh Jain	af424af840	pci: replace devinit/devuninit with probe/remove Probe and Remove are more appropriate names for PCI init and uninint operations. This is a cosmetic change. Only MLX* uses the PCI direct registration, bypassing PMD_* macro. The callbacks for this too have been updated. VDEV are left out. For them, init/uninit are more appropriate. Suggested-by: David Marchand <david.marchand@6wind.com> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: David Marchand <david.marchand@6wind.com>	2016-10-03 16:33:01 +02:00
Pablo de Lara	2f45703c17	drivers: make driver names consistent As discussed in the past release, driver names are modified to be more consistent, and the future driver should follow this new convention. Driver names consist of: "driver category"_"driver folder name"_"optional extra name". For example: - Crypto null driver -> "crypto_null" - Network IXGBE VF driver -> "net_ixgbe_vf" Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2016-09-16 11:55:59 +02:00
Pablo de Lara	bae696ebd4	drivers: remove static driver names Since now the PMD_REGISTER_DRIVER macro sets the driver names, there is no need to have the rte_driver structure setting it statically, as it will get overridden. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2016-07-10 14:51:09 +02:00
Thomas Monjalon	54266acc64	drivers: fix build with new register macro Compilation fails because of some typos. Fixes: `cb6696d220` ("drivers: update registration macro usage") Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Tested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-07-08 10:27:08 +02:00
Neil Horman	cb6696d220	drivers: update registration macro usage Modify the PMD_REGISTER_DRIVER macro, adding a name argument to it. The addition of a name argument creates a token that can be used for subsequent macros in the creation of unique symbol names to export additional bits of information for use by the pmdinfogen tool. For example: PMD_REGISTER_DRIVER(ena_driver, ena); registers the ena_driver struct as it always did, and creates a symbol const char this_pmd_name0[] __attribute__((used)) = "ena"; which pmdinfogen can search for and extract. The subsequent macro DRIVER_REGISTER_PCI_TABLE(ena, ena_pci_id_map); creates a symbol const char ena_pci_tbl_export[] __attribute__((used)) = "ena_pci_id_map"; Which allows pmdinfogen to find the pci table of this driver Using this pattern, we can export arbitrary bits of information. pmdinfo uses this information to extract hardware support from an object file and create a json string to make hardware support info discoverable later. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Panu Matilainen <pmatilai@redhat.com> Acked-by: Remy Horton <remy.horton@intel.com>	2016-07-06 23:21:40 +02:00
Nélio Laranjeiro	230189d9ff	net/mlx5: support multi-packet send This feature enables the TX burst function to emit up to 5 packets using only two work queue entries (WQEs) on devices that support it. Saves PCI bandwidth and improves performance. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Olga Shern <olgas@mellanox.com>	2016-06-27 16:17:52 +02:00
Yaacov Hazan	2a66cf3789	net/mlx5: support inline send Implement send inline feature which copies packet data directly into work queue entries (WQEs) for improved latency. The maximum packet size and the minimum number of Tx queues to qualify for inline send are user-configurable. This feature is effective when HW causes a performance bottleneck. Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2016-06-27 16:17:52 +02:00
Nélio Laranjeiro	99c12dcca6	net/mlx5: handle Rx CQE compression Mini (compressed) completion queue entries (CQEs) are returned by the NIC when PCI back pressure is detected, in which case the first CQE64 contains common packet information followed by a number of CQE8 providing the rest, followed by a matching number of empty CQE64 entries to be used by software for decompression. Before decompression: 0 1 2 6 7 8 +-------+ +---------+ +-------+ +-------+ +-------+ +-------+ \| CQE64 \| \| CQE64 \| \| CQE64 \| \| CQE64 \| \| CQE64 \| \| CQE64 \| \|-------\| \|---------\| \|-------\| \|-------\| \|-------\| \|-------\| \| ..... \| \| cqe8[0] \| \| \| . \| \| \| \| \| ..... \| \| ..... \| \| cqe8[1] \| \| \| . \| \| \| \| \| ..... \| \| ..... \| \| ....... \| \| \| . \| \| \| \| \| ..... \| \| ..... \| \| cqe8[7] \| \| \| \| \| \| \| \| ..... \| +-------+ +---------+ +-------+ +-------+ +-------+ +-------+ After decompression: 0 1 ... 8 +-------+ +-------+ +-------+ \| CQE64 \| \| CQE64 \| \| CQE64 \| \|-------\| \|-------\| \|-------\| \| ..... \| \| ..... \| . \| ..... \| \| ..... \| \| ..... \| . \| ..... \| \| ..... \| \| ..... \| . \| ..... \| \| ..... \| \| ..... \| \| ..... \| +-------+ +-------+ +-------+ This patch does not perform the entire decompression step as it would be really expensive, instead the first CQE64 is consumed and an internal context is maintained to interpret the following CQE8 entries directly. Intermediate empty CQE64 entries are handed back to HW without further processing. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Olga Shern <olgas@mellanox.com> Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>	2016-06-27 16:17:52 +02:00
Nélio Laranjeiro	e72dd09b61	net/mlx5: add support for configuration through kvargs The intent is to replace the remaining compile-time options and environment variables with a common mean of runtime configuration. This commit only adds the kvargs handling code, subsequent commits will update the rest. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-06-27 16:17:52 +02:00
Nélio Laranjeiro	36271e7446	net/mlx5: update prerequisites for upcoming enhancements The latest version of Mellanox OFED exposes hardware definitions necessary to implement data path operation bypassing Verbs. Update the minimum version requirement to MLNX_OFED >= 3.3 and clean up compatibility checks for previous releases. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-06-27 16:17:52 +02:00
Nélio Laranjeiro	0cdddf4d06	net/mlx5: split Rx queue structure To keep the data path as efficient as possible, move fields only useful to the control path into new structure rxq_ctrl. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-06-27 16:17:52 +02:00
Nélio Laranjeiro	21c8bb4928	net/mlx5: split Tx queue structure To keep the data path as efficient as possible, move fields only useful to the control path into new structure txq_ctrl. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-06-27 16:17:52 +02:00
Nélio Laranjeiro	1d1bc87026	net: fix PCI class id Use RTE_PCI_DEVICE macro to set all fields rather than explicitly setting them individually in the code. This shortens the code while helping to future-proof against future changes to the rte_pci_id structure. Fixes: `701c8d80c8` ("pci: support class id probing") Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2016-06-27 16:17:51 +02:00
Nélio Laranjeiro	85e347db87	net/mlx5: enhance SR-IOV detection SR-IOV mode is currently set when dealing with VF devices. PF devices must be taken into account as well if they have active VFs. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-06-20 17:21:49 +02:00
Yaacov Hazan	e192ef8034	mlx5: add VLAN insertion offload VLAN insertion can be done in hardware when supported in Verbs. A software fallback is provided otherwise. The software implementation is also used when multi-packet send is enabled on a queue, as both features are mutually exclusive. Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-03-25 18:56:44 +01:00
Olga Shern	4d803a7246	mlx5: support HW packet padding Environment variable MLX5_PMD_ENABLE_PADDING enables HW packet padding in PCI bus transactions. When packet size is cache aligned and CRC stripping is enabled, 4 fewer bytes are written to the PCI bus. Enabling padding makes such packets aligned again. In cases where PCI bandwidth is the bottleneck, padding can improve performance by 10%. This is disabled by default since this can also decrease performance for unaligned packet sizes. Signed-off-by: Olga Shern <olgas@mellanox.com> fix packet padding macro check Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-03-25 18:56:44 +01:00
Olga Shern	4d3267092a	mlx5: add Rx CRC stripping configuration Until now, CRC was always stripped by hardware. This feature can be configured since MLNX_OFED >= 3.2. Signed-off-by: Olga Shern <olgas@mellanox.com>	2016-03-25 18:56:44 +01:00
Or Ami	a48deada65	mlx5: allow operation in secondary processes Secondary processes are expected to use queues and other resources allocated by the primary, however Verbs resources can only be shared between processes when inherited through fork(). This limitation can be worked around for TX by configuring separate queues from secondary processes. Signed-off-by: Or Ami <ora@mellanox.com>	2016-03-25 18:56:44 +01:00
Or Ami	62072098b5	mlx5: support setting link up or down Add driver functions to set link state up or down. Burst functions are updated to make sure applications cannot attempt to send/receive after link is brought down. Signed-off-by: Or Ami <ora@mellanox.com>	2016-03-25 18:56:44 +01:00
Jianfeng Tan	78a38edf66	ethdev: query supported packet types Add a new API rte_eth_dev_get_supported_ptypes to query what packet types can be filled by a given device. The device should be already started or its PMD RX burst function already decided, since the packet types supported may vary depending on RX function. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-03-25 18:56:43 +01:00
Adrien Mazarguil	0d2186743d	mlx5: manage all special flow types at once This commit adds helpers to remove redundant code. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-03-16 18:52:17 +01:00
Or Ami	1371f4df16	mlx5: check port is configured as ethernet device If the port link layer is not Ethernet, notify the user. Signed-off-by: Or Ami <ora@mellanox.com>	2016-03-16 18:52:17 +01:00
Or Ami	2f636ae565	mlx5: fix possible crash during initialization RSS configuration should not be freed when priv is NULL. Fixes: `2f97422e77` ("mlx5: support RSS hash update and get") Signed-off-by: Or Ami <ora@mellanox.com>	2016-03-16 18:52:17 +01:00
Yaacov Hazan	f3db948918	mlx5: support Rx VLAN stripping Allows HW to strip the 802.1Q header from incoming frames and report it through the mbuf structure. This feature requires MLNX_OFED >= 3.2. Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-03-16 18:52:17 +01:00
Yaacov Hazan	76f5c99e68	mlx5: support flow director Add support for flow director filters (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN modes). This feature requires MLNX_OFED >= 3.2. Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Raslan Darawsheh <rdarawsheh@asaltech.com>	2016-03-16 18:52:17 +01:00
Yaacov Hazan	0497ddaac5	mlx5: add special flows for broadcast and IPv6 multicast Until now, broadcast frames were handled like unicast. Moving the related flow to the special flows table frees up the related unicast MAC entry. The same method is used to handle IPv6 multicast frames. Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-03-16 18:52:17 +01:00
Yaacov Hazan	083c2dd317	mlx5: refactor special flows handling Merge redundant code by adding a static initialization table to manage promiscuous and allmulticast (special) flows. New function priv_rehash_flows() implements the logic to enable/disable relevant flows in one place from any context. Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-03-16 18:52:17 +01:00
David Marchand	86977fccb9	mlx5: support setting primary MAC address Signed-off-by: David Marchand <david.marchand@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-03-16 18:52:17 +01:00
Adrien Mazarguil	13d57bd5cd	mlx5: fix build with debugging enabled The following error occurs when CONFIG_RTE_LIBRTE_MLX5_DEBUG=y: drivers/net/mlx5/mlx5.c:381:4: error: ISO C forbids braced-groups within expressions RTE_MIN() uses the non-standard ({ ... }) syntax to declare variables within parentheses, which is rejected by -pedantic. Since the RSS_INDIRECTION_TABLE_SIZE check is meant to go away as soon as DPDK supports larger/variable indirection tables, put it in a separate condition. Fixes: `634efbc2c8` ("mlx5: support RETA query and update") Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2015-11-03 22:57:21 +01:00
Bernard Iremonger	eeefe73f0a	drivers: copy PCI device info to ethdev data Use new function rte_eth_copy_pci_info. Copy device info for the following pdevs: bnx2x cxgbe e1000 enic fm10k i40e ixgbe mlx4 mlx5 virtio vmxnet3 Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2015-11-03 18:39:26 +01:00
Nelio Laranjeiro	634efbc2c8	mlx5: support RETA query and update ConnectX-4 adapters do not have a constant indirection table size, which is set at runtime from the number of RX queues. The maximum size is retrieved using a hardware query and is normally 512. Since the current RETA API cannot handle a variable size, any query/update command causes it to be silently updated to RSS_INDIRECTION_TABLE_SIZE entries regardless of the original size. Also due to the underlying type of the configuration structure, the maximum size is limited to RSS_INDIRECTION_TABLE_SIZE (currently 128, at most 256 entries). A port stop/start must be done to apply the new RETA configuration. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2015-11-03 11:23:15 +01:00
Nelio Laranjeiro	198a3c339a	mlx5: handle link status interrupts Add interrupts handler for port status notification. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2015-11-01 11:23:45 +01:00
Nelio Laranjeiro	0573873d5b	mlx5: use one RSS hash key per flow type DPDK expects to have an RSS hash key per flow type (IPv4, IPv6, UDPv4, etc.), to handle this the PMD must keep a table of hash keys to be able to reconfigure the queues at each start/stop call. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2015-10-31 00:21:58 +01:00
Nelio Laranjeiro	2f97422e77	mlx5: support RSS hash update and get First implementation of rss_hash_update and rss_hash_conf_get, those functions still lack in functionality but are usable to change the RSS hash key. For now, the PMD does not handle an indirection table for each kind of flow (IPv4, IPv6, etc.), the same RSS hash key is used for all protocols. This situation explains why the rss_hash_conf_get returns the RSS hash key for all DPDK supported protocols and why the hash key is set for all of them too. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2015-10-31 00:21:58 +01:00
Nelio Laranjeiro	95e16ef325	mlx5: adapt indirection table size depending on Rx queues number Use the maximum size of the indirection table when the number of requested RX queues is not a power of two, this help to improve RSS balancing. A message informs users that balancing is not optimal in such cases. Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2015-10-31 00:21:58 +01:00
Adrien Mazarguil	ecc1c29df8	mlx5: refactor Rx code for the new verbs RSS API The new Verbs RSS API is lower-level than the previous one and much more flexible but requires RX queues to use Work Queues (WQs) internally instead of Queue Pairs (QPs), which are grouped in an indirection table used by a new kind of hash RX QPs. Hash RX QPs and the indirection table together replace the parent RSS QP while WQs are mostly similar to child QPs. RSS hash key is not configurable yet. Summary of changes: - Individual DPDK RX queues do not store flow properties anymore, this info is now part of the hash RX queues. - All functions affecting the parent queue when RSS is enabled or the basic queues otherwise are modified to affect hash RX queues instead. - Hash RX queues are also used when a single DPDK RX queue is configured (no RSS) to remove that special case. - Hash RX queues and indirection table are created/destroyed when device is started/stopped in addition to create/destroy flows. - Contrary to QPs, WQs are moved to the "ready" state before posting RX buffers, otherwise they are ignored. - Resource domain information is added to WQs for better performance. - CQs are not resized anymore when switching between non-SG and SG modes as it does not work correctly with WQs. Use the largest possible size instead, since CQ size does not have to be the same as the number of elements in the RX queue. This also applies to the maximum number of outstanding WRs in a WQ (max_recv_wr). Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Olga Shern <olgas@mellanox.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Or Ami <ora@mellanox.com> Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>	2015-10-31 00:21:58 +01:00
Adrien Mazarguil	e9086978b2	mlx5: support VLAN filtering All MAC RX flows must be updated with VLAN information when configuring a VLAN filter. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2015-10-30 22:39:30 +01:00
Adrien Mazarguil	02d7543043	mlx5: support flow control Like most other device control operations, those are handled by the related kernel network device through syscalls. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2015-10-30 22:34:21 +01:00
Adrien Mazarguil	cb8faed7dd	mlx5: support link status update Link information is retrieved using ethtool ioctls. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2015-10-30 22:32:05 +01:00
Adrien Mazarguil	1bdbe1af99	mlx5: support promiscuous and allmulticast Rx modes These modes require special non-MAC flows. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>	2015-10-30 22:31:22 +01:00
Adrien Mazarguil	87011737b7	mlx5: add software counters Hardware counters are not supported yet. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2015-10-30 22:24:08 +01:00
Adrien Mazarguil	cf37ca9563	mlx5: support MTU configuration Depending on the MTU and whether jumbo frames are enabled, RX queues may switch between SG and non-SG modes for better performance. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2015-10-30 22:23:45 +01:00
Adrien Mazarguil	e60fbd5b24	mlx5: add device configure/start/stop This commit adds the remaining missing callbacks to make mlx5 usable. Like mlx4, device start and stop are implemented on top of MAC RX flows. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Francesco Santoro <francesco.santoro@6wind.com> Signed-off-by: Didier Pallard <didier.pallard@6wind.com>	2015-10-30 22:22:12 +01:00
Adrien Mazarguil	3318aef7e7	mlx5: add MAC handling This commit adds support for MAC flow steering rules mandatory for the RX path as well as the related callbacks to add/remove MAC addresses. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Didier Pallard <didier.pallard@6wind.com>	2015-10-30 22:21:51 +01:00
Adrien Mazarguil	2e22920b85	mlx5: support non-scattered Tx and Rx RSS implementation with parent/child QPs comes from mlx4 and is temporary. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2015-10-30 22:21:07 +01:00
Adrien Mazarguil	771fa900b7	mlx5: introduce new driver for Mellanox ConnectX-4 adapters In its current state, this driver implements the bare minimum to initialize itself and Mellanox ConnectX-4 adapters without doing anything else (no RX/TX for instance). It is disabled by default since it is based on the mlx4 driver and also depends on libibverbs. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Signed-off-by: Or Ami <ora@mellanox.com>	2015-10-30 22:03:42 +01:00

... 3 4 5 6 7 ...

422 Commits