numam-dpdk

Author	SHA1	Message	Date
Ivan Ilchenko	ca041cd44f	ethdev: change allmulticast callbacks to return status Enabling/disabling of allmulticast mode is not always successful and it should be taken into account to be able to handle it properly. When correct return status is unclear from driver code, -EAGAIN is used. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com>	2019-10-07 15:00:55 +02:00
Igor Romanov	9970a9ad07	ethdev: make stats and xstats reset callbacks return int Change return value of the callbacks from void to int. Make implementations across all drivers return negative errno values in case of error conditions. Both callbacks are updated together because a large number of drivers assign the same function to both callbacks. Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 15:00:54 +02:00
Andrew Rybchenko	9039c81257	ethdev: change promiscuous callbacks to return status Enabling/disabling of promiscuous mode is not always successful and it should be taken into account to be able to handle it properly. When correct return status is unclear from driver code, -EAGAIN is used. Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Hyong Youb Kim <hyonkim@cisco.com>	2019-10-07 15:00:54 +02:00
Ivan Ilchenko	bdad90d12e	ethdev: change device info get callback to return int Change eth_dev_infos_get_t return value from void to int. Make eth_dev_infos_get_t implementations across all drivers to return negative errno values if case of error conditions. Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-10-07 14:45:35 +02:00
Stephen Hemminger	164cad78b0	net/mlx4: fix crash on info query in secondary process mlx4_dev_info_get calls mlx4_get_ifname, but mlx4_get_ifname uses priv->ctx which is not a valid pointer in a secondary process. The fix is to cache the value in primary. In the primary process, get and store the interface index of the device so that secondary process can see it. Bugzilla ID: 320 Fixes: `61cbdd4194` ("net/mlx4: separate device control functions") Cc: stable@dpdk.org Reported-by: Suyang Ju <sju@paloaltonetworks.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Matan Azrad <matan@mellanox.com>	2019-08-06 17:42:12 +02:00
Dekel Peled	187ec068d5	net/mlx4: fix typo in comment A spelling mistake was found in comment. This patch fixes it. Fixes: `8e49376400` ("net/mlx4: add external allocator for Verbs object") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-07-23 14:31:36 +02:00
Anatoly Burakov	edf73dd330	ipc: handle unsupported IPC in action register Currently, IPC API will silently ignore unsupported IPC. Fix the API call and its callers to explicitly handle unsupported IPC cases. For primary processes, it is OK to not have IPC because there may not be any secondary processes in the first place, and there are valid use cases that disable IPC support, so all primary process usages are fixed up to ignore IPC failures. For secondary processes, IPC will be crucial, so leave all of the error handling as is. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-06-05 11:27:36 +02:00
Olivier Matz	35b2d13fd6	net: add rte prefix to ether defines Add 'RTE_' prefix to defines: - rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN. - rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN. - rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN. - rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN. - rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN. - rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN. - rename ETHER_MTU as RTE_ETHER_MTU. - rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN. - rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID. - rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN. - rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU. - rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR. - rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR. - rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4. - rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6. - rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP. - rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN. - rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP. - rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ. - rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG. - rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588. - rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW. - rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB. - rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP. - rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS. - rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM. - rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN. - rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE. - rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4. - rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6. - rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH. - rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH. - rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS. - rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP. - rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG. - rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN. Do not update the command line library to avoid adding a dependency to librte_net. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-05-24 13:34:45 +02:00
Olivier Matz	6d13ea8e8e	net: add rte prefix to ether structures Add 'rte_' prefix to structures: - rename struct ether_addr as struct rte_ether_addr. - rename struct ether_hdr as struct rte_ether_hdr. - rename struct vlan_hdr as struct rte_vlan_hdr. - rename struct vxlan_hdr as struct rte_vxlan_hdr. - rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr. Do not update the command line library to avoid adding a dependency to librte_net. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-05-24 13:34:45 +02:00
Adrien Mazarguil	138a740c45	net/mlx4: support multicast address list interface Since this driver does not distinguish unicast/multicast addresses, applications could always rely on the standard MAC add/remove/set interface to configure both types. As a result, the multicast address list interface never got implemented (rte_eth_dev_set_mc_addr_list()) however PMD-agnostic applications still rely on it for compatibility reasons; a wrapper is therefore required. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-05-03 18:45:23 +02:00
Yongseok Koh	97d37d2c1f	net/mlx4: remove device register remap UAR (User Access Region) register does not need to be remapped for primary process but it should be remapped only for secondary process. UAR register table is in the process private structure in rte_eth_devices[], (struct mlx4_proc_priv *)rte_eth_devices[port_id].process_private The actual UAR table follows the data structure and the table is used for both Tx and Rx. For Tx, BlueFlame in UAR is used to ring the doorbell. MLX4_TX_BFREG(txq) is defined to get a register for the txq. Processes access its own private data to acquire the register from the UAR table. For Rx, the doorbell in UAR is required in arming CQ event. However, it is a known issue that the register isn't remapped for secondary process. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-04-12 11:02:02 +02:00
Yongseok Koh	0b259b8e96	net/mlx4: enable secondary process to register DMA memory The Memory Region (MR) for DMA memory can't be created from secondary process due to lib/driver limitation. Whenever it is needed, secondary process can make a request to primary process through the EAL IPC channel (rte_mp_msg) which is established on initialization. Once a MR is created by primary process, it is immediately visible to secondary process because the MR list is global per a device. Thus, secondary process can look up the list after the request is successfully returned. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-04-05 17:45:22 +02:00
Yongseok Koh	f4efc0eb97	net/mlx4: add control of excessive memory pinning by kernel A new PMD parameter (mr_ext_memseg_en) is added to control extension of memseg when creating a MR. It is enabled by default. If enabled, mlx4_mr_create() tries to maximize the range of MR registration so that the LKey lookup tables on datapath become smalle and get the best performance. However, it may worsen memory utilization because registered memory is pinned by kernel driver. Even if a page in the extended chunk is freed, that doesn't become reusable until the entire memory is freed and the MR is destroyed. To make freed pages available immediately, this parameter has to be turned off but it could drop performance. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-04-05 17:45:22 +02:00
Yongseok Koh	0203d33a10	net/mlx4: support secondary process In order to support secondary process, a few features are required. a) rdma-core library should allocate device resources using DPDK's memory allocator. b) UAR should be remapped for secondary processes. Currently, in order not to use different data structure for secondary processes, PMD tries to reserve identical virtual address space for both primary and secondary processes. c) IPC channel is necessary, which can be easily set with rte_mp APIs. Through the channel, Verbs command FD is delivered to the secondary process and the device stop/start event is also broadcast from primary process. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-04-05 17:45:22 +02:00
Yongseok Koh	8e49376400	net/mlx4: add external allocator for Verbs object To support secondary process, the memory allocated by library such as completion rings (CQ) and buffer rings (WQ) must be manageable by EAL, in order to share it with secondary processes. With new changes in rdma-core and kernel driver, it is possible to provide an external allocator to the library layer for this purpose. All such resources will now be allocated within DPDK framework. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-04-05 17:45:22 +02:00
Yongseok Koh	099c2c5376	net/mlx4: change device reference for secondary process rte_eth_devices[] is not shared between primary and secondary process, but a static array to each process. The reverse pointer of device (priv->dev) becomes invalid if mlx4 supports secondary process. Instead, priv has the pointer to shared data of the device, struct rte_eth_dev_data *dev_data; Two macros are added, #define PORT_ID(priv) ((priv)->dev_data->port_id) #define ETH_DEV(priv) (&rte_eth_devices[PORT_ID(priv)]) Cc: stable@dpdk.org Suggested-by: Raslan Darawsheh <rasland@mellanox.com> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-04-05 17:45:22 +02:00
Thomas Monjalon	dbeba4cf18	net/mlx: prefix private structure The private structure stored in rte_eth_dev->data->dev_private was named "struct priv". In order to ease code browsing, the structure is renamed "struct mlx[45]_priv". Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2019-03-01 18:17:35 +01:00
Thomas Monjalon	714bf46ebb	net/mlx: support firmware version query The API function rte_eth_dev_fw_version_get() is querying drivers via the operation callback fw_version_get(). The implementation of this operation is added for mlx4 and mlx5. Both functions are copying the same ibverbs field fw_ver which is retrieved when calling ibv_query_device[_ex]() during the port probing. It is tested with command "drvinfo" of examples/ethtool/. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-02-13 12:55:38 +01:00
Moti Haimovsky	2b4e423fd4	net/mlx4: check RSS queues number limitation This patch verifies that the number of Rx queues configured for RSS is supported by the device hardware. RSS support in mlx4 requires contiguous chunk of QPs to be reserved, there is a hardware limitation on the amount of contiguous QPs which is reported by the hardware. Ignoring this value will cause Rx queues creation to fail. Cc: stable@dpdk.org Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2018-08-02 12:34:16 +02:00
Moti Haimovsky	ba576975a8	net/mlx4: support hardware TSO Implement support for hardware TSO. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2018-07-10 14:02:57 +02:00
Yongseok Koh	9797bfcce1	net/mlx4: add new memory region support This is the new design of Memory Region (MR) for mlx PMD, in order to: - Accommodate the new memory hotplug model. - Support non-contiguous Mempool. There are multiple layers for MR search. L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized array by linear search. L0/L1 is in an inline function - mlx4_mr_lookup_cache(). If L1 misses, the bottom-half function is called to look up the address from the bigger local cache of the queue. This is L2 - mlx4_mr_addr2mr_bh() and it is not an inline function. Data structure for L2 is the Binary Tree. If L2 misses, the search falls into the slowest path which takes locks in order to access global device cache (priv->mr.cache) which is also a B-tree and caches the original MR list (priv->mr.mr_list) of the device. Unless the global cache is overflowed, it is all-inclusive of the MR list. This is L3 - mlx4_mr_lookup_dev(). The size of the L3 cache table is limited and can't be expanded on the fly due to deadlock. Refer to the comments in the code for the details - mr_lookup_dev(). If L3 is overflowed, the list will have to be searched directly bypassing the cache although it is slower. If L3 misses, a new MR for the address should be created - mlx4_mr_create(). When it creates a new MR, it tries to register adjacent memsegs as much as possible which are virtually contiguous around the address. This must take two locks - memory_hotplug_lock and priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any allocation/free of memory inside. In the free callback of the memory hotplug event, freed space is searched from the MR list and corresponding bits are cleared from the bitmap of MRs. This can fragment a MR and the MR will have multiple search entries in the caches. Once there's a change by the event, the global cache must be rebuilt and all the per-queue caches will be flushed as well. If memory is frequently freed in run-time, that may cause jitter on dataplane processing in the worst case by incurring MR cache flush and rebuild. But, it would be the least probable scenario. To guarantee the most optimal performance, it is highly recommended to use an EAL option - '--socket-mem'. Then, the reserved memory will be pinned and won't be freed dynamically. And it is also recommended to configure per-lcore cache of Mempool. Even though there're many MRs for a device or MRs are highly fragmented, the cache of Mempool will be much helpful to reduce misses on per-queue caches anyway. '--legacy-mem' is also supported. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:52 +01:00
Yongseok Koh	2d684b911d	net/mlx4: remove memory region support This patch removes current support of Memory Region (MR) in order to accommodate the dynamic memory hotplug patch. This patch can be compiled but traffic can't flow and HW will raise faults. Subsequent patches will add new MR support. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:51 +01:00
Adrien Mazarguil	84a684862f	net/mlx4: fix Rx resource leak in case of error When creation of a flow rule fails during dev_start(), the usage count of the common RSS context is not decremented, which triggers an assertion failure in debug mode during dev_close(). This is addressed by tracking the initialization status of the common RSS context in order to add missing cleanup code. A similar issue exists in mlx4_rxq_attach(), where usage count is incremented on a Rx queue but not released in case of error. This may lead to the above issue since RSS contexts created by flow rules attach themselves to Rx queues, incrementing their usage count. Fixes: `5697a41421` ("net/mlx4: relax Rx queue configuration order") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-05-02 19:28:48 +02:00
Olivier Matz	caccf8b318	ethdev: return diagnostic when setting MAC address Change the prototype and the behavior of dev_ops->eth_mac_addr_set(): a return code is added to notify the caller (librte_ether) if an error occurred in the PMD. The new default MAC address is now copied in dev->data->mac_addrs[0] only if the operation is successful. The patch also updates all the PMDs accordingly. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-04-14 00:43:30 +02:00
Ophir Munk	de1df14e6e	net/mlx4: support CRC strip toggling Previous to this commit mlx4 CRC stripping was executed by default and there was no verbs API to disable it. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-04-14 00:43:30 +02:00
Shahaf Shuler	5feecc57d9	align SPDX Mellanox copyrights Aligning Mellanox SPDX copyrights to a single format. In addition replace to SPDX licence files which were missed. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-04-11 01:47:47 +02:00
Olivier Matz	82092c8734	net/mlx4: use SPDX tags in 6WIND copyrighted files Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2018-02-01 02:33:04 +01:00
Moti Haimovsky	fc1b5ec522	net/mlx4: fix removal detection of stopped port In failsafe device start can be called for ports/devices that had been plugged out. The mlx4 PMD detects device removal by listening to the device RMV events, when the mlx4 port is being stopped, the PMD no longer listens to these events causing the PMD to stop detecting device removals. This patch fixes this issue by moving installation of the interrupt handler to device configuration, and toggle only the Rx-queue interrupts on start/stop. Fixes: `a6e8b01c3c` ("net/mlx4: compact interrupt functions") Cc: stable@dpdk.org Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-01-30 10:20:35 +01:00
Ferruh Yigit	ffc905f3b8	ethdev: separate driver APIs Create a rte_ethdev_driver.h file and move PMD specific APIs here. Drivers updated to include this new header file. There is no update in header content and since ethdev.h included by ethdev_driver.h, nothing changed from driver point of view, only logically grouping of APIs. From applications point of view they can't access to driver specific APIs anymore and they shouldn't. More PMD specific data structures still remain in ethdev.h because of inline functions in header use them. Those will be handled separately. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2018-01-22 01:26:49 +01:00
Matan Azrad	cdf4ec6eaa	net/mlx4: support a device removal check operation Add support to get removal status of mlx4 device. Signed-off-by: Matan Azrad <matan@mellanox.com>	2018-01-21 21:09:41 +01:00
Adrien Mazarguil	55e8991e31	net/mlx4: restore inner VXLAN RSS support Inner VXLAN RSS was supported and performed by default prior to the entire mlx4 refactoring that occurred in DPDK 17.11, however so far the new Verbs RSS API did not provide means to enable it. This will be addressed in Linux 4.15 and in RDMA core. Thanks to RSS capabilities, the PMD can now probe for its support and enable it again by default. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2018-01-16 18:47:49 +01:00
Adrien Mazarguil	024e87bef4	net/mlx4: restore UDP RSS by probing capabilities Until now, UDP RSS support could not be relied on due to a problem in the Linux kernel implementation and mlx4 RSS capabilities were not reported at all, hence the PMD had to make assumptions. Since both issues will be addressed simultaneously in Linux 4.15 (related patches already upstream) and likely backported afterward, UDP RSS support can be enabled by probing RSS capabilities. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2018-01-16 18:47:49 +01:00
Adrien Mazarguil	c7869af57e	net/mlx4: fix documentation in private structure A couple of structure fields are not Doxygen-friendly. Fixes: `5db1d36408` ("net/mlx4: restore Tx checksum offloads") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2018-01-16 18:47:49 +01:00
Moti Haimovsky	aee4a03fee	net/mlx4: enhance Rx packet type offloads This patch enhances the Rx packet type offload to also report the L4 protocol information in the hw ptype filled by the PMD for each received packet. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-11-07 12:47:13 +01:00
Adrien Mazarguil	0d03353077	net/mlx4: share memory region resources Memory regions assigned to hardware and used during Tx/Rx are mapped to mbuf pools. Each Rx queue creates its own MR based on the mempool provided during queue setup, while each Tx queue looks up and registers MRs for all existing mbuf pools instead. Since most applications use few large mbuf pools (usually only a single one per NUMA node) common to all Tx/Rx queues, the above approach wastes hardware resources due to redundant MRs. This negatively affects performance, particularly with large numbers of queues. This patch therefore makes the entire MR registration common to all queues using a reference count. A spinlock is added to protect against asynchronous registration that may occur from the Tx side where new mempools are discovered based on mbuf data. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-11-03 21:30:41 +01:00
Ophir Munk	326d2cdf7b	net/mlx4: associate MR to MP in a short function Associate memory region to mempool (on data path) in a short function. Handle the less common case of adding a new memory region to mempool in a separate function. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-11-03 20:22:07 +01:00
Moti Haimovsky	5db1d36408	net/mlx4: restore Tx checksum offloads This patch adds hardware offloading support for IPv4, UDP and TCP checksum calculation, including inner/outer checksums on supported tunnel types. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-10-13 01:18:48 +01:00
Adrien Mazarguil	078b8b452e	net/mlx4: add RSS flow rule action support This patch dissociates single-queue indirection tables and hash QP objects from Rx queue structures to relinquish their control to users through the RSS flow rule action, while simultaneously allowing multiple queues to be associated with RSS contexts. Flow rules share identical RSS contexts (hashed fields, hash key, target queues) to save on memory and other resources. The trade-off is some added complexity due to reference counters management on RSS contexts. The QUEUE action is re-implemented on top of an automatically-generated single-queue RSS context. The following hardware limitations apply to RSS contexts: - The number of queues in a group must be a power of two. - Queue indices must be consecutive, for instance the [0 1 2 3] set is allowed, however [3 2 1 0], [0 2 1 3] and [0 0 1 1 2 3 3 3] are not. - The first queue of a group must be aligned to a multiple of the context size, e.g. if queues [0 1 2 3 4] are defined globally, allowed group combinations are [0 1] and [2 3]; groups [1 2] and [3 4] are not supported. - RSS hash key, while configurable per context, must be exactly 40 bytes long. - The only supported hash algorithm is Toeplitz. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-13 01:18:48 +01:00
Adrien Mazarguil	fc4e66649a	net/mlx4: convert Rx path to work queues Work queues (WQs) are lower-level than standard queue pairs (QPs). They are dedicated to one traffic direction and have to be used in conjunction with indirection tables and special "hash" QPs to get the same level of functionality. These extra objects however are the building blocks for RSS support brought by subsequent commits, as a single "hash" QP can manage several WQs through an indirection table according to a hash algorithm and other parameters. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-13 01:18:48 +01:00
Adrien Mazarguil	eacaac7bae	net/mlx4: restore promisc and allmulti support Implement promiscuous and all multicast through internal flow rules automatically generated according to the configured mode. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-13 01:18:48 +01:00
Adrien Mazarguil	30695adbdd	net/mlx4: add VLAN filter configuration support This commit brings back VLAN filter configuration support without any artificial limitation on the number of simultaneous VLANs that can be configured (previously 127). Also thanks to the fact it does not rely on fixed per-queue arrays for potential Verbs flow handle storage anymore, this version wastes a lot less memory (previously 128 * 127 * pointer size, i.e. 130 kiB per Rx queue, only one of which actually had any use for this room: the RSS parent queue). The number of internal flow rules generated still depends on the number of configured MAC addresses times that of configured VLAN filters though. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-13 01:18:48 +01:00
Adrien Mazarguil	1437784b03	net/mlx4: add MAC addresses configuration support This commit brings back support for configuring up to 128 MAC addresses on a port through internal flow rules automatically generated on demand. Unlike its previous incarnation, the necessary extra flow rule for broadcast traffic does not consume an entry from the MAC array anymore. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-13 01:18:48 +01:00
Adrien Mazarguil	bdcad2f484	net/mlx4: refactor internal flow rules When not in isolated mode, a flow rule is automatically configured by the PMD to receive traffic addressed to the MAC address of the device. This somewhat duplicates flow API functionality. Remove legacy support for internal flow rules to instead handle them through the flow API implementation. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-13 01:18:48 +01:00
Adrien Mazarguil	d3a7e09234	net/mlx4: allocate drop flow resources on demand Verbs QP and CQ resources for drop flow rules do not need to be permanently allocated, only when at least one rule needs them. Besides, struct rte_flow_drop is outside the mlx4 PMD name space and should never have been defined there. struct rte_flow is currently the only exception to this rule. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-13 01:18:48 +01:00
Adrien Mazarguil	ed0cc677ad	net/mlx4: enhance header files comments Add missing comments and fix those not Doxygen-friendly. Since the private structure definition is modified, use this opportunity to add one remaining missing include required by one of its fields (sys/queue.h for LIST_HEAD()). Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-13 01:18:47 +01:00
Adrien Mazarguil	a9cfedf39d	net/mlx4: replace bit-field type Make clear it's 32-bit wide. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2017-10-13 01:18:47 +01:00
Matan Azrad	d5b0924ba6	ethdev: add return value to stats get dev op The stats_get dev op API doesn't include return value, so PMD cannot return an error in case of failure at stats getting process time. Since PCI devices can be removed and there is a time between the physical removal to the RMV interrupt, the user may get invalid stats without any indication. This patch changes the stats_get API return value to be int instead of void. All the net PMDs stats_get dev ops are adjusted by this patch. Signed-off-by: Matan Azrad <matan@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2017-10-12 01:52:49 +01:00
Adrien Mazarguil	be65fdcbfb	net/mlx4: rely on ethdev for Tx/Rx queue arrays Allocation and management of Tx/Rx queue arrays is done by wrappers at the ethdev level. The resulting information is copied to the private structure while configuring the device, where it is managed separately by the PMD. This is redundant and consumes space in the private structure. Relying more on ethdev also means there is no need to protect the PMD against burst function calls while closing the device anymore. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-10-06 02:49:48 +02:00
Adrien Mazarguil	37491c7f8f	net/mlx4: clean up includes and comments Add missing includes and sort them, then update/remove comments around them for consistency. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-10-06 02:49:48 +02:00
Adrien Mazarguil	655588afc8	net/mlx4: separate memory management functions No impact on functionality. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2017-10-06 02:49:48 +02:00

1 2

97 Commits