Commit Graph

97 Commits

Author SHA1 Message Date
Ivan Ilchenko
ca041cd44f ethdev: change allmulticast callbacks to return status
Enabling/disabling of allmulticast mode is not always successful and
it should be taken into account to be able to handle it properly.

When correct return status is unclear from driver code, -EAGAIN is used.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2019-10-07 15:00:55 +02:00
Igor Romanov
9970a9ad07 ethdev: make stats and xstats reset callbacks return int
Change return value of the callbacks from void to int. Make
implementations across all drivers return negative errno
values in case of error conditions.

Both callbacks are updated together because a large number of
drivers assign the same function to both callbacks.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-07 15:00:54 +02:00
Andrew Rybchenko
9039c81257 ethdev: change promiscuous callbacks to return status
Enabling/disabling of promiscuous mode is not always successful and
it should be taken into account to be able to handle it properly.

When correct return status is unclear from driver code, -EAGAIN is used.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2019-10-07 15:00:54 +02:00
Ivan Ilchenko
bdad90d12e ethdev: change device info get callback to return int
Change eth_dev_infos_get_t return value from void to int.
Make eth_dev_infos_get_t implementations across all drivers to return
negative errno values if case of error conditions.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-07 14:45:35 +02:00
Stephen Hemminger
164cad78b0 net/mlx4: fix crash on info query in secondary process
mlx4_dev_info_get calls mlx4_get_ifname, but mlx4_get_ifname
uses priv->ctx which is not a valid pointer in a secondary
process. The fix is to cache the value in primary.

In the primary process, get and store the interface index of
the device so that secondary process can see it.

Bugzilla ID: 320
Fixes: 61cbdd4194 ("net/mlx4: separate device control functions")
Cc: stable@dpdk.org

Reported-by: Suyang Ju <sju@paloaltonetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Matan Azrad <matan@mellanox.com>
2019-08-06 17:42:12 +02:00
Dekel Peled
187ec068d5 net/mlx4: fix typo in comment
A spelling mistake was found in comment.
This patch fixes it.

Fixes: 8e49376400 ("net/mlx4: add external allocator for Verbs object")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23 14:31:36 +02:00
Anatoly Burakov
edf73dd330 ipc: handle unsupported IPC in action register
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.

For primary processes, it is OK to not have IPC because
there may not be any secondary processes in the first place,
and there are valid use cases that disable IPC support, so
all primary process usages are fixed up to ignore IPC
failures.

For secondary processes, IPC will be crucial, so leave all
of the error handling as is.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-06-05 11:27:36 +02:00
Olivier Matz
35b2d13fd6 net: add rte prefix to ether defines
Add 'RTE_' prefix to defines:
- rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN.
- rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN.
- rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN.
- rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN.
- rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN.
- rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN.
- rename ETHER_MTU as RTE_ETHER_MTU.
- rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN.
- rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID.
- rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN.
- rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU.
- rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR.
- rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR.
- rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4.
- rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6.
- rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP.
- rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN.
- rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP.
- rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ.
- rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG.
- rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588.
- rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW.
- rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB.
- rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP.
- rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS.
- rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM.
- rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN.
- rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE.
- rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4.
- rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6.
- rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH.
- rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH.
- rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS.
- rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP.
- rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG.
- rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN.

Do not update the command line library to avoid adding a dependency to
librte_net.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:45 +02:00
Olivier Matz
6d13ea8e8e net: add rte prefix to ether structures
Add 'rte_' prefix to structures:
- rename struct ether_addr as struct rte_ether_addr.
- rename struct ether_hdr as struct rte_ether_hdr.
- rename struct vlan_hdr as struct rte_vlan_hdr.
- rename struct vxlan_hdr as struct rte_vxlan_hdr.
- rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr.

Do not update the command line library to avoid adding a dependency to
librte_net.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24 13:34:45 +02:00
Adrien Mazarguil
138a740c45 net/mlx4: support multicast address list interface
Since this driver does not distinguish unicast/multicast addresses,
applications could always rely on the standard MAC add/remove/set
interface to configure both types.

As a result, the multicast address list interface never got implemented
(rte_eth_dev_set_mc_addr_list()) however PMD-agnostic applications still
rely on it for compatibility reasons; a wrapper is therefore required.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03 18:45:23 +02:00
Yongseok Koh
97d37d2c1f net/mlx4: remove device register remap
UAR (User Access Region) register does not need to be remapped for
primary process but it should be remapped only for secondary process.
UAR register table is in the process private structure in
rte_eth_devices[],
(struct mlx4_proc_priv *)rte_eth_devices[port_id].process_private

The actual UAR table follows the data structure and the table is used
for both Tx and Rx.

For Tx, BlueFlame in UAR is used to ring the doorbell.
MLX4_TX_BFREG(txq) is defined to get a register for the txq. Processes
access its own private data to acquire the register from the UAR table.

For Rx, the doorbell in UAR is required in arming CQ event. However, it
is a known issue that the register isn't remapped for secondary process.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Yongseok Koh
0b259b8e96 net/mlx4: enable secondary process to register DMA memory
The Memory Region (MR) for DMA memory can't be created from secondary
process due to lib/driver limitation. Whenever it is needed, secondary
process can make a request to primary process through the EAL IPC
channel (rte_mp_msg) which is established on initialization. Once a MR
is created by primary process, it is immediately visible to secondary
process because the MR list is global per a device. Thus, secondary
process can look up the list after the request is successfully returned.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
f4efc0eb97 net/mlx4: add control of excessive memory pinning by kernel
A new PMD parameter (mr_ext_memseg_en) is added to control extension of
memseg when creating a MR. It is enabled by default.

If enabled, mlx4_mr_create() tries to maximize the range of MR
registration so that the LKey lookup tables on datapath become smalle
and get the best performance. However, it may worsen memory utilization
because registered memory is pinned by kernel driver. Even if a page in
the extended chunk is freed, that doesn't become reusable until the
entire memory is freed and the MR is destroyed.

To make freed pages available immediately, this parameter has to be
turned off but it could drop performance.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
0203d33a10 net/mlx4: support secondary process
In order to support secondary process, a few features are required.

a) rdma-core library should allocate device resources using DPDK's
   memory allocator.

b) UAR should be remapped for secondary processes. Currently, in order
   not to use different data structure for secondary processes, PMD
   tries to reserve identical virtual address space for both primary
   and secondary processes.

c) IPC channel is necessary, which can be easily set with rte_mp APIs.
   Through the channel, Verbs command FD is delivered to the secondary
   process and the device stop/start event is also broadcast from
   primary process.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
8e49376400 net/mlx4: add external allocator for Verbs object
To support secondary process, the memory allocated by library such as
completion rings (CQ) and buffer rings (WQ) must be manageable by EAL,
in order to share it with secondary processes. With new changes in
rdma-core and kernel driver, it is possible to provide an external
allocator to the library layer for this purpose. All such resources
will now be allocated within DPDK framework.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
099c2c5376 net/mlx4: change device reference for secondary process
rte_eth_devices[] is not shared between primary and secondary process,
but a static array to each process. The reverse pointer of device
(priv->dev) becomes invalid if mlx4 supports secondary process.
Instead, priv has the pointer to shared data of the device,
  struct rte_eth_dev_data *dev_data;

Two macros are added,
  #define PORT_ID(priv) ((priv)->dev_data->port_id)
  #define ETH_DEV(priv) (&rte_eth_devices[PORT_ID(priv)])

Cc: stable@dpdk.org

Suggested-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Thomas Monjalon
dbeba4cf18 net/mlx: prefix private structure
The private structure stored in rte_eth_dev->data->dev_private
was named "struct priv".
In order to ease code browsing, the structure is renamed
"struct mlx[45]_priv".

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-03-01 18:17:35 +01:00
Thomas Monjalon
714bf46ebb net/mlx: support firmware version query
The API function rte_eth_dev_fw_version_get() is querying drivers
via the operation callback fw_version_get().
The implementation of this operation is added for mlx4 and mlx5.
Both functions are copying the same ibverbs field fw_ver
which is retrieved when calling ibv_query_device[_ex]()
during the port probing.

It is tested with command "drvinfo" of examples/ethtool/.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-02-13 12:55:38 +01:00
Moti Haimovsky
2b4e423fd4 net/mlx4: check RSS queues number limitation
This patch verifies that the number of Rx queues configured for RSS
is supported by the device hardware.
RSS support in mlx4 requires contiguous chunk of QPs to be reserved,
there is a hardware limitation on the amount of contiguous QPs which
is reported by the hardware. Ignoring this value will cause Rx queues
creation to fail.

Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2018-08-02 12:34:16 +02:00
Moti Haimovsky
ba576975a8 net/mlx4: support hardware TSO
Implement support for hardware TSO.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2018-07-10 14:02:57 +02:00
Yongseok Koh
9797bfcce1 net/mlx4: add new memory region support
This is the new design of Memory Region (MR) for mlx PMD, in order to:
- Accommodate the new memory hotplug model.
- Support non-contiguous Mempool.

There are multiple layers for MR search.

L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most
Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized
array by linear search. L0/L1 is in an inline function -
mlx4_mr_lookup_cache().

If L1 misses, the bottom-half function is called to look up the address
from the bigger local cache of the queue. This is L2 - mlx4_mr_addr2mr_bh()
and it is not an inline function. Data structure for L2 is the Binary Tree.

If L2 misses, the search falls into the slowest path which takes locks in
order to access global device cache (priv->mr.cache) which is also a B-tree
and caches the original MR list (priv->mr.mr_list) of the device. Unless
the global cache is overflowed, it is all-inclusive of the MR list. This is
L3 - mlx4_mr_lookup_dev(). The size of the L3 cache table is limited and
can't be expanded on the fly due to deadlock. Refer to the comments in the
code for the details - mr_lookup_dev(). If L3 is overflowed, the list will
have to be searched directly bypassing the cache although it is slower.

If L3 misses, a new MR for the address should be created -
mlx4_mr_create(). When it creates a new MR, it tries to register adjacent
memsegs as much as possible which are virtually contiguous around the
address. This must take two locks - memory_hotplug_lock and
priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any
allocation/free of memory inside.

In the free callback of the memory hotplug event, freed space is searched
from the MR list and corresponding bits are cleared from the bitmap of MRs.
This can fragment a MR and the MR will have multiple search entries in the
caches. Once there's a change by the event, the global cache must be
rebuilt and all the per-queue caches will be flushed as well. If memory is
frequently freed in run-time, that may cause jitter on dataplane processing
in the worst case by incurring MR cache flush and rebuild. But, it would be
the least probable scenario.

To guarantee the most optimal performance, it is highly recommended to use
an EAL option - '--socket-mem'. Then, the reserved memory will be pinned
and won't be freed dynamically. And it is also recommended to configure
per-lcore cache of Mempool. Even though there're many MRs for a device or
MRs are highly fragmented, the cache of Mempool will be much helpful to
reduce misses on per-queue caches anyway.

'--legacy-mem' is also supported.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
2d684b911d net/mlx4: remove memory region support
This patch removes current support of Memory Region (MR) in order to
accommodate the dynamic memory hotplug patch. This patch can be compiled
but traffic can't flow and HW will raise faults. Subsequent patches will
add new MR support.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Adrien Mazarguil
84a684862f net/mlx4: fix Rx resource leak in case of error
When creation of a flow rule fails during dev_start(), the usage count of
the common RSS context is not decremented, which triggers an assertion
failure in debug mode during dev_close().

This is addressed by tracking the initialization status of the common RSS
context in order to add missing cleanup code.

A similar issue exists in mlx4_rxq_attach(), where usage count is
incremented on a Rx queue but not released in case of error. This may lead
to the above issue since RSS contexts created by flow rules attach
themselves to Rx queues, incrementing their usage count.

Fixes: 5697a41421 ("net/mlx4: relax Rx queue configuration order")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-05-02 19:28:48 +02:00
Olivier Matz
caccf8b318 ethdev: return diagnostic when setting MAC address
Change the prototype and the behavior of dev_ops->eth_mac_addr_set(): a
return code is added to notify the caller (librte_ether) if an error
occurred in the PMD.

The new default MAC address is now copied in dev->data->mac_addrs[0]
only if the operation is successful.

The patch also updates all the PMDs accordingly.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-14 00:43:30 +02:00
Ophir Munk
de1df14e6e net/mlx4: support CRC strip toggling
Previous to this commit mlx4 CRC stripping was executed by default and
there was no verbs API to disable it.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-14 00:43:30 +02:00
Shahaf Shuler
5feecc57d9 align SPDX Mellanox copyrights
Aligning Mellanox SPDX copyrights to a single format.
In addition replace to SPDX licence files which were missed.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-11 01:47:47 +02:00
Olivier Matz
82092c8734 net/mlx4: use SPDX tags in 6WIND copyrighted files
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-02-01 02:33:04 +01:00
Moti Haimovsky
fc1b5ec522 net/mlx4: fix removal detection of stopped port
In failsafe device start can be called for ports/devices that
had been plugged out.
The mlx4 PMD detects device removal by listening to the device RMV
events, when the mlx4 port is being stopped, the PMD no longer
listens to these events causing the PMD to stop detecting device
removals.
This patch fixes this issue by moving installation of the interrupt
handler to device configuration, and toggle only the Rx-queue
interrupts on start/stop.

Fixes: a6e8b01c3c ("net/mlx4: compact interrupt functions")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-01-30 10:20:35 +01:00
Ferruh Yigit
ffc905f3b8 ethdev: separate driver APIs
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.

There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.

More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-01-22 01:26:49 +01:00
Matan Azrad
cdf4ec6eaa net/mlx4: support a device removal check operation
Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
2018-01-21 21:09:41 +01:00
Adrien Mazarguil
55e8991e31 net/mlx4: restore inner VXLAN RSS support
Inner VXLAN RSS was supported and performed by default prior to the entire
mlx4 refactoring that occurred in DPDK 17.11, however so far the new Verbs
RSS API did not provide means to enable it. This will be addressed in
Linux 4.15 and in RDMA core.

Thanks to RSS capabilities, the PMD can now probe for its support and
enable it again by default.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-01-16 18:47:49 +01:00
Adrien Mazarguil
024e87bef4 net/mlx4: restore UDP RSS by probing capabilities
Until now, UDP RSS support could not be relied on due to a problem in the
Linux kernel implementation and mlx4 RSS capabilities were not reported at
all, hence the PMD had to make assumptions.

Since both issues will be addressed simultaneously in Linux 4.15 (related
patches already upstream) and likely backported afterward, UDP RSS support
can be enabled by probing RSS capabilities.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-01-16 18:47:49 +01:00
Adrien Mazarguil
c7869af57e net/mlx4: fix documentation in private structure
A couple of structure fields are not Doxygen-friendly.

Fixes: 5db1d36408 ("net/mlx4: restore Tx checksum offloads")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-01-16 18:47:49 +01:00
Moti Haimovsky
aee4a03fee net/mlx4: enhance Rx packet type offloads
This patch enhances the Rx packet type offload to also report the L4
protocol information in the hw ptype filled by the PMD for each received
packet.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-07 12:47:13 +01:00
Adrien Mazarguil
0d03353077 net/mlx4: share memory region resources
Memory regions assigned to hardware and used during Tx/Rx are mapped to
mbuf pools. Each Rx queue creates its own MR based on the mempool
provided during queue setup, while each Tx queue looks up and registers
MRs for all existing mbuf pools instead.

Since most applications use few large mbuf pools (usually only a single
one per NUMA node) common to all Tx/Rx queues, the above approach wastes
hardware resources due to redundant MRs. This negatively affects
performance, particularly with large numbers of queues.

This patch therefore makes the entire MR registration common to all
queues using a reference count. A spinlock is added to protect against
asynchronous registration that may occur from the Tx side where new
mempools are discovered based on mbuf data.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 21:30:41 +01:00
Ophir Munk
326d2cdf7b net/mlx4: associate MR to MP in a short function
Associate memory region to mempool (on data path) in a short function.
Handle the less common case of adding a new memory region to mempool
in a separate function.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:22:07 +01:00
Moti Haimovsky
5db1d36408 net/mlx4: restore Tx checksum offloads
This patch adds hardware offloading support for IPv4, UDP and TCP checksum
calculation, including inner/outer checksums on supported tunnel types.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
078b8b452e net/mlx4: add RSS flow rule action support
This patch dissociates single-queue indirection tables and hash QP objects
from Rx queue structures to relinquish their control to users through the
RSS flow rule action, while simultaneously allowing multiple queues to be
associated with RSS contexts.

Flow rules share identical RSS contexts (hashed fields, hash key, target
queues) to save on memory and other resources. The trade-off is some added
complexity due to reference counters management on RSS contexts.

The QUEUE action is re-implemented on top of an automatically-generated
single-queue RSS context.

The following hardware limitations apply to RSS contexts:

- The number of queues in a group must be a power of two.
- Queue indices must be consecutive, for instance the [0 1 2 3] set is
  allowed, however [3 2 1 0], [0 2 1 3] and [0 0 1 1 2 3 3 3] are not.
- The first queue of a group must be aligned to a multiple of the context
  size, e.g. if queues [0 1 2 3 4] are defined globally, allowed group
  combinations are [0 1] and [2 3]; groups [1 2] and [3 4] are not
  supported.
- RSS hash key, while configurable per context, must be exactly 40 bytes
  long.
- The only supported hash algorithm is Toeplitz.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
fc4e66649a net/mlx4: convert Rx path to work queues
Work queues (WQs) are lower-level than standard queue pairs (QPs). They are
dedicated to one traffic direction and have to be used in conjunction with
indirection tables and special "hash" QPs to get the same level of
functionality.

These extra objects however are the building blocks for RSS support brought
by subsequent commits, as a single "hash" QP can manage several WQs through
an indirection table according to a hash algorithm and other parameters.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
eacaac7bae net/mlx4: restore promisc and allmulti support
Implement promiscuous and all multicast through internal flow rules
automatically generated according to the configured mode.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
30695adbdd net/mlx4: add VLAN filter configuration support
This commit brings back VLAN filter configuration support without any
artificial limitation on the number of simultaneous VLANs that can be
configured (previously 127).

Also thanks to the fact it does not rely on fixed per-queue arrays for
potential Verbs flow handle storage anymore, this version wastes a lot less
memory (previously 128 * 127 * pointer size, i.e. 130 kiB per Rx queue,
only one of which actually had any use for this room: the RSS parent
queue).

The number of internal flow rules generated still depends on the number of
configured MAC addresses times that of configured VLAN filters though.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
1437784b03 net/mlx4: add MAC addresses configuration support
This commit brings back support for configuring up to 128 MAC addresses on
a port through internal flow rules automatically generated on demand.

Unlike its previous incarnation, the necessary extra flow rule for
broadcast traffic does not consume an entry from the MAC array anymore.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
bdcad2f484 net/mlx4: refactor internal flow rules
When not in isolated mode, a flow rule is automatically configured by the
PMD to receive traffic addressed to the MAC address of the device. This
somewhat duplicates flow API functionality.

Remove legacy support for internal flow rules to instead handle them
through the flow API implementation.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
d3a7e09234 net/mlx4: allocate drop flow resources on demand
Verbs QP and CQ resources for drop flow rules do not need to be permanently
allocated, only when at least one rule needs them.

Besides, struct rte_flow_drop is outside the mlx4 PMD name space and should
never have been defined there. struct rte_flow is currently the only
exception to this rule.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
ed0cc677ad net/mlx4: enhance header files comments
Add missing comments and fix those not Doxygen-friendly.

Since the private structure definition is modified, use this opportunity to
add one remaining missing include required by one of its fields
(sys/queue.h for LIST_HEAD()).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Adrien Mazarguil
a9cfedf39d net/mlx4: replace bit-field type
Make clear it's 32-bit wide.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Matan Azrad
d5b0924ba6 ethdev: add return value to stats get dev op
The stats_get dev op API doesn't include return value, so PMD cannot
return an error in case of failure at stats getting process time.

Since PCI devices can be removed and there is a time between the
physical removal to the RMV interrupt, the user may get invalid stats
without any indication.

This patch changes the stats_get API return value to be int instead of
void.

All the net PMDs stats_get dev ops are adjusted by this patch.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-12 01:52:49 +01:00
Adrien Mazarguil
be65fdcbfb net/mlx4: rely on ethdev for Tx/Rx queue arrays
Allocation and management of Tx/Rx queue arrays is done by wrappers at the
ethdev level. The resulting information is copied to the private structure
while configuring the device, where it is managed separately by the PMD.

This is redundant and consumes space in the private structure.

Relying more on ethdev also means there is no need to protect the PMD
against burst function calls while closing the device anymore.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
37491c7f8f net/mlx4: clean up includes and comments
Add missing includes and sort them, then update/remove comments around them
for consistency.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
655588afc8 net/mlx4: separate memory management functions
No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00