Commit Graph

1099 Commits

Author SHA1 Message Date
Yongseok Koh
d588f12ffb net/mlx: fix library search in meson build
If MLNX_OFED is installed, there's no .pc file installed for libraries and
dependency() can't find libraries by pkg-config. By adding fallback of
using cc.find_library(), libraries are properly located.

Fixes: e30b4e566f ("build: improve dependency handling")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Luca Boccassi <bluca@debian.org>
2019-04-18 18:22:42 +02:00
Yongseok Koh
120dc4a7dc net/mlx5: remove device register remap
UAR (User Access Region) register does not need to be remapped for
primary process but it should be remapped only for secondary process.
UAR register table is in the process private structure in
rte_eth_devices[],
(struct mlx5_proc_priv *)rte_eth_devices[port_id].process_private

The actual UAR table follows the data structure and the table is used
for both Tx and Rx.

For Tx, BlueFlame in UAR is used to ring the doorbell.
MLX5_TX_BFREG(txq) is defined to get a register for the txq. Processes
access its own private data to acquire the register from the UAR table.

For Rx, the doorbell in UAR is required in arming CQ event. However, it
is a known issue that the register isn't remapped for secondary process.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-12 11:02:02 +02:00
Yongseok Koh
d5c900d1dd net/mlx5: remove redundant queue index
Queue index is redundantly stored for both Rx and Tx structures.
E.g. txq_ctrl->idx and txq->stats.idx. Both are consolidated to single
storage - rxq->idx and txq->idx.

Also, rxq and txq are moved to the beginning of its control structure
(rxq_ctrl and txq_ctrl) for cacheline alignment.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Yongseok Koh
227684feb8 net/mlx5: fix recursive inclusion of header file
mlx5.h includes mlx5_rxtx.h and mlx5_rxtx.h includes mlx5.h recursively.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Ori Kam
bbda883ca0 net/mlx5: fix build on Arm
In case of cross compilation on aarch64 we must add include for
stdlib in order to use the free function.

Fixes: cbb66daa3c ("net/mlx5: prepare Direct Verbs for Direct Rule")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
942d13e6e7 net/mlx5: fix sharing context destroy order
At the mlx5 device closing the shared IB context was destroyed
before cleanup routines completion. As it was found on some
setups (Netlink fails with old kernel drivers and we have to use
sysfs to retrieve interface index, this requires IB device name,
which is stored in shared context) the mlx5_nl_mac_addr_flush()
requires IB device name, and if shared context is removed it
causes the segmentation fault.

Fixes: 17e19bc4dd ("net/mlx5: add IB shared context alloc/free functions")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
9c2bbd0488 net/mlx5: fix device probing for old kernel drivers
Retrieving network interface index via Netlink fails in
case of old ib_core kernel driver installed - mlx5_nl_ifindex()
routine fails due to RDMA_NLDEV_ATTR_NDEV_INDEX attribute is not
supported by the old driver.

The patch allowing to retrieve the network interface index and
name via Netlink [1]. So, the problem depends on ib_core module
version - 4.16 supports getting ifindex via Netlink, 4.15 does not.

This error was ignored in previous versions of MLX5 PMD probing
routine. For single device ifindex was retrieved via sysfs
and link control was not lost, so problem just was not noticed.
In order to support MLX5 PMD functioning over old kernel driver
this patch adds ifindex retrieving via sysfs into probing routine.
It is worth to note this method works for master/standalone
device only.

[1] https://www.spinics.net/lists/linux-rdma/msg62948.html
    Linux tree: 5b2cc79d (Leon Romanovsky 2018-03-27 20:40:49 +0300 270)

Fixes: ad74bc6195 ("net/mlx5: support multiport IB device during probing")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
ae4eb7dc79 net/mlx5: fix typos in comments
Fixes: 299d7dc28c ("net/mlx5: add representor recognition on Linux 5.x")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
79e35d0d59 net/mlx5: share Direct Rules/Verbs flow related structures
Direct Rules/Verbs related structures are moved to
the shared context:
  - rx/tx namespaces, shared by master and representors
  - rx/tx flow tables
  - matchers
  - encap/decap action resources
  - flow tags (MARK actions)
  - modify action resources
  - jump tables

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Viacheslav Ovsiienko
b2177648b8 net/mlx5: add Direct Rules flow data alloc/free routines
We are going to share the Direct Rules and Direct Verbs flow
device data structures between master and representors in the
E-Switch configurations over multiport IB device.

The code of initializing and destroying these data is
moved to dedicated routines, this is just a preparation
step for actual data sharing.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
684b9a1b1f net/mlx5: support jump action
When using Direct Rules we can add actions to jump between tables.
This is extra useful since rule insertion rate is much higher on other
tables compared to table zero.

If no group is selected the rule is added to group 0.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
4f84a19779 net/mlx5: add Direct Rules API
Adds calls to the Direct Rules API inside the glue functions.
Due to difference in parameters between the Direct Rules and Direct
Verbs some of the glue functions API was updated.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
cbb66daa3c net/mlx5: prepare Direct Verbs for Direct Rule
This is the first patch of a series that is designed to enable the
Direct Rules API.

The main difference between Direct Verbs and Direct Rules from API
perspective is that in Direct Rules each action has it's own create
function and the object itself is of type void.

In this patch I'm adding functions to generate actions that currently
are done without create action, and I'm changing the action type to be
void *, so in next patches only the glue functions will need to change.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
f54aeb3ec0 net/mlx5: fix flow counters using devx
The API that was defined in OFED 4.5 was replaced both in OFED 4.6 and
in upstream.

This commit updates the API to match the upstream one.

Fixes: f5bf91de73 ("net/mlx5: support flow counters using devx")
Cc: stable@dpdk.org

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Thomas Monjalon
d874a4eed5 net/mlx5: use port sibling iterators
Iterating over siblings was done with RTE_ETH_FOREACH_DEV()
which skips the owned ports.
The new iterators RTE_ETH_FOREACH_DEV_SIBLING()
and RTE_ETH_FOREACH_DEV_OF() are more appropriate and more correct.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
c18cf501a7 net/mlx5: enable secondary process to register DMA memory
The Memory Region (MR) for DMA memory can't be created from secondary
process due to lib/driver limitation. Whenever it is needed, secondary
process can make a request to primary process through the EAL IPC
channel (rte_mp_msg) which is established on initialization. Once a MR
is created by primary process, it is immediately visible to secondary
process because the MR list is global per a device. Thus, secondary
process can look up the list after the request is successfully returned.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
dceb502942 net/mlx5: add control of excessive memory pinning by kernel
A new PMD parameter (mr_ext_memseg_en) is added to control extension of
memseg when creating a MR. It is enabled by default.

If enabled, mlx5_mr_create() tries to maximize the range of MR
registration so that the LKey lookup tables on datapath become smaller
and get the best performance. However, it may worsen memory utilization
because registered memory is pinned by kernel driver. Even if a page in
the extended chunk is freed, that doesn't become reusable until the
entire memory is freed and the MR is destroyed.

To make freed pages available immediately, this parameter has to be
turned off but it could drop performance.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
207fe7ac72 net/mlx5: fix external memory registration
Secondary process is not allowed to register MR due to a restriction of
library and kernel driver.

Fixes: 7e43a32ee0 ("net/mlx5: support externally allocated static memory")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
3d1f3c7c83 net/mlx: remove debug messages on datapath
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
2aac5b5d11 net/mlx5: sync stop/start with secondary process
Rx/Tx burst function pointers are stored in the rte_eth_dev structure,
which is local to a process. Even though primary process replaces the
function pointers, secondary will not run the new ones. With rte_mp
APIs, primary can easily broadcast a request to stop/start the datapath
of secondary processes.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
7be600c8d8 net/mlx5: rework PMD global data init
There's more need to have PMD global data structure. This should be
initialized once per a process regardless of how many PMD instances are
probed. mlx5_init_once() is called during probing and make sure all the
init functions are called once per a process. Currently, such global
data and its initialization functions are even scattered. Rather than
'extern'-ing such variables and calling such functions one by one making
sure it is called only once by checking the validity of such variables, it
will be better to have a global storage to hold such data and a
consolidated function having all the initializations. The existing shared
memory gets more extensively used for this purpose. As there could be
multiple secondary processes, a static storage (local to process) is also
added.

As the reserved virtual address for UAR remap is a PMD global resource,
this doesn't need to be stored in the device priv structure, but in the
PMD global data.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
9a8ab29b84 net/mlx5: replace IPC socket with EAL API
Socket API is used for IPC in order for secondary process to acquire
Verb command file descriptor. The FD is used to remap UAR address.
The multi-process APIs (rte_mp) in EAL are newly introduced.
mlx5_socket.c is replaced with mlx5_mp.c, which uses the new APIs.

As it is PMD global infrastructure, only one IPC channel is established.
All the IPC message types may have port_id in the message if there is
need to reference a specific device.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
3ebe658059 net/mlx5: fix memory event on secondary process
As the memory event is propagated to secondary processes, the event is
processed redundantly. This should be processed once because the data
structure used for MR and the event is global across the processes.

Fixes: 974f1e7ef1 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Dekel Peled
de90612f40 net/mlx5: fix errno typos in comments
Correct typing mistake in several locations:
ernno ==> errno

Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
9c55c6bd86 net/mlx5: revert mbuf address calculation for x86
When replenishing mbufs on Rx, buffer address (mbuf->buf_addr) should be
loaded. non-x86 processors (mostly RISC such as ARM and Power) are more
vulnerable to load stall. For x86, reducing the number of instructions
seems to matter most.

For x86, this is simply a load but for other architectures, it is
calculated from the address of mbuf structure by rte_mbuf_buf_addr()
without having to load the first cacheline of the mbuf.

Fixes: 12d468a62b ("net/mlx5: fix instruction hotspot on replenishing Rx buffer")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Viacheslav Ovsiienko
0fe3f18f78 net/mlx5: add source vport match to the ingress rules
For E-Switch configurations over multiport Infiniband devices
we should add source vport match to correctly distribute
traffic between representors.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
028b2a28c3 net/mlx5: update event handler for multiport IB devices
This patch modifies asynchronous event handler to support multiport
Infiniband devices. Handler queries the event parameters, including
event source port index, and invokes the handler for specific
devices with appropriate port_id.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
53e5a82fd1 net/mlx5: update install/uninstall event handlers
We are implementing the support for multiport Infiniband device
with representors attached to these multiple ports. Asynchronous
device event notifications (link status change, removal event, etc.)
should be shared between ports. We are going to implement shared
event handler and this patch introduces appropriate device
structure changes and updated event handler install and uninstall
routines.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
1e14090e31 net/mlx5: provide IB port for the object being created
The code is updated to provide IB port index for the Verbs
objects being created - QPs and Verbs Flows.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
f048f3d479 net/mlx5: switch to the shared IB device context
The code is updated to use the shared IB device context and
device handles. The IB device context is shared between
reprentors created over the single multiport IB device. All
Verbs and DevX objects will be created within this shared context.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
d485cdca01 net/mlx5: switch to the shared context IB attributes
The code is updated to use the shared IB device attributes,
located in the shared IB context. It saves some memory if
there are representors created over the single Infiniband
device with multiple ports.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
1b782252cb net/mlx5: switch to the shared protection domain
The PMD code is updated to use Protected Domain from the
shared IB device context. The Domain is shared between
all devices belonging to the same multiport Infiniband device.
If IB device has only one port, the PD is not shared, because
there is only ethernet device created over IB one.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
9c0a9eed37 net/mlx5: switch to the names in the shared IB context
The IB device names are moved from device private data
to the shared context, code involving the names is updated.
The IB port index treatment is added where it is relevant.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
17e19bc4dd net/mlx5: add IB shared context alloc/free functions
The Mellanox NICs support SR-IOV and have E-Switch feature.
When SR-IOV is set up in switchdev mode and E-Switch is enabled
we have so called VF representors in the system. All representors
belonging to the same E-Switch are created on the basis of the
single PCI function and with current implementation each representor
has its own dedicated Infiniband device and operates within its
own Infiniband context. It is proposed to provide representors
as ports of the single Infiniband device and operate on the
shared Infiniband context saving various resources. This patch
introduces appropriate structures.

Also the functions to allocate and free shared IB context for
multiport are added. The IB device context, Protection Domain,
device attributes, Infiniband names are going to be relocated
to the shared structure from the device private one.
mlx5_dev_spawn() is updated to support shared context.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
ad74bc6195 net/mlx5: support multiport IB device during probing
mlx5_pci_probe() routine is refactored to probe the ports
of found Infiniband devices. All active ports (with attached
network interface), belonging to the same Infiniband device
will use the single shared Infiniband context of that device.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
bbfad6427b net/mlx5: add getting IB ports number for multiport IB
There is the routine mlx5_nl_portnum() added to get
the number of ports of multiport Infiniband device.
It is assumed the Uplink/VF representors are attached
on these ports.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
e505508a38 net/mlx5: modify get ifindex routine for multiport IB
There is the routine mlx5_nl_ifindex() returning the
network interface index associated with Infiniband device.
We are going to support multiport IB devices, now function
takes the IB port as argument and returns ifindex associated
with tuple <IB device, IB port>

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko
299d7dc28c net/mlx5: add representor recognition on Linux 5.x
The master device and VF representors were distinguished by
presence of port name, master device did not have one. The new Linux
kernels starting from 5.0 provide the port name for master device
and the implemented representor recognizing method does not work.
The new recognizing method is based on querying the VF number,
has been created on the base of the device.

The IFLA_NUM_VF attribute is returned by kernel if IFLA_EXT_MASK
attribute is specified in the Netlink request message.

Also the presence check of device symlink in device sysfs folder
is added to distinguish representors with sysfs based method.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-29 17:25:32 +01:00
Ali Alnubani
e3bcaf3a0f net/mlx5: add missing return value check
This patch fixes the build failure with message:
  drivers/net/mlx5/mlx5_ethdev.c: In function ‘mlx5_sysfs_switch_info’:
  drivers/net/mlx5/mlx5_ethdev.c:1381:3:
    error: ignoring return value of ‘fscanf’, declared with attribute
           warn_unused_result [-Werror=unused-result]
  fscanf(file, "%s", port_name);
    ^

Which reproduces on Ubuntu 16.04 LTS with
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609.

Fixes: b2f3a38101 ("net/mlx5: support new representor naming format")

Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Dekel Peled <dekelp@mellanox.com>
2019-03-29 17:25:32 +01:00
Shahaf Shuler
989e999d93 net/mlx5: support PCI device DMA map and unmap
The implementation reuses the external memory registration work done by
commit[1].

Note about representors:

The current representor design will not work
with those map and unmap functions. The reason is that for representors
we have multiple IB devices share the same PCI function, so mapping will
happen only on one of the representors and not all of them.

While it is possible to implement such support, the IB representor
design is going to be changed during DPDK19.05. The new design will have
a single IB device for all representors, hence sharing of a single
memory region between all representors will be possible.

[1]
commit 7e43a32ee0
("net/mlx5: support externally allocated static memory")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-30 16:48:58 +01:00
Shahaf Shuler
0f132546a8 net/mlx5: refactor external memory registration
Move the memory region creation to a separate function to
prepare the ground for the reuse of it on the PCI driver map and unmap
functions.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-30 16:48:57 +01:00
Dekel Peled
b2f3a38101 net/mlx5: support new representor naming format
Kernel update [1] introduce new format of representors names.
This patch implements RFC [2], updating MLX5 PMD to support the new
format, while maintaining support of the existing format.

[1] https://github.com/torvalds/linux/commit/c12ecc2
[2] http://mails.dpdk.org/archives/dev/2019-March/125676.html

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-20 18:15:42 +01:00
Shahaf Shuler
57c0e2494c net/mlx5: fix packet inline on Tx queue wraparound
Inlining a packet to WQE that cross the WQ wraparound, i.e. the WQE
starts on the end of the ring and ends on the beginning, is not
supported and blocked by the data path logic.

However, in case of TSO, an extra inline header is required before
inlining. This inline header is not taken into account when checking if
there is enough room left for the required inline size.
On some corner cases were
(ring_tailroom - inline header) < inline size < ring_tailroom ,
this can lead to WQE being written outsize of the ring buffer.

Fixing it by always assuming the worse case that inline of packet will
require the inline header.

Fixes: 3f13f8c23a ("net/mlx5: support hardware TSO")
Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-03-20 18:15:42 +01:00
Dekel Peled
fd350d3c9a net/mlx5: fix sync when handling Tx completions
Function mlx5_tx_complete() reads completion entry information
from Tx queue.
For some processors not having strongly-ordered memory model,
there has to be a memory barrier between reading the entry index
and the entry fields, in order to guarantee data is valid.

Fixes: 54d3fe948d ("net/mlx5: poll completion queue once per a call")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-08 17:52:22 +01:00
Dekel Peled
38f0a160b5 net/mlx5: fix hex dump of error completion
struct mlx5_cqe is defined in MLX5 PMD code (mlx5_prm.h).
It includes 64 bytes padding in case of (RTE_CACHE_LINE_SIZE == 128).

struct mlx5_err_cqe is defined in kernel, and doesn't include padding.

When running in debug mode, in case an error CQE is detected
it is printed using rte_hexdump().

The size of data to print should be sizeof(*cqe) instead of
sizeof(*err_cqe), to handle the case of (RTE_CACHE_LINE_SIZE == 128),
and print the full data in any case.

Fixes: c771499209 ("net/mlx5: extend debug logs verbosity")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-08 17:52:22 +01:00
Thomas Monjalon
09c9c4d23d net/mlx5: call generic strlcpy
The call to strlcpy uses either libc, libbsd or internal rte_strlcpy.
No need to call the DPDK flavor explicitly.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-08 17:52:22 +01:00
Viacheslav Ovsiienko
4fb27c1dfe net/mlx5: fix flow priorities probing error path
The mlx5 PMD probes the Verbs flow priorities supported with
ibv_create_flow() function. If rdma-core or kernel fails for
some reason, the returned error causes the drop queue is not
destroyed, and pd is locked by not freed resource.

Also the mlx5_flow_discover_priorities() returned negative value
as error, and this code was reported "as is", without sign
changing (eventually causing assert(err > 0)).

Fixes: 2815702bae ("net/mlx5: replace verbs priorities by flow")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-08 17:52:22 +01:00
Thomas Monjalon
dbeba4cf18 net/mlx: prefix private structure
The private structure stored in rte_eth_dev->data->dev_private
was named "struct priv".
In order to ease code browsing, the structure is renamed
"struct mlx[45]_priv".

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-03-01 18:17:35 +01:00
Luca Boccassi
e30b4e566f build: improve dependency handling
Whenever possible (if the library ships a pkg-config file) use meson's
dependency() function to look for it, as it will automatically add it
to the Requires.private list if needed, to allow for static builds to
succeed for reverse dependencies of DPDK. Otherwise the recursive
dependencies are not parsed, and users doing static builds have to
resolve them manually by themselves.
When using this API avoid additional checks that are superfluous and
take extra time, and avoid adding the linker flag manually which causes
it to be duplicated.

Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
2019-02-27 12:13:54 +01:00
Thomas Monjalon
714bf46ebb net/mlx: support firmware version query
The API function rte_eth_dev_fw_version_get() is querying drivers
via the operation callback fw_version_get().
The implementation of this operation is added for mlx4 and mlx5.
Both functions are copying the same ibverbs field fw_ver
which is retrieved when calling ibv_query_device[_ex]()
during the port probing.

It is tested with command "drvinfo" of examples/ethtool/.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-02-13 12:55:38 +01:00
Dekel Peled
7f4019d370 net/mlx5: fix Tx metadata for multi-segment packet
Original patch implemented the use of match_metadata offload in the
different burst functions.
The concurrent use of match_metadata and multi_segs offloads was
not handled.

This patch updates function txq_scatter_v(), to pass metadata value
from mbuf to wqe, when indicated by offload flags.

Fixes: 6bd7fbd03c ("net/mlx5: support metadata as flow rule criteria")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-02-13 12:55:38 +01:00
Viacheslav Ovsiienko
c9e462d17b net/mlx5: fix build for armv8
Added <rte_cycles.h> inclusion, was not included on some
building setups (armv8).

Fixes: 71ab2d6472 ("net/mlx5: fix VXLAN port registration race condition")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-30 11:08:35 +01:00
Viacheslav Ovsiienko
a9c94cc050 net/mlx5: fix VXLAN without decap action for E-Switch
There is an intention to support VXLAN tunnel match without
hardware offloaded decapsulation, just to redirect ingress
tunnelled frame untouched. This small fix allows to specify
Flows with VXLAN VNI pattern and with or without following
decapsulation action.

Fixes: 251e8d02cf ("net/mlx5: add VXLAN to flow translate routine")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-27 12:33:47 +01:00
Viacheslav Ovsiienko
71ab2d6472 net/mlx5: fix VXLAN port registration race condition
E-Switch VXLAN tunneling rules require virtual VXLAN network
devices be created. These devices are managed by MLX5 PMD and
created/deleted dynamically.

Kernel creates the VXLAN devices and registers VXLAN UDP ports
to be hardware offloaded within the NIC kernel drivers. The
registration process is being performed into context of working
kernel thread and the race conditions might happen.

The VXLAN device is created and success code is returned to calling
application, but the UDP port registration process is not completed
yet and the next applied rule might be rejected by the driver with
ENOSUP code. This patch adds some timeout for new created devices,
allowing port registration process to be completed. The waiting
is performed once after device been created and first rule is being
applied.

Fixes: 95a464cecc ("net/mlx5: add E-switch VXLAN tunnel devices management")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-24 14:53:10 +01:00
Viacheslav Ovsiienko
0deb984fd7 net/mlx5: fix TC rule handle assignment
When tc rule is created via Netlink message application
can provide the unique rule value which can be accepted
by the kernel. Than rule is managed with this assigned
handle. It was found that kernel can reject the proposed
handle and assign its own handle value, the rule control
is lost, because application uses its initially prorosed
rule handle and knows nothing about handle been repleced.

The kernel can assign handle automatically, the application
can get the assigned handle value by specifying NLM_F_ECHO
flag in Netlink message when rule is being created. The
kernel sends back the full descriptor of rule and handle
can be retrieved from and stored by application for further
rule management.

Fixes: 57123c00c1 ("net/mlx5: add Linux TC flower driver for E-Switch flow")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-24 14:53:10 +01:00
Dekel Peled
15c8015587 net/mlx5: block RSS action without Rx queue
This patch modifies function mlx5_flow_validate_action_rss(), to
prevent the setting of rule with rss action, but without specifying
any queues.
For example:
flow create 0 ingress pattern end actions rss queues end / end

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-24 14:53:10 +01:00
Dekel Peled
ff160dbcba net/mlx5: allow port start with zero Rx queue
During port start, function mlx5_ctrl_flow_vlan() is called to create
default ingress flow rules.
For specific use-cases, a port can be used for Tx only.
In such case, number of Rx queues can be set to 0 to save resources,
hence the default ingress rules are irrelevant.

This patch modifies function mlx5_ctrl_flow_vlan() to avoid the
creation of the default ingress rules when number of Rx queues is 0.
It also includes update of validation functions for relevant actions,
mlx5_flow_validate_action_queue() and mlx5_flow_validate_action_rss(),
to prevent creation of flow rules with these actions when number of Rx
queues is 0.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-24 14:53:10 +01:00
Yongseok Koh
2014a7fbae net/mlx5: fix deprecated library API for Rx padding
In rdma-core library IBV_WQ_FLAG_RX_END_PADDING is renamed to
IBV_WQ_FLAGS_PCI_WRITE_END_PADDING. Way to query the capability is also
changed.

Fixes: 43e9d9794c ("net/mlx5: support upstream rdma-core")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Reviewed-by: Erez Ferber <erezf@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-18 09:47:26 +01:00
Yongseok Koh
78c7a16daa net/mlx5: fix Rx packet padding
Rx packet padding is supposed to be set by an environment variable -
MLX5_PMD_ENABLE_PADDING, but it has been missing for some time by mistake.
Rather than using such a variable, a PMD parameter (rxq_pkt_pad_en) is
added instead.

Fixes: a1366b1a2b ("net/mlx5: add reference counter on DPDK Rx queues")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Reviewed-by: Erez Ferber <erezf@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-18 09:47:26 +01:00
Yongseok Koh
12d468a62b net/mlx5: fix instruction hotspot on replenishing Rx buffer
On replenishing Rx buffers for vectorized Rx, mbuf->buf_addr isn't needed
to be accessed as it is static and easily calculated from the mbuf address.
Accessing the mbuf content causes unnecessary load stall and it is worsened
on ARM.

Fixes: 545b884b1d ("net/mlx5: fix buffer address posting in SSE Rx")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-15 02:40:40 +01:00
Viacheslav Ovsiienko
55c61fa714 net/mlx5: validate TOS and TTL on E-Switch
This patch adds the type-of-service and time-to-live IP header
fields validation on E-Switch, both for match pattern and
VXLAN encapsulation action IP header itesm. The E-Switch flows
will use the common mlx5_flow_validate_item_ipv4/6 routines
with added extra parameter, specifying the supported fields
mask.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
363fa2f296 net/mlx5: support TOS and TTL fields on E-Switch
This patch adds the type-of-service and time-to-live IP header
fields support on E-Switch. There match pattern for both fields
with masking is added. Also these fields can be set for VXLAN
tunnel encapsulation header.

This issue is critical for some Open VSwitch configuration
on overlayed (tunneled) networks, where the tos field can be
inherited from outer header to inner header.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
9d6d159a3f net/mlx5: add TOS and TTL flower match and tunnel keys
This patch is a preparation for adding the type-of-service and
time-to-live IP header fields support on E-Switch. There are
two types of keys added - one for match pattern, other for
tunnel encapsulation header.

This issue is critical for some Open VSwitch configuration
on overlayed (tunneled) networks, where the tos field can be
inherited from outer header to inner header.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
01925b8c64 net/mlx5: add RHEL-7.2 VXLAN device metadata workaround
RH7.2 with kernel 3.10.0-327 does not support VXLAN
devices metadata and IFLA_VXLAN_COLLECT_METADATA
key is neither defined nor supported. We must specify
VNI parameter, which will be actually ignored by kernel,
applied rules will be processed by mlx5 kernel driver
and the actual VNI from rules will be used.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
8ba325b051 net/mlx5: switch to detached VXLAN network devices
Current design uses the VXLAN virtual devices attached
to outer network interface for decapsulation. Kernel
allows to use non-attached devices, so now we can create
not attached device and use it both for encapsulation
and decapsulation. Devices management becomes simpler,
less VXLAN devices are created and used.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
80712ab8e9 net/mlx5: switch encap rules to use container
The VXLAN encapsulation neigh/local rules will use
the new introduced structure, which keeps the
rules lists, related to specified outer interface,
instead of attached VTEP structure. It allows us to
unbind VTEP structure from keeping the rules for
interface.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
18c3e6c90b net/mlx5: introduce encapsulation rules container
Currently the VXLAN encapsulation neigh/local rules
are stored in the list contained in the VTEP device
structure. Encapsulation VTEP device is attached to
outer interface and stored rules are related to this
underlying interface. We are going to use unattached
VXLAN devices for encapsulation (kernel does not use
attached interface to find egress one), so we should
introduce the structure to keep interface related
neigh/local rules instead of VTEP structure. This
patch introduces internal tcf_irule structure, and
its create/delete methods.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
4b837fe90e net/mlx5: optimize neigh and local encap rules search
This patch removes unnecessary local varialbles and optimizes
local and neigh encapsulation rules search.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
2398106073 net/mlx5: fix typos and code style
This patch fixes typos and codestyle issues in mlx5_flow_tcf.c file

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
3d14ad9be3 net/mlx5: support ethernet type for tunnels on E-Switch
This patch add support for inner and outer ethernet types for the
E-Switch Flows with tunnels. Inner and outer ethernet type match
can be specified with ethernet items, vlan items, or implicitly
deduced from IP address items. The tcm_info field in Netlink message
tcm structure is filled always with outer protocol.

Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
ea952e0273 net/mlx5: validate ethernet type on E-Switch
This patch updates the validation routine for the E-Switch Flows.
The ethernet type field can be specified within inner and outer
tunnel ethernet items, by vlan item or implicitly deduced from
IP address items. The validation routine checks all these items
and their combinations for mutual compatibility issues and possible
conflicts.

Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
78f5341d71 net/mlx5: support tunnel inner items on E-Switch
This patch updates the translation routine for the E-Switch Flows.
Inner tunnel pattern items are translated into Netlink message,
support for tunnel inner IP addresses (v4 or v6), IP protocol,
and TCP and UDP ports is added.

We are going to support Flows matching with outer tunnel items
and not containing the explicit tunnel decap action (this one
might be drop, redirect or table jump, for exapmle).
So we can not rely on presence of tunnel decap action in the
list to decide whether the Flow is for tunnel, instead we will
use the presence of tunnel item. Item translation is rebound
to presence of tunnel items, instead of relying on decap action.

There is no way to tell kernel driver the outer address type
(IPv4 or IPv6) but specify the address flower key. The outer
address key is put on Netlink with zero mask if there is no
RTE item is specified in the list.

Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:30 +01:00
Viacheslav Ovsiienko
9f4eb98f82 net/mlx5: validate tunnel inner items on E-Switch
This patch updates the validation routine for the E-Switch Flows.
The inner/outer item flags are added and set correctly, the
validation routine will accept and check the inner items
which follow the tunnel item (like VNI).

Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:29 +01:00
Viacheslav Ovsiienko
6b1a9b65be net/mlx5: remove checks for outer tunnel items on E-Switch
This patch removes unnecessary outer tunnel parameters check in the
validation routine for the E-Switch Flows. IPv4/IPv6 may have any
spec and mask, and transferred to tc without changes, all checks
are performed by kernel.

We are going to support Flows matching with outer tunnel items
and not containing the explicit tunnel decap action (this one
might be drop, redirect or table jump, for exapmle). So we can
not rely on presence of tunnel decap action in the list to decide
whether the Flow is for tunnel, instead we will use the presence
of tunnel item (like RTE_FLOW_ITEM_TYPE_VXLAN) in the item list.
The tunnel pattern checks within Flow validation routine are
rebound to presence of tunnel item. VXLAN decap action checks
for presence of VXLAN VNI item.

The tunnel UDP item is checked at the point of processing the tunnel
item (i.e. VXLAN). We can not perform UDP item check as tunnel once
UDP item encountered in the list, because it is not known yet whether
the tunnel item follows. The pointer to UDP item is saved and
checked as outer ones if tunnel item found.

Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:29 +01:00
Thomas Monjalon
2c0dd7b69f config: add static linkage of mlx dependency
The libraries provided by rdma-core may be statically linked
if enabling CONFIG_RTE_IBVERBS_LINK_STATIC in the make-based build.
If CONFIG_RTE_BUILD_SHARED_LIB is disabled, the applications
will embed the mlx PMDs with ibverbs and the mlx libraries.
If CONFIG_RTE_BUILD_SHARED_LIB is enabled,
the mlx PMDs will embed ibverbs and the mlx libraries.

Support with meson may be added later.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:29 +01:00
Thomas Monjalon
72b934adce config: gather options for dlopen mlx dependency
Rename options CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS and
CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS to a single option
CONFIG_RTE_IBVERBS_LINK_DLOPEN.
Rename meson option enable_driver_mlx_glue to ibverbs_link.

There was no good reason for setting a different link option
for mlx4 and mlx5. Having a single common option makes it
easier to understand and unify make and meson systems.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:29 +01:00
Moti Haimovsky
f5bf91de73 net/mlx5: support flow counters using devx
This commit adds counters support when creating flows via direct
verbs. The implementation uses devx interface in order to create
query and delete the counters.
This support requires MLNX_OFED_LINUX-4.5-0.1.0.1 installation.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:29 +01:00
Moti Haimovsky
6de1ffaa41 net/mlx5: add devx functions to glue
This patch adds glue functions for operations:
  - dv_open_device.
  - devx object create, destroy, query and modify.
  - devx general command
The new operations depend on HAVE_IBV_DEVX_OBJ.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:29 +01:00
Moti Haimovsky
5f09e80cf8 net/mlx5: fix shared counter allocation logic
This commit fixes the logic for searching and allocating a shared
counter in mlx5_flow_verbs.
Now only the shared counters in the counters list are checked for
a match and not all the counters as before.

Fixes: 84c406e745 ("net/mlx5: add flow translate function")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14 17:44:29 +01:00
Yongseok Koh
952f4cf5f0 mbuf: remove deprecated macro
RTE_MBUF_INDIRECT() is replaced with RTE_MBUF_CLONED() and removed.
This macro was deprecated in release 18.05 when EXT_ATTACHED_MBUF was
introduced.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-01-14 16:37:36 +01:00
Wisam Jaddo
f0354d8423 net/mlx5: add ConnectX-6 device IDs
This commit includes the add of:
- ConnectX-6 device ID
- ConnectX-6 SRIOV device ID

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-03 13:07:06 +01:00
Dekel Peled
4bb14c83df net/mlx5: support modify header using Direct Verbs
This patch implements the set of actions to support offload
of packet header modifications to MLX5 NIC.

Implementation is based on RFC [1].

[1] http://mails.dpdk.org/archives/dev/2018-November/119971.html

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-03 12:56:43 +01:00
Yongseok Koh
39e98c21be net/mlx5: fix Multi-Packet RQ mempool free
When MPRQ mempool is freed, the pointer stored in priv structure must be
reset to null. Otherwise, the mempool can be freed again if the port is
restarted.

Fixes: 7d6bf6b866 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-12-21 16:22:41 +01:00
Dekel Peled
8dd569abad net/mlx5: fix validation of Rx queue number
Function mlx5_ctrl_flow_vlan() is used to set the rss rule in
MLX5 PMD, using priv->reta_idx_n as number of Rx queues.
This number is passed to mlx5_flow_validate_action_rss(), which
attempts to access the Rx queues at priv->rxqs.
In case priv->rxqs_n is 0, priv->rxqs is empty, and
mlx5_flow_validate_action_rss() will crash with segmentation fault.

priv->reta_idx_n can never be 0, even if priv->rxqs_n is set to 0.
But when priv->rxqs_n is set to 0, setting the rss rule is invalid.

This patch updates mlx5_ctrl_flow_vlan(), if priv->rxqs_n is 0 the
function will fail with EINVAL errno.

Fixes: 8086cf08b2 ("net/mlx5: handle RSS hash configuration in RSS flow")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-12-13 17:44:52 +00:00
Asaf Penso
16a25aa1b7 net/mlx5: fix function documentation
tso and vlan parameters were removed from the signature
of txq_mbuf_to_swp function.

The documentation of the function was not updated accordingly.

Remove the tso and vlan documentation to match the function signature.

Fixes: 8f6d9e13a9 ("net/mlx5: remove redundant checks")
Cc: stable@dpdk.org

Signed-off-by: Asaf Penso <asafp@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-12-13 16:32:10 +00:00
Tom Barbette
ce9494d76c net/mlx5: report imissed statistics
The imissed counters (number of packets dropped because the queues were
full) were actually reported through xstats as "rx_out_of_buffer"
but was not reported through stats.

Following a recent discussion on the ML, as there is no way to tell the
user if a counter is implemented or not, this should be considered a
bug. For example, user looking at imissed will think the packets are
lost before reaching the device.

Signed-off-by: Tom Barbette <barbette@kth.se>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-12-13 16:31:06 +00:00
Viacheslav Ovsiienko
4022d78982 net/mlx5: fix TPID check for VLAN push action on E-Switch
The VLAN push action on E-Switch supports only 802.1Q (0x8100)
and 802.1AD (0x88A8) Tag Protocol ID (TPID) insertions. The
parameter check for RTE_FLOW_ACTION_TYPE_OF_PUSH_VLAN action
is added.

Fixes: 57123c00c1 ("net/mlx5: add Linux TC flower driver for E-Switch flow")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-22 16:17:31 +01:00
Viacheslav Ovsiienko
e71b32f172 net/mlx5: fix VLAN inner ethernet type on E-Switch
The TCA_FLOWER_KEY_VLAN_ETH_TYPE should be specified for the E-Switch
Flows with VLAN and L3 pattern items in the Netlink messages. The patch
adds missing flower key to the messages. This patch partially reverts to
the code smashed by http://patches.dpdk.org/patch/47781

Fixes: 251e8d02cf ("net/mlx5: add VXLAN to flow translate routine")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-22 16:17:31 +01:00
Dekel Peled
c5e508f0c3 net/mlx5: fix packet type for MPLS in UDP
Change the relevant value in tunnels_info[] to match tunnel type.

Fixes: a4a5cd21d2 ("net/mlx5: add flow MPLS item")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-16 10:45:37 +01:00
Dekel Peled
38f7efaaf5 net/mlx5: fix MPLS item validation
Update the mlx5_flow_validate_item_mpls() function to allow
MPLS over IP, UDP, and GRE.
Modify the flow_dv_validate() function with the new logic introduced
in previous patch of this series: set new variable last_item
after each validation, update item_flags after each item iteration.
The new variable last_item is sent to mlx5_flow_validate_item_mpls()
and used to validate the MPLS item.
Same change implemented in flow_verbs_validate().

Update the mlx5_flow_validate_item_mpls() function to verify that
device configuration has mpls_en set to 1.
This code was added earlier this year, but unintentionaly removed
in recent versions.

Fixes: 84c406e745 ("net/mlx5: add flow translate function")

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-16 10:45:37 +01:00
Dekel Peled
d1abe664dd net/mlx5: add MPLS to Direct Verbs flow engine
The support in MPLS on this flow engine was overlooked. It's absence is
critical because there are required actions for MPLS which can be done
only with the DV engine.

To set correctly the MPLS filter, we need to reason about the flow item
before the MPLS (UDP, GRE or other).
To do that, a new variable last_item was added and updated after each
translation. the full item flags are updated after each item iteration.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-16 10:45:37 +01:00
Yongseok Koh
545db54c7c net/mlx5: optimize Rx buffer replenishment threshold
Due to redundant calculation per every burst, performance drops a little.

Fixes: e10245a13b ("net/mlx5: fix Rx buffer replenishment threshold")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-16 10:45:37 +01:00
Yongseok Koh
317e64739d net/mlx5: optimize Tx doorbell write
Unnecessary volatile attribute keeps compiler from further optimizing the
code and this results in a little performance drop (~2%). Because of memory
barriers, it is safe to remove.

Fixes: 6bf10ab69b ("net/mlx5: support 32-bit systems")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-16 10:45:37 +01:00
Yongseok Koh
feddd5d243 net/mlx5: optimize Tx external memory registration
There's some performance drop due to extra condition checks on the
datapath. Checking for external memory registration should be consolidated
to the existing bottom-half.

Fixes: 7e43a32ee0 ("net/mlx5: support externally allocated static memory")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-16 10:45:37 +01:00
Yongseok Koh
57e1073bdc net/mlx5: fix flow destruction
As flow_drv_destroy() frees dev_flow, flow_rxq_flags_trim() must be called
ahead.

Fixes: 84c406e745 ("net/mlx5: add flow translate function")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-16 10:45:37 +01:00
Viacheslav Ovsiienko
a3166356ab net/mlx5: fix flow query routine in Direct Verbs
The flow_dv_query() just returns -ENOTSUP value and does not
set provided error parameter structure, that crashes the
port_flow_query(). The patch fixes flow_db_query(), now it
sets an error parameter structure.

Fixes: 684dafe795 ("net/mlx5: add flow query abstraction interface")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-16 10:45:38 +01:00
Ali Alnubani
0c15f3c010 net/mlx5: fix initialization of struct members
This patch fixes compilation errors with meson and the clang
compiler caused by some of the struct members not being
initialized.

```
../drivers/net/mlx5/mlx5_mr.c:345:37: error: missing field 'end'
initializer [-Werror,-Wmissing-field-initializers]
                struct mlx5_mr_cache entry = { 0, };
                                                  ^
../drivers/net/mlx5/mlx5_mr.c:389:36: error: missing field 'end'
initializer [-Werror,-Wmissing-field-initializers]
                        struct mlx5_mr_cache ret = { 0, };
                                                        ^
../drivers/net/mlx5/mlx5_mr.c:691:35: error: missing field 'end'
initializer [-Werror,-Wmissing-field-initializers]
                struct mlx5_mr_cache ret = { 0, };
                                                ^
```

The compilation errors reproduce with
clang version 3.4.2 (tags/RELEASE_34/dot2-final) on RHEL.

Fixes: e1114ff6a5 ("net/mlx5: support e-switch flow count action")
Fixes: db48f9db5d ("net/mlx5: support new flow counter API")
Fixes: 974f1e7ef1 ("net/mlx5: add new memory region support")
Fixes: 65c9d24170 ("net/mlx5: enable loopback by configured mode")
Fixes: 87011737b7 ("mlx5: add software counters")
Cc: stable@dpdk.org

Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-16 10:44:55 +01:00
Ali Alnubani
d77b9aac5d net/mlx5: fix minor typos
Fixes: e1114ff6a5 ("net/mlx5: support e-switch flow count action")
Fixes: 974f1e7ef1 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-15 23:54:53 +01:00
Viacheslav Ovsiienko
6526d465f0 net/mlx5: add E-switch rule hardware offload flag check
This patch adds the in_hw flag check for tc flower filter rules.
If rule is applied without skip_sw flag set driver should check
whether the in_hw flag is set after rule applying. If no in_hw
flag set found it means the rule is not hardware offloaded and
error should be returned to application.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-15 23:54:53 +01:00
Viacheslav Ovsiienko
59d6457111 net/mlx5: prepare to add E-switch rule flags check
The tc flower filter rules are used to control E-switch from
the application side. In order to gain garanteed rule hardware
offload the skip_sw flag should be specified while applying the
rule. But some tc rules is rejected by kernel if skip_sw flag
is set by design. Currently this regards VXLAN tunneling rules,
which are applied to special VXLAN virtual devices. Albeit these
rules are applied with skip_sw flag reset kernel tries to
perform hardware offload. If kernel succeeded the in_hw flag
is set in rule properties and application should check this
flag to get know whether the applied rule is actually hardware
offloaded.

This patch prepares for the rule flags query and check in_hw flag.
The driver checks only the rules with skip_sw flag reset, so we
need to test the actual flags of rule is being applied. The pointer
to flags field into translated rule is introduced and flags are
tested directly in the rule body. It is more reliable than save and
check flags copy.

Also patch swaps the flow_tcf_apply() and flow_tcf_remove(),
we are going to call flow_tcf_remove() from flow_tcf_appy() if
no in_hw flag set found in applied rule.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-15 23:54:53 +01:00
Viacheslav Ovsiienko
84cacbe34b net/mlx5: fix Netlink communication routine
While receiving the Netlink reply messages we should stop at DONE
or ACK message. The existing implementation stops at DONE message
or if no multiple message flag set ( NLM_F_MULTI). It prevents
the single query requests from working, these requests send the
single reply message without multi-message flag followed by
ACK message. This patch fixes receiving part of Netlink
communication routine.

Fixes: 6e74990b34 ("net/mlx5: update E-Switch VXLAN netlink routines")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-15 23:54:53 +01:00
Viacheslav Ovsiienko
ccdb3b2373 net/mlx5: remove unused TC message length parameter
This patch removes the unused message length parameter, we
do not send multiple commands in the single message anymore,
message length can be taken from the prepared message header,
so length parameter can be removed.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-15 23:54:53 +01:00
Yongseok Koh
e31b6a4151 net/mlx5: fix Direct Verbs RSS hash field
As mlx5_flow_hashfields_adjust() refers to flow->rss, actions must be
translated prior to items like in Verbs. Otherwise, hash fields are not
correctly set.

Fixes: d02cb06912 ("net/mlx5: add Direct Verbs translate actions")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-14 00:35:53 +01:00
Thomas Monjalon
15febafdd4 drivers/net: set close behaviour flag at probing
The ethdev flag RTE_ETH_DEV_CLOSE_REMOVE is set for drivers
having migrated to the new behaviour of rte_eth_dev_close().

As any other flag, it can be useful to know about its value
as soon as the port is probed.
Unfortunately, it was set inside the close operation,
just before being erased by memset() in rte_eth_dev_release_port().
The flag assignment is moved to the probing stage, so it can
be checked by the application in order to anticipate the behaviour.

Fixes: 42603bbdb5 ("net/mlx5: release port on close")
Fixes: 6c99085d97 ("net/vmxnet3: fix hot-unplug")
Fixes: 4d7877fde2 ("net/ena: remove resources when port is being closed")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-11-14 00:35:53 +01:00
Shahaf Shuler
325384fcd9 net/mlx5: remove GRE inner IPv6 matching limitation
Such limitation seems not to exist on:
 - MLNX_OFED_linux-4.5-0.3.0.0 (Beta)
 - MLNX_OFED_LINUX-4.4-2.0.7.0 (GA)
 - upstream kernel 4.19.0-rc7

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-14 00:35:53 +01:00
Viacheslav Ovsiienko
817a6c4740 net/mlx5: fix VXLAN device rollback if rule apply fails
If rule contains tunneling action (like VXLAN encapsulation)
the VTEP (Virtual Tunneling EndPoint) device is pre-configured
before applying the rule. If kernel returns an error this
VTEP configuration should be rolled back to the origin state.
The patch adds the missing VTEP configuration restoration.

Fixes: 95a464cecc ("net/mlx5: add E-switch VXLAN tunnel devices management")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-14 00:35:53 +01:00
Viacheslav Ovsiienko
1f64486170 net/mlx5: fix rule cleanup Netlink command sending
The VXLAN related rule cleanup routine queries and gathers all
existing local IP and neigh rules into buffer list. One buffer
may contain multiple rule deletion commands and is prepared
to send into Netlink as single message. But, if error occurs
for some deletion commands in the buffer, the multiple ACK
message with errors can be send back by the kernel. It breaks
the Netlink communication sequence numbers, because we expect
only one ACK message and it smashes out futher Netlik
communication.

The workaround of this problem is to send rule deletion commands
from buffer in one-by-one fashion and get ACK message for every
command sent. We do not expect too may rules preexist, so there
should not be critical performance degradation at VXLAN outer
interface initialization.

Fixes: f420f03d67 ("net/mlx5: add E-switch VXLAN rule cleanup routines")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-14 00:35:53 +01:00
Viacheslav Ovsiienko
b9e5c3ab2e net/mlx5: add Netlink message size check in rule cleanup
This patch is preparation for the following fix, we are going to send
Netlink message from buffer in one-by-one fashion. It is highly
desirable to check multimessage buffer consistency for debug purposes.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-14 00:35:53 +01:00
Viacheslav Ovsiienko
00ae11c4e8 net/mlx5: fix buffer allocation check in rule cleanup
The Netlink message buffer is allocated and there is the typo,
the other pointer is checked instead of returned one. If no
memory is allocated and NULL is returned by allocation routine
the bug causes segmentation fault. The patch fixes typo,
returned pointer is validated.

Fixes: f420f03d67 ("net/mlx5: add E-switch VXLAN rule cleanup routines")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-14 00:35:53 +01:00
Luca Boccassi
0c9df53de4 net/mlx5: fallback quietly if pkg-config is unavailable
Don't fail the build if pkg-config can't be found, instead print the
linker flag as it was doing before the change.

Fixes: b6b8793919 ("net/mlx5: use pkg-config to handle SUSE libmnl")
Cc: stable@dpdk.org

Reported-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-14 00:35:53 +01:00
Dekel Peled
99813c2a32 net/mlx5: fix flow director add and delete
Fix the flow_fdir_cmp() function, used by flow_fdir_filter_lookup().
This function is used by flow_fdir_filter_add() to check if same rule
exists, and by flow_fdir_filter_delete() to find flow rule to delete.

The function compared actions conf pointers, changed to compare
actions type only.

Fixes: 2720f833d4 ("net/mlx5: add missing flow director delete")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-14 00:35:53 +01:00
Thomas Monjalon
725f5dd0bf net/mlx5: fix build on PPC64
The AltiVec header file breaks boolean type:

error: incompatible types when initializing type
'__vector _bool int' {aka '_vector(4) __bool int'} using type 'int'

If __APPLE_ALTIVEC__ is defined, then bool type is redefined
and conflicts with stdbool.h.

There is no good solution to fix it for the whole project without
breaking something else, so a workaround is inserted in mlx5 PMD.
This workaround is not compatible with C++ but there is no C++ in DPDK.

Suggested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: David Wilder <dwilder@us.ibm.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
2018-11-14 00:35:53 +01:00
Yongseok Koh
3193c2494e net/mlx5: fix L4 protocol validation
- Currently, no device supports partial mask for protocol in IP header.
- As there could be multiple IP items, next_protocol variable in flow
  validation has to be reset for inner layer. Otherwise, inner TCP/UDP
  will see protocol number of outer IP header.
- Remove redundant protocol checking for MPLS, which is done in
  mlx5_flow_validate_item_mpls().

Fixes: 3d69434113 ("net/mlx5: add Direct Verbs validation function")
Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-14 00:35:53 +01:00
Yongseok Koh
636a71dedc net/mlx5: fix device flow reference
dev_flow->verbs is mistakenly used instead of dev_flow->dv. A sanity
check is added for debugging purpose.

Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-14 00:35:52 +01:00
Tom Barbette
26f0488344 net/mlx5: support Rx queue count API
This patch adds support for the rx_queue_count API in mlx5 driver

Signed-off-by: Tom Barbette <barbette@kth.se>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Yongseok Koh
c1cfb132fa net/mlx5: remove flags setting from flow preparation
Even though flow_drv_prepare() takes item_flags and action_flags to be
filled in, those are not used and will be overwritten by parsing of
flow_drv_translate(). There's no reason to keep the flags and fill it.
Appropriate notes are added to the documentation of flow_drv_prepare() and
flow_drv_translate().

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-11-05 15:01:25 +01:00
Yongseok Koh
9ce75f5122 net/mlx5: fix Direct Verbs flow tunnel
1) Fix layer parsing
In translation of tunneled flows, dev_flow->layers must not be used to
check tunneled layer as it contains all the layers parsed from
flow_drv_prepare(). Checking tunneled layer is needed to distinguish
between outer and inner item. This should be based on dynamic parsing. With
dev_flow->layers on a tunneled flow, items will always be interpreted as
inner as dev_flow->layer already has all the items. Dynamic parsing
(item_flags) is added as there's no such code.

2) Refactoring code
- flow_dv_create_item() and flow_dv_create_action() are merged into
  flow_dv_translate() for consistency with Verbs and *_validate().

Fixes: 2466364115 ("net/mlx5: fix flow tunnel handling")
Fixes: d02cb06912 ("net/mlx5: add Direct Verbs translate actions")
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-11-05 15:01:25 +01:00
Yongseok Koh
4a78c88e3b net/mlx5: fix Verbs flow tunnel
1) Fix layer parsing
In translation of tunneled flows, dev_flow->layers must not be used to
check tunneled layer as it contains all the layers parsed from
flow_drv_prepare(). Checking tunneled layer is needed to set
IBV_FLOW_SPEC_INNER and it should be based on dynamic parsing. With
dev_flow->layers on a tunneled flow, items will always be interpreted as
inner as dev_flow->layer already has all the items.

2) Refactoring code
It is partly because flow_verbs_translate_item_*() sets layer flag. Same
code is repeating in multiple locations and that could be error-prone.

- Introduce VERBS_SPEC_INNER() to unify setting IBV_FLOW_SPEC_INNER.
- flow_verbs_translate_item_*() doesn't set parsing result -
  MLX5_FLOW_LAYER_*.
- flow_verbs_translate_item_*() doesn't set priority or adjust hashfields
  but does only item translation. Both have to be done outside.
- Make more consistent between Verbs and DV.

3) Remove flow_verbs_mark_update()
This code can never be reached as validation prohibits specifying mark and
flag actions together. No need to convert flag to mark.

Fixes: 84c406e745 ("net/mlx5: add flow translate function")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-11-05 15:01:25 +01:00
Ophir Munk
f1b85a2719 net/mlx5: support default RSS key as null
Applications which add RSS rules must supply an RSS key and length.
If an application is only interested in default RSS operation it
should not care about the exact RSS key.
By setting the key to NULL - the PMD will use the default RSS key.
In addition if the application does not care about the RSS type it can
set it to 0 and the PMD will use the default type (ETH_RSS_IP).

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Yongseok Koh
e7d2e32b26 net/mlx5: limit priority range for Linux TC flower driver
Due to a limitation on driver/FW, priority ranges from 1 to 16 in kernel.
Priority in rte_flow attribute starts from 0 and is added by 1 in
translation. This is subject to be changed to determine the max priority
based on trial-and-error like Verbs driver once the restriction is lifted
or the range is extended.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Yongseok Koh
09d8b41699 net/mlx5: make vectorized Tx threshold configurable
Add txqs_max_vec parameter to configure the maximum number of Tx queues to
enable vectorized Tx. And its default value is set according to the
architecture and device type.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Yongseok Koh
f87bfa8eae net/mlx5: move device spawn configuration to probing
When a device is spawned, it does make more sense that the configuration
parameters are passed by callee. Furthermore, setting default value for
some configuration would need PCIe device ID which can be found in the
probe function.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
f420f03d67 net/mlx5: add E-switch VXLAN rule cleanup routines
The last part of patchset contains the rule cleanup routines.
These ones is the part of outer interface initialization at
the moment of VXLAN VTEP attaching. These routines query
the list of attached VXLAN devices, the list of local IP
addresses with peer and link scope attribute and the list
of permanent neigh rules, then all found abovementioned
items on the specified outer device are flushed.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
7a2d6d99a4 net/mlx5: add E-Switch VXLAN encapsulation rules
VXLAN encap rules are applied to the VF ingress traffic and have the
VTEP as actual redirection destinations instead of outer PF.
The encapsulation rule should provide:
- redirection action VF->PF
- VF port ID
- some inner network parameters (MACs/IP)
- the tunnel outer source IP (v4/v6)
- the tunnel outer destination IP (v4/v6). Current
- VNI - Virtual Network Identifier

There is no direct way found to provide kernel with all required
encapsulatioh header parameters. The encapsulation VTEP is created
attached to the outer interface and assumed as default path for
egress encapsulated traffic. The outer tunnel IP address are
assigned to interface using Netlink, the implicit route is
created like this:

  ip addr add <src_ip> peer <dst_ip> dev <outer> scope link

Peer address provides implicit route, and scode link reduces
the risk of conflicts. At initialization time all local scope
link addresses are flushed from device (see next part of patchset).

The destination MAC address is provided via permenent neigh rule:

  ip neigh add dev <outer> lladdr <dst_mac> to <dst_ip> nud permanent

At initialization time all neigh rules of this type are flushed
from device (see the next part of patchset).

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
95a464cecc net/mlx5: add E-switch VXLAN tunnel devices management
VXLAN interfaces are dynamically created for each local UDP port
of outer networks and then used as targets for TC "flower" filters
in order to perform encapsulation. These VXLAN interfaces are
system-wide, the only one device with given UDP port can exist
in the system (the attempt of creating another device with the
same UDP local port returns EEXIST), so PMD should support the
shared device instances database for PMD instances. These VXLAN
implicitly created devices are called VTEPs (Virtual Tunnel
End Points).

Creation of the VTEP occurs at the moment of rule applying. The
link is set up, root ingress qdisc is also initialized.

Encapsulation VTEPs are created on per port basis, the single
VTEP is attached to the outer interface and is shared for all
encapsulation rules on this interface. The source UDP port is
automatically selected in range 30000-60000.

For decapsulaton one VTEP is created per every unique UDP
local port to accept tunnel traffic. The name of created
VTEP consists of prefix "vmlx_" and the number of UDP port in
decimal digits without leading zeros (vmlx_4789). The VTEP
can be preliminary created in the system before the launching
application, it allows to share	UDP ports between primary
and secondary processes.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
4eaa82256c net/mlx5: fix E-Switch flow counter deletion
The counters for E-Switch rules were erroneously deleted in
flow_tcf_remove() routine. The counters deletion is moved to
flow_tcf_destroy() routine.

Fixes: e1114ff6a5 ("net/mlx5: support e-switch flow count action")

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
6e74990b34 net/mlx5: update E-Switch VXLAN netlink routines
This part of patchset updates Netlink exchange routine. Message
sequence numbers became not random ones, the multipart reply messages
are supported, not propagating errors to the following socket calls,
Netlink replies buffer size is increased to MNL_SOCKET_BUFFER_SIZE
and now is preallocated at context creation time instead of stack
usage. This update is needed to support Netlink query operations.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
251e8d02cf net/mlx5: add VXLAN to flow translate routine
This part of patchset adds support of VXLAN-related items and
actions to the flow translation routine. Later some tunnel types,
other than VXLAN can be addedd (GRE). No VTEP devices are created at
this point, the flow rule is just translated, not applied yet.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
1a02c6781f net/mlx5: add VXLAN to flow prepare routine
The e-switch Flow prepare function is updated to support VXLAN
encapsulation/and decapsulation actions. The function calculates
buffer size for Netlink message and Flow description structures,
including optional ones for tunneling purposes.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
2a3fed2042 net/mlx5: add E-Switch VXLAN to validation routine
This patch adds VXLAN support for flow item/action lists validation.
The following entities are now supported:

- RTE_FLOW_ITEM_TYPE_VXLAN, contains the tunnel VNI

- RTE_FLOW_ACTION_TYPE_VXLAN_DECAP, if this action is specified
  the items in the flow items list treated as outer  network
  parameters for tunnel outer header match. The ethernet layer
  addresses always are treated as inner ones.

- RTE_FLOW_ACTION_TYPE_VXLAN_ENCAP, contains the item list to
  build the encapsulation header. In current implementation the
  values is the subject for some constraints:
    - outer source MAC address will be always unconditionally
      set to the one of MAC addresses of outer egress interface
    - no way to specify source UDP port
    - all abovementioned parameters are ignored if specified
      in the rule, warning messages are sent to the log

Minimal tunneling support is also added. If VXLAN decapsulation
action is specified the ETH item can follow the VXLAN VNI item,
the content of this ETH item is treated as inner MAC addresses
and type. The outer ETH item for VXLAN decapsulation action
is always ignored.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
0decab8ed1 net/mlx5: swap items/actions validations for E-Switch rules
The rule validation function for E-Switch checks item list first,
then action list is checked. This patch swaps the validation order,
now actions are checked first. This is preparation for validation
function update with VXLAN tunnel actions. VXLAN decapsulation
action requires to check the items in special way. We could do
this special check in the single item check pass if the action
flags were gathered before. This is the reason to swap the
item/actions checking loops.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
6f2f07e228 net/mlx5: add necessary structures for E-Switch VXLAN
This patch introduces the data structures needed to implement VXLAN
encapsulation/decapsulation hardware offload support for E-Switch.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
a57f692be0 net/mlx5: add necessary definitions for E-Switch VXLAN
This patch contains tc flower related and some other definitions
needed to implement VXLAN encapsulation/decapsulation hardware
offload support for E-Switch.

mlx5 driver dynamically creates and manages the VXLAN virtual
tunnel endpoint devices, the following definitions control
the parameters of these network devices:

- MLX5_VXLAN_PORT_MIN - minimal allowed UDP port for VXLAN device
- MLX5_VXLAN_PORT_MAX - maximal allowed UDP port for VXLAN device
- MLX5_VXLAN_DEVICE_PFX - name prefix of driver created VXLAN device

The mlx5 drivers creates the VXLAN devices with UDP port within
specified range, devices have the names with specified prefix,
followed by decimal digits of UDP port.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
72579fae4e net/mlx5: prepare meson build for adding E-Switch VXLAN
This patch updates meson.build before adding E-Switch VXLAN
encapsulation/decapsulation hardware offload support.
E-Switch rules are controlled via tc Netilnk commands,
so we need to include tc related headers, and check for
some tunnel specific key definitions.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
65888c9274 net/mlx5: prepare makefile for adding E-Switch VXLAN
This patch updates makefile before adding E-Switch VXLAN
encapsulation/decapsulation hardware offload support.
E-Switch rules are controlled via tc Netilnk commands,
so we need to include tc related headers, and check for
some tunnel specific key definitions.

Suggested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Dekel Peled
65c9d24170 net/mlx5: enable loopback by configured mode
Enable NIC loopback mode based on rte_eth_conf.lpbk_mode
configuration.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Dekel Peled
c513f05cde net/mlx5: add caching of encap/decap actions
Make flow encap and decap Verbs actions cacheable resources.
Store created actions in local database.
This enables MLX5 PMD reuse of existing actions.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Dekel Peled
8ba9eee4ce net/mlx5: add raw data encap/decap to Direct Verbs
This patch implements the encap and decap actions, using raw data,
in DV flow for MLX5 PMD.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Dekel Peled
4b8727f08b net/mlx5: add NVGRE decap action to Direct Verbs
This patch implements the NVGRE decap action in DV flow for MLX5 PMD.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Dekel Peled
a124cff00f net/mlx5: add NVGRE encap action to Direct Verbs
This patch implements the nvgre encap action in DV flow for MLX5 PMD.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Dekel Peled
49d6465af3 net/mlx5: add VXLAN decap action to Direct Verbs
This patch implements the VXLAN decap action in DV flow for MLX5 PMD.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Dekel Peled
34d41b7aa3 net/mlx5: add VXLAN encap action to Direct Verbs
This patch implements the VXLAN encap action in DV flow for MLX5 PMD.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Dekel Peled
56ef2c5815 net/mlx5: add flow action functions to glue
This patch adds glue functions for operations:
- Create packet reformat (encap/decap) flow action.
- Destroy flow action.

The new operations depend on HAVE_IBV_FLOW_DV_SUPPORT.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Yongseok Koh
3bcc6d6a83 net/mlx5: fix validation of MLPS-in-GRE
Multiple tunnel isn't allowed but MPLS over GRE should be accepted.

Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Yongseok Koh
2720f833d4 net/mlx5: add missing flow director delete
Deleting FDIR flow is not implemented by mistake. Also the name of static
functions are properly renamed.

Fixes: b42c000e37 ("net/mlx5: remove flow support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Dekel Peled
7199088f9d net/mlx5: fix memory leak on Direct Verbs error
Add freeing of allocated memory before exiting on mlx5dv error.

Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Yongseok Koh
bc91e8db12 net/mlx5: add 128B padding of Rx completion entry
A PMD parameter (rxq_cqe_pad_en) is added to enable 128B padding of CQE on
RX side. The size of CQE is aligned with the size of a cacheline of the
core. If cacheline size is 128B, the CQE size is configured to be 128B even
though the device writes only 64B data on the cacheline. This is to avoid
unnecessary cache invalidation by device's two consecutive writes on to one
cacheline. However in some architecture, it is more beneficial to update
entire cacheline with padding the rest 64B rather than striding because
read-modify-write could drop performance a lot. On the other hand, writing
extra data will consume more PCIe bandwidth and could also drop the maximum
throughput. It is recommended to empirically set this parameter. Disabled
by default.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Viacheslav Ovsiienko
84be903c0c net/mlx5: fix flow counters deletion in Verbs
The Flow counters created with Verbs are erroneously destroyed
in Flow remove function (flow_verbs_remove()). Counter Verbs
handles stored in the translated rule buffer become invalid.
If rule is reapplied with these invalid counter handles the
driver hangs.

The counter should be destroyed with Verbs in the Flow destroy
function. The Flow remove function should keep counters intact.

Fixes: 60bd8c9747 ("net/mlx5: add count flow action")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-11-05 15:01:25 +01:00
Shahaf Shuler
3b557cac65 net/mlx5: fix detection and error for multiple item layers
1. The check for the Eth item was wrong. causing an error with
flow rules like:

flow create 0 ingress pattern eth / vlan vid is 13 / ipv4 / gre / eth /
vlan vid is 15 / end actions drop / end

2. align all error messages.

3. align multiple item layers check.

Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Shahaf Shuler
ed4c524753 net/mlx5: fix bit width of flow items
Apply the changes from commit c744f6b1b969 ("net/mlx5: fix bit width of
item and action flags") in some places that were overlooked.

Fixes: 0ddd11437a ("net/mlx5: fix bit width of item and action flags")
Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-11-05 15:01:25 +01:00
Stephen Hemminger
b6b8793919 net/mlx5: use pkg-config to handle SUSE libmnl
SUSE decided to install the libmnl include file in a non-standard
place: /usr/include/libmnl/libmnl/libmnl.h

This was probably a mistake by the SUSE package maintainer,
but hard to get fixed. Workaround the problem by pkg-config to find
the necessary include directive for libmnl.

Fixes: 20b71e92ef ("net/mlx5: lay groundwork for switch offloads")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Luca Boccassi <bluca@debian.org>
2018-11-05 15:01:25 +01:00
Ophir Munk
3a8207423a net/mlx5: close all ports on remove
With the introduction of representors several eth devices are using
the same rte device (e.g. a PCI bus). When calling port detach on one
eth device it is required that all eth devices belonging to the
same rte device have been closed in advance, then the rte device
itself can be removed/detached.
This commit implements this requirement implicitly by adding a
remove callback to struct rte_pci_driver.
The new behavior can be demonstrated in testpmd.
First we attach a representor 0 using PCI address 0000:08:00.0
testpmd> port attach  0000:08:00.0,representor=[0]
Attaching a new port...
EAL: PCI device 0000:08:00.0 on NUMA socket 0
EAL:   probe driver: 15b3:1013 net_mlx5
Port 0 is attached.
Done
Port 1 is attached.
Done

Port 0 is the master device (PF) - an ethdev of the PCI address.
Port 1 is representor 0 - another ethdev (representing a VF) using the
same PCI address. Next we detach port 1
testpmd> port detach 1
Removing a device...
Port 0 is closed
Port 1 is closed
Now total ports is 0
Done

Since port 0 has been implicitly closed we cannot act on it anymore.
testpmd> port stop 0
Invalid port 0

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-26 22:14:06 +02:00
Ophir Munk
42603bbdb5 net/mlx5: release port on close
With the introduction of representors several eth devices are using
the same rte device (e.g. a PCI bus). It is therefore required to
release the eth device resources during an eth device close operation
rather than during an rte device removal (detach) operation.
In current version many PMDs are still releasing the eth device as
part of the rte device removal. In order to allow a smooth transition
for all PMDs to behave correctly an ethdev flag RTE_ETH_DEV_CLOSE_REMOVE
is used. When this flag is set it indicates to rte_eth_dev_close() to
call rte_eth_dev_release_port(), so the port is freed during the close
operation.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-26 22:14:06 +02:00
Ophir Munk
206254b7dc net/mlx5: allow multiple probing for representor
Implement probing of a rte device multiple times, see [1].
Set PCI driver RTE_PCI_DRV_PROBE_AGAIN flag to enable multiple probing
of the PCI device by the PCI common driver.
Consecutive probing requests with a devargs string may contain
repetitive master and representors devices for which eth device should
be created only once. In case an eth device already exists - silently
ignore it.

[1]
commit e9d159c3d5 ("eal: allow probing a device again")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-26 22:14:06 +02:00
Yongseok Koh
0ddd11437a net/mlx5: fix bit width of item and action flags
Most of the code uses uint64_t for MLX5_FLOW_LAYER_* and
MLX5_FLOW_ACTION_*, but there're some code using uint32_t.

Fixes: 2ed2fe5f0a ("net/mlx5: rewrite IP address UDP/TCP port by E-Switch")
Fixes: 57123c00c1 ("net/mlx5: add Linux TC flower driver for E-Switch flow")
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Fixes: 3d69434113 ("net/mlx5: add Direct Verbs validation function")
Fixes: 84c406e745 ("net/mlx5: add flow translate function")
Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-26 22:14:06 +02:00
Yongseok Koh
2466364115 net/mlx5: fix flow tunnel handling
Both rte_flow and mlx5_flow redundantly have item flags. And it is not
properly set in the code. This causes wrong tunnel flag handling. A
rte_flow can have multiple expanded device flows if the flow has an RSS
action. Therefore, mlx5_flow should have the layers field.

Fixes: c4d9b9f7f3 ("net/mlx5: add Direct Verbs final functions")
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Fixes: 3d69434113 ("net/mlx5: add Direct Verbs validation function")
Fixes: 84c406e745 ("net/mlx5: add flow translate function")
Fixes: 4e05a229c5 ("net/mlx5: add flow prepare function")
Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-26 22:14:06 +02:00
Yongseok Koh
98521a3926 net/mlx5: rename static functions
In mlx5_flow*.c, static functions have names starting from 'flow_' while
shared ones start from "mlx5_flow_'.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-26 22:14:06 +02:00
Yongseok Koh
2096f61b95 net/mlx5: fix flow mark ID conversion in Direct Verbs
Fixes: d02cb06912 ("net/mlx5: add Direct Verbs translate actions")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-10-26 22:14:06 +02:00
Yongseok Koh
39bee16117 net/mlx5: fix wildcard item for Direct Verbs
If a network layer is specified with no spec, it means wildcard match.
flow_dv_translate_item_*() returns without writing anything if spec is
null and it causes creation of wrong flow. E.g., the following flow has to
patch with any ipv4 packet.

  flow create 0 ingress pattern eth / ipv4 / end actions ...

But, with the current code, it matches any packet because PMD doesn't write
anything about IPv4. The matcher value and mask becomes completely zero. It
should have written the IP version at least. It is same for the rest of
items.

Even if the spec is null, PMD has to write constant fields before return,
e.g. IP version and IP protocol number.

Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-10-26 22:14:06 +02:00
Yongseok Koh
6949c43c14 net/mlx5: fix item validation in Direct Verbs
1) remove MPLS item in validation as it doesn't have a translator.

2) add missing NVGRE item to validation

3) match switch-case order between validation and translation.

Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Fixes: 3d69434113 ("net/mlx5: add Direct Verbs validation function")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-10-26 22:14:06 +02:00
Yongseok Koh
31b1999991 net/mlx5: fix UDP hash field flag in Direct Verbs
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-10-26 22:14:06 +02:00
Yongseok Koh
58b1312e9d net/mlx5: add warning message for Direct Verbs flow
In case that the library doesn't support DV flow, if enabled by
'dv_flow_en=1', print out a warning message and disable it.

Fixes: 51e72d386c ("net/mlx5: add runtime parameter to enable Direct Verbs")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko
db48f9db5d net/mlx5: support new flow counter API
This patch updates the functions performing the Verbs ibrary calls
in order to support different versions of the library.
The functions:
  - flow_verbs_counter_new()
  - flow_verbs_counter_release()
  - flow_verbs_counter_query()
now have the several compilation branches, depending on the
counters support found in the system at compile time.

The flow_verbs_counter_create() function is introduced as
helper for flow_verbs_counter_new(), actually this helper
create the counters with Verbs.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko
e00767ee04 net/mlx5: remove unnecessary structure initializers
The unnecessary structure filed initializers to zero are removed.
We need to do this minor preparation before the following
mlx5 counter structure modification.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko
3b2c0bbc52 net/mlx5: add new flow counter Verbs API to glue library
This patch updates the mlx5 glue library, new counter support
Verbs function pointers are added to the glue linking structure
mlx5_glue. This structure now contains the pointers to the both
versions of counter supporting functions due to compatibility
issues. Depending on configuration macros the functions perform
actual Verbs library calls or return an error with meaning
"feature is not supported" (NULL or ENOTSUP).

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko
06629d860d net/mlx5: relocate flow counters query function
The flow_verbs_query_count() is moved into the the group of counter
managing functions and renamed to flow_verbs_counter_query() in order
to be in unified fashion with others.

Also minor function modification is made to avoid unreachable code
warnings if there is no counter support at compile time.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko
2dd8b72167 net/mlx5: simplify flow counters support check
The redundant check of Flow counters support in runtime is removed.
The flag flow_counter_en is eliminated from the code. The Verbs
create counter function just returns an error if no counter
support presented in the system.

If there is no any of Flow counters configuration macro defined
the log message is emited, indicating the missing counter support.

mlx5_flow_validate_action_count() fuctnion is also updated due to
flow_counter_en flag removal.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko
9a945375a8 net/mlx5: introduce new flow counters configuration macro
The new configuration macro HAVE_IBV_DEVICE_COUNTERS_SET_V45 is
introduced. Both makefile and meson.build are changed.

Flow counter support code depends on the following configuration
macros:

- HAVE_IBV_DEVICE_COUNTERS_SET_V42 - is defined if system supports
  the "old" flow counters functionality, MLNX_OFED version from
  4.2 to 4.4 is required.

- HAVE_IBV_DEVICE_COUNTERS_SET_V45 - is defined if system supports
  the "new" flow counters functionality, MLNX_OVED 4.5 (or higher)
  or Linux rdma-core v19 (or higher) is required.

Neither HAVE_IBV_DEVICE_COUNTERS_SET_V42 nor
HAVE_IBV_DEVICE_COUNTERS_SET_V45 is defined if there is no
counters support.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko
0d8c6e6ba0 net/mlx5: rename flow counter configuration macro
The HAVE_IBV_DEVICE_COUNTERS_SET_SUPPORT is replaced with
HAVE_IBV_DEVICE_COUNTERS_SET_V42. At this stage it is just
macro renaming. This macro is defined if system supports
the "old" Flow counters functionality, MLNX_OFED version
from 4.2 to 4.4 is required.

We need to do this preparation before introducing the new
configuration macro (HAVE_IBV_DEVICE_COUNTERS_SET_V45) for
the "new" Flow counters support.

Both makefile and meson.build are changed.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2018-10-26 22:14:06 +02:00
Viacheslav Ovsiienko
c4301b9caf net/mlx5: fix flow counters creation
The Flow counter creation function contains two problems:

  - Flow counter object in Verbs is not freed in case of memory
    allocation error. The call of counter Verbs object deallocating
    function is added to fix.

  - The initial value of reference counter is set to one in order
    to provide the correct counter object freeing in the
    flow_verbs_counter_release() function. The reference counter
    field should be initialized to one.

Fixes: 60bd8c9747 ("net/mlx5: add count flow action")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2018-10-26 22:14:06 +02:00
Thomas Monjalon
a7d3c6271d ethdev: support representor id as iterator filter
The representor id is added in rte_eth_dev_data in order to be able
to match a port with its representor id in devargs.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-10-26 22:14:06 +02:00
Dekel Peled
6bd7fbd03c net/mlx5: support metadata as flow rule criteria
As described in series starting at [1], it adds option to set
metadata value as match pattern when creating a new flow rule.

This patch adds metadata support in mlx5 driver, in two parts:
- Add the validation and setting of metadata value in matcher,
  when creating a new flow rule.
- Add the passing of metadata value from mbuf to wqe when
  indicated by ol_flag, in different burst functions.

[1] "ethdev: support metadata as flow rule criteria"
    http://mails.dpdk.org/archives/dev/2018-September/113269.html

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-26 22:14:06 +02:00
Moti Haimovsky
e1114ff6a5 net/mlx5: support e-switch flow count action
This commit adds support for configuring flows destined to the mlx5
eswitch with 'count' action and for querying these counts at runtime.

Each flow rule configured by the mlx5 driver is implicitly assigned
with flow counters. These counters can be retrieved when querying
the flow rule via Netlink, they can be found in each flow action
section of the reply. Hence, supporting the 'count' action in the
flow configuration command is straight-forward. When transposing
the command to a tc Netlink message we just ignore it instead of
rejecting it.
In the 'flow query count' side, the command now uses tc Netlink
query command in order to retrieve the values of the flow counters.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
2018-10-26 22:14:05 +02:00
Moti Haimovsky
684dafe795 net/mlx5: add flow query abstraction interface
Flow engine now supports multiple driver paths with each having
its own flow query implantation routine.
This patch adds an abstraction to the flow query routine in accordance
to commit 0c76d1c9a1 ("net/mlx5: add abstraction for multiple flow
drivers") done by Yongseok Koh.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
2018-10-26 22:14:05 +02:00
Moti Haimovsky
743fdbeb45 net/mlx5: use the new infrastructure for tc flow
modified TC-flow code to use the new infrastructure
introduced in "net/mlx5: refactor TC-flow infrastructure"
commit.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
2018-10-26 22:14:05 +02:00
Moti Haimovsky
d53180afe3 net/mlx5: refactor TC-flow infrastructure
This commit refactors tc_flow as a preparation to coming commits
that sends different type of messages and expect differ type of replies
while still using the same underlying routines.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
2018-10-26 22:14:05 +02:00
Thomas Monjalon
e16adf08e5 ethdev: free all common data when releasing port
This is a clean-up of common ethdev data freeing.
All data freeing are moved to rte_eth_dev_release_port()
and done only in case of primary process.

It is probably fixing some memory leaks for PMDs which were
not freeing all data.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-10-26 22:14:05 +02:00
Keith Wiles
81bede55e3 eal: add macro for attribute weak
eal: add shorthand __rte_weak macro
qat: update code to use __rte_weak macro
avf: update code to use __rte_weak macro
fm10k: update code to use __rte_weak macro
i40e: update code to use __rte_weak macro
ixgbe: update code to use __rte_weak macro
mlx5: update code to use __rte_weak macro
virtio: update code to use __rte_weak macro
acl: update code to use __rte_weak macro
bpf: update code to use __rte_weak macro

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-25 02:11:23 +02:00
Anatoly Burakov
5d7b673d5f mk: build with _GNU_SOURCE defined by default
We use _GNU_SOURCE all over the place, but often times we miss
defining it, resulting in broken builds on musl. Rather than
fixing every library's and driver's and application's makefile,
fix it by simply defining _GNU_SOURCE by default for all
builds.

Remove all usages of _GNU_SOURCE in source files and makefiles,
and also fixup a couple of instances of using __USE_GNU instead
of _GNU_SOURCE.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2018-10-22 11:28:27 +02:00
Shahaf Shuler
cd18e1b72f net/mlx5: fix build on Arm
On some ARM environment, the below compilation error will be seen

dpdk/drivers/net/mlx5/mlx5_flow_dv.c: In function
'flow_dv_translate_item_nvgre':
/tmp/dpdk/drivers/net/mlx5/mlx5_flow_dv.c:785:22: error: pointer targets
in initialization differ in signedness [-Werror=pointer-sign]
  const char *tni_v = nvgre_v->tni;

The reason for this error is that nvgre_v->tni is defined as byte array
in size of 3B. However the code in the function iterate till the 4B in
order to copy/set also the subsequent field after it (flow_id)

Fixing by pointing to this struct from a different pointer.

Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-10-18 10:24:39 +02:00
Xiaoyu Min
7604677883 net/mlx5: rewrite MAC address by E-Switch
Offload following modify MAC address actions to E-Switch
via TC-Flower driver

- RTE_FLOW_ACTION_TYPE_SET_MAC_SRC
- RTE_FLOW_ACTION_TYPE_SET_MAC_DST

The corresponding rte_flow_item_eth must be present in
rte_flow pattern

Only support modify outer layer MAC address

The example testpmd command is:

    flow create 0 transfer ingress
         pattern eth / ipv4 / udp dst is 7000 / end
         actions set_mac_dst mac_addr dd:00:aa:11:bb:33 /
         set_mac_src mac_addr bb:00:cc:11:aa:22 /
	 port_id id 1 / end

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-18 10:24:39 +02:00
Xiaoyu Min
a7cb5bcd28 net/mlx5: rewrite TTL by E-Switch
Offload following modify TTL actions to E-Switch via
TC-Flower driver

- RTE_FLOW_ACTION_TYPE_SET_TTL
- RTE_FLOW_ACTION_TYPE_DEC_TTL

The corresponding IP protocol rte_flow_item_ipv[4|6]
must be present in rte_flow pattern otherwith PMD
return error

The example testpmd commands are:

    flow create 0 transfer ingress
         pattern eth / ipv4 / udp dst is 7000 / end
	 actions dec_ttl /
	 port_id id 1 / end

    flow create 0 transfer ingress
         pattern eth / ipv4 / udp dst is 7001 / end
	 actions set_ttl ttl_value 10 /
	 port_id id 1 / end

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-18 10:24:39 +02:00
Xiaoyu Min
0eef26d073 net/mlx5: fix build with zero-size array
If the build environment doesn't have 'linux/tc_act/tc_pedit.h' header,
compiler will use needed structs defined in mlx5_flow_tcf.c.

However, there is a zero-size array defined in one struct and
ISO C forbids this when -Wpedantic is set by debug mode.

Simply put __extension__ keyword before the struct in question.

Fixes: 2ed2fe5f0a ("net/mlx5: rewrite IP address UDP/TCP port by E-Switch")

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
2018-10-18 10:24:39 +02:00
Yongseok Koh
31fda51877 net/mlx5: support multiple groups and jump action
rte_flow has 'group' attribute and 'jump' action in order to support
multiple groups. This feature is known as multi-table support ('chain' in
linux TC flower) in general because a group means a table of flows. Example
commands are:

	flow create 0 transfer priority 1 ingress
	     pattern eth / vlan vid is 100 / end
	     actions jump group 1 / end

	flow create 0 transfer priority 1 ingress
	     pattern eth / vlan vid is 200 / end
	     actions jump group 2 / end

	flow create 0 transfer group 1 priority 2 ingress
	     pattern eth / vlan vid is 100 /
	     	     ipv4 dst spec 192.168.40.0 dst prefix 24 / end
	     actions drop / end

	flow create 0 transfer group 1 priority 2 ingress
	     pattern end
	     actions of_pop_vlan / port_id id 1 / end

	flow create 0 transfer group 2 priority 2 ingress
	     pattern eth / vlan vid is 200 /
	     	     ipv4 dst spec 192.168.40.0 dst prefix 24 / end
	     actions of_pop_vlan / port_id id 2 / end

	flow create 0 transfer group 2 priority 2 ingress
	     pattern end
	     actions port_id id 2 / end

With theses flows, if a packet having vlan 200 and src_ip as 192.168.40.1,
this packet will firstly hit the 1st flow. Then it will hit the 5th flow
because of the 'jump' action. As a result, the packet will be forwarded to
port 2 (VF representor) with vlan tag being stripped off. If the packet had
vlan 100 instead, it would be dropped by the 3rd flow.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-18 10:24:39 +02:00
Thomas Monjalon
391797f042 drivers/bus: move driver assignment to end of probing
The PCI mapping requires to know the PCI driver to use,
even before the probing is done. That's why the PCI driver is
referenced early inside the PCI device structure. See
commit 1d20a073fa ("bus/pci: reference driver structure before mapping")

However the rte_driver does not need to be referenced in rte_device
before the device probing is done.
By moving back this assignment at the end of the device probing,
it becomes possible to make clear the status of a rte_device.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
2018-10-17 10:26:59 +02:00
Thomas Monjalon
a1883178e9 net/mlx5: remove useless driver name comparison
The function mlx5_dev_to_port_id() is returning all the ports
associated to a rte_device.
It was comparing driver names while already comparing rte_device pointers.
If two devices are the same, they will have the same driver.
So the useless driver name comparison is removed.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-17 10:26:59 +02:00
Xiaoyu Min
2ed2fe5f0a net/mlx5: rewrite IP address UDP/TCP port by E-Switch
Offload the following rte_flow actions by inserting accordingly
E-Switch rules via TC Flower driver

 - RTE_FLOW_ACTION_TYPE_SET_IPV4_SRC
 - RTE_FLOW_ACTION_TYPE_SET_IPV4_DST
 - RTE_FLOW_ACTION_TYPE_SET_IPV6_SRC
 - RTE_FLOW_ACTION_TYPE_SET_IPV6_DST
 - RTE_FLOW_ACTION_TYPE_SET_TP_SRC
 - RTE_FLOW_ACTION_TYPE_SET_TP_DST

The example testpmd command is:

    flow create 0 transfer ingress
         pattern eth / ipv4 / udp dst is 7000 / end
	 actions set_ipv4_src ipv4_addr 172.168.0.1 /
	 set_ipv4_dst ipv4_addr 172.168.10.1 /
	 set_tp_dst port 9000 /
	 set_tp_src port 700 /
	 port_id id 1 / end

Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:56:02 +02:00
Moti Haimovsky
92378c2b7f net/mlx5: support e-switch TCP-flags flow filter
This patch adds support for offloading flow rules with TCP-flags
filter to mlx5 eswitch Hardwrae.

With mlx5 it is possible to offload a limited set of flow rules to
the mlxsw (or e-switch) using the DPDK flow commands using the
"transfer" attribute. This set of flow rules also supports filtering
according to the values found in the TCP flags.
This patch implements this offload capability in the mlx5 PMD under
transfer attribute.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
2018-10-11 18:56:02 +02:00
Dekel Peled
3e9fa07908 net/mlx5: allow flow rule with attribute egress
This patch complements [1], adding to MLX5 PMD the option to set
flow rule for egress traffic.

[1] "net/mlx5: support metadata as flow rule criteria"
    http://mails.dpdk.org/archives/dev/2018-September/113275.html

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:56:02 +02:00
Shahaf Shuler
7dd7be29b4 net/mlx5: always use representor ifindex for ioctl
In the current code, on some cases the representor ethdev is using the
PF interface to query some link status information or pause parameters.

It was done because in previous kernel versions there was no support
from the kernel for the representor info.

Using the PF i/f for such ioctl is error prone and not always working
because:
 * On some cases there is no PF at all, only representors (e.g Bluefield
   with host representors)
 * Query the up/down status from representor and link status from PF
   is in-consist
 * PF link is down doesn't necessarily means representor is down.
 * setting different pause configuration for the PF and the
   representors will result on undefined behaviour

Making the code cleaner and more robust by using only the representor
i/f for the ioctl. whatever the kernel will provide on this query will
be used. No need to do W.A. for kernel missing functionality.

Note:
 1. Setting pause parameters will obviously won't work on representors
 2. Old kernel will not report all the possible representor info

Fixes: 2b73026388 ("net/mlx5: probe all port representors")
Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-11 18:56:02 +02:00
Shahaf Shuler
d469f6a5eb net/mlx5: add representor specific statistics
Representor ports has a different set of extended statistics (as those are
logical ports which cannot count all that the PF can).

Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-11 18:56:02 +02:00
Shahaf Shuler
1a611fdaf6 net/mlx5: support missing counter in extended statistics
The current code would fail if one of the counters DPDK counters was not
found on the device counters.

As representors and PF port has different counters the both cannot work
together.

Addressing this issue by making the counter init more flexible to
contain all the counter found and skipping the error.

Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-11 18:56:02 +02:00
Yongseok Koh
c10f5d643b net/mlx5: fix errno values for flow engine
Fixes: af689f1f04 ("net/mlx5: support flow Ethernet item along with drop action")
Fixes: 919d53ad78 ("net/mlx5: fix count query when flow has not counter")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-10-11 18:53:49 +02:00
Yongseok Koh
65254667c0 net/mlx5: add missing VLAN action constraints
1) VLAN modify isn't supported by driver.

2) FW syndrome (0xA9C090):
	set_flow_table_entry: push vlan action fte in fdb can ONLY be
	forward to the uplink.

3) FW syndrome (0x294609):
	set_flow_table_entry: modify/pop/push actions in fdb flow table are
	supported only while forwarding to vport.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-11 18:53:49 +02:00
Yongseok Koh
7e43a32ee0 net/mlx5: support externally allocated static memory
When MLX PMD registers memory for DMA, it accesses the global memseg list
of DPDK to maximize the range of registration so that LKey search can be
more efficient. Granularity of MR registration is per page.

Externally allocated memory shouldn't be used for DMA because it can't be
searched in the memseg list and free event can't be tracked by DPDK. If it
is used, the following error will occur:

	net_mlx5: port 0 unable to find virtually contiguous chunk for
	address (0x5600017587c0). rte_memseg_contig_walk() failed.

There's a pending patchset [1] which enables externally allocated memory.
Once it is merged, users can register their own memory out of EAL then that
will resolve this issue.

Meanwhile, if the external memory is static (allocated on startup and never
freed), such memory can also be registered by little tweak in the code.

[1] http://patches.dpdk.org/project/dpdk/list/?series=1415

This patch is not a bug fix but needs to be included in stable versions.

Fixes: 974f1e7ef1 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Xueming Li
45b83b9b04 net/mlx5: fix representor port xstats
This patch fixes the issue that representor port shows xstats of PF.

Fixes: 2b73026388 ("net/mlx5: probe all port representors")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Xueming Li
7bc47fb839 net/mlx5: fix representor port link status
Current code uses PF links status for representor port, not the
representor interface itself.
This caused wrong representor port link status when toggling
interface up or down.

Fixes: 2b73026388 ("net/mlx5: probe all port representors")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Yongseok Koh
57123c00c1 net/mlx5: add Linux TC flower driver for E-Switch flow
Flows having 'transfer' attribute have to be inserted to E-Switch on the
NIC and the control path uses Linux TC flower interface via Netlink
socket.
This patch adds the flow driver on top of the new flow engine.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Yongseok Koh
40c9ccf9e9 net/mlx5: remove Netlink flow driver
Netlink based E-Switch flow engine will be migrated to the new flow
engine.
nl_flow will be renamed to flow_tcf as it goes through Linux TC flower
interface.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Yongseok Koh
0c76d1c9a1 net/mlx5: add abstraction for multiple flow drivers
Flow engine has to support multiple driver paths. Verbs/DV for NIC flow
steering and Linux TC flower for E-Switch flow steering. In the future,
another flow driver could be added (devX).

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
51e72d386c net/mlx5: add runtime parameter to enable Direct Verbs
DV flow API is based on new kernel API and is
missing some functionality like counter but add other functionality
like encap.

In order not to affect current users even if the kernel supports
the new DV API it should be enabled only manually.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
c4d9b9f7f3 net/mlx5: add Direct Verbs final functions
This commits add the missing function which are apply, remove, and
destroy.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
509782b35b net/mlx5: add Direct Verbs driver to glue
This commit adds all Direct Verbs required functions to the glue lib.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
d02cb06912 net/mlx5: add Direct Verbs translate actions
In this commit we add the translation of flow actions.
Unlike the Verbs API actions are separeted from the items and are passed
to the API in array structure.
Since the target action like RSS require the QP information those
actions are handled both in the translate action and in the apply.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
fc2c498ccb net/mlx5: add Direct Verbs translate items
This commit handles the translation of the requested flow into Direct
Verbs API.

The Direct Verbs introduce the matcher object which acts as shared mask
for all flows that are using the same mask. So in this commit we
translate the item and get in return a matcher and the value that should
be matched.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
865a0c1567 net/mlx5: add Direct Verbs prepare function
This function allocates the Direct Verbs device flow, and
introduce the relevant PRM structures.

This commit also adds the matcher object. The matcher object acts as a
mask and should be shared between flows. For example all rules that
should match source IP with full mask should use the same matcher. A
flow that should match dest IP or source IP but without full mask should
have a new matcher allocated.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
3d69434113 net/mlx5: add Direct Verbs validation function
This is commit introduce the Direct Verbs driver API.
The Direct Verbs is an API adds new features like encapsulation, match
on metatdata.
In this commit the validation function was added, most of the validation
is done with functions that are also in use for the Verbs API.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
84c406e745 net/mlx5: add flow translate function
This commit modify the conversion of the input parameters into Verbs
spec, in order to support all previous changes.

Some of those changes are:
removing the use of the parser,
storing each flow in its own flow structure.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
4e05a229c5 net/mlx5: add flow prepare function
In current implementation the calculation of the flow size is done
during the validation stage, and the same function is also used to
translate the input parameters into verbs spec.  This is hard to
maintain and error prone.
Another issue is dev-flows (flows that are created implicitly in order
to support the requested flow for example when the user request RSS on
UDP 2 rules need to be created one for IPv4 and one for IPv6).
In current implementation the dev-flows are created on the same
memory allocation. This will be harder to implement in future drivers.

The commits extract the calculation and creation of the dev-flow from
the translation part (the part that converts the parameters into the
format required by the driver). This results in that the prepare
function only function is to allocate the dev-flow.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Ori Kam
23c1d42c71 net/mlx5: split flow validation to dedicated function
In current implementation the validation logic reside in the same
function that calculates the size of the verbs spec and also create the
verbs spec.
This approach results in hard to maintain code which can't be shared.
also in current logic there is a use of parser entity that holds the
information between function calls. The main problem with this parser is
that it assumes the connection between different functions. For example
it assumes that the validation function was called and relevant values
were set.
This may result in an issue if and when we only call the validation
function, or call the apply function without the validation (Currently
according to RTE flow we must call validation before creating flow, but
if we want to change that to save time during flow creation, for example
the user validated some rule and just want to change the IP there is no
true reason the validate the rule again).

This commit address both of those issues by extracting the validation
logic into detected functions and remove the use of the parser object.
The side effect of those changes is that in some cases there will be a
need to traverse the item list again.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 18:53:49 +02:00
Anatoly Burakov
5282bb1c36 mem: allow memseg lists to be marked as external
When we allocate and use DPDK memory, we need to be able to
differentiate between DPDK hugepage segments and segments that
were made part of DPDK but are externally allocated. Add such
a property to memseg lists.

This breaks the ABI, so document the change in release notes.
This also breaks a few internal assumptions about memory
contiguousness, so adjust malloc code in a few places.

All current calls for memseg walk functions were adjusted to
ignore external segments where it made sense.

Mempools is a special case, because we may be asked to allocate
a mempool on a specific socket, and we need to ignore all page
sizes on other heaps or other sockets. Previously, this
assumption of knowing all page sizes was not a problem, but it
will be now, so we have to match socket ID with page size when
calculating minimum page size for a mempool.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-10-11 10:24:29 +02:00
Ori Kam
c322c0e558 net/mlx5: add bluefield VF support
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-09-28 01:41:01 +02:00
Nelio Laranjeiro
96d7c62a70 net/mlx5: support meson build
Compile Mellanox driver when its external dependencies are met.  A
glue version of the driver can still be requested by using the
-Denable_driver_mlx_glue=true

Meson will try to find the required external libraries.  When they are
not installed system wide, they can be provided though CFLAGS, LDFLAGS
and LD_LIBRARY_PATH environment variables, example (considering
RDMA-Core is installed in /tmp/rdma-core):

 # CLFAGS=-I/tmp/rdma-core/build/include \
   LDFLAGS=-L/tmp/rdma-core/build/lib \
   LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
   meson output
 # LD_LIBRARY_PATH=/tmp/rdma-core/build/lib \
   ninja -C output install

Note: LD_LIBRARY_PATH before ninja is necessary when the meson
configuration has changed (e.g. meson configure has been called), in
such situation the LD_LIBRARY_PATH is necessary to invoke the
autoconfiguration script.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2018-09-28 01:41:01 +02:00
Ferruh Yigit
323e7b667f ethdev: make default behavior CRC strip on Rx
Removed DEV_RX_OFFLOAD_CRC_STRIP offload flag.
Without any specific Rx offload flag, default behavior by PMDs is to
strip CRC.

PMDs that support keeping CRC should advertise DEV_RX_OFFLOAD_KEEP_CRC
Rx offload capability.

Applications that require keeping CRC should check PMD capability first
and if it is supported can enable this feature by setting
DEV_RX_OFFLOAD_KEEP_CRC in Rx offload flag in rte_eth_dev_configure()

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tomasz Duszynski <tdu@semihalf.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Remes <remes@netcope.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2018-09-14 20:08:41 +02:00
Xueming Li
3afdf157fc net/mlx5: fix interrupt completion queue index wrapping
Rxq cq_ci was 16 bits while hardware is expecting to wrap
around 24 bits, this caused interrupt failure after burst of packets.

Fixes: 43e9d9794c ("net/mlx5: support upstream rdma-core")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-09-10 13:59:03 +02:00
Shahaf Shuler
b8ac090835 net/mlx5: fix RSS flow action hash type selection
On the code after the below commits, the criteria to select the IPV4 or
IPV6 hash functions was the existence of some ETH_RSS_IPV4 RSS types on
the flow rule.

The check is wrong. For example ETH_RSS_NONFRAG_IPV4_TCP will not select
the IPV4 hash which will cause the packet to be spread in a bad way.

Fix it by adding the corresponding types needed for each hash selection.

Fixes: 592f05b29a ("net/mlx5: add RSS flow action")
Fixes: fd0b70316b ("net/mlx5: support inner RSS computation")
Cc: stable@dpdk.org

Reported-by: Yaroslav Brustinov <ybrustin@cisco.com>
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-08-28 15:27:39 +02:00
Shahaf Shuler
f9de87187b net/mlx5: disable ConnectX-4 Lx Multi Packet Send by default
On ConnectX-4 Lx the Multi Packet Send (MPW) feature is considered
un-secure, as on some cases were the application provides incorrect mbufs
on the Tx burst the host or NIC can get stuck.

Hence, disabling the feature by default for this specific NIC.
Users can still enable this feature and enjoy the performance gain
(mostly for low number of cores) by using the txq_mpw_en devarg.

This patch will impact the out of the box performance of some application
using ConnectX-4 Lx for the sack of security and robustness.

Since we need different defaults based on the underlying device the mpw
field in the configuration struct was extended to contain also the
MLX5_ARG_UNSET option.

Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-08-28 15:27:39 +02:00
Adrien Mazarguil
dce1e4c204 net/mlx5: fix artificial L4 limitation on switch flow rules
Partial bit-masks are in fact supported on TCP/UDP source/destination
ports. Remove unnecessary check.

Fixes: 2bfc777e07 ("net/mlx5: add L2-L4 pattern items to switch flow rules")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-08-28 15:27:39 +02:00
Yongseok Koh
4996efd746 net/mlx5: fix minimum size of multi-packet Rx queue
The size of Rx queue is determined by dividing the number of descriptors by
the number of strides. As device can't support single slot queue, if the
number of descriptors is same as the number of strides, MPRQ shouldn't be
enabled. Otherwise, this will cause HW fault. For example, if rxd is set to
512 with testpmd on ConnectX-4 Lx, PMD can't receive more than 512 packets
because the minimum number of strides for ConnectX-4 Lx is 512. Users have
to configure larger number of descriptors in this case.

Fixes: 7d6bf6b866 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-08-09 00:56:50 +02:00
Yongseok Koh
b85b719ab7 net/mlx5: fix minimum number of Multi-Packet RQ buffers
If MPRQ is enabled, a PMD-private mempool is allocated. For ConnectX-4 Lx,
the minimum number of strides is 512 which ConnectX-5 supports 8. This
results in quite small number of elements for the MPRQ mempool. For
example, if the size of Rx ring is configured as 512, only one MPRQ buffer
can cover the whole ring. If there's only one Rx queue is configured. In
the following code in mlx5_mprq_alloc_mp(), desc is 1 and obj_num will be
36 as a result.

	desc *= 4;
	obj_num = desc + MLX5_MPRQ_MP_CACHE_SZ * priv->rxqs_n;

However, rte_mempool_create_empty() has a sanity check to refuse large
per-lcore cache size compared to the number of elements. Cache flush
threshold should not exceed the number of elements of a mempool. For the
above example, the threshold is 32 * 1.5 = 48 which is larger than 36 and
it fails to create the mempool.

Fixes: 7d6bf6b866 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-08-05 13:29:34 +02:00
Yongseok Koh
e845e9ceb4 net/mlx5: fix check for MPLS-in-GRE
Multiple tunnel isn't allowed but MPLS over GRE should be accepted.

Fixes: a4a5cd21d2 ("net/mlx5: add flow MPLS item")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-08-05 13:15:08 +02:00
Yongseok Koh
2547ee7458 net/mlx5: preserve allmulticast flag for flow isolation mode
mlx5_dev_ops_isolate doesn't have APIs for enabling/disabling allmulti
mode as it can't be enabled in flow isolation mode. If the function
pointers are null, librte APIs such as
rte_eth_allmulticast_enable/disable() fail to set the flag
(dev->data->all_multicast). The flag is used when starting traffic by
mlx5_traffic_enable(). When switching out of flow isolation mode, allmulti
mode will not be set even though it has been enabled.

Fixes: 0887aa7f27 ("net/mlx5: add new operations for isolated mode")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-08-05 08:47:41 +02:00
Yongseok Koh
24b068ad71 net/mlx5: preserve promiscuous flag for flow isolation mode
mlx5_dev_ops_isolate doesn't have APIs for enabling/disabling promiscuous
mode as it can't be enabled in flow isolation mode. If the function
pointers are null, librte APIs such as rte_eth_promiscuous_enable/disable()
fail to set the flag (dev->data->promiscuous). The flag is used when
starting traffic by mlx5_traffic_enable(). When switching out of flow
isolation mode, promiscuous mode will not be set even though it has been
enabled.

Fixes: 0887aa7f27 ("net/mlx5: add new operations for isolated mode")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-08-05 08:47:40 +02:00
Adrien Mazarguil
16ed22ba14 net/mlx5: fix VLAN pop action on switch flow rules
TC message for VLAN POP is broken due to an unfinished nested attribute.

Fixes: 7ac6778d50 ("net/mlx5: add VLAN item and actions to switch flow rules")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-08-02 13:47:57 +02:00
Adrien Mazarguil
eb19f1f6c8 net/mlx5: fix VLAN matching on switch flow rules
VLAN ID is not properly translated to TC due to swapped byte order.

Fixes: 7ac6778d50 ("net/mlx5: add VLAN item and actions to switch flow rules")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-08-02 13:46:39 +02:00
Moti Haimovsky
dd6eaddff7 net/mlx5: fix RSS flow configuration crash
This commit fixes a segmentation fault observed when configuring
mlx5 with RSS flow rule containing invalid queues indices such as
negative numbers, queue numbers bigger than the number Rx queues the
PMD or has no queues at all.

Fixes: 592f05b29a ("net/mlx5: add RSS flow action")

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
2018-08-02 13:22:35 +02:00
Ophir Munk
09e0fd260e net/mlx5: fix secondary process resource leakage
When running testpmd with an mlx5 device and then executing at testpmd
prompt in a raw: "port start all" followed by "port stop all"
a new file named /var/tmp/net_mlx5_<socket num> is created as a result
of creating a new unix domain socket (used for communication between
the primary and secondary processes).
When the new unix socket file is created the old unix socket file should
have been removed. This commit fixes it by closing the old unix socket
just before creating the new one in function mlx5_socket_init()

Fixes: f8b9a3bad4 ("net/mlx5: install a socket to exchange a file descriptor")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-08-02 13:15:27 +02:00
Shahaf Shuler
9da1db6bbf net/mlx5: fix VLAN filtering
The below commit has added a graph based expansion logic for RSS rule to
satisfy Verbs requirements. With this logic, for example, the rule:

flow create 0 ingress pattern eth / end actions rss queues 0 1 end types
ipv4-tcp ipv6-tcp end / end

will be expanded into the rules:

flow create 0 ingress pattern eth / ipv4 / tcp / end actions rss queues 0 1
end types ipv4-tcp ipv6-tcp end / end

flow create 0 ingress pattern eth / ipv6 / tcp / end actions rss queues 0 1
end types ipv4-tcp ipv6-tcp end / end

flow create 0 ingress pattern eth / end actions queue index 0 / end

The below commit defined two graphs:
1. graph for the tunnel case which starts from the ETH item
2. graph for the non-tunnel case which starts from the ETH item

The graphs are ignoring the VLAN case. Hence rules with VLAN item will
fail to traverse the graph and it will result in flow rule creation error.

Adding the VLAN item to the existing graphs will not work as the flow
engine will reject any VLAN item without a specific vid.

To solve this case two new graphs were added (for the tunnel and
non-tunnel case) which contain the VLAN item and are being used only
when the VLAN item exists in the flow pattern.

Two cases left un-covered for the inner RSS:
1. The case were VLAN exists in the pattern as part of the inner headers
2. The case were VLAN exists in the pattern both in the outer and the
inner headers

Solving those cases will require to add two more graphs.
Holding a VLAN for the overlay network is not common, the subnets are
usually defined by the tunnel protocol, for example the VXLAN vni.
Hence adding those two graphs seems like an overkill at this point.
Based on needs one can add those to provide the full support.

Fixes: 592f05b29a ("net/mlx5: add RSS flow action")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2018-08-02 13:15:18 +02:00
Matan Azrad
2d3b832052 net/mlx5: fix packet type offload for tunnels
There are dedicated QP attributes, tunnel offload flag and mask, which
must be configured in order to allow part of the HW tunnel offloads.

So, if a QP is pointed by a tunnel flow, the above QP attributes
should be configured.

The mask configuration is wrongly only performed if an internal RSS was
configured by the user, while there is no reason to condition the
tunnel offloads in RSS configurations.

Consequently, some of the tunnel offloads was not performed by the HW
when a tunnel flow was configured, for example, the packet tunnel
types was not reported to the user.

Replace the internal RSS condition with the tunnel flow condition.

Fixes: df6afd377a ("net/mlx5: remove useless arguments in hrxq API")

Signed-off-by: Matan Azrad <matan@mellanox.com>
2018-08-02 12:34:18 +02:00
Timothy Redaelli
c7684b6be4 net/mlx5: avoid stripping the glue library
Stripping binaries at build time is usually a bad thing since it makes
impossible to generate (split) debug symbols and this can lead to a more
difficult debugging.

Fixes: 59b91bec12 ("net/mlx5: spawn rdma-core dependency plug-in")
Cc: stable@dpdk.org

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
2018-08-02 12:34:17 +02:00
Shahaf Shuler
ed4c7fd92a net/mlx5: fix flow count action for shared counter
According to commit fb8fd96d42 ("ethdev: add shared counter to flow
API") the counter id should be taken into account only when the shared
flag is set.

Fixes: 60bd8c9747 ("net/mlx5: add count flow action")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2018-08-02 12:34:17 +02:00
Yaroslav Brustinov
2854ba22c5 net/mlx5: fix linkage of glue lib with gcc 4.7.2
addressing a gcc 4.7.2 bug that cannot be reproduced with latter
versions:

"bin/ld: Warning: alignment 8 of symbol `mlx5_glue' in
src/dpdk/drivers/net/mlx5/mlx5_glue.c.21.o is smaller than 16 in
src/dpdk/drivers/net/mlx5/mlx5_rxq.c.21.o"

Fix it be forcing the alignment of the glue lib.

Fixes: 0e83b8e536 ("net/mlx5: move rdma-core calls to separate file")
Cc: stable@dpdk.org

Signed-off-by: Yaroslav Brustinov <ybrustin@cisco.com>
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-07-26 14:05:52 +02:00
Adrien Mazarguil
3f8cb05df5 net/mlx5: fix invalid network interface index
Network interface indices being unsigned, an invalid index or error is
normally expressed through a zero value (see if_nametoindex()).

mlx5_ifindex() has a signed return type for negative values in case of
error. Since mlx5_nl.c does not check for errors, these may be fed back as
invalid interfaces indices to subsequent system calls. This usage would
have been correct if mlx5_ifindex() returned a zero value instead.

This patch makes mlx5_ifindex() unsigned for convenience.

Fixes: ccdcba53a3 ("net/mlx5: use Netlink to add/remove MAC addresses")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Nelio Laranjeiro
919d53ad78 net/mlx5: fix count query when flow has not counter
Querying a counters on a flow without counter is ending with a
segmentation fault.

Fixes: 60bd8c9747 ("net/mlx5: add count flow action")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Yongseok Koh
24f653a7e8 net/mlx5: fix queue rollback when starting device
mlx5_rxq_start() and mlx5_rxq_stop() must be strictly paired because
internal reference counter is increased or decreased inside. Also,
mlx5_rxq_get() must be paired with mlx5_rxq_release().

Fixes: 7d6bf6b866 ("net/mlx5: add Multi-Packet Rx support")
Fixes: a1366b1a2b ("net/mlx5: add reference counter on DPDK Rx queues")
Fixes: 6e78005a9b ("net/mlx5: add reference counter on DPDK Tx queues")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Yongseok Koh
c20d4a70ca net/mlx5: fix endless loop when clearing flow flags
If one of (*priv->rxqs)[] is null, the for loop can iterate infinitely as
idx can't be increased.

Fixes: cd24d52639 ("net/mlx5: add mark/flag flow action")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-07-26 14:05:52 +02:00
Nelio Laranjeiro
5366074b01 net/mlx5: fix route Netlink message overflow
Route Netlink message socket is wrongly initialized by registering to
the route link group.  This causes the socket to receive all link
message related to routes whereas the PMD do not expect to receive such
information.  In some situation it ends by filling the socket at a point
that any new message cannot be exchanged.
As the PMD is not expected to process such broadcast messages, the
parameter in the nl_group in the function is also remove.

Fixes: ccdcba53a3 ("net/mlx5: use Netlink to add/remove MAC addresses")
Cc: stable@dpdk.org

Signed-off-by: Zijie Pan <zijie.pan@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-07-26 14:05:52 +02:00
Yongseok Koh
c618e7e82b net/mlx5: fix assert for Tx completion queue count
There should be at least one Tx CQE remained if Tx WQ and txq->elts[] have
available slots to send a packet because the size of Tx CQ is exactly
calculated from the size of other resources. As it is guaranteed, it is
checked by an assertion.

max_elts is checked after the assertion for Tx CQ. If no slot is available
in txq->elts[], the assertion would be wrong.

Fixes: 2eefbec531 ("net/mlx5: add missing sanity checks for Tx completion queue")
Fixes: 6ce84bd889 ("net/mlx5: add enhanced multi-packet send for ConnectX-5")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
2018-07-26 14:05:52 +02:00
Nelio Laranjeiro
f872b4b99d net/mlx5: fix representors detection
On systems where the required Netlink commands are not supported but
Mellanox OFED is installed, representors information must be retrieved
through sysfs.

Fixes: 26c08b979d ("net/mlx5: add port representor awareness")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-07-26 14:05:52 +02:00
Nelio Laranjeiro
2bc98393ac net/mlx5: fix TCI mask filter
In mlx5_traffic_enable() the TCI mask for the VLAN is wrong causing the
sub flow engine to reject the rule.

Fixes: 272733b5eb ("net/mlx5: use flow to enable unicast traffic")
Cc: stable@dpdk.org

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-07-26 14:05:52 +02:00
Moti Haimovsky
79d0989213 net/mlx5: fix build with old kernels
This commit fixes compilation errors due to missing definitions
found when compiling mlx5 PMD from DPDK 17.11-LTS on Ubuntu 12.4
with kernel 3.15.

Fixes: 75ef62a943 ("net/mlx5: fix link speed capability information")
Fixes: 5bfc9fc112 ("net/mlx5: use static assert for compile-time sanity checks")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-07-26 14:05:52 +02:00
Adrien Mazarguil
21174f2a5c net/mlx5: add port ID pattern item to switch flow rules
This enables flow rules to match traffic coming from a different DPDK port
ID associated with the device (PORT_ID pattern item), mainly for the
convenience of applications that want to deal with a single port ID for all
flow rules associated with some physical device.

Testpmd example:

- Creating a flow rule on port ID 1 to consume all traffic from port ID 0
  and direct it to port ID 2:

  flow create 1 ingress transfer pattern port_id id is 0 / end actions
     port_id id 2 / end

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Adrien Mazarguil
7ac6778d50 net/mlx5: add VLAN item and actions to switch flow rules
This enables flow rules to explicitly match VLAN traffic (VLAN pattern
item) and perform various operations on VLAN headers at the switch level
(OF_POP_VLAN, OF_PUSH_VLAN, OF_SET_VLAN_VID and OF_SET_VLAN_PCP actions).

Testpmd examples:

- Directing all VLAN traffic received on port ID 1 to port ID 0:

  flow create 1 ingress transfer pattern eth / vlan / end actions
     port_id id 0 / end

- Adding a VLAN header to IPv6 traffic received on port ID 1 and directing
  it to port ID 0:

  flow create 1 ingress transfer pattern eth / ipv6 / end actions
     of_push_vlan ethertype 0x8100 / of_set_vlan_vid vlan_vid 42 /
     port_id id 0 / end

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Adrien Mazarguil
2bfc777e07 net/mlx5: add L2-L4 pattern items to switch flow rules
This enables flow rules to explicitly match supported combinations of
Ethernet, IPv4, IPv6, TCP and UDP headers at the switch level.

Testpmd example:

- Dropping TCPv4 traffic with a specific destination on port ID 2:

  flow create 2 ingress transfer pattern eth / ipv4 / tcp dst is 42 / end
     actions drop / end

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Adrien Mazarguil
9b33df8e0c net/mlx5: add fate actions to switch flow rules
This patch enables creation of rte_flow rules that direct matching traffic
to a different port (e.g. another VF representor) or drop it directly at
the switch level (PORT_ID and DROP actions).

Testpmd examples:

- Directing all traffic to port ID 0:

  flow create 1 ingress transfer pattern end actions port_id id 0 / end

- Dropping all traffic normally received by port ID 1:

  flow create 1 ingress transfer pattern end actions drop / end

Note the presence of the transfer attribute, which requests them to be
applied at the switch level. All traffic is matched due to empty pattern.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Adrien Mazarguil
8f9059ccee net/mlx5: add framework for switch flow rules
Because mlx5 switch flow rules are configured through Netlink (TC
interface) and have little in common with Verbs, this patch adds a separate
parser function to handle them.

- mlx5_nl_flow_transpose() converts a rte_flow rule to its TC equivalent
  and stores the result in a buffer.

- mlx5_nl_flow_brand() gives a unique handle to a flow rule buffer.

- mlx5_nl_flow_create() instantiates a flow rule on the device based on
  such a buffer.

- mlx5_nl_flow_destroy() performs the reverse operation.

These functions are called by the existing implementation when encountering
flow rules which must be offloaded to the switch (currently relying on the
transfer attribute).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Adrien Mazarguil
20b71e92ef net/mlx5: lay groundwork for switch offloads
With mlx5, unlike normal flow rules implemented through Verbs for traffic
emitted and received by the application, those targeting different logical
ports of the device (VF representors for instance) are offloaded at the
switch level and must be configured through Netlink (TC interface).

This patch adds preliminary support to manage such flow rules through the
flow API (rte_flow).

Instead of rewriting tons of Netlink helpers and as previously suggested by
Stephen [1], this patch introduces a new dependency to libmnl [2]
(LGPL-2.1) when compiling mlx5.

[1] https://mails.dpdk.org/archives/dev/2018-March/092676.html
[2] https://netfilter.org/projects/libmnl/

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Moti Haimovsky
6bf10ab69b net/mlx5: support 32-bit systems
This patch adds support for building and running mlx5 PMD on
32bit systems such as i686.

The main issue to tackle was handling the 32bit access to the UAR
as quoted from the mlx5 PRM:
QP and CQ DoorBells require 64-bit writes. For best performance, it
is recommended to execute the QP/CQ DoorBell as a single 64-bit write
operation. For platforms that do not support 64 bit writes, it is
possible to issue the 64 bits DoorBells through two consecutive
writes,
each write 32 bits, as described below:
* The order of writing each of the Dwords is from lower to upper
  addresses.
* No other DoorBell can be rung (or even start ringing) in the midst
 of an on-going write of a DoorBell over a given UAR page.

The last rule implies that in a multi-threaded environment, the access
to a UAR page (which can be accessible by all threads in the process)
must be synchronized (for example, using a semaphore) unless an atomic
write of 64 bits in a single bus operation is guaranteed. Such a
synchronization is not required for when ringing DoorBells on different
UAR pages.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-12 14:34:59 +02:00
Shahaf Shuler
06b1fe3f6d net/mlx5: fix build with rdma-core v19
The flow counter support introduced by
commit 9a761de8ea ("net/mlx5: flow counter support") was intend to
work only with MLNX_OFED_4.3 as the upstream rdma-core
libraries were lack such support.

On rdma-core v19 the support for the flow counters was added but with
different user APIs, hence causing compilation issues on the PMD.

This patch fix the compilation errors by forcing the flow counters
to be enabled only with MLNX_OFED APIs.
Once MLNX_OFED and rdma-core APIs will be aligned, a proper patch to
support the new API will be submitted.

Fixes: 9a761de8ea ("net/mlx5: flow counter support")
Cc: stable@dpdk.org

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Reported-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2018-07-12 12:53:59 +02:00
Nelio Laranjeiro
60bd8c9747 net/mlx5: add count flow action
This is only supported by Mellanox OFED.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-12 12:12:27 +02:00