532 Commits

Author SHA1 Message Date
Ferruh Yigit
cd8c7c7ce2 ethdev: replace bus specific struct with generic dev
Public struct rte_eth_dev_info has a "struct rte_pci_device" field in it
although it is common for all ethdev in all buses.

Replacing pci specific struct with generic device struct and updating
places that are using pci device in a way to get this information from
generic device.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-04-14 00:41:44 +02:00
Nélio Laranjeiro
db209cc32a net/mlx5: add parameter for Netlink support in VF
All Netlink request the PMD will do can also be done by a iproute2 command
line interface, enabling VF behavior configuration without having to modify
the application nor reaching PMD limits (e.g. MAC address number limit).

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-14 00:41:44 +02:00
Nélio Laranjeiro
dd4bb90bc3 net/mlx5: use Netlink to enable promisc/allmulti mode
VF devices are not able to receive promisc or allmulti traffic unless it
fully requests it though Netlink.  This will cause the request to be
processed by the PF which will handle the request and enable it.

This requires the VF to be trusted by the PF.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-14 00:41:44 +02:00
Nélio Laranjeiro
ccdcba53a3 net/mlx5: use Netlink to add/remove MAC addresses
VF devices are not able to receive traffic unless it fully requests it
though Netlink.  This will cause the request to be processed by the PF
which will add/remove the MAC address to the VF table if the VF is trusted.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-14 00:41:44 +02:00
Yongseok Koh
f84411be9e net/mlx5: remove excessive data prefetch
In Enhanced Multi-Packet Send (eMPW), entire packet data is prefetched to
LLC if it isn't inlined. Even though this helps reducing jitter when HW
fetches data by DMA, this can thresh the LLC with evicting precious data.
And if the size of queue is large and there are many queues, this might not
be effective. Also, if application runs on a remote node from the PCIe
link, it may not be helpful and can even cause bad results.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-14 00:40:21 +02:00
Bin Huang
0915e287a6 net/mlx5: add packet type index for TCP ack
According to CQE format:
- l4_hdr_type:
     0 - None
     1 - TCP header was present in the packet
     2 - UDP header was present in the packet
     3 - TCP header was present in the packet with Empty
         TCP ACK indication. (TCP packet <ACK> flag is set,
         and packet carries no data)
     4 - TCP header was present in the packet with TCP ACK indication.
         (TCP packet <ACK> flag is set, and packet carries data).

A packet should be identified as TCP packet if l4_hdr_type is 1, 3 or 4.
Add corresponding idx of TCP ACK to ptype table.

previous discussion:
https://www.mail-archive.com/users@dpdk.org/msg02980.html

Signed-off-by: Bin Huang <bin.huang@hxt-semitech.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-04-14 00:40:21 +02:00
Bruce Richardson
a11dfe9b65 net/mlx: fix warnings for unused compiler arguments
When linking the mlx glue code libraries using CC, the linker arguments in
LDFLAGS are not prefixed with -Wl. [The EXTRA_LDFLAGS are though.] This
leads to warning messages on build:

clang-5.0: warning: argument unused during compilation: '-e xport-dynamic'

Fix this by checking for $LINK_USING_CC in the Makefiles and prefixing the
LDFLAGS appropriately if set.

Fixes: 27cea11686ff ("net/mlx4: spawn rdma-core dependency plug-in")
Fixes: 59b91bec12c6 ("net/mlx5: spawn rdma-core dependency plug-in")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-04-14 00:40:21 +02:00
Anatoly Burakov
66cc45e293 mem: replace memseg with memseg lists
Before, we were aggregating multiple pages into one memseg, so the
number of memsegs was small. Now, each page gets its own memseg,
so the list of memsegs is huge. To accommodate the new memseg list
size and to keep the under-the-hood workings sane, the memseg list
is now not just a single list, but multiple lists. To be precise,
each hugepage size available on the system gets one or more memseg
lists, per socket.

In order to support dynamic memory allocation, we reserve all
memory in advance (unless we're in 32-bit legacy mode, in which
case we do not preallocate memory). As in, we do an anonymous
mmap() of the entire maximum size of memory per hugepage size, per
socket (which is limited to either RTE_MAX_MEMSEG_PER_TYPE pages or
RTE_MAX_MEM_MB_PER_TYPE megabytes worth of memory, whichever is the
smaller one), split over multiple lists (which are limited to
either RTE_MAX_MEMSEG_PER_LIST memsegs or RTE_MAX_MEM_MB_PER_LIST
megabytes per list, whichever is the smaller one). There is also
a global limit of CONFIG_RTE_MAX_MEM_MB megabytes, which is mainly
used for 32-bit targets to limit amounts of preallocated memory,
but can be used to place an upper limit on total amount of VA
memory that can be allocated by DPDK application.

So, for each hugepage size, we get (by default) up to 128G worth
of memory, per socket, split into chunks of up to 32G in size.
The address space is claimed at the start, in eal_common_memory.c.
The actual page allocation code is in eal_memalloc.c (Linux-only),
and largely consists of copied EAL memory init code.

Pages in the list are also indexed by address. That is, in order
to figure out where the page belongs, one can simply look at base
address for a memseg list. Similarly, figuring out IOVA address
of a memzone is a matter of finding the right memseg list, getting
offset and dividing by page size to get the appropriate memseg.

This commit also removes rte_eal_dump_physmem_layout() call,
according to deprecation notice [1], and removes that deprecation
notice as well.

On 32-bit targets due to limited VA space, DPDK will no longer
spread memory to different sockets like before. Instead, it will
(by default) allocate all of the memory on socket where master
lcore is. To override this behavior, --socket-mem must be used.

The rest of the changes are really ripple effects from the memseg
change - heap changes, compile fixes, and rewrites to support
fbarray-backed memseg lists. Due to earlier switch to _walk()
functions, most of the changes are simple fixes, however some
of the _walk() calls were switched to memseg list walk, where
it made sense to do so.

Additionally, we are also switching locks from flock() to fcntl().
Down the line, we will be introducing single-file segments option,
and we cannot use flock() locks to lock parts of the file. Therefore,
we will use fcntl() locks for legacy mem as well, in case someone is
unfortunate enough to accidentally start legacy mem primary process
alongside an already working non-legacy mem-based primary process.

[1] http://dpdk.org/dev/patchwork/patch/34002/

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:55:39 +02:00
Anatoly Burakov
718e35999c net/mlx5: use virt2memseg instead of iteration
Reduce dependency on internal details of EAL memory subsystem, and
simplify code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:55:02 +02:00
Anatoly Burakov
8594a2026b net/mlx5: use memseg walk instead of iteration
Reduce dependency on internal details of EAL memory subsystem, and
simplify code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:48:12 +02:00
Shahaf Shuler
5feecc57d9 align SPDX Mellanox copyrights
Aligning Mellanox SPDX copyrights to a single format.
In addition replace to SPDX licence files which were missed.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-11 01:47:47 +02:00
Bruce Richardson
c022cb400e convert snprintf to strlcpy
Since we have support for the strlcpy function in DPDK, replace all
instances where a string is copied using snprintf.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 17:33:08 +02:00
Shahaf Shuler
e7041f5529 net/mlx5: fix RSS key length query
The RSS key length returned by rte_eth_dev_info_get command was taken
from the
PMD private structure. This structure initialization was done only after
the port configuration.

Considering Mellanox device supports only 40B long RSS key, reporting
the fixed number instead.

Fixes: 29c1d8bb3e79 ("net/mlx5: handle a single RSS hash key for all protocols")
Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-03-30 14:08:44 +02:00
Shahaf Shuler
a1572312f7 net/mlx5: enforce RSS key length limitation
RSS hash key must be 40 Bytes long.

Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-03-30 14:08:44 +02:00
Dahir Osman
66669155da net/mlx5: setup RSS regardless of queue count
In some environments it is desirable to have the NIC perform RSS
normally on the packet regardless of the number of queues configured.
The RSS hash result that is stored in the mbuf can then be used by
the application to make decisions about how to distribute workloads
to threads, secondary processes, or even virtual machines if the
application is a virtual switch.  This change to the mlx5 driver
aligns with how other drivers in the Intel family work.

Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Tested-by: Allain Legacy <allain.legacy@windriver.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
7b2207afe8 net/mlx5: fix icc build
Remove the second declaration of device_attr [1] inside the loop as well as
the query_device_ex() which has already been done outside of the loop.

[1] https://dpdk.org/ml/archives/dev/2018-March/091744.html

Fixes: 9a761de8ea14 ("net/mlx5: flow counter support")
Cc: stable@dpdk.org

Reported-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-03-30 14:08:44 +02:00
Shahaf Shuler
b7059e6e43 net/mlx5: fix TSO enablement
TSO should be set if either of the TSO offload flags is requested.

Fixes: dbccb4cddcd2 ("net/mlx5: convert to new Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
0b1edd21cd net/mlx5: refuse empty VLAN flow specification
Verbs specification doesn't help to distinguish between packets having an
VLAN and those which do not have, this ends by having flow rule which does
not react as the user expects e.g.

 flow create 0 ingress pattern eth / vlan / end action queue index 0 / end
 flow create 0 ingress pattern eth / end action queue index 1 / end

are colliding in Verbs definition as in both rule are matching packets with
or without VLAN.
For this reason, the VLAN specification must not be empty, otherwise the
PMD has to refuse it.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
fca1301768 net/mlx5: improve flow error explanation
Fill the error context in conversion function to provide a better reason on
why it cannot be done to the user.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
749365717f net/mlx5: change tunnel flow priority
Packet matching inner and outer flow rules are caught by the first one
added in the device as both flows are configured with the same priority.
To avoid such situation, the inner flow can have an higher priority than
the outer ones as their pattern matching will otherwise collide.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
cfee94752b net/mlx5: fix link status to use wait to complete
Wait to complete is present to let the application get a correct status
when it requires it, it should not be ignored.

Fixes: e313ef4c2fe8 ("net/mlx5: fix link state on device start")
Fixes: cb8faed7dde8 ("mlx5: support link status update")
Cc: stable@dpdk.org

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
7ba5320baa net/mlx5: fix link status behavior
This behavior is mixed between what should be handled by the application
and what is under PMD responsibility.

According to DPDK API:
- link_update() should only query the link status [1]
- link_set_{up,down}() should only set the link to the according status [1]
- dev_{start,stop}() should enable/disable traffic reception/emission [2]

On this PMD, the link status is retrieved from the net device associated
owned by the Linux Kernel, it does not means that even when this interface
is down, the PMD cannot send/receive traffic from the NIC those two
information are unrelated, until the physical port is active and has a
link, the PMD can receive/send traffic on the wire.

According to DPDK API, calling the rte_eth_dev_start() even when the Linux
interface link is down is then possible and allowed, as the traffic will
flow between the DPDK application and the Physical port.

This also means that a synchronization between the Linux interface and the
DPDK application remains under the DPDK application responsibility.

To handle such synchronization the application should behave as the
following scheme, to start:

 rte_eth_get_link(port_id, &link);
 if (link.link_status == ETH_DOWN)
	rte_eth_dev_set_link_up(port_id);
 rte_eth_dev_start(port_id);

Taking in account the possible returned values for each function.

and to stop:

 rte_eth_dev_stop(port_id);
 rte_eth_dev_set_link_down(port_id);

The application should also set the LSC interrupt callbacks to catch and
behave accordingly when the administrator set the Linux device down/up.
The same callbacks are called when the link on the medium falls/raise.

[1] https://dpdk.org/browse/dpdk/tree/lib/librte_ether/rte_ethdev_core.h
[2] https://dpdk.org/browse/dpdk/tree/lib/librte_ether/rte_ethdev.h#n1677

Fixes: c7bf62255edf ("net/mlx5: fix handling link status event")
Fixes: e313ef4c2fe8 ("net/mlx5: fix link state on device start")
Cc: stable@dpdk.org

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
f47ba80080 net/mlx5: remove kernel version check
Kernel version check was introduced in
commit 3a49ffe38a95 ("net/mlx5: fix link status query")
due to a bug fixed by
commit ef09a7fc7620 ("net/mlx5: fix inconsistent link status query")

This patch restore the previous behavior as described in Linux API.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Yongseok Koh
264713ba10 net/mlx5: fix ARM build
rdma-core v16 has a bug. The following compilation error occurs on ARM
hosts.

In file included
from drivers/net/mlx5/mlx5_glue.h:16:0,
from drivers/net/mlx5/mlx5_glue.c:11:
/usr/include/infiniband/mlx5dv.h:144:2: error: unknown type name 'off_t'
off_t   uar_mmap_offset;
^

As a temporary fix, sys/types.h is included in PMD. This has been fixed in
rdma-core v17. This can be removed when all the Linux distros are shipped
with rdma-core v17 or back-ported fix. As of now, RedHat 7.5 is known to
have rdma-core v16.

Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Xueming Li
be939f60f4 net/mlx5: fix existing file removal
There is no guarantee that the file won't be removed by external
user/application between the stat() and remove() syscalls, remove() will
fail if the file no longer exists.

Fixes: f8b9a3bad467 ("net/mlx5: install a socket to exchange a file descriptor")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
a170a30d22 net/mlx5: use dynamic logging
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
0f99970b4a net/mlx5: use port id in PMD log
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
a6d83b6a92 net/mlx5: standardize on negative errno values
Set rte_errno systematically as well.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
925061b58b net/mlx5: change non failing function return values
These functions return int although they are not supposed to fail,
resulting in unnecessary checks in their callers.
Some are returning error where is should be a boolean.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
af4f09f282 net/mlx5: prefix all functions with mlx5
This change removes the need to distinguish unlocked priv_*() functions
which are therefore renamed using a mlx5_*() prefix for consistency.

At the same time, all functions from mlx5 uses a pointer to the ETH device
instead of the one to the PMD private data.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
7b2423cd2e net/mlx5: remove control path locks
In priv struct only the memory region needs to be protected against
concurrent access between the control plane and the data plane.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
0b3456e391 net/mlx5: remove useless empty lines
Some empty lines have been added in the middle of the code without any
reason.  This commit removes them.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
fb732b0a49 net/mlx5: add missing function documentation
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
c9e88d35da net/mlx5: normalize function prototypes
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
56f08e1671 net/mlx5: mark parameters with unused attribute
Replaces all (void)foo; by __rte_unused macro except when variables are
under #if statements.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
3692c7ec9e net/mlx5: name parameters in function prototypes
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:44 +02:00
Nélio Laranjeiro
a61888c8f2 net/mlx5: fix sriov flag
priv_get_num_vfs() was used to help the PMD in prefetching the mbuf in
datapath when the PMD was behaving in VF mode.
This knowledge is no more used.

Fixes: 528a9fbec6de ("net/mlx5: support ConnectX-5 devices")
Cc: stable@dpdk.org

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:43 +02:00
Adrien Mazarguil
08c028d08c net/mlx: fix rdma-core glue path with EAL plugins
Glue object files are looked up in RTE_EAL_PMD_PATH by default when set and
should be installed in this directory.

During startup, EAL attempts to load them automatically like other plug-ins
found there. While normally harmless, dlopen() fails when rdma-core is not
installed, EAL interprets this as a fatal error and terminates the
application.

This patch requests glue objects to be installed in a different directory
to prevent their automatic loading by EAL since they are PMD helpers, not
actual DPDK plug-ins.

Fixes: f6242d0655cd ("net/mlx: make rdma-core glue path configurable")
Cc: stable@dpdk.org

Reported-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Tested-by: Timothy Redaelli <tredaelli@redhat.com>
2018-03-30 14:08:43 +02:00
Yongseok Koh
24a8f52455 net/mlx5: fix disabling Tx packet inlining
Adding 'txq_inline=0' to PMD parameter should disable Tx packet inlining
but it doesn't work properly for Enhanced Multi-Packet Send.

Fixes: 6ce84bd88919 ("net/mlx5: add enhanced multi-packet send for ConnectX-5")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-03-30 14:08:42 +02:00
Shahaf Shuler
038e72511f net/mlx5: fix tunnel offloads cap query
The query for the tunnel stateless offloads is wrongly implemented
because of:

1. It was using the device id to query for the offloads.
2. It was using a compilation flag for Verbs which no longer exits.

The main reason was lack of proper API from Verbs.

Fixing the query to use rdma-core API. The capability returned from
rdma-core refer to both Tx and Rx sides.
Eventhough there is a separate cap for GRE and VXLAN, implementation merge
them into a single flag in order to simplify the checks on the data
path.

Fixes: 43e9d9794cde ("net/mlx5: support upstream rdma-core")
Fixes: f5fde5205101 ("net/mlx5: add hardware checksum offload for tunnel packets")
Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-03-30 14:08:42 +02:00
Nélio Laranjeiro
c55a166795 net/mlx5: fix flow creation with a single target queue
Adding a pattern targeting a single queues wrongly behaves as it is an RSS
request, ending by creating several Verbs flows rules to match the RSS
configuration.

Fixes: 8086cf08b2f0 ("net/mlx5: handle RSS hash configuration in RSS flow")
Cc: stable@dpdk.org

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30 14:08:42 +02:00
Adrien Mazarguil
fc40db9973 net/mlx: control netdevices through ioctl only
Several control operations implemented by these PMDs affect netdevices
through sysfs, itself subject to file system permission checks enforced by
the kernel, which limits their use for most purposes to applications
running with root privileges.

Since performing the same operations through ioctl() requires fewer
capabilities (only CAP_NET_ADMIN) and given the remaining operations are
already implemented this way, this patch standardizes on ioctl() and gets
rid of redundant code.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
2018-03-30 14:08:42 +02:00
Xueming Li
132c4c6e00 net/mlx5: add log on flow creation error
Add error message dump when flow create error happened.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-02-13 18:17:30 +01:00
Shahaf Shuler
c0ff2fb814 net/mlx5: revert multicast rule verbs flow type
This is to revert the following commits:
commit da646bd93888 ("net/mlx5: fix all multi verification code position")
commit 0a40a1363a4d ("net/mlx5: fix flow type for allmulti rules")

The last one introduced a bug in the following diff:
@ -1262,6 +1274,7 @@ struct ibv_spec_header {
                eth.val.ether_type &= eth.mask.ether_type;
        }
        mlx5_flow_create_copy(parser, &eth, eth_size);
+       parser->allmulti = eth.val.dst_mac[0] & 1;
        return 0;
 }

As broadcast rules will be considered of type allmulti as well.

The patch was originally intended to enable VF to receive all multicast
traffic by using the IBV_FLOW_ATTR_MC_DEFAULT flow type.
Since the support was removed from the kernel there is no point with
fixing this issue, hence the revert.

Fixes: da646bd93888 ("net/mlx5: fix all multi verification code position")
Fixes: 0a40a1363a4d ("net/mlx5: fix flow type for allmulti rules")
Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-02-13 17:00:37 +01:00
Xueming Li
8c5bca92c9 net/mlx5: fix close after start failure
This patch fixed primary socket assertion error during close on a device
that failed to start.

Fixes: f8b9a3bad467 ("net/mlx5: install a socket to exchange a file descriptor")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-02-13 16:55:49 +01:00
Shahaf Shuler
282da936f9 net/mlx5: revert support of IPv4 time-to-live filter
Neither upstream kernel nor MLNX_OFED support such filter.
There is no point announcing this feature.

Reverts commit 0fb2c9842b20 ("net/mlx5: support IPv4 time-to-live filter")

Fixes: 0fb2c9842b20 ("net/mlx5: support IPv4 time-to-live filter")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-02-08 18:42:14 +01:00
Nélio Laranjeiro
fbab400f61 net/mlx5: fix UAR remapping on non configured queues
priv_tx_uar_remap() is wrongly considering the queue is already configured
and thus present in the queue array of the device.

Fixes: f8b9a3bad467 ("net/mlx5: install a socket to exchange a file descriptor")
Cc: stable@dpdk.org

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-02-06 14:35:07 +01:00
Nélio Laranjeiro
1f30a22358 net/mlx5: fix flow RSS configuration
An RSS configuration without a key is valid according to the
rte_eth_rss_conf API definition.

Fixes: 8086cf08b2f0 ("net/mlx5: handle RSS hash configuration in RSS flow")
Cc: stable@dpdk.org

Reported-by: Yuanhan Liu <yliu@fridaylinux.org>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-02-06 14:35:07 +01:00
Adrien Mazarguil
f6242d0655 net/mlx: make rdma-core glue path configurable
Since rdma-core glue libraries are intrinsically tied to their respective
PMDs and used as internal plug-ins, their presence in the default search
path among other system libraries for the dynamic linker is not necessarily
desired.

This commit enables their installation and subsequent look-up at run time
in RTE_EAL_PMD_PATH if configured to a nonempty string. This path can also
be overridden by environment variables MLX[45]_GLUE_PATH.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-06 14:35:07 +01:00
Adrien Mazarguil
6d5df2eaf6 net/mlx: version rdma-core glue libraries
When built as separate objects, these libraries do not have unique names.
Since they do not maintain a stable ABI, loading an incompatible library
may result in a crash (e.g. in case multiple versions are installed).

This patch addresses the above by versioning glue libraries, both on the
file system (version suffix) and by comparing a dedicated version field
member in glue structures.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-06 14:35:07 +01:00