Commit Graph

450 Commits

Author SHA1 Message Date
Adrien Mazarguil
76e9a55b5b ethdev: add transfer attribute to flow API
This new attribute enables applications to create flow rules that do not
simply match traffic whose origin is specified in the pattern (e.g. some
non-default physical port or VF), but actively affect it by applying the
flow rule at the lowest possible level in the underlying device.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
18aee2861a ethdev: add encap level to RSS flow API action
RSS hash types (ETH_RSS_* macros defined in rte_ethdev.h) describe the
protocol header fields of a packet that must be taken into account while
computing RSS.

When facing encapsulated (e.g. tunneled) packets, there is an ambiguity as
to whether these should apply to inner or outer packets. Applications need
the ability to tell exactly "where" RSS must be performed.

This is addressed by adding encapsulation level information to the RSS flow
action. Its default value is 0 and stands for the usual unspecified
behavior. Other values provide a specific encapsulation level.

Contrary to the change announced by commit 676b605182 ("doc: announce
ethdev API change for RSS configuration"), this patch does not affect
struct rte_eth_rss_conf but struct rte_flow_action_rss as the former is not
used anymore by the RSS flow action. ABI impact is therefore limited to
rte_flow.

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
929e331934 ethdev: add hash function to RSS flow API action
By definition, RSS involves some kind of hash algorithm, usually Toeplitz.

Until now it could not be modified on a flow rule basis and PMDs had to
always assume RTE_ETH_HASH_FUNCTION_DEFAULT, which remains the default
behavior when unspecified (0).

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
ac8d22de23 ethdev: flatten RSS configuration in flow API
Since its inception, the rte_flow RSS action has been relying in part on
external struct rte_eth_rss_conf for compatibility with the legacy RSS API.
This structure lacks parameters such as the hash algorithm to use, and more
recently, a method to tell which layer RSS should be performed on [1].

Given struct rte_eth_rss_conf will never be flexible enough to represent a
complete RSS configuration (e.g. RETA table), this patch supersedes it by
extending the rte_flow RSS action directly.

A subsequent patch will add a field to use a non-default RSS hash
algorithm. To that end, a field named "types" replaces the field formerly
known as "rss_hf" and standing for "RSS hash functions" as it was
confusing. Actual RSS hash function types are defined by enum
rte_eth_hash_function.

This patch updates all PMDs and example applications accordingly.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

[1] commit 676b605182 ("doc: announce ethdev API change for RSS
    configuration")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00
Adrien Mazarguil
19b3bc47c6 ethdev: fix C99 flexible arrays from flow API
This patch replaces C99-style flexible arrays in struct rte_flow_action_rss
and struct rte_flow_item_raw with standard pointers to the same data.

They proved difficult to use in the field (e.g. no possibility of static
initialization) and unsuitable for C++ applications.

Affected PMDs and examples are updated accordingly.

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Fixes: b1a4b4cbc0 ("ethdev: introduce generic flow API")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:53 +01:00
Adrien Mazarguil
cc17feb904 ethdev: alter behavior of flow API actions
This patch makes the following changes to flow rule actions:

- List order now matters, they are redefined as performed first to last
  instead of "all simultaneously".

- Repeated actions are now supported (e.g. specifying QUEUE multiple times
  now duplicates traffic among them). Previously only the last action of
  any given kind was taken into account.

- No more distinction between terminating/non-terminating/meta actions.
  Flow rules themselves are now defined as always terminating unless a
  PASSTHRU action is specified.

These changes alter the behavior of flow rules in corner cases in order to
prepare the flow API for actions that modify traffic contents or properties
(e.g. encapsulation, compression) and for which order matter when combined.

Previously one would have to do so through multiple flow rules by combining
PASSTRHU with priority levels, however this proved overly complex to
implement at the PMD level, hence this simpler approach.

This breaks ABI compatibility for the following public functions:

- rte_flow_create()
- rte_flow_validate()

PMDs with rte_flow support are modified accordingly:

- bnxt: no change, implementation already forbids multiple actions and does
  not support PASSTHRU.

- e1000: no change, same as bnxt.

- enic: modified to forbid redundant actions, no support for default drop.

- failsafe: no change needed.

- i40e: no change, implementation already forbids multiple actions.

- ixgbe: same as i40e.

- mlx4: modified to forbid multiple fate-deciding actions and drop when
  unspecified.

- mlx5: same as mlx4, with other redundant actions also forbidden.

- sfc: same as mlx4.

- tap: implementation already complies with the new behavior except for
  the default pass-through modified as a default drop.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00
Yongseok Koh
96525b9e19 net/mlx4: fix alignment of memory region
The memory region is [start, end), so if the memseg of 'end' isn't
allocated yet, the returned memseg will have zero entries and this will
make 'end' zero (nil).

Fixes: c2fe582322 ("net/mlx4: use virt2memseg instead of iteration")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-27 15:54:56 +01:00
Ferruh Yigit
3fef0822ec drivers/net: update link status
Update link status related feature document items and minor updates in
some link status related functions.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-27 15:54:56 +01:00
Adrien Mazarguil
ef134c8daa net/mlx4: fix ignored RSS hash types
When an unsupported hash type is part of a RSS configuration structure, it
is silently ignored instead of triggering an error. This may lead
applications to assume that such types are accepted, while they are in fact
not part of the resulting flow rules.

Fixes: 078b8b452e ("net/mlx4: add RSS flow rule action support")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-04-27 15:54:56 +01:00
Adrien Mazarguil
cb43322fbd net/mlx4: fix RSS resource leak in case of error
When memory cannot be allocated for a flow rule, its RSS context reference
is not dropped.

Fixes: 078b8b452e ("net/mlx4: add RSS flow rule action support")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-04-27 15:54:56 +01:00
Olivier Matz
caccf8b318 ethdev: return diagnostic when setting MAC address
Change the prototype and the behavior of dev_ops->eth_mac_addr_set(): a
return code is added to notify the caller (librte_ether) if an error
occurred in the PMD.

The new default MAC address is now copied in dev->data->mac_addrs[0]
only if the operation is successful.

The patch also updates all the PMDs accordingly.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-14 00:43:30 +02:00
Ophir Munk
de1df14e6e net/mlx4: support CRC strip toggling
Previous to this commit mlx4 CRC stripping was executed by default and
there was no verbs API to disable it.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-14 00:43:30 +02:00
Ferruh Yigit
cd8c7c7ce2 ethdev: replace bus specific struct with generic dev
Public struct rte_eth_dev_info has a "struct rte_pci_device" field in it
although it is common for all ethdev in all buses.

Replacing pci specific struct with generic device struct and updating
places that are using pci device in a way to get this information from
generic device.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-04-14 00:41:44 +02:00
Bruce Richardson
a11dfe9b65 net/mlx: fix warnings for unused compiler arguments
When linking the mlx glue code libraries using CC, the linker arguments in
LDFLAGS are not prefixed with -Wl. [The EXTRA_LDFLAGS are though.] This
leads to warning messages on build:

clang-5.0: warning: argument unused during compilation: '-e xport-dynamic'

Fix this by checking for $LINK_USING_CC in the Makefiles and prefixing the
LDFLAGS appropriately if set.

Fixes: 27cea11686 ("net/mlx4: spawn rdma-core dependency plug-in")
Fixes: 59b91bec12 ("net/mlx5: spawn rdma-core dependency plug-in")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-04-14 00:40:21 +02:00
Rami Rosen
4db261fc88 net/mlx4: fix a typo in header file
This patch fixes a trivial typo in mlx4 header file.

Fixes: 3d555728c9 ("net/mlx4: separate Rx/Tx definitions")
Cc: stable@dpdk.org

Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-14 00:40:21 +02:00
Anatoly Burakov
66cc45e293 mem: replace memseg with memseg lists
Before, we were aggregating multiple pages into one memseg, so the
number of memsegs was small. Now, each page gets its own memseg,
so the list of memsegs is huge. To accommodate the new memseg list
size and to keep the under-the-hood workings sane, the memseg list
is now not just a single list, but multiple lists. To be precise,
each hugepage size available on the system gets one or more memseg
lists, per socket.

In order to support dynamic memory allocation, we reserve all
memory in advance (unless we're in 32-bit legacy mode, in which
case we do not preallocate memory). As in, we do an anonymous
mmap() of the entire maximum size of memory per hugepage size, per
socket (which is limited to either RTE_MAX_MEMSEG_PER_TYPE pages or
RTE_MAX_MEM_MB_PER_TYPE megabytes worth of memory, whichever is the
smaller one), split over multiple lists (which are limited to
either RTE_MAX_MEMSEG_PER_LIST memsegs or RTE_MAX_MEM_MB_PER_LIST
megabytes per list, whichever is the smaller one). There is also
a global limit of CONFIG_RTE_MAX_MEM_MB megabytes, which is mainly
used for 32-bit targets to limit amounts of preallocated memory,
but can be used to place an upper limit on total amount of VA
memory that can be allocated by DPDK application.

So, for each hugepage size, we get (by default) up to 128G worth
of memory, per socket, split into chunks of up to 32G in size.
The address space is claimed at the start, in eal_common_memory.c.
The actual page allocation code is in eal_memalloc.c (Linux-only),
and largely consists of copied EAL memory init code.

Pages in the list are also indexed by address. That is, in order
to figure out where the page belongs, one can simply look at base
address for a memseg list. Similarly, figuring out IOVA address
of a memzone is a matter of finding the right memseg list, getting
offset and dividing by page size to get the appropriate memseg.

This commit also removes rte_eal_dump_physmem_layout() call,
according to deprecation notice [1], and removes that deprecation
notice as well.

On 32-bit targets due to limited VA space, DPDK will no longer
spread memory to different sockets like before. Instead, it will
(by default) allocate all of the memory on socket where master
lcore is. To override this behavior, --socket-mem must be used.

The rest of the changes are really ripple effects from the memseg
change - heap changes, compile fixes, and rewrites to support
fbarray-backed memseg lists. Due to earlier switch to _walk()
functions, most of the changes are simple fixes, however some
of the _walk() calls were switched to memseg list walk, where
it made sense to do so.

Additionally, we are also switching locks from flock() to fcntl().
Down the line, we will be introducing single-file segments option,
and we cannot use flock() locks to lock parts of the file. Therefore,
we will use fcntl() locks for legacy mem as well, in case someone is
unfortunate enough to accidentally start legacy mem primary process
alongside an already working non-legacy mem-based primary process.

[1] http://dpdk.org/dev/patchwork/patch/34002/

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:55:39 +02:00
Anatoly Burakov
c2fe582322 net/mlx4: use virt2memseg instead of iteration
Reduce dependency on internal details of EAL memory subsystem, and
simplify code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:55:00 +02:00
Shahaf Shuler
5feecc57d9 align SPDX Mellanox copyrights
Aligning Mellanox SPDX copyrights to a single format.
In addition replace to SPDX licence files which were missed.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-11 01:47:47 +02:00
Bruce Richardson
c022cb400e convert snprintf to strlcpy
Since we have support for the strlcpy function in DPDK, replace all
instances where a string is copied using snprintf.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 17:33:08 +02:00
Adrien Mazarguil
08c028d08c net/mlx: fix rdma-core glue path with EAL plugins
Glue object files are looked up in RTE_EAL_PMD_PATH by default when set and
should be installed in this directory.

During startup, EAL attempts to load them automatically like other plug-ins
found there. While normally harmless, dlopen() fails when rdma-core is not
installed, EAL interprets this as a fatal error and terminates the
application.

This patch requests glue objects to be installed in a different directory
to prevent their automatic loading by EAL since they are PMD helpers, not
actual DPDK plug-ins.

Fixes: f6242d0655 ("net/mlx: make rdma-core glue path configurable")
Cc: stable@dpdk.org

Reported-by: Timothy Redaelli <tredaelli@redhat.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Tested-by: Timothy Redaelli <tredaelli@redhat.com>
2018-03-30 14:08:43 +02:00
Adrien Mazarguil
fc40db9973 net/mlx: control netdevices through ioctl only
Several control operations implemented by these PMDs affect netdevices
through sysfs, itself subject to file system permission checks enforced by
the kernel, which limits their use for most purposes to applications
running with root privileges.

Since performing the same operations through ioctl() requires fewer
capabilities (only CAP_NET_ADMIN) and given the remaining operations are
already implemented this way, this patch standardizes on ioctl() and gets
rid of redundant code.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
2018-03-30 14:08:42 +02:00
Moti Haimovsky
0ab56bd30c net/mlx4: add CRC stripping capability
This patch updates mlx4 Rx offload capabilities to also indicate that
Rx CRC stripping is (always) supported.

Since the device does not support disabling CRC stripping the PMD
silently ignores such requests.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-08 18:42:14 +01:00
Adrien Mazarguil
f6242d0655 net/mlx: make rdma-core glue path configurable
Since rdma-core glue libraries are intrinsically tied to their respective
PMDs and used as internal plug-ins, their presence in the default search
path among other system libraries for the dynamic linker is not necessarily
desired.

This commit enables their installation and subsequent look-up at run time
in RTE_EAL_PMD_PATH if configured to a nonempty string. This path can also
be overridden by environment variables MLX[45]_GLUE_PATH.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-06 14:35:07 +01:00
Adrien Mazarguil
6d5df2eaf6 net/mlx: version rdma-core glue libraries
When built as separate objects, these libraries do not have unique names.
Since they do not maintain a stable ABI, loading an incompatible library
may result in a crash (e.g. in case multiple versions are installed).

This patch addresses the above by versioning glue libraries, both on the
file system (version suffix) and by comparing a dedicated version field
member in glue structures.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-06 14:35:07 +01:00
Adrien Mazarguil
747ac2b4d9 net/mlx: fix missing includes for rdma-core glue
For consistency since these includes are already pulled by others.

Fixes: 4eba244b78 ("net/mlx4: move rdma-core calls to separate file")
Fixes: 0e83b8e536 ("net/mlx5: move rdma-core calls to separate file")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-06 14:35:07 +01:00
Adrien Mazarguil
2a3b00973d net/mlx: add debug checks to glue structure
This code should catch mistakes early if a glue structure member is added
without a corresponding implementation in the library.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-06 14:35:07 +01:00
Moti Haimovsky
c7aaaecd41 net/mlx4: fix Rx offload non-fragmented indication
This patch fixes the missing RTE_PTYPE_L4_NONFRAG on non-fragmented
IP packets with unrecognized payload type.

Fixes: aee4a03fee ("net/mlx4: enhance Rx packet type offloads")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-06 12:51:30 +01:00
Adrien Mazarguil
ff20ecbf2a net/mlx4: fix drop flow resources leak
Resources allocated for drop flow rules are not freed properly. This causes
a memory leak and triggers an assertion failure on a reference counter when
compiled in debug mode.

This issue can be reproduced with testpmd by entering the following
commands:

 flow create 0 ingress pattern eth / end actions drop / end
 port start all
 port stop all
 port start all
 port stop all
 quit

The reason is additional references are taken when re-enabling existing
flow rules, a common occurrence when rehashing configuration.

Fixes: d3a7e09234 ("net/mlx4: allocate drop flow resources on demand")
Cc: stable@dpdk.org

Reported-by: Moti Haimovsky <motih@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-05 13:42:53 +01:00
Olivier Matz
82092c8734 net/mlx4: use SPDX tags in 6WIND copyrighted files
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-02-01 02:33:04 +01:00
Adrien Mazarguil
27cea11686 net/mlx4: spawn rdma-core dependency plug-in
When mlx4 is not compiled directly as an independent shared object (e.g.
CONFIG_RTE_BUILD_SHARED_LIB not enabled for performance reasons), DPDK
applications inherit its dependencies on libibverbs and libmlx4 through
rte.app.mk.

This is an issue both when DPDK is delivered as a binary package (Linux
distributions) and for end users because rdma-core then propagates as a
mandatory dependency for everything.

Application writers relying on binary DPDK packages are not necessarily
aware of this fact and may end up delivering packages with broken
dependencies.

This patch therefore introduces an intermediate internal plug-in
hard-linked with rdma-core (to preserve symbol versioning) loaded by the
PMD through dlopen(), so that a missing rdma-core does not cause unresolved
symbols, allowing applications to start normally.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-31 20:57:29 +01:00
Adrien Mazarguil
4eba244b78 net/mlx4: move rdma-core calls to separate file
This lays the groundwork for externalizing rdma-core as an optional
run-time dependency instead of a mandatory one.

No functional change.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-01-31 20:57:29 +01:00
Moti Haimovsky
fc1b5ec522 net/mlx4: fix removal detection of stopped port
In failsafe device start can be called for ports/devices that
had been plugged out.
The mlx4 PMD detects device removal by listening to the device RMV
events, when the mlx4 port is being stopped, the PMD no longer
listens to these events causing the PMD to stop detecting device
removals.
This patch fixes this issue by moving installation of the interrupt
handler to device configuration, and toggle only the Rx-queue
interrupts on start/stop.

Fixes: a6e8b01c3c ("net/mlx4: compact interrupt functions")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-01-30 10:20:35 +01:00
Moti Haimovsky
643958cf91 net/mlx4: fix broadcast Rx
This patch fixes the issue of mlx4 not receiving broadcast packets
when configured to work promiscuous or allmulticast modes.

Fixes: eacaac7bae ("net/mlx4: restore promisc and allmulti support")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-29 10:45:20 +01:00
Ophir Munk
a43fba2c1e net/mlx4: fix single port configuration
The number of mlx4 present ports is calculated as follows:
conf.ports.present |= (UINT64_C(1) << device_attr.phys_port_cnt) - 1;

That is - all ones sequence (due to -1 subtraction)
When retrieving the number of ports, 1 must be added in order to obtain
the correct number of ports to the power of 2, as follows:
uint32_t ports = rte_log2_u32(conf->ports.present + 1);
If 1 was not added, in the case of one port, the number of ports would
be falsely calculated as 0.

Fixes: 8264279967 ("net/mlx4: check max number of ports dynamically")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-29 10:44:44 +01:00
Ferruh Yigit
ffc905f3b8 ethdev: separate driver APIs
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.

There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.

More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-01-22 01:26:49 +01:00
Matan Azrad
cdf4ec6eaa net/mlx4: support a device removal check operation
Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
2018-01-21 21:09:41 +01:00
Thomas Monjalon
cebe3d7b3d ethdev: remove useless parameter in callback process
The pointer to the user parameter of the callback registration is
automatically pass to the callback function.
There is no point to allow changing this user parameter by a caller.
That's why this parameter is always set to NULL by PMDs and set only
in ethdev layer before calling the callback function.

The history is that the user parameter was initially used
by the callback implementation to pass some information
between the application and the driver:
	c1ceaf3ad0 ("ethdev: add an argument to internal callback function")
Then a new parameter has been added to leave the user parameter
to its standard usage of context given at registration:
	d6af1a13d7 ("ethdev: add return values to callback process API")

The NULL parameter in the internal callback processing function
is now removed. It makes clear that the callback parameter is user
managed and opaque from a DPDK point of view.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-01-16 18:47:49 +01:00
Shahaf Shuler
597d2ce5b4 net/mlx4: convert to new Rx offloads API
Ethdev Rx offloads API has changed since:

commit ce17eddefc ("ethdev: introduce Rx queue offloads API")

This commit support the new Rx offloads API.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2018-01-16 18:47:49 +01:00
Shahaf Shuler
842860050d net/mlx4: convert to new Tx offloads API
Ethdev Tx offloads API has changed since:

commit cba7f53b71 ("ethdev: introduce Tx queue offloads API")

This commit support the new Tx offloads API.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2018-01-16 18:47:49 +01:00
Moti Haimovsky
2ebf5f7e92 net/mlx4: verify Tx max sges
Max number of Tx scatter-gather entries is a property of the device
and is queried at init. This value was not changed in a while and
most probably will not be changed in the future, Therefore and
in order to enhance Tx performance, the Tx max-sge value is hardcoded
in mlx4 PRM code.
This patch adds a verification that the above assumption still holds
and that the hardcoded value is still supported by the mlx4 hardware.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
ec82dddad0 net/mlx4: remove Tx completion elements counter
This counter saved the descriptor elements which are waiting to be
completed and was used to know if completion function should be
called.

This completion check can be done by other elements management
variables and we can prevent this counter management.

Remove this counter and replace the completion check easily by other
elements management variables.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
50163aec51 net/mlx4: align Tx descriptors number
Using power of 2 descriptors number makes the ring management easier
and allows to use mask operation instead of wraparound conditions.

Adjust Tx descriptor number to be power of 2 and change calculation to
use mask accordingly.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
533871524a net/mlx4: mitigate Tx send entry size calculations
The previuse code took a send queue entry size for stamping from the
send queue entry pointed by completion queue entry; This 2 reads were
done per packet in completion stage.

The completion burst packets number is managed by fixed size stored in
Tx queue, so we can infer that each valid completion entry actually frees
the next fixed number packets.

The descriptors ring holds the send queue entry, so we just can infer
all the completion burst packet entries size by simple calculation and
prevent calculations per packet.

Adjust completion functions to free full completion bursts packets
by one time and prevent per packet work queue entry reads and
calculations.

Save only start of completion burst or Tx burst send queue entry
pointers in the appropriate descriptor element.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
78e81a9844 net/mlx4: merge Tx queue rings management
The Tx queue send ring was managed by Tx block head,tail,count and mask
management variables which were used for managing the send queue remain
space and next places of empty or completed work queue entries.

This method suffered from an actual addresses recalculation per packet,
an unnecessary Tx block based calculations and an expensive dual
management of Tx rings.

Move send queue ring calculation to be based on actual addresses while
managing it by descriptors ring indexes.

Add new work queue entry pointer to the descriptor element to hold the
appropriate entry in the send queue.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
673800facc net/mlx4: optimize Tx multi-segment case
mlx4 Tx block can handle up to 4 data segments or control segment + up
to 3 data segments. The first data segment in each not first Tx block
must validate Tx queue wraparound and must use IO memory barrier before
writing the byte count.

The previous multi-segment code used "for" loop to iterate over all
packet segments and separated first Tx block data case by "if"
statements.

Use switch case and unconditional branches instead of "for" loop can
optimize the case and prevents the unnecessary checks for each data
segment; This hints to compiler to create optimized jump table.

Optimize this case by switch case and unconditional branches usage.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
818f1e7c23 net/mlx4: remove restamping from Tx error path
At error time, the first 4 bytes of each WQE Tx block still have not
writen, so no need to stamp them because they are already stamped.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
6e1a01b220 net/mlx4: remove unnecessary Tx wraparound checks
There is no need to check Tx queue wraparound for segments which are
not at the beginning of a Tx block. Especially relevant in a single
segment case.

Remove unnecessary aforementioned checks from Tx path.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
e3ecea72a8 net/mlx4: fix Tx packet drop application report
When invalid lkey is sent to HW, HW sends an error notification in
completion function.

The previous code wouldn't crash but doesn't add any application report
in case of completion error, so application cannot know that packet
actually was dropped in case of invalid lkey.

Return back the lkey validation to Tx path.

Fixes: 2eee458746 ("net/mlx4: remove error flows from Tx fast path")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
c2b3dba84a net/mlx4: revert workaround for broken Verbs
This workaround was needed to properly handle device removal with old
Mellanox OFED releases that are not supported by this PMD anymore.

Starting from rdma-core v16 this removal issue shouldn't happen when
setting MLX4_DEVICE_FATAL_CLEANUP environment variable to 1.

Set the aforementioned variable to 1.

Reverts: 5f4677c6ad ("net/mlx4: workaround verbs error after plug-out")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Adrien Mazarguil
55e8991e31 net/mlx4: restore inner VXLAN RSS support
Inner VXLAN RSS was supported and performed by default prior to the entire
mlx4 refactoring that occurred in DPDK 17.11, however so far the new Verbs
RSS API did not provide means to enable it. This will be addressed in
Linux 4.15 and in RDMA core.

Thanks to RSS capabilities, the PMD can now probe for its support and
enable it again by default.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-01-16 18:47:49 +01:00
Adrien Mazarguil
024e87bef4 net/mlx4: restore UDP RSS by probing capabilities
Until now, UDP RSS support could not be relied on due to a problem in the
Linux kernel implementation and mlx4 RSS capabilities were not reported at
all, hence the PMD had to make assumptions.

Since both issues will be addressed simultaneously in Linux 4.15 (related
patches already upstream) and likely backported afterward, UDP RSS support
can be enabled by probing RSS capabilities.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-01-16 18:47:49 +01:00
Adrien Mazarguil
27563725b1 net/mlx4: use function to get default RSS fields
Supported RSS hash fields are listed in function mlx4_conv_rss_hf() and
duplicated in mlx4_flow_prepare(); the latter are used when RSS is
requested without specifying any parameters.

This commit standardizes on mlx4_conv_rss_hf().

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-01-16 18:47:49 +01:00
Adrien Mazarguil
c7869af57e net/mlx4: fix documentation in private structure
A couple of structure fields are not Doxygen-friendly.

Fixes: 5db1d36408 ("net/mlx4: restore Tx checksum offloads")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-01-16 18:47:49 +01:00
Adrien Mazarguil
3058dd9bca net/mlx4: fix unnecessary include
Fixes: a2ce2121c0 ("net/mlx4: separate Tx configuration functions")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-01-16 18:47:49 +01:00
Raslan Darawsheh
8937c9f1de net/mlx4: store RSS hash result in mbufs
Add RSS hash result from CQE to mbuf,
Also, set PKT_RX_RSS_HASH in the ol_flags.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16 18:47:49 +01:00
Matan Azrad
46d7b08b91 net/mlx4: fix missing stamp during Tx completion
After processing completed packets, the owner bit of each TXBB comprised
in its WQEs must be invalidated. The loop stops short of processing the
last WQE.

Fixes: c3c977bbec ("net/mlx4: add Tx bypassing Verbs")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-11 15:54:16 +01:00
Shahaf Shuler
8ec50cd624 net/mlx4: fix rxq interrupt memory corruption
intr_vec allocation size was wrong causing a memory corruption.

Fixes: 0a2ae70319 ("net/mlx4: fix Rx interrupts management")
Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-10 09:04:20 +00:00
Moti Haimovsky
78214fb882 net/mlx4: fix Rx packet type offloads
This patch improves Rx packet type offload report in case the device is
a virtual function device.
In these devices we observed that the L2 tunnel flag is set also for
non-tunneled packets, this leads to a complete misinterpretation of the
packet type being received.
This issue occurs since the tunnel_mode is not set to 0x7 by the driver
for virtual devices and therefore the value in the L2 tunnel flag is
meaningless and should be ignored.

Fixes: aee4a03fee ("net/mlx4: enhance Rx packet type offloads")

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-10 02:29:56 +00:00
Moti Haimovsky
aee4a03fee net/mlx4: enhance Rx packet type offloads
This patch enhances the Rx packet type offload to also report the L4
protocol information in the hw ptype filled by the PMD for each received
packet.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-07 12:47:13 +01:00
Adrien Mazarguil
0d03353077 net/mlx4: share memory region resources
Memory regions assigned to hardware and used during Tx/Rx are mapped to
mbuf pools. Each Rx queue creates its own MR based on the mempool
provided during queue setup, while each Tx queue looks up and registers
MRs for all existing mbuf pools instead.

Since most applications use few large mbuf pools (usually only a single
one per NUMA node) common to all Tx/Rx queues, the above approach wastes
hardware resources due to redundant MRs. This negatively affects
performance, particularly with large numbers of queues.

This patch therefore makes the entire MR registration common to all
queues using a reference count. A spinlock is added to protect against
asynchronous registration that may occur from the Tx side where new
mempools are discovered based on mbuf data.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 21:30:41 +01:00
Adrien Mazarguil
b3d197b435 net/mlx4: fix function prototypes
This is done for consistency with the rest of the code.

Fixes: 078b8b452e ("net/mlx4: add RSS flow rule action support")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:37:10 +01:00
Matan Azrad
89ce4b02c6 net/mlx4: mitigate Tx path memory barriers
Replace most of the memory barriers by IO memory barriers since they
are all targeted to the DRAM; This improves code efficiency for
systems which force store order between different addresses.

Only the doorbell register store should be protected by memory barrier
since it is targeted to the PCI memory domain.

Limit pre byte count store IO memory barrier for systems with cache
line size smaller than 64B (TXBB size).

This patch improves Tx performance by 0.2MPPS for one segment 64B
packets via 1 queue with 1 core test.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:22:09 +01:00
Matan Azrad
b68d92b45c net/mlx4: fix HW memory optimizations careless
Volatilize all Rx/Tx HW negotiation memories to be sure no compiler
optimization prevents either load or store commands.

Fixes: c3c977bbec ("net/mlx4: add Tx bypassing Verbs")
Fixes: 9f57340a80 ("net/mlx4: restore Rx offloads")
Fixes: 6681b84503 ("net/mlx4: add Rx bypassing Verbs")
Fixes: 62e96ffb93 ("net/mlx4: fix no Rx interrupts")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:22:09 +01:00
Matan Azrad
dae76a678c net/mlx4: separate Tx segment cases
Optimize single segment case by processing it in different block which
prevents checks, calculations and barriers relevant only for multi
segment case.

Call a dedicated function for handling multi segments case.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:22:09 +01:00
Matan Azrad
4d8e284df2 net/mlx4: remove duplicate handling in Tx burst
Remove usage of variable which count the packets for completion and
doesn't add more information than packets counter.

Remove no space in elements ring check which is already covered by
regular Tx flow.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:22:08 +01:00
Matan Azrad
afe67d2c99 net/mlx4: merge Tx path functions
Merge tx_burst and mlx4_post_send functions to prevent
double asking about WQ remain space.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:22:08 +01:00
Matan Azrad
05be4516c9 net/mlx4: fix ring wraparound compiler hint
Remove unlikely hint from WQ wraparound check because it is
expected case.

Fixes: c3c977bbec ("net/mlx4: add Tx bypassing Verbs")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:22:08 +01:00
Ophir Munk
326d2cdf7b net/mlx4: associate MR to MP in a short function
Associate memory region to mempool (on data path) in a short function.
Handle the less common case of adding a new memory region to mempool
in a separate function.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:22:07 +01:00
Matan Azrad
2eee458746 net/mlx4: remove error flows from Tx fast path
Move unnecessary error flows to DEBUG mode.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-03 20:22:07 +01:00
Adrien Mazarguil
9e0207bf1a net/mlx4: fix missing include
Fixes: 76df01ff62 ("net/mlx4: separate debugging macros")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-02 20:05:43 +01:00
Adrien Mazarguil
e21bdfaa2c net/mlx4: fix queue index check on flow rules
Users are not prevented from creating flow rules targeting nonexistent
queues, which silently makes such rules drop-like.

While it can be thought as a feature, reporting an error instead is
actually far more useful in order to catch common mistakes.

Fixes: 078b8b452e ("net/mlx4: add RSS flow rule action support")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-01 22:17:06 +01:00
Adrien Mazarguil
a9b3568e73 net/mlx4: fix Rx after updating number of queues
When not in isolated mode, internal flow rules are automatically
maintained by the PMD to receive traffic according to global device
settings (MAC, VLAN, promiscuous mode and so on).

Since RSS support was added to the mix, it must also check whether Rx
queue configuration has changed when refreshing flow rules to prevent
the following from happening:

- With a smaller number of Rx queues, traffic is implicitly dropped
  since the existing RSS context cannot be re-applied.
- With a larger number of Rx queues, traffic remains balanced within the
  original (smaller) set of queues.

One workaround before this commit was to temporarily enter/leave
isolated mode to make it regenerate internal flow rules.

Fixes: 7d8675956f ("net/mlx4: add RSS support outside flow API")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-11-01 22:17:06 +01:00
Moti Haimovsky
62e96ffb93 net/mlx4: fix no Rx interrupts
This commit addresses the issue of Rx interrupts support with
the new Rx datapath introduced in DPDK version 17.11.
In order to generate an Rx interrupt an event queue is armed with the
consumer index of the Rx completion queue. Since version 17.11 this
index is handled by the PMD so it is now the responsibility of the
PMD to write this value when enabling Rx interrupts.

Fixes: 6681b84503 ("net/mlx4: add Rx bypassing Verbs")

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-27 01:05:55 +02:00
Moti Haimovsky
096134582c net/mlx4: introducing consumer index mask
This commit defines MLX4_CQ_DB_CI_MASK which is used when updating
the consumer index of the completion queue instead of the hardcoded
0xffffff used until now.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-27 01:05:55 +02:00
Gaetan Rivet
c752998b5e pci: introduce library and driver
The PCI lib defines the types and methods allowing to use PCI elements.

The PCI bus implements a bus driver for PCI devices by constructing
rte_bus elements using the PCI lib.

Move the relevant code out of the EAL to its expected place.

Libraries, drivers, unit tests and applications are updated to use the
new rte_bus_pci.h header when necessary.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-10-26 23:17:31 +02:00
Gaetan Rivet
00a3d8104a ethdev: remove detachable device flag
This flag is not necessary at the ether layer anymore.
Buses are able to advertise their hotplug support. The ether layer can
rely upon this capability instead of a special flag.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2017-10-26 02:33:01 +02:00
Matan Azrad
a76bec521a net/mlx4: fix targetless internal rule creation
The corrupted code allowed to create internal rule with no any target
queue in case the rule creation occurred before queues creation.

For example, when user calls rte_eth_dev_default_mac_addr_set after
probe and before dev_configure, mlx4 fails because the RSS queue number
was 0.

The fix prevents internal rules creation before queues creation based on
future creation before traffic start.

Fixes: 7d8675956f ("net/mlx4: add RSS support outside flow API")
Fixes: bdcad2f484 ("net/mlx4: refactor internal flow rules")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-26 02:33:01 +02:00
Adrien Mazarguil
cad92582d2 net/mlx4: fix restriction on TCP/UDP flow rules
The code as currently written requires TCP/UDP source and destination
ports to be always specified.

No such restriction is enforced by hardware; all TCP and UDP traffic
can be matched by providing an empty mask for these fields.

Fixes: 680d5280c2 ("net/mlx4: refactor flow item validation code")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-26 02:33:01 +02:00
Adrien Mazarguil
5697a41421 net/mlx4: relax Rx queue configuration order
Various hardware limitations apply to RSS indirection tables, one of
them being they must be an exact 1:1 mapping of the configured Rx queue
indices.

While this restriction is enforced when creating RSS flow rules, it is
not the case when Rx queues themselves are created; underlying WQ
numbers are assigned in turn, not according to queue index.

Applications such as l3fwd-power that create Rx queues from highest to
lowest index (or any other non-sequential order) thus fail to get a
working RSS context.

This commit postpones WQ initialization to dev_start(), once all Rx
queues are configured in order to address this issue.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-26 02:33:01 +02:00
Adrien Mazarguil
0ef007c939 net/mlx4: fix indirection table error rollback
In case of error occurring while setting up indirection table and
related RSS context resources, intermediate objects are not cleaned up.

Moreover although unlikely, an error other than EINVAL (e.g. ENOMEM)
may be returned.

A description of mlx4_rss_attach()'s return value is also missing.

Fixes: 078b8b452e ("net/mlx4: add RSS flow rule action support")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-26 02:33:01 +02:00
Adrien Mazarguil
14f2d6688c net/mlx4: fix useless flow rules synchronization
According to the original commit, Rx queues cannot be created nor
destroyed while the device is started. Synchronizing flow rules during
such events is unnecessary as it occurs later when starting the device.

Fixes: 7977082649 ("net/mlx4: drop live queue reconfiguration support")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-26 02:33:00 +02:00
Adrien Mazarguil
ed4724c80d net/mlx4: use dedicated list iterator
Dumb unconditional iteration on flow rules should be performed using the
dedicated macro.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-26 02:33:00 +02:00
Olivier Matz
cbc12b0a96 mk: do not generate LDLIBS from directory dependencies
The list of libraries in LDLIBS was generated from the DEPDIRS-xyz
variable. This is valid when the subdirectory name match the library
name, but it's not always the case, especially for PMDs.

The patches removes this feature and explicitly adds the proper
libraries in LDLIBS.

Some DEPDIRS-xyz variables become useless, remove them.

Reported-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Gage Eads <gage.eads@intel.com>
2017-10-24 02:14:57 +02:00
Adrien Mazarguil
cf2fdf7263 net/mlx4: fix missing initializers for old GCC
This patch works around compilation issues so far only seen on RHEL 7.2
using GCC 4.8.5:

 [...]/mlx4_rxq.c: In function `mlx4_rx_queue_setup':
 [...]/mlx4_rxq.c:473:3: error: missing initializer for field `ipackets' of
     `struct mlx4_rxq_stats' [-Werror=missing-field-initializers]

 [...]/mlx4_txq.c: In function `mlx4_tx_queue_setup':
 [...]/mlx4_txq.c:265:3: error: missing initializer for field `opackets' of
     `struct mlx4_txq_stats' [-Werror=missing-field-initializers]

Fixes: 7977082649 ("net/mlx4: drop live queue reconfiguration support")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 12:29:14 +02:00
Moti Haimovsky
43d77c2295 net/mlx4: add loopback Tx from VF
This patch adds loopback functionality used when the chip is a VF in order
to enable packet transmission between VFs and PF.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-13 01:18:48 +01:00
Moti Haimovsky
9f57340a80 net/mlx4: restore Rx offloads
This patch adds hardware offloading support for IPV4, UDP and TCP checksum
verification, including inner/outer checksums on supported tunnel types.

It also restores packet type recognition support.

Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-13 01:18:48 +01:00
Moti Haimovsky
5db1d36408 net/mlx4: restore Tx checksum offloads
This patch adds hardware offloading support for IPv4, UDP and TCP checksum
calculation, including inner/outer checksums on supported tunnel types.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-13 01:18:48 +01:00
Moti Haimovsky
6681b84503 net/mlx4: add Rx bypassing Verbs
This patch adds support for accessing the hardware directly when
handling Rx packets eliminating the need to use Verbs in the Rx data
path.

Rx scatter support: calculate the number of scatters on the fly
according to the maximum expected packet size.

Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-13 01:18:48 +01:00
Moti Haimovsky
c3c977bbec net/mlx4: add Tx bypassing Verbs
Modify PMD to send single-buffer packets directly to the device
bypassing the Verbs Tx post and poll routines.

Tx gather support: add support for transmitting packets spanning
over multiple buffers.

Take into consideration the amount of entries a packet occupies
in the TxQ when setting the report-completion flag of the chip.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
7d8675956f net/mlx4: add RSS support outside flow API
Bring back support for automatic RSS with the default flow rules when not
in isolated mode. Balancing is done according to unspecified default
settings, as was the case before this entire rework.

Since the number of queues part of RSS contexts is limited to power of two
values, the number of configured queues is rounded down to its previous
power of two; extra queues are silently discarded. This does not prevent
dedicated flow rules from targeting them.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
31c629c6f7 net/mlx4: disable UDP support in RSS flow rules
When part of the RSS hash calculation, UDP packets are discarded (not
received on any queue) likely due to an issue with the kernel
implementation.

Temporarily disable UDP RSS support until this issue is resolved.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
078b8b452e net/mlx4: add RSS flow rule action support
This patch dissociates single-queue indirection tables and hash QP objects
from Rx queue structures to relinquish their control to users through the
RSS flow rule action, while simultaneously allowing multiple queues to be
associated with RSS contexts.

Flow rules share identical RSS contexts (hashed fields, hash key, target
queues) to save on memory and other resources. The trade-off is some added
complexity due to reference counters management on RSS contexts.

The QUEUE action is re-implemented on top of an automatically-generated
single-queue RSS context.

The following hardware limitations apply to RSS contexts:

- The number of queues in a group must be a power of two.
- Queue indices must be consecutive, for instance the [0 1 2 3] set is
  allowed, however [3 2 1 0], [0 2 1 3] and [0 0 1 1 2 3 3 3] are not.
- The first queue of a group must be aligned to a multiple of the context
  size, e.g. if queues [0 1 2 3 4] are defined globally, allowed group
  combinations are [0 1] and [2 3]; groups [1 2] and [3 4] are not
  supported.
- RSS hash key, while configurable per context, must be exactly 40 bytes
  long.
- The only supported hash algorithm is Toeplitz.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
74d76e7ba5 net/mlx4: remove unnecessary check
Device operation callbacks are not supposed to handle a missing private
data structure.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
fc4e66649a net/mlx4: convert Rx path to work queues
Work queues (WQs) are lower-level than standard queue pairs (QPs). They are
dedicated to one traffic direction and have to be used in conjunction with
indirection tables and special "hash" QPs to get the same level of
functionality.

These extra objects however are the building blocks for RSS support brought
by subsequent commits, as a single "hash" QP can manage several WQs through
an indirection table according to a hash algorithm and other parameters.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
c64c58adc0 net/mlx4: allocate queues and mbuf rings together
Since live Tx and Rx queues cannot be reused anymore without being
destroyed first, mbuf ring sizes are fixed and known from the start.

This allows a single allocation for queue data structures and mbuf ring
together, saving space and bringing them closer in memory.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
7977082649 net/mlx4: drop live queue reconfiguration support
DPDK ensures that setup functions are never called on configured queues,
or only if they have previously been released.

PMDs therefore do not need to deal with the unexpected reconfiguration of
live queues which may fail with no easy way to recover. Dropping support
for this scenario greatly simplifies the code as allocation and setup steps
and checks can be merged.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
c8912bec52 net/mlx4: fix invalid errno value sign
Tx queue elements allocation function sets rte_errno properly and returns
its negative version. Reassigning this value to rte_errno is thus both
invalid and unnecessary.

Fixes: 9d14b27308 ("net/mlx4: standardize on negative errno values")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
67e6cce675 net/mlx4: update Rx/Tx callbacks consistently
Although their "removed" version acts as a safety against unexpected bursts
while queues are being modified by the control path, these callbacks are
set per device instead of per queue. It makes sense to update them during
start/stop/close cycles instead of queue setup.

As a side effect, this commit addresses a bug left over from a prior
commit: bringing the link down causes the "removed" Tx callback to be used,
however the normal callback is not restored when bringing it back up,
preventing the application from sending traffic at all.

Updating callbacks for a link change is not necessary as bringing the
netdevice down is normally enough to prevent traffic from flowing in.

Fixes: 3f75a02719 ("net/mlx4: drop scatter/gather support")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
eacaac7bae net/mlx4: restore promisc and allmulti support
Implement promiscuous and all multicast through internal flow rules
automatically generated according to the configured mode.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
3e49f870c5 net/mlx4: add flow support for multicast traffic
Give users the ability to create flow rules that match all multicast
traffic. Like promiscuous flow rules, they come with restrictions such as
not allowing additional matching criteria.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
30695adbdd net/mlx4: add VLAN filter configuration support
This commit brings back VLAN filter configuration support without any
artificial limitation on the number of simultaneous VLANs that can be
configured (previously 127).

Also thanks to the fact it does not rely on fixed per-queue arrays for
potential Verbs flow handle storage anymore, this version wastes a lot less
memory (previously 128 * 127 * pointer size, i.e. 130 kiB per Rx queue,
only one of which actually had any use for this room: the RSS parent
queue).

The number of internal flow rules generated still depends on the number of
configured MAC addresses times that of configured VLAN filters though.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
1437784b03 net/mlx4: add MAC addresses configuration support
This commit brings back support for configuring up to 128 MAC addresses on
a port through internal flow rules automatically generated on demand.

Unlike its previous incarnation, the necessary extra flow rule for
broadcast traffic does not consume an entry from the MAC array anymore.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
680d5280c2 net/mlx4: refactor flow item validation code
Since flow rule validation and creation have been refactored into a common
two-pass function, having separate callback functions to validate and
convert individual items seems redundant.

The purpose of these item validation functions is to reject partial masks
as those are not supported by hardware, before handing over the item to a
separate function that performs basic sanity checks.

The current approach and related code have the following issues:

- Lack of flow handle context in validation code requires kludges such as
  the special treatment reserved to spec-less Ethernet pattern items.
- Lack of useful error reporting; users need as much help as possible to
  understand what they did wrong, particularly when they hit hardware
  limitations that aren't mentioned by the flow API. Preventing them from
  going berserk after getting a generic "item not supported" message for no
  apparent reason is mandatory.
- Generic checks should be performed by the caller, not by item-specific
  validation functions.
- Mask checks either missing or too lax in some cases (Ethernet, VLAN).

This commit addresses all the above by combining validation and conversion
callbacks as "merge" callbacks that take an additional error context
parameter. Also:

- Support for source MAC address matching is removed as it has no effect.
- Providing an empty mask no longer bypasses the Ethernet specification
  check that causes a rule to become promiscuous-like.
- VLAN VIDs must be matched exactly, as matching all VLAN traffic while
  excluding non-VLAN traffic is not supported.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
fee75e14f3 net/mlx4: simplify trigger code for flow rules
Since flow rules synchronization function mlx4_flow_sync() takes into
account the state of the device (whether it is started), trigger functions
mlx4_flow_start() and mlx4_flow_stop() are redundant. Standardize on
mlx4_flow_sync().

Use this opportunity to enhance this function with better error reporting
as the inability to start the device due to a problem with a flow rule
otherwise results in a nondescript error code.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
fc49cbb768 net/mlx4: generalize flow rule priority support
Since both internal and user-defined flow rules are handled by a common
implementation, flow rule priority overlaps are easier to detect. No need
to restrict their use to isolated mode only.

With this patch, only the lowest priority level remains inaccessible to
users outside isolated mode.

Also, the PMD no longer automatically assigns a fixed priority level to
user-defined flow rules, which means collisions between overlapping rules
matching a different number of protocol layers at a given priority level
won't be avoided anymore (e.g. "eth" vs. "eth / ipv4 / udp").

As a reminder, the outcome of overlapping rules for a given priority level
was, and still is, undefined territory according to API documentation.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
bdcad2f484 net/mlx4: refactor internal flow rules
When not in isolated mode, a flow rule is automatically configured by the
PMD to receive traffic addressed to the MAC address of the device. This
somewhat duplicates flow API functionality.

Remove legacy support for internal flow rules to instead handle them
through the flow API implementation.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
328bf8e5a3 net/mlx4: relax check on missing flow rule target
Creating a flow rule targeting a missing (unconfigured) queue is not
possible. However, nothing really prevents the destruction of a queue with
existing flow rules still pointing at it, except currently the port must be
in a stopped state in order to avoid crashing.

Problem is that the port cannot be restarted if flow rules cannot be
re-applied due to missing queues. This flexibility will be needed by
subsequent work on this PMD.

Given that a PMD cannot decide on its own to remove problematic
user-defined flow rules in order to restart a port, work around this
restriction by making the affected ones drop-like, i.e. rules targeting
nonexistent queues drop packets instead.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
d3a7e09234 net/mlx4: allocate drop flow resources on demand
Verbs QP and CQ resources for drop flow rules do not need to be permanently
allocated, only when at least one rule needs them.

Besides, struct rte_flow_drop is outside the mlx4 PMD name space and should
never have been defined there. struct rte_flow is currently the only
exception to this rule.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
100fe44b81 net/mlx4: merge flow creation and validation code
These functions share a significant amount of code and require extra
internal objects to parse and build flow rule handles.

All this can be simplified by relying directly on the internal rte_flow
structure definition, whose QP pointer (destination Verbs queue) is
replaced by a DPDK queue ID and other properties, making it more versatile
without increasing its size (at least on 64-bit platforms).

This commit also gets rid of a few unnecessary debugging messages.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
f1c9ac9f23 net/mlx4: add iovec-like allocation wrappers
These wrappers implement the ability to allocate room for several disparate
objects as a single contiguous allocation while complying with their
respective alignment constraints.

This is usually more efficient than allocating and freeing them
individually if they are not expected to be reallocated with rte_realloc().

A typical use case is when several objects that cannot be dissociated must
be allocated together, as shown in the following example:

 struct b {
    ...
    struct d *d;
 }

 struct a {
     ...
     struct b *b;
     struct c *c;
 }

 struct mlx4_malloc_vec vec[] = {
     { .size = sizeof(struct a), .addr = &ptr_a, },
     { .size = sizeof(struct b), .addr = &ptr_b, },
     { .size = sizeof(struct c), .addr = &ptr_c, },
     { .size = sizeof(struct d), .addr = &ptr_d, },
 };

 if (!mlx4_mallocv(NULL, vec, RTE_DIM(vec)))
     goto error;

 struct a *a = ptr_a;

 a->b = ptr_b;
 a->c = ptr_c;
 a->b->d = ptr_d;
 ...
 rte_free(a);

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
28daff0785 net/mlx4: compact flow rule error reporting
Relying on rte_errno is not necessary where the return value of
rte_flow_error_set() can be used directly.

A related minor change is switching from RTE_FLOW_ERROR_TYPE_HANDLE to
RTE_FLOW_ERROR_TYPE_UNSPECIFIED when no rte_flow handle is involved in the
error, specifically when none is allocated yet.

This commit does not cause any functional change.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
267d07dacd net/mlx4: tidy up flow rule handling code
- Remove unnecessary casts.
- Replace consecutive if/else blocks with switch statements.
- Use proper big endian definitions for mask values.
- Make end marker checks of item and action lists less verbose since they
  are explicitly documented as being equal to 0.
- Remove unnecessary NULL check on action configuration structure.

This commit does not cause any functional change.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
809d8a6cff net/mlx4: clarify flow objects naming scheme
In several instances, "items" refers either to a flow pattern or a single
item, and "actions" either to the entire list of actions or only one of
them.

The fact the target of a rule (struct mlx4_flow_action) is also named
"action" and item-processing objects (struct mlx4_flow_items) as "cur_item"
("token" in one instance) contributes to the confusion.

Use this opportunity to clarify related comments and remove the unused
valid_actions[] global, whose sole purpose is to be referred by
item-processing objects as "actions".

This commit does not cause any functional change.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:48 +01:00
Adrien Mazarguil
a5171594fc net/mlx4: expose support for flow rule priorities
This PMD supports up to 4096 flow rule priority levels (0 to 4095).

Applications were not allowed to use them until now due to overlaps with
the default flows (e.g. MAC address, promiscuous mode).

This is not an issue in isolated mode when such flows do not exist.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Adrien Mazarguil
ed0cc677ad net/mlx4: enhance header files comments
Add missing comments and fix those not Doxygen-friendly.

Since the private structure definition is modified, use this opportunity to
add one remaining missing include required by one of its fields
(sys/queue.h for LIST_HEAD()).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Adrien Mazarguil
97561113a8 net/mlx4: remove Rx QP initializer function
There is no benefit in having this as a separate function.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Adrien Mazarguil
a9cfedf39d net/mlx4: replace bit-field type
Make clear it's 32-bit wide.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Adrien Mazarguil
5b4efcc6b9 ethdev: expose flow API error helper
rte_flow_error_set() is a convenient helper to initialize error objects.

Since there is no fundamental reason to prevent applications from using it,
expose it through the public interface after modifying its return value
from positive to negative. This is done for consistency with the rest of
the public interface.

Documentation is updated accordingly.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Matan Azrad
d5b0924ba6 ethdev: add return value to stats get dev op
The stats_get dev op API doesn't include return value, so PMD cannot
return an error in case of failure at stats getting process time.

Since PCI devices can be removed and there is a time between the
physical removal to the RMV interrupt, the user may get invalid stats
without any indication.

This patch changes the stats_get API return value to be int instead of
void.

All the net PMDs stats_get dev ops are adjusted by this patch.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-12 01:52:49 +01:00
Adrien Mazarguil
d84fb5eba1 net/mlx4: merge interrupt collector function
Since interrupt handler is the only function relying on it, merging them
simplifies the code as there is no need for an API to return collected
events.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
258937a3fd net/mlx4: fix rescheduled link status check
Link status is sometimes inconsistent during a LSC event. When it occurs,
the PMD refrains from immediately notifying the application; instead, an
alarm is scheduled to check link status later and notify the application
once it has settled.

The problem is that subsequent link status checks are only performed if
additional LSC events occur in the meantime, which is not always the case.

Worse, since support for removal events was added, rescheduled link status
checks may consume them as well without notifying the application. With the
right timing, a link loss occurring just before a device removal event may
hide it from the application.

Fixes: 6dd7b7056d ("net/mlx4: support device removal event")
Fixes: 2d449f7c52 ("net/mlx4: fix assertion failure on link update")
Cc: stable@dpdk.org

Reported-by: Matan Azrad <matan@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
ebada48456 net/mlx4: fix unhandled event debug message
When LSC or RMV events are received by the PMD but are not requested by the
application, a misleading debugging message implying the PMD does not
support them is shown.

Fixes: 6dd7b7056d ("net/mlx4: support device removal event")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
be65fdcbfb net/mlx4: rely on ethdev for Tx/Rx queue arrays
Allocation and management of Tx/Rx queue arrays is done by wrappers at the
ethdev level. The resulting information is copied to the private structure
while configuring the device, where it is managed separately by the PMD.

This is redundant and consumes space in the private structure.

Relying more on ethdev also means there is no need to protect the PMD
against burst function calls while closing the device anymore.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
dcafc2a64a net/mlx4: remove isolated mode constraint
Considering the remaining functionality, the only difference between
isolated and non-isolated mode is that a default MAC flow rule is present
with the latter.

The restriction on enabling isolated mode before creating any queues can
therefore be lifted.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
37491c7f8f net/mlx4: clean up includes and comments
Add missing includes and sort them, then update/remove comments around them
for consistency.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
655588afc8 net/mlx4: separate memory management functions
No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
686a64eb58 net/mlx4: rename private functions in flow API
While internal static functions do not cause link time conflicts, this
differentiates them from their mlx5 PMD counterparts while debugging.

No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
af745cd60b net/mlx4: group flow API handlers in common file
Only the common filter control operation callback needs to be exposed.

No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
5b4c63bdae net/mlx4: separate Rx configuration functions
Private functions are now prefixed with "mlx4_" to prevent them from
conflicting with their mlx5 PMD counterparts at link time.

No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
a2ce2121c0 net/mlx4: separate Tx configuration functions
Private functions are now prefixed with "mlx4_" to prevent them from
conflicting with their mlx5 PMD counterparts at link time.

No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
61cbdd4194 net/mlx4: separate device control functions
Private functions are now prefixed with "mlx4_" to prevent them from
conflicting with their mlx5 PMD counterparts at link time.

No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
7f45cb82da net/mlx4: separate Rx/Tx functions
This commit groups all data plane functions (Rx/Tx) into a separate file
and adjusts header files accordingly.

Private functions are now prefixed with "mlx4_" to prevent them from
conflicting with their mlx5 PMD counterparts at link time.

No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
3d555728c9 net/mlx4: separate Rx/Tx definitions
Except for a minor documentation update on internal structure definitions
to make them more Doxygen-friendly, there is no impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
b62579d4ce net/mlx4: separate interrupt handling
Private functions are now prefixed with "mlx4_" to prevent them from
conflicting with their mlx5 PMD counterparts at link time.

No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
a6e8b01c3c net/mlx4: compact interrupt functions
Link status (LSC) and removal (RMV) interrupts share a common handler and
are toggled simultaneously from common install/uninstall functions.

Four additional wrapper functions (two for each interrupt type) are
currently necessary because the PMD maintains an internal configuration
state for interrupts (priv->intr_conf).

This complexity can be avoided entirely since the PMD does not disable
interrupts configuration parameters in case of error anymore.

With this commit, only two functions are necessary to toggle interrupts
(including Rx) during start/stop cycles.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
635238bb65 net/mlx4: clean up interrupt functions prototypes
The naming scheme for these functions is overly verbose and not accurate
enough, with too many "handler" functions that are difficult to
differentiate (e.g. mlx4_dev_link_status_handler(),
mlx4_dev_interrupt_handler() and priv_dev_status_handler()).

This commit renames them and removes the unnecessary dev argument which can
be retrieved through the private structure where needed. Documentation is
updated accordingly.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
35d02c543a net/mlx4: refactor interrupt FD settings
File descriptors used for interrupts processing must be made non-blocking.

Doing so as soon as they are opened instead of waiting until they are
needed is more efficient as it avoids performing redundant system calls and
run through their associated error-handling code later on.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
7446202428 net/mlx4: rename alarm field
Make clear this field is related to interrupt handling.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
63c2f23c85 net/mlx4: use a single interrupt handle
The reason one interrupt handle is currently used for RMV/LSC events and
another one for Rx traffic is because these come from distinct file
descriptors.

This can be simplified however as Rx interrupt file descriptors are stored
elsewhere and are registered separately.

Modifying the interrupt handle type to RTE_INTR_HANDLE_UNKNOWN has never
been necessary as disabling interrupts is actually done by unregistering
the associated callback (RMV/LSC) or emptying the EFD array (Rx). Instead,
make clear that the base handle file descriptor is invalid by setting it to
-1 when disabled.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
76df01ff62 net/mlx4: separate debugging macros
The new definitions also rely on the existing DPDK logging subsystem
instead of using fprintf() directly.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
4e7367d831 net/mlx4: use standard macro to get array size
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
a0a745b7f7 net/mlx4: remove mbuf macro definitions
These were originally used for compatibility between DPDK releases when
this PMD was built out of tree.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
3cf06cea7c net/mlx4: remove unnecessary wrapper functions
Wrapper functions whose main purpose was to take a lock on the private
structure are no longer needed since this lock does not exist anymore.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
e4dff4d83d net/mlx4: remove control path locks
Concurrent use of various control path functions (e.g. configuring a queue
and destroying it simultaneously) may lead to undefined behavior.

PMD are not supposed to protect themselves from misbehaving applications,
and mlx4 is one of the few with internal locks on most control path
operations. This adds unnecessary complexity.

Leave this role to wrapper functions in ethdev.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
c76c88e1e0 net/mlx4: clean up coding style inconsistencies
This addresses badly formatted comments and needless empty lines before
refactoring functions into different files.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
9d14b27308 net/mlx4: standardize on negative errno values
Due to its reliance on system calls, the mlx4 PMD uses positive errno
values internally and negative ones at the ethdev API border. Although most
internal functions are documented, this mixed design is unusual and prone
to mistakes (e.g. flow API implementation uses negative values
exclusively).

Standardize on negative errno values and rely on rte_errno instead of
errno in all functions.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
d8fe7cdcfc net/mlx4: simplify link update function
Returning a different value when the current link status differs from the
previous one was probably useful at some point in the past but is now
meaningless; this value is ignored both internally (mlx4 PMD) and
externally (ethdev wrapper).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
5c5435192c net/mlx4: simplify Rx buffer handling
Thanks to the fact the PMD temporarily uses a slower interface for Rx,
removing the WR ID hack to instead store mbuf pointers directly makes the
code simpler at no extra cost.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
49040046f4 net/mlx4: revert fast verbs interface for Rx
This reverts commit acac55f164.

"Fast Verbs" is a nonstandard experimental interface that must be reverted
for compatibility reasons. Its replacement is slower but temporary,
performance will be restored by a subsequent commit through an enhanced
data path implementation. This one focuses on maintaining basic
functionality in the meantime.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
fd494d4fee net/mlx4: revert fast verbs interface for Tx
This reverts commit 9980f81dc2.

"Fast Verbs" is a nonstandard experimental interface that must be reverted
for compatibility reasons. Its replacement is slower but temporary,
performance will be restored by a subsequent commit through an enhanced
data path implementation. This one focuses on maintaining basic
functionality in the meantime.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
ca71a1a4ad net/mlx4: revert multicast echo prevention
This reverts commit 8b3ffe95e7.

Multicast loopback prevention is not part of the standard Verbs interface.
Remove it temporarily.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
72ba7fadbf net/mlx4: revert resource domain support
This reverts commit 3e49c148b7.

Resource domains are not part of the standard Verbs interface. The
performance improvement they bring will be restored later through a
different data path implementation.

This commit makes the PMD not rely on the non-standard QP allocation
interface.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
fa8551ffa8 net/mlx4: use standard QP attributes
The Verbs API used to set QP attributes is deprecated. Revert to the
standard API since it actually supports the remaining ones.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
056eaf2e6d net/mlx4: drop inline receive support
The Verbs API used to implement inline receive is deprecated.
Support will be added back after refactoring the PMD.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
3f75a02719 net/mlx4: drop scatter/gather support
The Verbs API used to implement Tx and Rx burst functions is deprecated.
Drop scatter/gather support to ease refactoring while maintaining basic
single-segment Rx/Tx functionality in the meantime.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:48 +02:00
Adrien Mazarguil
4e897255c8 net/mlx4: drop packet type recognition support
The Verbs API used to implement packet type recognition is deprecated.
Support will be added back after refactoring the PMD.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
22aadcc696 net/mlx4: drop checksum offloads support
The Verbs API used to implement Tx and Rx checksum offloads is deprecated.
Support for these will be added back after refactoring the PMD.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
4bd2aa1198 net/mlx4: drop RSS support
The Verbs RSS API used in this PMD is now obsolete. It is superseded by an
enhanced API with fewer constraints already used in the mlx5 PMD.

Drop RSS support in preparation for a major refactoring. The ability to
configure several Rx queues is retained, these can be targeted directly by
creating specific flow rules.

There is no need for "ignored" Rx queues anymore since their number is no
longer limited to powers of two.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
41f8001be6 net/mlx4: revert RSS parent queue refactoring
This reverts commit ff00a0dc56.

Support for several RSS parent queues was necessary to implement the RSS
flow rule action, dropped in a prior commit.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
56d1cd47ff net/mlx4: revert flow API RSS support
This reverts commit d7769c7c08.

Existing RSS features rely on experimental Verbs provided by Mellanox OFED.

In order to replace this dependency with standard distribution packages,
RSS support must be temporarily removed to be re-implemented using a
different API.

Removing support for the RSS flow rule action is the first step toward this
goal.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
8ca27e24e2 net/mlx4: drop MAC flows affecting all Rx queues
Configuring several Rx queues enables RSS, which causes an additional
special parent queue to be created to manage them.

MAC flows are associated with the queue supposed to receive packets; either
the parent one in case of RSS or the single orphan otherwise.

For historical reasons the current implementation supports another scenario
with multiple orphans, in which case MAC flows are configured on all of
them. This is harmless but useless since it cannot happen.

Removing this feature allows dissociating the remaining MAC flow from Rx
queues and store it inside the private structure where it belongs.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
320dc09f63 net/mlx4: remove MAC address configuration support
Only the default port MAC address remains and is not configurable.
This is done in preparation for a major refactoring.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
3e641ae766 net/mlx4: remove VLAN filter support
This is done in preparation for a major refactoring.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
805170c49c net/mlx4: remove allmulti and promisc support
This is done in preparation for a major refactoring.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
586db08058 net/mlx4: remove Tx inline compilation option
This should be a run-time parameter.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
367b31cd5b net/mlx4: remove scatter mode compilation option
This option both sets the maximum number of segments for Rx/Tx packets and
whether scattered mode is supported at all. This commit removes the latter
as well as configuration file exposure since the most appropriate value
should be decided at run-time.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
31a76ab0df net/mlx4: remove soft counters compilation option
Software counters are mandatory since hardware counters are not
implemented.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
863f34f710 net/mlx4: remove useless code
Less code makes refactoring easier. No impact on functionality.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
44dbb413a0 net/mlx4: remove secondary process support
Current implementation is partial (Tx only), not convenient to use and
not of primary concern.

Remove this feature before refactoring the PMD.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
5d15f04365 net/mlx4: remove useless compilation checks
Verbs support for RSS, inline receive and extended device query calls has
not been optional for a while. Their absence is untested and is therefore
unsupported.

Remove the related compilation checks and assume Mellanox OFED is up to
date, as described in the documentation.

Use this opportunity to remove a few useless data path debugging messages
behind compilation checks on never defined macros.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
8264279967 net/mlx4: check max number of ports dynamically
Use maximum number reported by hardware capabilities as replacement for the
static check on MLX4_PMD_MAX_PHYS_PORTS.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
f2318196c7 net/mlx4: remove limitation on number of instances
The seemingly artificial limitation on the maximum number of instances for
this PMD is an historical leftover that predates its first public release.

It was used as a workaround to support multiple physical ports on a PCI
device exposing a single bus address when mlx4 was implemented directly as
an Ethernet device driver instead of a PCI driver spawning Ethernet
devices.

Getting rid of it simplifies device initialization.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Adrien Mazarguil
3b47f9ac25 net/mlx4: add consistency to copyright notices
Copyright lasts long enough not to require notices to be updated yearly.

The current approach of updating them occasionally while working on
unrelated tasks should be deprecated in favor of dedicated commits updating
all files at once when necessary.

Standardize on a single year per copyright owner.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-10-06 02:49:47 +02:00
Matan Azrad
5f4677c6ad net/mlx4: workaround verbs error after plug-out
Current mlx4 OFED version has bug which returns error to
ibv destroy functions when the device was plugged out, in
spite of the resources were destroyed correctly.

Hence, failsafe PMD was aborted, only in debug mode, when
it tries to remove the device in plug-out process.

The workaround added option to replace all claim_zero
assertions with debugging messages, by the way, this option
affects non ibv destroy assertions.

DPDK 18.02 release should work with Mellanox OFED-4.2 which will
include the verbs fix to this bug, then, this patch can
be removed.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-08-03 23:10:27 +02:00
Matan Azrad
8d0f80167d net/mlx4: fix probe failure report
The corrupted code doesn't return error when probe function
fails due to error in device mac address getting.
By this way, the probe function may return success even if the
ETH dev is not allocated.

Hence, the probe caller, for example failsafe PMD, fails when it
tries to get ETH dev after the device was plugged out while mlx4
was probing it.

The fix adds error report to the probe caller when priv_get_mac fails
and in all other failure options which are missing it.

By this way, it prevents the unexpected behavior to miss ETH device
after the device was probed successfully.

Fixes: 7fae69eeff ("mlx4: new poll mode driver")
Fixes: 001a520e41 ("net/mlx4: add port parameter")
Fixes: 7b06615392 ("mlx4: check if port is configured for ethernet")
Fixes: fec3608673 ("mlx4: query netdevice to get initial MAC address")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-07-31 19:58:41 +02:00
Matan Azrad
f3b10f1d24 net/mlx4: fix flow creation before start
The corrupted code causes segmentation fault when user creates
flow with drop action before device starting.

For example, failsafe PMD recreates all the flows before calling
dev_start in plug-in sequence and mlx4 allocated its flow drop
queue in dev_start.
Hence, when failsafe created flow with drop action after plug-in
event, mlx4 tried to dereference flow drop queue which was
uninitialized.

The fix added check to the drop qp accessible and conditioned the
ibv_create_flow calling on device starting.

Fixes: 642fe56a1b ("net/mlx4: use a single drop queue for all drop flows")
Fixes: 46d5736a70 ("net/mlx4: support basic flow items and actions")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-07-31 19:58:41 +02:00
Gaetan Rivet
ac075085c1 net/mlx4: advertise the detach capability
This PMD supports hotplug, it is able to be detached.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-07-31 14:08:25 +02:00
Vasily Philipov
d7769c7c08 net/mlx4: support flow API RSS action
This commit adds support for the flow API RSS action with the following
limitations:

 - Only supported when isolated mode is enabled.
 - The number of queues specified by the action (rte_flow_action_rss.num)
   must be a power of two.
 - Each queue index can be specified at most once in the configuration
   (rte_flow_action_rss.queue[]).
 - Because a queue can be associated with a single RSS context, it cannot
   be targeted by multiple RSS actions simultaneously.

Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-07-06 15:00:57 +02:00
Vasily Philipov
ff00a0dc56 net/mlx4: refactor RSS parent queue allocation
A special "parent" queue must be allocated in addition to a group of
standard Rx queues for RSS to work. This is done automatically outside of
isolated mode by the PMD when applications request several Rx queues.

Since each configured flow rule with the RSS action may target a different
set of queues, the PMD must have the ability to dynamically allocate
several parent queues, one per RSS group.

If isolated mode was requested the default RSS parent queue isn't created
in this case.

Refactor RSS parent queue allocations (currently limited to a single
parent) in preparation for flow API RSS action support.

Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-07-06 15:00:57 +02:00
Vasily Philipov
ae7954ddea net/mlx4: implement isolated mode from flow API
The user must request isolated mode before device configuration.

Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-07-06 15:00:57 +02:00
Vasily Philipov
4be0621901 net/mlx4: fix mbuf poisoning in debug code
In debug mode, all mbuf ol_flags are temporarily enabled while sitting
in the Rx queue to detect otherwise silent data corruption, however
some of them are special (indirect and control) and must be cleared
before returning mbufs to the pool to avoid crashing.

Fixes: 7fae69eeff ("mlx4: new poll mode driver")
CC: stable@dpdk.org

Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-07-06 15:00:57 +02:00
Adrien Mazarguil
3aec2d14d6 net/mlx: update C compliance standard
This commit addresses a compilation issue against Glibc >= 2.25, which
implements assert() through a nonstandard ({ }) construct. Such constructs
can normally not be used without __extension__ keyword when -pedantic is
enabled, as is the case when compiling mlx4 and mlx5 PMDs in debug mode.

While assert.h checks for the compiler ability to support GNU extensions,
Clang, unlike GCC, does not allow the above syntax when combining
-std=gnu99 with -pedantic.

Work around missing keyword by moving these PMDs to a stricter compliance
standard without GNU extensions but properly checked by Glibc. Doing so is
supported on the DPDK side since includes have been cleaned up.

Even in C11, using types other than _Bool or signed/unsigned int for
bit-fields is an extension. Some GCC versions complain about that when
-pedantic checks are enabled.

The RTE_STD_C11 macro correctly prevented this issue with C99 but not with
C11 as it becomes a no-op. Forcing the extension keyword addresses it.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Tested-by: Yongseok Koh <yskoh@mellanox.com>
2017-07-06 15:00:57 +02:00
Adrien Mazarguil
2d449f7c52 net/mlx4: fix assertion failure on link update
The interrupt handler can sometimes be triggered for reasons other than a
link status event. An assertion failure happen when such events occur while
an asynchronous link status update is already scheduled.

Address this issue using the same approach as its mlx5 counterpart,
commit a9f2fbc42f ("net/mlx5: fix inconsistent link status")

Fixes: c4da6caa42 ("mlx4: handle link status interrupts")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-07-06 15:00:56 +02:00
Adrien Mazarguil
0a2ae70319 net/mlx4: fix Rx interrupts management
This commit addresses various issues that may lead to undefined behavior
when configuring Rx interrupts.

While failure to create a Rx queue completion channel in rxq_setup()
prevents that queue from being created, existing queues still have theirs.
Since the error handler disables dev_conf.intr_conf.rxq as well, subsequent
calls to rxq_setup() create Rx queues without interrupts. This leads to a
scenario where not all Rx queues support interrupts; missing checks on the
presence of completion channels may crash the application.

Considering that the PMD is not supposed to disable user-provided
configuration parameters (dev_conf.intr_conf.rxq), and that these can
change for subsequent rxq_setup() calls anyway, properly supporting a mixed
mode where not all Rx queues have interrupts enabled is a better approach.

To do so with a minimum set of changes, priv_intr_efd_enable() and
priv_create_intr_vec() are first refactored as a single
priv_rx_intr_vec_enable() function (same for their "disable" counterparts).
Since they had to be used together, there was no point in keeping them
separate.

Remaining changes:

- Always clean up before reconfiguring interrupts to avoid memory leaks.
- Always clean up when closing the device.
- Use malloc()/free() instead of their rte_*() counterparts since there is
  no need to store the vector in huge pages-backed memory.
- Allow more Rx queues than the size of the event file descriptor array as
  long as Rx interrupts are not requested on all of them.
- Properly clean up interrupt handle when disabling Rx interrupts (nb_efd
  and intr_vec reset to 0).
- Check completion channel presence while toggling Rx interrupts on a given
  queue.

Fixes: 9f05a4b818 ("net/mlx4: support user space Rx interrupt event")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Moti Haimovsky <motih@mellanox.com>
2017-07-06 15:00:56 +02:00
Adrien Mazarguil
3c560ec3ea net/mlx4: fix Rx interrupts with multiple ports
Several Ethernet device structures are allocated on top of a common PCI
device for mlx4 adapters with multiple ports. These inherit a common
interrupt handle from their parent PCI device, which prevents Rx interrupts
from working properly on all ports as their configuration is overwritten.

Use a local interrupt handle to address this issue.

Fixes: 9f05a4b818 ("net/mlx4: support user space Rx interrupt event")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Moti Haimovsky <motih@mellanox.com>
2017-07-06 15:00:56 +02:00
Adrien Mazarguil
1a3e40d5a4 net/mlx4: fix typos from prior commit
Fixes: 9f05a4b818 ("net/mlx4: support user space Rx interrupt event")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Moti Haimovsky <motih@mellanox.com>
2017-07-06 15:00:56 +02:00
Stephen Hemminger
463ced957c pci: increase domain storage to 32 bits
In some environments, the PCI domain can be larger than 16 bits.
For example, a PCI device passed through in Azure gets a synthetic domain
id  which is internally generated based on GUID. The PCI standard does
not restrict domain to be 16 bits.

This change breaks ABI for API's that expose PCI address structure.

The printf format for PCI remains unchanged, so that on most
systems (with only 16 bit domain) the output format is unchanged
and is 4 characters wide.  For example: 0000:00:01.0
Only on sysetms with higher bits will the domain take up more
space; example: 12000:00:01.0

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2017-07-06 01:28:02 +02:00
Bernard Iremonger
d6af1a13d7 ethdev: add return values to callback process API
Change the rte_eth_dev_callback_process function to return int,
and add a void *ret_param parameter.
The new parameter is used by ixgbe and i40e instead of abusing
the user data of the callback.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
2017-07-01 17:19:55 +02:00
Moti Haimovsky
9f05a4b818 net/mlx4: support user space Rx interrupt event
Implement rxq interrupt callbacks

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-06-12 10:41:29 +01:00
Ferruh Yigit
c0802544d9 drivers/net: add generic ethdev macro to get PCI device
Instead of many PMD define their own macro, define a generic one in
ethdev and use that in PMDs.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2017-06-12 10:41:25 +01:00
Jerin Jacob
c6dfeecb15 eal: introduce macro for no inline
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-06-06 22:31:45 +02:00
Wei Dai
6d01e580ac ethdev: fix adding invalid MAC address
Some customers find adding MAC addr to VF sometimes can fail,
but it is still stored in dev->data->mac_addrs[ ]. So this
can lead to some errors that assumes the non-zero entry in
dev->data->mac_addrs[ ] is valid.
Following acknowledgements are from specific NIC PMD
maintainer for their managing part.

This patch changes the ethdev internal API, it should not be
backported to a stable/LTS release so far.

Fixes: af75078fec ("first public release")

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
2017-05-05 16:27:11 +02:00
Thomas Monjalon
3dcfe0390c pci: remove eal prefix
The PCI code will move to the bus drivers directory.
Rename functions from rte_eal_pci_ to rte_pci_
to prepare the move of the driver out of EAL.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2017-05-05 14:38:17 +02:00
Gaetan Rivet
6dd7b7056d net/mlx4: support device removal event
Extend the LSC event handling to support the device removal as well. The
Verbs library will send several related events, that can conflict
with the LSC event itself.

The event handling has thus been made capable of receiving and signaling
several event types at once.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Signed-off-by: Elad Persiko <eladpe@mellanox.com>
2017-04-21 01:01:47 +02:00
Charles Myers
1193edaa82 net/mlx4: fix Rx after mbuf alloc failure
Fixes issue where mlx4 driver stops receiving packets when mbuf
allocation fails in mlx4_rx_burst().

This issue appears to be caused because the code doesn't recycle the
existing mbuf to the sges array when mbuf allocation fails as is done
in the code right above it which handles (wc.status != IBV_WC_SUCCESS).

Copying the code from the above case fixes the issue.

Fixes: acac55f164 ("mlx4: use MOFED 3.0 fast verbs interface for Rx operations")
Cc: stable@dpdk.org

Signed-off-by: Charles Myers <charles.myers@spirent.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-04-19 15:37:37 +02:00
Jan Blunck
fdf91e0f2f drivers/net: do not use ethdev driver
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2017-04-18 19:05:46 +02:00
Qi Zhang
c23a1a3000 eal: clean up interrupt handle
The patch change the prototype of callback function
(rte_intr_callback_fn) by removing the unnecessary parameter.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2017-04-06 21:15:55 +02:00
Ferruh Yigit
0c145b7eea drivers/net: remove unused DEPDIRS from makefiles
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-06 20:58:59 +02:00
Vasily Philipov
642fe56a1b net/mlx4: use a single drop queue for all drop flows
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-04-04 19:03:03 +02:00
Gaetan Rivet
9e09761b43 net/mlx4: fix returned values upon failed probing
Let error messages in place, but return unambiguous values upon
probing errors.

Fixes: 66e1591687 ("mlx4: avoid init errors when kernel modules are not loaded")
Cc: stable@dpdk.org

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2017-04-04 18:59:51 +02:00