Commit Graph

278 Commits

Author SHA1 Message Date
Yaacov Hazan
c502d05197 net/mlx5: fix inconsistent return value in flow director
The return value in DPDK is negative errno on failure.
Since internal functions in mlx driver return positive
values need to negate this value when it returned to
dpdk layer.

Fixes: 76f5c99 ("mlx5: support flow director")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2016-09-30 12:27:18 +02:00
Sagi Grimberg
15a756b637 net/mlx5: fix possible NULL dereference in Rx path
The user is allowed to call ->rx_pkt_burst() even without free
mbufs in the pool. In this scenario we'll fail allocating a rep mbuf
on the first iteration (where pkt is still NULL). This would cause us
to deref a NULL pkt (reset refcount and free).

Fix this by checking the pkt before freeing it.

Fixes: a1bdb71a32 ("net/mlx5: fix crash in Rx")

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-09-30 12:27:18 +02:00
Jan Viktorin
13a1317d3b pci: create device list and fallback on its members
Now that rte_device is available, drivers can start using its members
(numa, name) as well as link themselves into another rte_device list.

As of now no one is using this list, but can be used for moving over all
devices (pdev/vdev/Xdev) and perform bulk actions (like cleanup).

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Reword commit log for extra rte_device list]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:34:03 +02:00
Jan Viktorin
2f3193cf0f pci: inherit common driver in PCI driver
Remove the 'name' member from rte_pci_driver and move to generic
rte_driver.

Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(<name>..)
as well as assigning a name to eth_driver.pci_drv.name member.
In this patch, only the original DRIVER_REGISTER_PCI(<name>..) name has
been populated into the rte_driver.name member - assignments through
eth_driver has been removed.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Rebase and expand changes to newly added files]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:55 +02:00
David Marchand
6751f6deb7 ethdev: get rid of device type
Now that hotplug has been moved to eal, there is no reason to keep the
device type in this layer.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-10-03 16:33:39 +02:00
David Marchand
c830cb2954 drivers: use PCI registration macro
Simplify crypto and ethdev pci drivers init by using newly introduced
init macros and helpers.
Those drivers then don't need to register as "rte_driver"s anymore.

Exceptions:
- virtio and mlx* use RTE_INIT directly as they have custom initialization
  steps.
- VDEV devices are not modified - they continue to use PMD_REGISTER_DRIVER.

Update documentation for replacing an example referring to
PMD_REGISTER_DRIVER.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2016-10-03 16:33:23 +02:00
Shreyansh Jain
af424af840 pci: replace devinit/devuninit with probe/remove
Probe and Remove are more appropriate names for PCI init and uninint
operations. This is a cosmetic change.

Only MLX* uses the PCI direct registration, bypassing PMD_* macro.
The callbacks for this too have been updated.

VDEV are left out. For them, init/uninit are more appropriate.

Suggested-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-10-03 16:33:01 +02:00
Pablo de Lara
2f45703c17 drivers: make driver names consistent
As discussed in the past release, driver names are modified
to be more consistent, and the future driver should follow
this new convention.

Driver names consist of:
"driver category"_"driver folder name"_"optional extra name".

For example:
- Crypto null driver       -> "crypto_null"
- Network IXGBE VF driver  -> "net_ixgbe_vf"

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2016-09-16 11:55:59 +02:00
Olga Shern
cd8ec06ac0 net/mlx5: work around gcc error
RHEL 7.1's GCC for POWER8 reports the following error in one rte_memcpy()
macro call by mlx5:

 error: array subscript is above array bounds [-Werror=array-bounds]

It appears to be a GCC bug which can be worked around by making parentheses
more explicit.

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-07-15 23:01:15 +02:00
Nélio Laranjeiro
a1bdb71a32 net/mlx5: fix crash in Rx
Fixed issue could occur when Mbuf starvation happens in a middle of
reception of a segmented packet. In such a situation, the PMD has to
release all segments of that packet.  The end condition was wrong
causing it to free an Mbuf still handled by the NIC.

Fixes: 9964b965ad ("net/mlx5: re-add Rx scatter support")

Reported-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-07-08 22:51:10 +02:00
Maxime Leroy
0ac6484625 net/mlx5: fix packet type and offload flags on Rx
In mlx5 rx function, the packet_type and ol_flags mbuf fields are not
properly initialized when no rx offload feature is enabled (checksum, l2
tun checksum, vlan_strip, crc). Thus, these fields can have a value
different of 0 depending on their value when the mbuf was freed.

This can result in an incorrect application behavior if invalid
ol_flags/ptype are set, or memory corruptions if IND_ATTACHED_MBUF is
set in ol_flags.

Fixes: 081f7eae24 ("mlx5: process offload flags only when requested")

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-07-08 22:51:03 +02:00
Olivier Matz
3d04e05055 net/mlx5: fix API comment of link set function
Fixes: 62072098b5 ("mlx5: support setting link up or down")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-07-05 11:55:47 +02:00
Olivier Matz
33242e3e46 net/mlx: fix setting interface flags
According to the documentation, the function
priv_set_flags(priv, keep, flags) should not modify the flags
in "keep" mask.

So 'flags' argument should be masked with '~keep' before ORing
it with the previous flags value.

This avoids messing up the kernel interface flags when calling
priv_set_flags(priv, ~IFF_UP, ~IFF_UP) in priv_set_link():

  $ ip link
  26: eth0: BROADCAST,MULTICAST,NOARP,ALLMULTI,PROMISC,DEBUG,\
      DYNAMIC,AUTOMEDIA,PORTSEL,NOTRAILERS

Fixes: 7fae69eeff ("mlx4: new poll mode driver")
Fixes: 771fa900b7 ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters")

Reported-by: Fengtian Guo <fengtian.guo@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-07-05 11:51:10 +02:00
Pablo de Lara
bae696ebd4 drivers: remove static driver names
Since now the PMD_REGISTER_DRIVER macro sets the driver names,
there is no need to have the rte_driver structure setting it
statically, as it will get overridden.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2016-07-10 14:51:09 +02:00
Thomas Monjalon
54266acc64 drivers: fix build with new register macro
Compilation fails because of some typos.

Fixes: cb6696d220 ("drivers: update registration macro usage")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-07-08 10:27:08 +02:00
Neil Horman
cb6696d220 drivers: update registration macro usage
Modify the PMD_REGISTER_DRIVER macro, adding a name argument to it.  The
addition of a name argument creates a token that can be used for subsequent
macros in the creation of unique symbol names to export additional bits of
information for use by the pmdinfogen tool.  For example:

PMD_REGISTER_DRIVER(ena_driver, ena);

registers the ena_driver struct as it always did, and creates a symbol
const char this_pmd_name0[] __attribute__((used)) = "ena";

which pmdinfogen can search for and extract.  The subsequent macro

DRIVER_REGISTER_PCI_TABLE(ena, ena_pci_id_map);

creates a symbol const char ena_pci_tbl_export[] __attribute__((used)) =
"ena_pci_id_map";

Which allows pmdinfogen to find the pci table of this driver

Using this pattern, we can export arbitrary bits of information.

pmdinfo uses this information to extract hardware support from an object
file and create a json string to make hardware support info discoverable
later.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Remy Horton <remy.horton@intel.com>
2016-07-06 23:21:40 +02:00
Ferruh Yigit
36351ea34b net/mlx: fix build with icc
Compilation errors:
mlx4:
  drivers/net/mlx4/mlx4.c(5409): error #188:
  enumerated type mixed with another type
          priv->intr_handle.type = 0;
                                 ^
mlx5:
  drivers/net/mlx5/mlx5_rxq.c(282): error #188:
  enumerated type mixed with another type
        enum hash_rxq_type type = 0;
                                  ^
and more same type of error.
Fix these by assigning enum values rather than integer values to the enum
variables

Fixes: c4da6caa42 ("mlx4: handle link status interrupts")
Fixes: 198a3c339a ("mlx5: handle link status interrupts")
Fixes: 0d2186743d ("mlx5: manage all special flow types at once")
Fixes: 612ad38209 ("mlx5: fix hash Rx queue type in RSS mode")
Fixes: 083c2dd317 ("mlx5: refactor special flows handling")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-28 11:49:09 +02:00
Nélio Laranjeiro
9964b965ad net/mlx5: re-add Rx scatter support
This commit brings back Rx scatter and related support by the MTU update
function. The maximum number of segments per packet is not a fixed value
anymore (previously MLX5_PMD_SGE_WR_N, set to 4 by default) as it caused
performance issues when fewer segments were actually needed as well as
limitations on the maximum packet size that could be received with the
default mbuf size (supporting at most 8576 bytes).

These limitations are now lifted as the number of SGEs is derived from the
MTU (which implies MRU) at queue initialization and during MTU update.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-06-27 16:17:52 +02:00
Adrien Mazarguil
cd9ea11454 net/mlx5: make Rx queue reinitialization safer
The primary purpose of rxq_rehash() function is to stop and restart
reception on a queue after re-posting buffers. This may fail if the array
that temporarily stores existing buffers for reuse cannot be allocated.

Update rxq_rehash() to work on the target queue directly (not through a
template copy) and avoid this allocation.

rxq_alloc_elts() is modified accordingly to take buffers from an existing
queue directly and update their refcount.

Unlike rxq_rehash(), rxq_setup() must work on a temporary structure but
should not allocate new mbufs from the pool while reinitializing an
existing queue. This is achieved by using the refcount-aware
rxq_alloc_elts() before overwriting queue data.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
2016-06-27 16:17:52 +02:00
Adrien Mazarguil
c6cb3a8fb5 net/mlx5: remove redundant Rx queue initialization
Toggling RX checksum offloads is already done at initialization time. This
code does not belong in rxq_rehash().

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-06-27 16:17:52 +02:00
Adrien Mazarguil
a5bf6af931 net/mlx5: re-add Tx gather support
Compared to its previous incarnation, the software limit on the number of
mbuf segments is no more (previously MLX5_PMD_SGE_WR_N, set to 4 by
default) hence no need for linearization code and related buffers that
permanently consumed a non negligible amount of memory to handle oversized
mbufs.

The resulting code is both lighter and faster.

With the addition of this code, older GCC versions (such
as 4.8.5) may complain about 'wqe' variable being uninitialized, so
initialize it preemptively, even though it is not necessary to do so.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-06-27 16:17:52 +02:00
Adrien Mazarguil
c3d62cc953 net/mlx5: check remaining space while processing Tx burst
The space necessary to store segmented packets cannot be known in advance
and must be verified for each of them.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-06-27 16:17:52 +02:00
Adrien Mazarguil
c2a81fea90 net/mlx5: add debug information about Tx queues capabilities
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
230189d9ff net/mlx5: support multi-packet send
This feature enables the TX burst function to emit up to 5 packets using
only two work queue entries (WQEs) on devices that support it. Saves PCI
bandwidth and improves performance.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
2016-06-27 16:17:52 +02:00
Yaacov Hazan
2a66cf3789 net/mlx5: support inline send
Implement send inline feature which copies packet data directly into
work queue entries (WQEs) for improved latency. The maximum packet
size and the minimum number of Tx queues to qualify for inline send
are user-configurable.

This feature is effective when HW causes a performance bottleneck.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-06-27 16:17:52 +02:00
Adrien Mazarguil
c305090bba net/mlx5: replace countdown with threshold for Tx completions
Replacing the variable countdown (which depends on the number of
descriptors) with a fixed relative threshold known at compile time improves
performance by reducing the TX queue structure footprint and the amount of
code to manage completions during a burst.

Completions are now requested at most once per burst after threshold is
reached.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
99c12dcca6 net/mlx5: handle Rx CQE compression
Mini (compressed) completion queue entries (CQEs) are returned by the
NIC when PCI back pressure is detected, in which case the first CQE64
contains common packet information followed by a number of CQE8
providing the rest, followed by a matching number of empty CQE64
entries to be used by software for decompression.

Before decompression:

      0           1          2           6         7         8
  +-------+  +---------+ +-------+   +-------+ +-------+ +-------+
  | CQE64 |  |  CQE64  | | CQE64 |   | CQE64 | | CQE64 | | CQE64 |
  |-------|  |---------| |-------|   |-------| |-------| |-------|
  | ..... |  | cqe8[0] | |       | . |       | |       | | ..... |
  | ..... |  | cqe8[1] | |       | . |       | |       | | ..... |
  | ..... |  | ....... | |       | . |       | |       | | ..... |
  | ..... |  | cqe8[7] | |       |   |       | |       | | ..... |
  +-------+  +---------+ +-------+   +-------+ +-------+ +-------+

After decompression:

      0          1     ...     8
  +-------+  +-------+     +-------+
  | CQE64 |  | CQE64 |     | CQE64 |
  |-------|  |-------|     |-------|
  | ..... |  | ..... |  .  | ..... |
  | ..... |  | ..... |  .  | ..... |
  | ..... |  | ..... |  .  | ..... |
  | ..... |  | ..... |     | ..... |
  +-------+  +-------+     +-------+

This patch does not perform the entire decompression step as it would be
really expensive, instead the first CQE64 is consumed and an internal
context is maintained to interpret the following CQE8 entries directly.

Intermediate empty CQE64 entries are handed back to HW without further
processing.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
1d88ba1719 net/mlx5: refactor Tx data path
Bypass Verbs to improve Tx performance.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
6218063b39 net/mlx5: refactor Rx data path
Bypass Verbs to improve RX performance.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
cdab90cb5c net/mlx5: add Tx/Rx burst function selection wrapper
These wrappers are meant to prevent code duplication later.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
e72dd09b61 net/mlx5: add support for configuration through kvargs
The intent is to replace the remaining compile-time options and environment
variables with a common mean of runtime configuration. This commit only
adds the kvargs handling code, subsequent commits will update the rest.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
51a50a3d9b net/mlx5: add definitions for data path without Verbs
These structures and macros extend those exposed by libmlx5 (in mlx5_hw.h)
to let the PMD manage work queue and completion queue elements directly.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
36271e7446 net/mlx5: update prerequisites for upcoming enhancements
The latest version of Mellanox OFED exposes hardware definitions necessary
to implement data path operation bypassing Verbs. Update the minimum
version requirement to MLNX_OFED >= 3.3 and clean up compatibility checks
for previous releases.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
0cdddf4d06 net/mlx5: split Rx queue structure
To keep the data path as efficient as possible, move fields only useful to
the control path into new structure rxq_ctrl.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
21c8bb4928 net/mlx5: split Tx queue structure
To keep the data path as efficient as possible, move fields only useful to
the control path into new structure txq_ctrl.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
0431c40f47 net/mlx5: remove inline Tx support
Inline TX will be fully managed by the PMD after Verbs is bypassed in the
data path. Remove the current code until then.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
97c784fb18 net/mlx5: remove configuration variable
There is no scatter/gather support anymore, CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N
has no purpose and can be removed.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:51 +02:00
Nélio Laranjeiro
3f5ac8f292 net/mlx5: remove Rx scatter support
This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. RX scatter cannot be maintained during the
transition and will be reimplemented later.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:51 +02:00
Nélio Laranjeiro
573f54afae net/mlx5: remove Tx gather support
This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. TX gather cannot be maintained during the
transition and will be reimplemented later.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:51 +02:00
Nélio Laranjeiro
491770fafc net/mlx5: split memory registration function
Except for the first time when memory registration occurs, the lkey is
always cached. Since memory registration is slow and performs system calls,
performance can be improved by moving that code to its own function outside
of the data path so only the lookup code is left in the original inlined
function.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:51 +02:00
Nélio Laranjeiro
1d1bc87026 net: fix PCI class id
Use RTE_PCI_DEVICE macro to set all fields rather than explicitly setting
them individually in the code. This shortens the code while helping to
future-proof against future changes to the rte_pci_id structure.

Fixes: 701c8d80c8 ("pci: support class id probing")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-06-27 16:17:51 +02:00
Nélio Laranjeiro
85e347db87 net/mlx5: enhance SR-IOV detection
SR-IOV mode is currently set when dealing with VF devices. PF devices must
be taken into account as well if they have active VFs.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Adrien Mazarguil
96327064ab net/mlx5: cosmetic changes
Add consistency to mlx5_rxtx.h.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Adrien Mazarguil
6a707608dc net/mlx5: fix Rx VLAN stripping capability check
A hardware capability check is missing before enabling RX VLAN stripping
during queue setup.

Also, while dev_conf.rxmode.hw_vlan_strip is currently a single bit that
can be stored in priv->hw_vlan_strip directly, it should be interpreted as
a boolean value for safety.

Fixes: f3db948918 ("mlx5: support Rx VLAN stripping")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Adrien Mazarguil
78d9c95d91 net/mlx: retrieve mbuf size through proper function
No need to allocate a mbuf for that.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Adrien Mazarguil
f3b492d7b4 net/mlx: ensure MTU update is effective
There is no guarantee that the new MTU is effective after writing its value
to sysfs. Retrieve it to be sure.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Nélio Laranjeiro
6e0eab3802 net/mlx: remove unused memory region property
Memory regions are always local with raw Ethernet queues, drop the remote
property as it adds extra processing on the hardware side.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Adrien Mazarguil
80c4e7281e net/mlx: regenerate autoconf file automatically
Mellanox PMDs must be rebuilt if a Verbs update would cause the autoconf
file to differ.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:48 +02:00
Adrien Mazarguil
d06c608c01 net/mlx: fix compilation with glibc 2.20
Since _BSD_SOURCE was deprecated in favor of _DEFAULT_SOURCE in Glibc 2.19
and entirely removed in 2.20, various BSD ioctl macros are not exposed
anymore when _XOPEN_SOURCE is defined, and linux/if.h now conflicts with
net/if.h.

Add _DEFAULT_SOURCE and keep _BSD_SOURCE for compatibility with older
versions.

Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2016-06-20 17:21:34 +02:00
Olivier Matz
b37b528d95 mbuf: add new Rx flags for stripped VLAN
The behavior of PKT_RX_VLAN_PKT was not very well defined, resulting in
PMDs not advertising the same flags in similar conditions.

Following discussion in [1], introduce 2 new flags PKT_RX_VLAN_STRIPPED
and PKT_RX_QINQ_STRIPPED that are better defined:

  PKT_RX_VLAN_STRIPPED: a vlan has been stripped by the hardware and its
  tci is saved in mbuf->vlan_tci. This can only happen if vlan stripping
  is enabled in the RX configuration of the PMD.

For now, the old flag PKT_RX_VLAN_PKT is kept but marked as deprecated.
It should be removed from applications and PMDs in a future revision.

This patch also updates the drivers. For PKT_RX_VLAN_PKT:

- e1000, enic, i40e, mlx5, nfp, vmxnet3: done, PKT_RX_VLAN_PKT already
  had the same meaning than PKT_RX_VLAN_STRIPPED, minor update is
  required.
- fm10k: done, PKT_RX_VLAN_PKT already had the same meaning than
  PKT_RX_VLAN_STRIPPED, and vlan stripping is always enabled on fm10k.
- ixgbe: modification done (vector and normal), the old flag was set
  when a vlan was recognized, even if vlan stripping was disabled.
- the other drivers do not support vlan stripping.

For PKT_RX_QINQ_PKT, it was only supported on i40e, and the behavior was
already correct, so we can reuse the same bit value for
PKT_RX_QINQ_STRIPPED.

[1] http://dpdk.org/ml/archives/dev/2016-April/037837.html,

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2016-06-15 17:18:57 +02:00
Olivier Matz
e89880957f mbuf: add prefetch helpers
Some architectures (ex: Power8) have a cache line size of 128 bytes,
so the drivers should not expect that prefetching the second part of
the mbuf with rte_prefetch0(&m->cacheline1) is valid.

This commit add helpers that can be used by drivers to prefetch the
rx or tx part of the mbuf, whatever the cache line size.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2016-05-24 11:21:14 +02:00
Olivier Matz
d1d914ebbc mempool: allocate in several memory chunks by default
Introduce rte_mempool_populate_default() which allocates
mempool objects in several memzones.

The mempool header is now always allocated in a specific memzone
(not with its objects). Thanks to this modification, we can remove
many specific behavior that was required when hugepages are not
enabled in case we are using rte_mempool_xmem_create().

This change requires to update how kni and mellanox drivers lookup for
mbuf memory. For now, this will only work if there is only one memory
chunk (like today), but we could make use of rte_mempool_mem_iter() to
support more memory chunks.

We can also remove RTE_MEMPOOL_OBJ_NAME that is not required anymore for
the lookup, as memory chunks are referenced by the mempool.

Note that rte_mempool_create() is still broken (it was the case before)
when there is no hugepages support (rte_mempool_create_xmem() has to be
used). This is fixed in next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Olivier Matz
84121f1971 mempool: store memory chunks in a list
Do not use paddr table to store the mempool memory chunks.
This will allow to have several chunks with different virtual addresses.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Olivier Matz
d86046f0cf mempool: use the list to iterate the elements
Now that the mempool objects are chained into a list, we can use it to
browse them. This implies a rework of rte_mempool_obj_iter() API, that
does not need to take as many arguments as before. The previous function
is kept as a private function, and renamed in this commit. It will be
removed in a next commit of the patch series.

The only internal users of this function are the mellanox drivers. The
code is updated accordingly.

Introducing an API compatibility for this function has been considered,
but it is not easy to do without keeping the old code, as the previous
function could also be used to browse elements that were not added in a
mempool. Moreover, the API is already be broken by other patches in this
version.

The library version was already updated in
commit 213af31e09 ("mempool: reduce structure size if no cache needed")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Olivier Matz
c2c66588e2 mempool: remove const qualifier when browsing pools
This commit removes the const qualifier for the mempool in
rte_mempool_walk() callback prototype.

Indeed, most functions that can be done on a mempool require a non-const
mempool pointer, except the dump and the audit. Therefore, the
mempool_walk() is more useful if the mempool pointer is not const.

This is required by next commit where the mellanox drivers use
rte_mempool_walk() to iterate the mempools, then rte_mempool_obj_iter()
to iterate the objects in each mempool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Olivier Matz
fbfd99551c mbuf: add raw allocation function
Many drivers provide their own implementation of rte_mbuf_raw_alloc(),
duplicating the code. Introduce a new public function in rte_mbuf to
allocate a raw mbuf (uninitialized).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-17 08:31:33 +02:00
Thomas Monjalon
b2feed01d6 ethdev: add 100G link speed
The link speed configuration is now done with bitmaps so 100G speed
requires only a new bit flag.
The actual link speed is a number so its size must be increased from
16-bit to 32-bit.

Signed-off-by: Marc Sune <marcdevel@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Tested-by: Matej Vido <vido@cesnet.cz>
2016-04-01 21:38:34 +02:00
Marc Sune
82113036e4 ethdev: redesign link speed config
This patch redesigns the API to set the link speed/s configuration
of an ethernet port. Specifically:

- it allows to define a set of advertised speeds for
  auto-negociation.
- it allows to disable link auto-negociation (single fixed speed).
- default: auto-negociate all supported speeds.

A flag autoneg in struct rte_eth_link indicates if link speed was a
result of auto-negociation or was fixed by configuration.

Signed-off-by: Marc Sune <marcdevel@gmail.com>
Tested-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Beilei Xing <beilei.xing@intel.com>
Tested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-04-01 21:38:34 +02:00
Marc Sune
e274f57322 ethdev: add speed capabilities
The speed capabilities of a device can be retrieved with
rte_eth_dev_info_get().

The new field speed_capa is initialized in the drivers without
taking care of device characteristics in this patch.
When the capabilities of a driver are accurate, the table in
overview.rst must be filled.

Signed-off-by: Marc Sune <marcdevel@gmail.com>
2016-04-01 21:38:34 +02:00
Yaacov Hazan
da04298a11 mlx5: fix RETA table size
When the number of RX queues is not a power of two,
the RETA table is configured to its maximum size for
better balancing.

Testing showed that limiting its size to 256 improves performance
noticeably with little to no  impact on balancing results.

Fixes: ebb30ec64a ("mlx5: increase RETA table size")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-31 15:15:10 +02:00
Adrien Mazarguil
b185e63fdf mlx: fix double mbuf free in Tx queue clean up
Once freed, completed mbufs pointers are not set to NULL in the TX queue.
Clean up function must take this into account.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")
Fixes: 7fae69eeff ("mlx4: new poll mode driver")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-31 15:15:10 +02:00
Nélio Laranjeiro
fbd2319037 mlx5: fix null RSS key handling
Update function can be called with no key to enable or disable a RSS
protocol, or with a key to be applied to the desired protocols.

Fixes: 2f97422e77 ("mlx5: support RSS hash update and get")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-25 19:01:37 +01:00
Nélio Laranjeiro
c64ccc0eca mlx5: fix overwritten RSS configuration
RSS configuration provided by the application should not be used as storage
by the PMD.

Fixes: 2f97422e77 ("mlx5: support RSS hash update and get")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-25 19:01:37 +01:00
Yaacov Hazan
e192ef8034 mlx5: add VLAN insertion offload
VLAN insertion can be done in hardware when supported in Verbs. A software
fallback is provided otherwise. The software implementation is also used
when multi-packet send is enabled on a queue, as both features are mutually
exclusive.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-25 18:56:44 +01:00
Olga Shern
4d803a7246 mlx5: support HW packet padding
Environment variable MLX5_PMD_ENABLE_PADDING enables HW packet padding
in PCI bus transactions.

When packet size is cache aligned and CRC stripping is enabled, 4 fewer
bytes are written to the PCI bus. Enabling padding makes such packets
aligned again.

In cases where PCI bandwidth is the bottleneck, padding can improve
performance by 10%.

This is disabled by default since this can also decrease performance for
unaligned packet sizes.

Signed-off-by: Olga Shern <olgas@mellanox.com>

fix packet padding macro check

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-25 18:56:44 +01:00
Olga Shern
4d3267092a mlx5: add Rx CRC stripping configuration
Until now, CRC was always stripped by hardware. This feature can be
configured since MLNX_OFED >= 3.2.

Signed-off-by: Olga Shern <olgas@mellanox.com>
2016-03-25 18:56:44 +01:00
Or Ami
a48deada65 mlx5: allow operation in secondary processes
Secondary processes are expected to use queues and other resources
allocated by the primary, however Verbs resources can only be shared
between processes when inherited through fork().

This limitation can be worked around for TX by configuring separate queues
from secondary processes.

Signed-off-by: Or Ami <ora@mellanox.com>
2016-03-25 18:56:44 +01:00
Or Ami
62072098b5 mlx5: support setting link up or down
Add driver functions to set link state up or down.
Burst functions are updated to make sure applications cannot attempt to
send/receive after link is brought down.

Signed-off-by: Or Ami <ora@mellanox.com>
2016-03-25 18:56:44 +01:00
Jianfeng Tan
78a38edf66 ethdev: query supported packet types
Add a new API rte_eth_dev_get_supported_ptypes to query what packet types
can be filled by a given device. The device should be already started or
its PMD RX burst function already decided, since the packet types supported
may vary depending on RX function.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-25 18:56:43 +01:00
Yaacov Hazan
d0087d760f mlx5: fix Rx checksum offload in non L3/L4 packets
Change rxq_cq_to_ol_flags() to set checksum flags according to packet type,
so for non L3/L4 packets the mbuf chksum_bad flags will not be set.

Fixes: 67fa62bc67 ("mlx5: support checksum offload")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2016-03-16 18:52:18 +01:00
Adrien Mazarguil
188b8ccdfe mlx5: add VLAN filtering for broadcast and IPv6 multicast
Unlike promiscuous and allmulticast flows, those should remain
VLAN-specific.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:18 +01:00
Adrien Mazarguil
cc5436bccb mlx5: remove redundant debug message
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Adrien Mazarguil
0d2186743d mlx5: manage all special flow types at once
This commit adds helpers to remove redundant code.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Or Ami
1371f4df16 mlx5: check port is configured as ethernet device
If the port link layer is not Ethernet, notify the user.

Signed-off-by: Or Ami <ora@mellanox.com>
2016-03-16 18:52:17 +01:00
Or Ami
2f636ae565 mlx5: fix possible crash during initialization
RSS configuration should not be freed when priv is NULL.

Fixes: 2f97422e77 ("mlx5: support RSS hash update and get")

Signed-off-by: Or Ami <ora@mellanox.com>
2016-03-16 18:52:17 +01:00
Adrien Mazarguil
0dc02cca8b mlx: use aligned memory to register regions
The first and last memory pool elements are usually cache-aligned but not
page-aligned, particularly when using huge pages.

Hardware performance can be improved significantly by registering memory
regions starting and ending on page boundaries.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
a859e8a904 mlx5: free buffers immediately after completion
This lowers the amount of cache misses.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
d970e99260 mlx5: avoid lkey retrieval for inlined packets
Improves performance as the lkey is not needed by hardware in this case.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
081f7eae24 mlx5: process offload flags only when requested
Improve performance by processing offloads only when requested by the
application.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
e16820236d mlx5: remove one indirection level from Rx/Tx
Avoid dereferencing pointers twice to get to fast Verbs functions by
storing them directly in RX/TX queue structures.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
806af69386 mlx5: reorder Rx/Tx queue structure
Remove padding and move important fields to the beginning for better
performance.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
5e1d11ecf5 mlx5: prefetch next Tx mbuf header and data
This change improves performance noticeably.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
f3db948918 mlx5: support Rx VLAN stripping
Allows HW to strip the 802.1Q header from incoming frames and report it
through the mbuf structure.

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
76f5c99e68 mlx5: support flow director
Add support for flow director filters (RTE_FDIR_MODE_PERFECT and
RTE_FDIR_MODE_PERFECT_MAC_VLAN modes).

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Raslan Darawsheh <rdarawsheh@asaltech.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
1a56bdaceb mlx5: make flow steering rule generator more generic
Upcoming flow director support will reuse this function to generate filter
rules.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
0497ddaac5 mlx5: add special flows for broadcast and IPv6 multicast
Until now, broadcast frames were handled like unicast. Moving the related
flow to the special flows table frees up the related unicast MAC entry.

The same method is used to handle IPv6 multicast frames.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
083c2dd317 mlx5: refactor special flows handling
Merge redundant code by adding a static initialization table to manage
promiscuous and allmulticast (special) flows.

New function priv_rehash_flows() implements the logic to enable/disable
relevant flows in one place from any context.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Adrien Mazarguil
a94fda95ed mlx5: fix header generation in parallel builds
Fixes: 771fa900b7 ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
David Marchand
86977fccb9 mlx5: support setting primary MAC address
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
ebb30ec64a mlx5: increase RETA table size
ConnectX-4 NICs can handle at most 512 entries in RETA table.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-03 20:39:47 +01:00
Panu Matilainen
948fd64bef mk: replace the combined library with a linker script
The physically linked-together combined library has been an increasing
source of problems, as was predicted when library and symbol versioning
was introduced. Replace the complex and fragile construction with a
simple linker script which achieves the same without all the problems,
remove the related kludges from eg mlx drivers.

Since creating the linker script is practically zero cost, remove the
config option and just create it always.

Based on a patch by Sergio Gonzales Monroy, linker script approach
initially suggested by Neil Horman.

Suggested-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-03-01 14:37:27 +01:00
Huawei Xie
693f715da4 remove extra parentheses in return statement
fix the error reported by checkpatch:
  "ERROR: return is not a function, parentheses are not required"

remove parentheses in return like:
  "return (logical expressions)"

remove parentheses in return a function like:
  "return (rte_mempool_lookup(...))"

Fixes: 6307b909b8 ("lib: remove extra parenthesis after return")

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
2016-02-10 15:47:50 +01:00
Nelio Laranjeiro
4f52bbfb0e mlx5: fix available entries in Tx rings
The number of available entries in TX rings is taken before performing
completion, effectively making rings smaller than they are and causing
TX performance issues under load.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Olga Shern
ecbfdbad6e mlx5: fix local protection error when Tx MP to MR cache is full
When MP to MR cache is full, the last (newest) MR is freed instead of the
first (oldest) one, causing local protection errors during TX.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Olga Shern
0a3b350da8 mlx5: fix Tx loss after initialization
Pre-registering mbuf memory pools when creating TX queues avoids costly
registrations later in the data path.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Adrien Mazarguil
34d0626306 mlx5: fix scattered Tx with too many segments
Buffers with too many segments are linearized to overcome
MLX5_PMD_SGE_WR_N, unfortunately the last segment is never sent.

Fixes: 3ee8444608 ("mlx5: support scattered Rx and Tx")

Signed-off-by: Jesper Wramberg <jesper.wramberg@gmail.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Adrien Mazarguil
8340392e9b mlx5: fix memory registration for indirect mbuf data
Indirect mbuf data may come from a different mempool which must be
registered separately as another memory region, otherwise such mbufs cannot
be sent.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")

Signed-off-by: Jesper Wramberg <jesper.wramberg@gmail.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Adrien Mazarguil
7efc807d48 mlx5: fix possible crash when clearing device statistics
A typo causes TX stats indices to be retrieved from RX queues.

Fixes: 87011737b7 ("mlx5: add software counters")

Reported-by: Nicolas Harnois <nicolas.harnois@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Adrien Mazarguil
612ad38209 mlx5: fix hash Rx queue type in RSS mode
Only seen since IPv6 RSS support was added, confusion about the purpose of
the hash_rxq_type_from_n() function has caused it to return invalid hash RX
queue types.

Refactor function for its intended purpose, rename it
hash_rxq_type_from_pos() and update comment with a better description.

Fixes: a76133214d ("mlx5: use separate indirection table for default hash Rx queue")

Reported-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-11-03 22:57:32 +01:00
Adrien Mazarguil
13d57bd5cd mlx5: fix build with debugging enabled
The following error occurs when CONFIG_RTE_LIBRTE_MLX5_DEBUG=y:

 drivers/net/mlx5/mlx5.c:381:4: error: ISO C forbids braced-groups within expressions

RTE_MIN() uses the non-standard ({ ... }) syntax to declare variables within
parentheses, which is rejected by -pedantic.

Since the RSS_INDIRECTION_TABLE_SIZE check is meant to go away as soon as
DPDK supports larger/variable indirection tables, put it in a separate
condition.

Fixes: 634efbc2c8 ("mlx5: support RETA query and update")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-03 22:57:21 +01:00
Bernard Iremonger
eeefe73f0a drivers: copy PCI device info to ethdev data
Use new function rte_eth_copy_pci_info.
Copy device info for the following pdevs:

bnx2x
cxgbe
e1000
enic
fm10k
i40e
ixgbe
mlx4
mlx5
virtio
vmxnet3

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-11-03 18:39:26 +01:00
Nelio Laranjeiro
634efbc2c8 mlx5: support RETA query and update
ConnectX-4 adapters do not have a constant indirection table size, which is
set at runtime from the number of RX queues. The maximum size is retrieved
using a hardware query and is normally 512.

Since the current RETA API cannot handle a variable size, any query/update
command causes it to be silently updated to RSS_INDIRECTION_TABLE_SIZE
entries regardless of the original size.

Also due to the underlying type of the configuration structure, the maximum
size is limited to RSS_INDIRECTION_TABLE_SIZE (currently 128, at most 256
entries).

A port stop/start must be done to apply the new RETA configuration.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-11-03 11:23:15 +01:00
Nelio Laranjeiro
198a3c339a mlx5: handle link status interrupts
Add interrupts handler for port status notification.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-01 11:23:45 +01:00
Yaacov Hazan
1f527a1152 mlx5: fix build with GCC < 4.6
Seen with GCC < 4.6:

 error: unknown field ‘tcp_udp’ specified in initializer
 error: extra brace group at end of initializer

Static initialization of anonymous structs/unions is a C11 feature
properly supported only since GCC 4.6.

Work around compilation errors with older versions by expanding
struct ibv_exp_flow_spec into struct hash_rxq_init.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-01 11:23:45 +01:00
Adrien Mazarguil
2bbd8c9f8c mlx5: enable multi packet send WR in Tx CQ
For adapters that support it, this flag improves performance outside of VF
context.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-01 11:23:45 +01:00
Adrien Mazarguil
ae018afe86 mlx5: support IPv6 RSS using experimental flows
Normal flows do not currently provide IPv6 support.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-11-01 11:23:45 +01:00
Nelio Laranjeiro
5b45c208f8 mlx5: disable useless flows in promiscuous mode
Only a single flow per hash RX queue is needed in promiscuous mode.
Disable others to free up hardware resources.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-11-01 11:23:44 +01:00
Nelio Laranjeiro
0573873d5b mlx5: use one RSS hash key per flow type
DPDK expects to have an RSS hash key per flow type (IPv4, IPv6, UDPv4,
etc.), to handle this the PMD must keep a table of hash keys to be able
to reconfigure the queues at each start/stop call.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-10-31 00:21:58 +01:00
Nelio Laranjeiro
2f97422e77 mlx5: support RSS hash update and get
First implementation of rss_hash_update and rss_hash_conf_get, those
functions still lack in functionality but are usable to change the RSS
hash key.  For now, the PMD does not handle an indirection table for
each kind of flow (IPv4, IPv6, etc.), the same RSS hash key is used
for all protocols.  This situation explains why the rss_hash_conf_get
returns the RSS hash key for all DPDK supported protocols and why the
hash key is set for all of them too.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-10-31 00:21:58 +01:00
Olga Shern
542c21fb0f mlx5: use alternate method to configure promisc and allmulti modes
Promiscuous and allmulticast modes were historically enabled by adding
specific flows with types IBV_FLOW_ATTR_ALL_DEFAULT or
IBV_EXP_FLOW_ATTR_MC_DEFAULT to each hash RX queue, but this method is
deprecated.

- Promiscuous mode is now enabled by omitting destination MAC addresses from
  basic flow specifications.
- Allmulticast mode is now enabled by using flow specifications that match
  the broadcast bit in destination MAC addresses.

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-10-31 00:21:58 +01:00
Olga Shern
c74564e5b3 mlx5: define specific flow steering rules for each hash Rx QP
All hash RX QPs currently use the same flow steering rule (L2 MAC filtering)
regardless of their type (TCP, UDP, IPv4, IPv6), which prevents them from
being dispatched properly. This is fixed by adding flow information to the
hash RX queue initialization data and generating specific flow steering
rules for each of them.

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-31 00:21:58 +01:00
Nelio Laranjeiro
95e16ef325 mlx5: adapt indirection table size depending on Rx queues number
Use the maximum size of the indirection table when the number of requested
RX queues is not a power of two, this help to improve RSS balancing.

A message informs users that balancing is not optimal in such cases.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-10-31 00:21:58 +01:00
Olga Shern
a76133214d mlx5: use separate indirection table for default hash Rx queue
The default hash RX queue handles packets that are not matched by more
specific types and requires its own indirection table of size 1 to work
properly.

This commit implements support for multiple indirection tables by grouping
their layout and properties in a static initialization table.

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
ecc1c29df8 mlx5: refactor Rx code for the new verbs RSS API
The new Verbs RSS API is lower-level than the previous one and much more
flexible but requires RX queues to use Work Queues (WQs) internally instead
of Queue Pairs (QPs), which are grouped in an indirection table used by a
new kind of hash RX QPs.

Hash RX QPs and the indirection table together replace the parent RSS QP
while WQs are mostly similar to child QPs.

RSS hash key is not configurable yet.

Summary of changes:

- Individual DPDK RX queues do not store flow properties anymore, this info
  is now part of the hash RX queues.
- All functions affecting the parent queue when RSS is enabled or the basic
  queues otherwise are modified to affect hash RX queues instead.
- Hash RX queues are also used when a single DPDK RX queue is configured (no
  RSS) to remove that special case.
- Hash RX queues and indirection table are created/destroyed when device
  is started/stopped in addition to create/destroy flows.
- Contrary to QPs, WQs are moved to the "ready" state before posting RX
  buffers, otherwise they are ignored.
- Resource domain information is added to WQs for better performance.
- CQs are not resized anymore when switching between non-SG and SG modes as
  it does not work correctly with WQs. Use the largest possible size
  instead, since CQ size does not have to be the same as the number of
  elements in the RX queue. This also applies to the maximum number of
  outstanding WRs in a WQ (max_recv_wr).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
aa7f63ab35 mlx5: get rid of the WR structure in Rx queue elements
Removing this structure reduces the size of SG and non-SG RX queue elements
significantly to improve performance.

An nice side effect is that the mbuf pointer is now fully stored in
struct rxq_elt instead of relying on the WR ID data offset hack.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
61bdf1e0e5 mlx5: use fast verbs interface for scattered Rx
This commit updates mlx5_rx_burst_sp() to use the fast verbs interface for
posting RX buffers just like mlx5_rx_burst(). Doing so avoids a loop in
libmlx5 and an indirect function call through libibverbs.

Note: recv_sg_list() is not implemented in the QP burst API, this commit is
only to prepare transition to the WQ-based API.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
67fa62bc67 mlx5: support checksum offload
This is the same implementation as mlx4.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:41:01 +01:00
Adrien Mazarguil
e9086978b2 mlx5: support VLAN filtering
All MAC RX flows must be updated with VLAN information when configuring a
VLAN filter.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:39:30 +01:00
Adrien Mazarguil
02d7543043 mlx5: support flow control
Like most other device control operations, those are handled by the related
kernel network device through syscalls.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:34:21 +01:00
Adrien Mazarguil
cb8faed7dd mlx5: support link status update
Link information is retrieved using ethtool ioctls.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:32:05 +01:00
Adrien Mazarguil
1bdbe1af99 mlx5: support promiscuous and allmulticast Rx modes
These modes require special non-MAC flows.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-10-30 22:31:22 +01:00
Adrien Mazarguil
87011737b7 mlx5: add software counters
Hardware counters are not supported yet.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:24:08 +01:00
Adrien Mazarguil
cf37ca9563 mlx5: support MTU configuration
Depending on the MTU and whether jumbo frames are enabled, RX queues may
switch between SG and non-SG modes for better performance.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:23:45 +01:00
Adrien Mazarguil
3ee8444608 mlx5: support scattered Rx and Tx
A dedicated RX callback is added to handle scattered buffers. For better
performance, it is only used when jumbo frames are enabled and MTU is larger
than a single mbuf.

On the TX path, scattered buffers are also handled in a separate function.
When there are more than MLX5_PMD_SGE_WR_N segments in a given mbuf, the
remaining segments are linearized in the last SGE entry.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:23:00 +01:00
Adrien Mazarguil
e60fbd5b24 mlx5: add device configure/start/stop
This commit adds the remaining missing callbacks to make mlx5 usable.
Like mlx4, device start and stop are implemented on top of MAC RX flows.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Francesco Santoro <francesco.santoro@6wind.com>
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
2015-10-30 22:22:12 +01:00
Adrien Mazarguil
3318aef7e7 mlx5: add MAC handling
This commit adds support for MAC flow steering rules mandatory for the RX
path as well as the related callbacks to add/remove MAC addresses.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
2015-10-30 22:21:51 +01:00
Adrien Mazarguil
2e22920b85 mlx5: support non-scattered Tx and Rx
RSS implementation with parent/child QPs comes from mlx4 and is temporary.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:21:07 +01:00
Adrien Mazarguil
771fa900b7 mlx5: introduce new driver for Mellanox ConnectX-4 adapters
In its current state, this driver implements the bare minimum to initialize
itself and Mellanox ConnectX-4 adapters without doing anything else
(no RX/TX for instance). It is disabled by default since it is based on the
mlx4 driver and also depends on libibverbs.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Or Ami <ora@mellanox.com>
2015-10-30 22:03:42 +01:00