Commit Graph

46 Commits

Author SHA1 Message Date
Yaacov Hazan
2a66cf3789 net/mlx5: support inline send
Implement send inline feature which copies packet data directly into
work queue entries (WQEs) for improved latency. The maximum packet
size and the minimum number of Tx queues to qualify for inline send
are user-configurable.

This feature is effective when HW causes a performance bottleneck.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-06-27 16:17:52 +02:00
Adrien Mazarguil
c305090bba net/mlx5: replace countdown with threshold for Tx completions
Replacing the variable countdown (which depends on the number of
descriptors) with a fixed relative threshold known at compile time improves
performance by reducing the TX queue structure footprint and the amount of
code to manage completions during a burst.

Completions are now requested at most once per burst after threshold is
reached.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
99c12dcca6 net/mlx5: handle Rx CQE compression
Mini (compressed) completion queue entries (CQEs) are returned by the
NIC when PCI back pressure is detected, in which case the first CQE64
contains common packet information followed by a number of CQE8
providing the rest, followed by a matching number of empty CQE64
entries to be used by software for decompression.

Before decompression:

      0           1          2           6         7         8
  +-------+  +---------+ +-------+   +-------+ +-------+ +-------+
  | CQE64 |  |  CQE64  | | CQE64 |   | CQE64 | | CQE64 | | CQE64 |
  |-------|  |---------| |-------|   |-------| |-------| |-------|
  | ..... |  | cqe8[0] | |       | . |       | |       | | ..... |
  | ..... |  | cqe8[1] | |       | . |       | |       | | ..... |
  | ..... |  | ....... | |       | . |       | |       | | ..... |
  | ..... |  | cqe8[7] | |       |   |       | |       | | ..... |
  +-------+  +---------+ +-------+   +-------+ +-------+ +-------+

After decompression:

      0          1     ...     8
  +-------+  +-------+     +-------+
  | CQE64 |  | CQE64 |     | CQE64 |
  |-------|  |-------|     |-------|
  | ..... |  | ..... |  .  | ..... |
  | ..... |  | ..... |  .  | ..... |
  | ..... |  | ..... |  .  | ..... |
  | ..... |  | ..... |     | ..... |
  +-------+  +-------+     +-------+

This patch does not perform the entire decompression step as it would be
really expensive, instead the first CQE64 is consumed and an internal
context is maintained to interpret the following CQE8 entries directly.

Intermediate empty CQE64 entries are handed back to HW without further
processing.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
1d88ba1719 net/mlx5: refactor Tx data path
Bypass Verbs to improve Tx performance.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
6218063b39 net/mlx5: refactor Rx data path
Bypass Verbs to improve RX performance.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
36271e7446 net/mlx5: update prerequisites for upcoming enhancements
The latest version of Mellanox OFED exposes hardware definitions necessary
to implement data path operation bypassing Verbs. Update the minimum
version requirement to MLNX_OFED >= 3.3 and clean up compatibility checks
for previous releases.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
0cdddf4d06 net/mlx5: split Rx queue structure
To keep the data path as efficient as possible, move fields only useful to
the control path into new structure rxq_ctrl.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
21c8bb4928 net/mlx5: split Tx queue structure
To keep the data path as efficient as possible, move fields only useful to
the control path into new structure txq_ctrl.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
0431c40f47 net/mlx5: remove inline Tx support
Inline TX will be fully managed by the PMD after Verbs is bypassed in the
data path. Remove the current code until then.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:52 +02:00
Nélio Laranjeiro
3f5ac8f292 net/mlx5: remove Rx scatter support
This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. RX scatter cannot be maintained during the
transition and will be reimplemented later.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:51 +02:00
Nélio Laranjeiro
573f54afae net/mlx5: remove Tx gather support
This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. TX gather cannot be maintained during the
transition and will be reimplemented later.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:51 +02:00
Nélio Laranjeiro
491770fafc net/mlx5: split memory registration function
Except for the first time when memory registration occurs, the lkey is
always cached. Since memory registration is slow and performs system calls,
performance can be improved by moving that code to its own function outside
of the data path so only the lookup code is left in the original inlined
function.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-27 16:17:51 +02:00
Adrien Mazarguil
96327064ab net/mlx5: cosmetic changes
Add consistency to mlx5_rxtx.h.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Adrien Mazarguil
78d9c95d91 net/mlx: retrieve mbuf size through proper function
No need to allocate a mbuf for that.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Olivier Matz
d1d914ebbc mempool: allocate in several memory chunks by default
Introduce rte_mempool_populate_default() which allocates
mempool objects in several memzones.

The mempool header is now always allocated in a specific memzone
(not with its objects). Thanks to this modification, we can remove
many specific behavior that was required when hugepages are not
enabled in case we are using rte_mempool_xmem_create().

This change requires to update how kni and mellanox drivers lookup for
mbuf memory. For now, this will only work if there is only one memory
chunk (like today), but we could make use of rte_mempool_mem_iter() to
support more memory chunks.

We can also remove RTE_MEMPOOL_OBJ_NAME that is not required anymore for
the lookup, as memory chunks are referenced by the mempool.

Note that rte_mempool_create() is still broken (it was the case before)
when there is no hugepages support (rte_mempool_create_xmem() has to be
used). This is fixed in next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Olivier Matz
c2c66588e2 mempool: remove const qualifier when browsing pools
This commit removes the const qualifier for the mempool in
rte_mempool_walk() callback prototype.

Indeed, most functions that can be done on a mempool require a non-const
mempool pointer, except the dump and the audit. Therefore, the
mempool_walk() is more useful if the mempool pointer is not const.

This is required by next commit where the mellanox drivers use
rte_mempool_walk() to iterate the mempools, then rte_mempool_obj_iter()
to iterate the objects in each mempool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Yaacov Hazan
e192ef8034 mlx5: add VLAN insertion offload
VLAN insertion can be done in hardware when supported in Verbs. A software
fallback is provided otherwise. The software implementation is also used
when multi-packet send is enabled on a queue, as both features are mutually
exclusive.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-25 18:56:44 +01:00
Olga Shern
4d3267092a mlx5: add Rx CRC stripping configuration
Until now, CRC was always stripped by hardware. This feature can be
configured since MLNX_OFED >= 3.2.

Signed-off-by: Olga Shern <olgas@mellanox.com>
2016-03-25 18:56:44 +01:00
Or Ami
a48deada65 mlx5: allow operation in secondary processes
Secondary processes are expected to use queues and other resources
allocated by the primary, however Verbs resources can only be shared
between processes when inherited through fork().

This limitation can be worked around for TX by configuring separate queues
from secondary processes.

Signed-off-by: Or Ami <ora@mellanox.com>
2016-03-25 18:56:44 +01:00
Adrien Mazarguil
188b8ccdfe mlx5: add VLAN filtering for broadcast and IPv6 multicast
Unlike promiscuous and allmulticast flows, those should remain
VLAN-specific.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:18 +01:00
Adrien Mazarguil
0dc02cca8b mlx: use aligned memory to register regions
The first and last memory pool elements are usually cache-aligned but not
page-aligned, particularly when using huge pages.

Hardware performance can be improved significantly by registering memory
regions starting and ending on page boundaries.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
e16820236d mlx5: remove one indirection level from Rx/Tx
Avoid dereferencing pointers twice to get to fast Verbs functions by
storing them directly in RX/TX queue structures.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
806af69386 mlx5: reorder Rx/Tx queue structure
Remove padding and move important fields to the beginning for better
performance.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
f3db948918 mlx5: support Rx VLAN stripping
Allows HW to strip the 802.1Q header from incoming frames and report it
through the mbuf structure.

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
76f5c99e68 mlx5: support flow director
Add support for flow director filters (RTE_FDIR_MODE_PERFECT and
RTE_FDIR_MODE_PERFECT_MAC_VLAN modes).

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Raslan Darawsheh <rdarawsheh@asaltech.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
1a56bdaceb mlx5: make flow steering rule generator more generic
Upcoming flow director support will reuse this function to generate filter
rules.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
0497ddaac5 mlx5: add special flows for broadcast and IPv6 multicast
Until now, broadcast frames were handled like unicast. Moving the related
flow to the special flows table frees up the related unicast MAC entry.

The same method is used to handle IPv6 multicast frames.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
083c2dd317 mlx5: refactor special flows handling
Merge redundant code by adding a static initialization table to manage
promiscuous and allmulticast (special) flows.

New function priv_rehash_flows() implements the logic to enable/disable
relevant flows in one place from any context.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Olga Shern
0a3b350da8 mlx5: fix Tx loss after initialization
Pre-registering mbuf memory pools when creating TX queues avoids costly
registrations later in the data path.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Yaacov Hazan
1f527a1152 mlx5: fix build with GCC < 4.6
Seen with GCC < 4.6:

 error: unknown field ‘tcp_udp’ specified in initializer
 error: extra brace group at end of initializer

Static initialization of anonymous structs/unions is a C11 feature
properly supported only since GCC 4.6.

Work around compilation errors with older versions by expanding
struct ibv_exp_flow_spec into struct hash_rxq_init.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-01 11:23:45 +01:00
Adrien Mazarguil
ae018afe86 mlx5: support IPv6 RSS using experimental flows
Normal flows do not currently provide IPv6 support.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-11-01 11:23:45 +01:00
Nelio Laranjeiro
5b45c208f8 mlx5: disable useless flows in promiscuous mode
Only a single flow per hash RX queue is needed in promiscuous mode.
Disable others to free up hardware resources.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-11-01 11:23:44 +01:00
Nelio Laranjeiro
0573873d5b mlx5: use one RSS hash key per flow type
DPDK expects to have an RSS hash key per flow type (IPv4, IPv6, UDPv4,
etc.), to handle this the PMD must keep a table of hash keys to be able
to reconfigure the queues at each start/stop call.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-10-31 00:21:58 +01:00
Nelio Laranjeiro
2f97422e77 mlx5: support RSS hash update and get
First implementation of rss_hash_update and rss_hash_conf_get, those
functions still lack in functionality but are usable to change the RSS
hash key.  For now, the PMD does not handle an indirection table for
each kind of flow (IPv4, IPv6, etc.), the same RSS hash key is used
for all protocols.  This situation explains why the rss_hash_conf_get
returns the RSS hash key for all DPDK supported protocols and why the
hash key is set for all of them too.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-10-31 00:21:58 +01:00
Olga Shern
c74564e5b3 mlx5: define specific flow steering rules for each hash Rx QP
All hash RX QPs currently use the same flow steering rule (L2 MAC filtering)
regardless of their type (TCP, UDP, IPv4, IPv6), which prevents them from
being dispatched properly. This is fixed by adding flow information to the
hash RX queue initialization data and generating specific flow steering
rules for each of them.

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-31 00:21:58 +01:00
Olga Shern
a76133214d mlx5: use separate indirection table for default hash Rx queue
The default hash RX queue handles packets that are not matched by more
specific types and requires its own indirection table of size 1 to work
properly.

This commit implements support for multiple indirection tables by grouping
their layout and properties in a static initialization table.

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
ecc1c29df8 mlx5: refactor Rx code for the new verbs RSS API
The new Verbs RSS API is lower-level than the previous one and much more
flexible but requires RX queues to use Work Queues (WQs) internally instead
of Queue Pairs (QPs), which are grouped in an indirection table used by a
new kind of hash RX QPs.

Hash RX QPs and the indirection table together replace the parent RSS QP
while WQs are mostly similar to child QPs.

RSS hash key is not configurable yet.

Summary of changes:

- Individual DPDK RX queues do not store flow properties anymore, this info
  is now part of the hash RX queues.
- All functions affecting the parent queue when RSS is enabled or the basic
  queues otherwise are modified to affect hash RX queues instead.
- Hash RX queues are also used when a single DPDK RX queue is configured (no
  RSS) to remove that special case.
- Hash RX queues and indirection table are created/destroyed when device
  is started/stopped in addition to create/destroy flows.
- Contrary to QPs, WQs are moved to the "ready" state before posting RX
  buffers, otherwise they are ignored.
- Resource domain information is added to WQs for better performance.
- CQs are not resized anymore when switching between non-SG and SG modes as
  it does not work correctly with WQs. Use the largest possible size
  instead, since CQ size does not have to be the same as the number of
  elements in the RX queue. This also applies to the maximum number of
  outstanding WRs in a WQ (max_recv_wr).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
aa7f63ab35 mlx5: get rid of the WR structure in Rx queue elements
Removing this structure reduces the size of SG and non-SG RX queue elements
significantly to improve performance.

An nice side effect is that the mbuf pointer is now fully stored in
struct rxq_elt instead of relying on the WR ID data offset hack.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
67fa62bc67 mlx5: support checksum offload
This is the same implementation as mlx4.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:41:01 +01:00
Adrien Mazarguil
e9086978b2 mlx5: support VLAN filtering
All MAC RX flows must be updated with VLAN information when configuring a
VLAN filter.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:39:30 +01:00
Adrien Mazarguil
1bdbe1af99 mlx5: support promiscuous and allmulticast Rx modes
These modes require special non-MAC flows.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-10-30 22:31:22 +01:00
Adrien Mazarguil
87011737b7 mlx5: add software counters
Hardware counters are not supported yet.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:24:08 +01:00
Adrien Mazarguil
cf37ca9563 mlx5: support MTU configuration
Depending on the MTU and whether jumbo frames are enabled, RX queues may
switch between SG and non-SG modes for better performance.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:23:45 +01:00
Adrien Mazarguil
3ee8444608 mlx5: support scattered Rx and Tx
A dedicated RX callback is added to handle scattered buffers. For better
performance, it is only used when jumbo frames are enabled and MTU is larger
than a single mbuf.

On the TX path, scattered buffers are also handled in a separate function.
When there are more than MLX5_PMD_SGE_WR_N segments in a given mbuf, the
remaining segments are linearized in the last SGE entry.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:23:00 +01:00
Adrien Mazarguil
3318aef7e7 mlx5: add MAC handling
This commit adds support for MAC flow steering rules mandatory for the RX
path as well as the related callbacks to add/remove MAC addresses.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
2015-10-30 22:21:51 +01:00
Adrien Mazarguil
2e22920b85 mlx5: support non-scattered Tx and Rx
RSS implementation with parent/child QPs comes from mlx4 and is temporary.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:21:07 +01:00