Commit Graph

283 Commits

Author SHA1 Message Date
Adrien Mazarguil
78d9c95d91 net/mlx: retrieve mbuf size through proper function
No need to allocate a mbuf for that.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Nélio Laranjeiro
6e0eab3802 net/mlx: remove unused memory region property
Memory regions are always local with raw Ethernet queues, drop the remote
property as it adds extra processing on the hardware side.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-06-20 17:21:49 +02:00
Olivier Matz
b37b528d95 mbuf: add new Rx flags for stripped VLAN
The behavior of PKT_RX_VLAN_PKT was not very well defined, resulting in
PMDs not advertising the same flags in similar conditions.

Following discussion in [1], introduce 2 new flags PKT_RX_VLAN_STRIPPED
and PKT_RX_QINQ_STRIPPED that are better defined:

  PKT_RX_VLAN_STRIPPED: a vlan has been stripped by the hardware and its
  tci is saved in mbuf->vlan_tci. This can only happen if vlan stripping
  is enabled in the RX configuration of the PMD.

For now, the old flag PKT_RX_VLAN_PKT is kept but marked as deprecated.
It should be removed from applications and PMDs in a future revision.

This patch also updates the drivers. For PKT_RX_VLAN_PKT:

- e1000, enic, i40e, mlx5, nfp, vmxnet3: done, PKT_RX_VLAN_PKT already
  had the same meaning than PKT_RX_VLAN_STRIPPED, minor update is
  required.
- fm10k: done, PKT_RX_VLAN_PKT already had the same meaning than
  PKT_RX_VLAN_STRIPPED, and vlan stripping is always enabled on fm10k.
- ixgbe: modification done (vector and normal), the old flag was set
  when a vlan was recognized, even if vlan stripping was disabled.
- the other drivers do not support vlan stripping.

For PKT_RX_QINQ_PKT, it was only supported on i40e, and the behavior was
already correct, so we can reuse the same bit value for
PKT_RX_QINQ_STRIPPED.

[1] http://dpdk.org/ml/archives/dev/2016-April/037837.html,

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2016-06-15 17:18:57 +02:00
Olivier Matz
e89880957f mbuf: add prefetch helpers
Some architectures (ex: Power8) have a cache line size of 128 bytes,
so the drivers should not expect that prefetching the second part of
the mbuf with rte_prefetch0(&m->cacheline1) is valid.

This commit add helpers that can be used by drivers to prefetch the
rx or tx part of the mbuf, whatever the cache line size.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2016-05-24 11:21:14 +02:00
Olivier Matz
d1d914ebbc mempool: allocate in several memory chunks by default
Introduce rte_mempool_populate_default() which allocates
mempool objects in several memzones.

The mempool header is now always allocated in a specific memzone
(not with its objects). Thanks to this modification, we can remove
many specific behavior that was required when hugepages are not
enabled in case we are using rte_mempool_xmem_create().

This change requires to update how kni and mellanox drivers lookup for
mbuf memory. For now, this will only work if there is only one memory
chunk (like today), but we could make use of rte_mempool_mem_iter() to
support more memory chunks.

We can also remove RTE_MEMPOOL_OBJ_NAME that is not required anymore for
the lookup, as memory chunks are referenced by the mempool.

Note that rte_mempool_create() is still broken (it was the case before)
when there is no hugepages support (rte_mempool_create_xmem() has to be
used). This is fixed in next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Olivier Matz
84121f1971 mempool: store memory chunks in a list
Do not use paddr table to store the mempool memory chunks.
This will allow to have several chunks with different virtual addresses.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Olivier Matz
d86046f0cf mempool: use the list to iterate the elements
Now that the mempool objects are chained into a list, we can use it to
browse them. This implies a rework of rte_mempool_obj_iter() API, that
does not need to take as many arguments as before. The previous function
is kept as a private function, and renamed in this commit. It will be
removed in a next commit of the patch series.

The only internal users of this function are the mellanox drivers. The
code is updated accordingly.

Introducing an API compatibility for this function has been considered,
but it is not easy to do without keeping the old code, as the previous
function could also be used to browse elements that were not added in a
mempool. Moreover, the API is already be broken by other patches in this
version.

The library version was already updated in
commit 213af31e09 ("mempool: reduce structure size if no cache needed")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Olivier Matz
c2c66588e2 mempool: remove const qualifier when browsing pools
This commit removes the const qualifier for the mempool in
rte_mempool_walk() callback prototype.

Indeed, most functions that can be done on a mempool require a non-const
mempool pointer, except the dump and the audit. Therefore, the
mempool_walk() is more useful if the mempool pointer is not const.

This is required by next commit where the mellanox drivers use
rte_mempool_walk() to iterate the mempools, then rte_mempool_obj_iter()
to iterate the objects in each mempool.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-19 14:40:13 +02:00
Olivier Matz
fbfd99551c mbuf: add raw allocation function
Many drivers provide their own implementation of rte_mbuf_raw_alloc(),
duplicating the code. Introduce a new public function in rte_mbuf to
allocate a raw mbuf (uninitialized).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-17 08:31:33 +02:00
Adrien Mazarguil
b185e63fdf mlx: fix double mbuf free in Tx queue clean up
Once freed, completed mbufs pointers are not set to NULL in the TX queue.
Clean up function must take this into account.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")
Fixes: 7fae69eeff ("mlx4: new poll mode driver")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-31 15:15:10 +02:00
Yaacov Hazan
e192ef8034 mlx5: add VLAN insertion offload
VLAN insertion can be done in hardware when supported in Verbs. A software
fallback is provided otherwise. The software implementation is also used
when multi-packet send is enabled on a queue, as both features are mutually
exclusive.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-25 18:56:44 +01:00
Olga Shern
4d3267092a mlx5: add Rx CRC stripping configuration
Until now, CRC was always stripped by hardware. This feature can be
configured since MLNX_OFED >= 3.2.

Signed-off-by: Olga Shern <olgas@mellanox.com>
2016-03-25 18:56:44 +01:00
Jianfeng Tan
78a38edf66 ethdev: query supported packet types
Add a new API rte_eth_dev_get_supported_ptypes to query what packet types
can be filled by a given device. The device should be already started or
its PMD RX burst function already decided, since the packet types supported
may vary depending on RX function.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-25 18:56:43 +01:00
Yaacov Hazan
d0087d760f mlx5: fix Rx checksum offload in non L3/L4 packets
Change rxq_cq_to_ol_flags() to set checksum flags according to packet type,
so for non L3/L4 packets the mbuf chksum_bad flags will not be set.

Fixes: 67fa62bc67 ("mlx5: support checksum offload")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2016-03-16 18:52:18 +01:00
Adrien Mazarguil
0dc02cca8b mlx: use aligned memory to register regions
The first and last memory pool elements are usually cache-aligned but not
page-aligned, particularly when using huge pages.

Hardware performance can be improved significantly by registering memory
regions starting and ending on page boundaries.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
a859e8a904 mlx5: free buffers immediately after completion
This lowers the amount of cache misses.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
d970e99260 mlx5: avoid lkey retrieval for inlined packets
Improves performance as the lkey is not needed by hardware in this case.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
081f7eae24 mlx5: process offload flags only when requested
Improve performance by processing offloads only when requested by the
application.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
e16820236d mlx5: remove one indirection level from Rx/Tx
Avoid dereferencing pointers twice to get to fast Verbs functions by
storing them directly in RX/TX queue structures.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
5e1d11ecf5 mlx5: prefetch next Tx mbuf header and data
This change improves performance noticeably.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2016-03-16 18:52:17 +01:00
Yaacov Hazan
f3db948918 mlx5: support Rx VLAN stripping
Allows HW to strip the 802.1Q header from incoming frames and report it
through the mbuf structure.

This feature requires MLNX_OFED >= 3.2.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2016-03-16 18:52:17 +01:00
Nelio Laranjeiro
4f52bbfb0e mlx5: fix available entries in Tx rings
The number of available entries in TX rings is taken before performing
completion, effectively making rings smaller than they are and causing
TX performance issues under load.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Olga Shern
ecbfdbad6e mlx5: fix local protection error when Tx MP to MR cache is full
When MP to MR cache is full, the last (newest) MR is freed instead of the
first (oldest) one, causing local protection errors during TX.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Olga Shern
0a3b350da8 mlx5: fix Tx loss after initialization
Pre-registering mbuf memory pools when creating TX queues avoids costly
registrations later in the data path.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")

Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Adrien Mazarguil
34d0626306 mlx5: fix scattered Tx with too many segments
Buffers with too many segments are linearized to overcome
MLX5_PMD_SGE_WR_N, unfortunately the last segment is never sent.

Fixes: 3ee8444608 ("mlx5: support scattered Rx and Tx")

Signed-off-by: Jesper Wramberg <jesper.wramberg@gmail.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Adrien Mazarguil
8340392e9b mlx5: fix memory registration for indirect mbuf data
Indirect mbuf data may come from a different mempool which must be
registered separately as another memory region, otherwise such mbufs cannot
be sent.

Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")

Signed-off-by: Jesper Wramberg <jesper.wramberg@gmail.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-11-24 17:49:27 +01:00
Adrien Mazarguil
ecc1c29df8 mlx5: refactor Rx code for the new verbs RSS API
The new Verbs RSS API is lower-level than the previous one and much more
flexible but requires RX queues to use Work Queues (WQs) internally instead
of Queue Pairs (QPs), which are grouped in an indirection table used by a
new kind of hash RX QPs.

Hash RX QPs and the indirection table together replace the parent RSS QP
while WQs are mostly similar to child QPs.

RSS hash key is not configurable yet.

Summary of changes:

- Individual DPDK RX queues do not store flow properties anymore, this info
  is now part of the hash RX queues.
- All functions affecting the parent queue when RSS is enabled or the basic
  queues otherwise are modified to affect hash RX queues instead.
- Hash RX queues are also used when a single DPDK RX queue is configured (no
  RSS) to remove that special case.
- Hash RX queues and indirection table are created/destroyed when device
  is started/stopped in addition to create/destroy flows.
- Contrary to QPs, WQs are moved to the "ready" state before posting RX
  buffers, otherwise they are ignored.
- Resource domain information is added to WQs for better performance.
- CQs are not resized anymore when switching between non-SG and SG modes as
  it does not work correctly with WQs. Use the largest possible size
  instead, since CQ size does not have to be the same as the number of
  elements in the RX queue. This also applies to the maximum number of
  outstanding WRs in a WQ (max_recv_wr).

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
aa7f63ab35 mlx5: get rid of the WR structure in Rx queue elements
Removing this structure reduces the size of SG and non-SG RX queue elements
significantly to improve performance.

An nice side effect is that the mbuf pointer is now fully stored in
struct rxq_elt instead of relying on the WR ID data offset hack.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
61bdf1e0e5 mlx5: use fast verbs interface for scattered Rx
This commit updates mlx5_rx_burst_sp() to use the fast verbs interface for
posting RX buffers just like mlx5_rx_burst(). Doing so avoids a loop in
libmlx5 and an indirect function call through libibverbs.

Note: recv_sg_list() is not implemented in the QP burst API, this commit is
only to prepare transition to the WQ-based API.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-31 00:21:58 +01:00
Adrien Mazarguil
67fa62bc67 mlx5: support checksum offload
This is the same implementation as mlx4.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:41:01 +01:00
Adrien Mazarguil
87011737b7 mlx5: add software counters
Hardware counters are not supported yet.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:24:08 +01:00
Adrien Mazarguil
3ee8444608 mlx5: support scattered Rx and Tx
A dedicated RX callback is added to handle scattered buffers. For better
performance, it is only used when jumbo frames are enabled and MTU is larger
than a single mbuf.

On the TX path, scattered buffers are also handled in a separate function.
When there are more than MLX5_PMD_SGE_WR_N segments in a given mbuf, the
remaining segments are linearized in the last SGE entry.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:23:00 +01:00
Adrien Mazarguil
2e22920b85 mlx5: support non-scattered Tx and Rx
RSS implementation with parent/child QPs comes from mlx4 and is temporary.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2015-10-30 22:21:07 +01:00