Commit Graph

594 Commits

Author SHA1 Message Date
David Marchand
5fd4d04969 net/mlx5: fix count in xstats
With the commit af4f09f282 ("net/mlx5: prefix all functions with mlx5"),
mlx5_xstats_get() is not compliant any longer with the api.
It always returns the caller max entries count while it should return how
many entries it wrote/wanted to write.

Fixes: af4f09f282 ("net/mlx5: prefix all functions with mlx5")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-23 00:35:01 +02:00
Gavin Hu
74572f23cd net/mlx5: fix build with clang on ARM
This patch adds a pair of "()" to embrace the argument
input to the function-like macro invocation.

drivers/net/mlx5/mlx5_rxtx_vec.c:37:
drivers/net/mlx5/mlx5_rxtx_vec_neon.h:170:24: error: too many arguments
provided to function-like macro invocation
	(uint16x8_t) { 0, 0, cs_flags, rte_cpu_to_be_16(len),

Fixes: 570acdb1da ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Signed-off-by: Sirshak Das <sirshak.das@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-21 00:55:39 +02:00
Shahaf Shuler
607fc8e4a9 net/mlx5: fix default RSS level
Using inner RSS by default for GRE leads to memory corruption as the
extra flow items added for the inner RSS are not counted in the flow
attributes buffer size.

Fixing by enforcing the default RSS level to be outer. This much
simplify the flow engine and more robust.
Future optimization for out of the box RSS can be done on subsequent
commits.

Fixes: d4a405186b ("net/mlx5: support tunnel RSS level")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-17 19:06:29 +02:00
Shahaf Shuler
34511c25d5 net/mlx5: fix build without tunnel RSS support
IBV_RX_HASH_INNER should be referenced only when having tunnel support
in the Verbs headers.

Fixes: 80f2d0ed7f ("net/mlx5: add hardware flow debug dump")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-17 12:31:42 +02:00
Matan Azrad
1f106da2bf net/mlx5: support MPLS-in-GRE and MPLS-in-UDP
Add support for MPLS over GRE and MPLS over UDP tunnel types as
described in the next RFCs:
1. https://tools.ietf.org/html/rfc4023
2. https://tools.ietf.org/html/rfc7510
3. https://tools.ietf.org/html/rfc4385

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-17 12:31:42 +02:00
Shahaf Shuler
dd3331c6f1 net/mlx5: add Bluefield device id
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-17 12:31:42 +02:00
Shahaf Shuler
8fe576ad17 net/mlx5: fix flow director drop rule deletion crash
Drop flow rules are created on the ETH queue even though the parser layer
matches the flow rule layer (L3/L4)

Fixes: 6f2f4948b2 ("net/mlx5: fix flow director rule deletion crash")
Cc: stable@dpdk.org

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-05-17 12:31:42 +02:00
Andy Green
f11a4a7d8a net/mlx5: fix uninitialized variable in probing
Fixes: ccdcba53a3 ("net/mlx5: use Netlink to add/remove MAC addresses")

Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-15 22:29:22 +02:00
Yongseok Koh
c9ec2192ff net/mlx5: use correct field in a union structure
This is not a bug but it is better to use semantically correct field.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:32:23 +01:00
Yongseok Koh
0cfdc1808d net/mlx5: use coherent I/O memory barrier
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:32:22 +01:00
Yongseok Koh
5f44cfd011 net/mlx5: fix inlining segmented TSO packet
When a multi-segmented packet is inlined, data can be further inlined even
after the first segment. In case of TSO packet, extra inline data after TSO
header should be carried by an inline DSEG which has 4B inline header
recording the length of the inline data. If more than one segment is
inlined, the length doesn't count from the second segment. This will cause
a fault in HW and CQE will have an error, which is ignored by PMD.

Fixes: f895536be4 ("net/mlx5: enable inlining data from multiple segments")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:32:22 +01:00
Thomas Monjalon
fbe90cdd77 ethdev: add probing finish function
A new hook function is added and called inside the PMDs at the end
of the device probing:
	- in primary process, after allocating, init and config
	- in secondary process, after attaching and local init

This new function is almost empty for now.
It will be used later to add some post-initialization processing.

For the PMDs calling the helpers rte_eth_dev_create() or
rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish()
is called from here, and not in the PMD itself.

Note that the helper rte_eth_dev_create() could be used more,
especially for vdevs, avoiding some code duplication in PMDs.

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Yongseok Koh
7d6bf6b866 net/mlx5: add Multi-Packet Rx support
Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe
bandwidth by posting a single large buffer for multiple packets. Instead of
posting a buffer per a packet, one large buffer is posted in order to
receive multiple packets on the buffer. A MPRQ buffer consists of multiple
fixed-size strides and each stride receives one packet.

Rx packet is mem-copied to a user-provided mbuf if the size of Rx packet is
comparatively small, or PMD attaches the Rx packet to the mbuf by external
buffer attachment - rte_pktmbuf_attach_extbuf(). A mempool for external
buffers will be allocated and managed by PMD.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
18bee13096 net/mlx5: add a function to rdma-core glue
mlx5dv_create_wq() is added for the Multi-Packet RQ (a.k.a Striding RQ).

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
3e1f82a1f1 net/mlx5: separate filling Rx flags
Filling in fields of mbuf becomes a separate inline function so that this
can be reused.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
974f1e7ef1 net/mlx5: add new memory region support
This is the new design of Memory Region (MR) for mlx PMD, in order to:
- Accommodate the new memory hotplug model.
- Support non-contiguous Mempool.

There are multiple layers for MR search.

L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most
Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized
array by linear search. L0/L1 is in an inline function -
mlx5_mr_lookup_cache().

If L1 misses, the bottom-half function is called to look up the address
from the bigger local cache of the queue. This is L2 - mlx5_mr_addr2mr_bh()
and it is not an inline function. Data structure for L2 is the Binary Tree.

If L2 misses, the search falls into the slowest path which takes locks in
order to access global device cache (priv->mr.cache) which is also a B-tree
and caches the original MR list (priv->mr.mr_list) of the device. Unless
the global cache is overflowed, it is all-inclusive of the MR list. This is
L3 - mlx5_mr_lookup_dev(). The size of the L3 cache table is limited and
can't be expanded on the fly due to deadlock. Refer to the comments in the
code for the details - mr_lookup_dev(). If L3 is overflowed, the list will
have to be searched directly bypassing the cache although it is slower.

If L3 misses, a new MR for the address should be created -
mlx5_mr_create(). When it creates a new MR, it tries to register adjacent
memsegs as much as possible which are virtually contiguous around the
address. This must take two locks - memory_hotplug_lock and
priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any
allocation/free of memory inside.

In the free callback of the memory hotplug event, freed space is searched
from the MR list and corresponding bits are cleared from the bitmap of MRs.
This can fragment a MR and the MR will have multiple search entries in the
caches. Once there's a change by the event, the global cache must be
rebuilt and all the per-queue caches will be flushed as well. If memory is
frequently freed in run-time, that may cause jitter on dataplane processing
in the worst case by incurring MR cache flush and rebuild. But, it would be
the least probable scenario.

To guarantee the most optimal performance, it is highly recommended to use
an EAL option - '--socket-mem'. Then, the reserved memory will be pinned
and won't be freed dynamically. And it is also recommended to configure
per-lcore cache of Mempool. Even though there're many MRs for a device or
MRs are highly fragmented, the cache of Mempool will be much helpful to
reduce misses on per-queue caches anyway.

'--legacy-mem' is also supported.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Yongseok Koh
d561b5dc13 net/mlx5: remove memory region support
This patch removes current support of Memory Region (MR) in order to
accommodate the dynamic memory hotplug patch. This patch can be compiled
but traffic can't flow and HW will raise faults. Subsequent patches will
add new MR support.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Wei Dai
a4996bd89c ethdev: new Rx/Tx offloads API
This patch check if a input requested offloading is valid or not.
Any reuqested offloading must be supported in the device capabilities.
Any offloading is disabled by default if it is not set in the parameter
dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If any offloading is enabled in rte_eth_dev_configure() by application,
it is enabled on all queues no matter whether it is per-queue or
per-port type and no matter whether it is set or cleared in
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(),
it can be enabled or disabled for individual queue in
ret_eth_[rt]x_queue_setup().
A new added offloading is the one which hasn't been enabled in
rte_eth_dev_configure() and is reuqested to be enabled in
rte_eth_[rt]x_queue_setup(), it must be per-queue type,
otherwise trigger an error log.
The underlying PMD must be aware that the requested offloadings
to PMD specific queue_setup() function only carries those
new added offloadings of per-queue type.

This patch can make above such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

This patch assumes that all PMDs in 18.05-rc2 have already
converted to offload API defined in 17.11 . It also assumes
that all PMDs can return correct offloading capabilities
in rte_eth_dev_infos_get().

In the beginning of [rt]x_queue_setup() of underlying PMD,
add offloads = [rt]xconf->offloads |
dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API
defined in 17.11 to avoid upper application broken due to offload
API change.
PMD can use the info that input [rt]xconf->offloads only carry
the new added per-queue offloads to do some optimization or some
code change on base of this patch.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-14 22:31:51 +01:00
Yongseok Koh
df428ceef4 net/mlx5: change device reference for secondary process
rte_eth_devices[] is not shared between primary and secondary process, but
a static array to each process. The reverse pointer of device (priv->dev)
is invalid. Instead, priv has the pointer to shared data of the device,
  struct rte_eth_dev_data *dev_data;

Two macros are added,
  #define PORT_ID(priv) ((priv)->dev_data->port_id)
  #define ETH_DEV(priv) (&rte_eth_devices[PORT_ID(priv)])

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Yongseok Koh
95d7e115be net/mlx5: fix calculation of Tx TSO inline room size
rdma-core doesn't add up max_tso_header size to max_inline_data size. The
library takes bigger value between the two.

Fixes: 43e9d9794c ("net/mlx5: support upstream rdma-core")
Cc: stable@dpdk.org

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-14 22:31:50 +01:00
Raslan Darawsheh
690de2850b net/mlx5: fix resource leak in case of error
If something went wrong in mlx5_pci_prob the allocated eth dev
will cause a memory leak.

This commit release the eth dev that was previously allocated.

Fixes: 771fa900b7 ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters")
Cc: stable@dpdk.org

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:50 +01:00
Raslan Darawsheh
e9f4166014 net/mlx5: fix double free on error handling
When attr_ctx is NULL it will attempt to free the list of devices twice.
Avoid double freeing the list by directly going to error handling.

Fixes: 771fa900b7 ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters")
Cc: stable@dpdk.org

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:49 +01:00
Xueming Li
32d4246c90 net/mlx5: fix SW parser enabling
Fixes: 5f8ba81c42 ("net/mlx5: support generic tunnel offloading")

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:49 +01:00
Xueming Li
5afda2c6ac net/mlx5: fix SW parsing feature detection
Fixes: 5f8ba81c42 ("net/mlx5: support generic tunnel offloading")

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:49 +01:00
Nélio Laranjeiro
b7a7c97a40 net/mlx5: fix flow validation
Item spec and last are wrongly compared to the NIC capability causing a
validation failure when the mask is null.
This validation function should only verify the user is not configuring
unsupported matching fields.

Fixes: 2097d0d1e2 ("net/mlx5: support basic flow items and actions")
Cc: stable@dpdk.org

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-14 22:31:48 +01:00
Shahaf Shuler
012ad9944d net/mlx5: fix probe return value polarity
mlx5 prefixed function returns a negative errno value.
the error handler on mlx5_pci_probe is doing the same.

Fixes: a6d83b6a92 ("net/mlx5: standardize on negative errno values")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-14 22:31:48 +01:00
Shahaf Shuler
eac9cd58de net/mlx5: fix socket connection return value
Upon success, mlx5_socket_connect should return the fd descriptor of the
primary process

Fixes: a6d83b6a92 ("net/mlx5: standardize on negative errno values")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-14 22:31:48 +01:00
Shahaf Shuler
d11d651f6d net/mlx5: add Rx and Tx tuning parameters
A new ethdev API was exposed by
commit 3be82f5cc5 ("ethdev: support PMD-tuned Tx/Rx parameters")

Enabling the PMD to provide default parameters in case no strict request
from application in order to improve the out of the box experience.

While the current API lacks the means for the PMD to provide the best
possible value, providing the best default the PMD can guess.
The values are based on Mellanox performance report and depends on the
underlying NIC capabilities.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-14 22:31:48 +01:00
Shahaf Shuler
7d2e32f76c net/mlx5: fix ethtool link setting call order
According to ethtool_link_setting API recommendation ETHTOOL_GLINKSETTINGS
should be called before ETHTOOL_GSET as the later one deprecated.

Fixes: f47ba80080 ("net/mlx5: remove kernel version check")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-14 22:31:48 +01:00
Adrien Mazarguil
6f2f4948b2 net/mlx5: fix flow director rule deletion crash
Flow director rules matching traffic properties above layer 2 do not
target a fixed hash Rx queue (HASH_RXQ_ETH), it actually depends on the
highest protocol layer specified by each flow rule.

mlx5_fdir_filter_delete() makes this wrong assumption and causes a crash
when attempting to destroy flow rules with L3/L4 specifications.

Fixes: 4c3e9bcdd5 ("net/mlx5: support flow director")
Cc: stable@dpdk.org

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-02 19:28:48 +02:00
Declan Doherty
fb8fd96d42 ethdev: add shared counter to flow API
Add rte_flow_action_count action data structure to enable shared
counters across multiple flows on a single port or across multiple
flows on multiple ports within the same switch domain. Also this enables
multiple count actions to be specified in a single flow action.

This patch also modifies the existing rte_flow_query API to take the
rte_flow_action structure as an input parameter instead of the
rte_flow_action_type enumeration to allow querying a specific action
from a flow rule when multiple actions of the same type are specified.

This patch also contains updates for the bonding, failsafe and mlx5 PMDs
and testpmd application which are affected by this API change.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
2018-04-27 18:00:57 +01:00
Xueming Li
bd315baecf net/mlx5: allow flow tunnel ID 0 with outer pattern
Tunnel w/o tunnel id pattern could match any non-tunneled packet,
this patch allowed tunnel w/o tunnel id pattern after proper outer spec.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Xueming Li
05dda761bd net/mlx5: introduce VXLAN-GPE tunnel type
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Xueming Li
80f2d0ed7f net/mlx5: add hardware flow debug dump
Dump verb flow detail including flow spec type and size for debugging
purpose.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Xueming Li
d4a405186b net/mlx5: support tunnel RSS level
Tunnel RSS level of flow RSS action offers user a choice to do RSS hash
calculation on inner or outer RSS fields. Testpmd flow command examples:

GRE flow inner RSS:
  flow create 0 ingress pattern eth / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 1 / end

GRE tunnel flow outer RSS:
  flow create 0 ingress pattern eth  / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 0 / end

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Xueming Li
8486125353 net/mlx5: split flow RSS handling logic
This patch split out flow RSS hash field handling logic to dedicate
function.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Xueming Li
6ba07449ed net/mlx5: cleanup tunnel checksum offloads
Once tunnel packet type(RTE_PTYPE_TUNNEL_xxx) identified,
PKT_RX_IP_CKSUM_XXX and PKT_RX_L4_CKSUM_XXX represent checksum result of
inner headers, outer L3 and L4 header checksum are always valid as soon
as tunnel identified. If no tunnel identified, PKT_RX_IP_CKSUM_XXX and
PKT_RX_L4_CKSUM_XXX represent checksum result of outer L3 and L4
headers.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Xueming Li
3cc08bc6dd net/mlx5: support Rx tunnel type identification
This patch introduced tunnel type identification based on flow rules.
If flows of multiple tunnel types built on same queue, no tunnel type
will be returned. User application could use bits in flow mark as tunnel
type identifier.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Xueming Li
78a54648ff net/mlx5: support L3 VXLAN flow
This patch support L3 VXLAN, no inner L2 header comparing to standard
VXLAN protocol. L3 VXLAN using specific overlay UDP destination port to
discriminate against standard VXLAN, device parameter and FW has to be
configured to support it:
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
  sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Xueming Li
96c6c65a10 net/mlx5: support GRE tunnel flow
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Xueming Li
b43802b4bd net/mlx5: support 16 hardware priorities
This patch supports new 16 Verbs flow priorities by trying to create a
simple flow of priority 15. If 16 priorities not available, fallback to
traditional 8 priorities.

Verb priority mapping:
			8 priorities	>=16 priorities
Control flow:		4-7		8-15
User normal flow:	1-3		4-7
User tunnel flow:	0-2		0-3

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:56 +01:00
Adrien Mazarguil
76e9a55b5b ethdev: add transfer attribute to flow API
This new attribute enables applications to create flow rules that do not
simply match traffic whose origin is specified in the pattern (e.g. some
non-default physical port or VF), but actively affect it by applying the
flow rule at the lowest possible level in the underlying device.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
e58638c324 ethdev: fix TPID handling in flow API
TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
consistent with the normal stacking order of pattern items, which is
confusing to applications.

Problem is that when followed by one of these layers, the EtherType field
of the preceding layer keeps its "inner" definition, and the "outer" TPID
is provided by the subsequent layer, the reverse of how a packet looks like
on the wire:

 Wire:     [ ETH TPID = A | VLAN EtherType = B | B DATA ]
 rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]

Worse, when QinQ is involved, the stacking order of VLAN layers is
unspecified. It is unclear whether it should be reversed (innermost to
outermost) as well given TPID applies to the previous layer:

 Wire:       [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
 rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
 rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]

While specifying EtherType/TPID is hopefully rarely necessary, the stacking
order in case of QinQ and the lack of documentation remain an issue.

This patch replaces TPID in the VLAN pattern item with an inner
EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
clarifies documentation and updates all relevant code.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
items:

- bnxt: EtherType matching is supported with and without VLAN, but TPID
  matching is not and triggers an error.

- e1000: EtherType matching is only supported with the ETHERTYPE filter,
  which does not support VLAN matching, therefore no impact.

- enic: same as bnxt.

- i40e: same as bnxt with existing FDIR limitations on allowed EtherType
  values. The remaining filter types (VXLAN, NVGRE, QINQ) do not support
  EtherType matching.

- ixgbe: same as e1000, with additional minor change to rely on the new
  E-Tag macro definition.

- mlx4: EtherType/TPID matching is not supported, no impact.

- mlx5: same as bnxt.

- mvpp2: same as bnxt.

- sfc: same as bnxt.

- tap: same as bnxt.

Fixes: b1a4b4cbc0 ("ethdev: introduce generic flow API")
Fixes: 99e7003831 ("net/ixgbe: parse L2 tunnel filter")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
18aee2861a ethdev: add encap level to RSS flow API action
RSS hash types (ETH_RSS_* macros defined in rte_ethdev.h) describe the
protocol header fields of a packet that must be taken into account while
computing RSS.

When facing encapsulated (e.g. tunneled) packets, there is an ambiguity as
to whether these should apply to inner or outer packets. Applications need
the ability to tell exactly "where" RSS must be performed.

This is addressed by adding encapsulation level information to the RSS flow
action. Its default value is 0 and stands for the usual unspecified
behavior. Other values provide a specific encapsulation level.

Contrary to the change announced by commit 676b605182 ("doc: announce
ethdev API change for RSS configuration"), this patch does not affect
struct rte_eth_rss_conf but struct rte_flow_action_rss as the former is not
used anymore by the RSS flow action. ABI impact is therefore limited to
rte_flow.

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
929e331934 ethdev: add hash function to RSS flow API action
By definition, RSS involves some kind of hash algorithm, usually Toeplitz.

Until now it could not be modified on a flow rule basis and PMDs had to
always assume RTE_ETH_HASH_FUNCTION_DEFAULT, which remains the default
behavior when unspecified (0).

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
ac8d22de23 ethdev: flatten RSS configuration in flow API
Since its inception, the rte_flow RSS action has been relying in part on
external struct rte_eth_rss_conf for compatibility with the legacy RSS API.
This structure lacks parameters such as the hash algorithm to use, and more
recently, a method to tell which layer RSS should be performed on [1].

Given struct rte_eth_rss_conf will never be flexible enough to represent a
complete RSS configuration (e.g. RETA table), this patch supersedes it by
extending the rte_flow RSS action directly.

A subsequent patch will add a field to use a non-default RSS hash
algorithm. To that end, a field named "types" replaces the field formerly
known as "rss_hf" and standing for "RSS hash functions" as it was
confusing. Actual RSS hash function types are defined by enum
rte_eth_hash_function.

This patch updates all PMDs and example applications accordingly.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

[1] commit 676b605182 ("doc: announce ethdev API change for RSS
    configuration")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00
Adrien Mazarguil
19b3bc47c6 ethdev: fix C99 flexible arrays from flow API
This patch replaces C99-style flexible arrays in struct rte_flow_action_rss
and struct rte_flow_item_raw with standard pointers to the same data.

They proved difficult to use in the field (e.g. no possibility of static
initialization) and unsuitable for C++ applications.

Affected PMDs and examples are updated accordingly.

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Fixes: b1a4b4cbc0 ("ethdev: introduce generic flow API")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-27 18:00:53 +01:00
Adrien Mazarguil
cc17feb904 ethdev: alter behavior of flow API actions
This patch makes the following changes to flow rule actions:

- List order now matters, they are redefined as performed first to last
  instead of "all simultaneously".

- Repeated actions are now supported (e.g. specifying QUEUE multiple times
  now duplicates traffic among them). Previously only the last action of
  any given kind was taken into account.

- No more distinction between terminating/non-terminating/meta actions.
  Flow rules themselves are now defined as always terminating unless a
  PASSTHRU action is specified.

These changes alter the behavior of flow rules in corner cases in order to
prepare the flow API for actions that modify traffic contents or properties
(e.g. encapsulation, compression) and for which order matter when combined.

Previously one would have to do so through multiple flow rules by combining
PASSTRHU with priority levels, however this proved overly complex to
implement at the PMD level, hence this simpler approach.

This breaks ABI compatibility for the following public functions:

- rte_flow_create()
- rte_flow_validate()

PMDs with rte_flow support are modified accordingly:

- bnxt: no change, implementation already forbids multiple actions and does
  not support PASSTHRU.

- e1000: no change, same as bnxt.

- enic: modified to forbid redundant actions, no support for default drop.

- failsafe: no change needed.

- i40e: no change, implementation already forbids multiple actions.

- ixgbe: same as i40e.

- mlx4: modified to forbid multiple fate-deciding actions and drop when
  unspecified.

- mlx5: same as mlx4, with other redundant actions also forbidden.

- sfc: same as mlx4.

- tap: implementation already complies with the new behavior except for
  the default pass-through modified as a default drop.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00
Xueming Li
3d140329ca net/mlx5: allow max 192B TSO inline header length
Change max inline header length to 192B to allow IPv6 VXLAN TSO headers
and header with options that more than 128B.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-04-27 17:34:43 +01:00
Xueming Li
5f8ba81c42 net/mlx5: support generic tunnel offloading
This commit adds support for generic tunnel TSO and checksum offload.
PMD will compute the inner/outer headers offset according to the
mbuf fields. Hardware will do calculation based on offsets and types.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-04-27 17:34:43 +01:00