Commit Graph

163 Commits

Author SHA1 Message Date
Raslan Darawsheh
c8d5690060 net/tap: fix multi-process request
The structure was not initialized.

Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
2019-03-01 18:17:35 +01:00
Stephen Hemminger
04df418f0f net/tap: do not print pointer in info message
Printing pointer in log is uninformative (unless in a debugger),
instead print the assigned kernel device name which correlates
well with what TAP is doing.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-01-14 17:44:29 +01:00
Stephen Hemminger
c8ae56e62d net/tap: get rid of global name variable
Having a global variable which is set to "TUN" or "TAP" during
probe is a potential bug if probing is ever done in different
processes or contexts. Let's fix it now by using existing enum
that has type of connection.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
2019-01-14 17:44:29 +01:00
Stephen Hemminger
b5235d61f3 net/tap: let kernel choose tun device name
Assigning tun and tap index in DPDK tap device driver is racy
and fails if used with primary/secondary. Instead use the kernel
feature of device wildcarding where if a name with %d is used
the kernel will fill in the next available device.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org

Reported-by: Haifeng Li <hfli@netitest.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
2019-01-14 17:44:29 +01:00
Stephen Hemminger
16f1c8abb5 net/tap: lower the priority of log messages
Any messages that normally occur during probe should be at DEBUG
level (not NOTICE). This reduces overall log clutter.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
2019-01-14 17:44:29 +01:00
Stephen Hemminger
7d0c832709 net/tap: check interface name in kvargs
If interface name is passed to remote or iface then check
the length and for invalid characters. This avoids problems where
name gets truncated or rejected by kernel.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
2019-01-14 17:44:29 +01:00
Stephen Hemminger
12ad0b6572 net/tap: allow full length names
The code for set_interface_name was incorrectly assuming that
space for null byte was necessary with snprintf/strlcpy.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
2019-01-14 17:44:29 +01:00
Stephen Hemminger
1d7490bc7c net/tap: use strlcpy for interface name
snprintf is not needed here, use strlcpy instead.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
2019-01-14 17:44:29 +01:00
Ferruh Yigit
04d3f6bc97 net/tap: fix possible uninitialized variable access
Fixes: 7c25284e30 ("net/tap: add netlink back-end for flow API")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-12-21 16:22:41 +01:00
Bruce Richardson
1168a4fd19 net/tap: add buffer overflow checks before checksum
The checksum calculation APIs take only the packet headers pointers as
parameters, so they assume that the lengths reported in those headers
are correct. However, a malicious packet could claim to be far larger
than it is, so we need to check the header lengths in the driver before
calling the checksum API.

A better fix would be to allow the lengths to be passed into the API
function, but that would be an API break, so fixing in TAP driver for
now.

Fixes: 8ae3023387 ("net/tap: add Rx/Tx checksum offload support")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-12-21 16:22:41 +01:00
Vipin Varghese
126372ce72 net/tap: fix probe for multiq or flowq failure
In scenarios for multiq or flowq setup failure
`rte_eth_dev_probing_finish()` has to be invoked for successful device
registration.

Fixes: fbe90cdd77 ("ethdev: add probing finish function")
Cc: stable@dpdk.org

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-11-14 00:35:53 +01:00
Stephen Hemminger
e0a10f4691 net/tap: fix file descriptor check
Static analysis tools don't like the fact that fd could be zero
in the error path. This won't happen in real world because
stdin would have to be closed, then other error occurring.

Coverity issue: 14079
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-11-14 02:14:12 +01:00
Stephen Hemminger
cc02c97718 net/tap: fix file descriptor leak on error
If netlink socket setup fails the file descriptor was leaked.

Coverity issue: 257040
Fixes: 7c25284e30 ("net/tap: add netlink back-end for flow API")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-11-14 02:14:09 +01:00
Ferruh Yigit
b74fd6b842 add missing static keyword to globals
Some global variables can indeed be static, add static keyword to them.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2018-10-29 02:01:08 +01:00
Agalya Babu RadhaKrishnan
b077118a50 net/tap: disable in FreeBSD build with meson
Disabled tap build in FreeBSD because it is not supported
Added changes to enable tap build if it is Linux OS and
disable in FreeBSD.

Fixes: 095cae3668 ("net/tap: add in meson build")

Signed-off-by: Agalya Babu RadhaKrishnan <agalyax.babu.radhakrishnan@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-10-27 18:03:30 +02:00
Thomas Monjalon
662dbc322d ethdev: remove release function for secondary process
After previous changes, the function rte_eth_dev_release_port()
can be used for primary or secondary process as well.
The only difference with rte_eth_dev_release_port_secondary()
is the shared lock used in rte_eth_dev_release_port().

The function rte_eth_dev_release_port_secondary() was recently
added in 18.11 cycle.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-10-26 22:14:05 +02:00
Thomas Monjalon
e16adf08e5 ethdev: free all common data when releasing port
This is a clean-up of common ethdev data freeing.
All data freeing are moved to rte_eth_dev_release_port()
and done only in case of primary process.

It is probably fixing some memory leaks for PMDs which were
not freeing all data.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-10-26 22:14:05 +02:00
Raslan Darawsheh
c9aa56edec net/tap: access primary process queues from secondary
In the case the device is created by the primary process,
the secondary must request some file descriptors to attach the queues.
The file descriptors are shared via IPC Unix socket.

Thanks to the IPC synchronization, the secondary process
is now able to do Rx/Tx on a TAP created by the primary process.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-26 22:14:05 +02:00
Raslan Darawsheh
ed8132e7c9 net/tap: move fds of queues to be in process private
fd's cannot be shared between processes, and each process need to have
it's own fd's pointer.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-26 22:14:05 +02:00
Raslan Darawsheh
48b9dcd62a net/tap: add queue and port ids in queue structures
Port and queue ids are added to easily map the file
descriptors stored in each process private.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-26 22:14:05 +02:00
Raslan Darawsheh
9396ad3346 net/tap: fix reported number of Tx packets
When writev fails to send packets it doesn't update the
number of Tx packets, but it still num_tx is updated.

The value that should be returned is the actual number
of sent packets which is num_packets.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
CC: stable@dpdk.org

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-10-18 10:24:39 +02:00
Qi Zhang
4852aa8f6e drivers/net: enable hotplug on secondary process
Attach port from secondary should ignore devargs since the private
device is not necessary to support. Also previously, detach port on
a secondary process will mess primary process and cause the same
device can't be attached back again. A secondary process should use
rte_eth_dev_release_port_secondary to release a port.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-10-17 10:16:18 +02:00
Luca Boccassi
095cae3668 net/tap: add in meson build
Use same autoconf generation mechanism as the MLX4/5 PMDs

Signed-off-by: Luca Boccassi <bluca@debian.org>
2018-09-18 22:48:49 +02:00
Ferruh Yigit
323e7b667f ethdev: make default behavior CRC strip on Rx
Removed DEV_RX_OFFLOAD_CRC_STRIP offload flag.
Without any specific Rx offload flag, default behavior by PMDs is to
strip CRC.

PMDs that support keeping CRC should advertise DEV_RX_OFFLOAD_KEEP_CRC
Rx offload capability.

Applications that require keeping CRC should check PMD capability first
and if it is supported can enable this feature by setting
DEV_RX_OFFLOAD_KEEP_CRC in Rx offload flag in rte_eth_dev_configure()

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tomasz Duszynski <tdu@semihalf.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Remes <remes@netcope.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2018-09-14 20:08:41 +02:00
Matan Azrad
4278f8df47 net/tap: fix zeroed flow mask configurations
The rte_flow meaning of zero flow mask configuration is to match all
the range of the item value.
For example, the flow eth / ipv4 dst spec 1.2.3.4 dst mask 0.0.0.0
should much all the ipv4 traffic from the rte_flow API perspective.

>From some kernel perspectives the above rule means to ignore all the
ipv4 traffic (e.g. Ubuntu 16.04, 4.15.10).

Due to the fact that the tap PMD should provide the rte_flow meaning,
it is necessary to ignore the spec in case the mask is zero when it
forwards such like flows to the kernel.
So, the above rule should be translated to eth / ipv4 to get the
correct meaning.

Ignore spec configurations when the mask is zero.

Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-08-07 22:48:53 +02:00
Vipin Varghese
5a401fb85e net/tap: call probe finish for tun secondary
Invoke rte_eth_dev_probing_finish for rte_pmd_tun_probe.

Fixes: fbe90cdd77 ("ethdev: add probing finish function")
Cc: stable@dpdk.org

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-08-05 01:44:24 +02:00
Ferruh Yigit
d1c3ab220a drivers/net: fix crash in secondary process
Calling rte_eth_dev_info_get() on secondary process cause a crash
because eth_dev->device is not set properly.

Fixes: ee27edbe0c ("drivers/net: share vdev data to secondary process")
Cc: stable@dpdk.org

Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
2018-07-26 15:00:34 +02:00
Gage Eads
968eb52c49 net/tap: set queue started and stopped
Set the rx and tx queue state appropriately when the queues or device
are started or stopped.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-07-23 23:55:26 +02:00
Thomas Monjalon
f8e9989606 remove useless constructor headers
A constructor is usually declared with RTE_INIT* macros.
As it is a static function, no need to declare before its definition.
The macro is used directly in the function definition.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2018-07-12 00:00:35 +02:00
Ophir Munk
050316a883 net/tap: support TSO (TCP Segment Offload)
This commit implements TCP segmentation offload in TAP.
librte_gso library is used to segment large TCP payloads (e.g. packets
of 64K bytes size) into smaller MTU size buffers.
By supporting TSO offload capability in software a TAP device can be used
as a failsafe sub device and be paired with another PCI device which
supports TSO capability in HW.

For more details on librte_gso implementation please refer to dpdk
documentation.
The number of newly generated TCP TSO segments is limited to 64.

Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-07-03 01:35:58 +02:00
Ophir Munk
6546e76056 net/tap: calculate checksums of multi segs packets
Prior to this commit IP/UDP/TCP checksum offload calculations
were skipped in case of a multi segments packet.
This commit enables TAP checksum calculations for multi segments
packets.
The only restriction is that the first segment must contain
headers of layers 3 (IP) and 4 (UDP or TCP)

Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-07-03 01:35:58 +02:00
Vipin Varghese
403232b817 net/tap: update tap index to unsigned
Updating the logic to reflect unsigned integer as index for TAP PMD.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-06-14 19:27:50 +02:00
Ophir Munk
42ec78eaeb net/tap: fix keep-alive queue not detached
The TAP keep-alive queue was created in order to keep the TAP device
in Linux even in case all of its Rx/Tx queues are released (in Linux
terminology: even in case all of the TAP device file descriptors are
closed), however, the keep-alive queue itself is attached to the TAP
device like all other Rx/Tx queues and therefore the kernel will
enqueue to it some Rx packets based on the kernel RSS distribution
rules. Those packets are unknown to the application and will remain
lost in the keep-alive queue.
All queues are attached by default to the TAP device after they are
created though TUNSETIFF ioctl call.
The fix is to detach the keep-alive queue after its creation through
TUNSETQUEUE ioctl call.

Fixes: 3101191c63 ("net/tap: fix device removal when no queue exist")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-25 17:07:40 +02:00
Vipin Varghese
d8dc42fab6 net/tap: fix vdev data sharing for tun
Enables TUN PMD sharing by attaching the port from the shared data.

Fixes: ee27edbe0c ("drivers/net: share vdev data to secondary process")

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-23 00:35:01 +02:00
Vipin Varghese
d1883bf1de net/tap: fix protocol field
The TX function is shared between TAP and TUN PMD. Checking TUN-TAP
type field will ensure the TAP PMD will always have protocol field
as 0.

Fixes: 204d026a39 ("net/tap: support tun")

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-23 00:35:01 +02:00
Ophir Munk
3101191c63 net/tap: fix device removal when no queue exist
TAP device is created following its first queue creation. Multiple
queues can be added or removed over time. In Linux terminology those
are file descriptors which are opened or closed over time. As long as
the number of opened file descriptors is positive - TAP device will
appear as a Linux device. In case all queues are released (the
equivalent of all file descriptors being closed) the TAP device will
be removed. This can lead to abnormalities in different scenarios
where the TAP device should exist even if all its queues are released.
In order to make TAP existence independent of its number of queues -
an extra file descriptor is opened on TAP creation and is closed on
TAP closure. Its only purpose is to serve as a keep-alive mechanism
for the TAP device.

Fixes: bf7b7f437b ("net/tap: create netdevice during probing")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-23 00:35:01 +02:00
Ophir Munk
2f5045c51c net/tap: support RSS hash update
Add RSS hash update callback to eth_dev_ops.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-17 19:07:08 +02:00
Ophir Munk
2ef1c0da89 net/tap: fix isolation mode toggling
Running testpmd command "flow isolae <port> 0" (i.e. disabling flow
isolation) followed by command "flow isolate <port> 1" (i.e. enabling
flow isolation) may result in a TAP error:
PMD: Kernel refused TC filter rule creation (17): File exists

Root cause analysis: when disabling flow isolation we keep the local
rule to redirect packets on TX (TAP_REMOTE_TX index) while we add it
again when enabling flow isolation. As a result this rule is added
two times in a row which results in "File exists" error.
The fix is to identify the "File exists" error and silently ignore it.

Another issue occurs when enabling isolation mode several times in a
row in which case the same tc rules are added consecutively and
rte_flow structs are added to a linked list before removing the
previous rte_flow structs.
The fix is to act upon isolation mode command only when there is a
change from "0" to "1" (or vice versa).

Fixes: f503d26948 ("net/tap: support flow API isolated mode")
Cc: stable@dpdk.org

Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-17 16:01:05 +02:00
Vipin Varghese
f5fd98c802 net/tap: add default name to tun
The change adds default name to reflect TUN PMD instance. if option
name is not passed, the default dtun is taken.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-14 22:32:23 +01:00
Thomas Monjalon
fbe90cdd77 ethdev: add probing finish function
A new hook function is added and called inside the PMDs at the end
of the device probing:
	- in primary process, after allocating, init and config
	- in secondary process, after attaching and local init

This new function is almost empty for now.
It will be used later to add some post-initialization processing.

For the PMDs calling the helpers rte_eth_dev_create() or
rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish()
is called from here, and not in the PMD itself.

Note that the helper rte_eth_dev_create() could be used more,
especially for vdevs, avoiding some code duplication in PMDs.

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Wei Dai
a4996bd89c ethdev: new Rx/Tx offloads API
This patch check if a input requested offloading is valid or not.
Any reuqested offloading must be supported in the device capabilities.
Any offloading is disabled by default if it is not set in the parameter
dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If any offloading is enabled in rte_eth_dev_configure() by application,
it is enabled on all queues no matter whether it is per-queue or
per-port type and no matter whether it is set or cleared in
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(),
it can be enabled or disabled for individual queue in
ret_eth_[rt]x_queue_setup().
A new added offloading is the one which hasn't been enabled in
rte_eth_dev_configure() and is reuqested to be enabled in
rte_eth_[rt]x_queue_setup(), it must be per-queue type,
otherwise trigger an error log.
The underlying PMD must be aware that the requested offloadings
to PMD specific queue_setup() function only carries those
new added offloadings of per-queue type.

This patch can make above such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

This patch assumes that all PMDs in 18.05-rc2 have already
converted to offload API defined in 17.11 . It also assumes
that all PMDs can return correct offloading capabilities
in rte_eth_dev_infos_get().

In the beginning of [rt]x_queue_setup() of underlying PMD,
add offloads = [rt]xconf->offloads |
dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API
defined in 17.11 to avoid upper application broken due to offload
API change.
PMD can use the info that input [rt]xconf->offloads only carry
the new added per-queue offloads to do some optimization or some
code change on base of this patch.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-14 22:31:51 +01:00
Ophir Munk
8d54ede738 net/tap: report on supported RSS hash functions
Report on TAP supported RSS functions as part of dev_infos_get
callback: ETH_RSS_IP, ETH_RSS_UDP and ETH_RSS_TCP.
Known limitation: TAP supports all of the above hash functions together
and not in partial combinations.
Previous to this commit RSS support was reported as none. Since the
introduction of [1] it is required that all RSS configurations will be
verified.

[1] commit 8863a1fbfc ("ethdev: add supported hash function check")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-14 22:31:51 +01:00
Stephen Hemminger
1b3b7caeb1 net/tap: convert to dynamic logging
Use new logging macro to convert all calls to RTE_LOG() into
new dynamic log type.

Also fix whitespace.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-04-27 18:00:59 +01:00
Ophir Munk
ead63dd318 net/tap: return empty port offload capabilities
Fix internal report on port specific offload capabilities to be 0 (no
capabilities). Before this commit port capabilities were a clone of queue
capabilities, however the current TAP offload capabilities (e.g.
checksum calculation) are per queue and are not specific per port.
This commit fixes an internal validation check for new configured
queue offloads.
The port capability API keeps reporting all queue capabilities as port
capabilities.

Fixes: 95ae196ae1 ("net/tap: use new Rx offloads API")
Fixes: 818fe14a98 ("net/tap: use new Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-04-27 18:00:57 +01:00
Adrien Mazarguil
76e9a55b5b ethdev: add transfer attribute to flow API
This new attribute enables applications to create flow rules that do not
simply match traffic whose origin is specified in the pattern (e.g. some
non-default physical port or VF), but actively affect it by applying the
flow rule at the lowest possible level in the underlying device.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
e58638c324 ethdev: fix TPID handling in flow API
TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
consistent with the normal stacking order of pattern items, which is
confusing to applications.

Problem is that when followed by one of these layers, the EtherType field
of the preceding layer keeps its "inner" definition, and the "outer" TPID
is provided by the subsequent layer, the reverse of how a packet looks like
on the wire:

 Wire:     [ ETH TPID = A | VLAN EtherType = B | B DATA ]
 rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]

Worse, when QinQ is involved, the stacking order of VLAN layers is
unspecified. It is unclear whether it should be reversed (innermost to
outermost) as well given TPID applies to the previous layer:

 Wire:       [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
 rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
 rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]

While specifying EtherType/TPID is hopefully rarely necessary, the stacking
order in case of QinQ and the lack of documentation remain an issue.

This patch replaces TPID in the VLAN pattern item with an inner
EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
clarifies documentation and updates all relevant code.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
items:

- bnxt: EtherType matching is supported with and without VLAN, but TPID
  matching is not and triggers an error.

- e1000: EtherType matching is only supported with the ETHERTYPE filter,
  which does not support VLAN matching, therefore no impact.

- enic: same as bnxt.

- i40e: same as bnxt with existing FDIR limitations on allowed EtherType
  values. The remaining filter types (VXLAN, NVGRE, QINQ) do not support
  EtherType matching.

- ixgbe: same as e1000, with additional minor change to rely on the new
  E-Tag macro definition.

- mlx4: EtherType/TPID matching is not supported, no impact.

- mlx5: same as bnxt.

- mvpp2: same as bnxt.

- sfc: same as bnxt.

- tap: same as bnxt.

Fixes: b1a4b4cbc0 ("ethdev: introduce generic flow API")
Fixes: 99e7003831 ("net/ixgbe: parse L2 tunnel filter")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
18aee2861a ethdev: add encap level to RSS flow API action
RSS hash types (ETH_RSS_* macros defined in rte_ethdev.h) describe the
protocol header fields of a packet that must be taken into account while
computing RSS.

When facing encapsulated (e.g. tunneled) packets, there is an ambiguity as
to whether these should apply to inner or outer packets. Applications need
the ability to tell exactly "where" RSS must be performed.

This is addressed by adding encapsulation level information to the RSS flow
action. Its default value is 0 and stands for the usual unspecified
behavior. Other values provide a specific encapsulation level.

Contrary to the change announced by commit 676b605182 ("doc: announce
ethdev API change for RSS configuration"), this patch does not affect
struct rte_eth_rss_conf but struct rte_flow_action_rss as the former is not
used anymore by the RSS flow action. ABI impact is therefore limited to
rte_flow.

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
929e331934 ethdev: add hash function to RSS flow API action
By definition, RSS involves some kind of hash algorithm, usually Toeplitz.

Until now it could not be modified on a flow rule basis and PMDs had to
always assume RTE_ETH_HASH_FUNCTION_DEFAULT, which remains the default
behavior when unspecified (0).

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
ac8d22de23 ethdev: flatten RSS configuration in flow API
Since its inception, the rte_flow RSS action has been relying in part on
external struct rte_eth_rss_conf for compatibility with the legacy RSS API.
This structure lacks parameters such as the hash algorithm to use, and more
recently, a method to tell which layer RSS should be performed on [1].

Given struct rte_eth_rss_conf will never be flexible enough to represent a
complete RSS configuration (e.g. RETA table), this patch supersedes it by
extending the rte_flow RSS action directly.

A subsequent patch will add a field to use a non-default RSS hash
algorithm. To that end, a field named "types" replaces the field formerly
known as "rss_hf" and standing for "RSS hash functions" as it was
confusing. Actual RSS hash function types are defined by enum
rte_eth_hash_function.

This patch updates all PMDs and example applications accordingly.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

[1] commit 676b605182 ("doc: announce ethdev API change for RSS
    configuration")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00
Adrien Mazarguil
cc17feb904 ethdev: alter behavior of flow API actions
This patch makes the following changes to flow rule actions:

- List order now matters, they are redefined as performed first to last
  instead of "all simultaneously".

- Repeated actions are now supported (e.g. specifying QUEUE multiple times
  now duplicates traffic among them). Previously only the last action of
  any given kind was taken into account.

- No more distinction between terminating/non-terminating/meta actions.
  Flow rules themselves are now defined as always terminating unless a
  PASSTHRU action is specified.

These changes alter the behavior of flow rules in corner cases in order to
prepare the flow API for actions that modify traffic contents or properties
(e.g. encapsulation, compression) and for which order matter when combined.

Previously one would have to do so through multiple flow rules by combining
PASSTRHU with priority levels, however this proved overly complex to
implement at the PMD level, hence this simpler approach.

This breaks ABI compatibility for the following public functions:

- rte_flow_create()
- rte_flow_validate()

PMDs with rte_flow support are modified accordingly:

- bnxt: no change, implementation already forbids multiple actions and does
  not support PASSTHRU.

- e1000: no change, same as bnxt.

- enic: modified to forbid redundant actions, no support for default drop.

- failsafe: no change needed.

- i40e: no change, implementation already forbids multiple actions.

- ixgbe: same as i40e.

- mlx4: modified to forbid multiple fate-deciding actions and drop when
  unspecified.

- mlx5: same as mlx4, with other redundant actions also forbidden.

- sfc: same as mlx4.

- tap: implementation already complies with the new behavior except for
  the default pass-through modified as a default drop.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00