Commit Graph

207 Commits

Author SHA1 Message Date
Stephen Hemminger
12ad0b6572 net/tap: allow full length names
The code for set_interface_name was incorrectly assuming that
space for null byte was necessary with snprintf/strlcpy.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
2019-01-14 17:44:29 +01:00
Stephen Hemminger
1d7490bc7c net/tap: use strlcpy for interface name
snprintf is not needed here, use strlcpy instead.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
2019-01-14 17:44:29 +01:00
Ferruh Yigit
04d3f6bc97 net/tap: fix possible uninitialized variable access
Fixes: 7c25284e30 ("net/tap: add netlink back-end for flow API")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-12-21 16:22:41 +01:00
Bruce Richardson
1168a4fd19 net/tap: add buffer overflow checks before checksum
The checksum calculation APIs take only the packet headers pointers as
parameters, so they assume that the lengths reported in those headers
are correct. However, a malicious packet could claim to be far larger
than it is, so we need to check the header lengths in the driver before
calling the checksum API.

A better fix would be to allow the lengths to be passed into the API
function, but that would be an API break, so fixing in TAP driver for
now.

Fixes: 8ae3023387 ("net/tap: add Rx/Tx checksum offload support")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-12-21 16:22:41 +01:00
Vipin Varghese
126372ce72 net/tap: fix probe for multiq or flowq failure
In scenarios for multiq or flowq setup failure
`rte_eth_dev_probing_finish()` has to be invoked for successful device
registration.

Fixes: fbe90cdd77 ("ethdev: add probing finish function")
Cc: stable@dpdk.org

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-11-14 00:35:53 +01:00
Stephen Hemminger
e0a10f4691 net/tap: fix file descriptor check
Static analysis tools don't like the fact that fd could be zero
in the error path. This won't happen in real world because
stdin would have to be closed, then other error occurring.

Coverity issue: 14079
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-11-14 02:14:12 +01:00
Stephen Hemminger
cc02c97718 net/tap: fix file descriptor leak on error
If netlink socket setup fails the file descriptor was leaked.

Coverity issue: 257040
Fixes: 7c25284e30 ("net/tap: add netlink back-end for flow API")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-11-14 02:14:09 +01:00
Ferruh Yigit
b74fd6b842 add missing static keyword to globals
Some global variables can indeed be static, add static keyword to them.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2018-10-29 02:01:08 +01:00
Agalya Babu RadhaKrishnan
b077118a50 net/tap: disable in FreeBSD build with meson
Disabled tap build in FreeBSD because it is not supported
Added changes to enable tap build if it is Linux OS and
disable in FreeBSD.

Fixes: 095cae3668 ("net/tap: add in meson build")

Signed-off-by: Agalya Babu RadhaKrishnan <agalyax.babu.radhakrishnan@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-10-27 18:03:30 +02:00
Thomas Monjalon
662dbc322d ethdev: remove release function for secondary process
After previous changes, the function rte_eth_dev_release_port()
can be used for primary or secondary process as well.
The only difference with rte_eth_dev_release_port_secondary()
is the shared lock used in rte_eth_dev_release_port().

The function rte_eth_dev_release_port_secondary() was recently
added in 18.11 cycle.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-10-26 22:14:05 +02:00
Thomas Monjalon
e16adf08e5 ethdev: free all common data when releasing port
This is a clean-up of common ethdev data freeing.
All data freeing are moved to rte_eth_dev_release_port()
and done only in case of primary process.

It is probably fixing some memory leaks for PMDs which were
not freeing all data.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-10-26 22:14:05 +02:00
Raslan Darawsheh
c9aa56edec net/tap: access primary process queues from secondary
In the case the device is created by the primary process,
the secondary must request some file descriptors to attach the queues.
The file descriptors are shared via IPC Unix socket.

Thanks to the IPC synchronization, the secondary process
is now able to do Rx/Tx on a TAP created by the primary process.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-26 22:14:05 +02:00
Raslan Darawsheh
ed8132e7c9 net/tap: move fds of queues to be in process private
fd's cannot be shared between processes, and each process need to have
it's own fd's pointer.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-26 22:14:05 +02:00
Raslan Darawsheh
48b9dcd62a net/tap: add queue and port ids in queue structures
Port and queue ids are added to easily map the file
descriptors stored in each process private.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-26 22:14:05 +02:00
Raslan Darawsheh
9396ad3346 net/tap: fix reported number of Tx packets
When writev fails to send packets it doesn't update the
number of Tx packets, but it still num_tx is updated.

The value that should be returned is the actual number
of sent packets which is num_packets.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
CC: stable@dpdk.org

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-10-18 10:24:39 +02:00
Qi Zhang
4852aa8f6e drivers/net: enable hotplug on secondary process
Attach port from secondary should ignore devargs since the private
device is not necessary to support. Also previously, detach port on
a secondary process will mess primary process and cause the same
device can't be attached back again. A secondary process should use
rte_eth_dev_release_port_secondary to release a port.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-10-17 10:16:18 +02:00
Luca Boccassi
095cae3668 net/tap: add in meson build
Use same autoconf generation mechanism as the MLX4/5 PMDs

Signed-off-by: Luca Boccassi <bluca@debian.org>
2018-09-18 22:48:49 +02:00
Ferruh Yigit
323e7b667f ethdev: make default behavior CRC strip on Rx
Removed DEV_RX_OFFLOAD_CRC_STRIP offload flag.
Without any specific Rx offload flag, default behavior by PMDs is to
strip CRC.

PMDs that support keeping CRC should advertise DEV_RX_OFFLOAD_KEEP_CRC
Rx offload capability.

Applications that require keeping CRC should check PMD capability first
and if it is supported can enable this feature by setting
DEV_RX_OFFLOAD_KEEP_CRC in Rx offload flag in rte_eth_dev_configure()

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tomasz Duszynski <tdu@semihalf.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Remes <remes@netcope.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2018-09-14 20:08:41 +02:00
Matan Azrad
4278f8df47 net/tap: fix zeroed flow mask configurations
The rte_flow meaning of zero flow mask configuration is to match all
the range of the item value.
For example, the flow eth / ipv4 dst spec 1.2.3.4 dst mask 0.0.0.0
should much all the ipv4 traffic from the rte_flow API perspective.

>From some kernel perspectives the above rule means to ignore all the
ipv4 traffic (e.g. Ubuntu 16.04, 4.15.10).

Due to the fact that the tap PMD should provide the rte_flow meaning,
it is necessary to ignore the spec in case the mask is zero when it
forwards such like flows to the kernel.
So, the above rule should be translated to eth / ipv4 to get the
correct meaning.

Ignore spec configurations when the mask is zero.

Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-08-07 22:48:53 +02:00
Vipin Varghese
5a401fb85e net/tap: call probe finish for tun secondary
Invoke rte_eth_dev_probing_finish for rte_pmd_tun_probe.

Fixes: fbe90cdd77 ("ethdev: add probing finish function")
Cc: stable@dpdk.org

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-08-05 01:44:24 +02:00
Ferruh Yigit
d1c3ab220a drivers/net: fix crash in secondary process
Calling rte_eth_dev_info_get() on secondary process cause a crash
because eth_dev->device is not set properly.

Fixes: ee27edbe0c ("drivers/net: share vdev data to secondary process")
Cc: stable@dpdk.org

Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
2018-07-26 15:00:34 +02:00
Gage Eads
968eb52c49 net/tap: set queue started and stopped
Set the rx and tx queue state appropriately when the queues or device
are started or stopped.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-07-23 23:55:26 +02:00
Thomas Monjalon
f8e9989606 remove useless constructor headers
A constructor is usually declared with RTE_INIT* macros.
As it is a static function, no need to declare before its definition.
The macro is used directly in the function definition.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2018-07-12 00:00:35 +02:00
Ophir Munk
050316a883 net/tap: support TSO (TCP Segment Offload)
This commit implements TCP segmentation offload in TAP.
librte_gso library is used to segment large TCP payloads (e.g. packets
of 64K bytes size) into smaller MTU size buffers.
By supporting TSO offload capability in software a TAP device can be used
as a failsafe sub device and be paired with another PCI device which
supports TSO capability in HW.

For more details on librte_gso implementation please refer to dpdk
documentation.
The number of newly generated TCP TSO segments is limited to 64.

Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-07-03 01:35:58 +02:00
Ophir Munk
6546e76056 net/tap: calculate checksums of multi segs packets
Prior to this commit IP/UDP/TCP checksum offload calculations
were skipped in case of a multi segments packet.
This commit enables TAP checksum calculations for multi segments
packets.
The only restriction is that the first segment must contain
headers of layers 3 (IP) and 4 (UDP or TCP)

Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-07-03 01:35:58 +02:00
Vipin Varghese
403232b817 net/tap: update tap index to unsigned
Updating the logic to reflect unsigned integer as index for TAP PMD.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-06-14 19:27:50 +02:00
Ophir Munk
42ec78eaeb net/tap: fix keep-alive queue not detached
The TAP keep-alive queue was created in order to keep the TAP device
in Linux even in case all of its Rx/Tx queues are released (in Linux
terminology: even in case all of the TAP device file descriptors are
closed), however, the keep-alive queue itself is attached to the TAP
device like all other Rx/Tx queues and therefore the kernel will
enqueue to it some Rx packets based on the kernel RSS distribution
rules. Those packets are unknown to the application and will remain
lost in the keep-alive queue.
All queues are attached by default to the TAP device after they are
created though TUNSETIFF ioctl call.
The fix is to detach the keep-alive queue after its creation through
TUNSETQUEUE ioctl call.

Fixes: 3101191c63 ("net/tap: fix device removal when no queue exist")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-25 17:07:40 +02:00
Vipin Varghese
d8dc42fab6 net/tap: fix vdev data sharing for tun
Enables TUN PMD sharing by attaching the port from the shared data.

Fixes: ee27edbe0c ("drivers/net: share vdev data to secondary process")

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-23 00:35:01 +02:00
Vipin Varghese
d1883bf1de net/tap: fix protocol field
The TX function is shared between TAP and TUN PMD. Checking TUN-TAP
type field will ensure the TAP PMD will always have protocol field
as 0.

Fixes: 204d026a39 ("net/tap: support tun")

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-23 00:35:01 +02:00
Ophir Munk
3101191c63 net/tap: fix device removal when no queue exist
TAP device is created following its first queue creation. Multiple
queues can be added or removed over time. In Linux terminology those
are file descriptors which are opened or closed over time. As long as
the number of opened file descriptors is positive - TAP device will
appear as a Linux device. In case all queues are released (the
equivalent of all file descriptors being closed) the TAP device will
be removed. This can lead to abnormalities in different scenarios
where the TAP device should exist even if all its queues are released.
In order to make TAP existence independent of its number of queues -
an extra file descriptor is opened on TAP creation and is closed on
TAP closure. Its only purpose is to serve as a keep-alive mechanism
for the TAP device.

Fixes: bf7b7f437b ("net/tap: create netdevice during probing")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-23 00:35:01 +02:00
Ophir Munk
2f5045c51c net/tap: support RSS hash update
Add RSS hash update callback to eth_dev_ops.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-17 19:07:08 +02:00
Ophir Munk
2ef1c0da89 net/tap: fix isolation mode toggling
Running testpmd command "flow isolae <port> 0" (i.e. disabling flow
isolation) followed by command "flow isolate <port> 1" (i.e. enabling
flow isolation) may result in a TAP error:
PMD: Kernel refused TC filter rule creation (17): File exists

Root cause analysis: when disabling flow isolation we keep the local
rule to redirect packets on TX (TAP_REMOTE_TX index) while we add it
again when enabling flow isolation. As a result this rule is added
two times in a row which results in "File exists" error.
The fix is to identify the "File exists" error and silently ignore it.

Another issue occurs when enabling isolation mode several times in a
row in which case the same tc rules are added consecutively and
rte_flow structs are added to a linked list before removing the
previous rte_flow structs.
The fix is to act upon isolation mode command only when there is a
change from "0" to "1" (or vice versa).

Fixes: f503d26948 ("net/tap: support flow API isolated mode")
Cc: stable@dpdk.org

Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2018-05-17 16:01:05 +02:00
Vipin Varghese
f5fd98c802 net/tap: add default name to tun
The change adds default name to reflect TUN PMD instance. if option
name is not passed, the default dtun is taken.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-14 22:32:23 +01:00
Thomas Monjalon
fbe90cdd77 ethdev: add probing finish function
A new hook function is added and called inside the PMDs at the end
of the device probing:
	- in primary process, after allocating, init and config
	- in secondary process, after attaching and local init

This new function is almost empty for now.
It will be used later to add some post-initialization processing.

For the PMDs calling the helpers rte_eth_dev_create() or
rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish()
is called from here, and not in the PMD itself.

Note that the helper rte_eth_dev_create() could be used more,
especially for vdevs, avoiding some code duplication in PMDs.

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14 22:31:53 +01:00
Wei Dai
a4996bd89c ethdev: new Rx/Tx offloads API
This patch check if a input requested offloading is valid or not.
Any reuqested offloading must be supported in the device capabilities.
Any offloading is disabled by default if it is not set in the parameter
dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If any offloading is enabled in rte_eth_dev_configure() by application,
it is enabled on all queues no matter whether it is per-queue or
per-port type and no matter whether it is set or cleared in
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(),
it can be enabled or disabled for individual queue in
ret_eth_[rt]x_queue_setup().
A new added offloading is the one which hasn't been enabled in
rte_eth_dev_configure() and is reuqested to be enabled in
rte_eth_[rt]x_queue_setup(), it must be per-queue type,
otherwise trigger an error log.
The underlying PMD must be aware that the requested offloadings
to PMD specific queue_setup() function only carries those
new added offloadings of per-queue type.

This patch can make above such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

This patch assumes that all PMDs in 18.05-rc2 have already
converted to offload API defined in 17.11 . It also assumes
that all PMDs can return correct offloading capabilities
in rte_eth_dev_infos_get().

In the beginning of [rt]x_queue_setup() of underlying PMD,
add offloads = [rt]xconf->offloads |
dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API
defined in 17.11 to avoid upper application broken due to offload
API change.
PMD can use the info that input [rt]xconf->offloads only carry
the new added per-queue offloads to do some optimization or some
code change on base of this patch.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-14 22:31:51 +01:00
Ophir Munk
8d54ede738 net/tap: report on supported RSS hash functions
Report on TAP supported RSS functions as part of dev_infos_get
callback: ETH_RSS_IP, ETH_RSS_UDP and ETH_RSS_TCP.
Known limitation: TAP supports all of the above hash functions together
and not in partial combinations.
Previous to this commit RSS support was reported as none. Since the
introduction of [1] it is required that all RSS configurations will be
verified.

[1] commit 8863a1fbfc ("ethdev: add supported hash function check")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-14 22:31:51 +01:00
Stephen Hemminger
1b3b7caeb1 net/tap: convert to dynamic logging
Use new logging macro to convert all calls to RTE_LOG() into
new dynamic log type.

Also fix whitespace.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-04-27 18:00:59 +01:00
Ophir Munk
ead63dd318 net/tap: return empty port offload capabilities
Fix internal report on port specific offload capabilities to be 0 (no
capabilities). Before this commit port capabilities were a clone of queue
capabilities, however the current TAP offload capabilities (e.g.
checksum calculation) are per queue and are not specific per port.
This commit fixes an internal validation check for new configured
queue offloads.
The port capability API keeps reporting all queue capabilities as port
capabilities.

Fixes: 95ae196ae1 ("net/tap: use new Rx offloads API")
Fixes: 818fe14a98 ("net/tap: use new Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-04-27 18:00:57 +01:00
Adrien Mazarguil
76e9a55b5b ethdev: add transfer attribute to flow API
This new attribute enables applications to create flow rules that do not
simply match traffic whose origin is specified in the pattern (e.g. some
non-default physical port or VF), but actively affect it by applying the
flow rule at the lowest possible level in the underlying device.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
e58638c324 ethdev: fix TPID handling in flow API
TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
consistent with the normal stacking order of pattern items, which is
confusing to applications.

Problem is that when followed by one of these layers, the EtherType field
of the preceding layer keeps its "inner" definition, and the "outer" TPID
is provided by the subsequent layer, the reverse of how a packet looks like
on the wire:

 Wire:     [ ETH TPID = A | VLAN EtherType = B | B DATA ]
 rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]

Worse, when QinQ is involved, the stacking order of VLAN layers is
unspecified. It is unclear whether it should be reversed (innermost to
outermost) as well given TPID applies to the previous layer:

 Wire:       [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
 rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
 rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]

While specifying EtherType/TPID is hopefully rarely necessary, the stacking
order in case of QinQ and the lack of documentation remain an issue.

This patch replaces TPID in the VLAN pattern item with an inner
EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
clarifies documentation and updates all relevant code.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
items:

- bnxt: EtherType matching is supported with and without VLAN, but TPID
  matching is not and triggers an error.

- e1000: EtherType matching is only supported with the ETHERTYPE filter,
  which does not support VLAN matching, therefore no impact.

- enic: same as bnxt.

- i40e: same as bnxt with existing FDIR limitations on allowed EtherType
  values. The remaining filter types (VXLAN, NVGRE, QINQ) do not support
  EtherType matching.

- ixgbe: same as e1000, with additional minor change to rely on the new
  E-Tag macro definition.

- mlx4: EtherType/TPID matching is not supported, no impact.

- mlx5: same as bnxt.

- mvpp2: same as bnxt.

- sfc: same as bnxt.

- tap: same as bnxt.

Fixes: b1a4b4cbc0 ("ethdev: introduce generic flow API")
Fixes: 99e7003831 ("net/ixgbe: parse L2 tunnel filter")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
18aee2861a ethdev: add encap level to RSS flow API action
RSS hash types (ETH_RSS_* macros defined in rte_ethdev.h) describe the
protocol header fields of a packet that must be taken into account while
computing RSS.

When facing encapsulated (e.g. tunneled) packets, there is an ambiguity as
to whether these should apply to inner or outer packets. Applications need
the ability to tell exactly "where" RSS must be performed.

This is addressed by adding encapsulation level information to the RSS flow
action. Its default value is 0 and stands for the usual unspecified
behavior. Other values provide a specific encapsulation level.

Contrary to the change announced by commit 676b605182 ("doc: announce
ethdev API change for RSS configuration"), this patch does not affect
struct rte_eth_rss_conf but struct rte_flow_action_rss as the former is not
used anymore by the RSS flow action. ABI impact is therefore limited to
rte_flow.

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
929e331934 ethdev: add hash function to RSS flow API action
By definition, RSS involves some kind of hash algorithm, usually Toeplitz.

Until now it could not be modified on a flow rule basis and PMDs had to
always assume RTE_ETH_HASH_FUNCTION_DEFAULT, which remains the default
behavior when unspecified (0).

This breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:54 +01:00
Adrien Mazarguil
ac8d22de23 ethdev: flatten RSS configuration in flow API
Since its inception, the rte_flow RSS action has been relying in part on
external struct rte_eth_rss_conf for compatibility with the legacy RSS API.
This structure lacks parameters such as the hash algorithm to use, and more
recently, a method to tell which layer RSS should be performed on [1].

Given struct rte_eth_rss_conf will never be flexible enough to represent a
complete RSS configuration (e.g. RETA table), this patch supersedes it by
extending the rte_flow RSS action directly.

A subsequent patch will add a field to use a non-default RSS hash
algorithm. To that end, a field named "types" replaces the field formerly
known as "rss_hf" and standing for "RSS hash functions" as it was
confusing. Actual RSS hash function types are defined by enum
rte_eth_hash_function.

This patch updates all PMDs and example applications accordingly.

It breaks ABI compatibility for the following public functions:

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

[1] commit 676b605182 ("doc: announce ethdev API change for RSS
    configuration")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00
Adrien Mazarguil
cc17feb904 ethdev: alter behavior of flow API actions
This patch makes the following changes to flow rule actions:

- List order now matters, they are redefined as performed first to last
  instead of "all simultaneously".

- Repeated actions are now supported (e.g. specifying QUEUE multiple times
  now duplicates traffic among them). Previously only the last action of
  any given kind was taken into account.

- No more distinction between terminating/non-terminating/meta actions.
  Flow rules themselves are now defined as always terminating unless a
  PASSTHRU action is specified.

These changes alter the behavior of flow rules in corner cases in order to
prepare the flow API for actions that modify traffic contents or properties
(e.g. encapsulation, compression) and for which order matter when combined.

Previously one would have to do so through multiple flow rules by combining
PASSTRHU with priority levels, however this proved overly complex to
implement at the PMD level, hence this simpler approach.

This breaks ABI compatibility for the following public functions:

- rte_flow_create()
- rte_flow_validate()

PMDs with rte_flow support are modified accordingly:

- bnxt: no change, implementation already forbids multiple actions and does
  not support PASSTHRU.

- e1000: no change, same as bnxt.

- enic: modified to forbid redundant actions, no support for default drop.

- failsafe: no change needed.

- i40e: no change, implementation already forbids multiple actions.

- ixgbe: same as i40e.

- mlx4: modified to forbid multiple fate-deciding actions and drop when
  unspecified.

- mlx5: same as mlx4, with other redundant actions also forbidden.

- sfc: same as mlx4.

- tap: implementation already complies with the new behavior except for
  the default pass-through modified as a default drop.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-27 18:00:53 +01:00
Ferruh Yigit
18869f97f1 drivers/net: fix link autoneg value for virtual PMDs
These drivers never attempt link speed negotiation. Change link_autoneg
value to ETH_LINK_FIXED to be more accurate and consistent between PMDs.

Fixes: 1e3a958f40 ("ethdev: fix link autonegotiation value")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-04-27 17:34:43 +01:00
Ferruh Yigit
6db80eb7c9 net/tap: fix icc build
build error:
.../dpdk/drivers/net/tap/rte_eth_tap.c(598):
error #279: controlling expression is constant
	RTE_ASSERT(!"unsupported request type: must not happen");

Although RTE_ASSERT helps debugging this issue when assert enabled,
constant expression in assert means this path can be taken during
runtime and there is no protection against it when assert is disabled.

Adding error log and error return back, replacing RTE_ASSERT.

Fixes: 7748a4b441 ("net/tap: add debug messages")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2018-04-27 17:34:42 +01:00
Vipin Varghese
8055b534d6 net/tap: fix protocol field for non-IP
When non IP packets are sent on TUN interface, the logic put Ipv6 as
protocol field in header. With the current patch, the check is modified
for ipv4, ipv6 and non ip.

Fixes: 204d026a39 ("net/tap: support tun")

Suggested-by: Ophir Munk <ophirmu@mellanox.com>
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-04-27 15:54:55 +01:00
Jianfeng Tan
ee27edbe0c drivers/net: share vdev data to secondary process
dpdk-procinfo, as a secondary process, cannot fetch stats for vdev.

This patch enables that by attaching the port from the shared data.
We also fill the eth dev ops, with only some ops works in secondary
process, for example, stats_get().

Note that, we still cannot Rx/Tx packets on the ports which do not
support multi-process.

Reported-by: Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
2018-04-24 12:37:31 +02:00
Jianfeng Tan
5f19dee604 drivers/net: do not use private ethdev data
We introduced private rte_eth_dev_data to allow vdev to be created
both in primary process and secondary process(es). This is not
friendly to multi-process model, for example, it leads to port id
contention issue if two processes both find the data entry is free.

And to get stats of primary vdev in secondary, we must allocate
from the pre-defined array so that we can find it.

Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
2018-04-24 12:33:51 +02:00
Olivier Matz
caccf8b318 ethdev: return diagnostic when setting MAC address
Change the prototype and the behavior of dev_ops->eth_mac_addr_set(): a
return code is added to notify the caller (librte_ether) if an error
occurred in the PMD.

The new default MAC address is now copied in dev->data->mac_addrs[0]
only if the operation is successful.

The patch also updates all the PMDs accordingly.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-04-14 00:43:30 +02:00
Ferruh Yigit
cd8c7c7ce2 ethdev: replace bus specific struct with generic dev
Public struct rte_eth_dev_info has a "struct rte_pci_device" field in it
although it is common for all ethdev in all buses.

Replacing pci specific struct with generic device struct and updating
places that are using pci device in a way to get this information from
generic device.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-04-14 00:41:44 +02:00
Vipin Varghese
58f7db4396 net/tap: add tun log and documentation
The changes add TUN|TAP specific logs and documentation support.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-04-14 00:41:44 +02:00
Vipin Varghese
204d026a39 net/tap: support tun
The change adds functional TUN PMD logic to the existing TAP PMD.
TUN PMD can be initialized with 'net_tunX' where 'X' represents unique id.
PMD supports argument interface, while MAC address and remote are not
supported.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-04-14 00:41:44 +02:00
Pavan Nikhilesh
28aa16db4a net/tap: fix memcpy with incorrect size
Fix incorrect sizeof operation being used for getting mac addr size.

Found while compiling with arm64 clang.
drivers/net/tap/rte_eth_tap.c:1410:40: error: argument to 'sizeof' in
    'memcpy' call is the same pointer type 'struct ether_addr *' as the
    destination; expected 'struct ether_addr' or an explicit length
    [-Werror,-Wsizeof-pointer-memaccess]
       rte_memcpy(&pmd->eth_addr, mac_addr, sizeof(mac_addr));
                  ~~~~~~~~~~~~~~            ^~~~~~~~~~~~~~~~

Fixes: bcab6c1d27 ("net/tap: allow user MAC to be passed as args")

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Zhiyong Yang <zhiyong.yang@intel.com>
2018-04-14 00:40:21 +02:00
Shahaf Shuler
5feecc57d9 align SPDX Mellanox copyrights
Aligning Mellanox SPDX copyrights to a single format.
In addition replace to SPDX licence files which were missed.

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-11 01:47:47 +02:00
Bruce Richardson
c022cb400e convert snprintf to strlcpy
Since we have support for the strlcpy function in DPDK, replace all
instances where a string is copied using snprintf.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-04-04 17:33:08 +02:00
Vipin Varghese
bcab6c1d27 net/tap: allow user MAC to be passed as args
Allow TAP PMD to pass user desired MAC address as argument.
The argument value is processed as string delimited by  ':',
is parsed and converted to HEX MAC address after validation.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-03-30 14:08:43 +02:00
Ophir Munk
ba2f43464e net/tap: fix promiscuous rules double insertions
Running testpmd command "port stop all" followed by command "port start
all" may result in a TAP error:
PMD: Kernel refused TC filter rule creation (17): File exists

Root cause analysis: during the execution of "port start all" command
testpmd calls rte_eth_promiscuous_enable() while during the execution
of "port stop all" command testpmd does not call
rte_eth_promiscuous_disable().
As a result the TAP PMD is trying to add tc (traffic control command)
promiscuous rules to the remote netvsc device consecutively. From the
kernel point of view it is seen as an attempt to add the same rule more
than once. In recent kernels (e.g. version 4.13) this attempt is rejected
with a "File exists" error. In less recent kernels (e.g. version 4.4) the
same rule may have been successfully accepted twice, which is undesirable.

In the corrupted code every tc promiscuous rule included a different
handle number parameter. If instead an identical handle number is
used for all tc promiscuous rules - all kernels will reject the second
identical rule with a "File exists" error, which is easy to identify and
to silently ignore.

Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-02-14 15:42:39 +01:00
Ophir Munk
3382eb3a3a net/tap: add CRC stripping capability
CRC stripping is executed in the kernel outside of TAP PMD scope.
There is no prevention that the TAP PMD will report on Rx CRC
stripping capability.
In the corrupted code, TAP PMD did not report on this capability.
The fix enables TAP PMD to report that Rx CRC stripping is supported.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-02-13 18:17:30 +01:00
Moti Haimovsky
cb7e68da63 net/tap: fix cleanup on allocation failure
This patch complements the partial cleanup done inside
eth_dev_tap_create when the routine failed.
Such a failure left a non-functional device attached to the system.

Fixes: 050fe6e9ff ("drivers/net: use ethdev allocation helper for vdev")
Cc: stable@dpdk.org

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-02-08 13:18:07 +01:00
Ophir Munk
54aad2f231 net/tap: fix eBPF handling of non-RSS flows
The eBPF classifier (section "cls_q" in tap_bpf_program.c) is tracing
marked packets in which skb->cb[1] contains an RSS queue number, and
redirects those packets to the matched queue.
It is expected that skb->cb[1] has been previously set with a valid RSS
queue number during an eBPF action (section "l3_l4" in tap_bpf_program.c).
However, for non-RSS flows, skb->cb[1] may contain a random unset value,
which could falsely be interpreted as a valid RSS queue.
To avoid this potential error, tap_bpf_program.c has been updated as
follows:
1. After calculating the RSS queue number, it is added a unique offset in
order to uniquely identify it as a valid RSS queue number.
2. After matching an RSS queue to a packet, skb->cb[1] is set to 0.

Fixes: cdc07e83bb ("net/tap: add eBPF program file")
Fixes: aabe70df73 ("net/tap: add eBPF bytes code")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-02-05 19:56:04 +01:00
Ophir Munk
9abaad8d68 net/tap: fix multi segments capability
TAP device is supporting multi segments Tx, however this capability is
not reported when querying the TAP device.
This commit adds this capability report.

Fixes: 818fe14a98 ("net/tap: use new Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-02-05 19:56:04 +01:00
Zhiyong Yang
d98fac4dae net/tap: fix icc build
The following error is reported when compiling 18.02-rc2 using ICC,
"transfer of control bypasses initialization of".
The patch fixes the issue.

Fixes: 1911c5edc6 ("net/tap: fix eBPF RSS map key handling")

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
2018-02-01 10:31:18 +01:00
Olivier Matz
acfe2bd440 net/tap: use SPDX tags in 6WIND copyrighted files
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-02-01 02:33:04 +01:00
Vipin Varghese
22e8c9f341 net/tap: remove speed argument
TAP is a virtual device created on Kernel. The speed of interface is
set by Kernel to a fixed static value. But this does not prevent using
RX or TX to rate limit. Hence removing the option from user arguments.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-01-31 20:57:29 +01:00
Ophir Munk
1911c5edc6 net/tap: fix eBPF RSS map key handling
This commit addresses a case of RSS and non RSS flows mixture.
In the corrupted code, if a non-RSS flow is destroyed, it has an eBPF map
index 0 (initialized upon flow creation). This might unintentionally remove
RSS entry 0 from eBPF map.
To fix this issue, add an offset to the real index during a KEY_CMD_GET
operation, and subtract this offset during a KEY_CMD_RELEASE operation, in
order to restore the real index.
Thus, if a non RSS flow is falsely trying to release map entry 0 - The
offset subtraction will calculate the real map index as an
out-of-range value, and the release operation will be silently ignored.

Fixes: 036d721a82 ("net/tap: implement RSS using eBPF")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-31 20:57:29 +01:00
Ophir Munk
da8841a713 net/tap: fix eBPF file descriptors leakage
When a user creates an RSS rule, the tap PMD dynamically allocates
a 'flow' data structure, and uploads BPF programs (represented by file
descriptors) to the kernel.
The kernel might reject the rule (due to filters overlap, for example)
in which case, flow memory should be freed and BPF file descriptors
should be closed.
In the corrupted code there were scenarios where BPF file descriptors
were not closed.
The fix is to add a new function - tap_flow_free(), which will make sure
to always close BPF file descriptors before freeing the flow allocated
memory.

Fixes: 036d721a82 ("net/tap: implement RSS using eBPF")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-31 20:57:29 +01:00
Ophir Munk
e56447980e net/tap: fix build on ARM32
This commit adds eBPF system call definitions for ARM architecture.
Old Linux header files may not define eBPF system call numbers.
In order to successful compile eBPF on all Linux platforms - the
missing ARM system call definition is explicitly added.

Fixes: b02d85e1 ("net/tap: add eBPF API")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-01-31 19:14:47 +01:00
Gowrishankar Muthukrishnan
3eaad8e530 net/tap: fix build on ppc
This patch defines __NR_bpf for powerpc architecture and hence,
fixes compiling tap driver for this architecture.

Fixes: b02d85e1 ("net/tap: add eBPF API")

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-30 14:28:14 +01:00
Moti Haimovsky
4870a8cdd9 net/tap: support Rx interrupt
This patch adds support for registering and waiting for Rx interrupts.
This allows applications to wait for Rx events from the PMD using the
DPDK rte_epoll subsystem.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-29 10:45:20 +01:00
Ophir Munk
e011bad02d net/tap: use local eBPF definitions
eBPF has a graceful approach: it must successfully compile on all Linux
distributions. If a specific kernel cannot support eBPF it will gracefully
refuse the eBPF netlink message sent to it.
The kernel header file linux/bpf.h (if present) on different Linux
distributions may not include all definitions required for TAP
compilation.
In order to guarantee a successful eBPF compilation everywhere all the
required definitions for TAP have been locally added instead of including
file <linux/bpf.h>

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
2018-01-24 19:03:29 +01:00
Ferruh Yigit
ffc905f3b8 ethdev: separate driver APIs
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.

There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.

More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-01-22 01:26:49 +01:00
Ophir Munk
036d721a82 net/tap: implement RSS using eBPF
TAP PMD is required to support RSS queue mapping based on rte_flow API. An
example usage for this requirement is failsafe transparent switching from a
PCI device to TAP device while keep redirecting packets to the same RSS
queues on both devices.

TAP RSS implementation is based on eBPF programs sent to Linux kernel
through BPF system calls and using netlink messages to reference the
programs as part of traffic control commands.

TC uses eBPF programs as classifiers and actions.
eBPF classification: packets marked with an RSS queue will be directed
to this queue using TC with "skbedit" action.
BPF classifiers are downloaded to the kernel once on TAP creation for
each TAP Rx queue.

eBPF action: calculate the Toeplitz RSS hash based on L3 addresses and
L4 ports. Mark the packet with the RSS queue according the resulting
RSS hash, then reclassify the packet.
BPF actions are downloaded to the kernel for each new RSS rule.

TAP eBPF requires Linux version 4.9 configured with BPF. TAP PMD will
successfully compile on systems with old or non-BPF configured kernels but
RSS rules creation on TAP devices will not be successful

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
b02d85e19c net/tap: add eBPF API
This commit include BPF API to be used by TAP.

tap_flow_bpf_cls_q() - download to kernel BPF program that classifies
packets to their matching queues
tap_flow_bpf_calc_l3_l4_hash() - download to kernel BPF program that
calculates per packet layer 3 and layer 4 RSS hash
tap_flow_bpf_rss_map_create() - create BPF RSS map for storing RSS
parameters per RSS rule
tap_flow_bpf_update_rss_elem() - update BPF map entry with RSS rule
parameters

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
aabe70df73 net/tap: add eBPF bytes code
File tap_bpf_insns.h was added. It includes  eBPF bytes code
which corresponds to source file tap_bpf_program.c
(see "net/tap: add eBPF program file").
The bytes code is in the format of C arrays of struct bpf_insn and
was generated from the C file tap_bpf_program.c
1. The C file was compiled via LLVM into an object file in ELF
format as:
   clang -O2 -emit-llvm -c tap_bpf_program.c -o - | llc -march=bpf \
   -filetype=obj -o <tap_bpf_program.o>

clang version must be 3.7 and above
The C functions are under different ELF sections and are considered
different BPF programs to be downloaded to the kernel

2. Using an external tool the ELF sections are parsed and the C arrays
of struct bpf_insn are generated. Each C array (corresponding to a
different function under an ELF section) is downloaded to the kernel
using an BPF systm call. The external tool that generates the C arrays
will be added in separate commits.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
cdc07e83bb net/tap: add eBPF program file
File tap_bpf_program.c was added with two ELF sections
corresponding to two BPF programs and one BPF map.

Section cls_q - BPF classifier to classify packets to their
corresponding queue after an RSS hash was calculated on the packet
and saved in skb->cb[1]
Section l3_l4 - BPF action to calculate RSS hash on packet
layers 3 and 4
This file is not part of DPDK tree compilation.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
fb847dcf73 net/tap: support actions for different classifiers
Add a generic TC actions handling for TC actions: "mirred",
"gact", "skbedit". This will be useful when introducing
BPF actions, as it uses TCA_BPF_ACT instead of TCA_FLOWER_ACT

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Moti Haimovsky
95ae196ae1 net/tap: use new Rx offloads API
Ethdev Rx offloads API has changed since:
commit ce17eddefc ("ethdev: introduce Rx queue offloads API")
This commit adds support for the new Rx offloads API.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Moti Haimovsky
818fe14a98 net/tap: use new Tx offloads API
Ethdev Tx offloads API has changed since:
commit cba7f53b71 ("ethdev: introduce Tx queue offloads API")
This commit adds support for the new Tx offloads API.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Stephen Hemminger
d732ec19dc net/tap: remove unused kernel version definitions
The TAP device does not use these definitions in current version.
And kernel version is not the correct way to detect features.

Fixes: 7c2d03d65f ("net/tap: remove Linux version check")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-01-16 18:47:49 +01:00
Thomas Monjalon
1e3a958f40 ethdev: fix link autonegotiation value
There are 3 kind of link data in ethdev:
	- capabilities (rte_eth_dev_info)
	- configuration (rte_eth_conf)
	- status (rte_eth_link)

A bit-field is used for capabilities (rte_eth_dev_info.speed_capa) and
configuration (rte_eth_conf.link_speeds).
Bits are defined in ETH_LINK_SPEED_*.

Some numerical (ETH_SPEED_NUM_*) and boolean (ETH_LINK_*) values
are used for the link status (rte_eth_link.*).

There was a mistake in the comment of rte_eth_link.link_autoneg,
suggesting ETH_LINK_SPEED_[AUTONEG/FIXED] which are 0/1,
instead of ETH_LINK_[AUTONEG/FIXED] which are 1/0.

The drivers are fixed to use ETH_LINK_[AUTONEG/FIXED].

Fixes: 82113036e4 ("ethdev: redesign link speed config")

Suggested-by: Andrew Rybchenko <arybchenko@solarflare.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-16 18:47:49 +01:00
Radu Nicolau
c5cf2a0429 net/tap: renamed netlink functions
Functions like nl_recev and nl_send name clash functions in the
libnl library (https://www.infradead.org/~tgr/libnl/).
All functions declared in tap_netlink.h were decorated with tap_
for consistency.

Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-01-16 18:47:49 +01:00
Bruce Richardson
5566a3e358 drivers: use SPDX tag for Intel copyright files
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2018-01-04 22:41:39 +01:00
Jianfeng Tan
d4a586d29e bus/vdev: move code from EAL into a new driver
Move the vdev bus from lib/librte_eal to drivers/bus.

As the crypto vdev helper function refers to data structure
in rte_vdev.h, so we move those helper function into drivers/bus
too.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
2017-11-07 16:54:07 +01:00
Gaetan Rivet
00a3d8104a ethdev: remove detachable device flag
This flag is not necessary at the ether layer anymore.
Buses are able to advertise their hotplug support. The ether layer can
rely upon this capability instead of a special flag.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2017-10-26 02:33:01 +02:00
Olivier Matz
cbc12b0a96 mk: do not generate LDLIBS from directory dependencies
The list of libraries in LDLIBS was generated from the DEPDIRS-xyz
variable. This is valid when the subdirectory name match the library
name, but it's not always the case, especially for PMDs.

The patches removes this feature and explicitly adds the proper
libraries in LDLIBS.

Some DEPDIRS-xyz variables become useless, remove them.

Reported-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Gage Eads <gage.eads@intel.com>
2017-10-24 02:14:57 +02:00
Adrien Mazarguil
5b4efcc6b9 ethdev: expose flow API error helper
rte_flow_error_set() is a convenient helper to initialize error objects.

Since there is no fundamental reason to prevent applications from using it,
expose it through the public interface after modifying its return value
from positive to negative. This is done for consistency with the rest of
the public interface.

Documentation is updated accordingly.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Matan Azrad
d5b0924ba6 ethdev: add return value to stats get dev op
The stats_get dev op API doesn't include return value, so PMD cannot
return an error in case of failure at stats getting process time.

Since PCI devices can be removed and there is a time between the
physical removal to the RMV interrupt, the user may get invalid stats
without any indication.

This patch changes the stats_get API return value to be int instead of
void.

All the net PMDs stats_get dev ops are adjusted by this patch.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-12 01:52:49 +01:00
Matan Azrad
8073bc818c net/tap: allow RSS flow action
One of the main identified use cases for the tap PMD is to be used in
combination with the fail-safe PMD as a fallback for a physical device.

Fail-safe is very strict about making sure its current configuration is
properly applied to all slave devices, they get rejected otherwise in
order to maintain a consistent state.

The problem is that tap's RSS support is currently limited to the
default (non-Toeplitz) balancing performed by the kernel on all Rx
queues. While proper RSS support emulation in the tap PMD is a work in
progress, the lack of rte_flow counterpart prevents validation of the
above use case in the meantime.

Given that unlike most PMDs, tap is more about convenience than
performance, support for the RSS action can be temporarily faked with
a minimum amount of code and mostly correct behavior by treating it
like a QUEUE action. Traffic is directed to the first queue of the set.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-10-06 02:49:50 +02:00
Vipin Varghese
d8f759a0ea net/tap: fix unregistering callback with invalid fd
tap_intr_handle_set() called by tap_dev_start(), and if LSC is disabled
(dev_conf.intr_conf.lsc == 0), it tries to unregister interrupt
callback without checking the interrupt file descriptor.

Fixes: c0bddd3a05 ("net/tap: add link status notification")
Cc: stable@dpdk.org

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-10-06 02:49:49 +02:00
Ophir Munk
f46900d038 net/tap: fix flow and port commands
This commit fixes two bugs related to tap devices. The first bug occurs
when executing in testpmd the following flow rule assuming tap device has
4 rx and tx pair queues
"flow create 0 ingress pattern eth / end actions queue index 5 / end"
This command will report on success and will print ""Flow rule #0 created"
although it should have failed as queue index number 5 does not exist

The second bug occurs when executing in testpmd "port start all" following
a port configuration. Assuming 1 pair of rx and tx queues an error is
reported: "Fail to start port 0"

Before this commit a fixed max number (16) of rx and tx queue pairs were
created on startup where the file descriptors (fds) of rx and tx pairs were
identical.  As a result in the first bug queue index 5 existed because the
tap device was created with 16 rx and tx queue pairs regardless of the
configured number of queues. In the second bug when tap device was started
tx fd was closed before opening it and executing ioctl() on it. However
closing the sole fd of the device caused ioctl to fail with "No such
device".

This commit creates the configured number of rx and tx queue pairs (up to
max 16) and assigns a unique fd to each queue. It was written to solve the
first bug and was found as the right fix for the second bug as well.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Fixes: bf7b7f437b ("net/tap: create netdevice during probing")
Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-10-06 02:49:48 +02:00
Raslan Darawsheh
7c2d03d65f net/tap: remove Linux version check
Remove checks of Linux kernel version
in order to support kernel with backported features.

the expected behavior with a kernel that doesn't support flower
and other bits is the following:
        -flow validate can return successfully
        -flow create using the same rule fails.

Using the "remote" feature without kernel flower does not fail silently.
The TAP instance is not initialized if the requested parameters cannot
be satisfied.

it has been tested on an old kernel without required support:

PMD: Kernel refused TC filter rule creation (2): No such file or directory
PMD: tap0 failed to create implicit rules.
PMD: Can't set up remote feature: No such file of directory(2)
PMD: TAP Unable to initialize net_tap0

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-07-19 11:09:13 +03:00
Thomas Monjalon
013b83b82e net/tap: add missing newlines in logs
Some logs are missing the newline character \n.

The logs using only line can be checked with this command:
	git grep 'RTE_LOG(.*".*[^n]"' drivers/net/tap/

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Fixes: 268483dc20 ("net/tap: add preliminary support for flow API")
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Fixes: bf7b7f437b ("net/tap: create netdevice during probing")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-07-19 11:09:13 +03:00
Thomas Monjalon
4810d3af83 net/tap: restore state of remote device when closing
When exiting a DPDK application, the TAP remote was left
with the link down even if it was initially up.

The device flags of the remote netdevice are saved when probing,
and restored when calling the close function.
The remote state is not set down when calling the stop function anymore.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-07-06 15:00:57 +02:00
Pascal Mazon
f503d26948 net/tap: support flow API isolated mode
With this patch, it is possible to enable or disable the isolate
feature anytime, even immediately after a probe while the tap has not
been configured yet. It will do its job as soon as the netdevice gets
created.

A specific implicit flow rule is created with the lowest priority (all
other flow rules will be evaluated before), at the end of the list. If
isolated mode is enabled, the associated action will be to drop the
packet. Otherwise, the action would be passthrough.

In case of a remote netdevice, implicit rules on it will be removed in
isolated mode, to ensure only actual flow rules redirect packets to the
tap.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-07-06 15:00:56 +02:00
Ferruh Yigit
4be4659a93 drivers/net: use device name from device structure
Device name resides in two different locations, in rte_device->name and
in ethernet device private data.

For now, the copy in the ethernet device private data is required for
multi process support, the name is the how secondary process finds about
primary process device.

But for drivers there is no reason to use the copy in the ethernet
device private data.

This patch updates PMDs to use only rte_device->name.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-07-06 00:17:02 +02:00
Jerin Jacob
98a7ea332b fix typos using codespell utility
Fixing typos across dpdk source code using codespell utility.
Skipped the ethdev driver's base code fixes to keep the base
code intact.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2017-06-14 23:54:13 +02:00
Ferruh Yigit
740feaf349 ethdev: remove driver name from device private data
rte_driver->name has the driver name and all physical and virtual
devices has access to it.

Previously it was not possible for virtual ethernet devices to access
rte_driver->name field (because eth_dev used to keep only pci_dev),
and it was required to save driver name in the device private struct.

After re-works on bus and vdev, it is possible for all bus types to
access rte_driver.

It is able to remove the driver name from ethdev device private data and
use eth_dev->device->driver->name.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Jan Blunck <jblunck@infradead.org>
2017-06-12 16:27:44 +01:00
Pascal Mazon
8ae3023387 net/tap: add Rx/Tx checksum offload support
This patch adds basic offloading support, widely expected in a PMD.

Verify IPv4 and UDP/TCP checksums upon packet reception, and set
ol_flags accordingly.

On Tx, set IPv4 and UDP/TCP checksums when required, considering
ol_flags.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-06-12 10:41:26 +01:00
Pascal Mazon
f96fe1ab35 net/tap: fix some flow collision
The following two flow rules (testpmd syntax) should not collide:
flow create 0 priority 1 ingress pattern eth / ipv4 / end actions drop / end
flow create 0 priority 1 ingress pattern eth / ipv6 / end actions drop / end

But the eth_type in the associated TC rule was set to either "ip" or
"ipv6".  For TC, they could thus not have the same priority.

Use ETH_P_ALL only in the TC message to make sure those rules can
coexist.

Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-06-12 10:41:26 +01:00