Commit Graph

235 Commits

Author SHA1 Message Date
Ophir Munk
036d721a82 net/tap: implement RSS using eBPF
TAP PMD is required to support RSS queue mapping based on rte_flow API. An
example usage for this requirement is failsafe transparent switching from a
PCI device to TAP device while keep redirecting packets to the same RSS
queues on both devices.

TAP RSS implementation is based on eBPF programs sent to Linux kernel
through BPF system calls and using netlink messages to reference the
programs as part of traffic control commands.

TC uses eBPF programs as classifiers and actions.
eBPF classification: packets marked with an RSS queue will be directed
to this queue using TC with "skbedit" action.
BPF classifiers are downloaded to the kernel once on TAP creation for
each TAP Rx queue.

eBPF action: calculate the Toeplitz RSS hash based on L3 addresses and
L4 ports. Mark the packet with the RSS queue according the resulting
RSS hash, then reclassify the packet.
BPF actions are downloaded to the kernel for each new RSS rule.

TAP eBPF requires Linux version 4.9 configured with BPF. TAP PMD will
successfully compile on systems with old or non-BPF configured kernels but
RSS rules creation on TAP devices will not be successful

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
b02d85e19c net/tap: add eBPF API
This commit include BPF API to be used by TAP.

tap_flow_bpf_cls_q() - download to kernel BPF program that classifies
packets to their matching queues
tap_flow_bpf_calc_l3_l4_hash() - download to kernel BPF program that
calculates per packet layer 3 and layer 4 RSS hash
tap_flow_bpf_rss_map_create() - create BPF RSS map for storing RSS
parameters per RSS rule
tap_flow_bpf_update_rss_elem() - update BPF map entry with RSS rule
parameters

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
aabe70df73 net/tap: add eBPF bytes code
File tap_bpf_insns.h was added. It includes  eBPF bytes code
which corresponds to source file tap_bpf_program.c
(see "net/tap: add eBPF program file").
The bytes code is in the format of C arrays of struct bpf_insn and
was generated from the C file tap_bpf_program.c
1. The C file was compiled via LLVM into an object file in ELF
format as:
   clang -O2 -emit-llvm -c tap_bpf_program.c -o - | llc -march=bpf \
   -filetype=obj -o <tap_bpf_program.o>

clang version must be 3.7 and above
The C functions are under different ELF sections and are considered
different BPF programs to be downloaded to the kernel

2. Using an external tool the ELF sections are parsed and the C arrays
of struct bpf_insn are generated. Each C array (corresponding to a
different function under an ELF section) is downloaded to the kernel
using an BPF systm call. The external tool that generates the C arrays
will be added in separate commits.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
cdc07e83bb net/tap: add eBPF program file
File tap_bpf_program.c was added with two ELF sections
corresponding to two BPF programs and one BPF map.

Section cls_q - BPF classifier to classify packets to their
corresponding queue after an RSS hash was calculated on the packet
and saved in skb->cb[1]
Section l3_l4 - BPF action to calculate RSS hash on packet
layers 3 and 4
This file is not part of DPDK tree compilation.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Ophir Munk
fb847dcf73 net/tap: support actions for different classifiers
Add a generic TC actions handling for TC actions: "mirred",
"gact", "skbedit". This will be useful when introducing
BPF actions, as it uses TCA_BPF_ACT instead of TCA_FLOWER_ACT

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Moti Haimovsky
95ae196ae1 net/tap: use new Rx offloads API
Ethdev Rx offloads API has changed since:
commit ce17eddefc ("ethdev: introduce Rx queue offloads API")
This commit adds support for the new Rx offloads API.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Moti Haimovsky
818fe14a98 net/tap: use new Tx offloads API
Ethdev Tx offloads API has changed since:
commit cba7f53b71 ("ethdev: introduce Tx queue offloads API")
This commit adds support for the new Tx offloads API.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2018-01-21 15:51:52 +01:00
Stephen Hemminger
d732ec19dc net/tap: remove unused kernel version definitions
The TAP device does not use these definitions in current version.
And kernel version is not the correct way to detect features.

Fixes: 7c2d03d65f ("net/tap: remove Linux version check")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-01-16 18:47:49 +01:00
Thomas Monjalon
1e3a958f40 ethdev: fix link autonegotiation value
There are 3 kind of link data in ethdev:
	- capabilities (rte_eth_dev_info)
	- configuration (rte_eth_conf)
	- status (rte_eth_link)

A bit-field is used for capabilities (rte_eth_dev_info.speed_capa) and
configuration (rte_eth_conf.link_speeds).
Bits are defined in ETH_LINK_SPEED_*.

Some numerical (ETH_SPEED_NUM_*) and boolean (ETH_LINK_*) values
are used for the link status (rte_eth_link.*).

There was a mistake in the comment of rte_eth_link.link_autoneg,
suggesting ETH_LINK_SPEED_[AUTONEG/FIXED] which are 0/1,
instead of ETH_LINK_[AUTONEG/FIXED] which are 1/0.

The drivers are fixed to use ETH_LINK_[AUTONEG/FIXED].

Fixes: 82113036e4 ("ethdev: redesign link speed config")

Suggested-by: Andrew Rybchenko <arybchenko@solarflare.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-16 18:47:49 +01:00
Radu Nicolau
c5cf2a0429 net/tap: renamed netlink functions
Functions like nl_recev and nl_send name clash functions in the
libnl library (https://www.infradead.org/~tgr/libnl/).
All functions declared in tap_netlink.h were decorated with tap_
for consistency.

Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-01-16 18:47:49 +01:00
Bruce Richardson
5566a3e358 drivers: use SPDX tag for Intel copyright files
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2018-01-04 22:41:39 +01:00
Jianfeng Tan
d4a586d29e bus/vdev: move code from EAL into a new driver
Move the vdev bus from lib/librte_eal to drivers/bus.

As the crypto vdev helper function refers to data structure
in rte_vdev.h, so we move those helper function into drivers/bus
too.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
2017-11-07 16:54:07 +01:00
Gaetan Rivet
00a3d8104a ethdev: remove detachable device flag
This flag is not necessary at the ether layer anymore.
Buses are able to advertise their hotplug support. The ether layer can
rely upon this capability instead of a special flag.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2017-10-26 02:33:01 +02:00
Olivier Matz
cbc12b0a96 mk: do not generate LDLIBS from directory dependencies
The list of libraries in LDLIBS was generated from the DEPDIRS-xyz
variable. This is valid when the subdirectory name match the library
name, but it's not always the case, especially for PMDs.

The patches removes this feature and explicitly adds the proper
libraries in LDLIBS.

Some DEPDIRS-xyz variables become useless, remove them.

Reported-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Gage Eads <gage.eads@intel.com>
2017-10-24 02:14:57 +02:00
Adrien Mazarguil
5b4efcc6b9 ethdev: expose flow API error helper
rte_flow_error_set() is a convenient helper to initialize error objects.

Since there is no fundamental reason to prevent applications from using it,
expose it through the public interface after modifying its return value
from positive to negative. This is done for consistency with the rest of
the public interface.

Documentation is updated accordingly.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2017-10-13 01:18:47 +01:00
Matan Azrad
d5b0924ba6 ethdev: add return value to stats get dev op
The stats_get dev op API doesn't include return value, so PMD cannot
return an error in case of failure at stats getting process time.

Since PCI devices can be removed and there is a time between the
physical removal to the RMV interrupt, the user may get invalid stats
without any indication.

This patch changes the stats_get API return value to be int instead of
void.

All the net PMDs stats_get dev ops are adjusted by this patch.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-12 01:52:49 +01:00
Matan Azrad
8073bc818c net/tap: allow RSS flow action
One of the main identified use cases for the tap PMD is to be used in
combination with the fail-safe PMD as a fallback for a physical device.

Fail-safe is very strict about making sure its current configuration is
properly applied to all slave devices, they get rejected otherwise in
order to maintain a consistent state.

The problem is that tap's RSS support is currently limited to the
default (non-Toeplitz) balancing performed by the kernel on all Rx
queues. While proper RSS support emulation in the tap PMD is a work in
progress, the lack of rte_flow counterpart prevents validation of the
above use case in the meantime.

Given that unlike most PMDs, tap is more about convenience than
performance, support for the RSS action can be temporarily faked with
a minimum amount of code and mostly correct behavior by treating it
like a QUEUE action. Traffic is directed to the first queue of the set.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-10-06 02:49:50 +02:00
Vipin Varghese
d8f759a0ea net/tap: fix unregistering callback with invalid fd
tap_intr_handle_set() called by tap_dev_start(), and if LSC is disabled
(dev_conf.intr_conf.lsc == 0), it tries to unregister interrupt
callback without checking the interrupt file descriptor.

Fixes: c0bddd3a05 ("net/tap: add link status notification")
Cc: stable@dpdk.org

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-10-06 02:49:49 +02:00
Ophir Munk
f46900d038 net/tap: fix flow and port commands
This commit fixes two bugs related to tap devices. The first bug occurs
when executing in testpmd the following flow rule assuming tap device has
4 rx and tx pair queues
"flow create 0 ingress pattern eth / end actions queue index 5 / end"
This command will report on success and will print ""Flow rule #0 created"
although it should have failed as queue index number 5 does not exist

The second bug occurs when executing in testpmd "port start all" following
a port configuration. Assuming 1 pair of rx and tx queues an error is
reported: "Fail to start port 0"

Before this commit a fixed max number (16) of rx and tx queue pairs were
created on startup where the file descriptors (fds) of rx and tx pairs were
identical.  As a result in the first bug queue index 5 existed because the
tap device was created with 16 rx and tx queue pairs regardless of the
configured number of queues. In the second bug when tap device was started
tx fd was closed before opening it and executing ioctl() on it. However
closing the sole fd of the device caused ioctl to fail with "No such
device".

This commit creates the configured number of rx and tx queue pairs (up to
max 16) and assigns a unique fd to each queue. It was written to solve the
first bug and was found as the right fix for the second bug as well.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Fixes: bf7b7f437b ("net/tap: create netdevice during probing")
Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-10-06 02:49:48 +02:00
Raslan Darawsheh
7c2d03d65f net/tap: remove Linux version check
Remove checks of Linux kernel version
in order to support kernel with backported features.

the expected behavior with a kernel that doesn't support flower
and other bits is the following:
        -flow validate can return successfully
        -flow create using the same rule fails.

Using the "remote" feature without kernel flower does not fail silently.
The TAP instance is not initialized if the requested parameters cannot
be satisfied.

it has been tested on an old kernel without required support:

PMD: Kernel refused TC filter rule creation (2): No such file or directory
PMD: tap0 failed to create implicit rules.
PMD: Can't set up remote feature: No such file of directory(2)
PMD: TAP Unable to initialize net_tap0

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-07-19 11:09:13 +03:00
Thomas Monjalon
013b83b82e net/tap: add missing newlines in logs
Some logs are missing the newline character \n.

The logs using only line can be checked with this command:
	git grep 'RTE_LOG(.*".*[^n]"' drivers/net/tap/

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Fixes: 268483dc20 ("net/tap: add preliminary support for flow API")
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Fixes: bf7b7f437b ("net/tap: create netdevice during probing")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-07-19 11:09:13 +03:00
Thomas Monjalon
4810d3af83 net/tap: restore state of remote device when closing
When exiting a DPDK application, the TAP remote was left
with the link down even if it was initially up.

The device flags of the remote netdevice are saved when probing,
and restored when calling the close function.
The remote state is not set down when calling the stop function anymore.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-07-06 15:00:57 +02:00
Pascal Mazon
f503d26948 net/tap: support flow API isolated mode
With this patch, it is possible to enable or disable the isolate
feature anytime, even immediately after a probe while the tap has not
been configured yet. It will do its job as soon as the netdevice gets
created.

A specific implicit flow rule is created with the lowest priority (all
other flow rules will be evaluated before), at the end of the list. If
isolated mode is enabled, the associated action will be to drop the
packet. Otherwise, the action would be passthrough.

In case of a remote netdevice, implicit rules on it will be removed in
isolated mode, to ensure only actual flow rules redirect packets to the
tap.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-07-06 15:00:56 +02:00
Ferruh Yigit
4be4659a93 drivers/net: use device name from device structure
Device name resides in two different locations, in rte_device->name and
in ethernet device private data.

For now, the copy in the ethernet device private data is required for
multi process support, the name is the how secondary process finds about
primary process device.

But for drivers there is no reason to use the copy in the ethernet
device private data.

This patch updates PMDs to use only rte_device->name.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-07-06 00:17:02 +02:00
Jerin Jacob
98a7ea332b fix typos using codespell utility
Fixing typos across dpdk source code using codespell utility.
Skipped the ethdev driver's base code fixes to keep the base
code intact.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2017-06-14 23:54:13 +02:00
Ferruh Yigit
740feaf349 ethdev: remove driver name from device private data
rte_driver->name has the driver name and all physical and virtual
devices has access to it.

Previously it was not possible for virtual ethernet devices to access
rte_driver->name field (because eth_dev used to keep only pci_dev),
and it was required to save driver name in the device private struct.

After re-works on bus and vdev, it is possible for all bus types to
access rte_driver.

It is able to remove the driver name from ethdev device private data and
use eth_dev->device->driver->name.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Jan Blunck <jblunck@infradead.org>
2017-06-12 16:27:44 +01:00
Pascal Mazon
8ae3023387 net/tap: add Rx/Tx checksum offload support
This patch adds basic offloading support, widely expected in a PMD.

Verify IPv4 and UDP/TCP checksums upon packet reception, and set
ol_flags accordingly.

On Tx, set IPv4 and UDP/TCP checksums when required, considering
ol_flags.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-06-12 10:41:26 +01:00
Pascal Mazon
f96fe1ab35 net/tap: fix some flow collision
The following two flow rules (testpmd syntax) should not collide:
flow create 0 priority 1 ingress pattern eth / ipv4 / end actions drop / end
flow create 0 priority 1 ingress pattern eth / ipv6 / end actions drop / end

But the eth_type in the associated TC rule was set to either "ip" or
"ipv6".  For TC, they could thus not have the same priority.

Use ETH_P_ALL only in the TC message to make sure those rules can
coexist.

Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-06-12 10:41:26 +01:00
Ferruh Yigit
dd2c630a5f drivers/net: remove unnecessary macro for unused variables
remove __rte_unused instances that are not required.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
2017-06-12 10:41:25 +01:00
Pascal Mazon
460bf26fd8 net/tap: do not set remote MAC if not necessary
Check for the current MAC address on both the remote and the tap
netdevices before setting a new value.

While there, remove wrong empty lines and ensure tap_ioctl() return
value is negative, just like what is done throughout this code.

Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-06-12 10:41:25 +01:00
Pascal Mazon
bf7b7f437b net/tap: create netdevice during probing
This has three main benefits:
 - tun_alloc is now generic again for any queue,
 - mtu no longer needs to be handled in tap_setup_queue(),
 - an actual netdevice is created as soon as the device is probed.

On top of it, code in eth_dev_tap_create() has been reworked to have a
more logical behavior; initialization can now fail if a remote is
requested but cannot be set up.

Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-06-12 10:41:25 +01:00
Pascal Mazon
b32105eb1f net/tap: drop unnecessary nested block
This is cosmetic; the code is functionally equivalent.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-06-12 10:41:25 +01:00
Pascal Mazon
2e05e1d8c5 net/tap: remove unnecessary functions
These functions are only two lines each and are used only once.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-06-12 10:41:25 +01:00
Pascal Mazon
7748a4b441 net/tap: add debug messages
Print a detailed debug message inside tap_ioctl() directly. The caller
now only needs to check for return value.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-06-12 10:41:25 +01:00
Pascal Mazon
f6921783fe net/tap: add support for fixed MAC addresses
Support for a fixed MAC address for testing with the last octet
incrementing by one for each interface defined with the new 'mac=fixed'
string on the --vdev option. The default option is still to randomize
the MAC address for each tap interface.

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-06-12 10:41:25 +01:00
Pascal Mazon
ec12df9504 net/tap: fix support for large Rx queues
Rx queues configured with more than 1023 descriptors cause readv() calls to
fail due to more iovec entries than permitted by the kernel. As a result,
no packets can be received.

Quietly limit internal Rx queue size to the maximum number of iovec entries
to fix this issue.

Fixes: 0781f5762c ("net/tap: support segmented mbufs")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-05-01 22:32:55 +02:00
Pascal Mazon
69ebb8ae17 net/tap: update driver param string
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-04-19 15:37:37 +02:00
Jan Blunck
050fe6e9ff drivers/net: use ethdev allocation helper for vdev
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2017-04-18 19:04:49 +02:00
Jan Blunck
5d2aa461cb vdev: use generic vdev struct for probe and remove
This is a preparation to embed the generic rte_device into the rte_eth_dev
also for virtual devices.

Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-04-14 15:41:50 +02:00
Qi Zhang
c23a1a3000 eal: clean up interrupt handle
The patch change the prototype of callback function
(rte_intr_callback_fn) by removing the unnecessary parameter.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2017-04-06 21:15:55 +02:00
Ferruh Yigit
0c145b7eea drivers/net: remove unused DEPDIRS from makefiles
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-06 20:58:59 +02:00
Pascal Mazon
947d949de7 net/tap: fix max queues redefinition
The macro RTE_PMD_TAP_MAX_QUEUES was defined twice.
On machines with kernel < 3.8, IFF_MULTI_QUEUE didn't exist, and thus
both definitions used different values.

Fixes: cf56436611 ("net/tap: move private elements to external header")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-04-06 16:29:32 +02:00
Pascal Mazon
41f0e86033 net/tap: fix redirection rule after MAC change
This is necessary to ensure packets with the new MAC address as
destination get redirected to the tap device.

Also change the MAC address only if the current one is different from
the requested one.

Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-04-04 19:03:03 +02:00
Pascal Mazon
1b93b12a94 net/tap: fix null MAC address at init
Immediately after init (probing), the device MAC address is all zeroes.
It should be possible to get a correct MAC address as soon as that,
without need for a dev_configure().

With this patch, a MAC address is set in eth_dev_tap_create()
explicitly. It either comes from the remote if any was configured, or is
randomly generated. In any case, the device MAC address is guaranteed to
be the correct one when the tap netdevice actually gets created in
tun_alloc().

Fixes: f76d46b4ff ("net/tap: add MAC address management")
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-04-04 19:03:03 +02:00
Pascal Mazon
6fc6de7e0e net/tap: update netlink error code management
Some errors received from the kernel are acceptable, such as a -ENOENT
for a rule deletion (the rule was already no longer existing in the
kernel). Make sure we consider return codes properly. For that,
nl_recv() has been simplified.

qdisc_exists() function is no longer needed as we can check whether the
kernel returned -EEXIST when requiring the qdisc creation. It's simpler
and faster.

Add a few messages for clarity when a netlink error occurs.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-04-04 19:03:03 +02:00
Pascal Mazon
f279f6f2bd net/tap: remove minimum packet size in Rx
With support for segmented packets, it is now possible to easily receive
packets of many sizes, given an adequate number of descriptors.

Remove limitation on the minimum size of mbuf: on reception, if a packet
won't fit in the queue's mbufs, it will be detected in the packet info
and the packet will be discarded.

Fixes: 0781f5762c ("net/tap: support segmented mbufs")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 19:03:02 +02:00
Pascal Mazon
6b890c2cd6 net/tap: remove unsupported UDP/TCP port mask in flow
Only full mask (0xffff) is accepted, there is no way to specify a mask
for layer 4 ports to the kernel using TC rules.

Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 19:03:02 +02:00
Pascal Mazon
c0bddd3a05 net/tap: add link status notification
As tap is a virtual device, there's no physical way a link can be cut.
However, it has an associated kernel netdevice and possibly a remote
netdevice too. These netdevices link status may change outside of the
DPDK scope, through an external command such as:

  ip link set dev tapX down

This commit implements link status notification through netlink.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:47 +02:00
Pascal Mazon
0b3e4ab9bf net/tap: improve link update
Reflect device link status according to the state of the tap netdevice
and the remote netdevice (if any). If both are UP and RUNNING, then the
device link status is set to ETH_LINK_UP, otherwise ETH_LINK_DOWN.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:47 +02:00
Pascal Mazon
2bc06869cd net/tap: add remote netdevice traffic capture
By default, a tap netdevice is of no use when not fed by a separate
process. The ability to automatically feed it from another netdevice
allows applications to capture any kind of traffic normally destined to
the kernel stack.

This patch implements this ability through a new optional "remote"
parameter.

Packets matching filtering rules created with the flow API are matched
on the remote device and redirected to the tap PMD, where the relevant
action will be performed.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:47 +02:00
Pascal Mazon
0781f5762c net/tap: support segmented mbufs
Support for segmented packets (scatter/gather) is mandatory for most
purposes, regardless of the MTU size. Tx packets are often the result of
mbuf concatenation, and an mbuf is not necessarily large enough for Rx
packets to fit in a single one.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:47 +02:00
Pascal Mazon
5025409c04 net/tap: do not send packets larger than MTU
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:47 +02:00
Pascal Mazon
de96fe68ae net/tap: add basic flow API patterns and actions
Supported flow rules are now mapped to TC rules on the tap netdevice.
The netlink message used for creating the TC rule is stored in struct
rte_flow. That way, by simply changing a metadata in it, we can require
for the rule deletion without further parsing.

Supported items:
- eth: src and dst (with variable masks), and eth_type (0xffff mask).
- vlan: vid, pcp, tpid, but not eid.
- ipv4/6: src and dst (with variable masks), and ip_proto (0xffff mask).
- udp/tcp: src and dst port (0xffff) mask.

Supported actions:
- DROP
- QUEUE
- PASSTHRU

It is generally not possible to provide a "last" item. However, if the
"last" item, once masked, is identical to the masked spec, then it is
supported.

Only IPv4/6 and MAC addresses can use a variable mask. All other
items need a full mask (exact match).

Support for VLAN requires kernel headers >= 4.9, checked using
auto-config.sh.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:46 +02:00
Pascal Mazon
7c25284e30 net/tap: add netlink back-end for flow API
Each kernel netdevice may have queueing disciplines set for it, which
determine how to handle the packet (mostly on egress). That's part of
the TC (Traffic Control) mechanism.

Through TC, it is possible to set filter rules that match specific
packets, and act according to what is in the rule. This is a perfect
candidate to implement the flow API for the tap PMD, as it has an
associated kernel netdevice automatically.

Each flow API rule will be translated into its TC counterpart.

To leverage TC, it is necessary to communicate with the kernel using
netlink. This patch introduces a library to help that communication.

Inside netlink.c, functions are generic for any netlink messaging.
Inside tcmsgs.c, functions are specific to deal with TC rules.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:46 +02:00
Pascal Mazon
268483dc20 net/tap: add preliminary support for flow API
The flow API provides the ability to classify packets received by a tap
netdevice.

This patch only implements skeleton functions for flow API support, no
patterns are supported yet.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:46 +02:00
Pascal Mazon
cf56436611 net/tap: move private elements to external header
In the next patch, access to struct pmd_internals will be necessary in
tap_flow.c to store the flows.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:46 +02:00
Pascal Mazon
2d3ebfeea9 net/tap: add flow control management
A tap netdevice does not support flow control; ensure nothing but
RTE_FC_NONE mode can be set.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-04 18:59:41 +02:00
Pascal Mazon
0849ac3b61 net/tap: add packet type management
Advertise packet types supported by the librte_net.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-04 18:59:40 +02:00
Pascal Mazon
730651c27c net/tap: add multicast addresses management
A tap netdevice actually receives every packet, without any filtering
whatsoever. There is no need for any multicast address registration
to receive multicast packets.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-04 18:59:40 +02:00
Pascal Mazon
4a25714ad9 net/tap: add speed capabilities
Tap PMD is flexible, it supports any speed.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-04 18:59:40 +02:00
Pascal Mazon
e5dc143a42 net/tap: add MTU management
The MTU is assigned to the tap netdevice according to the argument, but
packet transmission and reception just write/read on an fd with the
default limit being the socket buffer size.

As a new rte_eth_dev_data is allocated during tap device init, ensure it
is set again dev->data->mtu.
Once the actual netdevice is created via tun_alloc(), make sure to apply
the desired MTU to the netdevice.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-04 18:59:40 +02:00
Pascal Mazon
f76d46b4ff net/tap: add MAC address management
As soon as the netdevice is created, update pmd->mac_addr with its
actual MAC address.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-04 18:59:40 +02:00
Pascal Mazon
73cf55c8bd net/tap: refactor ioctl calls
Create a socket for ioctl at tap device creation instead of opening it
and closing it every call to tap_link_set_flags().

Use a common tap_ioctl() function that can be extended for various uses
(such as MTU change, MAC address change, ...).

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-04 18:59:40 +02:00
Pascal Mazon
9d4aabe66a net/tap: remove NO-ARP setting
There is no reason not to support ARP on a tap netdevice. Remove
IFF_NOARP flags.
Focus on IFF_UP when a link status change is required.

Fixes: f457b472b1 ("net/tap: add link up and down operations")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-04-04 18:59:40 +02:00
Adrien Mazarguil
a0d8e807d9 net/tap: add Rx trigger
This commit adds a signal-based trigger to the Rx burst function in order
to avoid unnecessary system calls while Rx queues are empty.

Triggered Rx bursts put less pressure on the kernel, free up CPU resources
for applications and result in a noticeable performance improvement when
sharing CPU threads with other PMDs.

Measuring the traffic forwarding rate between two physical devices in
testpmd (IO mode, single thread, 64B packets) before and after adding two
tap PMD instances (4 ports total) that do not process any traffic and
comparing results yields:

Without Rx trigger:

 -15% (--burst=32)
 -62% (--burst=1)

With Rx trigger:

 -0.3% (--burst=32)
 -6% (--burst=1)

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:39 +02:00
Adrien Mazarguil
d3668acb7b net/tap: remove redundant syscall on Tx
Polling the Tx queue file descriptor before writing to it is not mandatory
since it is configured as non-blocking.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 18:59:39 +02:00
Pascal Mazon
66db3932dd net/tap: fix dev name look-up
Store the device name in dev->data->name, to have symmetrical behavior
between rte_pmd_tap_probe(name) and rte_pmd_tap_remove(name).

The netdevice name (linux interface name) is stored in the name field of
struct pmd_internals.

snprintf(data->name) has been moved closer to the rte_ethdev_allocate()
as it should use the same name.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 15:52:52 +02:00
Keith Wiles
8657878861 net/tap: fix possibly unterminated string
Calling strncpy with a maximum size argument of 16 bytes on destination
array "ifr.ifr_ifrn.ifrn_name" of size 16 bytes might leave the
destination string unterminated.

Coverity issue: 1407499
Fixes: 6b38b2725c ("net/tap: fix multi-queue support")
Cc: stable@dpdk.org

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
2017-04-04 15:52:50 +02:00
Olivier Matz
feb9f680cd mk: optimize directory dependencies
Before this patch, the management of dependencies between directories
had several issues:

- the generation of .depdirs, done at configuration is slow: it can take
  more than one minute on some slow targets (usually ~10s on a standard
  PC without -j).

- for instance, it is possible to express a dependency like:
  - app/foo depends on lib/librte_foo
  - and lib/librte_foo depends on app/bar
  But this won't work because the directories are traversed with a
  depth-first algorithm, so we have to choose between doing 'app' before
  or after 'lib'.

- the script depdirs-rule.sh is too complex.

- we cannot use "make -d" for debug, because the output of make is used for
  the generation of .depdirs.

This patch moves the DEPDIRS-* variables in the upper Makefile, making
the dependencies much easier to calculate. A DEPDIRS variable is still
used to process library dependencies in LDLIBS.

After this commit, "make config" is almost immediate.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Robin Jarry <robin.jarry@6wind.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-03-27 23:28:43 +02:00
Keith Wiles
e1dbde9a1f net/tap: move closing file descriptors to close function
Remove closing fds code from pmd stop routine.

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-02-10 12:25:49 +01:00
Keith Wiles
89a5bef09a net/tap: move link down before close
Fixes: f457b472b1 ("net/tap: add link up and down operations")

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-02-10 12:25:49 +01:00
Keith Wiles
6157fd377c net/tap: cleanup log messages
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-02-10 12:25:49 +01:00
Keith Wiles
6b38b2725c net/tap: fix multi-queue support
At the same time remove the code which created the first device queue
at probe time. Now all queues are created during queue setup calls.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-02-10 12:25:49 +01:00
Keith Wiles
f4f0b1bc13 net/tap: remove unused variable
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-02-10 12:25:49 +01:00
Keith Wiles
c1f2e8c78c net/tap: remove redundant file descriptor array
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
2017-02-10 12:25:49 +01:00
Pascal Mazon
123c42a487 net/tap: support promiscuous and allmulticast
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-02-10 12:25:49 +01:00
Pascal Mazon
f457b472b1 net/tap: add link up and down operations
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-02-10 12:25:49 +01:00
Pascal Mazon
7835953870 net/tap: display name after parsing
The probe parses for user-defined iface name. Let's use that value.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-02-10 12:25:49 +01:00
Pascal Mazon
634dbc7aa7 net/tap: keep kernel-assigned MAC address
There's no point in having a different internal MAC address than the one
provided by the kernel when creating the netdevice.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-02-10 12:25:49 +01:00
Pascal Mazon
455c14e974 net/tap: skip overwritten file descriptor assignment
pmd->fds[0], pmd->rxq[0] and pmd->txq[0] are set a couple of lines after
the for loop that initializes them to -1.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-02-10 12:25:49 +01:00
Pascal Mazon
f48a8d6797 net/tap: fix log level of errors
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-02-10 12:25:48 +01:00
Pascal Mazon
542594d3fb net/tap: fix name
dev->data->name contains the device name, e.g. "net_tap0".
dev->data->dev_private->name contains the actual iface name,
e.g. "dtap0".

In any case, the name must to be consistent with the tun_alloc() call in
eth_dev_tap_create().

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2017-02-10 12:25:48 +01:00
Keith Wiles
0002ca582d net/tap: fix invalid queue file descriptor
Rx and Tx queues share the common tap file descriptor, but save this
value separately.

Setting up Rx/Tx queue sets up both queues, release_queue close the
tap file but update file descriptor only for that queue.

This makes other queue's file descriptor invalid.

As a workaround, prevent release_queue callback to be called by default.

This is done by separating Rx/Tx setup functions, so that each only
setup its own queue, this prevents rte_eth_rx/tx_queue_setup() calling
release_queue before setup_queue.

Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-01-30 22:18:27 +01:00
Keith Wiles
b0120a300b net/tap: fix build with old kernels
IFF_MULTI_QUEUE does not exist in older kernels:
drivers/net/tap/rte_eth_tap.c:143:19: error: ‘IFF_MULTI_QUEUE’ undeclared

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-01-20 18:37:22 +01:00
Keith Wiles
02f96a0a82 net/tap: add TUN/TAP device PMD
The PMD allows for DPDK and the host to communicate using a raw
device interface on the host and in the DPDK application. The device
created is a Tap device with a L2 packet header.

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Aws Ismail <aismail@ciena.com>
Tested-by: Vasily Philipov <vasilyf@mellanox.com>
2017-01-17 19:40:50 +01:00