When exiting a DPDK application, the TAP remote was left
with the link down even if it was initially up.
The device flags of the remote netdevice are saved when probing,
and restored when calling the close function.
The remote state is not set down when calling the stop function anymore.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
With this patch, it is possible to enable or disable the isolate
feature anytime, even immediately after a probe while the tap has not
been configured yet. It will do its job as soon as the netdevice gets
created.
A specific implicit flow rule is created with the lowest priority (all
other flow rules will be evaluated before), at the end of the list. If
isolated mode is enabled, the associated action will be to drop the
packet. Otherwise, the action would be passthrough.
In case of a remote netdevice, implicit rules on it will be removed in
isolated mode, to ensure only actual flow rules redirect packets to the
tap.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Device name resides in two different locations, in rte_device->name and
in ethernet device private data.
For now, the copy in the ethernet device private data is required for
multi process support, the name is the how secondary process finds about
primary process device.
But for drivers there is no reason to use the copy in the ethernet
device private data.
This patch updates PMDs to use only rte_device->name.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fixing typos across dpdk source code using codespell utility.
Skipped the ethdev driver's base code fixes to keep the base
code intact.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
rte_driver->name has the driver name and all physical and virtual
devices has access to it.
Previously it was not possible for virtual ethernet devices to access
rte_driver->name field (because eth_dev used to keep only pci_dev),
and it was required to save driver name in the device private struct.
After re-works on bus and vdev, it is possible for all bus types to
access rte_driver.
It is able to remove the driver name from ethdev device private data and
use eth_dev->device->driver->name.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Jan Blunck <jblunck@infradead.org>
This patch adds basic offloading support, widely expected in a PMD.
Verify IPv4 and UDP/TCP checksums upon packet reception, and set
ol_flags accordingly.
On Tx, set IPv4 and UDP/TCP checksums when required, considering
ol_flags.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
The following two flow rules (testpmd syntax) should not collide:
flow create 0 priority 1 ingress pattern eth / ipv4 / end actions drop / end
flow create 0 priority 1 ingress pattern eth / ipv6 / end actions drop / end
But the eth_type in the associated TC rule was set to either "ip" or
"ipv6". For TC, they could thus not have the same priority.
Use ETH_P_ALL only in the TC message to make sure those rules can
coexist.
Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
remove __rte_unused instances that are not required.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Check for the current MAC address on both the remote and the tap
netdevices before setting a new value.
While there, remove wrong empty lines and ensure tap_ioctl() return
value is negative, just like what is done throughout this code.
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
This has three main benefits:
- tun_alloc is now generic again for any queue,
- mtu no longer needs to be handled in tap_setup_queue(),
- an actual netdevice is created as soon as the device is probed.
On top of it, code in eth_dev_tap_create() has been reworked to have a
more logical behavior; initialization can now fail if a remote is
requested but cannot be set up.
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Print a detailed debug message inside tap_ioctl() directly. The caller
now only needs to check for return value.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Support for a fixed MAC address for testing with the last octet
incrementing by one for each interface defined with the new 'mac=fixed'
string on the --vdev option. The default option is still to randomize
the MAC address for each tap interface.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Rx queues configured with more than 1023 descriptors cause readv() calls to
fail due to more iovec entries than permitted by the kernel. As a result,
no packets can be received.
Quietly limit internal Rx queue size to the maximum number of iovec entries
to fix this issue.
Fixes: 0781f5762c ("net/tap: support segmented mbufs")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
This is a preparation to embed the generic rte_device into the rte_eth_dev
also for virtual devices.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The patch change the prototype of callback function
(rte_intr_callback_fn) by removing the unnecessary parameter.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
The macro RTE_PMD_TAP_MAX_QUEUES was defined twice.
On machines with kernel < 3.8, IFF_MULTI_QUEUE didn't exist, and thus
both definitions used different values.
Fixes: cf56436611 ("net/tap: move private elements to external header")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
This is necessary to ensure packets with the new MAC address as
destination get redirected to the tap device.
Also change the MAC address only if the current one is different from
the requested one.
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Immediately after init (probing), the device MAC address is all zeroes.
It should be possible to get a correct MAC address as soon as that,
without need for a dev_configure().
With this patch, a MAC address is set in eth_dev_tap_create()
explicitly. It either comes from the remote if any was configured, or is
randomly generated. In any case, the device MAC address is guaranteed to
be the correct one when the tap netdevice actually gets created in
tun_alloc().
Fixes: f76d46b4ff ("net/tap: add MAC address management")
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Some errors received from the kernel are acceptable, such as a -ENOENT
for a rule deletion (the rule was already no longer existing in the
kernel). Make sure we consider return codes properly. For that,
nl_recv() has been simplified.
qdisc_exists() function is no longer needed as we can check whether the
kernel returned -EEXIST when requiring the qdisc creation. It's simpler
and faster.
Add a few messages for clarity when a netlink error occurs.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
With support for segmented packets, it is now possible to easily receive
packets of many sizes, given an adequate number of descriptors.
Remove limitation on the minimum size of mbuf: on reception, if a packet
won't fit in the queue's mbufs, it will be detected in the packet info
and the packet will be discarded.
Fixes: 0781f5762c ("net/tap: support segmented mbufs")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Only full mask (0xffff) is accepted, there is no way to specify a mask
for layer 4 ports to the kernel using TC rules.
Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
As tap is a virtual device, there's no physical way a link can be cut.
However, it has an associated kernel netdevice and possibly a remote
netdevice too. These netdevices link status may change outside of the
DPDK scope, through an external command such as:
ip link set dev tapX down
This commit implements link status notification through netlink.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Reflect device link status according to the state of the tap netdevice
and the remote netdevice (if any). If both are UP and RUNNING, then the
device link status is set to ETH_LINK_UP, otherwise ETH_LINK_DOWN.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
By default, a tap netdevice is of no use when not fed by a separate
process. The ability to automatically feed it from another netdevice
allows applications to capture any kind of traffic normally destined to
the kernel stack.
This patch implements this ability through a new optional "remote"
parameter.
Packets matching filtering rules created with the flow API are matched
on the remote device and redirected to the tap PMD, where the relevant
action will be performed.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Support for segmented packets (scatter/gather) is mandatory for most
purposes, regardless of the MTU size. Tx packets are often the result of
mbuf concatenation, and an mbuf is not necessarily large enough for Rx
packets to fit in a single one.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Supported flow rules are now mapped to TC rules on the tap netdevice.
The netlink message used for creating the TC rule is stored in struct
rte_flow. That way, by simply changing a metadata in it, we can require
for the rule deletion without further parsing.
Supported items:
- eth: src and dst (with variable masks), and eth_type (0xffff mask).
- vlan: vid, pcp, tpid, but not eid.
- ipv4/6: src and dst (with variable masks), and ip_proto (0xffff mask).
- udp/tcp: src and dst port (0xffff) mask.
Supported actions:
- DROP
- QUEUE
- PASSTHRU
It is generally not possible to provide a "last" item. However, if the
"last" item, once masked, is identical to the masked spec, then it is
supported.
Only IPv4/6 and MAC addresses can use a variable mask. All other
items need a full mask (exact match).
Support for VLAN requires kernel headers >= 4.9, checked using
auto-config.sh.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Each kernel netdevice may have queueing disciplines set for it, which
determine how to handle the packet (mostly on egress). That's part of
the TC (Traffic Control) mechanism.
Through TC, it is possible to set filter rules that match specific
packets, and act according to what is in the rule. This is a perfect
candidate to implement the flow API for the tap PMD, as it has an
associated kernel netdevice automatically.
Each flow API rule will be translated into its TC counterpart.
To leverage TC, it is necessary to communicate with the kernel using
netlink. This patch introduces a library to help that communication.
Inside netlink.c, functions are generic for any netlink messaging.
Inside tcmsgs.c, functions are specific to deal with TC rules.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
The flow API provides the ability to classify packets received by a tap
netdevice.
This patch only implements skeleton functions for flow API support, no
patterns are supported yet.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
In the next patch, access to struct pmd_internals will be necessary in
tap_flow.c to store the flows.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
A tap netdevice does not support flow control; ensure nothing but
RTE_FC_NONE mode can be set.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
A tap netdevice actually receives every packet, without any filtering
whatsoever. There is no need for any multicast address registration
to receive multicast packets.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The MTU is assigned to the tap netdevice according to the argument, but
packet transmission and reception just write/read on an fd with the
default limit being the socket buffer size.
As a new rte_eth_dev_data is allocated during tap device init, ensure it
is set again dev->data->mtu.
Once the actual netdevice is created via tun_alloc(), make sure to apply
the desired MTU to the netdevice.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
As soon as the netdevice is created, update pmd->mac_addr with its
actual MAC address.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Create a socket for ioctl at tap device creation instead of opening it
and closing it every call to tap_link_set_flags().
Use a common tap_ioctl() function that can be extended for various uses
(such as MTU change, MAC address change, ...).
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
There is no reason not to support ARP on a tap netdevice. Remove
IFF_NOARP flags.
Focus on IFF_UP when a link status change is required.
Fixes: f457b472b1 ("net/tap: add link up and down operations")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This commit adds a signal-based trigger to the Rx burst function in order
to avoid unnecessary system calls while Rx queues are empty.
Triggered Rx bursts put less pressure on the kernel, free up CPU resources
for applications and result in a noticeable performance improvement when
sharing CPU threads with other PMDs.
Measuring the traffic forwarding rate between two physical devices in
testpmd (IO mode, single thread, 64B packets) before and after adding two
tap PMD instances (4 ports total) that do not process any traffic and
comparing results yields:
Without Rx trigger:
-15% (--burst=32)
-62% (--burst=1)
With Rx trigger:
-0.3% (--burst=32)
-6% (--burst=1)
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Polling the Tx queue file descriptor before writing to it is not mandatory
since it is configured as non-blocking.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Store the device name in dev->data->name, to have symmetrical behavior
between rte_pmd_tap_probe(name) and rte_pmd_tap_remove(name).
The netdevice name (linux interface name) is stored in the name field of
struct pmd_internals.
snprintf(data->name) has been moved closer to the rte_ethdev_allocate()
as it should use the same name.
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Calling strncpy with a maximum size argument of 16 bytes on destination
array "ifr.ifr_ifrn.ifrn_name" of size 16 bytes might leave the
destination string unterminated.
Coverity issue: 1407499
Fixes: 6b38b2725c ("net/tap: fix multi-queue support")
Cc: stable@dpdk.org
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Before this patch, the management of dependencies between directories
had several issues:
- the generation of .depdirs, done at configuration is slow: it can take
more than one minute on some slow targets (usually ~10s on a standard
PC without -j).
- for instance, it is possible to express a dependency like:
- app/foo depends on lib/librte_foo
- and lib/librte_foo depends on app/bar
But this won't work because the directories are traversed with a
depth-first algorithm, so we have to choose between doing 'app' before
or after 'lib'.
- the script depdirs-rule.sh is too complex.
- we cannot use "make -d" for debug, because the output of make is used for
the generation of .depdirs.
This patch moves the DEPDIRS-* variables in the upper Makefile, making
the dependencies much easier to calculate. A DEPDIRS variable is still
used to process library dependencies in LDLIBS.
After this commit, "make config" is almost immediate.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Robin Jarry <robin.jarry@6wind.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Fixes: f457b472b1 ("net/tap: add link up and down operations")
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>