Reproduced with '--buildtype=debugoptimized' config,
compiler version: gcc (GCC) 12.0.0 20210509 (experimental)
There are multiple build errors, like:
In file included from ../drivers/net/tap/tap_flow.c:13:
In function ‘rte_jhash_2hashes’,
inlined from ‘rte_jhash’ at ../lib/hash/rte_jhash.h:284:2,
inlined from ‘tap_flow_set_handle’ at
../drivers/net/tap/tap_flow.c:1306:12,
inlined from ‘rss_enable’ at ../drivers/net/tap/tap_flow.c:1909:3,
inlined from ‘priv_flow_process’ at
../drivers/net/tap/tap_flow.c:1228:11:
../lib/hash/rte_jhash.h:238:9:
warning: ‘flow’ may be used uninitialized [-Wmaybe-uninitialized]
238 | __rte_jhash_2hashes(key, length, pc, pb, 1);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/net/tap/tap_flow.c: In function ‘priv_flow_process’:
../lib/hash/rte_jhash.h:81:1: note: by argument 1 of type ‘const void *’
to ‘__rte_jhash_2hashes.constprop’ declared here
81 | __rte_jhash_2hashes(const void *key, uint32_t length, uint32_t *pc,
| ^~~~~~~~~~~~~~~~~~~
../drivers/net/tap/tap_flow.c:1028:1: note: ‘flow’ declared here
1028 | priv_flow_process(struct pmd_internals *pmd,
| ^~~~~~~~~~~~~~~~~
Fix strict aliasing rule by using union.
Bugzilla ID: 690
Fixes: de96fe68ae95 ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Since rte_flow is the only API for filtering operations,
the legacy driver interface filter_ctrl was too much complicated
for the simple task of getting the struct rte_flow_ops.
The filter type RTE_ETH_FILTER_GENERIC and
the filter operarion RTE_ETH_FILTER_GET are removed.
The new driver callback flow_ops_get replaces filter_ctrl.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The TAP driver does not initialize all the elements of the rte_flow
structure. This can lead to crash in rte_flow_destroy.
(gdb) where
flow=0x100e99280, error=0x0)
at drivers/net/tap/tap_flow.c:1514
(gdb) p remote_flow
$1 = (struct rte_flow *) 0x6b6b6b6b6b6b6b6b
Which is here:
static int
tap_flow_destroy_pmd(struct pmd_internals *pmd,
struct rte_flow *flow,
struct rte_flow_error *error)
{
struct rte_flow *remote_flow = flow->remote_flow;
...
if (remote_flow) {
remote_flow->msg.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
Simplest fix is to use rte_zmalloc() so remote_flow and other fields
are always set at zero.
Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add 'RTE_' prefix to defines:
- rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN.
- rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN.
- rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN.
- rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN.
- rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN.
- rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN.
- rename ETHER_MTU as RTE_ETHER_MTU.
- rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN.
- rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID.
- rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN.
- rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU.
- rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR.
- rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR.
- rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4.
- rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6.
- rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP.
- rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN.
- rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP.
- rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ.
- rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG.
- rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588.
- rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW.
- rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB.
- rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP.
- rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS.
- rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM.
- rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN.
- rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE.
- rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4.
- rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6.
- rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH.
- rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH.
- rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS.
- rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP.
- rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG.
- rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Some global variables can indeed be static, add static keyword to them.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
fd's cannot be shared between processes, and each process need to have
it's own fd's pointer.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The rte_flow meaning of zero flow mask configuration is to match all
the range of the item value.
For example, the flow eth / ipv4 dst spec 1.2.3.4 dst mask 0.0.0.0
should much all the ipv4 traffic from the rte_flow API perspective.
>From some kernel perspectives the above rule means to ignore all the
ipv4 traffic (e.g. Ubuntu 16.04, 4.15.10).
Due to the fact that the tap PMD should provide the rte_flow meaning,
it is necessary to ignore the spec in case the mask is zero when it
forwards such like flows to the kernel.
So, the above rule should be translated to eth / ipv4 to get the
correct meaning.
Ignore spec configurations when the mask is zero.
Fixes: de96fe68ae95 ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Running testpmd command "flow isolae <port> 0" (i.e. disabling flow
isolation) followed by command "flow isolate <port> 1" (i.e. enabling
flow isolation) may result in a TAP error:
PMD: Kernel refused TC filter rule creation (17): File exists
Root cause analysis: when disabling flow isolation we keep the local
rule to redirect packets on TX (TAP_REMOTE_TX index) while we add it
again when enabling flow isolation. As a result this rule is added
two times in a row which results in "File exists" error.
The fix is to identify the "File exists" error and silently ignore it.
Another issue occurs when enabling isolation mode several times in a
row in which case the same tc rules are added consecutively and
rte_flow structs are added to a linked list before removing the
previous rte_flow structs.
The fix is to act upon isolation mode command only when there is a
change from "0" to "1" (or vice versa).
Fixes: f503d2694825 ("net/tap: support flow API isolated mode")
Cc: stable@dpdk.org
Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Use new logging macro to convert all calls to RTE_LOG() into
new dynamic log type.
Also fix whitespace.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This new attribute enables applications to create flow rules that do not
simply match traffic whose origin is specified in the pattern (e.g. some
non-default physical port or VF), but actively affect it by applying the
flow rule at the lowest possible level in the underlying device.
It breaks ABI compatibility for the following public functions:
- rte_flow_copy()
- rte_flow_create()
- rte_flow_validate()
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
consistent with the normal stacking order of pattern items, which is
confusing to applications.
Problem is that when followed by one of these layers, the EtherType field
of the preceding layer keeps its "inner" definition, and the "outer" TPID
is provided by the subsequent layer, the reverse of how a packet looks like
on the wire:
Wire: [ ETH TPID = A | VLAN EtherType = B | B DATA ]
rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]
Worse, when QinQ is involved, the stacking order of VLAN layers is
unspecified. It is unclear whether it should be reversed (innermost to
outermost) as well given TPID applies to the previous layer:
Wire: [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]
While specifying EtherType/TPID is hopefully rarely necessary, the stacking
order in case of QinQ and the lack of documentation remain an issue.
This patch replaces TPID in the VLAN pattern item with an inner
EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
clarifies documentation and updates all relevant code.
It breaks ABI compatibility for the following public functions:
- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()
Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
items:
- bnxt: EtherType matching is supported with and without VLAN, but TPID
matching is not and triggers an error.
- e1000: EtherType matching is only supported with the ETHERTYPE filter,
which does not support VLAN matching, therefore no impact.
- enic: same as bnxt.
- i40e: same as bnxt with existing FDIR limitations on allowed EtherType
values. The remaining filter types (VXLAN, NVGRE, QINQ) do not support
EtherType matching.
- ixgbe: same as e1000, with additional minor change to rely on the new
E-Tag macro definition.
- mlx4: EtherType/TPID matching is not supported, no impact.
- mlx5: same as bnxt.
- mvpp2: same as bnxt.
- sfc: same as bnxt.
- tap: same as bnxt.
Fixes: b1a4b4cbc0a8 ("ethdev: introduce generic flow API")
Fixes: 99e7003831c3 ("net/ixgbe: parse L2 tunnel filter")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
RSS hash types (ETH_RSS_* macros defined in rte_ethdev.h) describe the
protocol header fields of a packet that must be taken into account while
computing RSS.
When facing encapsulated (e.g. tunneled) packets, there is an ambiguity as
to whether these should apply to inner or outer packets. Applications need
the ability to tell exactly "where" RSS must be performed.
This is addressed by adding encapsulation level information to the RSS flow
action. Its default value is 0 and stands for the usual unspecified
behavior. Other values provide a specific encapsulation level.
Contrary to the change announced by commit 676b605182a5 ("doc: announce
ethdev API change for RSS configuration"), this patch does not affect
struct rte_eth_rss_conf but struct rte_flow_action_rss as the former is not
used anymore by the RSS flow action. ABI impact is therefore limited to
rte_flow.
This breaks ABI compatibility for the following public functions:
- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
By definition, RSS involves some kind of hash algorithm, usually Toeplitz.
Until now it could not be modified on a flow rule basis and PMDs had to
always assume RTE_ETH_HASH_FUNCTION_DEFAULT, which remains the default
behavior when unspecified (0).
This breaks ABI compatibility for the following public functions:
- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Since its inception, the rte_flow RSS action has been relying in part on
external struct rte_eth_rss_conf for compatibility with the legacy RSS API.
This structure lacks parameters such as the hash algorithm to use, and more
recently, a method to tell which layer RSS should be performed on [1].
Given struct rte_eth_rss_conf will never be flexible enough to represent a
complete RSS configuration (e.g. RETA table), this patch supersedes it by
extending the rte_flow RSS action directly.
A subsequent patch will add a field to use a non-default RSS hash
algorithm. To that end, a field named "types" replaces the field formerly
known as "rss_hf" and standing for "RSS hash functions" as it was
confusing. Actual RSS hash function types are defined by enum
rte_eth_hash_function.
This patch updates all PMDs and example applications accordingly.
It breaks ABI compatibility for the following public functions:
- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()
[1] commit 676b605182a5 ("doc: announce ethdev API change for RSS
configuration")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This patch makes the following changes to flow rule actions:
- List order now matters, they are redefined as performed first to last
instead of "all simultaneously".
- Repeated actions are now supported (e.g. specifying QUEUE multiple times
now duplicates traffic among them). Previously only the last action of
any given kind was taken into account.
- No more distinction between terminating/non-terminating/meta actions.
Flow rules themselves are now defined as always terminating unless a
PASSTHRU action is specified.
These changes alter the behavior of flow rules in corner cases in order to
prepare the flow API for actions that modify traffic contents or properties
(e.g. encapsulation, compression) and for which order matter when combined.
Previously one would have to do so through multiple flow rules by combining
PASSTRHU with priority levels, however this proved overly complex to
implement at the PMD level, hence this simpler approach.
This breaks ABI compatibility for the following public functions:
- rte_flow_create()
- rte_flow_validate()
PMDs with rte_flow support are modified accordingly:
- bnxt: no change, implementation already forbids multiple actions and does
not support PASSTHRU.
- e1000: no change, same as bnxt.
- enic: modified to forbid redundant actions, no support for default drop.
- failsafe: no change needed.
- i40e: no change, implementation already forbids multiple actions.
- ixgbe: same as i40e.
- mlx4: modified to forbid multiple fate-deciding actions and drop when
unspecified.
- mlx5: same as mlx4, with other redundant actions also forbidden.
- sfc: same as mlx4.
- tap: implementation already complies with the new behavior except for
the default pass-through modified as a default drop.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Aligning Mellanox SPDX copyrights to a single format.
In addition replace to SPDX licence files which were missed.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Running testpmd command "port stop all" followed by command "port start
all" may result in a TAP error:
PMD: Kernel refused TC filter rule creation (17): File exists
Root cause analysis: during the execution of "port start all" command
testpmd calls rte_eth_promiscuous_enable() while during the execution
of "port stop all" command testpmd does not call
rte_eth_promiscuous_disable().
As a result the TAP PMD is trying to add tc (traffic control command)
promiscuous rules to the remote netvsc device consecutively. From the
kernel point of view it is seen as an attempt to add the same rule more
than once. In recent kernels (e.g. version 4.13) this attempt is rejected
with a "File exists" error. In less recent kernels (e.g. version 4.4) the
same rule may have been successfully accepted twice, which is undesirable.
In the corrupted code every tc promiscuous rule included a different
handle number parameter. If instead an identical handle number is
used for all tc promiscuous rules - all kernels will reject the second
identical rule with a "File exists" error, which is easy to identify and
to silently ignore.
Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
The following error is reported when compiling 18.02-rc2 using ICC,
"transfer of control bypasses initialization of".
The patch fixes the issue.
Fixes: 1911c5edc6cd ("net/tap: fix eBPF RSS map key handling")
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
This commit addresses a case of RSS and non RSS flows mixture.
In the corrupted code, if a non-RSS flow is destroyed, it has an eBPF map
index 0 (initialized upon flow creation). This might unintentionally remove
RSS entry 0 from eBPF map.
To fix this issue, add an offset to the real index during a KEY_CMD_GET
operation, and subtract this offset during a KEY_CMD_RELEASE operation, in
order to restore the real index.
Thus, if a non RSS flow is falsely trying to release map entry 0 - The
offset subtraction will calculate the real map index as an
out-of-range value, and the release operation will be silently ignored.
Fixes: 036d721a8229 ("net/tap: implement RSS using eBPF")
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
When a user creates an RSS rule, the tap PMD dynamically allocates
a 'flow' data structure, and uploads BPF programs (represented by file
descriptors) to the kernel.
The kernel might reject the rule (due to filters overlap, for example)
in which case, flow memory should be freed and BPF file descriptors
should be closed.
In the corrupted code there were scenarios where BPF file descriptors
were not closed.
The fix is to add a new function - tap_flow_free(), which will make sure
to always close BPF file descriptors before freeing the flow allocated
memory.
Fixes: 036d721a8229 ("net/tap: implement RSS using eBPF")
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
TAP PMD is required to support RSS queue mapping based on rte_flow API. An
example usage for this requirement is failsafe transparent switching from a
PCI device to TAP device while keep redirecting packets to the same RSS
queues on both devices.
TAP RSS implementation is based on eBPF programs sent to Linux kernel
through BPF system calls and using netlink messages to reference the
programs as part of traffic control commands.
TC uses eBPF programs as classifiers and actions.
eBPF classification: packets marked with an RSS queue will be directed
to this queue using TC with "skbedit" action.
BPF classifiers are downloaded to the kernel once on TAP creation for
each TAP Rx queue.
eBPF action: calculate the Toeplitz RSS hash based on L3 addresses and
L4 ports. Mark the packet with the RSS queue according the resulting
RSS hash, then reclassify the packet.
BPF actions are downloaded to the kernel for each new RSS rule.
TAP eBPF requires Linux version 4.9 configured with BPF. TAP PMD will
successfully compile on systems with old or non-BPF configured kernels but
RSS rules creation on TAP devices will not be successful
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Add a generic TC actions handling for TC actions: "mirred",
"gact", "skbedit". This will be useful when introducing
BPF actions, as it uses TCA_BPF_ACT instead of TCA_FLOWER_ACT
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Functions like nl_recev and nl_send name clash functions in the
libnl library (https://www.infradead.org/~tgr/libnl/).
All functions declared in tap_netlink.h were decorated with tap_
for consistency.
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_flow_error_set() is a convenient helper to initialize error objects.
Since there is no fundamental reason to prevent applications from using it,
expose it through the public interface after modifying its return value
from positive to negative. This is done for consistency with the rest of
the public interface.
Documentation is updated accordingly.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
One of the main identified use cases for the tap PMD is to be used in
combination with the fail-safe PMD as a fallback for a physical device.
Fail-safe is very strict about making sure its current configuration is
properly applied to all slave devices, they get rejected otherwise in
order to maintain a consistent state.
The problem is that tap's RSS support is currently limited to the
default (non-Toeplitz) balancing performed by the kernel on all Rx
queues. While proper RSS support emulation in the tap PMD is a work in
progress, the lack of rte_flow counterpart prevents validation of the
above use case in the meantime.
Given that unlike most PMDs, tap is more about convenience than
performance, support for the RSS action can be temporarily faked with
a minimum amount of code and mostly correct behavior by treating it
like a QUEUE action. Traffic is directed to the first queue of the set.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
This commit fixes two bugs related to tap devices. The first bug occurs
when executing in testpmd the following flow rule assuming tap device has
4 rx and tx pair queues
"flow create 0 ingress pattern eth / end actions queue index 5 / end"
This command will report on success and will print ""Flow rule #0 created"
although it should have failed as queue index number 5 does not exist
The second bug occurs when executing in testpmd "port start all" following
a port configuration. Assuming 1 pair of rx and tx queues an error is
reported: "Fail to start port 0"
Before this commit a fixed max number (16) of rx and tx queue pairs were
created on startup where the file descriptors (fds) of rx and tx pairs were
identical. As a result in the first bug queue index 5 existed because the
tap device was created with 16 rx and tx queue pairs regardless of the
configured number of queues. In the second bug when tap device was started
tx fd was closed before opening it and executing ioctl() on it. However
closing the sole fd of the device caused ioctl to fail with "No such
device".
This commit creates the configured number of rx and tx queue pairs (up to
max 16) and assigns a unique fd to each queue. It was written to solve the
first bug and was found as the right fix for the second bug as well.
Fixes: 02f96a0a82d1 ("net/tap: add TUN/TAP device PMD")
Fixes: bf7b7f437b49 ("net/tap: create netdevice during probing")
Fixes: de96fe68ae95 ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Remove checks of Linux kernel version
in order to support kernel with backported features.
the expected behavior with a kernel that doesn't support flower
and other bits is the following:
-flow validate can return successfully
-flow create using the same rule fails.
Using the "remote" feature without kernel flower does not fail silently.
The TAP instance is not initialized if the requested parameters cannot
be satisfied.
it has been tested on an old kernel without required support:
PMD: Kernel refused TC filter rule creation (2): No such file or directory
PMD: tap0 failed to create implicit rules.
PMD: Can't set up remote feature: No such file of directory(2)
PMD: TAP Unable to initialize net_tap0
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Some logs are missing the newline character \n.
The logs using only line can be checked with this command:
git grep 'RTE_LOG(.*".*[^n]"' drivers/net/tap/
Fixes: 02f96a0a82d1 ("net/tap: add TUN/TAP device PMD")
Fixes: 268483dc2086 ("net/tap: add preliminary support for flow API")
Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
Fixes: bf7b7f437b49 ("net/tap: create netdevice during probing")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
With this patch, it is possible to enable or disable the isolate
feature anytime, even immediately after a probe while the tap has not
been configured yet. It will do its job as soon as the netdevice gets
created.
A specific implicit flow rule is created with the lowest priority (all
other flow rules will be evaluated before), at the end of the list. If
isolated mode is enabled, the associated action will be to drop the
packet. Otherwise, the action would be passthrough.
In case of a remote netdevice, implicit rules on it will be removed in
isolated mode, to ensure only actual flow rules redirect packets to the
tap.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
The following two flow rules (testpmd syntax) should not collide:
flow create 0 priority 1 ingress pattern eth / ipv4 / end actions drop / end
flow create 0 priority 1 ingress pattern eth / ipv6 / end actions drop / end
But the eth_type in the associated TC rule was set to either "ip" or
"ipv6". For TC, they could thus not have the same priority.
Use ETH_P_ALL only in the TC message to make sure those rules can
coexist.
Fixes: de96fe68ae95 ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Some errors received from the kernel are acceptable, such as a -ENOENT
for a rule deletion (the rule was already no longer existing in the
kernel). Make sure we consider return codes properly. For that,
nl_recv() has been simplified.
qdisc_exists() function is no longer needed as we can check whether the
kernel returned -EEXIST when requiring the qdisc creation. It's simpler
and faster.
Add a few messages for clarity when a netlink error occurs.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Only full mask (0xffff) is accepted, there is no way to specify a mask
for layer 4 ports to the kernel using TC rules.
Fixes: de96fe68ae95 ("net/tap: add basic flow API patterns and actions")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
By default, a tap netdevice is of no use when not fed by a separate
process. The ability to automatically feed it from another netdevice
allows applications to capture any kind of traffic normally destined to
the kernel stack.
This patch implements this ability through a new optional "remote"
parameter.
Packets matching filtering rules created with the flow API are matched
on the remote device and redirected to the tap PMD, where the relevant
action will be performed.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Supported flow rules are now mapped to TC rules on the tap netdevice.
The netlink message used for creating the TC rule is stored in struct
rte_flow. That way, by simply changing a metadata in it, we can require
for the rule deletion without further parsing.
Supported items:
- eth: src and dst (with variable masks), and eth_type (0xffff mask).
- vlan: vid, pcp, tpid, but not eid.
- ipv4/6: src and dst (with variable masks), and ip_proto (0xffff mask).
- udp/tcp: src and dst port (0xffff) mask.
Supported actions:
- DROP
- QUEUE
- PASSTHRU
It is generally not possible to provide a "last" item. However, if the
"last" item, once masked, is identical to the masked spec, then it is
supported.
Only IPv4/6 and MAC addresses can use a variable mask. All other
items need a full mask (exact match).
Support for VLAN requires kernel headers >= 4.9, checked using
auto-config.sh.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
The flow API provides the ability to classify packets received by a tap
netdevice.
This patch only implements skeleton functions for flow API support, no
patterns are supported yet.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>