When trying to compile with glibc < 2.24 that doesn't
support SOL_NETLINK it will cause compilation failure:
drivers/net/tap/tap_netlink.c:70:17: error:
'SOL_NETLINK' undeclared (first use in this function)
setsockopt(fd, SOL_NETLINK, NETLINK_EXT_ACK, &one, sizeof(one));
The glibc commits adds the SOL_NETLINK support:
https://github.com/bminor/glibc/commit/f9b437d5efce93800b51ad2a437c8b1c9
Fixes: 647909bcf3 ("net/tap: use netlink extended ack support")
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
In recent Linux kernels, there is support for extended acknowledgment
to netlink messages. This is quite useful for diagnosing errors
in configuration in the kernel with TAP.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
The tap_nl_recv() function does not need to use the full
complex recvmsg() system call, basic recv() will work here.
Ditto for tap_nl_send() full sendmsg is not needed.
Add logic to retry in case EINTR rather than forcing
error handling back in driver or worse to ethdev API.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
The TAP driver does not initialize all the elements of the rte_flow
structure. This can lead to crash in rte_flow_destroy.
(gdb) where
flow=0x100e99280, error=0x0)
at drivers/net/tap/tap_flow.c:1514
(gdb) p remote_flow
$1 = (struct rte_flow *) 0x6b6b6b6b6b6b6b6b
Which is here:
static int
tap_flow_destroy_pmd(struct pmd_internals *pmd,
struct rte_flow *flow,
struct rte_flow_error *error)
{
struct rte_flow *remote_flow = flow->remote_flow;
...
if (remote_flow) {
remote_flow->msg.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;
Simplest fix is to use rte_zmalloc() so remote_flow and other fields
are always set at zero.
Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The fd is possibly a negative value while it is passed as an
argument to function "close". Fix the check to the fd.
Fixes: ed8132e7c9 ("net/tap: move fds of queues to be in process private")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The nic's interrupt source has some active handler, which maybe call
tap_dev_intr_handler() to set link handler. We should cancel the link
handler before close fd to prevent executing the link handler. It
triggers segfault.
Call Trace:
0x00007f15e08dad99 in __rte_panic (Error adding fd %d epoll_ctl, %s\n")
0x00007f15e08e9b87 in eal_intr_thread_main ()
0x00007f15e249be15 in start_thread ()
0x00007f15d5322f9d in clone ()
Fixes: c0bddd3a05 ("net/tap: add link status notification")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When eth_dev_tap_create() is failed, nlsk_fd and ka_fd won't be closed
thus leading fds leak. Zero is a valid fd. Ultimately leads to a valid
fd was closed by mistake.
Fixes: bf7b7f437b ("net/tap: create netdevice during probing")
Fixes: cb7e68da63 ("net/tap: fix cleanup on allocation failure")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
The internal structure is freed and set to NULL in the
rte_eth_dev_release_port() and zero is a valid fd. Ultimately
leads to a valid fd was closed by mistake.
Fixes: 3101191c63 ("net/tap: fix device removal when no queue exist")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Now the rxq->pool is mbuf concatenation, but its nb_segs is 1. When
conducting some sanity checks on the mbuf with debug enabled, it fails.
Fixes: 0781f5762c ("net/tap: support segmented mbufs")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
For the tap PMD, we should release mbufs and iovecs from the Rx queue
when closing device. In order to remove duplicated code,
rte_pmd_tap_remove() calls tap_dev_close().
Fixes: 0781f5762c ("net/tap: support segmented mbufs")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
When the tap_write_mbufs() function return with break, mbuf was freed
without increasing num_packets, which could cause applications to free
the mbuf again. And the pmd_tx_burst() function should returns the
number of original packets it actually sent excluding tso mbufs.
Fixes: 9396ad3346 ("net/tap: fix reported number of Tx packets")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
The assert checks is not necessary, the gso_ctx is always non-NULL.
Fixes: 050316a883 ("net/tap: support TSO (TCP Segment Offload)")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The PMD logtype is legacy and drivers should use their own logtype.
Fixes: 050316a883 ("net/tap: support TSO (TCP Segment Offload)")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
There is a common macro __rte_packed for packing structs,
which is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
There is a common macro __rte_aligned for alignment,
which is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Remove setting ALLOW_EXPERIMENTAL_API individually for each Makefile and
meson.build. Instead, enable ALLOW_EXPERIMENTAL_API flag across app, lib
and drivers.
This changes reduces the clutter across the project while still
maintaining the functionality of ALLOW_EXPERIMENTAL_API i.e. warning
external applications about experimental API usage.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
The return check of function tap_lsc_intr_handle_set() is wrong, it should
be 0 or a positive number if success. So the intr_handle->intr_vec was not
been freed when tap_lsc_intr_handle_set() returned a positive number.
Fixes: 4870a8cdd9 ("net/tap: support Rx interrupt")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Merge all versions in linker version script files to DPDK_20.0.
This commit was generated by running the following command:
:~/DPDK$ buildtools/update-abi.sh 20.0
Signed-off-by: Pawel Modrak <pawelx.modrak@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Since the library versioning for both stable and experimental ABI's is
now managed globally, the LIBABIVER and version variables no longer
serve any useful purpose, and can be removed.
The replacement in Makefiles was done using the following regex:
^(#.*\n)?LIBABIVER\s*:=\s*\d+\n(\s*\n)?
(LIBABIVER := numbers, optionally preceded by a comment and optionally
succeeded by an empty line)
The replacement for meson files was done using the following regex:
^(#.*\n)?version\s*=\s*\d+\n(\s*\n)?
(version = numbers, optionally preceded by a comment and optionally
succeeded by an empty line)
[David]: those variables are manually removed for the files:
- drivers/common/qat/Makefile
- lib/librte_eal/meson.build
[David]: the LIBABIVER is restored for the external ethtool example
library.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
When OS sends more packets than are being read with a single
'rte_eth_rx_burst' call, rx packets are getting stucked in the tap pmd
and are unable to receive, because trigger_seen is getting updated
and consecutive calls are not getting any packets.
Do not update trigger_seen unless less than a max number of packets were
received allowing next call to receive the rest.
Remove unnecessary compiler barrier.
Fixes: a0d8e807d9 ("net/tap: add Rx trigger")
Cc: stable@dpdk.org
Signed-off-by: Marcin Smoczynski <marcinx.smoczynski@intel.com>
Tested-by: Mariusz Drost <mariuszx.drost@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
Enabling/disabling of allmulticast mode is not always successful and
it should be taken into account to be able to handle it properly.
When correct return status is unclear from driver code, -EAGAIN is used.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Change return value of the callbacks from void to int. Make
implementations across all drivers return negative errno
values in case of error conditions.
Both callbacks are updated together because a large number of
drivers assign the same function to both callbacks.
Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Enabling/disabling of promiscuous mode is not always successful and
it should be taken into account to be able to handle it properly.
When correct return status is unclear from driver code, -EAGAIN is used.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Change eth_dev_infos_get_t return value from void to int.
Make eth_dev_infos_get_t implementations across all drivers to return
negative errno values if case of error conditions.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The rte_vdev_drivers are declared twice.
The first one is not necessary.
Fixes: 740feaf349 ("ethdev: remove driver name from device private data")
Fixes: 204d026a39 ("net/tap: support tun")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
For each driver where we optionally disable it, add in the reason why it's
being disabled, so the user knows how to fix it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Transmit errors must not be reported in q_errors[] which is for
reception.
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.
For primary processes, it is OK to not have IPC because
there may not be any secondary processes in the first place,
and there are valid use cases that disable IPC support, so
all primary process usages are fixed up to ignore IPC
failures.
For secondary processes, IPC will be crucial, so leave all
of the error handling as is.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Since we change these macros, we might as well avoid triggering complaints
from checkpatch because of mixed case.
old=RTE_IPv4
new=RTE_IPV4
git grep -lw $old | xargs sed -i -e "s/\<$old\>/$new/g"
old=RTE_ETHER_TYPE_IPv4
new=RTE_ETHER_TYPE_IPV4
git grep -lw $old | xargs sed -i -e "s/\<$old\>/$new/g"
old=RTE_ETHER_TYPE_IPv6
new=RTE_ETHER_TYPE_IPV6
git grep -lw $old | xargs sed -i -e "s/\<$old\>/$new/g"
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
This source is compiled as a bpf program out of dpdk.
Revert to structures/macros coming from libc.
Fixes: 6d13ea8e8e ("net: add rte prefix to ether structures")
Fixes: 24ac604ef7 ("net: add rte prefix to IP defines")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Add 'RTE_' prefix to defines:
- rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN.
- rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN.
- rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN.
- rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN.
- rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN.
- rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN.
- rename ETHER_MTU as RTE_ETHER_MTU.
- rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN.
- rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID.
- rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN.
- rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU.
- rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR.
- rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR.
- rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4.
- rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6.
- rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP.
- rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN.
- rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP.
- rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ.
- rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG.
- rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588.
- rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW.
- rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB.
- rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP.
- rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS.
- rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM.
- rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN.
- rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE.
- rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4.
- rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6.
- rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH.
- rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH.
- rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS.
- rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP.
- rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG.
- rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add 'rte_' prefix to structures:
- rename struct ether_addr as struct rte_ether_addr.
- rename struct ether_hdr as struct rte_ether_hdr.
- rename struct vlan_hdr as struct rte_vlan_hdr.
- rename struct vxlan_hdr as struct rte_vxlan_hdr.
- rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When secondary to primary process synchronization occurs
there is no check for number of fds which could cause buffer overrun.
Bugzilla ID: 252
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When sending synchronous IPC requests, the caller must free the response
buffer if the request was successful and reply is no longer needed.
Fix the code to correctly
use the IPC API.
Bugzilla ID: 228
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
A successful call to rte_mp_request_sync does not guarantee that there
are any messages in the buffer, and this should be checked for before
accessing data in the message. Buffer can be empty if IPC is disabled or
if we decide to ignore replies.
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Define variables for "is_linux", "is_freebsd" and "is_windows"
to make the code shorter for comparisons and more readable.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
For files that already have rte_string_fns.h included in them, we can
do a straight replacement of snprintf(..."%s",...) with strlcpy. The
changes in this patch were auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy-with-header.cocci --dir . --in-place
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
If the value _SC_IOV_MAX is missing, sysconf returns -1.
In this case, iov_max is set to a default value of 1024.
This should never happen except for redhat bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1504165
Fixes: ec12df9504 ("net/tap: fix support for large Rx queues")
Cc: stable@dpdk.org
Signed-off-by: Oleg Polyakov <olegp123@walla.co.il>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Shifting signed 32-bit values by 31-bits has the potential for
unexpected outcomes as compiler can overwrite a bit.
Specified that values are unsigned.
Errors are observed from running cppcheck.
Bugzilla ID: 58
Fixes: 69e209be54 ("net/axgbe: add register map and related macros")
Fixes: b5bf771922 ("bnx2x: driver support routines")
Fixes: ed2ced6fe9 ("net/bnxt: check initialization before accessing stats")
Fixes: 6fda3f0ddd ("net/cxgbe: add API to program hardware MPS table")
Fixes: bdb244b969 ("e1000: whitespace changes")
Fixes: 5a32a257f9 ("e1000: more NICs in base driver")
Fixes: 2fe669f4bc ("net/nfp: support MAC address change")
Fixes: defb9a5dd1 ("nfp: introduce driver initialization")
Fixes: ec94dbc573 ("qede: add base driver")
Fixes: d2e7d931d0 ("net/qede/base: formatting changes")
Fixes: cdc07e83bb ("net/tap: add eBPF program file")
Cc: stable@dpdk.org
Signed-off-by: Andrius Sirvys <andrius.sirvys@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The structure was not initialized.
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Printing pointer in log is uninformative (unless in a debugger),
instead print the assigned kernel device name which correlates
well with what TAP is doing.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Having a global variable which is set to "TUN" or "TAP" during
probe is a potential bug if probing is ever done in different
processes or contexts. Let's fix it now by using existing enum
that has type of connection.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
Assigning tun and tap index in DPDK tap device driver is racy
and fails if used with primary/secondary. Instead use the kernel
feature of device wildcarding where if a name with %d is used
the kernel will fill in the next available device.
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org
Reported-by: Haifeng Li <hfli@netitest.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
Any messages that normally occur during probe should be at DEBUG
level (not NOTICE). This reduces overall log clutter.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
If interface name is passed to remote or iface then check
the length and for invalid characters. This avoids problems where
name gets truncated or rejected by kernel.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>