Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.
For primary processes, it is OK to not have IPC because
there may not be any secondary processes in the first place,
and there are valid use cases that disable IPC support, so
all primary process usages are fixed up to ignore IPC
failures.
For secondary processes, IPC will be crucial, so leave all
of the error handling as is.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Since we change these macros, we might as well avoid triggering complaints
from checkpatch because of mixed case.
old=RTE_IPv4
new=RTE_IPV4
git grep -lw $old | xargs sed -i -e "s/\<$old\>/$new/g"
old=RTE_ETHER_TYPE_IPv4
new=RTE_ETHER_TYPE_IPV4
git grep -lw $old | xargs sed -i -e "s/\<$old\>/$new/g"
old=RTE_ETHER_TYPE_IPv6
new=RTE_ETHER_TYPE_IPV6
git grep -lw $old | xargs sed -i -e "s/\<$old\>/$new/g"
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
This source is compiled as a bpf program out of dpdk.
Revert to structures/macros coming from libc.
Fixes: 6d13ea8e8e ("net: add rte prefix to ether structures")
Fixes: 24ac604ef7 ("net: add rte prefix to IP defines")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Add 'RTE_' prefix to defines:
- rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN.
- rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN.
- rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN.
- rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN.
- rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN.
- rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN.
- rename ETHER_MTU as RTE_ETHER_MTU.
- rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN.
- rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID.
- rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN.
- rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU.
- rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR.
- rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR.
- rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4.
- rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6.
- rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP.
- rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN.
- rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP.
- rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ.
- rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG.
- rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588.
- rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW.
- rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB.
- rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP.
- rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS.
- rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM.
- rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN.
- rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE.
- rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4.
- rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6.
- rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH.
- rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH.
- rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS.
- rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP.
- rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG.
- rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add 'rte_' prefix to structures:
- rename struct ether_addr as struct rte_ether_addr.
- rename struct ether_hdr as struct rte_ether_hdr.
- rename struct vlan_hdr as struct rte_vlan_hdr.
- rename struct vxlan_hdr as struct rte_vxlan_hdr.
- rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When secondary to primary process synchronization occurs
there is no check for number of fds which could cause buffer overrun.
Bugzilla ID: 252
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When sending synchronous IPC requests, the caller must free the response
buffer if the request was successful and reply is no longer needed.
Fix the code to correctly
use the IPC API.
Bugzilla ID: 228
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
A successful call to rte_mp_request_sync does not guarantee that there
are any messages in the buffer, and this should be checked for before
accessing data in the message. Buffer can be empty if IPC is disabled or
if we decide to ignore replies.
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Define variables for "is_linux", "is_freebsd" and "is_windows"
to make the code shorter for comparisons and more readable.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
For files that already have rte_string_fns.h included in them, we can
do a straight replacement of snprintf(..."%s",...) with strlcpy. The
changes in this patch were auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy-with-header.cocci --dir . --in-place
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
If the value _SC_IOV_MAX is missing, sysconf returns -1.
In this case, iov_max is set to a default value of 1024.
This should never happen except for redhat bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1504165
Fixes: ec12df9504 ("net/tap: fix support for large Rx queues")
Cc: stable@dpdk.org
Signed-off-by: Oleg Polyakov <olegp123@walla.co.il>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Shifting signed 32-bit values by 31-bits has the potential for
unexpected outcomes as compiler can overwrite a bit.
Specified that values are unsigned.
Errors are observed from running cppcheck.
Bugzilla ID: 58
Fixes: 69e209be54 ("net/axgbe: add register map and related macros")
Fixes: b5bf771922 ("bnx2x: driver support routines")
Fixes: ed2ced6fe9 ("net/bnxt: check initialization before accessing stats")
Fixes: 6fda3f0ddd ("net/cxgbe: add API to program hardware MPS table")
Fixes: bdb244b969 ("e1000: whitespace changes")
Fixes: 5a32a257f9 ("e1000: more NICs in base driver")
Fixes: 2fe669f4bc ("net/nfp: support MAC address change")
Fixes: defb9a5dd1 ("nfp: introduce driver initialization")
Fixes: ec94dbc573 ("qede: add base driver")
Fixes: d2e7d931d0 ("net/qede/base: formatting changes")
Fixes: cdc07e83bb ("net/tap: add eBPF program file")
Cc: stable@dpdk.org
Signed-off-by: Andrius Sirvys <andrius.sirvys@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The structure was not initialized.
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Printing pointer in log is uninformative (unless in a debugger),
instead print the assigned kernel device name which correlates
well with what TAP is doing.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Having a global variable which is set to "TUN" or "TAP" during
probe is a potential bug if probing is ever done in different
processes or contexts. Let's fix it now by using existing enum
that has type of connection.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
Assigning tun and tap index in DPDK tap device driver is racy
and fails if used with primary/secondary. Instead use the kernel
feature of device wildcarding where if a name with %d is used
the kernel will fill in the next available device.
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org
Reported-by: Haifeng Li <hfli@netitest.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
Any messages that normally occur during probe should be at DEBUG
level (not NOTICE). This reduces overall log clutter.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
If interface name is passed to remote or iface then check
the length and for invalid characters. This avoids problems where
name gets truncated or rejected by kernel.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
The code for set_interface_name was incorrectly assuming that
space for null byte was necessary with snprintf/strlcpy.
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
snprintf is not needed here, use strlcpy instead.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by Keith Wiles <keith.wiles@intel.com>
The checksum calculation APIs take only the packet headers pointers as
parameters, so they assume that the lengths reported in those headers
are correct. However, a malicious packet could claim to be far larger
than it is, so we need to check the header lengths in the driver before
calling the checksum API.
A better fix would be to allow the lengths to be passed into the API
function, but that would be an API break, so fixing in TAP driver for
now.
Fixes: 8ae3023387 ("net/tap: add Rx/Tx checksum offload support")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
In scenarios for multiq or flowq setup failure
`rte_eth_dev_probing_finish()` has to be invoked for successful device
registration.
Fixes: fbe90cdd77 ("ethdev: add probing finish function")
Cc: stable@dpdk.org
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Static analysis tools don't like the fact that fd could be zero
in the error path. This won't happen in real world because
stdin would have to be closed, then other error occurring.
Coverity issue: 14079
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Some global variables can indeed be static, add static keyword to them.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Disabled tap build in FreeBSD because it is not supported
Added changes to enable tap build if it is Linux OS and
disable in FreeBSD.
Fixes: 095cae3668 ("net/tap: add in meson build")
Signed-off-by: Agalya Babu RadhaKrishnan <agalyax.babu.radhakrishnan@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
After previous changes, the function rte_eth_dev_release_port()
can be used for primary or secondary process as well.
The only difference with rte_eth_dev_release_port_secondary()
is the shared lock used in rte_eth_dev_release_port().
The function rte_eth_dev_release_port_secondary() was recently
added in 18.11 cycle.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
This is a clean-up of common ethdev data freeing.
All data freeing are moved to rte_eth_dev_release_port()
and done only in case of primary process.
It is probably fixing some memory leaks for PMDs which were
not freeing all data.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
In the case the device is created by the primary process,
the secondary must request some file descriptors to attach the queues.
The file descriptors are shared via IPC Unix socket.
Thanks to the IPC synchronization, the secondary process
is now able to do Rx/Tx on a TAP created by the primary process.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
fd's cannot be shared between processes, and each process need to have
it's own fd's pointer.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Port and queue ids are added to easily map the file
descriptors stored in each process private.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When writev fails to send packets it doesn't update the
number of Tx packets, but it still num_tx is updated.
The value that should be returned is the actual number
of sent packets which is num_packets.
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
CC: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Attach port from secondary should ignore devargs since the private
device is not necessary to support. Also previously, detach port on
a secondary process will mess primary process and cause the same
device can't be attached back again. A secondary process should use
rte_eth_dev_release_port_secondary to release a port.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Removed DEV_RX_OFFLOAD_CRC_STRIP offload flag.
Without any specific Rx offload flag, default behavior by PMDs is to
strip CRC.
PMDs that support keeping CRC should advertise DEV_RX_OFFLOAD_KEEP_CRC
Rx offload capability.
Applications that require keeping CRC should check PMD capability first
and if it is supported can enable this feature by setting
DEV_RX_OFFLOAD_KEEP_CRC in Rx offload flag in rte_eth_dev_configure()
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tomasz Duszynski <tdu@semihalf.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Remes <remes@netcope.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
The rte_flow meaning of zero flow mask configuration is to match all
the range of the item value.
For example, the flow eth / ipv4 dst spec 1.2.3.4 dst mask 0.0.0.0
should much all the ipv4 traffic from the rte_flow API perspective.
>From some kernel perspectives the above rule means to ignore all the
ipv4 traffic (e.g. Ubuntu 16.04, 4.15.10).
Due to the fact that the tap PMD should provide the rte_flow meaning,
it is necessary to ignore the spec in case the mask is zero when it
forwards such like flows to the kernel.
So, the above rule should be translated to eth / ipv4 to get the
correct meaning.
Ignore spec configurations when the mask is zero.
Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Calling rte_eth_dev_info_get() on secondary process cause a crash
because eth_dev->device is not set properly.
Fixes: ee27edbe0c ("drivers/net: share vdev data to secondary process")
Cc: stable@dpdk.org
Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Set the rx and tx queue state appropriately when the queues or device
are started or stopped.
Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
A constructor is usually declared with RTE_INIT* macros.
As it is a static function, no need to declare before its definition.
The macro is used directly in the function definition.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
This commit implements TCP segmentation offload in TAP.
librte_gso library is used to segment large TCP payloads (e.g. packets
of 64K bytes size) into smaller MTU size buffers.
By supporting TSO offload capability in software a TAP device can be used
as a failsafe sub device and be paired with another PCI device which
supports TSO capability in HW.
For more details on librte_gso implementation please refer to dpdk
documentation.
The number of newly generated TCP TSO segments is limited to 64.
Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Prior to this commit IP/UDP/TCP checksum offload calculations
were skipped in case of a multi segments packet.
This commit enables TAP checksum calculations for multi segments
packets.
The only restriction is that the first segment must contain
headers of layers 3 (IP) and 4 (UDP or TCP)
Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Updating the logic to reflect unsigned integer as index for TAP PMD.
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
The TAP keep-alive queue was created in order to keep the TAP device
in Linux even in case all of its Rx/Tx queues are released (in Linux
terminology: even in case all of the TAP device file descriptors are
closed), however, the keep-alive queue itself is attached to the TAP
device like all other Rx/Tx queues and therefore the kernel will
enqueue to it some Rx packets based on the kernel RSS distribution
rules. Those packets are unknown to the application and will remain
lost in the keep-alive queue.
All queues are attached by default to the TAP device after they are
created though TUNSETIFF ioctl call.
The fix is to detach the keep-alive queue after its creation through
TUNSETQUEUE ioctl call.
Fixes: 3101191c63 ("net/tap: fix device removal when no queue exist")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>