Commit Graph

235 Commits

Author SHA1 Message Date
Stephen Hemminger
71f8bd4e60 net/tap: define offload capabilities constants
Since the offload values are always the same, these can
just be data instead of code.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2021-07-02 19:03:03 +02:00
Stephen Hemminger
d777cc7f21 net/tap: remove useless offload capability functions
Since these always return 0, they were doing nothing useful.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2021-07-02 19:03:03 +02:00
Olivier Matz
45a08ef55e net: introduce functions to verify L4 checksums
Since commit d5df2ae042 ("net: fix unneeded replacement of TCP
checksum 0"), the functions rte_ipv4_udptcp_cksum() and
rte_ipv6_udptcp_cksum() can return either 0x0000 or 0xffff when used to
verify a packet containing a valid checksum.

Since these functions should be used to calculate the checksum to set in
a packet, introduce 2 new helpers for checksum verification. They return
0 if the checksum is valid in the packet.

Use this new helper in net/tap driver.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-07-02 19:03:03 +02:00
Olivier Matz
375474b4d9 net/tap: fix Rx checksum flags on TCP packets
Since commit d5df2ae042 ("net: fix unneeded replacement of TCP
checksum 0"), the functions rte_ipv4_udptcp_cksum() or
rte_ipv6_udptcp_cksum() can return either 0x0000 or 0xffff when used to
verify a packet containing a valid checksum.

This new behavior broke the checksum verification in tap driver for TCP
packets: these packets are marked with PKT_RX_L4_CKSUM_BAD.

Fix this by checking the 2 possible values. A next commit will introduce
a checksum verification helper to simplify this a bit.

Fixes: d5df2ae042 ("net: fix unneeded replacement of TCP checksum 0")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-07-02 19:03:03 +02:00
Olivier Matz
a46bd97f60 net/tap: fix Rx checksum flags on IP options packets
When packet type is IPV4_EXT, the checksum is always marked as good in
the mbuf offload flags.

Since we know the header lengths, we can easily call
rte_ipv4_udptcp_cksum() in this case too.

Fixes: 8ae3023387 ("net/tap: add Rx/Tx checksum offload support")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-07-02 19:03:03 +02:00
Ferruh Yigit
a625ab89df net/tap: fix build with GCC 11
Reproduced with '--buildtype=debugoptimized' config,
compiler version: gcc (GCC) 12.0.0 20210509 (experimental)

There are multiple build errors, like:
In file included from ../drivers/net/tap/tap_flow.c:13:
In function ‘rte_jhash_2hashes’,
    inlined from ‘rte_jhash’ at ../lib/hash/rte_jhash.h:284:2,
    inlined from ‘tap_flow_set_handle’ at
	../drivers/net/tap/tap_flow.c:1306:12,
    inlined from ‘rss_enable’ at ../drivers/net/tap/tap_flow.c:1909:3,
    inlined from ‘priv_flow_process’ at
	../drivers/net/tap/tap_flow.c:1228:11:
../lib/hash/rte_jhash.h:238:9:
	warning: ‘flow’ may be used uninitialized [-Wmaybe-uninitialized]
  238 |         __rte_jhash_2hashes(key, length, pc, pb, 1);
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../drivers/net/tap/tap_flow.c: In function ‘priv_flow_process’:
../lib/hash/rte_jhash.h:81:1: note: by argument 1 of type ‘const void *’
	to ‘__rte_jhash_2hashes.constprop’ declared here
 81 | __rte_jhash_2hashes(const void *key, uint32_t length, uint32_t *pc,
    | ^~~~~~~~~~~~~~~~~~~
../drivers/net/tap/tap_flow.c:1028:1: note: ‘flow’ declared here
 1028 | priv_flow_process(struct pmd_internals *pmd,
      | ^~~~~~~~~~~~~~~~~

Fix strict aliasing rule by using union.

Bugzilla ID: 690
Fixes: de96fe68ae ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2021-05-12 14:54:16 +02:00
David Marchand
eeded2044a log: register with standardized names
Let's try to enforce the convention where most drivers use a pmd. logtype
with their class reflected in it, and libraries use a lib. logtype.

Introduce two new macros:
- RTE_LOG_REGISTER_DEFAULT can be used when a single logtype is
  used in a component. It is associated to the default name provided
  by the build system,
- RTE_LOG_REGISTER_SUFFIX can be used when multiple logtypes are used,
  and then the passed name is appended to the default name,

RTE_LOG_REGISTER is left untouched for existing external users
and for components that do not comply with the convention.

There is a new Meson variable log_prefix to adapt the default name
for baseband (pmd.bb.), bus (no pmd.) and mempool (no pmd.) classes.

Note: achieved with below commands + reverted change on net/bonding +
edits on crypto/virtio, compress/mlx5, regex/mlx5

$ git grep -l RTE_LOG_REGISTER drivers/ |
  while read file; do
    pattern=${file##drivers/};
    class=${pattern%%/*};
    pattern=${pattern#$class/};
    drv=${pattern%%/*};
    case "$class" in
      baseband) pattern=pmd.bb.$drv;;
      bus) pattern=bus.$drv;;
      mempool) pattern=mempool.$drv;;
      *) pattern=pmd.$class.$drv;;
    esac
    sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
    sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
  done

$ git grep -l RTE_LOG_REGISTER lib/ |
  while read file; do
    pattern=${file##lib/};
    pattern=lib.${pattern%%/*};
    sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
    sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
  done

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-05-11 15:17:55 +02:00
Chengchang Tang
8f3ca7f9a8 net/tap: check ioctl on restore
After restoring the remote states, the return value of ioctl() is not
checked. Therefore, users cannot know whether the remote state is
restored successfully.

This patch add log for restoring failure.

Fixes: 4810d3af83 ("net/tap: restore state of remote device when closing")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-29 15:34:39 +02:00
Chengchang Tang
2ce7bc9634 net/tap: fix interrupt vector array size
The size of the current interrupt vector array is fixed to an integer.

This patch will create an interrupt vector array based on the number
of rxqs.

Fixes: 4870a8cdd9 ("net/tap: support Rx interrupt")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-26 17:30:42 +02:00
Bruce Richardson
4ad4b20a79 drivers: change indentation in build files
Switch from using tabs to 4 spaces for meson.build indentation.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-21 14:04:09 +02:00
Thomas Monjalon
fb7ad441d4 ethdev: replace callback getting filter operations
Since rte_flow is the only API for filtering operations,
the legacy driver interface filter_ctrl was too much complicated
for the simple task of getting the struct rte_flow_ops.

The filter type RTE_ETH_FILTER_GENERIC and
the filter operarion RTE_ETH_FILTER_GET are removed.
The new driver callback flow_ops_get replaces filter_ctrl.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-26 18:37:13 +01:00
Bruce Richardson
df96fd0d73 ethdev: make driver-only headers private
The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Steven Webster <steven.webster@windriver.com>
2021-01-29 20:59:09 +01:00
Thomas Monjalon
135155a836 build: align wording of non-support reasons
Reasons for building not supported generally start with lowercase
because printed as the second part of a line.

Other changes:
	- "linux" should be "Linux" with a capital letter.
	- ARCH_X86_64 may be simply x86_64.
	- aarch64 is preferred over arm64.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-11-20 16:05:35 +01:00
Michael Pfeiffer
8aab74c0b9 net/tap: allow all-zero checksum for UDP over IPv4
Unlike TCP, UDP checksums are optional and may be zero to indicate "not
set" [RFC 768] (except for IPv6, where this prohibited [RFC 8200]). Add
this special case to the checksum offload emulation in net/tap.

Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-13 19:43:27 +01:00
Yi Yang
c0d002aed9 gso: fix mbuf freeing responsibility
rte_gso_segment decreased refcnt of pkt by one, but
it is wrong if pkt is external mbuf, pkt won't be
freed because of incorrect refcnt, the result is
application can't allocate mbuf from mempool because
mbufs in mempool are run out of.

One correct way is application should call
rte_pktmbuf_free after calling rte_gso_segment to free
pkt explicitly. rte_gso_segment must not handle it, this
should be responsibility of application.

This commit changed rte_gso_segment in functional behavior
and return value, so the application must take appropriate
actions according to return values, "ret < 0" means it
should free and drop 'pkt', "ret == 0" means 'pkt' isn't
GSOed but 'pkt' can be transmitted as a normal packet,
"ret > 0" means 'pkt' has been GSOed into two or multiple
segments, it should use "pkts_out" to transmit these
segments. The application must free 'pkt' after call
rte_gso_segment when return value isn't equal to 0.

Fixes: 119583797b ("gso: support TCP/IPv4 GSO")
Cc: stable@dpdk.org

Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-11-03 22:45:02 +01:00
Bruce Richardson
63b3907833 build: remove library name from version map file name
Since each version map file is contained in the subdirectory of the library
it refers to, there is no need to include the library name in the filename.
This makes things simpler in case of library renaming.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Rosen Xu <rosen.xu@intel.com>
2020-10-19 22:13:59 +02:00
Ferruh Yigit
f30e69b41f ethdev: add device flag to bypass auto-filled queue xstats
Queue stats are stored in 'struct rte_eth_stats' as array and array size
is defined by 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag.

As a result of technical board discussion, decided to remove the queue
statistics from 'struct rte_eth_stats' in the long term.

Instead PMDs should represent the queue statistics via xstats, this
gives more flexibility on the number of the queues supported.

Currently queue stats in the xstats are filled by ethdev layer, using
some basic stats, when queue stats removed from basic stats the
responsibility to fill the relevant xstats will be pushed to the PMDs.

During the switch period, temporary 'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS'
device flag is created. Initially all PMDs using xstats set this flag.
The PMDs implemented queue stats in the xstats should clear the flag.

When all PMDs switch to the xstats for the queue stats, queue stats
related fields from 'struct rte_eth_stats' will be removed, as well as
'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS' flag.
Later 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag also can be
removed.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2020-10-16 23:27:15 +02:00
Ivan Ilchenko
62024eb827 ethdev: change stop operation callback to return int
Change eth_dev_stop_t return value from void to int.
Make eth_dev_stop_t implementations across all drivers to return
negative errno values if case of error conditions.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-10-16 22:26:41 +02:00
Thomas Monjalon
0607dadf98 ethdev: reset all when releasing a port
The function rte_eth_dev_release_port() is partially resetting
the struct rte_eth_dev. The drivers were completing this reset
with more pointers set to NULL in the close or remove operations.

More pointers are reset at ethdev level,
and some redundant assignments are removed from PMDs.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2020-10-16 22:26:41 +02:00
Michael Pfeiffer
9863627f52 net: add function to calculate IPv4 header length
Add a function to calculate the length of an IPv4 header as suggested
on the mailing list [1]. Call where appropriate.

[1] https://mails.dpdk.org/archives/dev/2020-October/184471.html

Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-10-16 19:48:17 +02:00
Thomas Monjalon
fbd1913561 ethdev: remove old close behaviour
The temporary flag RTE_ETH_DEV_CLOSE_REMOVE is removed.
It was introduced in DPDK 18.11 in order to give time for PMDs to migrate.

The old behaviour was to free only queues when closing a port.
The new behaviour is calling rte_eth_dev_release_port() which does
three more tasks:
	- trigger event callback
	- reset state and few pointers
	- free all generic port resources

The private port resources must be released in the .dev_close callback.

The .remove callback should:
	- call .dev_close callback
	- call rte_eth_dev_release_port()
	- free multi-port device shared resources

Despite waiting two years, some drivers have not migrated,
so they may hit issues with the incompatible new behaviour.
After sending emails, adding logs, and announcing the deprecation,
the only last solution is to declare these drivers as unmaintained:
	ionic, liquidio, nfp
Below is a summary of what to implement in those drivers.

* The freeing of private port resources must be moved
from the ".remove(device)" function to the ".dev_close(port)" function.

* If a generic resource (.mac_addrs or .hash_mac_addrs) cannot be freed,
it must be set to NULL in ".dev_close" function to protect from
subsequent rte_eth_dev_release_port() freeing.

* Note 1:
The generic resources are freed in rte_eth_dev_release_port(),
after ".dev_close" is called in rte_eth_dev_close(), but not when
calling ".dev_close" directly from the ".remove" PMD function.
That's why rte_eth_dev_release_port() must still be called explicitly
from ".remove(device)" after calling the ".dev_close" PMD function.

* Note 2:
If a device can have multiple ports, the common resources must be freed
only in the ".remove(device)" function.

* Note 3:
The port is supposed to be in a stopped state when it is closed.
If it is not the case, it is free to the PMD implementation
how to react when trying to close a non-stopped port:
either try to stop it automatically or just return an error.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-09-30 19:19:14 +02:00
Yunjian Wang
23d8e85e44 net/tap: release port upon close
The flag RTE_ETH_DEV_CLOSE_REMOVE is set so all port resources
can be freed by rte_eth_dev_close().

Freeing of private port resources is moved
from the ".remove(device)" to the ".dev_close(port)" operation.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-09-30 19:19:14 +02:00
Thomas Monjalon
b142387b07 ethdev: allow drivers to return error on close
The device operation .dev_close was returning void.
This driver interface is changed to return an int.

Note that the API rte_eth_dev_close() is still returning void,
although a deprecation notice is pending to change it as well.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Reviewed-by: Sachin Saxena <sachin.saxena@oss.nxp.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-09-30 19:19:13 +02:00
Yunjian Wang
05a8e4a35d net/tap: free mempool when closing
When setup tx queues, we will create a mempool for the 'gso_ctx'.
The mempool is not freed when closing tap device. If free the tap
device and create it with different name, it will create a new
mempool. This maybe cause an OOM.

The snprintf function return value is not checked and the mempool
name may be truncated. This patch also fix it.

Fixes: 050316a883 ("net/tap: support TSO (TCP Segment Offload)")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-18 18:55:10 +02:00
Stephen Hemminger
2cecd7f974 net/tap: avoid using SIGIO
SIGIO maybe used by application, instead choose another rt-signal.
Linux allows any signal to be used for signal based IO.
Search for an unused signal in the available rt-signal range.

Add more error checking for fcntl and signal handling.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-18 18:55:06 +02:00
Ciara Power
3cc6ecfdfe build: remove makefiles
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.

[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-08 00:09:50 +02:00
Thomas Monjalon
4f86c0ba19 version: 20.11-rc0
Start a new release cycle with empty release notes.

The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-08-12 11:32:16 +02:00
Jerin Jacob
9c99878aa1 log: introduce logtype register macro
Introduce the RTE_LOG_REGISTER macro to avoid the code duplication
in the logtype registration process.

It is a wrapper macro for declaring the logtype, registering it and
setting its level in the constructor context.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-07-03 15:52:51 +02:00
Raslan Darawsheh
53601aefc4 net/tap: fix build for glibc < 2.24
When trying to compile with glibc < 2.24 that doesn't
support SOL_NETLINK it will cause compilation failure:

drivers/net/tap/tap_netlink.c:70:17: error:
 'SOL_NETLINK' undeclared (first use in this function)
  setsockopt(fd, SOL_NETLINK, NETLINK_EXT_ACK, &one, sizeof(one));

The glibc commits adds the SOL_NETLINK support:
https://github.com/bminor/glibc/commit/f9b437d5efce93800b51ad2a437c8b1c9

Fixes: 647909bcf3 ("net/tap: use netlink extended ack support")

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-05-11 22:27:39 +02:00
Stephen Hemminger
647909bcf3 net/tap: use netlink extended ack support
In recent Linux kernels, there is support for extended acknowledgment
to netlink messages. This is quite useful for diagnosing errors
in configuration in the kernel with TAP.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 22:27:39 +02:00
Stephen Hemminger
6d81801d84 net/tap: simplify netlink send/receive functions
The tap_nl_recv() function does not need to use the full
complex recvmsg() system call, basic recv() will work here.

Ditto for tap_nl_send() full sendmsg is not needed.

Add logic to retry in case EINTR rather than forcing
error handling back in driver or worse to ethdev API.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 22:27:39 +02:00
Stephen Hemminger
8451387df2 net/tap: fix crash in flow destroy
The TAP driver does not initialize all the elements of the rte_flow
structure. This can lead to crash in rte_flow_destroy.

(gdb) where
    flow=0x100e99280, error=0x0)
    at drivers/net/tap/tap_flow.c:1514

(gdb) p remote_flow
$1 = (struct rte_flow *) 0x6b6b6b6b6b6b6b6b

Which is here:
static int
tap_flow_destroy_pmd(struct pmd_internals *pmd,
		     struct rte_flow *flow,
		     struct rte_flow_error *error)
{
	struct rte_flow *remote_flow = flow->remote_flow;
...
	if (remote_flow) {
		remote_flow->msg.nh.nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK;

Simplest fix is to use rte_zmalloc() so remote_flow and other fields
are always set at zero.

Fixes: 2bc06869cd ("net/tap: add remote netdevice traffic capture")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-05-11 22:27:39 +02:00
Yunjian Wang
13b698d11f net/tap: fix queues fd check before close
The fd is possibly a negative value while it is passed as an
argument to function "close". Fix the check to the fd.

Fixes: ed8132e7c9 ("net/tap: move fds of queues to be in process private")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-04-21 13:57:09 +02:00
Yunjian Wang
dc1a4d86c6 net/tap: fix unexpected link handler
The nic's interrupt source has some active handler, which maybe call
tap_dev_intr_handler() to set link handler. We should cancel the link
handler before close fd to prevent executing the link handler. It
triggers segfault.

Call Trace:
   0x00007f15e08dad99 in __rte_panic (Error adding fd %d epoll_ctl, %s\n")
   0x00007f15e08e9b87 in eal_intr_thread_main ()
   0x00007f15e249be15 in start_thread ()
   0x00007f15d5322f9d in clone ()

Fixes: c0bddd3a05 ("net/tap: add link status notification")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-04-21 13:57:08 +02:00
Yunjian Wang
cbf8909b3a net/tap: fix fd leak on creation failure
When eth_dev_tap_create() is failed, nlsk_fd and ka_fd won't be closed
thus leading fds leak. Zero is a valid fd. Ultimately leads to a valid
fd was closed by mistake.

Fixes: bf7b7f437b ("net/tap: create netdevice during probing")
Fixes: cb7e68da63 ("net/tap: fix cleanup on allocation failure")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-04-21 13:57:08 +02:00
Yunjian Wang
f9d5da4ab6 net/tap: fix file close on remove
The internal structure is freed and set to NULL in the
rte_eth_dev_release_port() and zero is a valid fd. Ultimately
leads to a valid fd was closed by mistake.

Fixes: 3101191c63 ("net/tap: fix device removal when no queue exist")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-04-21 13:57:08 +02:00
Yunjian Wang
cc6cf04f59 net/tap: fix check for mbuf number of segment
Now the rxq->pool is mbuf concatenation, but its nb_segs is 1. When
conducting some sanity checks on the mbuf with debug enabled, it fails.

Fixes: 0781f5762c ("net/tap: support segmented mbufs")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-04-21 13:57:08 +02:00
Yunjian Wang
710aa42790 net/tap: fix mbuf and mem leak during queue release
For the tap PMD, we should release mbufs and iovecs from the Rx queue
when closing device. In order to remove duplicated code,
rte_pmd_tap_remove() calls tap_dev_close().

Fixes: 0781f5762c ("net/tap: support segmented mbufs")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-04-21 13:57:08 +02:00
Yunjian Wang
24cb500c17 net/tap: fix mbuf double free when writev fails
When the tap_write_mbufs() function return with break, mbuf was freed
without increasing num_packets, which could cause applications to free
the mbuf again. And the pmd_tx_burst() function should returns the
number of original packets it actually sent excluding tso mbufs.

Fixes: 9396ad3346 ("net/tap: fix reported number of Tx packets")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-04-21 13:57:08 +02:00
Yunjian Wang
252566dab5 net/tap: remove unused assert
The assert checks is not necessary, the gso_ctx is always non-NULL.

Fixes: 050316a883 ("net/tap: support TSO (TCP Segment Offload)")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-04-21 13:57:06 +02:00
Stephen Hemminger
bd3b90d53a net/tap: do not use PMD log type
The PMD logtype is legacy and drivers should use their own logtype.

Fixes: 050316a883 ("net/tap: support TSO (TCP Segment Offload)")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-04-21 13:57:05 +02:00
Thomas Monjalon
ef5baf3486 replace packed attributes
There is a common macro __rte_packed for packing structs,
which is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-04-16 18:16:46 +02:00
Thomas Monjalon
f35e5b3e07 replace alignment attributes
There is a common macro __rte_aligned for alignment,
which is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2020-04-16 18:16:18 +02:00
Pavan Nikhilesh
acec04c4b2 build: disable experimental API check internally
Remove setting ALLOW_EXPERIMENTAL_API individually for each Makefile and
meson.build. Instead, enable ALLOW_EXPERIMENTAL_API flag across app, lib
and drivers.
This changes reduces the clutter across the project while still
maintaining the functionality of ALLOW_EXPERIMENTAL_API i.e. warning
external applications about experimental API usage.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2020-04-14 16:22:34 +02:00
Yunjian Wang
c5f9911d34 net/tap: fix memory leak when unregister intr handler
The return check of function tap_lsc_intr_handle_set() is wrong, it should
be 0 or a positive number if success. So the intr_handle->intr_vec was not
been freed when tap_lsc_intr_handle_set() returned a positive number.

Fixes: 4870a8cdd9 ("net/tap: support Rx interrupt")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-02-05 09:51:19 +01:00
Pawel Modrak
85ff364f3b build: align symbols with global ABI version
Merge all versions in linker version script files to DPDK_20.0.

This commit was generated by running the following command:

:~/DPDK$ buildtools/update-abi.sh 20.0

Signed-off-by: Pawel Modrak <pawelx.modrak@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-11-20 23:05:39 +01:00
Anatoly Burakov
fbaf943887 build: remove individual library versions
Since the library versioning for both stable and experimental ABI's is
now managed globally, the LIBABIVER and version variables no longer
serve any useful purpose, and can be removed.

The replacement in Makefiles was done using the following regex:

	^(#.*\n)?LIBABIVER\s*:=\s*\d+\n(\s*\n)?

(LIBABIVER := numbers, optionally preceded by a comment and optionally
succeeded by an empty line)

The replacement for meson files was done using the following regex:

	^(#.*\n)?version\s*=\s*\d+\n(\s*\n)?

(version = numbers, optionally preceded by a comment and optionally
succeeded by an empty line)

[David]: those variables are manually removed for the files:
- drivers/common/qat/Makefile
- lib/librte_eal/meson.build
[David]: the LIBABIVER is restored for the external ethtool example
library.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-11-20 23:05:39 +01:00
Marcin Smoczynski
d070c34131 net/tap: fix blocked Rx packets
When OS sends more packets than are being read with a single
'rte_eth_rx_burst' call, rx packets are getting stucked in the tap pmd
and are unable to receive, because trigger_seen is getting updated
and consecutive calls are not getting any packets.

Do not update trigger_seen unless less than a max number of packets were
received allowing next call to receive the rest.

Remove unnecessary compiler barrier.

Fixes: a0d8e807d9 ("net/tap: add Rx trigger")
Cc: stable@dpdk.org

Signed-off-by: Marcin Smoczynski <marcinx.smoczynski@intel.com>
Tested-by: Mariusz Drost <mariuszx.drost@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2019-10-23 16:43:08 +02:00
Ivan Ilchenko
ca041cd44f ethdev: change allmulticast callbacks to return status
Enabling/disabling of allmulticast mode is not always successful and
it should be taken into account to be able to handle it properly.

When correct return status is unclear from driver code, -EAGAIN is used.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2019-10-07 15:00:55 +02:00
Igor Romanov
9970a9ad07 ethdev: make stats and xstats reset callbacks return int
Change return value of the callbacks from void to int. Make
implementations across all drivers return negative errno
values in case of error conditions.

Both callbacks are updated together because a large number of
drivers assign the same function to both callbacks.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-07 15:00:54 +02:00