Some zero copy AF_XDP drivers eg. ice require that there are addresses
already in the fill queue before the socket is created. Otherwise you may
see log messages such as:
XSK buffer pool does not provide enough addresses to fill 2047 buffers on
Rx ring 0
This commit ensures that the addresses are available before creating the
socket, instead of after.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
The Rx queue setup can fail for many reasons eg. failure to setup the
custom program, failure to allocate or reserve fill queue buffers,
failure to configure busy polling etc. When a failure like one of these
occurs, if the xsk is already set up it should be deleted before
returning. This commit ensures this happens.
Fixes: d8a210774e ("net/af_xdp: support unaligned umem chunks")
Fixes: 288a85aef1 ("net/af_xdp: enable custom XDP program loading")
Fixes: 055a393626 ("net/af_xdp: prefer busy polling")
Fixes: 01fa83c94d ("net/af_xdp: workaround custom program loading")
Cc: stable@dpdk.org
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
libbpf v0.7.0 deprecates the bpf_prog_load function. Use meson to detect
if libbpf >= v0.7.0 is linked and if so, use the recommended replacement
functions bpf_object__open_file and bpf_object__load.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Caught while trying --in-memory mode, some log messages in this driver
are not terminated with a newline:
rte_pmd_af_xdp_probe(): net_af_xdp: Failed to register multi-process IPC
callback: Operation not supportedvdev_probe(): failed to initialize
net_af_xdp device
Other locations in this driver had the same issue, fix all at once.
Fixes: f1debd77ef ("net/af_xdp: introduce AF_XDP PMD")
Fixes: d8a210774e ("net/af_xdp: support unaligned umem chunks")
Fixes: 9876cf8316 ("net/af_xdp: re-enable secondary process support")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
If EAL multiprocess feature has been disabled via rte_mp_disable()
function, AF_XDP driver may not be able to register its IPC callback.
Previously this leads to probe failure.
This commit adds a check for this condition so that AF_XDP can still be
used even if multiprocess is disabled.
Fixes: 9876cf8316 ("net/af_xdp: re-enable secondary process support")
Signed-off-by: Junxiao Shi <git@mail1.yoursunny.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Multiple PMDs have dummy/noop Rx/Tx packet burst functions.
These dummy functions are very simple, introduce a common function in
the ethdev and update drivers to use it instead of each driver having
its own functions.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Secondary process support had been disabled for the AF_XDP PMD because
there was no logic in place to share the AF_XDP socket file descriptors
between the processes. This commit introduces this logic using the IPC
APIs.
Rx and Tx are disabled in the secondary process due to memory mapping of
the AF_XDP rings being assigned by the kernel in the primary process only.
However other operations including retrieval of stats are permitted.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
The below compile time defined style make the code not so readable, the
first function end block is after "#endif" segment.
#if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
xdp_umem_configure()
{
#else
xdp_umem_configure()
{
#endif
'shared code block'
}
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
AF_XDP support is deprecated in libbpf since v0.7.0 [1]. The libxdp library
now provides the functionality which once was in libbpf and which the
AF_XDP PMD relies on. This commit updates the AF_XDP meson build to use the
libxdp library if a version >= v1.2.2 is available. If it is not available,
only versions of libbpf prior to v0.7.0 are allowed, as they still contain
the required AF_XDP functionality.
libbpf still remains a dependency even if libxdp is present, as we use
libbpf APIs for program loading.
The minimum required kernel version for libxdp for use with AF_XDP is v5.3.
For the library to be fully-featured, a kernel v5.10 or newer is
recommended. The full compatibility information can be found in the libxdp
README.
v1.2.2 of libxdp includes an important fix required for linking with DPDK
which is why this version or greater is required. Meson uses pkg-config to
verify the version of libxdp on the system, so it is necessary that the
library is discoverable using pkg-config in order for the PMD to use it. To
verify this, you can run: pkg-config --modversion libxdp
[1] https://github.com/libbpf/libbpf/commit/277846bc6c15
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
The get_shared_umem function is only called when the kernel
flag XDP_UMEM_UNALIGNED_CHUNK_FLAG is defined. Move the
function implementation and associated helper so that it only
gets compiled when that flag is set.
Fixes: 74b46340e2 ("net/af_xdp: support shared UMEM")
Cc: stable@dpdk.org
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Since v0.4.0, if the underlying kernel supports it, libbpf uses 'bpf
link' to manage the programs on the interfaces of the XDP sockets (xsks).
This is not compatible with the PMD's custom XDP program loading feature
which uses the netlink-based method for loading custom programs.
The conflict arises when libbpf searches for a custom program on the
interface using bpf link, but doesn't find one because the netlink
method was used. libbpf then proceeds to try to load the default program
on the interface, but fails due to the presence of the custom program.
To work around this, the PMD now uses the
XSK_LIBBPF_FLAGS__INHIBIT_PROG_LOAD flag which prevents libbpf from
attempting to search for or load a program. One repercussion is that
DPDK must now insert the xsk into the xsks_map as this was previously
handled by libbpf during the routines for program loading/probing.
Ideally, the PMD would use bpf link to load the custom program, however
at present there is no convenient and reliable way of detecting whether
the underlying kernel supports bpf link. Perhaps this may become
available in a future libbpf release, at which point we can switch the
PMD over to the new bpf link based method.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
The commit ae70cc6e89 ("net/af_xdp: use BPF link for XDP programs")
caused compilation errors on kernels older than v5.8 due to absence of
the bpf_link_info struct and some definitions in the linux/bpf.h header.
Since relying on the reported kernel version is not a robust solution
and also since there doesn't appear to be a suitable definition in the
bpf header that the preprocessor could rely on to determine support for
bpf link, we will take a different approach to solving the issue that
the original patch attempted to solve. The next commit will address
this.
Fixes: ae70cc6e89 ("net/af_xdp: use BPF link for XDP programs")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Since v0.4.0, if the underlying kernel supports it, libbpf uses 'bpf
link' to manage the programs on the interfaces of the xsks. This has two
repercussions for the PMD.
1. In the case where the PMD asks libbpf to load the default XDP
program, the PMD no longer needs to remove it on teardown. This is
because bpf link handles the unloading under the hood.
2. In the case where the PMD loads a custom program, libbpf expects this
program to be linked via bpf link prior to creating the socket.
This patch introduces probes for the libbpf version and kernel support
for bpf link and orchestrates the loading and unloading of
programs according to the capabilities of the kernel and libbpf. The
libbpf version is checked with meson and pkg-config. The probe for
kernel support mirrors how it is implemented in libbpf. A bpf_link is
created and looked up on loopback device. If successful, bpf_link will
be used for the AF_XDP netdev.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Add 'RTE_ETH' namespace to all enums & macros in a backward compatible
way. The macros for backward compatibility can be removed in next LTS.
Also updated some struct names to have 'rte_eth' prefix.
All internal components switched to using new names.
Syntax fixed on lines that this patch touches.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Commit 1bb4a528c4 ("ethdev: fix max Rx packet length") clarified the
expected usage of the max_rx_pktlen and max_mtu values and implemented
some extra checks on these values to ensure they are sane. After this,
the AF_XDP PMD fails to initialise. The value for max_rx_pktlen which
represents the max size of the Ethernet frame was set to ETH_FRAME_LEN
(1514) and the max_mtu which represents the size of the payload was set
to the max size of the Ethernet frame. This did not make sense, as
naturally the maximum frame size should be greater than the payload
size.
Fix this by setting the max_rx_pktlen equal to the max size of the
Ethernet frame as expected, and the max MTU equal to the max_rx_pktlen
less the overhead which is set to the size of an Ethernet header plus
CRC.
Fixes: 1bb4a528c4 ("ethdev: fix max Rx packet length")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Since the AF_XDP PMD does not work for secondary processes as reported
in Bugzilla 805, check for the process type at the beginning of probe
and return ENOTSUP if the process type is secondary.
It is planned that secondary processes will be supported by the PMD in
full in a future release by using rte_mp_msg to pass the state to the
secondary process that it requires in order to work.
Bugzilla ID: 805
Fixes: f1debd77ef ("net/af_xdp: introduce AF_XDP PMD")
Cc: stable@dpdk.org
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Some drivers don't need Rx and Tx queue release callback, make them
optional. Clean up empty queue release callbacks for some drivers.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Call xsk_ring_prod__submit() before kick_tx() so that the kernel
consumer sees the updated state of Tx ring. Otherwise, Tx packets are
stuck in the ring until the next call to af_xdp_tx_zc().
Fixes: d8a210774e ("net/af_xdp: support unaligned umem chunks")
Cc: stable@dpdk.org
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Start a new release cycle with empty release notes.
The ABI version becomes 22.0.
The map files are updated to the new ABI major number (22).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Prior to this change, two implementations of rx_syscall_handler
existed although only one was needed (for the zero copy path which
is only available from kernel 5.4 and onwards). Remove the second
definition from compat.h and move the first definition back to where
it is called in the Rx function. Doing this removes a build warning
on kernels before 5.4 which complained about the second function
being defined but not used.
Fixes: 2aa51cdd55 ("net/af_xdp: fix trigger for syscall on Tx")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Let's try to enforce the convention where most drivers use a pmd. logtype
with their class reflected in it, and libraries use a lib. logtype.
Introduce two new macros:
- RTE_LOG_REGISTER_DEFAULT can be used when a single logtype is
used in a component. It is associated to the default name provided
by the build system,
- RTE_LOG_REGISTER_SUFFIX can be used when multiple logtypes are used,
and then the passed name is appended to the default name,
RTE_LOG_REGISTER is left untouched for existing external users
and for components that do not comply with the convention.
There is a new Meson variable log_prefix to adapt the default name
for baseband (pmd.bb.), bus (no pmd.) and mempool (no pmd.) classes.
Note: achieved with below commands + reverted change on net/bonding +
edits on crypto/virtio, compress/mlx5, regex/mlx5
$ git grep -l RTE_LOG_REGISTER drivers/ |
while read file; do
pattern=${file##drivers/};
class=${pattern%%/*};
pattern=${pattern#$class/};
drv=${pattern%%/*};
case "$class" in
baseband) pattern=pmd.bb.$drv;;
bus) pattern=bus.$drv;;
mempool) pattern=mempool.$drv;;
*) pattern=pmd.$class.$drv;;
esac
sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
done
$ git grep -l RTE_LOG_REGISTER lib/ |
while read file; do
pattern=${file##lib/};
pattern=lib.${pattern%%/*};
sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
done
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The recvfrom() syscall is only supported by AF_XDP sockets since
kernel 5.11. Only use it if busy polling is configured. We can
assume a kernel >= 5.11 is in use if busy polling is configured
so we can safely call recvfrom() in that case.
Fixes: 63e8989fe5 ("net/af_xdp: use recvfrom instead of poll syscall")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
The send() syscall on the Tx path is not concerned with busy polling
and as such its invocation should not depend on whether or not it is
configured. Fix this by distinguishing the conditions necessary for
syscalls on the Rx and Tx paths individually.
Fixes: 055a393626 ("net/af_xdp: prefer busy polling")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Coverity complains that the return value of recvfrom() in the AF_XDP
datapath is not checked. We don't care about the return value because in
the case of an error we still return 0 from the receive function to
indicate no packets were received. So to make Coverity happy we cast the
return to 'void'.
Coverity issue: 369671
Fixes: 63e8989fe5 ("net/af_xdp: use recvfrom instead of poll syscall")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
This commit introduces support for preferred busy polling
to the AF_XDP PMD. This feature aims to improve single-core
performance for AF_XDP sockets under heavy load.
A new vdev arg is introduced called 'busy_budget' whose default
value is 64. busy_budget is the value supplied to the kernel
with the SO_BUSY_POLL_BUDGET socket option and represents the
busy-polling NAPI budget. To set the budget to a different value
eg. 256:
--vdev=net_af_xdp0,iface=eth0,busy_budget=256
Preferred busy polling is enabled by default provided a kernel with
version >= v5.11 is in use. To disable it, set the budget to zero.
The following settings are also strongly recommended to be used in
conjunction with this feature:
echo 2 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs
echo 200000 | sudo tee /sys/class/net/eth0/gro_flush_timeout
.. where eth0 is the interface being used by the PMD.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
poll() is more expensive and requires more tuning
when used with the upcoming busy polling functionality.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Prior to this commit, the maximum batch sizes for zero-copy and
copy-mode rx and copy-mode tx were set to 32. Apart from zero-copy tx,
the user could never rx/tx any more than 32 packets at a time and
without inspecting the code the user wouldn't be aware of this.
This commit removes these upper limits placed on the user and instead
sets an internal batch size equal to the default ring size (2048).
Batches larger than this are still processed, however they are split
into smaller batches similar to how it's done in other drivers. This is
necessary because some arrays used during rx/tx need to be sized at
compile-time.
Allowing a larger batch size allows for fewer batches and thus larger
bulk operations, fewer ring accesses and fewer syscalls which should
yield improved performance.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Prior to this commit, if rte_pktmbuf_alloc_bullk failed during rx queue
setup the error was not returned to the user and they may incorrectly
assume that the rx queue had been successfully set up. This commit ensures
that the error is returned to the user.
Bugzilla ID: 643
Fixes: d8a210774e ("net/af_xdp: support unaligned umem chunks")
Cc: stable@dpdk.org
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Steven Webster <steven.webster@windriver.com>
Meson can use cmake as a fallback for detecting packages, and this can
lead to picking up 64-libs for 32-bit builds. To work around this, force
the use of pkg-config only for detecting libcrypto, zlib, jansson and
other package dependencies.
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Ruifeng Wang <ruifeng.wang@arm.com>
Tested-by: Liron Himi <lironh@marvell.com>
Tested-by: Lee Daly <lee.daly@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Martin Spinler <spinler@cesnet.cz>
Allows i40e and mlx5 PMDs to compile on Windows and disable other drivers.
Disable few i40e warnings with Clang such as comparison of integers of
different signs and macro redefinitions.
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
While receiving packets, the max bunch number of mbufs are allocated
and if hardware does not receive the max bunch number packets, it
will free redundancy mbufs, this is low performance.
So optimize Rx performance, by allocating number of mbuf based on
result of xsk_ring_cons__peek, to avoid to redundancy allocation,
and free mbuf when receive packets.
And Rx cached_cons must be roll backed if fails to allocate mbuf.
Signed-off-by: RongQing Li <lirongqing@baidu.com>
Signed-off-by: Dongsheng Rong <rongdongsheng@baidu.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Assignment of function parameter 'umem' removed.
Fixes: f0ce7af0e1 ("net/af_xdp: remove resources when port is closed")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
'uint64_t' is used to hold the pointer, for 32-bits build this
assumption is wrong and giving following build error:
rte_eth_af_xdp.c: In function ‘xdp_umem_configure’:
rte_eth_af_xdp.c:970:15:
error: cast to pointer from integer of different size
[-Werror=int-to-pointer-cast]
970 | base_addr = (void *)get_base_addr(mb_pool, &align);
| ^
Replacing the 'uint64_t' return type of the 'get_base_addr()' to the
'uintptr_t'.
Although not sure if the overall logic supports the 32-bits, using
'uintptr_t' should be safe both for 64/32 bits.
Fixes: d8a210774e ("net/af_xdp: support unaligned umem chunks")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
The multiplication of two u32 integers may cause an overflow with large
mempool sizes.
Fixes: 74b46340e2 ("net/af_xdp: support shared UMEM")
Signed-off-by: Martin Weiser <martin.weiser@allegro-packets.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Since each version map file is contained in the subdirectory of the library
it refers to, there is no need to include the library name in the filename.
This makes things simpler in case of library renaming.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Queue stats are stored in 'struct rte_eth_stats' as array and array size
is defined by 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag.
As a result of technical board discussion, decided to remove the queue
statistics from 'struct rte_eth_stats' in the long term.
Instead PMDs should represent the queue statistics via xstats, this
gives more flexibility on the number of the queues supported.
Currently queue stats in the xstats are filled by ethdev layer, using
some basic stats, when queue stats removed from basic stats the
responsibility to fill the relevant xstats will be pushed to the PMDs.
During the switch period, temporary 'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS'
device flag is created. Initially all PMDs using xstats set this flag.
The PMDs implemented queue stats in the xstats should clear the flag.
When all PMDs switch to the xstats for the queue stats, queue stats
related fields from 'struct rte_eth_stats' will be removed, as well as
'RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS' flag.
Later 'RTE_ETHDEV_QUEUE_STAT_CNTRS' compile time flag also can be
removed.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Change eth_dev_stop_t return value from void to int.
Make eth_dev_stop_t implementations across all drivers to return
negative errno values if case of error conditions.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
AF_XDP PMDs who wish to share a UMEM must have a unique context
(ctx) ie. netdev,qid tuple. For instance, the following will not
work since both PMDs' contexts are identical.
--vdev net_af_xdp0,iface=ens786f1,start_queue=0,shared_umem=1
--vdev net_af_xdp1,iface=ens786f1,start_queue=0,shared_umem=1
Supporting this scenario would require locks, which would impact
the performance of the more typical cases - xsks with different
netdev,qid tuples.
Fixes: 74b46340e2 ("net/af_xdp: support shared UMEM")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
strncpy may leave the destination buffer not NULL terminated so use
strlcpy instead.
Coverity issue: 362975
Fixes: 339b88c6a9 ("net/af_xdp: support multi-queue")
Cc: stable@dpdk.org
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The new 'xdp_prog=<string>' vdev arg allows the user to specify the path to
a custom XDP program to be set on the device, instead of the default libbpf
one. The program must have an XSK_MAP of name 'xsks_map' which will allow
for the redirection of some packets to userspace and thus the PMD, using
some criteria defined in the program. This can be useful for filtering
purposes, for example if we only want a subset of packets to reach
userspace or to drop or process a subset of packets in the kernel.
Note: a netdev may only load one program.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Xuekun Hu <xuekun.hu@intel.com>
The secondary processes are not allowed to release shared resources.
Only process-private resources should be freed in a secondary process.
Most of the time, there is no process-private resource,
so the close operation is just forbidden in a secondary process.
After adding proper check in the port close functions,
some redundant checks in the device remove functions are dropped.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Reviewed-by: Sachin Saxena <sachin.saxena@oss.nxp.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
The temporary flag RTE_ETH_DEV_CLOSE_REMOVE is removed.
It was introduced in DPDK 18.11 in order to give time for PMDs to migrate.
The old behaviour was to free only queues when closing a port.
The new behaviour is calling rte_eth_dev_release_port() which does
three more tasks:
- trigger event callback
- reset state and few pointers
- free all generic port resources
The private port resources must be released in the .dev_close callback.
The .remove callback should:
- call .dev_close callback
- call rte_eth_dev_release_port()
- free multi-port device shared resources
Despite waiting two years, some drivers have not migrated,
so they may hit issues with the incompatible new behaviour.
After sending emails, adding logs, and announcing the deprecation,
the only last solution is to declare these drivers as unmaintained:
ionic, liquidio, nfp
Below is a summary of what to implement in those drivers.
* The freeing of private port resources must be moved
from the ".remove(device)" function to the ".dev_close(port)" function.
* If a generic resource (.mac_addrs or .hash_mac_addrs) cannot be freed,
it must be set to NULL in ".dev_close" function to protect from
subsequent rte_eth_dev_release_port() freeing.
* Note 1:
The generic resources are freed in rte_eth_dev_release_port(),
after ".dev_close" is called in rte_eth_dev_close(), but not when
calling ".dev_close" directly from the ".remove" PMD function.
That's why rte_eth_dev_release_port() must still be called explicitly
from ".remove(device)" after calling the ".dev_close" PMD function.
* Note 2:
If a device can have multiple ports, the common resources must be freed
only in the ".remove(device)" function.
* Note 3:
The port is supposed to be in a stopped state when it is closed.
If it is not the case, it is free to the PMD implementation
how to react when trying to close a non-stopped port:
either try to stop it automatically or just return an error.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
The device operation .dev_close was returning void.
This driver interface is changed to return an int.
Note that the API rte_eth_dev_close() is still returning void,
although a deprecation notice is pending to change it as well.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Reviewed-by: Sachin Saxena <sachin.saxena@oss.nxp.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Kernel v5.10 will introduce the ability to efficiently share a UMEM
between AF_XDP sockets bound to different queue ids on the same or
different devices. This patch integrates that functionality into the AF_XDP
PMD.
A PMD will attempt to share a UMEM with others if the shared_umem=1 vdev
arg is set. UMEMs can only be shared across PMDs with the same mempool, up
to a limited number of PMDs goverened by the size of the given mempool.
Sharing UMEMs is not supported for non-zero-copy (aligned) mode.
The benefit of sharing UMEM across PMDs is a saving in memory due to not
having to register the UMEM multiple times. Throughput was measured to
remain within 2% of the default mode (not sharing UMEM).
A version of libbpf >= v0.2.0 is required and the appropriate pkg-config
file for libbpf must be installed such that meson can determine the
version.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
While receiving packets, it is possible to fail to reserve
fill queue, since buffer ring is shared between tx and rx,
and maybe not available temporary. As a result both fill
queue and Rx queue will be empty.
Then kernel side will not be able to receive packets due to
empty fill queue, and dpdk will not be able to reserve fill
queue because dpdk doesn't have packets to receive, finally
deadlock will happen.
So move reserve fill queue before xsk_ring_cons__peek to fix it.
Cc: stable@dpdk.org
Signed-off-by: RongQing Li <lirongqing@baidu.com>
Signed-off-by: Dongsheng Rong <rongdongsheng@baidu.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
The kernel expects the start address of the UMEM to be page size
aligned.
Since the mempool is not guaranteed to have such alignment, we have been
aligning the address to the start of the page the mempool is on. However
when passing the 'size' of the UMEM during it's creation we did not take
this into account.
This commit adds the amount by which the address was aligned to the size
of the UMEM.
Bugzilla ID: 532
Fixes: d8a210774e ("net/af_xdp: support unaligned umem chunks")
Cc: stable@dpdk.org
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
The af_xdp rx function was returning a negative value on error, when an
unsigned value is expected. Fix this.
Fixes: d8a210774e ("net/af_xdp: support unaligned umem chunks")
Cc: stable@dpdk.org
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>