When IOVA=VA, address translation for segmented packets is wrong, it
assumes the address in the mbuf->next is physical address, not VA
address.
Fixing the address translation to work both PA & VA mode.
Fixes: e73831dc6c ("kni: support userspace VA")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
There is no reason for the DPDK libraries to all have 'librte_' prefix on
the directory names. This prefix makes the directory names longer and also
makes it awkward to add features referring to individual libraries in the
build - should the lib names be specified with or without the prefix.
Therefore, we can just remove the library prefix and use the library's
unique name as the directory name, i.e. 'eal' rather than 'librte_eal'
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Switch from using tabs to 4 spaces for meson.build indentation, for the
basic infrastructure and tooling files, as well as doc and kernel
directories.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
KNI runs userspace callback with rtnl lock held, this is not working
fine with some devices that needs to interact with kernel interface in
the callback, like Mellanox devices.
The solution is releasing the rtnl lock before calling the userspace
callback. But it requires two consideration:
1. The rtnl lock needs to released before 'kni->sync_lock', otherwise it
causes deadlock with multiple KNI devices, please check below the A.
for the details of the deadlock condition.
2. When rtnl lock is released for interface down event, it cause a
regression and deadlock, so can't release the rtnl lock for interface
down event, please check below B. for the details.
As a solution, interface down event is handled asynchronously and for
all other events rtnl lock is released before processing the callback.
A. KNI sync lock is being locked while rtnl is held.
If two threads are calling kni_net_process_request() ,
then the first one will take the sync lock, release rtnl lock then sleep.
The second thread will try to lock sync lock while holding rtnl.
The first thread will wake, and try to lock rtnl, resulting in a
deadlock. The remedy is to release rtnl before locking the KNI sync
lock.
Since in between nothing is accessing Linux network-wise, no rtnl
locking is needed.
B. There is a race condition in __dev_close_many() processing the
close_list while the application terminates.
It looks like if two KNI interfaces are terminating,
and one releases the rtnl lock, the other takes it,
updating the close_list in an unstable state,
causing the close_list to become a circular linked list,
hence list_for_each_entry() will endlessly loop inside
__dev_close_many() .
To summarize:
request != interface down : unlock rtnl, send request to user-space,
wait for response, send the response error code to caller in user-space.
request == interface down: send request to user-space, return immediately
with error code of 0 (success) to user-space.
Fixes: 3fc5ca2f63 ("kni: initial import")
Cc: stable@dpdk.org
Signed-off-by: Elad Nachman <eladv6@gmail.com>
Adding async userspace requests which don't wait for the userspace
response and always return success. This is preparation to address a
regression in KNI.
Signed-off-by: Elad Nachman <eladv6@gmail.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Refactor the parameter kni_net_process_request() gets, this is
preparation for addressing a user request processing deadlock problem.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Elad Nachman <eladv6@gmail.com>
The KNI linux module is using a custom target for building, which
doesn't take into account any cross compilation arguments. The arguments
in question are ARCH, CROSS_COMPILE (for gcc, clang) and CC, LD (for
clang). Get those from the cross file and pass them to the custom
target.
The user supplied path may not contain the 'build' directory, such as
when using cross-compiled headers, so only append that in the default
case (when no path is supplied in native builds) and use the unmodified
path from the user otherwise. Also modify the install path accordingly.
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Like what was done for mainline kernel in commit 38ad54f3bc ("kni: fix
build with Linux 5.6"), a new parameter 'txqueue' has to be added to
'ndo_tx_timeout' ndo on RHEL 8.3 kernel.
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Christophe Grosse <christophe.grosse@6wind.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Since the kernel module is not part of EAL anymore,
there is no need to have the common KNI header file in EAL.
The file rte_kni_common.h is moved to librte_kni.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
As decided in the Technical Board in November 2019,
the kernel module igb_uio is moved to the dpdk-kmods repository
in the /linux/igb_uio/ directory.
Minutes of Technical Board meeting:
https://mails.dpdk.org/archives/dev/2019-November/151763.html
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Starting from Linux 5.9 'get_user_pages_remote()' API doesn't get
'struct task_struct' parameter:
commit 64019a2e467a ("mm/gup: remove task_struct pointer for all gup code")
The change reflected to the KNI with version check.
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Remove the deprecated buf_physaddr union field from rte_mbuf.
It is replaced with buf_iova which is at the same offset.
The single field buf_physaddr in rte_kni_mbuf is also renamed.
This concludes a 3-year process of semantic change.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.
[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Now that kernel modules aren't built by default, we can be more
strict with their build process, and fail the build if they were
requested to be built, but weren't.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Since the kernel modules are moved to kernel/ directory,
there is no need anymore for the sub-directory eal/ in
linux/, freebsd/ and windows/.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
The EAL API (with doxygen documentation) is moved from
common/include/ to include/, which makes more clear that
it is the global API for all environments and architectures.
Note that the arch-specific and OS-specific include files are not
in this global include directory, but include/generic/ should
cover the doxygen documentation for them.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
If contigmem is not able to allocate all of the
requested buffers, it frees whatever buffers were
able to be allocated up until that point.
But the pointers are not set to NULL in that case.
After the load fails, the FreeBSD kernel will
immediately call the contigmem unload handler, which
tries to free the buffers again since the pointers
were not set to NULL.
It's not clear that we should just rely on the unload
handler getting called after load failure. So let's
keep the existing cleanup code in the load handler,
but explicitly set the pointers to NULL after freeing
them.
Fixes: 5f51eca224 ("contigmem: free allocated memory on error")
Cc: stable@dpdk.org
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The netuio driver will be hosted in a separate repository:
http://git.dpdk.org/dpdk-kmods/
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Narcisa Vasile <navasile@microsoft.com>
With the following Linux commit a new parameter 'txqueue' has been added
to 'ndo_tx_timeout' ndo:
commit 0290bd291cc0 ("netdev: pass the stuck queue to the timeout handler")
The change reflected to the KNI with version check.
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
FreeBSD 13 has changed the definition of vm_page_replace so we need
to have slightly different code paths around this function depending on
the BSD version.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
All global variables in kernel should be prefixed by the same
to avoid any symbol conflics. Rename dflt_carrier to kni_default_carrier.
Fixes: 89397a01ce ("kni: set default carrier state of interface")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Since kni no longer includes the ethtool code and so is faster to build, we
no longer need the console parameter to have incremental screen updates as
it builds. Therefore, we drop the keyword which removes the warning.
Fixes: b78f32cff9 ("kni: support meson build")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
The 'get_user_pages_remote()' API is updated in kernel 4.10.0 [1],
but the check added as > 4.9.0,
this logic is broken for kernels 4.9.x, because they justify
> 4.9.0 check but have the old API.
Fixing the check as >= 4.10.0
[1]
commit 5b56d49fc31d ("mm: add locked parameter to get_user_pages_remote()")
Fixes: d965af9e8a ("kni: increase kernel version requirement for VA")
Reported-by: Andrew Rybchenko <arybchenko@solarflare.com>
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
A build error reported related to the selected 'get_user_pages_remote()'
kernel API:
.../kernel/linux/kni/kni_dev.h:113:8:
error: too few arguments to function ‘get_user_pages_remote’
ret = get_user_pages_remote(tsk, tsk->mm, iova, 1
^~~~~~~~~~~~~~~~~~~~~
Currently there are three versions of the 'get_user_pages_remote()'
supported, based on kernel version < 4.9, = 4.9, > 4.9.
These version based checks are not working fine with the distro kernels
which is the cause of reported build error. The error reported by the
kernel version 4.8, but it is using API defined in > 4.9.
To be able to take control of this, and possible more, related build
error, increasing the minimum supported kernel version for iova=va with
KNI to kernel version 4.9.
This leaves us with single version of the kernel API and more manageable.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Clang is the system compiler for FreeBSD and kernel module builds can fail
when built with gcc, e.g. when testing with test-meson-builds.sh.
Therefore, it's safer to always use clang to build the kmods since the
actual flags used are outside of DPDK's control and cannot be guaranteed to
work with all compilers.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Set the install path for the kernel modules as /boot/modules. This may
ease the integration with the official FreeBSD ports system as all
components should be correctly located in the staging directory after
running "ninja install"
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Patch adds support for kernel module to work in IOVA = VA mode by
providing address translation routines to convert userspace VA to
kernel VA.
KNI performance using PA is not changed by this patch.
But comparing KNI using PA to KNI using VA, the latter will have lower
performance due to the cost of the added translation.
This translation is implemented only with kernel versions starting 4.6.0.
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Starting with kernel version 4.10, there are new min/max MTU values in
net_device structure, which are set to ETH_MIN_MTU and ETH_DATA_LEN by
default. We should be able to change these values to allow MTU more than
1500 to be set on KNI.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch adds support to allow users enable/disable allmulticast mode for
kni interface.
This requirement comes from bugzilla 312, more details can refer to:
https://bugs.dpdk.org/show_bug.cgi?id=312
Bugzilla ID: 312
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
build error:
kernel/linux/igb_uio/igb_uio.c:
In function ‘igbuio_pci_enable_interrupts’:
kernel/linux/igb_uio/igb_uio.c:230:6:
error: this statement may fall through
[-Werror=implicit-fallthrough=]
230 | if (pci_alloc_irq_vectors(udev->pdev, 1, 1, ....
kernel/linux/igb_uio/igb_uio.c:240:2: note: here
240 | case RTE_INTR_MODE_MSI:
| ^~~~
The build error is caused by Linux kernel commit in 5.3 that enables the
"-Wimplicit-fallthrough=3" gcc flag.
Commit a035d552a93b ("Makefile: Globally enable fall-through warning")
To fix the error, either a gcc attribute can be provided [1] or a code
comment with some defined syntax need to be provided [2], since there is
already comments, updated them slightly to match the required syntax to
fix the build error.
[1]
"__attribute__ ((fallthrough));"
[2]
[ \t.!]*([Ee]lse,? |[Ii]ntentional(ly)? )?
fall(s | |-)?thr(ough|u)[ \t.!]*(-[^\n\r]*)?
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
'kni_net_rx_lo_fifo()' can get segmented buffers, using 'pkt_len' for
that case will be wrong and some values can cause buffer overflow
in destination mbuf data.
Fixes: d89a58dfe9 ("kni: support chained mbufs")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
va2pa depends on the physical address and virtual address offset of
current mbuf. It may get the wrong physical address of next mbuf which
allocated in another hugepage segment.
In rte_mempool_populate_default(), trying to allocate whole block of
contiguous memory could be failed. Then, it would reserve memory in
several memzones that have different physical address and virtual address
offsets. The rte_mempool_populate_default() is used by
rte_pktmbuf_pool_create().
Fixes: 8451269e6d ("kni: remove continuous memory restriction")
Cc: stable@dpdk.org
Signed-off-by: Yangchao Zhou <zhouyates@gmail.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Some applications use ethtool so add the minimum ethtool ops.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_kni does not follow standard style rules.
Noticed some extra \ line continuation etc.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The correct thing to return if user gives a bad data
is to return -EFAULT. Logging is also discouraged because
it could be used as a DoS attack.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Using void * instead of proper type is unsafe practice.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Several fields were either totally unused or set and never used.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Since kernel 2.6.28 the network subsystem has provided
dev->stats for devices to use statistics handling and is the
default if no ndo_get_stats is provided.
This allow allows for 64 bit (rather than just 32 bit)
statistics with KNI.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
netdev_alloc_skb is optimized to any alignment or setup
of skb->dev that is required. The kernel has chosen to not pad
packets on x86 (for many years), because it is faster.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The netdev subsystem already handles case where
network sevice does not support ioctl.
If device has no rx_mode hook it is not called.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Currently kernel modules are installed into /usr/src instead of
/lib/modules when meson build system is used. This patch fixes that.
Cc: stable@dpdk.org
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Internal changes in the freebsd kernel have meant that additional includes
are now necessary to build the kernel modules for DPDK. Tested with latest
bsd HEAD revision.
Bugzilla ID: 282
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
As there is no ethtool support in KNI anymore,
PCI related information is no longer needed.
Fixes: ea6b39b5b8 ("kni: remove ethtool support")
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The local variables for the error message aren't needed, since the messages
aren't used more than once, and the indent levels are now such that the
lines printing the message are not much longer than the lines defining the
variables to hold the messages themselves. Therefore the use of the
variables is pointless.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Since meson 0.46, meson has supported the subdir_done() function, which
allows us to abort processing of a file early. Using this we can reduce the
indentation in our files by eliminating unnecessary else blocks.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
The check for meson version 0.44 is redundant since the minimum
supported version for the project as a whole is 0.47.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Current design requires kernel drivers and they need to be probed by
Linux up to some level so that they can be usable by DPDK for ethtool
support, this requires maintaining the Linux drivers in DPDK.
Also ethtool support is limited and hard, if not impossible, to expand
to other PMDs.
Since KNI ethtool support is not used commonly, if not used at all,
removing the support for the sake of simplicity and maintenance.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The type for MAC address should be unsigned.
Fixes: 1cfe212ed1 ("kni: support MAC address change")
Cc: stable@dpdk.org
Signed-off-by: Jie Pan <panjie5@jd.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
It allows applications running packet sockets over KNI interfaces to get
source Ethernet addresses of packets received using recvfrom function.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Build error seen with Linux kernel 5.1 and
when CONFIG_RTE_KNI_KMOD_ETHTOOL is enabled.
Build error:
kernel/linux/kni/igb_main.c:2352:18:
error: initialization of ... from incompatible pointer type ...
[-Werror=incompatible-pointer-types]
.ndo_fdb_add = igb_ndo_fdb_add,
^~~~~~~~~~~~~~~
ndo_fdb_add() is changed in Linux kernel version 5.1 and now requires
a new parameter, 'struct netlink_ext_ack *extack':
Linux Commit 87b0984ebfab ("net: Add extack argument to ndo_fdb_add()")
ndo_fdb_add() parameter updated with compile time Linux kernel version
check.
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>