Since the kernel module is not part of EAL anymore,
there is no need to have the common KNI header file in EAL.
The file rte_kni_common.h is moved to librte_kni.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This patch fixes (dereference after null check) coverity issue.
For this reason, we should add null check at the beginning of the
function and return error directly if the 'intr_handle' is null.
Coverity issue: 357695, 357751
Fixes: 05c4105738d8 ("trace: add interrupt tracepoints")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
The config options CONFIG_RTE_* are simple RTE_* defines with meson.
Now that make support is dropped, update the names in logs and comments.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
When introducing the new option CONFIG_RTE_MAX_MEM_MB_PER_TYPE,
some logs were referencing a wrong name: CONFIG_RTE_MAX_MEM_PER_TYPE.
Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
When the memory for uevent.devname is allocated in dev_uev_parse(). It
is not freed when parse the subsystem layer fails in dev_uev_parse().
Before return, it is also not freed in dev_uev_handler(). These cause a
memory leak.
Fixes: 0d0f478d0483 ("eal/linux: add uevent parse and process")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Replace master lcore with main lcore and
replace slave lcore with worker lcore.
Keep the old functions and macros but mark them as deprecated
for this release.
The "--master-lcore" command line option is also deprecated
and any usage will print a warning and use "--main-lcore"
as replacement.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Use the newer macros defined by meson in all DPDK source code, to ensure
there are no errors when the old non-standard macros are removed.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The existing definition of rte_epoll_wait retries if interrupted
by a signal. This behavior makes it hard to use rte_epoll_wait
for applications that want to use signals do do things like
exit polling loop and shutdown.
Since changing existing semantic might break applications, add
a new rte_epoll_wait_interruptible() function that does the
same thing as rte_epoll_wait but will return -1 and errno of EINTR
if it receives a signal.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
Running dpdk-helloworld on Linux with lib numa present, but no kernel
support for NUMA (CONFIG_NUMA=n) causes rte_service_init() to fail with
EAL: error allocating rte services array.
alloc_seg() calls get_mempolicy to verify that the allocation
has happened on the correct socket, but receives ENOSYS from
the kernel and fails the allocation.
The allocated socket should only be verified if check_numa() is true.
Fixes: 2a96c88be83e ("mem: ease init in a docker container")
Cc: stable@dpdk.org
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
This is something we encountered while working in an OpenShift
environment with SELinux enabled.
In this environment, a DPDK application could create/write to hugepage
files but removing them was refused.
This resulted in dirty files being reused when starting a new DPDK
application and triggered random crashes / erratic behavior.
Getting a SELinux setup can be a challenge, and even more if you add
containers to the picture :-).
So here is a reproducer for the interested testers:
# cat >wrap.c <<EOF
#define _GNU_SOURCE
#include <dlfcn.h>
#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>
int unlink(const char *pathname)
{
static int (*orig)(const char *pathname) = NULL;
struct stat st;
if (orig == NULL)
orig = dlsym(RTLD_NEXT, "unlink");
if (strstr(pathname, "rtemap_") != NULL &&
stat(pathname, &st) == 0) {
fprintf(stderr, "### refused unlink for %s\n",
pathname);
errno = EACCES;
return -1;
}
fprintf(stderr, "### called unlink for %s\n", pathname);
return orig(pathname);
}
int unlinkat(int dirfd, const char *pathname, int flags)
{
static int (*orig)(int dirfd, const char *pathname, int flags) =
NULL;
struct stat st;
if (orig == NULL)
orig = dlsym(RTLD_NEXT, "unlinkat");
if (strstr(pathname, "rtemap_") != NULL &&
fstatat(dirfd, pathname, &st, flags) == 0) {
fprintf(stderr, "### refused unlinkat for %s\n",
pathname);
errno = EACCES;
return -1;
}
fprintf(stderr, "### called unlinkat for %s\n", pathname);
return orig(dirfd, pathname, flags);
}
EOF
# gcc -fPIC -shared -o libwrap.so wrap.c -ldl
# \rm /dev/hugepages/rtemap*
# # First run is fine
# LD_PRELOAD=libwrap.so dpdk-testpmd -w 0000:01:00.0 -- -i
[...]
Configuring Port 0 (socket 0)
Port 0: 24:6E:96:3C:52:D8
Checking link statuses...
Done
testpmd>
# # Second run we have dirty memory
# LD_PRELOAD=libwrap.so dpdk-testpmd -w 0000:01:00.0 -- -i
[...]
### refused unlinkat for rtemap_0
[...]
Port 0 is now not stopped
Please stop the ports first
Done
testpmd>
Removing hugepage files is done in multiple places and the memory
allocation code is complex.
This fix tries to do the minimum and avoids touching other paths.
If trying to remove the hugepage file before allocating a page fails,
the error is reported to the caller and the user will see a memory
allocation error log.
Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The issue is that a file descriptor at 0 is a valid one. Currently
the file not found, the return value will be set to 0. As a result,
it is impossible to distinguish between a correct descriptor and a
failed return value. Fix it to return -ENOENT instead of 0.
Fixes: b758423bc4fe ("vfio: fix race condition with sysfs")
Fixes: ff0b67d1c868 ("vfio: DMA mapping")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Since rte_atomicXX APIs are not allowed to be used, use C11 builtins to
check if EAL is already initialized.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
This structure is not used in the public API.
Fixes: a753e53d517b ("eal: add device event monitor framework")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Remove the deprecated buf_physaddr union field from rte_mbuf.
It is replaced with buf_iova which is at the same offset.
The single field buf_physaddr in rte_kni_mbuf is also renamed.
This concludes a 3-year process of semantic change.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
The debug message was poorly worded and did not include the
part that would be useful. I.e it never said what was being ignored.
Change it to print the message so that if udev changes format or
other subsystems need to be added then the necessary information
will be in the debug log.
Fixes: 0d0f478d0483 ("eal/linux: add uevent parse and process")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jeff Guo <jia.guo@intel.com>
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.
[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
pthread_setname_np refuses names larger than 16 bytes (\0 included).
Rather than return an error, truncate the name to this limit in the
rte_thread_setname helper.
Caught with ixgbe which creates control thread with name
"ixgbe-link-handler":
Configuring Port 0 (socket 0)
EAL: Cannot set name for ctrl thread
...
EAL: Cannot set name for ctrl thread
Port 0: link state change event
...
EAL: Cannot set name for ctrl thread
Port 0: link state change event
Note: before this change, the thread would keep its original name, which
meant in my test for the ixgbe handler either "dpdk-testpmd" or
"eal-intr-thread".
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The event status is defined as a volatile variable and shared between
threads. Use C11 atomic built-ins with explicit ordering instead of
rte_atomic ops which enforce unnecessary barriers on aarch64.
The event status has been cleaned up by the compare-and-swap operation
when we free the event data, so there is no need to set it to invalid
after that.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
Add a helper to iterate all lcores.
The iterator callback is read-only wrt the lcores list.
Implement a dump function on top of this for debugging.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
DPDK allows calling some part of its API from a non-EAL thread but this
has some limitations.
OVS (and other applications) has its own thread management but still
want to avoid such limitations by hacking RTE_PER_LCORE(_lcore_id) and
faking EAL threads potentially unknown of some DPDK component.
Introduce a new API to register non-EAL thread and associate them to a
free lcore with a new NON_EAL role.
This role denotes lcores that do not run DPDK mainloop and as such
prevents use of rte_eal_wait_lcore() and consorts.
Multiprocess is not supported as the need for cohabitation with this new
feature is unclear at the moment.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Introduce a helper responsible for initialising the per thread context.
We can then have a unified context for EAL and non-EAL threads and
remove copy/paste'd OS-specific helpers.
Per EAL thread CPU affinity setting is separated from the thread init.
It is to accommodate with Windows EAL where CPU affinity is not set at
the moment.
Besides, having affinity set by the master lcore in FreeBSD and Linux
will make it possible to detect errors rather than panic in the child
thread. But the cleanup when such an event happens is left for later.
A side-effect of this patch is that control threads can now use
recursive locks (rte_gettid() was not called before).
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
We have per lcore thread symbols scattered in OS implementations but
common code relies on them.
Move all of them in common.
RTE_PER_LCORE(_socket_id) and RTE_PER_LCORE(_cpuset) have public
accessors and are not exported through the library map, they can be
made static.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The Linux kernel module vfio-pci introduces the VF token to enable
SR-IOV support since 5.7.
The VF token can be set by a vfio-pci based PF driver and must be known
by the vfio-pci based VF driver in order to gain access to the device.
Since the vfio-pci module uses the VF token as internal data to provide
the collaboration between SR-IOV PF and VFs, so DPDK can use the same
VF token for all PF devices by specifying the related EAL option.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Tested-by: Harman Kalra <hkalra@marvell.com>
The startup of VFIO is too noisy. Logging is expensive on some
systems, and distracting to the user.
It should not be logging at NOTICE level, reduce it to INFO level.
It really should be DEBUG here but that would hide it by default.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The 'group_status' has never been used and can be removed.
Fixes: 94c0776b1bad ("vfio: support hotplug")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Move common functions between Unix and Windows to eal_common_options.c.
Those functions are getter functions for rte_application_usage_hook.
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Move common functions between Unix and Windows to eal_common_config.c.
Those functions are getter functions for IOVA,
configuration, Multi-process.
Move rte_config, internal_config, early_mem_config and runtime_dir
to be defined in the common file with getter functions.
Refactor the users of the config variables above to use
the getter functions.
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Move common functions between Unix and Windows to eal_common_debug.c.
Those functions are rte_exit, __rte_panic and rte_dump_registers
which has the same implementation on Unix and Windows.
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
An issue has been observed where epoll file descriptor
list rebuilds every time an interrupt/alarm event is
received.
eal_intr_process_interrupts() should notify pipe fd only
if any source is removed from the source list i.e (rv > 0)
Fixes: 0c7ce182a760 ("eal: add pending interrupt callback unregister")
Cc: stable@dpdk.org
Signed-off-by: Harman Kalra <hkalra@marvell.com>
EAL common timer doesn't compile under Windows.
Compilation log:
error LNK2019:
unresolved external symbol nanosleep referenced in function
rte_delay_us_sleep
error LNK2019:
unresolved external symbol get_tsc_freq referenced in function set_tsc_freq
error LNK2019:
unresolved external symbol sleep referenced in function set_tsc_freq
The reason was that some functions called POSIX functions.
The solution was to move POSIX dependent functions from common to Unix.
Signed-off-by: Fady Bader <fady@mellanox.com>
Reviewed-by: Tal Shnaiderman <talshn@mellanox.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Code in Linux EAL that supports dynamic memory allocation (as opposed to
static allocation used by FreeBSD) is not OS-dependent and can be reused
by Windows EAL. Move such code to a file compiled only for the OS that
require it. Keep Anatoly Burakov maintainer of extracted code.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
All supported OS create memory segment lists (MSL) and reserve VA space
for them in a nearly identical way. Move common code into EAL private
functions to reduce duplication.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Introduce OS-independent wrappers for memory management operations used
across DPDK and specifically in common code of EAL:
* rte_mem_map()
* rte_mem_unmap()
* rte_mem_page_size()
* rte_mem_lock()
Windows uses different APIs for memory mapping and reservation, while
Unices reserve memory by mapping it. Introduce EAL private functions to
support memory reservation in common code:
* eal_mem_reserve()
* eal_mem_free()
* eal_mem_set_dump()
Wrappers follow POSIX semantics limited to DPDK tasks, but their
signatures deliberately differ from POSIX ones to be more safe and
expressive. New symbols are internal. Being thin wrappers, they require
no special maintenance.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Introduce OS-independent wrappers in order to support common EAL code
on Unix and Windows:
* eal_file_open: open or create a file.
* eal_file_lock: lock or unlock an open file.
* eal_file_truncate: enforce a given size for an open file.
Implementation for Linux and FreeBSD is placed in "unix" subdirectory,
which is intended for common code between the two. These thin wrappers
require no special maintenance.
Common code supporting multi-process doesn't use the new wrappers,
because it is inherently Unix-specific and would impose excessive
requirements on the wrappers.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Initially, printf was used to indicate and error/warning resulting from
telemetry initialisation. This is now fixed to use EAL logs for
notices, and the unnecessary printf for an error is removed.
Fixes: eeb486f3ba65 ("eal: add telemetry as dependency")
Fixes: dd6275a424ac ("telemetry: fix error log output")
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The threads for listening on the telemetry sockets are control threads
and should be separated from those on the data plane. Since telemetry
cannot use the rte_ctrl_thread_create() API, as it does not depend on
EAL, we pass the ctrl thread cpu_set to telemetry init and use it
directly to ensure that telemetry cannot interfere with the data plane
threads.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
As Telemetry no longer uses rte_option, and was the only user of this
infrastructure, it can now be removed.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
This patch moves telemetry further down the build, and adds it as a
dependency for EAL. Telemetry V2 is now configured to build by default,
and the legacy support is built when the telemetry config flag is set.
Telemetry now has EAL flags, shown below:
"--telemetry" = Enables telemetry (this is default if no flags given)
"--no-telemetry" = Disables telemetry
When telemetry is enabled, it will attempt to open the new socket
version, and also the legacy support socket (this will depend on Jansson
external dependency and telemetry config flag, as before).
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
Currently, even though memory is mapped with PROT_NONE, this does not
cause it to be excluded from core dumps. This is counter-productive,
because in a lot of cases, this memory will go unused (e.g. when the
memory subsystem preallocates VA space but hasn't yet mapped physical
pages into it).
Use `madvise()` call with MADV_DONTDUMP/MADV_NOCORE to exclude the
unused memory from being dumped.
Signed-off-by: Li Feng <fengli@smartx.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Commit 8a4baf06c17a ("mem: mark pages as not accessed when reserving VA")
has mapped the initialized memory with PROT_NONE, and when it's unmapped,
eal_memalloc.c should remmap the anonymous memory with PROT_NONE too.
Fixes: 8a4baf06c17a ("mem: mark pages as not accessed when reserving VA")
Cc: stable@dpdk.org
Signed-off-by: Li Feng <fengli@smartx.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Add the following interrupt related tracepoints.
- rte_eal_trace_intr_callback_register()
- rte_eal_trace_intr_callback_unregister()
- rte_eal_trace_intr_enable()
- rte_eal_trace_intr_disable()
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add the following thread related tracepoints.
- rte_eal_trace_thread_remote_launch()
- rte_eal_trace_thread_lcore_ready()
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add following alarm related trace points.
- rte_eal_trace_alarm_set()
- rte_eal_trace_alarm_cancel()
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
This patch creates the following generic tracepoint for
generic tracing when there is no dedicated tracepoint is
available.
- rte_eal_trace_generic_void()
- rte_eal_trace_generic_u64()
- rte_eal_trace_generic_u32()
- rte_eal_trace_generic_u16()
- rte_eal_trace_generic_u8()
- rte_eal_trace_generic_i64()
- rte_eal_trace_generic_i32()
- rte_eal_trace_generic_i16()
- rte_eal_trace_generic_i8()
- rte_eal_trace_generic_int()
- rte_eal_trace_generic_long()
- rte_eal_trace_generic_float()
- rte_eal_trace_generic_double()
- rte_eal_trace_generic_ptr()
- rte_eal_trace_generic_str()
For example, if an application wishes to emit an int datatype,
it can call rte_eal_trace_generic_int(val) to emit the trace.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Connect the internal trace interface API to Linux EAL.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Common trace format(CTF) defines the metadata[1][2] for trace events,
This patch creates the metadata for the DPDK events in memory and
later this will be saved to trace directory on rte_trace_save()
invocation.
[1] https://diamon.org/ctf/#specification
[2] https://diamon.org/ctf/#examples
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Define eal_trace_init() and eal_trace_fini() EAL interface
functions that rte_eal_init() and rte_eal_cleanup() function can
use to initialize and finalize the trace subsystem.
eal_trace_init() function will add the following functionality if
trace is enabled through EAL command line param.
- Test for trace registration failure.
- Test for duplicate trace name registration.
- Generate UUID ver 4.
- Create a trace directory.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>