Commit Graph

172 Commits

Author SHA1 Message Date
Stanislaw Kardach
93cba71bdc eal/riscv: fix vector header for C++
rte_xmm_t is a union type which wraps around xmm_t and maps its contents
to scalar structures. Since C++ has stricter type conversion rules than
C, the rte_xmm_t::x has to be used instead of C-casting.

Fixes: f22e705ebf ("eal/riscv: support RISC-V architecture")

Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2022-06-15 09:12:16 +02:00
David Marchand
e5e613f05b eal: remove unused arch-specific headers for locks
MCS lock, PF lock and Ticket lock have no arch specific implementation,
there is no need for the extra redirection in headers.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Stanislaw Kardach <kda@semihalf.com>
2022-06-08 15:44:20 +02:00
Michal Mazurek
f22e705ebf eal/riscv: support RISC-V architecture
Add all necessary elements for DPDK to compile and run EAL on SiFive
Freedom U740 SoC which is based on SiFive U74-MC (ISA: rv64imafdc)
core complex.

This includes:

- EAL library implementation for rv64imafdc ISA.
- meson build structure for 'riscv' architecture. RTE_ARCH_RISCV define
  is added for architecture identification.
- xmm_t structure operation stubs as there is no vector support in the
  U74 core.

Compilation was tested on Ubuntu and Arch Linux using riscv64 toolchain.
Clang compilation currently not supported due to issues with missing
relocation relaxation.

Two rte_rdtsc() schemes are provided: stable low-resolution using rdtime
(default) and unstable high-resolution using rdcycle. User can override
the scheme by defining RTE_RISCV_RDTSC_USE_HPM=1 during compile time of
both DPDK and the application. The reasoning for this is as follows.
The RISC-V ISA mandates that clock read by rdtime has to be of constant
period and synchronized between all hardware threads within 1 tick
(chapter 10.1 in version 20191213 of RISC-V spec).
However this clock may not be of high-enough frequency for dataplane
uses. I.e. on HiFive Unmatched (FU740) it is 1MHz.
There is a high-resolution alternative in form of rdcycle which is
clocked at the core clock frequency. The drawbacks are that it may be
disabled during sleep (WFI), its frequency might change due to DVFS and
it is core-local and therefore cannot be used as a wall-clock. It can
however be used for micro-benchmarking user applications, similarly to
Aarch64's PMCCNTR PMU counter.

The platform is currently marked as linux-only because rte_cycles
implementation uses the timebase-frequency device-tree node read through
the proc file system. Such approach was chosen because Linux kernel
depends on the presence of this device-tree node.

The i40e PMD driver is disabled on RISC-V as the rv64gc ISA has no vector
operations.

The compilation of following modules has been disabled by this commit
and will be re-enabled in later commits as fixes are introduced:
net/ixgbe, net/memif, net/tap, example/l3fwd.

Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Michal Mazurek <maz@semihalf.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2022-06-08 11:26:20 +02:00
Tyler Retzlaff
ca04c78b62 eal: get/set thread priority per thread identifier
Add functions for setting and getting the priority of a thread.
Priorities on multiple platforms are similarly determined by a priority
value and a priority class/policy.

Currently in DPDK most threads operate at the OS-default priority level
but there are cases when increasing the priority is useful. For
example, high performance applications may require elevated priority
levels.

For these reasons, EAL will expose two priority levels which are named
suggestively "normal" and "realtime_critical" and are computed as
follows:

  On Linux, the following mapping is created:
    RTE_THREAD_PRIORITY_NORMAL corresponds to
      * policy SCHED_OTHER
      * priority value:   (sched_get_priority_min(SCHED_OTHER) +
			   sched_get_priority_max(SCHED_OTHER))/2;
    RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to
      * policy SCHED_RR
      * priority value: sched_get_priority_max(SCHED_RR);

  On Windows, the following mapping is created:
    RTE_THREAD_PRIORITY_NORMAL corresponds to
      * class NORMAL_PRIORITY_CLASS
      * priority THREAD_PRIORITY_NORMAL
    RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to
      * class REALTIME_PRIORITY_CLASS (when running with privileges)
      * class HIGH_PRIORITY_CLASS (when running without privileges)
      * priority THREAD_PRIORITY_TIME_CRITICAL

Note that on Linux the resulting priority value will be 0, in
accordance to the documentation that mention the value should be 0 for
SCHED_OTHER policy.

Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-07 13:33:14 +02:00
Mattias Rönnblom
0bee070907 eal: add seqlock
A sequence lock (seqlock) is a synchronization primitive which allows
for data-race free, low-overhead, high-frequency reads, suitable for
data structures shared across many cores and which are updated
relatively infrequently.

A seqlock permits multiple parallel readers. A spinlock is used to
serialize writers. In cases where there is only a single writer, or
writer-writer synchronization is done by some external means, the
"raw" sequence counter type (and accompanying rte_seqcount_*()
functions) may be used instead.

To avoid resource reclamation and other issues, the data protected by
a seqlock is best off being self-contained (i.e., no pointers [except
to constant data]).

One way to think about seqlocks is that they provide means to perform
atomic operations on data objects larger than what the native atomic
machine instructions allow for.

DPDK seqlocks (and the underlying sequence counters) are not
preemption safe on the writer side. A thread preemption affects
performance, not correctness.

A seqlock contains a sequence number, which can be thought of as the
generation of the data it protects.

A reader will
  1. Load the sequence number (sn).
  2. Load, in arbitrary order, the seqlock-protected data.
  3. Load the sn again.
  4. Check if the first and second sn are equal, and even numbered.
     If they are not, discard the loaded data, and restart from 1.

The first three steps need to be ordered using suitable memory fences.

A writer will
  1. Take the spinlock, to serialize writer access.
  2. Load the sn.
  3. Store the original sn + 1 as the new sn.
  4. Perform load and stores to the seqlock-protected data.
  5. Store the original sn + 2 as the new sn.
  6. Release the spinlock.

Proper memory fencing is required to make sure the first sn store, the
data stores, and the second sn store appear to the reader in the
mentioned order.

The sn loads and stores must be atomic, but the data loads and stores
need not be.

The original seqlock design and implementation was done by Stephen
Hemminger. This is an independent implementation, using C11 atomics.

For more information on seqlocks, see
https://en.wikipedia.org/wiki/Seqlock

Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
2022-06-07 13:33:14 +02:00
Duncan Bellamy
0615dd2aa1 eal/ppc: fix compilation for musl
musl lacks __ppc_get_timebase() but has __builtin_ppc_get_timebase()

Signed-off-by: Duncan Bellamy <dunk@denkimushi.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2022-06-07 13:33:14 +02:00
Thomas Monjalon
b251bb7630 eal/ppc: undefine AltiVec keyword vector
The AltiVec header file is defining "vector", except in C++ build.
The keyword "vector" may conflict easily.
As a rule, it is better to use the alternative keyword "__vector".

The DPDK header file rte_altivec.h takes care of undefining "vector",
so the applications and dependencies are free to define the name "vector".

This is a compatibility breakage for applications which were using
the keyword "vector" for its AltiVec meaning.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2022-06-01 16:51:53 +02:00
Thomas Monjalon
64fcadeac0 avoid AltiVec keyword vector
The AltiVec header file is defining "vector", except in C++ build.
The keyword "vector" may conflict easily.
As a rule, it is better to use the alternative keyword "__vector",
so we will be able to #undef vector after including AltiVec header.

Later it may become possible to #undef vector in rte_altivec.h
with a compatibility breakage.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2022-05-25 11:49:39 +02:00
David Marchand
2f51bc9c27 eal/freebsd: fix use of newer cpuset macros
FreeBSD has updated its CPU macros to align more with the definitions
used on Linux[1]. Unfortunately, while this makes compatibility better
in future, it means we need to have both legacy and newer definition
support. Use a meson check to determine which set of macros are used.

[1] https://cgit.freebsd.org/src/commit/?id=e2650af157bc

Bugzilla ID: 1014
Fixes: c3568ea376 ("eal: restrict control threads to startup CPU affinity")
Fixes: b6be16acfe ("eal: fix control thread affinity with --lcores")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Daxue Gao <daxuex.gao@intel.com>
2022-05-24 12:48:11 +02:00
David Marchand
26d734b5d2 devargs: fix leak on hotplug failure
Caught by ASan, if a secondary process tried to attach a device with an
incorrect driver name, devargs was leaked.

Fixes: 64051bb1f1 ("devargs: unify scratch buffer storage")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
2022-05-19 18:45:20 +02:00
Luc Pelletier
00901e4d1a eal/x86: fix unaligned access for small memcpy
Calls to rte_memcpy for 1 < n < 16 could result in unaligned
loads/stores, which is undefined behaviour according to the C
standard, and strict aliasing violations.

The code was changed to use a packed structure that allows aliasing
(using the __may_alias__ attribute) to perform the load/store
operations. This results in code that has the same performance as the
original code and that is also C standards-compliant.

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2022-05-19 18:19:30 +02:00
Tyler Retzlaff
b70a9b7886 eal: get/set thread affinity per thread identifier
Implement functions for getting/setting thread affinity.
Threads can be pinned to specific cores by setting their
affinity attribute.

Windows error codes are translated to errno-style error codes.
The possible return values are chosen so that we have as
much semantic compatibility between platforms as possible.

note: convert_cpuset_to_affinity has a limitation that all cpus of
the set belong to the same processor group.

Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2022-05-19 16:45:07 +02:00
Tyler Retzlaff
56539289b8 eal: provide current thread identifier
Provide a portable type-safe thread identifier.
Provide rte_thread_self for obtaining current thread identifier.

Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
2022-05-19 16:45:07 +02:00
David Marchand
48ff13ef37 test/mem: disable ASan when accessing unallocated memory
As described in bugzilla, ASan reports accesses to all memory segment as
invalid, since those parts have not been allocated with rte_malloc.
Move __rte_no_asan to rte_common.h and disable ASan on a part of the test.

Bugzilla ID: 880
Fixes: 6cc51b1293 ("mem: instrument allocator for ASan")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-05-11 14:05:30 +02:00
Tianhao Chai
28c5d60072 eal: fix C++ include for device event and DMA
Currently the "extern C" section ends right before rte_dev_dma_unmap
and other DMA function declarations, causing some C++ compilers to
produce C++ mangled symbols to rte_dev_dma_unmap instead of C symbols.
This leads to build failures later when linking a final executable
against this object.

Fixes: a753e53d51 ("eal: add device event monitor framework")
Cc: stable@dpdk.org

Signed-off-by: Tianhao Chai <cth451@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2022-05-05 11:48:33 +02:00
Anatoly Burakov
4d8bdd8b56 malloc: fix ASan handling for unmapped memory
Currently, when we free previously allocated memory, we mark the area as
"freed" for ASan purposes (flag 0xfd). However, sometimes, freeing a
malloc element will cause pages to be unmapped from memory and re-backed
with anonymous memory again. This may cause ASan's "use-after-free"
error down the line, because the allocator will try to write into
memory areas recently marked as "freed".

To fix this, we need to mark the unmapped memory area as "available",
and fixup surrounding malloc element header/trailers to enable later
malloc routines to safely write into new malloc elements' headers or
trailers.

Bugzilla ID: 994
Fixes: 6cc51b1293 ("mem: instrument allocator for ASan")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-05-05 10:13:43 +02:00
Deepak Khandelwal
90bf3f89ed mem: skip attaching external memory in secondary process
Currently, EAL init in secondary processes will attach all fbarrays
in the memconfig to have access to the primary process's page tables.
However, fbarrays corresponding to external memory segments should
not be attached at initialization, because this will happen as part
of `rte_extmem_attach` [1] or `rte_malloc_heap_memory_attach` [2] calls.

1: https://doc.dpdk.org/api/rte__memory_8h.html#a2796da68de6825f8edf53759f8e4d230
2: https://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62cf149fd3

Fixes: ff3619d624 ("malloc: allow attaching to external memory chunks")
Cc: stable@dpdk.org

Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Deepak Khandelwal <deepak.khandelwal@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-04-28 13:44:13 +02:00
Tyler Retzlaff
8001c0ddfe eal/windows: set main lcore affinity
Add missing code to affinitize main_lcore from lcore configuration.

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2022-04-25 09:38:15 +02:00
Mattias Rönnblom
76076342ec eal: emit warning for unused trylock return value
Mark the trylock family of spinlock functions with
__rte_warn_unused_result.

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2022-04-14 14:38:20 +02:00
Mattias Rönnblom
eb13e558f8 eal: add macro to warn for unused function return values
This patch adds a wrapper macro __rte_warn_unused_result for the
warn_unused_result function attribute.

Marking a function __rte_warn_unused_result will make the compiler
emit a warning in case the caller does not use the function's return
value.

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2022-04-14 14:38:20 +02:00
David Marchand
a95d70547c eal: factorize lcore main loop
All OS implementations provide the same main loop.
Introduce helpers (shared for Linux and FreeBSD) to handle synchronisation
between main and threads and factorize the rest as common code.
Thread id are now logged as string in a common format across OS.

Note:
- this change also fixes Windows EAL: worker threads cpu affinity was
  incorrectly reported in log.

- libabigail flags this change as breaking ABI in clang builds:
  1 function with some indirect sub-type change:

  [C] 'function int rte_eal_remote_launch(int (void*)*, void*, unsigned
      int)' at eal_common_launch.c:35:1 has some indirect sub-type
      changes:
    parameter 1 of type 'int (void*)*' changed:
      in pointed to type 'function type int (void*)' at rte_launch.h:31:1:
        entity changed from 'function type int (void*)' to 'typedef
          lcore_function_t' at rte_launch.h:31:1
        type size hasn't changed

  This is being investigated on libabigail side.
  For now, we don't have much choice but to waive reports on this symbol.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2022-04-14 13:59:50 +02:00
David Marchand
449e7dbc7b eal: cleanup lcore ID hand-over
So far, a worker thread has been using its thread_id to discover which
lcore has been assigned to it.

On the other hand, as noted by Tyler, the pthread API does not strictly
guarantee that a new thread won't start running eal_thread_loop before
pthread_create writes to &lcore_config[xx].thread_id.

Though all OS implementations supported in DPDK (recently) ensure this
property, it is more robust to have the main thread directly pass
the worker thread lcore.

Signed-off-by: David Marchand <david.marchand@redhat.com>
2022-04-14 13:59:50 +02:00
David Marchand
1e230b9be8 eal/windows: add missing C++ include guards
Add missing 'extern "C"' to file.

Fixes: 1db72630da ("eal/windows: do not expose private facilities")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2022-04-08 10:49:39 +02:00
Tyler Retzlaff
e4e983b975 eal/windows: fix data race when creating threads
eal_thread_loop() uses lcore_config[i].thread_id,
which is stored upon the return from CreateThread().
Per documentation, eal_thread_loop() can start
before CreateThread() returns and the ID is stored.

Create lcore worker threads suspended and then subsequently resume to
allow &lcore_config[i].thread_id be stored before eal_thread_loop
execution.

Fixes: 53ffd9f080 ("eal/windows: add minimum viable code")
Cc: stable@dpdk.org

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2022-03-30 19:01:52 +02:00
Haiyue Wang
f6ecec2b91 eal/x86: remove atomic header include loop
Remove the x86 top atomic header include from the architecture related
header file, since this x86 top atomic header file has included them.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
2022-03-30 18:54:14 +02:00
Bruce Richardson
29fd052dcc eal/freebsd: add missing C++ include guards
Add missing 'extern "C"' to file.

Fixes: 428eb983f5 ("eal: add OS specific header file")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2022-03-15 02:06:13 +01:00
Wenxuan Wu
4e3582ab5b eal/linux: fix device monitor stop return
The ret value in rte_dev_event_monitor_stop stands for whether the
monitor has been successfully closed, and should not bind with
rte_intr_callback_unregister, so once it goes to the right exit point of
rte_dev_event_monitor, the ret value should be set to 0.

Also, the refmonitor count has been carefully evaluated, the value
change from 1 to 0, so there is no potential memory leak failure.

Fixes: 1fef6ced07 ("eal/linux: allow multiple starts of event monitor")
Cc: stable@dpdk.org

Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
2022-03-07 19:21:18 +01:00
Madhuker Mythri
356a2aa054 devargs: fix crash with uninitialized parsing
The function rte_devargs_parse() previously was safe to call with
non-initialized devargs structure as parameter.

When adding the support for the global device syntax,
this assumption was broken.
Restore it by forcing memset as part of the call itself.

Bugzilla ID: 933
Fixes: b344eb5d94 ("devargs: parse global device syntax")
Cc: stable@dpdk.org

Signed-off-by: Madhuker Mythri <madhuker.mythri@oracle.com>
Signed-off-by: Gaetan Rivet <grive@u256.net>
2022-02-27 19:28:59 +01:00
Steve Yang
1a287fc9c9 eal/linux: fix illegal memory access in uevent handler
'recv()' fills the 'buf', later 'strlcpy()' used to copy from this buffer.
But as coverity warns 'recv()' doesn't guarantee that 'buf' is
null-terminated, but 'strlcpy()' requires it.

Enlarge 'buf' size to 'EAL_UEV_MSG_LEN + 1' and ensure the last one can
be set to 0 when received buffer size is EAL_UEV_MSG_LEN.

CID 375864:  Memory - illegal accesses  (STRING_NULL)
Passing unterminated string "buf" to "dev_uev_parse", which expects
a null-terminated string.

Coverity issue: 375864
Fixes: 0d0f478d04 ("eal/linux: add uevent parse and process")
Cc: stable@dpdk.org

Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2022-02-27 19:12:34 +01:00
Brian Dooley
d7e9c02cca eal: add missing C++ guards
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.

Fixes: af75078fec ("first public release")
Fixes: 7f3aa08639 ("eal: introduce bit operations API")
Fixes: 166a743c53 ("compat: add infrastructure to support symbol versioning")
Fixes: 8f40ee0734 ("eal/x86: get hypervisor name")
Fixes: 75583b0d1e ("eal: add keep alive monitoring")
Fixes: 88701645c9 ("eal: move interrupt type out of igb_uio")
Fixes: f04519d809 ("lib: add missing include dependencies")
Fixes: f58880682c ("trace: implement register API")
Fixes: 428eb983f5 ("eal: add OS specific header file")
Cc: stable@dpdk.org

Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2022-02-22 14:47:49 +01:00
Sean Morrissey
30a1de105a lib: remove unneeded header includes
These header includes have been flagged by the iwyu_tool
and removed.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
2022-02-22 13:10:39 +01:00
Michael Barker
ed57d08dfd eal: ignore gcc-compat warning in clang-only macro
When compiling with clang using -Wpedantic (or -Wgcc-compat) the use of
diagnose_if kicks up a warning:

.../include/rte_interrupts.h:623:1: error: 'diagnose_if' is a clang
extension [-Werror,-Wgcc-compat]
__rte_internal
^
.../include/rte_compat.h:36:16: note: expanded from macro '__rte_internal'
__attribute__((diagnose_if(1, "Symbol is not public ABI", "error"), \

This change ignores the '-Wgcc-compat' warning in the specific location
where the warning occurs.  It is safe to do in this circumstance as the
specific macro is only defined when using the clang compiler.

Signed-off-by: Michael Barker <mikeb01@gmail.com>
2022-02-12 14:37:47 +01:00
Stephen Hemminger
06c047b680 remove unnecessary null checks
Functions like free, rte_free, and rte_mempool_free
already handle NULL pointer so the checks here are not necessary.

Remove redundant NULL pointer checks before free functions
found by nullfree.cocci

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-12 12:07:48 +01:00
Stephen Hemminger
a0cc7be20d mem: cleanup multiprocess resources
The mp action resources in malloc should be cleaned up via
rte_eal_cleanup.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-11 19:49:22 +01:00
Stephen Hemminger
e8dc971b63 eal: cleanup multiprocess hotplug resources
When rte_eal_cleanup is called, hotplug should unregister the
resources associated with the multi-process server.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-11 19:49:22 +01:00
Stephen Hemminger
6412941ae8 vfio: cleanup the multiprocess sync handle
When rte_eal_cleanup is called the rte_mp_action for VFIO
should be freed.

Fixes: edf73dd330 ("ipc: handle unsupported IPC in action register")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-11 19:49:22 +01:00
Stephen Hemminger
6e858b4d92 ipc: end multiprocess thread during cleanup
When rte_eal_cleanup is called, all control threads should exit.
For the mp thread, this best handled by closing the mp_socket
and letting the thread see that.

This also fixes potential problems where the mp_socket gets
another hard error, and the thread runs away repeating itself
by reading the same error.

Fixes: 85d6815fa6 ("eal: close multi-process socket during cleanup")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-11 19:49:22 +01:00
Stephen Hemminger
5f4eb82f3c log: close in cleanup stage
When application calls rte_eal_cleanup on shutdown,
the DPDK log should be closed and cleaned up.

This helps reduce false reports from tools like ASAN
and valgrind that track memory leaks.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-11 19:49:22 +01:00
Yunjian Wang
5f69ebbd85 mem: check allocation in dynamic hugepage init
The function malloc() could return NULL, the return value
need to be checked.

Fixes: 6f63858e55 ("mem: prevent preallocated pages from being freed")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2022-02-11 08:46:21 +01:00
Pavan Nikhilesh
264ff3f250 eal/arm: inline 128-bit atomic compare exchange with GCC
GCC [1] now assigns even register pairs for CASP, the fix has also been
backported to all stable releases of older GCC versions.
Removing the manual register allocation allows GCC to inline the
functions and pick optimal registers for performing CASP.

1: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=563cc649beaf

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
2022-02-11 08:44:09 +01:00
Bruce Richardson
59144f6edd eal: fix C++ include
C++ files could not include some headers because:

* "new" is a keyword in C++, so can't be a variable name
* there is no automatic casting to/from void *

Fixes: 184104fc61 ("ticketlock: introduce fair ticket based locking")
Fixes: 032a7e5499 ("trace: implement provider payload")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2022-02-10 23:02:35 +01:00
Stephen Hemminger
6e97b5fc1a eal: move Unix filesystem functions into one file
Both Linux and FreeBSD have same code for creating runtime
directory and reading sysfs files. Put them in the new lib/eal/unix
subdirectory.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-09 19:12:53 +01:00
Stephen Hemminger
1835a22f34 support systemd service convention for runtime directory
Systemd.exec supports configuring the runtime directory of a service
via RuntimeDirectory=. This creates the directory with the necessary
permissions which actual service may not have if running in container.

The change to DPDK is to look for the environment RUNTIME_DIRECTORY
first and use that in preference to the fallback alternatives.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
2022-02-09 19:12:40 +01:00
Stephen Hemminger
36514d8dfa eal: remove size for setting runtime directory
The size argument to eal_set_runtime_dir is useless and was
being used incorrectly in strlcpy. It worked only because
all callers passed PATH_MAX which is same as sizeof the destination
runtime_dir.

Note: this is an internal API so no user exposed change.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-02-09 16:42:31 +01:00
Srikanth Yalavarthi
f3ca33bb20 eal: add internal function to get base address
Added an internal helper to get OS-specific EAL mapping base address

This helper can be used by the drivers to program offload / accelerator
devices, where the base address can be used as a reference address by
the accelerator to access the host memory

An address can also be represented as an offset relative to the base
address using smaller data types

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-08 23:59:10 +01:00
Dmitry Kozlyuk
0dff3f26d6 eal: extend --huge-unlink for hugepage file reuse
Expose Linux EAL ability to reuse existing hugepage files
via --huge-unlink=never switch.
Default behavior is unchanged, it can also be specified
using --huge-unlink=existing for consistency.
Old --huge-unlink switch is kept,
it is an alias for --huge-unlink=always.
Add a test case for the --huge-unlink=never mode.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-08 21:32:53 +01:00
Dmitry Kozlyuk
32b4771cd8 eal/linux: allow hugepage file reuse
Linux EAL ensured that mapped hugepages are clean
by always mapping from newly created files:
existing hugepage backing files were always removed.
In this case, the kernel clears the page to prevent data leaks,
because the mapped memory may contain leftover data
from the previous process that was using this memory.
Clearing takes the bulk of the time spent in mmap(2),
increasing EAL initialization time.

Introduce a mode to keep existing files and reuse them
in order to speed up initial memory allocation in EAL.
Hugepages mapped from such files may contain data
left by the previous process that used this memory,
so RTE_MEMSEG_FLAG_DIRTY is set for their segments.
If multiple hugepages are mapped from the same file:
1. When fallocate(2) is used, all memory mapped from this file
   is considered dirty, because it is unknown
   which parts of the file are holes.
2. When ftruncate(3) is used, memory mapped from this file
   is considered dirty unless the file is extended
   to create a new mapping, which implies clean memory.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-08 21:32:53 +01:00
Dmitry Kozlyuk
52d7d91ed4 eal: refactor --huge-unlink storage
In preparation to extend --huge-unlink option semantics
refactor how it is stored in the internal configuration.
It makes future changes more isolated.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-08 21:32:53 +01:00
Dmitry Kozlyuk
2edd037c09 mem: add dirty malloc element support
EAL malloc layer assumed all free elements content
is filled with zeros ("clean"), as opposed to uninitialized ("dirty").
This assumption was ensured in two ways:
1. EAL memalloc layer always returned clean memory.
2. Freed memory was cleared before returning into the heap.

Clearing the memory can be as slow as around 14 GiB/s.
To save doing so, memalloc layer is allowed to return dirty memory.
Such segments being marked with RTE_MEMSEG_FLAG_DIRTY.
The allocator tracks elements that contain dirty memory
using the new flag in the element header.
When clean memory is requested via rte_zmalloc*()
and the suitable element is dirty, it is cleared on allocation.
When memory is deallocated, the freed element is joined
with adjacent free elements, and the dirty flag is updated:

a) If the joint element contains dirty parts, it is dirty:

    dirty + freed + dirty = dirty  =>  no need to clean
            freed + dirty = dirty      the freed memory

   Dirty parts may be large (e.g. initial allocation),
   so clearing them could create unpredictable slowdown.

b) If the only dirty part of the joint element
   is the freed memory, the joint element can be made clean:

    clean + freed + clean = clean  =>  freed memory
    clean + freed         = clean      must be cleared
            freed + clean = clean
            freed         = clean

   This logic naturally reproduces the old behavior
   and always applies in modes when EAL memalloc layer
   returns only clean segments.

As a result, memory is either cleared on free, as before,
or it will be cleared on allocation if need be, but never twice.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-08 21:32:53 +01:00
Weiguo Li
0ae7844fcd eal/windows: remove useless C++ include guard
Remove the incomplete cplusplus guard in internal header.

Fixes: 6e1ed4cbbe ("eal/windows: add dirent implementation")

Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
2022-02-08 17:13:50 +01:00