Commit Graph

205 Commits

Author SHA1 Message Date
Fidaullah Noonari
f92b9ebed0 malloc: fix storage size for some allocations
The amount of memory to allocate from the system for heap expansion
was calculated in a way that may yield one page more than needed.
This could hit the allocation limit from the system or EAL.
The allocation would fail despite enough memory being available.

In response to mail:
https://inbox.dpdk.org/dev/CAEYuUWCnRZNwxiOHEeTHw0Gy9aFJRLZtvAG9g=smuUvUEMcFXg@mail.gmail.com/

A reproducer has been provided by Dmitry, see:
https://inbox.dpdk.org/dev/20220922015212.03bfde66@sovereign/

Fixes: 07dcbfe010 ("malloc: support multiprocess memory hotplug")
Cc: stable@dpdk.org

Signed-off-by: Fidaullah Noonari <fidaullah.noonari@emumba.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2022-09-26 11:40:20 +02:00
David Marchand
fbd59c8ecb dev: hide device object
Make rte_device opaque for non internal users.
This will make extending this object possible without breaking the ABI.

Some applications may have been dereferencing rte_device objects, mark
this object's accessors as stable.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
5b569f2ea8 dev: provide bus specific information
For diagnostic, it may be useful to provide a description of the device
with bus specific information.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
ec5ecd7e37 dev: introduce device accessors
Prepare for making the device object opaque by adding accessors.
Update existing "external" users.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
1acb7f5474 dev: hide driver object
Make rte_driver opaque for non internal users.
This will make extending this object possible without breaking the ABI.

Introduce a new driver header and move rte_driver definition.
Update drivers and library to use the internal header.

Some applications may have been dereferencing rte_driver objects, mark
this object's accessors as stable.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
97bbdba31d dev: introduce driver accessors
Prepare for making the driver object opaque by adding accessors.
Update existing "external" users.
Internal users may still dereference a rte_driver object.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
a04322f616 bus: hide bus object
Make rte_bus opaque for non internal users.
This will make extending this object possible without breaking the ABI.

Introduce a new driver header and move rte_bus definition and helpers.
Update drivers and library to use the internal header.

Some applications may have been dereferencing rte_bus objects, mark
this object's accessors as stable.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
148c51a3de bus: introduce accessors
Add helpers to get a rte_bus object details.
This will be used externally.
Internal users may still dereference a rte_bus object.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
770ebc060e bus: move IOVA definition from header
iova enum definition does not need to be defined as part of the bus API.
Move it to rte_eal.h.
With this step, rte_eal.h does not depend on rte_bus.h and rte_dev.h.
Fix existing code that was relying on these implicit inclusions.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
82ee04a855 devargs: remove dependency on bus header
We don't need to include rte_bus.h in rte_devargs.h.
Only a forward declaration of rte_bus and an inclusion of rte_dev.h are
needed.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-23 16:14:34 +02:00
David Marchand
8f1d23ece0 eal: deprecate RTE_FUNC_PTR_* macros
Those macros have no real value and are easily replaced with a simple
if() block.

Existing users have been converted using a new cocci script.
Deprecate them.

Signed-off-by: David Marchand <david.marchand@redhat.com>
2022-09-23 16:14:34 +02:00
David Marchand
99948194a8 dev: hide debug messages in device iterator
For any bus that does not support device iteration, rte_dev_iterator_init
both returned an error code and logged an error message.
An application (like testpmd) that only wants to list devices, would have
no choice but to inspect a bus object to avoid spewing error logs.

Make those log messages debug level, and remove the check in testpmd.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-23 16:14:34 +02:00
Abdullah Sevincer
aec322ce15 eal: export coremask parsing helper
DLB2 has a need to parse a user supplied coremask as part
of an optimization that associates optimal core/resource
pairs. Therefore eal_parse_coremask has been renamed
to rte_eal_parse_coremask and exported but kept internal.

Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2022-09-23 12:07:49 +02:00
Dmitry Kozlyuk
72b452c5f2 eal: remove unneeded includes from a public header
Do not include <ctype.h>, <errno.h>, and <stdlib.h> from <rte_common.h>,
because they are not used by this file.
Include the needed headers directly from the files that need them.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-21 15:31:03 +02:00
Dmitry Kozlyuk
347623c9c7 eal: uninline some string formatting helper
There is no reason for rte_str_to_size() to be inline.
Move the implementation out of <rte_common.h>.
Export it as a stable ABI because it always has been public.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
2022-09-21 15:31:03 +02:00
Dmitry Kozlyuk
107dc0664c eal: deduplicate roundup code
RTE_CACHE_LINE_ROUNDUP() implementation repeated RTE_ALIGN_MUL_CEIL().
In other places RTE_CACHE_LINE_SIZE is assumed to be a power-of-2,
so define RTE_CACHE_LINE_ROUNDUP() using RTE_ALIGN_CEIL().

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
2022-09-21 15:31:03 +02:00
Dmitry Kozlyuk
1a7374c956 eal: fix side effect in some pointer arithmetic macros
RTE_PTR_SUB(ptr, x) and RTE_PTR_ALIGN_FLOOR() worked incorrectly
if "ptr" was an expression:

    uint32_t arr[3];

    RTE_PTR_SUB(arr + 1, sizeof(arr[0]));
    // expected: (uint32_t *)((uintptr_t)(arr + 1) - 4) == arr
    // actual:   (uint32_t *)((uintptr_t) arr + 1  - 4) != arr

    RTE_PTR_ALIGN_FLOOR(arr + 2, sizeof(arr[0]));
    // expected: RTE_ALIGN_FLOOR((uintptr_t)(arr + 2), 4) == &arr[2]
    // actual:   RTE_ALIGN_FLOOR((uintptr_t) arr + 2,  4) == &arr[0]

Fix the macros and extend the relevant unit test.
Convert uses of a custom test failure macro to RTE_TEST_ASSERT*().

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
2022-09-21 15:31:03 +02:00
David Marchand
72206323a5 version: 22.11-rc0
Start a new release cycle with empty release notes.

The ABI version becomes 23.0.
The map files are updated to the new ABI major number (23).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.
Special handling of removed drivers is also dropped in check-abi.sh and
a note has been added in libabigail.abignore as a reminder.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2022-07-21 12:13:48 +02:00
Harry van Haaren
6550113be6 service: fix lingering active status
This commit fixes an issue where calling rte_service_lcore_stop()
would result in a service's "active on lcore" status becoming stale.

The stale status would result in rte_service_may_be_active() always
returning "1", indicating that the service is not certainly stopped.

This is fixed by ensuring the "active on lcore" status of each service
is set to 0 when an lcore is stopped.

Fixes: e30dd31847 ("service: add mechanism for quiescing")
Fixes: 8929de043e ("service: retrieve lcore active state")

Reported-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2022-07-05 16:24:43 +02:00
Stephen Hemminger
96c3a92809 eal: promote experimental sleep function
This has been around since 2018 release.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2022-06-26 15:00:36 +02:00
Stephen Hemminger
7f7efe829b interrupts: promote some experimental functions
These are functions related to interrupts that have been
in since 20.02 release or earlier.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2022-06-26 15:00:36 +02:00
Stephen Hemminger
218f2f9749 eal: promote some lcore experimental accessors
These API's have been around for a long time and by now are fixed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2022-06-26 15:00:36 +02:00
Stephen Hemminger
025e9bcd61 log: promote some experimental macros and function
The RTE_LOG_REGISTER is not experimental, and the experimental
tag was never enforced on these.

Make rte_log_can_log a fully supported function.
It was introduced nearly 2yrs ago.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2022-06-26 14:43:19 +02:00
Stephen Hemminger
448e01f1b5 lib: document free functions
Make sure all functions which use the convention that XXX_free(NULL)
is a nop are all documented.

The wording is chosen to match the documentation of free(3).
"If ptr is NULL, no operation is performed."

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
[David: squashed with other series updates, unified wording]
2022-06-24 14:50:34 +02:00
Stephen Hemminger
cb68a56343 remove passive voice in function description
Remove extraneous phrase "This API is used to" and use
active instead of passive voice when describing a function.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
[David: for raw/ioat and dmadev parts:]
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Conor Walsh <conor.walsh@intel.com>
2022-06-24 14:05:54 +02:00
Don Wallwork
42fbb8e85d eal/linux: allocate worker lcore stacks in hugepages
Add support for using hugepages for worker lcore stack memory. The
intent is to improve performance by reducing stack memory related TLB
misses and also by using memory local to the NUMA node of each lcore.

EAL option '--huge-worker-stack[=stack-size-in-kbytes]' is added to allow
the feature to be enabled at runtime. If the size is not specified,
the system pthread stack size will be used.

Signed-off-by: Don Wallwork <donw@xsightlabs.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
2022-06-23 22:36:33 +02:00
Fidaullah Noonari
ce2f7d472e malloc: fix allocation of almost hugepage size
If called to allocate memory of size is between multiple of hugepage
size minus malloc_header_len and hugepage size, rte_malloc fails.

This fix replaces malloc_elem_trailer_len with malloc_elem_overhead in
try_expand_heap() to include malloc_elem_header_len when calculating
n_seg.

Bugzilla ID: 800
Fixes: 07dcbfe010 ("malloc: support multiprocess memory hotplug")
Cc: stable@dpdk.org

Signed-off-by: Fidaullah Noonari <fidaullah.noonari@emumba.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2022-06-23 13:40:50 +02:00
Stephen Hemminger
0efcd352e2 eal/unix: make stack dump signal safe
rte_dump_stack() needs to be usable in situations when a bug is
encountered and from signal handlers (such as SEGV).

Glibc backtrace_symbols() calls malloc which makes it
dangerous in a signal handler that is handling errors that maybe
due to memory corruption. Additionally, rte_log() is unsafe because
syslog() is not signal safe; printf() is also documented as
not being safe.

This version formats message and uses writev for each line in a manner
similar to what glibc version of backtrace_symbols_fd() does. The
FreeBSD version of backtrace_symbols_fd() is not signal safe.

Sample output:

0: ./build/app/dpdk-testpmd (rte_dump_stack+0x2b) [560a6e9c002b]
1: ./build/app/dpdk-testpmd (main+0xad) [560a6decd5ad]
2: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xcd) [7fd43d3e27fd]
3: ./build/app/dpdk-testpmd (_start+0x2a) [560a6e83628a]

Bugzilla ID: 929

Acked-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2022-06-23 13:40:50 +02:00
David Marchand
11f61ea2f6 eal/x86: drop export of internal alignment macro
ALIGNMENT_MASK is only used internally.
Besides it lacks a DPDK-related prefix.
Hide it from external eyes.

Fixes: f5472703c0 ("eal: optimize aligned memcpy on x86")
Cc: stable@dpdk.org

Reported-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2022-06-22 11:32:35 +02:00
Stephen Hemminger
0cd10724bf eal: provide pseudo-random floating point number
The PIE code and other applications can benefit from having a
fast way to get a random floating point value. This new function
is equivalent to drand() in the standard library.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2022-06-22 10:59:09 +02:00
Sean Morrissey
2ff3976e67 eal: remove unneeded header includes
These header includes have been flagged by the iwyu_tool
and removed.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
2022-06-21 16:46:56 +02:00
Chengwen Feng
d59a940667 trace: fix init with long file prefix
Bug scenario:
1. start testpmd:
  $ dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* \
    --file-prefix=trace_autotest -- -i
2. then observed:
  EAL: eal_trace_init():93 failed to initialize trace [File exists]
  EAL: FATAL: Cannot init trace
  EAL: Cannot init trace
  EAL: Error - exiting with code: 1

The root cause it that the offset set wrong with long file-prefix and
then lead the strftime return failed.

At the same time, trace_session_name_generate() uses errno as the return
value, but the errno was not set if strftime returned zero.
A previously set errno (EEXIST or ENOENT from call to mkdir for creating
the runtime configuration directory) was returned in this case.
This is fragile and may lead to incorrect logic if errno was set
to 0 previously.
This also resulted in inaccurate prompting.
Set errno to ENOSPC if strftime return zero.

Fixes: 321dd5f8fa ("trace: add internal init and fini interface")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2022-06-21 11:11:00 +02:00
Chengwen Feng
a8f23b444d trace: fix crash when exiting
Bug scenario:
1. start testpmd:
  $ dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* -- -i
2. quit testpmd and then observed segment fault:
  Bye...
  Segmentation fault (core dumped)

The root cause is that rte_trace_save() and eal_trace_fini() access
the huge pages which were cleanup by rte_eal_memory_detach().

This patch moves rte_trace_save() and eal_trace_fini() before
rte_eal_memory_detach() to fix the bug.

Fixes: dfbc61a2f9 ("mem: detach memsegs on cleanup")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2022-06-21 11:11:00 +02:00
Stanislaw Kardach
93cba71bdc eal/riscv: fix vector header for C++
rte_xmm_t is a union type which wraps around xmm_t and maps its contents
to scalar structures. Since C++ has stricter type conversion rules than
C, the rte_xmm_t::x has to be used instead of C-casting.

Fixes: f22e705ebf ("eal/riscv: support RISC-V architecture")

Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2022-06-15 09:12:16 +02:00
David Marchand
e5e613f05b eal: remove unused arch-specific headers for locks
MCS lock, PF lock and Ticket lock have no arch specific implementation,
there is no need for the extra redirection in headers.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Stanislaw Kardach <kda@semihalf.com>
2022-06-08 15:44:20 +02:00
Michal Mazurek
f22e705ebf eal/riscv: support RISC-V architecture
Add all necessary elements for DPDK to compile and run EAL on SiFive
Freedom U740 SoC which is based on SiFive U74-MC (ISA: rv64imafdc)
core complex.

This includes:

- EAL library implementation for rv64imafdc ISA.
- meson build structure for 'riscv' architecture. RTE_ARCH_RISCV define
  is added for architecture identification.
- xmm_t structure operation stubs as there is no vector support in the
  U74 core.

Compilation was tested on Ubuntu and Arch Linux using riscv64 toolchain.
Clang compilation currently not supported due to issues with missing
relocation relaxation.

Two rte_rdtsc() schemes are provided: stable low-resolution using rdtime
(default) and unstable high-resolution using rdcycle. User can override
the scheme by defining RTE_RISCV_RDTSC_USE_HPM=1 during compile time of
both DPDK and the application. The reasoning for this is as follows.
The RISC-V ISA mandates that clock read by rdtime has to be of constant
period and synchronized between all hardware threads within 1 tick
(chapter 10.1 in version 20191213 of RISC-V spec).
However this clock may not be of high-enough frequency for dataplane
uses. I.e. on HiFive Unmatched (FU740) it is 1MHz.
There is a high-resolution alternative in form of rdcycle which is
clocked at the core clock frequency. The drawbacks are that it may be
disabled during sleep (WFI), its frequency might change due to DVFS and
it is core-local and therefore cannot be used as a wall-clock. It can
however be used for micro-benchmarking user applications, similarly to
Aarch64's PMCCNTR PMU counter.

The platform is currently marked as linux-only because rte_cycles
implementation uses the timebase-frequency device-tree node read through
the proc file system. Such approach was chosen because Linux kernel
depends on the presence of this device-tree node.

The i40e PMD driver is disabled on RISC-V as the rv64gc ISA has no vector
operations.

The compilation of following modules has been disabled by this commit
and will be re-enabled in later commits as fixes are introduced:
net/ixgbe, net/memif, net/tap, example/l3fwd.

Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Michal Mazurek <maz@semihalf.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2022-06-08 11:26:20 +02:00
Tyler Retzlaff
ca04c78b62 eal: get/set thread priority per thread identifier
Add functions for setting and getting the priority of a thread.
Priorities on multiple platforms are similarly determined by a priority
value and a priority class/policy.

Currently in DPDK most threads operate at the OS-default priority level
but there are cases when increasing the priority is useful. For
example, high performance applications may require elevated priority
levels.

For these reasons, EAL will expose two priority levels which are named
suggestively "normal" and "realtime_critical" and are computed as
follows:

  On Linux, the following mapping is created:
    RTE_THREAD_PRIORITY_NORMAL corresponds to
      * policy SCHED_OTHER
      * priority value:   (sched_get_priority_min(SCHED_OTHER) +
			   sched_get_priority_max(SCHED_OTHER))/2;
    RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to
      * policy SCHED_RR
      * priority value: sched_get_priority_max(SCHED_RR);

  On Windows, the following mapping is created:
    RTE_THREAD_PRIORITY_NORMAL corresponds to
      * class NORMAL_PRIORITY_CLASS
      * priority THREAD_PRIORITY_NORMAL
    RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to
      * class REALTIME_PRIORITY_CLASS (when running with privileges)
      * class HIGH_PRIORITY_CLASS (when running without privileges)
      * priority THREAD_PRIORITY_TIME_CRITICAL

Note that on Linux the resulting priority value will be 0, in
accordance to the documentation that mention the value should be 0 for
SCHED_OTHER policy.

Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2022-06-07 13:33:14 +02:00
Mattias Rönnblom
0bee070907 eal: add seqlock
A sequence lock (seqlock) is a synchronization primitive which allows
for data-race free, low-overhead, high-frequency reads, suitable for
data structures shared across many cores and which are updated
relatively infrequently.

A seqlock permits multiple parallel readers. A spinlock is used to
serialize writers. In cases where there is only a single writer, or
writer-writer synchronization is done by some external means, the
"raw" sequence counter type (and accompanying rte_seqcount_*()
functions) may be used instead.

To avoid resource reclamation and other issues, the data protected by
a seqlock is best off being self-contained (i.e., no pointers [except
to constant data]).

One way to think about seqlocks is that they provide means to perform
atomic operations on data objects larger than what the native atomic
machine instructions allow for.

DPDK seqlocks (and the underlying sequence counters) are not
preemption safe on the writer side. A thread preemption affects
performance, not correctness.

A seqlock contains a sequence number, which can be thought of as the
generation of the data it protects.

A reader will
  1. Load the sequence number (sn).
  2. Load, in arbitrary order, the seqlock-protected data.
  3. Load the sn again.
  4. Check if the first and second sn are equal, and even numbered.
     If they are not, discard the loaded data, and restart from 1.

The first three steps need to be ordered using suitable memory fences.

A writer will
  1. Take the spinlock, to serialize writer access.
  2. Load the sn.
  3. Store the original sn + 1 as the new sn.
  4. Perform load and stores to the seqlock-protected data.
  5. Store the original sn + 2 as the new sn.
  6. Release the spinlock.

Proper memory fencing is required to make sure the first sn store, the
data stores, and the second sn store appear to the reader in the
mentioned order.

The sn loads and stores must be atomic, but the data loads and stores
need not be.

The original seqlock design and implementation was done by Stephen
Hemminger. This is an independent implementation, using C11 atomics.

For more information on seqlocks, see
https://en.wikipedia.org/wiki/Seqlock

Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
2022-06-07 13:33:14 +02:00
Duncan Bellamy
0615dd2aa1 eal/ppc: fix compilation for musl
musl lacks __ppc_get_timebase() but has __builtin_ppc_get_timebase()

Signed-off-by: Duncan Bellamy <dunk@denkimushi.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2022-06-07 13:33:14 +02:00
Thomas Monjalon
b251bb7630 eal/ppc: undefine AltiVec keyword vector
The AltiVec header file is defining "vector", except in C++ build.
The keyword "vector" may conflict easily.
As a rule, it is better to use the alternative keyword "__vector".

The DPDK header file rte_altivec.h takes care of undefining "vector",
so the applications and dependencies are free to define the name "vector".

This is a compatibility breakage for applications which were using
the keyword "vector" for its AltiVec meaning.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2022-06-01 16:51:53 +02:00
Thomas Monjalon
64fcadeac0 avoid AltiVec keyword vector
The AltiVec header file is defining "vector", except in C++ build.
The keyword "vector" may conflict easily.
As a rule, it is better to use the alternative keyword "__vector",
so we will be able to #undef vector after including AltiVec header.

Later it may become possible to #undef vector in rte_altivec.h
with a compatibility breakage.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2022-05-25 11:49:39 +02:00
David Marchand
2f51bc9c27 eal/freebsd: fix use of newer cpuset macros
FreeBSD has updated its CPU macros to align more with the definitions
used on Linux[1]. Unfortunately, while this makes compatibility better
in future, it means we need to have both legacy and newer definition
support. Use a meson check to determine which set of macros are used.

[1] https://cgit.freebsd.org/src/commit/?id=e2650af157bc

Bugzilla ID: 1014
Fixes: c3568ea376 ("eal: restrict control threads to startup CPU affinity")
Fixes: b6be16acfe ("eal: fix control thread affinity with --lcores")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Daxue Gao <daxuex.gao@intel.com>
2022-05-24 12:48:11 +02:00
David Marchand
26d734b5d2 devargs: fix leak on hotplug failure
Caught by ASan, if a secondary process tried to attach a device with an
incorrect driver name, devargs was leaked.

Fixes: 64051bb1f1 ("devargs: unify scratch buffer storage")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
2022-05-19 18:45:20 +02:00
Luc Pelletier
00901e4d1a eal/x86: fix unaligned access for small memcpy
Calls to rte_memcpy for 1 < n < 16 could result in unaligned
loads/stores, which is undefined behaviour according to the C
standard, and strict aliasing violations.

The code was changed to use a packed structure that allows aliasing
(using the __may_alias__ attribute) to perform the load/store
operations. This results in code that has the same performance as the
original code and that is also C standards-compliant.

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2022-05-19 18:19:30 +02:00
Tyler Retzlaff
b70a9b7886 eal: get/set thread affinity per thread identifier
Implement functions for getting/setting thread affinity.
Threads can be pinned to specific cores by setting their
affinity attribute.

Windows error codes are translated to errno-style error codes.
The possible return values are chosen so that we have as
much semantic compatibility between platforms as possible.

note: convert_cpuset_to_affinity has a limitation that all cpus of
the set belong to the same processor group.

Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2022-05-19 16:45:07 +02:00
Tyler Retzlaff
56539289b8 eal: provide current thread identifier
Provide a portable type-safe thread identifier.
Provide rte_thread_self for obtaining current thread identifier.

Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
2022-05-19 16:45:07 +02:00
David Marchand
48ff13ef37 test/mem: disable ASan when accessing unallocated memory
As described in bugzilla, ASan reports accesses to all memory segment as
invalid, since those parts have not been allocated with rte_malloc.
Move __rte_no_asan to rte_common.h and disable ASan on a part of the test.

Bugzilla ID: 880
Fixes: 6cc51b1293 ("mem: instrument allocator for ASan")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-05-11 14:05:30 +02:00
Tianhao Chai
28c5d60072 eal: fix C++ include for device event and DMA
Currently the "extern C" section ends right before rte_dev_dma_unmap
and other DMA function declarations, causing some C++ compilers to
produce C++ mangled symbols to rte_dev_dma_unmap instead of C symbols.
This leads to build failures later when linking a final executable
against this object.

Fixes: a753e53d51 ("eal: add device event monitor framework")
Cc: stable@dpdk.org

Signed-off-by: Tianhao Chai <cth451@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2022-05-05 11:48:33 +02:00
Anatoly Burakov
4d8bdd8b56 malloc: fix ASan handling for unmapped memory
Currently, when we free previously allocated memory, we mark the area as
"freed" for ASan purposes (flag 0xfd). However, sometimes, freeing a
malloc element will cause pages to be unmapped from memory and re-backed
with anonymous memory again. This may cause ASan's "use-after-free"
error down the line, because the allocator will try to write into
memory areas recently marked as "freed".

To fix this, we need to mark the unmapped memory area as "available",
and fixup surrounding malloc element header/trailers to enable later
malloc routines to safely write into new malloc elements' headers or
trailers.

Bugzilla ID: 994
Fixes: 6cc51b1293 ("mem: instrument allocator for ASan")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-05-05 10:13:43 +02:00
Deepak Khandelwal
90bf3f89ed mem: skip attaching external memory in secondary process
Currently, EAL init in secondary processes will attach all fbarrays
in the memconfig to have access to the primary process's page tables.
However, fbarrays corresponding to external memory segments should
not be attached at initialization, because this will happen as part
of `rte_extmem_attach` [1] or `rte_malloc_heap_memory_attach` [2] calls.

1: https://doc.dpdk.org/api/rte__memory_8h.html#a2796da68de6825f8edf53759f8e4d230
2: https://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62cf149fd3

Fixes: ff3619d624 ("malloc: allow attaching to external memory chunks")
Cc: stable@dpdk.org

Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Deepak Khandelwal <deepak.khandelwal@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-04-28 13:44:13 +02:00