Extend test_ring_autotest with new test-cases for RTS/HTS sync modes.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Introduce new test case to test MT peek API.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
For rings with producer/consumer in RTE_RING_SYNC_ST, RTE_RING_SYNC_MT_HTS
mode, provide an ability to split enqueue/dequeue operation
into two phases:
- enqueue/dequeue start
- enqueue/dequeue finish
That allows user to inspect objects in the ring without removing
them from it (aka MT safe peek).
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Introduce new test case to test HTS ring mode under contention.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Introduce head/tail sync mode for MT ring synchronization.
In that mode enqueue/dequeue operation is fully serialized:
only one thread at a time is allowed to perform given op.
Suppose to reduce stall times in case when ring is used on
overcommitted cpus (multiple active threads on the same cpu).
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Introduce new test case to test RTS ring mode under contention.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Introduce relaxed tail sync (RTS) mode for MT ring synchronization.
Aim to reduce stall times in case when ring is used on
overcommited cpus (multiple active threads on the same cpu).
The main difference from original MP/MC algorithm is that
tail value is increased not by every thread that finished enqueue/dequeue,
but only by the last one.
That allows threads to avoid spinning on ring tail value,
leaving actual tail value change to the last thread in the update queue.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
To make these preparations two main things are done:
- Change from *single* to *sync_type* to allow different
synchronisation schemes to be applied.
Mark *single* as deprecated in comments.
Add new functions to allow user to query ring sync types.
Replace direct access to *single* with appropriate function call.
- Move actual rte_ring and related structures definitions into a
separate file: <rte_ring_core.h>. It allows to refer contents
of <rte_ring_elem.h> from <rte_ring.h> without introducing a
circular dependency.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Introduce stress test for ring enqueue/dequeue operations.
Performs the following pattern on each slave worker:
dequeue/read-write data from the dequeued objects/enqueue.
Serves as both functional and performance test of ring
enqueue/dequeue operations under high contention
(for both over committed and non-over committed scenarios).
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Add libatomic as a global dependency when compiling for 32-bit using
clang. As we need libatomic for 64-bit atomic ops.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When running make with CONFIG_RTE_BUILD_SHARED_LIB=n,
no shared library is built.
In this case, no need to run ABI check.
With meson, both shared and static libraries are always built.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: David Marchand <david.marchand@redhat.com>
Static builds can take a lot of space, so reduce the number of examples
built when testing those static builds.
As makefile-based build is close to end of life, completely skip examples
in case of static linkage with make.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Static builds can take a lot of space, so reduce the number of examples
built when doing those static builds.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Build fails because '__rte_unused' macro not defined in file, error
produced by 'i686-native-linux-gcc config' but it seems generic issue.
Build error:
.../examples/vm_power_manager/oob_monitor_nop.c:11:13:
error: expected ‘;’ before ‘static’
11 | __rte_unused static float
| ^~~~~~~
| ;
.../examples/vm_power_manager/oob_monitor_nop.c:12:14:
error: unknown type name ‘__rte_unused’
12 | apply_policy(__rte_unused int core)
| ^~~~~~~~~~~~
.../examples/vm_power_manager/oob_monitor_nop.c:18:21:
error: unknown type name ‘__rte_unused’
18 | add_core_to_monitor(__rte_unused int core)
| ^~~~~~~~~~~~
.../examples/vm_power_manager/oob_monitor_nop.c:24:26:
error: unknown type name ‘__rte_unused’
24 | remove_core_from_monitor(__rte_unused int core)
| ^~~~~~~~~~~~
Including 'rte_common.h' header which defines the macro for fix.
Fixes: f2fc83b40f06 ("replace unused attributes")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Compilation is broken on ppc:
CC otx2_rx.o
In file included from .../drivers/net/octeontx2/otx2_rx.c:5:0:
.../builds/ppc_64-power8-linux-gcc/include/rte_vect.h:29:17:
error: expected declaration specifiers or ‘...’ before numeric constant
} __rte_aligned(16) rte_xmm_t;
^~
compilation terminated due to -Wfatal-errors.
Fixes: f35e5b3e07b2 ("replace alignment attributes")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Thomas Monjalon <thomas@monjalon.net>
The PMDs bnx2x and nfp have a separate column for VF.
Such separation is unneeded because the features are the same.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The virtual PMDs bonding, KNI, null, ring, softnic and vdev_netvsc
have no real feature to advertise so they can be removed
from the (too) big matrix of ethdev features.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
It seems sphinx >= 2.0 is inserting a <p> tag in each table cell.
The feature table (matrix) style needs to be updated to avoid
cells being too big.
The margin, padding and line height are overridden.
The font size in percentage is replaced with an equivalent pixel size.
The border is explicit because it disappeared for th.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
When a log type is registered, the level can be picked
by matching saved options.
The check of fnmatch globbing result was reversed.
The same bug was already fixed in a similar function.
This one is acting in log type register function.
Note: this function rte_log_register_type_and_pick_level()
is not used a lot and could be merged with rte_log_register().
Fixes: 6ff0f81d0ef7 ("log: fix pattern matching")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The keyword __attribute__ will emit a warning,
because it is preferred to use or define a common __rte macro.
The centralized macros may help to control or workaround some compilers.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The new macro __rte_noreturn, for compiler hinting,
is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The new macro __rte_cold, for compiler hinting,
is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
The new macro __rte_used, forcing symbol to be generated,
is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
There is a common macro __rte_unused, avoiding warnings,
which is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
There is a macro __rte_noinline, preventing function to be inlined,
which is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
There is a macro __rte_always_inline, forcing functions to be inlined,
which is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
There is a common macro __rte_packed for packing structs,
which is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
There is a common macro __rte_aligned for alignment,
which is now used where appropriate for consistency.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
The keyword alignas can be replaced with __rte_aligned macro
for consistency and allow compilers compatibility control.
The macro __rte_cache_aligned is a shortcut including __rte_aligned
and RTE_CACHE_LINE_SIZE constant.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The macros RTE_MARKER and __rte_cache_aligned can be used
for consistency for describing MEMIF_CACHELINE_ALIGN_MARK.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
There is a macro RTE_FINI for destructors,
which is now used where appropriate for consistency.
The destructor function mlx5_pmd_socket_uninit does not need
to be declared separately in mlx5.h.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
rte_ipsec has a dependency on rte_hash
So we need the librte_hash to be compiled before librte_ipsec.
Add the DEPDIRs to make sure this.
Fixes: 3feb23609cae ("ipsec: add SAD create/destroy implementation")
Cc: stable@dpdk.org
Reported-by: Raslan Darawsheh <rasland@mellanox.com>
Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Add redundant stack variable initialization to work around
false-positive warnings in older versions of GCC.
Fixes: 1f2b99e8d9b1 ("event/dsw: improve migration mechanism")
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Currently, in the case to use bitmap as resource allocator, after
bitmap creation, all the bitmap bits should be set to indicate the
bit available. Every time when allocate one bit, search for the set
bits and clear it to make it in use.
Add a new rte_bitmap_init_with_all_set() function to have a quick
fill up the bitmap bits.
Comparing with the case create the bitmap as empty and set the bitmap
one by one, the new function costs less cycles.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Meson is detecting the path /proc/sys/vm/nr_hugepages in the call to cat
in app/test/meson.build and then adding it as a build dependency.
This causes build loop if the timestamp of this file keeps changing.
It is fixed by hiding hugepage check in a shell script.
Fixes: 77784ef0fba8 ("test: allow no-huge mode for fast-tests")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Reviewed-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The function add_stylesheet() is deprecated since sphinx 1.8.
It will be removed in sphinx 4.0.
It is replaced by add_css_file().
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Use c11 atomics with RELAXED ordering instead of rte_atomic ops which
enforce unnessary barriers on arm64.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Current l2fwd-event application statically configures adjacent ports as
destination ports for forwarding the traffic.
Add a config option to pass the forwarding port pair mapping which allows
the user to configure forwarding port mapping.
If no config argument is specified, destination port map is not
changed and traffic gets forwarded with existing mapping.
To align port/queue configuration of each lcore with destination port
map, port/queue configuration of each lcore gets modified when config
option is specified.
Ex: ./l2fwd-event -c 0xff -- -p 0x3f -q 2 --config="(0,3)(1,4)(2,5)"
With above config option, traffic received from portid = 0 gets forwarded
to port = 3 and vice versa, similarly traffic gets forwarded on other port
pairs (1,4) and (2,5).
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Andrzej Ostruszka <aostruszka@marvell.com>
DSW keeps an internal port load estimate, used by the load balancing
mechanism. As a side effect, it keeps track of the total number of
busy cycles since startup. This metric is indirectly exposed in the
form of DSW xstats' "port_<n>_event_proc_latency", which is the total
number of busy cycles divided by the total number of events processed
on a particular port.
An external application can take (event_latency * dequeued) to go back
to busy_cycles. One reason for doing this is to measure the port's
load during a longer time period, without resorting to sampling
"port_<n>_load". However, as the number dequeued events grows, a
rounding error in event_latency renders the application-calculated
busy_cycles inaccurate.
Thus, it makes sense to directly expose the number of busy cycles as a
DSW xstats, even though it might seem redundant.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
On dequeue, polling the control ring once is enough.
Fixes: f6257b22e767 ("event/dsw: add load balancing")
Cc: stable@dpdk.org
Suggested-by: Ola Liljedahl <ola.liljedahl@arm.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
DSW limits the rate of migrations on a per-port basis. Hence, as the
number of cores grows, so does the total migration capacity.
In high core-count systems, this allows for a situation where flows
are migrated to a lightly loaded port which recently already received
a number of new flows (from other ports). The processing load
generated by these new flows may not yet be reflected in the lightly
loaded port's load estimate. The result is that the previously lightly
loaded port is now overloaded.
This patch adds a rough estimate of the size of the inbound migrations
to a particular port, which can be factored into the migration logic,
avoiding the above problem.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Allowing moving multiple flows in one migration transaction, to
rebalance load more quickly.
Introduce a threshold to avoid migrating flows between ports with very
similar load.
Simplify logic for selecting which flow to migrate. The aim is now to
move flows in such a way that the receiving port is as lightly-loaded
as possible (after receiving the flow), while still migrating enough
flows from the source port to reduce its load. This is essentially how
legacy strategy work as well, but the code is more readable.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
To allow visualization of migrations, track the number flow
immigrations in "port_<N>_immigrations". The "port_<N>_migrations"
retains legacy semantics, but is renamed "port_<N>_emigrations".
Expose the number of events currently undergoing processing
(i.e. pending releases) at a particular port.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Reduce the maximum number of DSW flows from 32k to 8k, to be able
rebalance load faster.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>