This patch adds the implementation of the 128-bit atomic compare
exchange API on aarch64. Using 64-bit 'ldxp/stxp' instructions
can perform this operation. Moreover, on the LSE atomic extension
accelerated platforms, it is implemented by 'casp' instructions for
better performance.
Since the '__ARM_FEATURE_ATOMICS' flag only supports GCC-9, this
patch adds a new config flag 'RTE_ARM_FEATURE_ATOMICS' to enable
the 'cas' version on older version compilers.
For octeontx2, we make sure that the lse (and other) extensions are
enabled even if the compiler does not know of the octeontx2 target
cpu.
Since direct x0 register used in the code and cas_op_name() and
rte_atomic128_cmp_exchange() is inline function, based on parent
function load, it may corrupt x0 register aka break aarch64 ABI.
Define CAS operations as rte_noinline functions to avoid an ABI
break [1].
1: https://git.dpdk.org/dpdk/commit/?id=5b40ec6b9662
Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This ensures secondary processes never have to calculate the TSC rate
themselves, which can be noticeable in VMs that don't have access to
arch-specific detection mechanism (such as CPUID leaf 0x15 or MSR 0xCE
on x86).
Since rte_mem_config is now internal to the EAL library, we can add
tsc_hz without ABI breakage concerns.
Reduces rte_eal_init() execution time in a secondary process from 165ms
to 66ms on my test system.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Thread unregister returns success while unregister not been performed.
This is due to incorrect thread registration status check.
Fix this issue by correcting bitmap check.
Fixes: 64994b56cfd7 ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
For a valid service, the core mask of the service
is checked against the current core and the corresponding
entry in the active_on_lcore array is set or reset.
Upto 8 cores share the same cache line for their
service active_on_lcore array entries since each entry is a uint8_t.
Some number of these entries also share the cache line with
the internal_flags member of struct rte_service_spec_impl,
hence this false sharing also makes the service_valid() check
expensive.
Eliminate false sharing by moving the active_on_lcore array to
a per-core data structure. The array is now indexed by service id.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Gage Eads <gage.eads@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This code was added 7+ years ago in
commit fb022b85bae4 ("timer: check TSC reliability")
presumably when variant TSCs were still somewhat common.
But this code doesn't do anything except print a warning,
and the warning doesn't give any kind of advice to the user,
so let's just remove it.
While the warning has no functional meaning, the /proc/cpuinfo
parsing consumes a non-trivial amount of time which is especially
noticeable in secondary processes.
On my test system, it consumes 21ms out of the 66ms total execution
time for rte_eal_init() in a secondary process.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The rte_atomic64_exchange operation for ppc_64 incorrectly linked
back to a 32 bit generic operation (__atomic_exchange_4) rather than
the 64 bit generic operation (__atomic_exchange_8). As a result,
applications that used rte_eth_link_get_nowait() would only receive
the link speed, they would not receive the link state, link duplex,
or link autoneg properties.
Fixes: ff2863570fcc ("eal: introduce atomic exchange operation")
Cc: stable@dpdk.org
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This is a commonly used operation that surprisingly the
DPDK has not supported. The new rte_pktmbuf_copy does a
deep copy of packet. This is a complete copy including
meta-data.
It handles the case where the source mbuf comes from a pool
with larger data area than the destination pool. The routine
also has options for skipping data, or truncating at a fixed
length.
This patch also introduces internal inline to copy the
metadata fields of mbuf.
Add a test for this new function, based of the clone tests.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Cloning mbufs requires allocations and iteration
and therefore should not be an inline.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
This copy part of this function is too big to be put inline.
The places it is used are only in special exception paths
where a highly fragmented mbuf arrives at a device that can't handle it.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The test for cloning changed mbuf would generate an mbuf whose length
and segments count were invalid.
This would cause a crash if test was run with mbuf debugging enabled.
Fixes: 4ccd2bb3a9e2 ("app/test: enhance mbuf refcnt check")
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
No functional change.
Clarify the populate function to make future changes easier to
understand.
Rename the variables:
- to avoid negation in the name
- to have more understandable names
Remove useless variable (no_pageshift is equivalent to pg_sz == 0).
Remove duplicate affectation of "external" variable.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
This patch adds support to allow users enable/disable allmulticast mode for
kni interface.
This requirement comes from bugzilla 312, more details can refer to:
https://bugs.dpdk.org/show_bug.cgi?id=312
Bugzilla ID: 312
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
EAL should always use rte_log instead of putting errors to
stderr (which maybe redirected to /dev/null in a daemon).
Also checks for null before rte_free are unnecessary.
Minor code consistency improvements.
Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Have rte_eal_config_reattach clean up the mapped address which is a valid
address but not the one intended.
Coverity issue: 343439
Fixes: 4e8854ae89fa ("eal: do not panic on shared memory init")
Fixes: b149a7064261 ("eal/freebsd: add config reattach in secondary process")
Cc: stable@dpdk.org
Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The code checks both rte_mp_request_sync() return code and that the number
of messages in the reply equals 1. If rte_mp_request_sync() succeeds but
there was more than one message, those messages would get leaked.
Found via code review by Anatoly Burakov of patches that used the vhost
code as a template for using rte_mp_request_sync().
Fixes: 83a73c5fef66 ("vfio: use generic multi-process channel")
Cc: stable@dpdk.org
Reported-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
A while ago telemetry was added in 57ae0ec6 and it also added as-needed
to config/meson.build. This seems no more needed these days as due to other
build changes the ordering in buildlogs is:
[...] -lrte_telemetry [...] -Wl,--no-as-needed [...]
Which means telemetry no more benefits from --no-as-needed anyway.
Overlinking problems get triggered by the meson generated pkgconfig which
will have:
[...] -Wl,--no-as-needed <somelibsusedbydpdk>
This will overlink <somelibs> and in addition anything that follows
as it also doesn't wrap back to --as-needed. So if a projects includes
dpdk libs + <other> it will also consider <other> with --no-as-needed.
Ubuntu bug: https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1841759
Fixes: 57ae0ec62620 ("build: add dependency on telemetry to apps with meson")
Cc: stable@dpdk.org
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Acked-by: Luca Boccassi <bluca@debian.org>
RTE_BPF_ARG_PTR_STACK is used as internal program
arg type. Rename to RTE_BPF_ARG_RESERVED to
avoid exposing internal program type.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add branch and call operations.
jump_offset_* APIs used for finding the relative offset
to jump w.r.t current eBPF program PC.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Implement XADD eBPF instruction using STADD arm64 instruction.
If the given platform does not have atomics support,
use LDXR and STXR pair for critical section instead of STADD.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add mov, add, sub, mul, div and mod arithmetic
operations for immediate and source register variants.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add prologue and epilogue as per arm64 procedure call standard.
As an optimization the generated instructions are
the function of whether eBPF program has stack and/or
CALL class.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add build infrastructure and documentation
update for arm64 JIT support.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add options to set mbuf size and max packet size which allow the user to
enable jumbo frames and Rx/Tx scatter gather.
Arrange `struct evt_options` based on ascending order of data type to
make it more readable.
Packet mbuf size can be modified by using `--mbuf_sz=N`.
Max packet size can be modified by using `--max_pkt_sz=N`.
These options are only applicable `pipeline_atq` and `pipeline_queue`
tests.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
This patch add support for testing dpaa2 eventdev self test
for basic sanity for parallel and atomic queues.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
The patch adds the break in the TX function, if it is failing
to send the packets out. Previously the system was trying
infinitely to send packet out.
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
This patch removes the conditional compilation for
cryptodev event support from RTE_LIBRTE_SECURITY flag.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Test vector expect only one type of scheduling as default.
The old code is provide support scheduling types instead of default.
Fixes: 13370a3877a5 ("eventdev: fix inconsistency in queue config")
Cc: stable@dpdk.org
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
The existing code uses NULL as the cipher algo
for testing crypto event adapter.
DPAA1/DPAA2 do not support NULL algo. Hence changing
it to the most common algo AES-CBC, which is supported
by all crypto drivers implementing event crypto adapter.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
The longer mempool name size is causing error in
rte_mempool_create_empty for dpaa1
ret = snprintf(mz_name, sizeof(mz_name), RTE_MEMPOOL_MZ_FORMAT, name);
This patch reduce the size of mempool name string
Fixes: 24054e3640a2 ("test/crypto: use separate session mempools")
Cc: stable@dpdk.org
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Octeontx2 SSO co-processor allows multiple ethernet device Rx queues
connected to a single Event device queue.
Fix the Rx adapter capabilities to allow application to configure
Rx queueus in n:1 ratio to event queues by adding
`RTE_EVENT_ETH_RX_ADAPTER_CAP_MULTI_EVENTQ` as a capability.
Fixes: 37720fc1fba8 ("event/octeontx2: add Rx adapter")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
The sw PMD implements xstats reset by having the xstat get operations
return a value to the statistic's value at the last reset. The value at the
last reset is maintained in the per-xstat reset_value field, but the PMD
was setting reset_value = current - reset_value instead of reset_value =
current.
Fixes: c1ad03df7ad5 ("event/sw: support xstats")
Cc: stable@dpdk.org
Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
If HW support is available, service core shall not come
into play by default. This shall be for both FWD/NEW modes.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Mismatch in algo or sec capability can cause session to fail.
This patch handle it and return error timely.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
The routine mlx5dv_query_devx_port() was called directly
instead of using the mlx5 glue thunk.
Fixes: d5c06b1b10ae ("net/mlx5: query vport index match mode and parameters")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
In LAG configuration the devices in the same switch domain
might be spawned on the base of different PCI devices, so
we should check all devices backed by mlx5 PMD whether they
belong to specified switch domain. When the new devices are
being created it is not possible to detect whether the
sibling devices created in the current probe() loop belong
to the driver, driver field is not filled yet (it will be
done on returned success of current probe()). This patch
updates the device scanning, allowing extra match on
current backing PCI device, is being used to create siblings.
Fixes: f7e95215ac7c ("net/mlx5: extend switch domain searching range")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The hardware may have limitations on maximal amount of
supported Tx descriptors building blocks (WQEBB). Application
requires the Tx queue must accept the specified amount of packets.
If inline data feature is engaged the packet may require more WQEBBs
and overall amount of blocks may exceed the hardware capabilities.
Application has to make a trade-off between Tx queue size and maximal
data inline size.
In case if the inline settings are not requested explicitly with
devarg keys the default values are used. This patch adjusts the
applied default values if large Tx queue size is requested and
default inline settings can not be satisfied due to hardware
limitations.
The explicitly requested inline setting may be aligned (enlarging
only) by configurations routines to provide better WQEBB filling,
this implicit alignment is the subject for adjustment either.
The warning message is emitted to the log if adjustment happens.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
When rules are being inserted from multiple cores, there are several
race conditions during rte_flow operations.
For example, when inserting rules from 2 cores simultaneously, both
the cores try to fetch a free available filter entry and they both
end up fetching the same entry. Both of them start overwriting the
same filter entry before sending to firmware, which results in wrong
rule being inserted to hardware.
Fix the races by adding spinlock to serialize the rte_flow operations.
Fixes: ee61f5113b17 ("net/cxgbe: parse and validate flows")
Fixes: 9eb2c9a48072 ("net/cxgbe: implement flow create operation")
Fixes: da23bc9d33f4 ("net/cxgbe: implement flow destroy operation")
Fixes: 8d3c12e19368 ("net/cxgbe: implement flow query operation")
Fixes: 86910379d335 ("net/cxgbe: implement flow flush operation")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Scattered receive is supported but not included in receive offload
capabilities. Fix by adding it and including in scattered receive
calculation.
Fixes: 9c1507d96ab8 ("net/bnxt: switch to the new offload API")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Use first receive queue assigned to VNIC as the default receive queue
when configuring Thor VNICs. This is necessary e.g. in order for flow
redirection to a specific receive queue to work correctly.
Fixes: f8168ca0e690 ("net/bnxt: support thor controller")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
The required number of statistics contexts is computed as the sum
of the number of receive and transmit rings plus one for the async
completion ring. A statistics context is not actually required for
the async completion ring, so remove it from the calculation.
Fixes: bd0a14c99f65 ("net/bnxt: use dedicated CPR for async events")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Thor queue scaling is currently limited by the number of NQs that
can be allocated. Fix by using a common NQ for all receive/transmit
rings instead of allocating a separate NQ for each ring.
Fixes: f8168ca0e690 ("net/bnxt: support thor controller")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Class of Service (CoS) is a way to manage multiple types of
traffic over a network to offer different types of services
to applications. CoS classification (priority to cosqueue) is
determined by the user and configured through the PF driver.
DPDK driver queries this configuration and maps the cos queue
ids to different VNICs. This patch adds this support.
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Santoshkumar Karanappa Rastapur <santosh.rastapur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>