These header includes have been flagged by the iwyu_tool
and removed.
Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Bug scenario:
1. start testpmd:
$ dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* \
--file-prefix=trace_autotest -- -i
2. then observed:
EAL: eal_trace_init():93 failed to initialize trace [File exists]
EAL: FATAL: Cannot init trace
EAL: Cannot init trace
EAL: Error - exiting with code: 1
The root cause it that the offset set wrong with long file-prefix and
then lead the strftime return failed.
At the same time, trace_session_name_generate() uses errno as the return
value, but the errno was not set if strftime returned zero.
A previously set errno (EEXIST or ENOENT from call to mkdir for creating
the runtime configuration directory) was returned in this case.
This is fragile and may lead to incorrect logic if errno was set
to 0 previously.
This also resulted in inaccurate prompting.
Set errno to ENOSPC if strftime return zero.
Fixes: 321dd5f8fa ("trace: add internal init and fini interface")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Bug scenario:
1. start testpmd:
$ dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* -- -i
2. quit testpmd and then observed segment fault:
Bye...
Segmentation fault (core dumped)
The root cause is that rte_trace_save() and eal_trace_fini() access
the huge pages which were cleanup by rte_eal_memory_detach().
This patch moves rte_trace_save() and eal_trace_fini() before
rte_eal_memory_detach() to fix the bug.
Fixes: dfbc61a2f9 ("mem: detach memsegs on cleanup")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The P4 language requires marking a header as valid before any of the
header fields are written as opposed to after the writes are done.
Hence, the optimization of replacing the sequence of instructions to
generate a header by reading it from the table action data with a
single DMA internal instruction are reworked from "mov all + validate
-> dma" to "validate + mov all -> dma".
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Fix comparison used to check against the maximum number of learner
table timeouts.
Fixes: e2ecc53582 ("pipeline: improve learner table timers")
Signed-off-by: Harshad Narayane <harshad.suresh.narayane@intel.com>
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Fix segmentation fault due to null pointer dereferencing inside the
"mirror" instruction when number of mirroring slots is set to 0. This
was taking place when the "mirror" instruction was used without the
mirror feature being properly configured, i.e. the API function
rte_swx_pipeline_mirroring_config was not called at initialization.
Fixes: dac0ecd909 ("pipeline: support packet mirroring")
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
rte_xmm_t is a union type which wraps around xmm_t and maps its contents
to scalar structures. Since C++ has stricter type conversion rules than
C, the rte_xmm_t::x has to be used instead of C-casting.
The generated assembly is identical to the code without the fix (checked
both on x86 and RISC-V).
Fixes: 406937f89f ("lpm: add scalar version of lookupx4")
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
rte_xmm_t is a union type which wraps around xmm_t and maps its contents
to scalar structures. Since C++ has stricter type conversion rules than
C, the rte_xmm_t::x has to be used instead of C-casting.
Fixes: f22e705ebf ("eal/riscv: support RISC-V architecture")
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The cmdline library cmdline_parse() function parses a command and
executes the action automatically too. The cmdline_valid_buffer function
also uses this function to validate commands, meaning that there is no
function to validate a command as ok without executing it.
To fix this omission, we extract the body of cmdline_parse into a new
static inline function with an extra parameter to indicate whether the
action should be performed or not. Then we create two wrappers around
that - a replacement for the existing cmdline_parse function where the
extra parameter is "true" to execute the command, and a new function
"cmdline_parse_check" which passes the parameter as "false" to perform
cmdline validation only.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Weiyuan Li <weiyuanx.li@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
A new event RTE_ETH_EVENT_RX_AVAIL_THRESH should be generated by HW
when number of available descriptors in Rx queue goes below the
threshold.
The threshold is defined as a percentage of an Rx queue size with valid
values from 0 to 99 (inclusive). Zero (default) value disables it.
There is no capability reporting for the feature. Application should
simply try to set required threshold value and handle result.
Add testpmd commands to control the threshold:
set port <port_id> rxq <rxq_id> avail_thresh <avail_thresh_num>
Signed-off-by: Spike Du <spiked@nvidia.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
MCS lock, PF lock and Ticket lock have no arch specific implementation,
there is no need for the extra redirection in headers.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Stanislaw Kardach <kda@semihalf.com>
Add all necessary elements for DPDK to compile and run EAL on SiFive
Freedom U740 SoC which is based on SiFive U74-MC (ISA: rv64imafdc)
core complex.
This includes:
- EAL library implementation for rv64imafdc ISA.
- meson build structure for 'riscv' architecture. RTE_ARCH_RISCV define
is added for architecture identification.
- xmm_t structure operation stubs as there is no vector support in the
U74 core.
Compilation was tested on Ubuntu and Arch Linux using riscv64 toolchain.
Clang compilation currently not supported due to issues with missing
relocation relaxation.
Two rte_rdtsc() schemes are provided: stable low-resolution using rdtime
(default) and unstable high-resolution using rdcycle. User can override
the scheme by defining RTE_RISCV_RDTSC_USE_HPM=1 during compile time of
both DPDK and the application. The reasoning for this is as follows.
The RISC-V ISA mandates that clock read by rdtime has to be of constant
period and synchronized between all hardware threads within 1 tick
(chapter 10.1 in version 20191213 of RISC-V spec).
However this clock may not be of high-enough frequency for dataplane
uses. I.e. on HiFive Unmatched (FU740) it is 1MHz.
There is a high-resolution alternative in form of rdcycle which is
clocked at the core clock frequency. The drawbacks are that it may be
disabled during sleep (WFI), its frequency might change due to DVFS and
it is core-local and therefore cannot be used as a wall-clock. It can
however be used for micro-benchmarking user applications, similarly to
Aarch64's PMCCNTR PMU counter.
The platform is currently marked as linux-only because rte_cycles
implementation uses the timebase-frequency device-tree node read through
the proc file system. Such approach was chosen because Linux kernel
depends on the presence of this device-tree node.
The i40e PMD driver is disabled on RISC-V as the rv64gc ISA has no vector
operations.
The compilation of following modules has been disabled by this commit
and will be re-enabled in later commits as fixes are introduced:
net/ixgbe, net/memif, net/tap, example/l3fwd.
Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Michal Mazurek <maz@semihalf.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
To allow other projects to easily use DPDK as a subproject, add in the
necessary dependency definitions. Slightly different definitions are
necessary for static and shared builds, since for shared builds the
drivers should not be linked in, and the internal meson dependency
objects are more complete.
To use DPDK as a subproject fallback i.e. use installed DPDK if present,
otherwise the shipped one, the following meson statement can be used:
libdpdk = dependency('libdpdk', fallback: ['dpdk', 'dpdk_dep'])
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ben Magistro <koncept1@gmail.com>
Tested-by: Ben Magistro <koncept1@gmail.com>
This patch replaces instances of zero-sized arrays i.e. those at the end
of structures with "[0]" with the more standard syntax of "[]".
Replacement was done using coccinelle script, with some revert and
cleanup of whitespace afterwards.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Add functions for setting and getting the priority of a thread.
Priorities on multiple platforms are similarly determined by a priority
value and a priority class/policy.
Currently in DPDK most threads operate at the OS-default priority level
but there are cases when increasing the priority is useful. For
example, high performance applications may require elevated priority
levels.
For these reasons, EAL will expose two priority levels which are named
suggestively "normal" and "realtime_critical" and are computed as
follows:
On Linux, the following mapping is created:
RTE_THREAD_PRIORITY_NORMAL corresponds to
* policy SCHED_OTHER
* priority value: (sched_get_priority_min(SCHED_OTHER) +
sched_get_priority_max(SCHED_OTHER))/2;
RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to
* policy SCHED_RR
* priority value: sched_get_priority_max(SCHED_RR);
On Windows, the following mapping is created:
RTE_THREAD_PRIORITY_NORMAL corresponds to
* class NORMAL_PRIORITY_CLASS
* priority THREAD_PRIORITY_NORMAL
RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to
* class REALTIME_PRIORITY_CLASS (when running with privileges)
* class HIGH_PRIORITY_CLASS (when running without privileges)
* priority THREAD_PRIORITY_TIME_CRITICAL
Note that on Linux the resulting priority value will be 0, in
accordance to the documentation that mention the value should be 0 for
SCHED_OTHER policy.
Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
A sequence lock (seqlock) is a synchronization primitive which allows
for data-race free, low-overhead, high-frequency reads, suitable for
data structures shared across many cores and which are updated
relatively infrequently.
A seqlock permits multiple parallel readers. A spinlock is used to
serialize writers. In cases where there is only a single writer, or
writer-writer synchronization is done by some external means, the
"raw" sequence counter type (and accompanying rte_seqcount_*()
functions) may be used instead.
To avoid resource reclamation and other issues, the data protected by
a seqlock is best off being self-contained (i.e., no pointers [except
to constant data]).
One way to think about seqlocks is that they provide means to perform
atomic operations on data objects larger than what the native atomic
machine instructions allow for.
DPDK seqlocks (and the underlying sequence counters) are not
preemption safe on the writer side. A thread preemption affects
performance, not correctness.
A seqlock contains a sequence number, which can be thought of as the
generation of the data it protects.
A reader will
1. Load the sequence number (sn).
2. Load, in arbitrary order, the seqlock-protected data.
3. Load the sn again.
4. Check if the first and second sn are equal, and even numbered.
If they are not, discard the loaded data, and restart from 1.
The first three steps need to be ordered using suitable memory fences.
A writer will
1. Take the spinlock, to serialize writer access.
2. Load the sn.
3. Store the original sn + 1 as the new sn.
4. Perform load and stores to the seqlock-protected data.
5. Store the original sn + 2 as the new sn.
6. Release the spinlock.
Proper memory fencing is required to make sure the first sn store, the
data stores, and the second sn store appear to the reader in the
mentioned order.
The sn loads and stores must be atomic, but the data loads and stores
need not be.
The original seqlock design and implementation was done by Stephen
Hemminger. This is an independent implementation, using C11 atomics.
For more information on seqlocks, see
https://en.wikipedia.org/wiki/Seqlock
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
musl lacks __ppc_get_timebase() but has __builtin_ppc_get_timebase()
Signed-off-by: Duncan Bellamy <dunk@denkimushi.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Telemetry commands are now registered through the dmadev library
for the gathering of DSA stats. The corresponding callback
functions for listing dmadevs and providing info and stats for a
specific dmadev are implemented in the dmadev library.
An example usage can be seen below:
Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 22.03.0-rc2", "pid": 2956551, "max_output_len": 16384}
Connected to application: "dpdk-dma"
--> /
{"/": ["/", "/dmadev/info", "/dmadev/list", "/dmadev/stats", ...]}
--> /dmadev/list
{"/dmadev/list": [0, 1]}
--> /dmadev/info,0
{"/dmadev/info": {"name": "0000:00:01.0", "nb_vchans": 1, "numa_node": 0,
"max_vchans": 1, "max_desc": 4096, "min_desc": 32, "max_sges": 0,
"capabilities": {"mem2mem": 1, "mem2dev": 0, "dev2mem": 0, ...}}}
--> /dmadev/stats,0,0
{"/dmadev/stats": {"submitted": 0, "completed": 0, "errors": 0}}
Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
Tested-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Clarify that once an operation has completed, the output of that
operation is visible to all cores.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Vhost backend of different devices have different features.
Add an API to get vDPA device type, net device or blk device
currently, so users can set different features for different
kinds of devices.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Add support for VHOST_USER_GET_CONFIG and VHOST_USER_SET_CONFIG.
VHOST_USER_GET_CONFIG and VHOST_USER_SET_CONFIG message is only
supported by virtio blk VDPA device.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Get_config and set_config are necessary ops for blk device.
Add get_config and set_config ops to vDPA ops.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
In vhost_user_msg_handler(), if vhost message handling
failed, we should check whether the queue is locked and
release the lock before returning. Or, it will cause a
deadlock later.
Fixes: 7f31d4ea05 ("vhost: fix lock on device readiness notification")
Cc: stable@dpdk.org
Signed-off-by: Wenwu Ma <wenwux.ma@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Tested-by: Wei Ling <weix.ling@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
This patch implements asynchronous dequeue data path for vhost split
ring, a new API rte_vhost_async_try_dequeue_burst() is introduced.
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Tested-by: Yvonne Yang <yvonnex.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch refactors copy_desc_to_mbuf() used by the sync
path to support both sync and async descriptor to mbuf filling.
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Yvonne Yang <yvonnex.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch refactors vhost async enqueue path and dequeue path to use
the same function async_fill_seg() for preparing batch elements,
which simplifies the code without performance degradation.
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Yvonne Yang <yvonnex.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch extracts the descriptors to buffers filling from
copy_desc_to_mbuf() into a dedicated function. Besides, enqueue
and dequeue path are refactored to use the same function
sync_fill_seg() for preparing batch elements, which simplifies
the code without performance degradation.
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Yvonne Yang <yvonnex.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds runtime checks in unsafe Vhost async APIs,
to ensure the access lock is taken.
The detection won't work every time, as another thread
could take the lock, but it would help to detect misuse
of these unsafe API.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch adds statistics for packets in-flight submission
and completion, when Vhost async mode is used.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch adds statistics for IOTLB hits and misses.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch adds a new virtqueue statistic for guest
notifications. It is useful to deduce from hypervisor side
whether the corresponding guest Virtio device is using
Kernel Virtio-net driver or DPDK Virtio PMD.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch introduces new APIs for the application
to query and reset per-virtqueue statistics. The
patch also introduces generic counters.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
vq->async accesses must be protected with vq->access_lock.
Fixes: eb666d2408 ("vhost: fix async unregister deadlock")
Fixes: 0c0935c5f7 ("vhost: allow to check in-flight packets for async vhost")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sunil Pai G <sunil.pai.g@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The port ownership concept was introduced in ethdev in DPDK 18.02.
Not sure it is used by applications except those using failsafe or netvsc.
It can also be used by libraries or applications to sort out
how ports are controlled.
Hiding sub-ports controlled by failsafe or netvsc look to be enough
justification to promote this API as stable.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
This patch introduces the IPv4/IPv6 ECN modify field support, and
adds the testpmd CLI commands support.
Usage:
modify_field op set dst_type ipv4_ecn src_type ...
For example:
flow create 0 ingress group 1 pattern eth / ipv4 / end actions
modify_field op set dst_type ipv4_ecn src_type value src_value
0x03 width 2 / queue index 0 / end
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Secondary process needs to close device to release process private
resources. But secondary process should not be obliged to wait for
device stop before closing ethdev.
Fixes: febc855b35 ("ethdev: forbid closing started device")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Add support for module EEPROM information format defined in
SFF-8636 Rev 2.7.
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Signed-off-by: Kevin Liu <kevinx.liu@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Add support for module EEPROM information format defined in
SFF-8472 Rev 12.0
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Signed-off-by: Kevin Liu <kevinx.liu@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Add support for module EEPROM information format defined in
SFF-8079 Rev 1.7.
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Signed-off-by: Kevin Liu <kevinx.liu@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Add support for SFF-8024 Rev 4.0 of pluggable I/O configuration
and some common utilities for SFF-8436/8636 and SFF-8472/8079.
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Signed-off-by: Kevin Liu <kevinx.liu@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Add a new telemetry command /ethdev/module_eeprom to dump the module
EEPROM of each port. The format of module EEPROM information follows
the SFF(Small Form Factor) Committee specifications.
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Signed-off-by: Kevin Liu <kevinx.liu@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Currently, 'dev_started' is always set to be 0 when dev stop, whether
it succeeded or failed. This is unreasonable and this patch fixed it.
Fixes: 62024eb827 ("ethdev: change stop operation callback to return int")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
Whether it is allowed to call Rx/Tx functions for a stopped queue
was undocumented. Some PMDs make this behavior a no-op
either by explicitly checking the queue state
or by the way how their routines are implemented or HW works.
No-op behavior may be convenient for application developers.
But it also means that pollers of stopped queues
would go all the way down to PMD Rx/Tx routines, wasting cycles.
Some PMDs would do a check for the queue state on data path,
even though it may never be needed for a particular application.
Also, use cases for stopping queues or starting them deferred
do not logically require polling stopped queues.
Use case 1: a secondary that was polling the queue has crashed,
the primary is doing a recovery to free all mbufs.
By definition the queue to be restarted is not polled.
Use case 2: deferred queue start or queue reconfiguration.
The polling thread must be synchronized anyway,
because queue start and stop are non-atomic.
Prohibit calling Rx/Tx functions on stopped queues.
Fixes: 0748be2cf9 ("ethdev: queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
The rte_tel_data_alloc() may return NULL, so the caller should add
judgement for it.
Fixes: 083b0b310b ("ethdev: add common stats for telemetry")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
When xstats location is null in rte_eth_xstats_get() the return value
is not clearly specified. Some PMDs (eg. hns3/ipn3ke/mvpp2/axgbe) return
zero while others return the required number of elements.
In this patch, special parameter combinations are restricted:
1. highlight that xstats location may be null if and only if n is 0.
2. amend n parameter description to specify that if n is lower than
the required number of elements, the function returns the required
number of elements.
3. specify that if n is zero, the xstats must be NULL, the function
returns the required number of elements (a duplicate which should
help to not very attentive readers).
Add sanity check for null xstats and non-zero n case on API level to
make it unnecessary to care about it in drivers.
Fixes: ce757f5c9a ("ethdev: new method to retrieve extended statistics")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Currently, meter object supports only DSCP based on input color table,
The patch enhance that to support VLAN based input color table,
color table based on inner field for the tunnel use case, and
support for fallback color per meter if packet based on a different field.
All of the above features are exposed through capability and added
additional capability to specify the implementation supports
more than one input color table per ethdev port.
Suggested-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Add new get/set API to allow the user or application to set the minimum
and maximum frequencies to use when scaling.
Previously, the frequency range was determined by the HW capabilities of
the CPU. With this new API, the user or application can constrain this
if required.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Add new get/set API for configuring 'pause_duration' which used to adjust
the pause mode callback duration.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Add new get/set APIs to configure emptypoll max which is used to
determine when a queue can go into sleep state.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Tested-by: David Hunt <david.hunt@intel.com>
Add an implementation of the rte_lpm_lookupx4() function for platforms
without support for vector operations.
This will be useful in the upcoming RISC-V port as well as any platform
which may want to start with a basic level of LPM support.
Signed-off-by: Michal Mazurek <maz@semihalf.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
All other rte_lpm_lookup* functions take lpm argument as a const. As the
basic rte_lpm_lookup() performs the same function, it should also do
that.
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
- Added salt length and optional label.
Common parameters to PSS and OAEP padding for RSA.
- Changed RSA hash padding fields names.
Now it corresponds to the RSA documents.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- Clarified where should output be stored of signature
decryption with padding none.
PMD is not able to know what padding algorithm was used,
therefore decrypted signature should be returned to the user.
- Removed incorrect big-endian constraints.
Not all data in RSA can be treated as big endian integer,
therefore some of the constraints were lifted.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- move RSA padding into separate struct.
More padding members should be added into padding,
therefore having separate struct for padding parameters will
make this more readable.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- Clarified usage of RSA padding hash.
It was not specified how to use hash for PKCS1_5
padding. This could lead to incorrect implementation.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- Added flags to rte_crypto_asym_op struct.
It may be shared between different algorithms.
- Added Diffie-Hellman padding flags.
Diffie-Hellman padding is used in certain protocols,
in others, leading zero bytes need to be stripped.
Even same protocol may use a different approach - most
glaring example is TLS1.2 - TLS1.3.
For ease of use, and to avoid additional copy
on certain occasions, driver should be able to return both.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- Added key exchange public key verify option.
For some elliptic curves public point in DH exchange
needs to be checked, if it lays on the curve.
Modular exponentiation needs certain checks as well,
though mathematically much easier.
This commit adds verify option to asym_op operations.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- Added elliptic curve Diffie-Hellman parameters.
Point multiplication allows the user to process every phase of
ECDH, but for phase 1, user should not really care about the generator.
The user does not even need to know what the generator looks like,
therefore setting ec xform would make this work.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- Moved DH operation type to DH operation struct.
Operation type (PUBLIC_KEY_GENERATION, SHARED_SECRET) should
be free to choose for any operation. One xform/session should
be enough to perform both DH operations, if op_type would be xform
member, session would have to be created twice for the same
group. Similar problem would be observed in sessionless case.
Additionally, it will help extend DH to support Elliptic Curves.
- Changed order of Diffie-Hellman operation phases.
Now it corresponds with the order of operations.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- Clarified usage of private key in Diffie-Hellman.
CSRNG capable device should generate private key and then
use it for public key generation.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Removed comment that stated DSA can be used with Diffie
Hellman ephemeral key.
DH and DSA integration allowed to use ephemeral keys for
random integer, but not for private keys.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- Separated key exchange enum from asym op type.
Key exchange and asymmetric crypto operations like signatures,
encryption/decryption should not share same operation enum as
its use cases are unrelated and mutually exclusive.
Therefore op_type was separate into:
1) operation type
2) key exchange operation type
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
- EC enum was renamed to rte_crypto_curve_id.
Elliptic curve enum name was incorrectly associated
with a group (it comes from the current tls registry name).
- Clarified comments about TLS deprecation.
Some curves included are deprecated with TLS 1.3.
Comments to address it were added.
- Clarified FFDH groups usage.
Elliptic curves IDs in TLS are placed in the same registry
as FFDH. Cryptodev does not assign specific groups, and
if specific groups would be assigned by DPDK, it cannot be
TLS SupportedGroups registry, as it would conflict with
other protocols like IPSec.
- Added IANA reference.
Only few selected curves are included in previously
referenced rfc8422. IANA reference is added instead.
- Removed UNKNOWN ec group.
There is no default value, and there is no UNKNOWN
elliptic curve.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
For getting event crypto metadata from crypto_op,
the new API rte_cryptodev_get_session_event_mdata is used
instead of getting userdata inside PMD.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Currently, crypto session userdata is used to set event crypto
metadata from the application and the driver is dereferencing it
in driver which is not correct. User data is meant to be opaque
to the driver.
To support this, new API is added to get and set event crypto
metadata. The new API, rte_cryptodev_set_session_event_mdata,
allows setting event metadata in session private data which is
filled inside PMD using a new cryptodev op. This operation
can be performed on any of the PMD supported sessions
(sym/asym/security).
For SW abstraction of event crypto adapter to be used by
eventdev library, a new field is added in asymmetric crypto
session for now and for symmetric case, current implementation
of using userdata is used. Symmetric cases cannot be fixed now,
as it will be ABI breakage which will be resolved in DPDK 22.11.
Signed-off-by: Volodymyr Fialko <vfialko@marvell.com>
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
The AltiVec header file is defining "vector", except in C++ build.
The keyword "vector" may conflict easily.
As a rule, it is better to use the alternative keyword "__vector".
The DPDK header file rte_altivec.h takes care of undefining "vector",
so the applications and dependencies are free to define the name "vector".
This is a compatibility breakage for applications which were using
the keyword "vector" for its AltiVec meaning.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
In pcap_tsc_to_ns(), delta * NSEC_PER_SEC will overflow approx 8
seconds after pcap_init is called when using a TSC with a frequency
of 2.5GHz.
To avoid the overflow, update the saved time and TSC value once
delta >= tsc_hz.
Fixes: 8d23ce8f5e ("pcapng: add new library for writing pcapng files")
Cc: stable@dpdk.org
Signed-off-by: Quentin Armitage <quentin@armitage.org.uk>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Add support for hash functions that compute a signature for an array
of bytes read from a packet header or meta-data. Useful for flow
affinity-based load balancing.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Enable the pipeline to use the improved learner table timer operation
through the new "rearm" instruction.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Previously, on lookup hit, the hit key had its timer automatically
rearmed with the same timeout in order to prevent its expiration. Now,
a broader set of actions is available on lookup hit, which has to be
managed explicitly: the key can have its timer rearmed with the same
or with a different timeout, or the key timer can be left unmodified.
The latter option allows the key to expire naturally when the timer
eventually runs out, unless the key is hit again and its timer rearmed
at that point. Needed by the TCP connection tracking state machine.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add support for packet recirculation. The current packet is flagged
for recirculation using the new "recirculate" instruction; on TX, this
flag causes the packet to execute the full pipeline again as if it was
a new packet, except the packet meta-data is preserved. The new
"recircid" instruction can be used to read the pass number in case the
packet goes several times through the pipeline.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The packet mirroring is configured through slots and sessions, with
the number of slots and sessions set at init.
The new "mirror" instruction assigns one of the existing sessions to a
specific slot, which results in scheduling a mirror operation for the
current packet to be executed later at the time the packet is either
transmitted or dropped.
Several copies of the same input packet can be mirrored to different
output ports by using multiple slots.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Add packet clone operation to the output ports in order to support
packet mirroring.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Add support for arguments to the default action of regular and learner
tables at initialization time. Until now, only default actions with no
arguments were accepted in the .spec file.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
Fix the emit instruction for the pathological case of all headers to
be emitted being invalid. In this case, the for loop was essentially
skipped and the last emitted header (or an invalid memory location)
getting corrupted by setting its size to 0 through the assignment to
ho->n_bytes right after the for loop.
Fixes: d60dbdc88a ("pipeline: create inline functions for emit instruction")
Cc: stable@dpdk.org
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Venkata Suresh Kumar P <venkata.suresh.kumar.p@intel.com>
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
Enable printing of the outer VLAN if flags indicate it is present.
Cc: stable@dpdk.org
Signed-off-by: Ben Magistro <koncept1@gmail.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Also mark some conditional functions as const.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
If a /32 route is entered in the RIB the code to traverse
will not see end of the tree. This is due to trying
to do a negative shift which is an undefined in C.
Fix by checking for max depth as is already done in rib6.
Fixes: 5a5793a5ff ("rib: add RIB library")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
According to RFC791,the options may appear or not in datagrams.
They must be implemented by all IP modules (host and gateways).
What is optional is their transmission in any particular datagram,
not their implementation. So we have to deal with it during the
fragmenting process.
Add some test data for the IPv4 header optional field fragmenting.
Signed-off-by: Huichao Cai <chcchc88@163.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Added new API for flag to enable or disable TC oversubscription
for best effort traffic class at subport level.
By default TC OV is enabled.
Signed-off-by: Marcin Danilewicz <marcinx.danilewicz@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
In theory ACL library allows fields with 8B long.
Though in practice they usually not used, not tested,
and as was revealed by Ido, this functionality is not working properly.
There are few places inside ACL build code-path that need to be addressed.
Bugzilla ID: 673
Fixes: dc276b5780 ("acl: new library")
Cc: stable@dpdk.org
Reported-by: Ido Goshen <ido@cgstowernetworks.com>
Signed-off-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
Tested-by: Ido Goshen <ido@cgstowernetworks.com>
The AltiVec header file is defining "vector", except in C++ build.
The keyword "vector" may conflict easily.
As a rule, it is better to use the alternative keyword "__vector",
so we will be able to #undef vector after including AltiVec header.
Later it may become possible to #undef vector in rte_altivec.h
with a compatibility breakage.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
FreeBSD has updated its CPU macros to align more with the definitions
used on Linux[1]. Unfortunately, while this makes compatibility better
in future, it means we need to have both legacy and newer definition
support. Use a meson check to determine which set of macros are used.
[1] https://cgit.freebsd.org/src/commit/?id=e2650af157bc
Bugzilla ID: 1014
Fixes: c3568ea376 ("eal: restrict control threads to startup CPU affinity")
Fixes: b6be16acfe ("eal: fix control thread affinity with --lcores")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Daxue Gao <daxuex.gao@intel.com>
Caught by ASan, if a secondary process tried to attach a device with an
incorrect driver name, devargs was leaked.
Fixes: 64051bb1f1 ("devargs: unify scratch buffer storage")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Calls to rte_memcpy for 1 < n < 16 could result in unaligned
loads/stores, which is undefined behaviour according to the C
standard, and strict aliasing violations.
The code was changed to use a packed structure that allows aliasing
(using the __may_alias__ attribute) to perform the load/store
operations. This results in code that has the same performance as the
original code and that is also C standards-compliant.
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Implement functions for getting/setting thread affinity.
Threads can be pinned to specific cores by setting their
affinity attribute.
Windows error codes are translated to errno-style error codes.
The possible return values are chosen so that we have as
much semantic compatibility between platforms as possible.
note: convert_cpuset_to_affinity has a limitation that all cpus of
the set belong to the same processor group.
Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Provide a portable type-safe thread identifier.
Provide rte_thread_self for obtaining current thread identifier.
Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
Merge CRC32 hash calculation public API implementation for x86 and Arm.
Select the best available CRC32 algorithm when unsupported algorithm
on a given CPU architecture is requested by an application.
Previously, if an application directly includes `rte_crc_arm64.h`
without including `rte_hash_crc.h` it will fail to compile.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Split x86 and SW hash crc intrinsics into separate files.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
Extended eventdev queue QoS attributes to support weight and affinity.
If queues are of the same priority, events from the queue with highest
weight will be scheduled first. Affinity indicates the number of times,
the subsequent schedule calls from an event port will use the same event
queue. Schedule call selects another queue if current queue goes empty
or schedule count reaches affinity count.
To avoid ABI break, weight and affinity attributes are not yet added to
queue config structure and rely on PMD for managing it. New eventdev op
queue_attr_get can be used to get it from the PMD.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Added a new eventdev API rte_event_queue_attr_set(), to set event queue
attributes at runtime from the values set during initialization using
rte_event_queue_setup(). PMD's supporting this feature should expose the
capability RTE_EVENT_DEV_CAP_RUNTIME_QUEUE_ATTR.
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add function to quiesce any core specific resources consumed by
the event port.
When the application decides to migrate the event port to another lcore
or teardown the current lcore it may to call `rte_event_port_quiesce`
to make sure that all the data associated with the event port are released
from the lcore, this might also include any prefetched events.
While releasing the event port from the lcore, this function calls the
user-provided flush callback once per event.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Adds telemetry support to get timer adapter info and timer adapter
statistics.
Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Caught by covscan:
1. dpdk-21.11/lib/eventdev/rte_event_eth_rx_adapter.c:3279:
logical_vs_bitwise: "~(*__ctype_b_loc()[(int)*params] & 2048 /*
(unsigned short)_ISdigit */)" is always 1/true regardless of the values
of its operand. This occurs as the logical second operand of "||".
2. dpdk-21.11/lib/eventdev/rte_event_eth_rx_adapter.c:3279: remediation:
Did you intend to use "!" rather than "~"?
While isdigit return value should be compared as an int to 0,
prefer ! since all of this file uses this convention.
Fixes: 814d017093 ("eventdev/eth_rx: support telemetry")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
The RTE_ETH_MQ_RX_RSS_FLAG flag is a switch to enable RSS. If the flag
is not set in dev_configure, RSS will be not configured and enabled.
However, RSS hash and reta can still be configured by ethdev ops to
enable RSS if the flag isn't set. The behavior is inconsistent.
Fixes: 99a2dd955f ("lib: remove librte_ prefix from directory names")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
This patch ensures virtqueue metadata are not being
modified while rte_vhost_vring_call() is executed.
Fixes: 6c299bb732 ("vhost: introduce vring call API")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Some message handlers do not expect any file descriptor attached as
ancillary data.
Provide a common way to enforce this by adding a accepts_fd boolean in
the message handler structure. When a message handler sets accepts_fd to
true, it is responsible for calling validate_msg_fds with a right
expected file descriptor count.
This will avoid leaking some file descriptor by mistake when adding
support for new vhost user message types.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Move message handler description and callbacks into a single array and
remove unneeded VHOST_USER_MAX and VHOST_SLAVE_MAX enums.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In async data path, when vring state changes or device is destroyed,
it is necessary to know the number of in-flight packets in DMA engine.
This patch provides a thread unsafe API to return the number of
in-flight packets for a vhost queue without using any lock.
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As described in bugzilla, ASan reports accesses to all memory segment as
invalid, since those parts have not been allocated with rte_malloc.
Move __rte_no_asan to rte_common.h and disable ASan on a part of the test.
Bugzilla ID: 880
Fixes: 6cc51b1293 ("mem: instrument allocator for ASan")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Fix comments to reflect the hard expiry fields.
Fixes: ad7515a39f ("security: add SA lifetime configuration")
Cc: stable@dpdk.org
Reported-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Currently the "extern C" section ends right before rte_dev_dma_unmap
and other DMA function declarations, causing some C++ compilers to
produce C++ mangled symbols to rte_dev_dma_unmap instead of C symbols.
This leads to build failures later when linking a final executable
against this object.
Fixes: a753e53d51 ("eal: add device event monitor framework")
Cc: stable@dpdk.org
Signed-off-by: Tianhao Chai <cth451@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Currently, when we free previously allocated memory, we mark the area as
"freed" for ASan purposes (flag 0xfd). However, sometimes, freeing a
malloc element will cause pages to be unmapped from memory and re-backed
with anonymous memory again. This may cause ASan's "use-after-free"
error down the line, because the allocator will try to write into
memory areas recently marked as "freed".
To fix this, we need to mark the unmapped memory area as "available",
and fixup surrounding malloc element header/trailers to enable later
malloc routines to safely write into new malloc elements' headers or
trailers.
Bugzilla ID: 994
Fixes: 6cc51b1293 ("mem: instrument allocator for ASan")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, EAL init in secondary processes will attach all fbarrays
in the memconfig to have access to the primary process's page tables.
However, fbarrays corresponding to external memory segments should
not be attached at initialization, because this will happen as part
of `rte_extmem_attach` [1] or `rte_malloc_heap_memory_attach` [2] calls.
1: https://doc.dpdk.org/api/rte__memory_8h.html#a2796da68de6825f8edf53759f8e4d230
2: https://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62cf149fd3
Fixes: ff3619d624 ("malloc: allow attaching to external memory chunks")
Cc: stable@dpdk.org
Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Deepak Khandelwal <deepak.khandelwal@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Mark the trylock family of spinlock functions with
__rte_warn_unused_result.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
This patch adds a wrapper macro __rte_warn_unused_result for the
warn_unused_result function attribute.
Marking a function __rte_warn_unused_result will make the compiler
emit a warning in case the caller does not use the function's return
value.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
All OS implementations provide the same main loop.
Introduce helpers (shared for Linux and FreeBSD) to handle synchronisation
between main and threads and factorize the rest as common code.
Thread id are now logged as string in a common format across OS.
Note:
- this change also fixes Windows EAL: worker threads cpu affinity was
incorrectly reported in log.
- libabigail flags this change as breaking ABI in clang builds:
1 function with some indirect sub-type change:
[C] 'function int rte_eal_remote_launch(int (void*)*, void*, unsigned
int)' at eal_common_launch.c:35:1 has some indirect sub-type
changes:
parameter 1 of type 'int (void*)*' changed:
in pointed to type 'function type int (void*)' at rte_launch.h:31:1:
entity changed from 'function type int (void*)' to 'typedef
lcore_function_t' at rte_launch.h:31:1
type size hasn't changed
This is being investigated on libabigail side.
For now, we don't have much choice but to waive reports on this symbol.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
So far, a worker thread has been using its thread_id to discover which
lcore has been assigned to it.
On the other hand, as noted by Tyler, the pthread API does not strictly
guarantee that a new thread won't start running eal_thread_loop before
pthread_create writes to &lcore_config[xx].thread_id.
Though all OS implementations supported in DPDK (recently) ensure this
property, it is more robust to have the main thread directly pass
the worker thread lcore.
Signed-off-by: David Marchand <david.marchand@redhat.com>
eal_thread_loop() uses lcore_config[i].thread_id,
which is stored upon the return from CreateThread().
Per documentation, eal_thread_loop() can start
before CreateThread() returns and the ID is stored.
Create lcore worker threads suspended and then subsequently resume to
allow &lcore_config[i].thread_id be stored before eal_thread_loop
execution.
Fixes: 53ffd9f080 ("eal/windows: add minimum viable code")
Cc: stable@dpdk.org
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Remove the x86 top atomic header include from the architecture related
header file, since this x86 top atomic header file has included them.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
When compiling on FreeBSD with clang and include checking enabled,
errors are emitted due to differences in how empty structs/unions are
handled in C and C++, as C++ structs cannot have zero size.
lib/eventdev/rte_eventdev.h:992:2: error:
union has size 0 in C, non-zero size in C++
Since the contents of the union are all themselves of zero size,
the actual union wrapper is unnecessary. We therefore remove it for C++
builds - though keep it for C builds for safety and clarity of
understanding the code. The alignment constraint on the union is
unnecessary in the case where the whole struct is aligned on a 16-byte
boundary, so we add that constraint to the overall structure to ensure
it applies for C++ code as well as C.
Fixes: 1cc44d4092 ("eventdev: introduce event vector capability")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
When compiling on FreeBSD with clang and include checking enabled,
errors are emitted due to differences in how empty structs/unions are
handled in C and C++, as C++ structs cannot have zero size.
lib/cryptodev/rte_crypto.h:127:2: error:
union has size 0 in C, non-zero size in C++
Since the contents of the union are all themselves of zero size,
the actual union wrapper is unnecessary. We therefore remove it for C++
builds - though keep it for C builds for safety and clarity of
understanding the code.
Fixes: c0f87eb525 ("cryptodev: change burst API to be crypto op oriented")
Fixes: d2a4223c4c ("cryptodev: do not store pointer to op specific params")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Building with clang on FreeBSD with chkincs enabled, we get the
following error about a missing space:
lib/compressdev/rte_compressdev_internal.h:25:58: error:
invalid suffix on literal;
C++11 requires a space between literal and identifier [-Wreserved-user-defined-literal]
rte_log(RTE_LOG_ ## level, compressdev_logtype, "%s(): "fmt "\n", \
Adding in a space between the '"' and 'fmt' removes the error.
Fixes: ed7dd94f7f ("compressdev: add basic device management")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The headers rte_compressdev_pmd.h and rte_compressdev_internal.h are,
as the filenames suggest, headers for building drivers using the
compressdev APIs. As such they should be marked as
"driver_sdk_headers" rather than just "headers" in the meson.build
file.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Add missing 'extern "C"' to file.
Fixes: 428eb983f5 ("eal: add OS specific header file")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This is something caught in UNH FreeBSD env.
For some reason [1], the pcap/bpf.h header started to define _BPF_H_.
It happens that the bpf_impl.h internal DPDK header uses this define as
an internal guard.
This triggers a build failure in bpf_convert.c which can't find
RTE_BPF_LOG macro.
Fix the include guard to use the filename and remove _.
1: https://github.com/the-tcpdump-group/libpcap/pull/1074
Fixes: 94972f35a0 ("bpf: add BPF loading and execution framework")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Even if unlikely, a buggy vhost-user master might attach fds to inflight
messages. Add checks like for other types of vhost-user messages.
Fixes: d87f1a1cb7 ("vhost: support inflight info sharing")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In function vhost_user_set_inflight_fd, queue number in inflight
message is used to access virtqueue. However, queue number could
be larger than VHOST_MAX_VRING and cause write OOB as this number
will be used to write inflight info in virtqueue structure. This
patch checks the queue number to avoid the issue and also make
sure virtqueues are allocated before setting inflight information.
Fixes: ad0a4ae491 ("vhost: checkout resubmit inflight information")
Cc: stable@dpdk.org
Reported-by: Wenxiang Qian <leonwxqian@gmail.com>
Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Memory allocated for CPU mapping the status flag
in the communication list should be aligned to the
GPU page size, which can be different of CPU page alignment.
The GPU page size is added to the GPU info,
and is used when creating a communication list.
Fixes: 9b8cae4d99 ("gpudev: use CPU mapping in communication list")
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
API documentation for "struct rte_eth_dev_info" was missing some fields
'device' & 'max_hash_mac_addrs',
because of syntax error in doxygen comment, fixing it.
Bugzilla ID: 954
Fixes: 88ac4396ad ("ethdev: add VMDq support")
Fixes: cd8c7c7ce2 ("ethdev: replace bus specific struct with generic dev")
Cc: stable@dpdk.org
Reported-by: Bruce Merry <bmerry@sarao.ac.za>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Following a rework, external message handlers were receiving a pointer
to a vhost_user message (as stated in the API), but lost the ability to
interact with fds attached to the message.
Restore the original layout and put a build check and reminders.
Bugzilla ID: 953
Fixes: 5e0099dc70 ("vhost: remove payload size limitation")
Reported-by: Fan Zhang <roy.fan.zhang@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Jakub Poczatek <jakub.poczatek@intel.com>
Acked-by: Jakub Poczatek <jakub.poczatek@intel.com>
Reviewed-by: Christophe Fontaine <cfontain@redhat.com>
The symbols which are not listed in the version script
are exported by default.
Adding a local section with a wildcard make non-listed functions
and variables as hidden, as it should be in all version.map files.
These are the changes done in the shared libraries:
- DF .text Base auxiliary_add_device
- DF .text Base auxiliary_dev_exists
- DF .text Base auxiliary_dev_iterate
- DF .text Base auxiliary_insert_device
- DF .text Base auxiliary_is_ignored_device
- DF .text Base auxiliary_match
- DF .text Base auxiliary_on_scan
- DF .text Base auxiliary_scan
- DO .bss Base auxiliary_bus_logtype
- DO .data Base auxiliary_bus
- DO .bss Base gpu_logtype
There is no impact on regexdev library.
Because these local symbols were exported as non-internal
in DPDK 21.11, any change in these functions would break the ABI.
Exception rules are added for these experimental libraries,
so the ABI check will skip them until the next ABI version.
A check is added to avoid such miss in future.
Fixes: 1afce3086c ("bus/auxiliary: introduce auxiliary bus")
Fixes: 8b8036a66e ("gpudev: introduce GPU device class library")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The functions used by the drivers must be internal,
while the function and variables used in inline functions
must be experimental.
These are the changes done in the shared library:
- DF .text Base rte_regexdev_get_device_by_name
+ DF .text INTERNAL rte_regexdev_get_device_by_name
- DF .text Base rte_regexdev_register
+ DF .text INTERNAL rte_regexdev_register
- DF .text Base rte_regexdev_unregister
+ DF .text INTERNAL rte_regexdev_unregister
- DF .text Base rte_regexdev_is_valid_dev
+ DF .text EXPERIMENTAL rte_regexdev_is_valid_dev
- DO .bss Base rte_regex_devices
+ DO .bss EXPERIMENTAL rte_regex_devices
- DO .bss Base rte_regexdev_logtype
+ DO .bss EXPERIMENTAL rte_regexdev_logtype
Because these symbols were exported in the default section in DPDK 21.11,
any change in these functions would be seen as incompatible
by the ABI compatibility check.
An exception rule is added for this experimental library,
so the ABI check will skip it until the next ABI version.
Fixes: bab9497ef7 ("regexdev: introduce API")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ori Kam <orika@nvidia.com>
If rte_ethlink_get fails, the code can just not add speed
to the pcap file.
Coverity issue: 373664
Fixes: 8d23ce8f5e ("pcapng: add new library for writing pcapng files")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The node clone API parameter 'name' is the new node's postfix name, not
the final node name, so it makes no sense to check it. And the new name
will be checked duplicate when calling API '__rte_node_register'.
And update the test case to call clone API twice to check the real name
duplicate.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
'rte_pie_rt_data_init(NULL)' is not expected, and it's ought to
fail when this happen. The malloc inside the function didn't work.
So remove the malloc otherwise will lead to a memory leak.
Fixes: 44c730b0e3 ("sched: add PIE based congestion management")
Cc: stable@dpdk.org
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The ret value in rte_dev_event_monitor_stop stands for whether the
monitor has been successfully closed, and should not bind with
rte_intr_callback_unregister, so once it goes to the right exit point of
rte_dev_event_monitor, the ret value should be set to 0.
Also, the refmonitor count has been carefully evaluated, the value
change from 1 to 0, so there is no potential memory leak failure.
Fixes: 1fef6ced07 ("eal/linux: allow multiple starts of event monitor")
Cc: stable@dpdk.org
Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
The punctuation after the `global` keyword should be colon, not
semicolon. The default gcc linker accepts both colon and semicolon, but
the gold linker will report syntax error if we use semicolon after the
`global` keyword.
Fixes: 94c16e89d7 ("vhost: mark vDPA driver API as internal")
Cc: stable@dpdk.org
Signed-off-by: Peng Yu <penyu@amazon.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Socket ID is used and interpreted as integer, one of the possible
values for socket id is -1 (SOCKET_ID_ANY).
here socket_id is defined as unsigned 8 bit integer, so when putting
-1, it is interpreted as 255, which causes allocation errors when
trying to allocate from socket_id (255).
change socket_id from unsigned 8 bit integer to integer.
Fixes: ed7dd94f7f ("compressdev: add basic device management")
Cc: stable@dpdk.org
Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The function rte_devargs_parse() previously was safe to call with
non-initialized devargs structure as parameter.
When adding the support for the global device syntax,
this assumption was broken.
Restore it by forcing memset as part of the call itself.
Bugzilla ID: 933
Fixes: b344eb5d94 ("devargs: parse global device syntax")
Cc: stable@dpdk.org
Signed-off-by: Madhuker Mythri <madhuker.mythri@oracle.com>
Signed-off-by: Gaetan Rivet <grive@u256.net>
'recv()' fills the 'buf', later 'strlcpy()' used to copy from this buffer.
But as coverity warns 'recv()' doesn't guarantee that 'buf' is
null-terminated, but 'strlcpy()' requires it.
Enlarge 'buf' size to 'EAL_UEV_MSG_LEN + 1' and ensure the last one can
be set to 0 when received buffer size is EAL_UEV_MSG_LEN.
CID 375864: Memory - illegal accesses (STRING_NULL)
Passing unterminated string "buf" to "dev_uev_parse", which expects
a null-terminated string.
Coverity issue: 375864
Fixes: 0d0f478d04 ("eal/linux: add uevent parse and process")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Coverity flags the fact that the tag values used in distributor are
32-bit, which means that when we use bit-manipulation to convert a tag
match/no-match to a bit in an array, we need to typecast to a 64-bit
type before shifting past 32 bits.
Coverity issue: 375808
Fixes: 08ccf3faa6 ("distributor: new packet distributor library")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Coverity flags that both elements of efd_online_group_entry
are used uninitialized. This is OK because this structure
is initially used for starting values, so any value is OK.
Coverity ID: 375868
Fixes: 56b6ef874f ("efd: new Elastic Flow Distributor library")
Cc: stable@dpdk.org
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
Queue-based flow rules management mechanism is suitable
not only for flow rules creation/destruction, but also
for speeding up other types of Flow API management.
Indirect action object operations may be executed
asynchronously as well. Provide async versions for all
indirect action operations, namely:
rte_flow_async_action_handle_create,
rte_flow_async_action_handle_destroy and
rte_flow_async_action_handle_update.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
A new, faster, queue-based flow rules management mechanism is needed for
applications offloading rules inside the datapath. This asynchronous
and lockless mechanism frees the CPU for further packet processing and
reduces the performance impact of the flow rules creation/destruction
on the datapath. Note that queues are not thread-safe and the queue
should be accessed from the same thread for all queue operations.
It is the responsibility of the app to sync the queue functions in case
of multi-threaded access to the same queue.
The rte_flow_async_create() function enqueues a flow creation to the
requested queue. It benefits from already configured resources and sets
unique values on top of item and action templates. A flow rule is enqueued
on the specified flow queue and offloaded asynchronously to the hardware.
The function returns immediately to spare CPU for further packet
processing. The application must invoke the rte_flow_pull() function
to complete the flow rule operation offloading, to clear the queue, and to
receive the operation status. The rte_flow_async_destroy() function
enqueues a flow destruction to the requested queue.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Treating every single flow rule as a completely independent and separate
entity negatively impacts the flow rules insertion rate. Oftentimes in an
application, many flow rules share a common structure (the same item mask
and/or action list) so they can be grouped and classified together.
This knowledge may be used as a source of optimization by a PMD/HW.
The pattern template defines common matching fields (the item mask) without
values. The actions template holds a list of action types that will be used
together in the same rule. The specific values for items and actions will
be given only during the rule creation.
A table combines pattern and actions templates along with shared flow rule
attributes (group ID, priority and traffic direction). This way a PMD/HW
can prepare all the resources needed for efficient flow rules creation in
the datapath. To avoid any hiccups due to memory reallocation, the maximum
number of flow rules is defined at the table creation time.
The flow rule creation is done by selecting a table, a pattern template
and an actions template (which are bound to the table), and setting unique
values for the items and actions.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
The flow rules creation/destruction at a large scale incurs a performance
penalty and may negatively impact the packet processing when used
as part of the datapath logic. This is mainly because software/hardware
resources are allocated and prepared during the flow rule creation.
In order to optimize the insertion rate, PMD may use some hints provided
by the application at the initialization phase. The rte_flow_configure()
function allows to pre-allocate all the needed resources beforehand.
These resources can be used at a later stage without costly allocations.
Every PMD may use only the subset of hints and ignore unused ones or
fail in case the requested configuration is not supported.
The rte_flow_info_get() is available to retrieve the information about
supported pre-configurable resources. Both these functions must be called
before any other usage of the flow API engine.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
This patch adds missing protection around vring_invalidate
and translate_ring_addresses calls in vhost_user_iotlb_msg.
Fixes: eefac9536a ("vhost: postpone device creation until rings are mapped")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
When choosing IOVA as PA mode, IOVA is likely to be discontinuous,
which requires page by page mapping for DMA devices. To be consistent,
this patch implements page by page mapping instead of mapping at the
region granularity for both IOVA as VA and PA mode.
Fixes: 7c61fa08b7 ("vhost: enable IOMMU for async vhost")
Cc: stable@dpdk.org
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch renames the host_phys_addr to host_iova in guest_page
struct. The host_phys_addr is iova, it depends on the DPDK
IOVA mode.
Fixes: e246896178 ("vhost: get guest/host physical address mappings")
Cc: stable@dpdk.org
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The right size for a human readable MAC is RTE_ETHER_ADDR_FMT_SIZE.
While at it, the net library provides a helper for MAC address
formatting. Prefer it.
Fixes: 58b43c1ddf ("ethdev: add telemetry endpoint for device info")
Cc: stable@dpdk.org
Reported-by: Christophe Fontaine <cfontain@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch fixes misspelled RTE_RSA_KEY_TYPE_QT,
this will prevent checkpach from complaining wherever
change to RSA is being made.
Fixes: 26008aaed1 ("cryptodev: add asymmetric xform and op definitions")
Cc: stable@dpdk.org
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
When creating the asymmetric session mempool, the maximum private
session size of all devices is used when creating the mempool
object size.
The return value for ``rte_cryptodev_asym_get_private_session_size``
is unsigned int, whereas the variable was uint8_t, leading to a
possible overflow issue.
To fix this, the variable for private session size is now changed to
unsigned int to match the function return type.
Fixes: 1f1e4b7cba ("cryptodev: use single mempool for asymmetric session")
Reported-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
eca_cryptodev_cdev_flush() is internal function and called with
valid range of cdevs.
crypto_cdev_info structure is allocated at adapter creation time
and retrieved from the adapter for a valid cdevs which cannot be NULL
and hence no need for NULL check.
Fixes: 2ae84b39ae ("eventdev/crypto: store operations in circular buffer")
Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Remove the memcpy usage in queue config get function for
`event` variable which is 8 byte size and use direct copy.
Also provide vector information and event buffer size in the
queue config info.
Fixes: da781e6488 ("eventdev/eth_rx: support Rx queue config get")
Cc: stable@dpdk.org
Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
The memory get from strdup should be freed when parameter parsing
finished, and also should be freed when error occurs.
Fixes: 814d017093 ("eventdev/eth_rx: support telemetry")
Fixes: 9e58318531 ("eventdev/eth_rx: support telemetry")
Cc: stable@dpdk.org
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Acked-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Removed RTE_SCHED_SUBPORT_TC_OV from rte_config.h.
Best effort traffic class oversubscription is always enabled.
Signed-off-by: Megha Ajmera <megha.ajmera@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Grinder configuration is now moved to sched library.
Number of grinders can also modified by specifying
RTE_SCHED_PORT_N_GRINDERS=N in CFLAGS, where N is number of grinders.
Signed-off-by: Megha Ajmera <megha.ajmera@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Remove RTE_SCHED_VECTOR flag from rte_config.h.
This flag is no longer useful. Only scalar version is supported.
Signed-off-by: Megha Ajmera <megha.ajmera@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
rte_gpu_mem_cpu_map() exposes a GPU memory area to the CPU.
In gpudev communication list this is useful to store the
status flag.
A communication list status flag allocated on GPU memory
and mapped for CPU visibility can be updated by CPU and polled
by a GPU workload.
The polling operation is more frequent than the CPU update operation.
Having the status flag in GPU memory reduces the GPU workload polling
latency.
If CPU mapping feature is not enabled, status flag resides in
CPU memory registered so it's visible from the GPU.
To facilitate the interaction with the status flag, this patch
provides also the set/get functions for it.
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
Update rte_gpu_mem_cpu_unmap() header documentation
and the test application to use GPU pointer when unmapping.
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
With a one-line change to the lib meson.build file we can add the SDK
headers to the list of files to be checked using the chkincs binary.
Unfortunately, many of those SDK header depend upon headers in the PCI
and vdev bus drivers, so we need to update chkincs build to ensure those
dependencies are added. We also need to allow internal APIs to be
present in these SDK headers.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
C++ does not allow implicit conversion to/from void*,
so we need an explicit cast to allow the driver SDK header
to be included from C++ code.
Fixes: e489007a41 ("ethdev: add generic create/destroy ethdev APIs")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: 7a33572057 ("lib: remove C++ include guard from private headers")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: 7a33572057 ("lib: remove C++ include guard from private headers")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: d7280c9fff ("vhost: support selective datapath")
Fixes: 78639d5456 ("vhost: introduce async enqueue registration API")
Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")
Fixes: 94c16e89d7 ("vhost: mark vDPA driver API as internal")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: 3fc5ca2f63 ("kni: initial import")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: dc39e2f359 ("eventdev: add ring structure for events")
Fixes: 7a33572057 ("lib: remove C++ include guard from private headers")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: ed7dd94f7f ("compressdev: add basic device management")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: dc276b5780 ("acl: new library")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: c5b7197f66 ("telemetry: move some functions to metrics library")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: 7a3f27cbf5 ("ethdev: add access to specific device info")
Fixes: dcd5c8112b ("ethdev: add PCI driver helpers")
Fixes: 7f0a669e7b ("ethdev: add allocation helper for virtual drivers")
Fixes: 7a33572057 ("lib: remove C++ include guard from private headers")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: 8877ac688b ("telemetry: introduce infrastructure")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Some public header files were missing 'extern "C"' C++ guards,
and couldn't be used by C++ applications. Add the missing guards.
Fixes: af75078fec ("first public release")
Fixes: 7f3aa08639 ("eal: introduce bit operations API")
Fixes: 166a743c53 ("compat: add infrastructure to support symbol versioning")
Fixes: 8f40ee0734 ("eal/x86: get hypervisor name")
Fixes: 75583b0d1e ("eal: add keep alive monitoring")
Fixes: 88701645c9 ("eal: move interrupt type out of igb_uio")
Fixes: f04519d809 ("lib: add missing include dependencies")
Fixes: f58880682c ("trace: implement register API")
Fixes: 428eb983f5 ("eal: add OS specific header file")
Cc: stable@dpdk.org
Signed-off-by: Brian Dooley <brian.dooley@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
When checking C++ compatibility of SDK headers,
an error is detected by the compiler:
lib/dmadev/rte_dmadev_pmd.h:95:23: error:
‘RTE_DEV_NAME_MAX_LEN’ undeclared here (not in a function)
The header file rte_dev.h must be included.
Fixes: b36970f2e1 ("dmadev: introduce DMA device library")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Conor Walsh <conor.walsh@intel.com>
The internal function txa_service_queue_add() is returning 0
in case of error, correct this logic to return a negative value
to indicate failure.
Fixes: a3bbf2e097 ("eventdev: add eth Tx adapter implementation")
Cc: stable@dpdk.org
Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
update rte_event_crypto_adapter_caps_get() to return
SW_CAP if PMD callback is not registered.
Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Added checksum support for variable size headers such as IPv4 headers
with options.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
Signed-off-by: Harshad Narayane <harshad.suresh.narayane@intel.com>
The regular tables, selector tables and learner tables are all sharing
the table state array. The locations in this array were computed
incorrectly, leading to memory corruption issues.
Fixes: 4f59d37261 ("pipeline: support learner tables")
Cc: stable@dpdk.org
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Harshad Narayane <harshad.suresh.narayane@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Signed-off-by: Venkata Suresh Kumar P <venkata.suresh.kumar.p@intel.com>
The checks for the table-only and default-only annotations were
incorrect, as they were using the pipeline action ID instead of the
table action ID for retrieving the table action info. These checks are
now corrected and pushed into the internal table_entry_check()
function.
Fixes: cd79e02058 ("pipeline: support action annotations")
Cc: stable@dpdk.org
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
Detect when a jump instruction, either conditional or unconditional,
is jumping to itself, thus creating a loop, which is not allowed in
data plane code.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Harshad Narayane <harshad.suresh.narayane@intel.com>
An additional output port is now implicitly created for every pipeline
to serve as the packet drop port. Up to now, the drop port had to be
explicitly created for each pipeline.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
Move the table type registration for the well known table types from
the application to the pipeline library.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
Move the port type registration for the well known port types from the
application to the pipeline library.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
The output port to be used as the drop port is now determined when the
drop instruction is executed as opposed to being statically determined
at instruction translation time and hardcoded in the opcode.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Yogesh Jangra <yogesh.jangra@intel.com>
This patch adds crypto uint typedef so adding comment
about byte-order becomes unnecessary.
It makes API comments more tidy, and consistent
with other asymmetric crypto APIs.
Additionally it reorganizes code that enums, externs
and forward declarations are moved to the top of the
header file making code more readable.
It removes also comments like co-prime constraint
from mod inv as it is natural mathematical constraint,
not PMD constraint.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
This commit replaces __extension__ attribute with
RTE_STD_C11 in anonymous unions.
It makes API consistent in terms of usage of C11
feature macro.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
This commit clarifies usage of random numbers in asymmetric
crypto API.
The user is now allowed to provide information to the PMD if random
number should be generated or should be read from user input.
If PMD does not support random number generation user should
always provide it, if PMD does not support user random,
rte_crypto_param.data accordingly should be set to NULL.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
This commit adds random number 'k' to DSA
op param struct for asymmetric crypto ops.
This parameter is crucial in stiuations where:
- PMD cannot generate random number
- User would like to provide random source
Additionally, it makes DSA consistent with ECDSA
in terms of 'k' which includes this parameter.
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Rather than the asym session create function returning a session on
success, and a NULL value on error, it is modified to now return int
values - 0 on success or -EINVAL/-ENOTSUP/-ENOMEM on failure.
The session to be used is passed as input.
This adds clarity on the failure of the create function, which enables
treating the -ENOTSUP return as TEST_SKIPPED in test apps.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
A user data field is added to the asymmetric session structure.
Relevant API added to get/set the field.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
The rte_cryptodev_asym_session structure is now moved to an internal
header. This will no longer be used directly by apps,
private session data can be accessed via get API.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Rather than using a session buffer that contains pointers to private
session data elsewhere, have a single session buffer.
This session is created for a driver ID, and the mempool element
contains space for the max session private data needed for any driver.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
ethdev has two interfaces, one interface between applications and
library, these APIs are declared in the rte_ethdev.h public header.
Other interface is between drivers and library, these functions are
declared in ethdev_driver.h and marked as internal.
But all functions are defined in rte_ethdev.c file. This patch moves
functions for drivers to its own file, ethdev_driver.c for cleanup, no
functional change in functions.
Some public APIs and driver helpers call common internal functions,
which were mostly static since both were in same file. To be able to
move driver helpers, common functions are moved to ethdev_private.c.
(ethdev_private.c is used for functions that are internal to the library
and shared by multiple .c files in the ethdev library.)
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Multiple PMDs have dummy/noop Rx/Tx packet burst functions.
These dummy functions are very simple, introduce a common function in
the ethdev and update drivers to use it instead of each driver having
its own functions.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Add flow pattern items and header format for matching optional fields
(checksum/key/sequence) in GRE header. And the flags in gre item should
be correspondingly set with the new added items.
Signed-off-by: Sean Zhang <xiazhang@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
There are optional fields in GRE header(checksum/key/sequence), this
patch adds definition of structures of the optional fields.
Signed-off-by: Sean Zhang <xiazhang@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Added the ethdev dump API which provides querying private info from device.
There exists many private properties in different PMD drivers, such as
adapter state, Rx/Tx func algorithm in hns3 PMD. The information of these
properties is important for debug. As the information is private, the new
API is introduced.
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch adds vDPA device cleanup callback to release resources on
vhost user connection close.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Since dmadev is introduced in 21.11, to avoid the overhead of vhost DMA
abstraction layer and simplify application logics, this patch integrates
dmadev in asynchronous data path.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Sunil Pai G <sunil.pai.g@intel.com>
Tested-by: Yvonne Yang <yvonnex.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This NULL check is unnecessary, 'eth_dev' is never NULL.
Fixes: 58b43c1ddf ("ethdev: add telemetry endpoint for device info")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Hardware IP reassembly may be incomplete for multiple reasons like
reassembly timeout reached, duplicate fragments, etc.
To save application cycles to process these packets again, a new
mbuf dynflag is added to show that the mbuf received is not
reassembled properly.
Now if this dynflag is set, application can retrieve corresponding
chain of mbufs using mbuf dynfield set by the PMD. Now, it will be
up to application to either drop those fragments or wait for more time.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
IP Reassembly is a costly operation if it is done in software.
The operation becomes even more costlier if IP fragments are encrypted.
However, if it is offloaded to HW, it can considerably save application
cycles.
Hence, a new offload feature is exposed in eth_dev ops for devices which
can attempt IP reassembly of packets in hardware.
- rte_eth_ip_reassembly_capability_get() - to get the maximum values
of reassembly configuration which can be set.
- rte_eth_ip_reassembly_conf_set() - to set IP reassembly configuration
and to enable the feature in the PMD (to be called before
rte_eth_dev_start()).
- rte_eth_ip_reassembly_conf_get() - to get the current configuration
set in PMD.
Now when the offload is enabled using rte_eth_ip_reassembly_conf_set(),
the resulting reassembled IP packet would be a typical segmented mbuf in
case of success.
And if reassembly of IP fragments is failed or is incomplete (if
fragments do not come before the reass_timeout, overlap, etc), the mbuf
dynamic flags can be updated by the PMD. This is updated in a subsequent
patch.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
This patch adds L2TPv2 control message and 5 types of data message
support for testpmd.
The added L2TPv2 message types are listed below:
1. L2TPv2 control
2. L2TPv2
3. L2TPv2 + length option
4. L2TPv2 + sequence option
5. L2TPv2 + offset option
6. L2TPv2 + length option + sequence option
Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Ori Kam <orika@nvidia.com>
The fields of L2TPv2 common header were reversed in big endian and
little endian.
This patch fixes this error to ensure L2TPv2 can be parsed correctly.
For L2TP reference:
https://datatracker.ietf.org/doc/html/rfc2661#section-3.1
Fixes: 3a929df1f2 ("ethdev: support L2TPv2 and PPP procotol")
Cc: stable@dpdk.org
Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch defines new RSS offload type for L2TPv2, which
is required when users want to distribute packets based on
the L2TPv2 session ID field.
Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
When compiling with clang using -Wpedantic (or -Wgcc-compat) the use of
diagnose_if kicks up a warning:
.../include/rte_interrupts.h:623:1: error: 'diagnose_if' is a clang
extension [-Werror,-Wgcc-compat]
__rte_internal
^
.../include/rte_compat.h:36:16: note: expanded from macro '__rte_internal'
__attribute__((diagnose_if(1, "Symbol is not public ABI", "error"), \
This change ignores the '-Wgcc-compat' warning in the specific location
where the warning occurs. It is safe to do in this circumstance as the
specific macro is only defined when using the clang compiler.
Signed-off-by: Michael Barker <mikeb01@gmail.com>
Functions like free, rte_free, and rte_mempool_free
already handle NULL pointer so the checks here are not necessary.
Remove redundant NULL pointer checks before free functions
found by nullfree.cocci
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
These functions all behave like libc free() and do
nothing if handed a NULL pointer. The code is already doing
this, this patch just documents the behavior.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The mp action resources in malloc should be cleaned up via
rte_eal_cleanup.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
When rte_eal_cleanup is called, hotplug should unregister the
resources associated with the multi-process server.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When rte_eal_cleanup is called the rte_mp_action for VFIO
should be freed.
Fixes: edf73dd330 ("ipc: handle unsupported IPC in action register")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
When rte_eal_cleanup is called, all control threads should exit.
For the mp thread, this best handled by closing the mp_socket
and letting the thread see that.
This also fixes potential problems where the mp_socket gets
another hard error, and the thread runs away repeating itself
by reading the same error.
Fixes: 85d6815fa6 ("eal: close multi-process socket during cleanup")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
When application calls rte_eal_cleanup on shutdown,
the DPDK log should be closed and cleaned up.
This helps reduce false reports from tools like ASAN
and valgrind that track memory leaks.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The function malloc() could return NULL, the return value
need to be checked.
Fixes: 6f63858e55 ("mem: prevent preallocated pages from being freed")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
GCC [1] now assigns even register pairs for CASP, the fix has also been
backported to all stable releases of older GCC versions.
Removing the manual register allocation allows GCC to inline the
functions and pick optimal registers for performing CASP.
1: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=563cc649beaf
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
The virtio kernel header includes are already noted as being
incompatible with C++. We can ensure that the header is safe for
inclusion in C++ code by not including those headers during C++ builds.
While not ideal, this does ensure that all DPDK headers can be included
in C++ code without errors.
Fixes: f8904d5636 ("vhost: fix header for strict compilation flags")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Since C++ doesn't support automatic casting from void * to other types,
we need to explicitly add the casts to any header files in DPDK.
Fixes: ea7be0a038 ("lib/librte_table: add hash function headers")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
C++ does not have automatic casting to/from void pointers, so need
explicit cast if header is to be included in C++ code
Fixes: f901d9c826 ("ipsec: add helpers to group completed crypto-ops")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
C++ does not have automatic casting to/from void pointers, so need
explicit cast if header is to be included in C++ code
Fixes: 40d4f51403 ("graph: implement fastpath routines")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
The eventdev headers had issues when used from C++
* Missing closing "}" for the extern "C" block
* No automatic casting to/from void *
Fixes: a6562f6d6f ("eventdev: introduce event timer adapter")
Fixes: 32e326869e ("eventdev: add tracepoints")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
C++ files could not include some headers because:
* "new" is a keyword in C++, so can't be a variable name
* there is no automatic casting to/from void *
Fixes: 184104fc61 ("ticketlock: introduce fair ticket based locking")
Fixes: 032a7e5499 ("trace: implement provider payload")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The stubs header is included as part of rte_stack.h for architectures
other than x86_64 and aarch64 (i.e. x86 32 bits and ppc).
Note: chkincs won't catch this issue since it checks headers from within
the source directory.
Fixes: 7911ba0473 ("stack: enable lock-free implementation for aarch64")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Enable the possibility to expose a GPU memory area and make it
accessible from the CPU.
GPU memory has to be allocated via rte_gpu_mem_alloc().
This patch allows the gpudev library to map (and unmap),
through the GPU driver, a chunk of GPU memory and to return
a memory pointer usable by the CPU to access the GPU memory area.
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
FDs at the end of the VhostUserMessage structure limits the size
of the payload. Move them to an other englobing structure, before
the header & payload of a VhostUserMessage.
Also removes a reference to fds in the VHUMsg structure defined in
drivers/net/virtio/virtio_user/vhost_user.c
Signed-off-by: Christophe Fontaine <cfontain@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Async copy fails when looking up hpa in the gpa to hpa mapping table.
This happens because the gpa is matched exactly in the merged
mapping table, and the merge loses the mapping entries.
A new range comparison method is introduced to solve this issue.
Fixes: 6563cf9238 ("vhost: fix async copy on multi-page buffers")
Cc: stable@dpdk.org
Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Based on device support and use-case need, there are two different ways
to enable PFC. The first case is the port level PFC configuration, in
this case, rte_eth_dev_priority_flow_ctrl_set() API shall be used to
configure the PFC, and PFC frames will be generated using based on VLAN
TC value.
The second case is the queue level PFC configuration, in this
case, Any packet field content can be used to steer the packet to the
specific queue using rte_flow or RSS and then use
rte_eth_dev_priority_flow_ctrl_queue_configure() to configure the
TC mapping on each queue.
Based on congestion selected on the specific queue, configured TC
shall be used to generate PFC frames.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add functions to call rte_raw_cksum_mbuf() to calculate IPv4/6
UDP/TCP checksum in mbuf which can be over multi-segments.
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Aman Singh <aman.deep.singh@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
The PMDs would need a function to access the rte_eth_devices
without accessing the global rte_eth_device array.
Cc: stable@dpdk.org
Signed-off-by: Kumara Parameshwaran <kparameshwar@vmware.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Both Linux and FreeBSD have same code for creating runtime
directory and reading sysfs files. Put them in the new lib/eal/unix
subdirectory.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Systemd.exec supports configuring the runtime directory of a service
via RuntimeDirectory=. This creates the directory with the necessary
permissions which actual service may not have if running in container.
The change to DPDK is to look for the environment RUNTIME_DIRECTORY
first and use that in preference to the fallback alternatives.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
The size argument to eal_set_runtime_dir is useless and was
being used incorrectly in strlcpy. It worked only because
all callers passed PATH_MAX which is same as sizeof the destination
runtime_dir.
Note: this is an internal API so no user exposed change.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Added an internal helper to get OS-specific EAL mapping base address
This helper can be used by the drivers to program offload / accelerator
devices, where the base address can be used as a reference address by
the accelerator to access the host memory
An address can also be represented as an offset relative to the base
address using smaller data types
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Expose Linux EAL ability to reuse existing hugepage files
via --huge-unlink=never switch.
Default behavior is unchanged, it can also be specified
using --huge-unlink=existing for consistency.
Old --huge-unlink switch is kept,
it is an alias for --huge-unlink=always.
Add a test case for the --huge-unlink=never mode.
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Linux EAL ensured that mapped hugepages are clean
by always mapping from newly created files:
existing hugepage backing files were always removed.
In this case, the kernel clears the page to prevent data leaks,
because the mapped memory may contain leftover data
from the previous process that was using this memory.
Clearing takes the bulk of the time spent in mmap(2),
increasing EAL initialization time.
Introduce a mode to keep existing files and reuse them
in order to speed up initial memory allocation in EAL.
Hugepages mapped from such files may contain data
left by the previous process that used this memory,
so RTE_MEMSEG_FLAG_DIRTY is set for their segments.
If multiple hugepages are mapped from the same file:
1. When fallocate(2) is used, all memory mapped from this file
is considered dirty, because it is unknown
which parts of the file are holes.
2. When ftruncate(3) is used, memory mapped from this file
is considered dirty unless the file is extended
to create a new mapping, which implies clean memory.
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
In preparation to extend --huge-unlink option semantics
refactor how it is stored in the internal configuration.
It makes future changes more isolated.
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
EAL malloc layer assumed all free elements content
is filled with zeros ("clean"), as opposed to uninitialized ("dirty").
This assumption was ensured in two ways:
1. EAL memalloc layer always returned clean memory.
2. Freed memory was cleared before returning into the heap.
Clearing the memory can be as slow as around 14 GiB/s.
To save doing so, memalloc layer is allowed to return dirty memory.
Such segments being marked with RTE_MEMSEG_FLAG_DIRTY.
The allocator tracks elements that contain dirty memory
using the new flag in the element header.
When clean memory is requested via rte_zmalloc*()
and the suitable element is dirty, it is cleared on allocation.
When memory is deallocated, the freed element is joined
with adjacent free elements, and the dirty flag is updated:
a) If the joint element contains dirty parts, it is dirty:
dirty + freed + dirty = dirty => no need to clean
freed + dirty = dirty the freed memory
Dirty parts may be large (e.g. initial allocation),
so clearing them could create unpredictable slowdown.
b) If the only dirty part of the joint element
is the freed memory, the joint element can be made clean:
clean + freed + clean = clean => freed memory
clean + freed = clean must be cleared
freed + clean = clean
freed = clean
This logic naturally reproduces the old behavior
and always applies in modes when EAL memalloc layer
returns only clean segments.
As a result, memory is either cleared on free, as before,
or it will be cleared on allocation if need be, but never twice.
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
This private header contains an incomplete cplusplus guard,
just remove it.
Fixes: d35e61322d ("eventdev: move inline APIs into separate structure")
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
On Windows, strerror returns just "Unknown error" for errnum greater
than MAX_ERRNO, while linux and freebsd returns "Unknown error <num>",
which is the current expectation for errno_autotest. Differentiate
the error string on Windows to remove a "duplicate error code" failure.
Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
UT memory_autotest on Windows has 2 failed cases on EAL APIs
eal_memalloc_get_seg_fd and eal_memalloc_get_seg_fd_offset. These 2
APIs are not supported on Windows yet. Should return ENOTSUP such that
in test_memory.c these 2 ENOTSUP cases will not be marked as failures,
same as other ENOTSUP cases.
Fixes: 2a5d547a4a ("eal/windows: implement basic memory management")
Cc: stable@dpdk.org
Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Parameters count and esize are both unsigned int, and their product can
legaly exceed unsigned int and lead to runtime access violation.
Fixes: cc4b218790 ("ring: support configurable element size")
Cc: stable@dpdk.org
Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
Reviewed-by: Liang Ma <liangma@liangbit.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The error value returned by rte_ring_create_elem() should be positive
integers. However, if the rte_ring_get_memsize_elem() function fails,
a negative number is returned and is directly used as the return value.
As a result, this will cause the external call to check the return
value to fail(like called by rte_mempool_create()).
Fixes: a182620042 ("ring: get size in memory")
Cc: stable@dpdk.org
Reported-by: Nan Zhou <zhounan14@huawei.com>
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
When enqueueing/dequeueing to/from the ring we try to optimize by manual
loop unrolling. The check for this optimization looks like:
if (likely(idx + n < size)) {
where 'idx' points to the first usable element (empty slot for enqueue,
data for dequeue). The correct comparison here should be '<=' instead
of '<'.
This is not a functional error since we fall back to the loop with
correct checks on indexes. Just a minor suboptimal behaviour for the
case when we want to enqueue/dequeue exactly the number of elements that
we have in the ring before wrapping to its beginning.
Fixes: cc4b218790 ("ring: support configurable element size")
Fixes: 286bd05bf7 ("ring: optimisations")
Signed-off-by: Andrzej Ostruszka <amo@semihalf.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Sometimes OS tries to switch the core. So, bind the lcore thread
to a fixed core.
Implement affinity call on Windows similar to Linux.
Signed-off-by: Qiao Liu <qiao.liu@intel.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Tested-by: Idan Hackmon <idanhac@nvidia.com>
The generic header file was missing
in the list of files to install.
Fixes: 9667d97c25 ("pflock: add phase-fair reader writer locks")
Cc: stable@dpdk.org
Signed-off-by: Martijn Bakker <gladdyu@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This patch changes type from config to data for functions
called in the datapath.
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Same logging messages were used for both IOTLB cache
insertion failure and IOTLB pending insertion failure.
This patch differentiate them to ease logs analysis.
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This patch replaces multi-lines logs in multiple single-
line logs in order to ease logs filtering based on their
socket path.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This patch standardizes logging done in Virtio-net, so that
the Vhost-user socket path is always prepended to the logs.
It will ease log analysis when multiple Vhost-user ports
are in use.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This patch adds the Vhost socket path whenever possible in
order to make debugging possible when multiple Vhost
devices are in use. Some vhost-user layer functions are
modified to pass the device path down to the socket layer.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This patch adds the Vhost-user socket path to Vhost-user
layer logs in order to ease logs filtering.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This patch prepends Vhost logs with the Vhost-user socket
path when available to ease filtering logs for a given port.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This patch adds name of the device failing vDPA registration.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This patch adds IOTLB mempool name when logging debug
or error messages, and also prepends the socket path.
to all the logs.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This patch adds log for vring related info in handling of vhost message
VHOST_USER_SET_VRING_BASE, which will be useful in live migration case.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
In eth_dev_handle_port_info() allocated memory for rxq_state,
we should free it when error happens, otherwise it will lead
to memory leak.
Fixes: 58b43c1ddf ("ethdev: add telemetry endpoint for device info")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Old macros kept for backward compatibility, but this cause old macro
usage to sneak in silently.
Marking old macros as deprecated. Downside is this will cause some noise
for applications that are using old macros.
Fixes: 295968d174 ("ethdev: add namespace")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Any EAL memory allocation often goes through eal_get_virtual_area()
function, which will print a warning whenever the resulting allocation
didn't match the specified address requirements. This is useful for
when we have requested a specific base virtual address, to let the user
know that the mapping has deviated from that address.
However, on Linux, we also have a default base address that's there to
ensure better chances of successful secondary process initialization,
as well as higher likelihood of the virtual areas to fit inside the
IOMMU address width. Because of this default base address, there are
warnings printed even when no base address was explicitly requested,
which can be confusing to the user.
Emit this warning with debug level unless base address was explicitly
requested by the user.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch introduces new api for retrieving event port id
of eth rx adapter.
Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
When event delivery is through internal port, stats are maintained
by HW and we should avoid reading SW data structures for stats.
Fix missing internal port checks.
Fixes: 995b150c1a ("eventdev/eth_rx: add queue stats API")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
While debugging running DPDK service in a container, it is
useful to see which file creation failed. Don't hide this
failure with DEBUG.
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Allow disabling of the cfgfile library in builds.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add port, table and pipeline libraries - collectively often known as
the "packet framework" - to the list of optional libraries, and
ensure tests can build with them disabled.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Add the flow_classify library to the list of optional libraries, and
ensure tests can build with it disabled.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Allow the 'node' library to be disabled in builds.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Align the code in lib/meson.build with that in drivers/meson.build to
enable recursive disabling of libraries, i.e. if library b depends on
library a, disable library b if a is disabled (either explicitly or
implicitly). This allows libraries to be optional even if other DPDK
libs depend on them, something that was not previously possible.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Similarly to rte_malloc, rte_gpu_mem_alloc accepts as
input the memory alignment size.
GPU driver should return GPU memory address aligned
with the input value.
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
Define a set of macros in the build configuration to allow C runtime
code to check the current OS environment. This saves the user having to
use ifdefs for e.g. disabling particular tests on Windows.
See included documentation changes for usage examples.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
The generic RTE_FLOW_ACTION_TYPE_MODIFY_FIELD action was
introduced by [1]. This action provides an unified way
to perform various arithmetic and transfer operations over
packet network header fields and packet metadata.
[1] 73b68f4c54 ("ethdev: introduce generic modify flow action")
On other side there are a bunch of multiple legacy actions,
that can be superseded by the generic MODIFY_FIELD action:
RTE_FLOW_ACTION_TYPE_OF_SET_MPLS_TTL
RTE_FLOW_ACTION_TYPE_OF_DEC_MPLS_TTL
RTE_FLOW_ACTION_TYPE_OF_SET_NW_TTL
RTE_FLOW_ACTION_TYPE_OF_DEC_NW_TTL sfc
RTE_FLOW_ACTION_TYPE_OF_COPY_TTL_OUT
RTE_FLOW_ACTION_TYPE_OF_COPY_TTL_IN
RTE_FLOW_ACTION_TYPE_SET_IPV4_SRC bnxt, cxgbe, mlx5
RTE_FLOW_ACTION_TYPE_SET_IPV4_DST bnxt, cxgbe, mlx5
RTE_FLOW_ACTION_TYPE_SET_IPV6_SRC cxgbe, mlx5
RTE_FLOW_ACTION_TYPE_SET_IPV6_DST cxgbe, mlx5
RTE_FLOW_ACTION_TYPE_SET_TP_SRC cxgbe, mlx5
RTE_FLOW_ACTION_TYPE_SET_TP_DST cxgbe, mlx5
RTE_FLOW_ACTION_TYPE_DEC_TTL mlx5, sfc
RTE_FLOW_ACTION_TYPE_SET_TTL mlx5
RTE_FLOW_ACTION_TYPE_SET_MAC_SRC cxgbe, mlx5
RTE_FLOW_ACTION_TYPE_SET_MAC_DST cxgbe, mlx5
RTE_FLOW_ACTION_TYPE_INC_TCP_SEQ mlx5
RTE_FLOW_ACTION_TYPE_DEC_TCP_SEQ mlx5
RTE_FLOW_ACTION_TYPE_INC_TCP_ACK mlx5
RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK mlx5
RTE_FLOW_ACTION_TYPE_SET_IPV4_DSCP mlx5
RTE_FLOW_ACTION_TYPE_SET_IPV6_DSCP mlx5
RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_VID bnxt, cnxk, cxgbe, enic,
mlx5, octeontx2, sfc
RTE_FLOW_ACTION_TYPE_OF_SET_VLAN_PCP bnxt, cnxk, cxgbe, enic,
mlx5, octeontx2, sfc
RTE_FLOW_ACTION_TYPE_SET_TAG mlx5
RTE_FLOW_ACTION_TYPE_SET_META mlx5
This note deprecates the following RTE Flow actions,
as not supported by any of PMDs:
RTE_FLOW_ACTION_TYPE_OF_SET_MPLS_TTL
RTE_FLOW_ACTION_TYPE_OF_DEC_MPLS_TTL
RTE_FLOW_ACTION_TYPE_OF_SET_NW_TTL
RTE_FLOW_ACTION_TYPE_OF_COPY_TTL_OUT
RTE_FLOW_ACTION_TYPE_OF_COPY_TTL_IN
The following actions are supposed to be deprecated in 22.07
and replaced by generic field modify action:
RTE_FLOW_ACTION_TYPE_OF_DEC_NW_TTL
RTE_FLOW_ACTION_TYPE_SET_IPV4_SRC
RTE_FLOW_ACTION_TYPE_SET_IPV4_DST
RTE_FLOW_ACTION_TYPE_SET_IPV6_SRC
RTE_FLOW_ACTION_TYPE_SET_IPV6_DST
RTE_FLOW_ACTION_TYPE_SET_TP_SRC
RTE_FLOW_ACTION_TYPE_SET_TP_DST
RTE_FLOW_ACTION_TYPE_DEC_TTL
RTE_FLOW_ACTION_TYPE_SET_TTL
RTE_FLOW_ACTION_TYPE_SET_MAC_SRC
RTE_FLOW_ACTION_TYPE_SET_MAC_DST
RTE_FLOW_ACTION_TYPE_INC_TCP_SEQ
RTE_FLOW_ACTION_TYPE_DEC_TCP_SEQ
RTE_FLOW_ACTION_TYPE_INC_TCP_ACK
RTE_FLOW_ACTION_TYPE_DEC_TCP_ACK
RTE_FLOW_ACTION_TYPE_SET_IPV4_DSCP
RTE_FLOW_ACTION_TYPE_SET_IPV6_DSCP
RTE_FLOW_ACTION_TYPE_SET_TAG
RTE_FLOW_ACTION_TYPE_SET_META
The VLAN set actions are interrelated to VLAN header insertion/removal
and supported by multiple PMDs and widely used by applications and
not supposed to be deprecated due to potential large impact on
drivers and applications.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Added a diagram to document meter library components
and added text for steps performed by the application to
configure the traffic meter and policing library.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Remove unnecessary rte_gpu_wmb from rte_gpu_comm_populate_list_pkts.
It causes a performance degradation in case of NVIDIA GPU V100.
This change doesn't affect any functionality as the status resides
in CPU registered memory.
Fixes: c7ebd65c13 ("gpudev: add communication list")
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
Removing the use of driver following PMD as its unnecessary.
Cc: stable@dpdk.org
Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Signed-off-by: Conor Fogarty <conor.fogarty@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Remove the use of double "the" as it does not make sense.
Cc: stable@dpdk.org
Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Signed-off-by: Conor Fogarty <conor.fogarty@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The dump of dynamic fields and flags fails if the shm is already
allocated. Add a check to fix the issue.
Fixes: d4902ed31c ("mbuf: check shared memory before dumping dynamic space")
Cc: stable@dpdk.org
Signed-off-by: Alexander Bechikov <asb.tyum@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The gpudev functions free, register and unregister
return gracefully if input pointer is NULL or size 0,
as API doc was indicating no-op accepted values.
CUDA driver checks are removed because redundant
with the checks added in gpudev library.
Fixes: e818c4e2bf ("gpudev: add memory API")
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
Multiple drivers are defining macros for VLAN header length, to remove
the redundancy defining macro in the ether header.
And updated drivers to use the new macro.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Jiawen Wu <jiawenwu@trustnetic.com>
This patch adds a comment for RTE_HASH_BUCKET_ENTRIES
explaining why a particular value was chosen.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
This library can be made optional.
drivers/gpu and app/test-gpudev depend on this library,
so they are automatically disabled if the lib is disabled.
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
If the packet uses multiple descriptors and its descriptor indices are
wrapped, the first descriptor flag is not updated last, which may cause
virtio read the incomplete packet. For example, given a packet uses 64
descriptors, and virtio ring size is 256, and its descriptor indices are
224~255 and 0~31, current implementation will update 224~255 descriptor
flags earlier than 0~31 descriptor flags.
This patch fixes this issue by updating descriptor flags in one loop,
so that the first descriptor flag is always updated last.
Fixes: 873e8dad6f ("vhost: support packed ring in async datapath")
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This library can be made optional.
dumpcap and pdump applications depend on this library, check for
dependencies like what we have for examples.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
metrics, bitratestats, jobstats and latencystats libraries can be made
optional as they provide standalone features.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
GRO and GSO integration in testpmd is relatively self contained and easy
to extract.
Those libraries can be made optional as they provide standalone
features.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Update public macros to have RTE_IP_FRAG_ prefix.
Update DPDK components to use new names.
Keep obsolete macro for compatibility reasons.
Renamed experimental function ``rte_frag_table_del_expired_entries``to
``rte_ip_frag_table_del_expired_entries`` to comply with other public
API naming convention.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch fixes various issues:
- replace _mm512_set_epi8 with _mm512_set_epi32 due to the lack
of support by some compilers (at least, gcc 8),
- check if AVX512F is supported along with GFNI, this is done if the code
is built on a platform that supports GFNI, but does not support AVX512,
- fix compilation problems on 32bit arch due to lack of support for
_mm_extract_epi64() by implementing XOR folding with
_mm_extract_epi32() on 32-bit arch,
Fixes: 4fd8c4cb0d ("hash: add new Toeplitz hash implementation")
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Kai Ji <kai.ji@intel.com>
AVX512 was disabled when GNU binutils were missing or had a known bug,
even if LLVM binutils were used for the build,
because binutils-avx512-check.sh was invoked regardless and failed.
In particular, this was the case for FreeBSD with clang (default).
Run the check only when GNU binutils are used.
Fixes: 68b1f1cda5 ("build: check AVX512 rather than binutils version")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>