DLB2 has a need to parse a user supplied coremask as part
of an optimization that associates optimal core/resource
pairs. Therefore eal_parse_coremask has been renamed
to rte_eal_parse_coremask and exported but kept internal.
Signed-off-by: Abdullah Sevincer <abdullah.sevincer@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Do not flush the buffered packets unnecessarily when a burst was sent
since the last flush call.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Free the buffered packets as opposed to retrying to send them when the
output port is freed.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Drop packets that cannot be sent instead of retry sending the same
packets potentially forever when the ring consumer that is down.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Do not flush the buffered packets unnecessarily when a burst was sent
since the last flush call.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Free the buffered packets as opposed to retrying to send them when the
output port is freed.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Drop packets that cannot be sent instead of retry sending the same
packets potentially forever when the Ethernet device that is down.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The sink port is tasked to drop all packets, hence the packet and byte
counters should be named to reflect the drop operation.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add packet drop statistics counters for the output ports. Required by
the non-blocking output port behavior where the packets that cannot
be sent at the time of the operation are dropped as opposed to the
send operation being retried potentially forever for the same packets.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Do not include <ctype.h>, <errno.h>, and <stdlib.h> from <rte_common.h>,
because they are not used by this file.
Include the needed headers directly from the files that need them.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
There is no reason for rte_str_to_size() to be inline.
Move the implementation out of <rte_common.h>.
Export it as a stable ABI because it always has been public.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
RTE_CACHE_LINE_ROUNDUP() implementation repeated RTE_ALIGN_MUL_CEIL().
In other places RTE_CACHE_LINE_SIZE is assumed to be a power-of-2,
so define RTE_CACHE_LINE_ROUNDUP() using RTE_ALIGN_CEIL().
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
There is no point in such a call and UBSan complains about a call to
memcpy with a null pointer as second arg.
When building with -Db_sanitize=undefined, Clang gives the following
warning
../lib/bpf/bpf_load.c:37:20: runtime error: null pointer passed as
argument 2, which is declared to never be null
A check of the sz before calling memcpy fixes that.
Signed-off-by: Henning Schild <henning.schild@siemens.com>
__rte_raw_cksum() (used by rte_raw_cksum() among others) accessed its
data through an uint16_t pointer, which allowed the compiler to assume
the data was 16-bit aligned. This in turn would, with certain
architectures and compiler flag combinations, result in code with SIMD
load or store instructions with restrictions on data alignment.
This patch keeps the old algorithm, but data is read using memcpy()
instead of direct pointer access, forcing the compiler to always
generate code that handles unaligned input. The __may_alias__ GCC
attribute is no longer needed.
The data on which the Internet checksum functions operates are almost
always 16-bit aligned, but there are exceptions. In particular, the
PDCP protocol header may (literally) have an odd size.
Performance impact seems to range from none to a very slight
regression.
Bugzilla ID: 1035
Fixes: 6006818cfb ("net: new checksum functions")
Cc: stable@dpdk.org
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
A mempool consumes 3 memzones (with the default ring mempool driver).
The default DPDK configuration allows RTE_MAX_MEMZONE (2560) memzones.
Assuming there is no other memzones that means that we can have a
maximum of 853 mempools.
In the vhost library, the IOTLB cache code so far was requesting a
mempool per vq, which means that at the maximum, the vhost library
could request mempools for 426 qps.
This limit was recently reached on big systems with a lot of virtio
ports (and multiqueue in use).
While the limit on mempool count could be something we fix at the DPDK
project level, there is no reason to use mempools for the IOTLB cache:
- the IOTLB cache entries do not need to be DMA-able and are only used
by the current process (in multiprocess context),
- getting/putting objects from/in the mempool is always associated with
some other locks, so some level of lock contention is already present,
We can convert to a malloc'd pool with objects put in a free list
protected by a spinlock.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Having a back reference to the index of the vq in the dev->virtqueue[]
array makes it possible to unify the internal API, with only passing dev
and vq.
It also allows displaying the vq index in log messages.
Remove virtqueue index checks where unneeded (like in static helpers
called from a loop on all available virtqueue).
Move virtqueue index validity checks the sooner possible.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
translate_ring_addresses and numa_realloc may change a virtio device and
virtio queue. Callers of those helpers must be extra careful and refresh
any reference to old data.
Change those functions prototype as a way to hint about this issue and
always ask for an indirect pointer.
Besides, when reallocating the device and queue, the code already made
sure it will return a pointer to a valid device. The checks on such
returned pointer can be removed.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
translate_ring_addresses (via numa_realloc) may change a virtio device and
virtio queue.
The virtqueue object must be refreshed before accessing the lock.
Fixes: 04c27cb673 ("vhost: fix unsafe vring addresses modifications")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Previously, the pipeline build operation was done based on the
specification file (typically produced by the P4 compiler), then the C
code with optimized functions for the pipeline actions and
instructions was generated, built into a shared object library, loaded
and installed into the pipeline in a completely hardcoded and
non-customizable way.
Now, this process is split into three explicit stages:
i) code generation (specification file -> C file);
ii) code build (C file -> shared object library);
iii) code installation (library load into the pipeline).
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Previously, the C code generation for the pipeline was hidden under
the hood; now, we make this an explicit API operation. Besides the
functions for the pipeline actions and the pipeline instructions,
the generated C source code now includes the pipeline specification
structure required for the pipeline configuration operations.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Add specification data structure and API for the pipeline I/O ports
and related pipeline configuration such as packet mirroring.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Add support to export the pipeline specification data structure to a C
source code file.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Rework the specification file-based pipeline build operation to first
parse the specification file into the previously introduced pipeline
specification data structure, then use this structure to configure
and build the pipeline.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Add specification data structure for the entire pipeline.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Move all the pipeline object specification data structures to an
internal header file.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Add an unique name to every pipeline. This enables the library to
maintain a list of the existing pipeline objects, which can be
queried by the application.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Some NIC drivers support MBUF_FAST_FREE (device supports optimization
for fast release of mbufs. When set, application must guarantee that
per-queue all mbufs comes from the same mempool, has refcnt = 1, direct
and non-segmented.) offload.
In order to adapt to this offload function, add this API.
Add some test data for this API.
Signed-off-by: Huichao Cai <chcchc88@163.com>
Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
This patch aims at supporting the unlikely case where a
Virtio-net header is spanned across more than two
descriptors.
CVE-2022-2132
Fixes: fd68b4739d ("vhost: use buffer vectors in dequeue path")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
This patch discards descriptor chains which are smaller
than the Virtio-net header size, and ones that are equal.
Indeed, such descriptor chains sizes mean there is no
packet data.
This patch also has the advantage of requesting the exact
packets sizes for the mbufs.
CVE-2022-2132
Fixes: 62250c1d09 ("vhost: extract split ring handling from Rx and Tx functions")
Fixes: c3ff0ac70a ("vhost: improve performance by supporting large buffer")
Fixes: 84d5204310 ("vhost: support async dequeue for split ring")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The "offset" and "n_bits" fields were generated incorrectly, hence the
output C file was producing compilation errors when the "recircid"
instruction was used.
Fixes: 5ec76d29dc ("pipeline: support packet recirculation")
Cc: stable@dpdk.org
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Added changes to enable CMAN (RED or PIE) at init
from profile configuration file.
By default CMAN code is enabled but not in use, when
there is no RED or PIE profile configured.
Signed-off-by: Marcin Danilewicz <marcinx.danilewicz@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
Start a new release cycle with empty release notes.
The ABI version becomes 23.0.
The map files are updated to the new ABI major number (23).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.
Special handling of removed drivers is also dropped in check-abi.sh and
a note has been added in libabigail.abignore as a reminder.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The dirty page logging is only required in vhost enqueue direction for
live migration. This patch removes the unnecessary dirty page logging
in vhost dequeue direction. Otherwise, it will result in a performance
drop. Some if-else judgements are also optimized to improve performance.
Fixes: 6d823bb302 ("vhost: prepare sync for descriptor to mbuf refactoring")
Fixes: b6eee3e834 ("vhost: fix sync dequeue offload")
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Xingguang He <xingguang.he@intel.com>
A packet with RTE_PTYPE_L4_FRAG(0x300) contains both RTE_PTYPE_L4_TCP
(0x100) & RTE_PTYPE_L4_UDP (0x200). A fragmented packet as defined in
rte_mbuf_ptype.h cannot be recognized as other L4 types and hence the
GRO layer should not use IS_IPV4_TCP_PKT or IS_IPV4_UDP_PKT for
RTE_PTYPE_L4_FRAG. Hence, if the packet type is RTE_PTYPE_L4_FRAG the
IP header should be parsed to recognize the appropriate IP type and
invoke the respective gro handler.
Fixes: 1ca5e67408 ("gro: support UDP/IPv4")
Cc: stable@dpdk.org
Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
This commit fixes an issue where calling rte_service_lcore_stop()
would result in a service's "active on lcore" status becoming stale.
The stale status would result in rte_service_may_be_active() always
returning "1", indicating that the service is not certainly stopped.
This is fixed by ensuring the "active on lcore" status of each service
is set to 0 when an lcore is stopped.
Fixes: e30dd31847 ("service: add mechanism for quiescing")
Fixes: 8929de043e ("service: retrieve lcore active state")
Reported-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
We recently improved the log messages in the vhost library, adding some
context that helps filtering for a given vhost-user device.
However, some parts of the code were missed, and some later code changes
broke this new convention (fixes were sent previous to this patch).
Change the VHOST_LOG_CONFIG/DATA helpers and always ask for a string
used as context. This should help limit regressions on this topic.
Most of the time, the context is the vhost-user device socket path.
For the rest when a vhost-user device can not be related, generic
names were chosen:
- "dma", for vhost-user async DMA operations,
- "device", for vhost-user device creation and lookup,
- "thread", for threads management,
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Those messages were missed when adding socket context.
Fix this.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
device information in the log messages was dropped.
Fixes: 52ade97e36 ("vhost: fix physical address mapping")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch checks the return value of rte_dma_info_get()
called in rte_vhost_async_dma_configure().
Coverity issue: 379066
Fixes: 53d3f4778c ("vhost: integrate dmadev in asynchronous data-path")
Cc: stable@dpdk.org
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch fixes the missing virtio net header copy in sync
dequeue path caused by refactoring, which affects dequeue
offloading.
Fixes: 6d823bb302 ("vhost: prepare sync for descriptor to mbuf refactoring")
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Wei Ling <weix.ling@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
In the virtio blk vDPA live migration use case, before the live
migration process, QEMU will set call fd to vDPA back-end. QEMU
and vDPA back-end stand by until live migration starts.
During live migration process, QEMU sets kick fd and a new call
fd. However, after the kick fd is set to the vDPA back-end, the
vDPA back-end configures device and data path starts. The new
call fd will cause some kind of "re-configuration", this kind
of "re-configuration" cause IO drop.
After this patch, vDPA back-end configures device after kick fd
and call fd are well set and make sure no IO drops.
This patch only impact virtio blk vDPA device and does not impact
net device.
Fixes: 7015b65771 ("vdpa/ifc: add block device SW live-migration")
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add spinlock protection in queue delete function.
This protects the data path while the queue delete operation
is in progress.
Fixes: a3bbf2e097 ("eventdev: add eth Tx adapter implementation")
Cc: stable@dpdk.org
Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
These are functions related to interrupts that have been
in since 20.02 release or earlier.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
These API's have been around for a long time and by now are fixed.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
The RTE_LOG_REGISTER is not experimental, and the experimental
tag was never enforced on these.
Make rte_log_can_log a fully supported function.
It was introduced nearly 2yrs ago.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ray Kinsella <mdr@ashroe.eu>