User may want to configure same TC value across multiple queues, but
for that all queues should have a common TL3 where this TC value will
get configured.
Changed the pfc_tc_cq_map/pfc_tc_sq_map array indexing to qid and store
TC values in the array. As multiple queues may have same TC value.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Current PFC implementation does not support VFs.
This patch enables PFC on VFs too.
Also fix the config of aura.bp to be based on number
of buffers(aura.limit) and corresponding shift
value(aura.shift).
Fixes: cb4bfd6e7bdf ("event/cnxk: support Rx adapter")
Cc: stable@dpdk.org
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Avoid enabling CPT backpressure due to errata where
backpressure would block requests from even other
CPT LF's. Also allow CQ size >=16K.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Use computed value for WQE skip instead of a hard-coded value.
WQE skip needs to be number of 128B lines to accommodate rte_mbuf.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
The header file "roc_io.h" uses the "__plt_always_inline" macro but
don't include "roc_platform.h" to get the definition of it. This
inclusion is not necessary for compilation, but the lack of it can
confuse some indexers - such as those in eclipse, which reports the
lines:
"static __plt_always_inline uint64_t"
as possible definitions of a variable called "uint64_t". This confusion
leads to uint64_t being flagged as an unknown type in all other parts of
the project being indexed, e.g. across all of DPDK code.
Adding in the include of roc_platform.h makes it clear to the indexer
that those lines are part of a function definition, and that allows
eclipse to correctly recognise uint64_t as a type from stdint.h
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add ROC API to free the given MCAM entry. If the MCAM
entry has flow counter associated, this API will clear
and free the flow counter.
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Fix the subsystem device ID for CN103XX.
Fixes: dd462f68f04a ("common/cnxk: support CN103XX platform")
Cc: stable@dpdk.org
Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
When counting the batch allocated pointers in cnxk mempool driver,
currently it always waits for in-flight batch operations to finish.
Add a provision to make this waiting optional.
Signed-off-by: Ashwin Sekhar T K <asekhar@marvell.com>
Return with error on fail to initialize ROC model.
Fixes: 014a9e222bac ("common/cnxk: add model init and IO handling API")
Cc: stable@dpdk.org
Signed-off-by: Hanumanth Pothula <hpothula@marvell.com>
This add the support to dump NIX inline outbound CPT LF
registers.
Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Update cnxk platform documentation to use
-Dplatform meson option for native builds.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
When dumping flow data, read hardware MCAM entry corresponding
to the flow and print that data also.
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
Calling this telemetry callback with no argument caused a crash.
Fixes: 41cc645c214f ("net/cnxk: add inline IPsec telemetry for CN9K")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Added new platform layer macros for pointer operations,
bitwise operations, spinlock and 32 bit read and write.
Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Adding changes to accommodate the following requirements
while masking the channel number.
1. For CN10K device, channel number should not be masked
for first pass rules with RTE_FLOW_ACTION_TYPE_SECURITY
action. And channel number should be masked for all
other flow rules.
2. For CN9K device channel number should not be masked.
Fixes: 4968b362b639 ("common/cnxk: support CPT second pass flow rules")
Cc: stable@dpdk.org
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
During initialization, reset RX DMAC control register by
sending mbox message NIC_MBOX_MSG_RESET_XCAST to PF.
Signed-off-by: Hanumanth Pothula <hpothula@marvell.com>
Moving the logic of link polling to VF from PF. Now VF
is supposed to poll for the link status, rather PF alerting
VF about any link change.
Signed-off-by: Hanumanth Pothula <hpothula@marvell.com>
Segmentation fault has been observed while closing the ethernet
port. Reason for the segfault is, eth port close also shuts down
event device while other ethernet port is still using the event
device.
Fixes: da6c687471a3 ("net/octeontx: add start and stop support")
Cc: stable@dpdk.org
Signed-off-by: Harman Kalra <hkalra@marvell.com>
The crossbuild-essential-<arch> packages contain all necessary
dependencies to cross-compile binaries for a given architecture
including C and C++ compilers. Therefore use those instead of listing
packages directly. This way C++ compiler is also installed and C++
include checks will be checked in CI for ARM and PowerPC.
Cc: stable@dpdk.org
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Through some mixup all cross-files for ARM and PowerPC platforms were
using C Preprocessor (cpp) instead of GCC (g++).
This caused meson to fail detecting the C++ compiler presence and
therefore disabling some targets (i.e. C++ include file checks).
Fixes: e53a5299d219 ("build: support vendor specific ARM cross builds")
Cc: stable@dpdk.org
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
If called to allocate memory of size is between multiple of hugepage
size minus malloc_header_len and hugepage size, rte_malloc fails.
This fix replaces malloc_elem_trailer_len with malloc_elem_overhead in
try_expand_heap() to include malloc_elem_header_len when calculating
n_seg.
Bugzilla ID: 800
Fixes: 07dcbfe0101f ("malloc: support multiprocess memory hotplug")
Cc: stable@dpdk.org
Signed-off-by: Fidaullah Noonari <fidaullah.noonari@emumba.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
rte_dump_stack() needs to be usable in situations when a bug is
encountered and from signal handlers (such as SEGV).
Glibc backtrace_symbols() calls malloc which makes it
dangerous in a signal handler that is handling errors that maybe
due to memory corruption. Additionally, rte_log() is unsafe because
syslog() is not signal safe; printf() is also documented as
not being safe.
This version formats message and uses writev for each line in a manner
similar to what glibc version of backtrace_symbols_fd() does. The
FreeBSD version of backtrace_symbols_fd() is not signal safe.
Sample output:
0: ./build/app/dpdk-testpmd (rte_dump_stack+0x2b) [560a6e9c002b]
1: ./build/app/dpdk-testpmd (main+0xad) [560a6decd5ad]
2: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xcd) [7fd43d3e27fd]
3: ./build/app/dpdk-testpmd (_start+0x2a) [560a6e83628a]
Bugzilla ID: 929
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
copy_data was returning a pointer to an increased (off by one) descriptor.
Subsequent calls to copy_data in the library were then failing.
Fix this by incrementing the descriptor only if there is some left data
to copy.
Fixes: 4414bb67010d ("vhost/crypto: fix build with GCC 12")
Cc: stable@dpdk.org
Reported-by: Jakub Poczatek <jakub.poczatek@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Jakub Poczatek <jakub.poczatek@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
In multi-process, the secondary process will remap PCI during
initialization, but the mapping is not removed in the uninit path,
the device is not closed, and the device busy error will be reported
when the device is hotplugged.
This patch unmaps PCI device at secondary process uninitialization
based on virtio_rempa_pci.
Fixes: 36a7a2e7a53f ("net/virtio: move PCI device init in dedicated file")
Cc: stable@dpdk.org
Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Tested-by: Wei Ling <weix.ling@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
GCC 12 raises the following warning:
In file included from ../lib/mempool/rte_mempool.h:46,
from ../lib/mbuf/rte_mbuf.h:38,
from ../lib/vhost/vhost_crypto.c:7:
../lib/vhost/vhost_crypto.c: In function ‘rte_vhost_crypto_fetch_requests’:
../lib/eal/x86/include/rte_memcpy.h:371:9: warning: array subscript 1 is
outside array bounds of ‘struct virtio_crypto_op_data_req[1]’
[-Warray-bounds]
371 | rte_mov32((uint8_t *)dst + 3 * 32, (const uint8_t *)src + 3 * 32);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../lib/vhost/vhost_crypto.c:1178:42: note: while referencing ‘req’
1178 | struct virtio_crypto_op_data_req req;
| ^~~
Split this function and separate the per descriptor copy.
This makes the code clearer, and the compiler happier.
Note: logs for errors have been moved to callers to avoid duplicates.
Fixes: 3c79609fda7c ("vhost/crypto: handle virtually non-contiguous buffers")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Split the virtqs virt-queue resource between
the configuration threads.
Also need pre-created virt-queue resource
after virtq destruction.
This accelerates the LM process and reduces its time by 30%.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
pre-created virt-queue sub-resource in device probe stage
and then modify virtqueue in device config stage.
Steer table also need to support dummy virt-queue.
This accelerates the LM process and reduces its time by 40%.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Split the virtqs device close tasks after
stopping virt-queue between the configuration threads.
This accelerates the LM process and
reduces its time by 50%.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Split the virtqs LM log between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The virtq object and all its sub-resources use a lot of
FW commands and can be accelerated by the MT management.
Split the virtqs creation between the configuration threads.
This accelerates the LM process and reduces its time by 20%.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The driver creates a direct MR object of
the HW for each VM memory region,
which maps the VM physical address to
the actual physical address.
Later, after all the MRs are ready,
the driver creates an indirect MR to group all the direct MRs
into one virtual space from the HW perspective.
Create direct MRs in parallel using the MT mechanism.
After completion, the primary thread creates the indirect MR
needed for the following virtqs configurations.
This optimization accelerrate the LM process and
reduce its time by 5%.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The configuration threads tasks need a container to
support multiple tasks assigned to a thread in parallel.
Use rte_ring container per thread to manage
the thread tasks without locks.
The caller thread from the user context opens a task to
a thread and enqueue it to the thread ring.
The thread polls its ring and dequeue tasks.
That’s why the ring should be in multi-producer
and single consumer mode.
Anatomic counter manages the tasks completion notification.
The threads report errors to the caller by
a dedicated error counter per task.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The LM process includes a lot of objects creations and
destructions in the source and the destination servers.
As much as LM time increases, the packet drop of the VM increases.
To improve LM time need to parallel the configurations for mlx5 FW.
Add internal multi-thread management in the driver for it.
A new devarg defines the number of threads and their CPU.
The management is shared between all the devices of the driver.
Since the event_core also affects the datapath events thread,
reduce the priority of the datapath event thread to
allow fast configuration of the devices doing the LM.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The driver used a single global lock for any synchronization
needed for the datapath and control path.
It is better to group the critical sections with
the other ones that should be synchronized.
Replace the global lock with the following locks:
1.virtq locks(per virtq) synchronize datapath polling and
parallel configurations on the same virtq.
2.A doorbell lock synchronizes doorbell update,
which is shared for all the virtqs in the device.
3.A steering lock for the shared steering objects updates.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
dev_config operation is called in LM progress.
LM time is very critical because all
the VM packets are dropped directly at that time.
Move the virtq creation to probe time and
only modify the configuration later in
the dev_config stage using the new ability
to modify virtq.
This optimization accelerates the LM process and
reduces its time by 70%.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
A virtq configuration can be modified after the virtq creation.
Added the following modifiable fields:
1.address fields: desc_addr/used_addr/available_addr
2.hw_available_index
3.hw_used_index
4.virtio_q_type
5.version type
6.queue mkey
7.feature bit mask: tso_ipv4/tso_ipv6/tx_csum/rx_csum
8.event mode: event_mode/event_qpn_or_msix
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>