In rte_flow, if a counter action is before a meter which has
non-termination policy, the counter value only includes packets not
being dropped.
This patch fixes this issue by differentiating the order of counter and
non-termination meter:
1. counter + meter, counts all packets hitting this flow.
2. meter + counter, only counts packets not being dropped.
Fixes: 51ec04dc7b ("net/mlx5: connect meter policy to created flows")
Cc: stable@dpdk.org
Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Users can probe primary or secondary PCIe id when bonding is
configured.
1. -a 0a:00.0,representor=pf[0-1]vf[0-1], PMD probes 5 ports
totally: bonding device plus 4 representor ports.
2. -a 0a:00.1,representor=pf[0-1]vf[0-1], PMD only probes 2
representor ports.
Under the 2nd condition, bonding IB device doesn't have the same
PCIe id and PMD needs to check bonding relationship otherwise
probe failure.
Fixes: 6856efa54e ("net/mlx5: fix PF leak on PCI probing failure")
Cc: stable@dpdk.org
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
When txq_inline_max is too large and an mbuf is multi-segment
it may be impossible to inline data and build a valid WQE,
because WQE length would be larger then HW can represent.
It is impossible to detect misconfiguration at startup,
because the condition depends on the mbuf composition.
The check on the data path to prevent the error
treated the length limit as expressed in 64B units,
while the calculated length and limit are in 16B units.
Fix the condition to avoid subsequent TxQ failure and recovery.
Fixes: 18a1c20044 ("net/mlx5: implement Tx burst template")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
MR end for a mempool chunk may be calculated incorrectly.
For example, for chunk with addr=1.5M and len=1M with 2M page size
the range would be [0, 2M), while the proper result is [0, 4M).
Fix the calculation.
Fixes: 690b2a88c2 ("common/mlx5: add mempool registration facilities")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Multi-Packet Rx queue uses PMD-managed buffers to store packets.
These buffers are externally attached to user mbufs.
This conflicts with the feature that allows using user-managed
externally attached buffers in an application.
Fall back to SPRQ in case external buffers mempool is configured.
The limitation is already documented in mlx5 guide.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The netvsc should use RTE_MBUF_F_TX_L4_MASK and check the masked value
to decide the correct way to calculate checksums.
Not checking for RTE_MBUF_F_TX_L4_MASK results in incorrect RNDIS
packets sent to VSP and incorrect checksums calculated by the VSP.
Fixes: 4e9c73e96e ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
This patch adds quanta size configuration support.
Quanta size should between 256 and 4096, and be a product of 64.
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
In the Rx bulk path, packets which are taken from the HW ring, are first
copied to the stage data structure and then later copied from the stage
to the rx_pkts array. For the number of packets requested immediately
by the receiving function, this two-step process adds extra overhead
that is not necessary.
Instead, put requested number of packets directly into the rx_pkts array
and only stage excess packets. On N1SDP with 1 core/port, l3fwd saw up
to 4% performance improvement. On x86, no difference in performance was
observed.
Signed-off-by: Kathleen Capella <kathleen.capella@arm.com>
Suggested-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
When parsing raw flow pattern in FDIR, the input parameter spec and
mask are used directly and the original value will be changed. It
will cause error if these values are used in other functions. In this
patch, temporary variables are created to store the spec and mask.
Fixes: 25be39cc17 ("net/ice: enable protocol agnostic flow offloading in FDIR")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Junfeng Guo <junfeng.guo@intel.com>
Not necessary to create / destroy a parser instance for every raw packet
rule. A global parser instance will be created in ice_flow_init and be
destroyed in ice_flow_uninit.
Also, ice_dev_udp_tunnel_port_add has been hooked to perform corresponding
parser configure. This also fix the issue that RSS engine can't support
VXLAN inner through raw packet filter.
Fixes: 1b9c68120a ("net/ice: enable protocol agnostic flow offloading in RSS")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Xu Ting <ting.xu@intel.com>
The function ice_xmit_pkts_vec_avx2_offload was left out in the list
of tx functions for ice_tx_burst_mode_get.
Fixes: 52ccdcf2fd ("net/ice: add AVX2 offload Tx")
Cc: stable@dpdk.org
Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de>
Suggested-by: Michael Rossberg <michael.rossberg@tu-ilmenau.de>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Errors from i40e_flow_parse_fdir_pattern() can bubble up to
rte_flow_create. If rte_flow_error is not initialized a caller may
dereference error->message. This may be uninitialized memory, leading
to a segemntation fault.
Fixes: 4a072ad434 ("net/i40e: fix flow director config after flow validate")
Cc: stable@dpdk.org
Signed-off-by: Mike Pattrick <mkp@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Previously, each time a burst of packets is received, SW reads HW
register and assembles it and the timestamp from descriptor together to
get the complete 64 bits timestamp.
This patch optimizes the algorithm. The SW only needs to check the
monotonicity of the low 32bits timestamp to avoid crossing borders.
Each time before SW receives a burst of packets, it should check the
time difference between current time and last update time to avoid
the low 32 bits timestamp cycling twice.
The patch proved a 50% ~ 70% single core performance improvement on a
main stream Xeon server, this fix the performance gap for some use cases.
Fixes: f9c561ffbc ("net/ice: fix performance for Rx timestamp")
Cc: stable@dpdk.org
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
For i40e_xmit_pkts_vec_xx function, it checks nb_pkts to ensure nb_pkts
does not cross rs_thresh.
However, in i40e_xmit_fixed_burst_vec_xx function, this check will be
performed again. To improve code, delete this redundant check.
Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
When setup Rx queue, the rxdid would be changed if it's
"IAVF_RXDID_LEGACY_0/1", that caused the scan HW ring used the wrong
function 'iavf_rx_scan_hw_ring_flex_rxd()'.
Ignore the rxdid changed when equals "IAVF_RXDID_LEGACY_0/1".
Fixes: 0ed16e0131 ("net/iavf: fix function pointer in multi-process")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Replace the SMP barrier with atomic thread fence for iavf hw ring scan
in the bulk Rx path.
This patch introduces a change to the iavf driver that was already added
to the i40e driver [1] as part of the adoption of the use of compiler
atomics.
[1]Commit 8649e23566 ("net/i40e: replace SMP barrier with thread fence
in Rx")
Signed-off-by: Kathleen Capella <kathleen.capella@arm.com>
Reviewed-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Some XGS-PON SFPs have been observed ACKing I2C reads and returning
uninitialized garbage while their uC boots. This can lead to the SFP ID
code marking an otherwise working SFP module as unsupported if a bogus
ID value is read while its internal PHY/microcontroller is still
booting.
Retry the ID read several times looking not just for NAK, but also for a
valid ID field.
Since the device isn't NAKing the transaction, the existing longer retry
code in ixgbe_read_i2c_byte_generic_int() doesn't apply here.
Signed-off-by: Stephen Douthit <stephend@silicom-usa.com>
Signed-off-by: Jeff Daly <jeffd@silicom-usa.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
256 queues can be allowed now. This patch improves the code
to support 256 queues for per PF.
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
test_bpf_convert is being conditionally registered depending on the
presence of RTE_HAS_LIBPCAP except the UT unconditionally lists it as a
test to run.
When the UT runs test_bpf_convert test-dpdk can't find the registration
and assumes the DPDK_TEST environment variable hasn't been defined
resulting in test-dpdk dropping to interactive mode and subsequently
waiting for the remainder of the UT fast-test timeout period before
reporting the test as having timed out.
* unconditionally register test_bpf_convert,
* if ! RTE_HAS_LIBPCAP provide a stub test_bpf_convert that reports the
test is skipped similar to that done with the test_bpf test.
Fixes: 2eccf6afbe ("bpf: add function to convert classic BPF to DPDK BPF")
Cc: stable@dpdk.org
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Mark the trylock family of spinlock functions with
__rte_warn_unused_result.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
This patch adds a wrapper macro __rte_warn_unused_result for the
warn_unused_result function attribute.
Marking a function __rte_warn_unused_result will make the compiler
emit a warning in case the caller does not use the function's return
value.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
The conditional rte_spinlock_trylock() was used as if it is an
unconditional lock operation in a number of places.
Fixes: cc7e8ae84f ("examples/bond: add example application for link bonding mode 6")
Cc: stable@dpdk.org
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
All OS implementations provide the same main loop.
Introduce helpers (shared for Linux and FreeBSD) to handle synchronisation
between main and threads and factorize the rest as common code.
Thread id are now logged as string in a common format across OS.
Note:
- this change also fixes Windows EAL: worker threads cpu affinity was
incorrectly reported in log.
- libabigail flags this change as breaking ABI in clang builds:
1 function with some indirect sub-type change:
[C] 'function int rte_eal_remote_launch(int (void*)*, void*, unsigned
int)' at eal_common_launch.c:35:1 has some indirect sub-type
changes:
parameter 1 of type 'int (void*)*' changed:
in pointed to type 'function type int (void*)' at rte_launch.h:31:1:
entity changed from 'function type int (void*)' to 'typedef
lcore_function_t' at rte_launch.h:31:1
type size hasn't changed
This is being investigated on libabigail side.
For now, we don't have much choice but to waive reports on this symbol.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
So far, a worker thread has been using its thread_id to discover which
lcore has been assigned to it.
On the other hand, as noted by Tyler, the pthread API does not strictly
guarantee that a new thread won't start running eal_thread_loop before
pthread_create writes to &lcore_config[xx].thread_id.
Though all OS implementations supported in DPDK (recently) ensure this
property, it is more robust to have the main thread directly pass
the worker thread lcore.
Signed-off-by: David Marchand <david.marchand@redhat.com>
if dpdmux objects created by restool tools with
the argument "--default-if=<if-id-number>", this
function would change it to 1
Fixes: 1def64c2d7 ("net/dpaa2: add dpdmux initialization and configuration")
Cc: stable@dpdk.org
Signed-off-by: Tianli Lai <laitianli@tom.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
The NULL pointer check is clearly unneeded.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Build DPDK with Fedora 35 containers.
GHA container support does not allow caching images and docker hub
seems to limit image pulls.
On the other hand, the Fedora project hub does not seem to limit them,
so prefer this hub.
Nevertheless, let's try to be good citizens and cache (once a day) a
prepared image for subsequent builds.
This preparation is done in a first prepare-container-images job.
The rpm-container-builds job then depends on it with a 'needs:' tag.
Differences with builds in Ubuntu GHA vm images:
- tasks are run as root in containers, no need for sudo,
- compiler must be explicitly installed,
- GHA artifacts can't contain a ':' in their name, and must be filtered,
- environment variables are not inherited and must be passed explicitly,
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
eal_thread_loop() uses lcore_config[i].thread_id,
which is stored upon the return from CreateThread().
Per documentation, eal_thread_loop() can start
before CreateThread() returns and the ID is stored.
Create lcore worker threads suspended and then subsequently resume to
allow &lcore_config[i].thread_id be stored before eal_thread_loop
execution.
Fixes: 53ffd9f080 ("eal/windows: add minimum viable code")
Cc: stable@dpdk.org
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Remove the x86 top atomic header include from the architecture related
header file, since this x86 top atomic header file has included them.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
These were implicit from DPDK script but adding
separate reference to make it explicit.
Separate sections for API and PMDs
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Niklas has been appointed the new maintainer for the NFP PMD.
Update the MAINTAINERS file to reflect this.
Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com>
Suppression rules are being added during the life of an ABI and cleaned
when bumping the major version.
Sort and document those rules to avoid pruning rules that should be kept.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Start a new release cycle with empty release notes.
Bump version and ABI minor.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Aaron Conole <aconole@redhat.com>
An errata exists where users may see reduced power savings when using
PMD Power Management. This issue occurs when compiling DPDK applications
with GCC-9 on platforms with TSX enabled. In rte_power_monitor_multi(),
the function may return without successfully starting the RTM
transaction (the _xbegin() fails).
Signed-off-by: David Hunt <david.hunt@intel.com>
Compile failed with cflag optimization=1 on Ubuntu20.04 with GCC10.3,
it reported vendor_id and dev_id may be used uninitialized in function
ifpga_rawdev_fill_info().
Actually it's not the truth, the variables are initialized in function
ifpga_get_dev_vendor_id(). To avoid such compile error, the variables
are initialized when they are defined.
Fixes: 9c006c45d0 ("raw/ifpga: scan PCIe BDF device tree")
Cc: stable@dpdk.org
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Blank line added to the final telemetry example for the
cryptography device library as to fix the example
rendering.
Fixes: 1c559ee846 ("cryptodev: add telemetry endpoint for capabilities")
Cc: stable@dpdk.org
Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
The docs mention modifications and additions to the cross file,
but there is no demonstration of how those should look like.
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Numactl cross compilation doesn't work with clang, remove it
and fix the GCC cross compiler executable name.
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
The newer versions have an extra -none- in the name.
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Remove CFLAGS and LDFLAGS since Meson doesn't support them well enough.
Add Meson alternatives: -Dc_args and -Dc_link_args on the command line
and in cross files.
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
From Documentation/admin-guide/kernel-parameters.txt, specifically the
last sentence:
nohz_full= [KNL,BOOT,SMP,ISOL]
The argument is a cpu list, as described above.
In kernels built with CONFIG_NO_HZ_FULL=y, set
the specified list of CPUs whose tick will be stopped
whenever possible. The boot CPU will be forced outside
the range to maintain the timekeeping. Any CPUs
in this list will have their RCU callbacks offloaded,
just as if they had also been called out in the
rcu_nocbs= boot parameter.
The kernel or-s the nohz_full cpumask into the rcu_nocbs cpumask at
startup, and uses that.
Signed-off-by: Tudor Brindus <me@tbrindus.ca>
The document roadmap section was missing any mention of the individual
drivers guides which are important for users. Add them to list.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>