Add assertions to help debug in case of counter double alloc/free.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
The __hws_cnt_r2rcpy() function copies elements from one zero-copy ring
to another zero-copy ring in place.
This routine needs to consider the situation that the address was given
by source and destination could be both wrapped.
It uses 4 different "n" local variables to manage it:
- n: Number of elements to copy in total.
- n1: Number of elements to copy from ptr1, it is the minimal value
from source/dest n1 field.
- n2: Number of elements to copy from src->ptr1 to dst->ptr2 or from
src->ptr2 to dst->ptr1, this variable is 0 when both source and
dest n1 field are equal.
- n3: Number of elements to copy from src->ptr2 to dst->ptr2.
The function copies the first n1 elements. If n2 isn't zero it copies
more elements and check whether n3 is zero.
This logic is wrong since n3 may be bigger than zero even when n2 is
zero. This scenario is commonly happening in counters when the internal
mlx5 service thread copies elements from the reset ring into the reuse
ring.
This patch changes the function to copy n3 regardless of n2 value.
Fixes: 4d368e1da3 ("net/mlx5: support flow counter action for HWS")
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
The HWS counter has 2 different identifiers:
1. Type "cnt_id_t" which represents the counter inside caches and in
the flow structure. This index cannot be zero and is mostly called
"cnt_id".
2. Internal index, the index in counters array with type "uint32_t".
mostly it is called "iidx".
The second ID is calculated from the first using "mlx5_hws_cnt_iidx()"
function.
When a direct counter is allocated, if the queue cache is not empty, the
counter represented by cnt_id is popped from the cache. This counter may
be invalid according to the query_gen field. Thus, the "iidx" is parsed
from cnt_id and if it is valid, it is used to update the fields of the
counter structure.
When this counter is invalid, all the cache is flashed and new counters
are fetched into the cache. After fetching, another counter represented
by cnt_id is taken from the cache.
Unfortunately, for updating fields like "in_used" or "age_idx", the
function wrongly may use the old "iidx" coming from an invalid cnt_id.
Update the "iidx" in case of an invalid counter popped from the cache.
Fixes: 4d368e1da3 ("net/mlx5: support flow counter action for HWS")
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
Counter management structure has array of counter pools. This array is
invalid in management structure initialization and grows on demand.
The resizing include:
1. Allocate memory for the new size.
2. Copy the existing data to the new memory.
3. Move the pointer to the new memory.
4. Free the old memory.
The third step can be performed before for this function, and compiler
may do that, but another thread might read the pointer before coping and
read invalid data or even crash.
This patch allocates memory for this array once in management structure
initialization and limit the counters number by 16M.
Fixes: 3aa279157f ("net/mlx5: synchronize flow counter pool creation")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
As the queue-based aging API has been integrated[1], the flow aging
action support in HWS steering code can be enabled now.
[1]: https://patchwork.dpdk.org/project/dpdk/cover/
20221026214943.3686635-1-michaelba@nvidia.com/
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The use of rte_atomic functions is deprecated and is not
required in HWS code. HWS refcounts are used only during
control and always under lock.
Fixes: f8c8a6d844 ("net/mlx5/hws: add action object")
Signed-off-by: Alex Vesker <valex@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
In this patch, we removed the necessity of the version files and
you don't need to update these files for each release, you can just
remove them.
Suggested-by: Ferruh Yigit <ferruh.yigit@amd.com>
Signed-off-by: Abdullah Ömer Yamaç <omer.yamac@ceng.metu.edu.tr>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@amd.com>
A recent commit added an explicit dependency check on common/mlx5.
For consistency, query dpdk_conf instead of the list of common drivers.
The lists *_drivers should be used only for printing.
Fixes: 3df380f617 ("common/mlx5: fix disabling build")
Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Add support for queue operations:
- rx_queue_release
- tx_queue_release
Previous gve_tx_queue_release and gve_rx_queue_release functions are
only used internally to release Rx/Tx queue related resources.
But when the queues or ports are required to re-config, both of the dev
ops tx_queue_release and ops rx_queue_release will be checked and then
called.
Without these two dev ops, the Rx/Tx queue struct will be set as NULL
directly.
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@amd.com>
The RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE offload can't be used in bonding
mode Broadcast and mode 8023AD. Currently, bonding driver forcibly removes
from the dev->data->dev_conf.txmode.offloads and processes as success in
bond_ethdev_configure(). But this still cause that rte_eth_dev_configure()
fails to execute because of the failure of validating Tx offload in the
eth_dev_validate_offloads(). So this patch moves the modification of txmode
offlaods to the stage of adding slave device to report the correct txmode
offloads.
Fixes: 18c41457cb ("net/bonding: fix mbuf fast free usage")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Some capabilities (like, rx_offload_capa and tx_offload_capa) of bonding
device in dev_info is zero when no slave is added. And its capability will
be updated when add a new slave device.
The capability to update dynamically may introduce some problems if not
handled properly. For example, the reconfig() is called to initialize
bonding port configurations when create a bonding device. The global
tx_mode is assigned to dev_conf.txmode. The DEV_TX_OFFLOAD_MBUF_FAST_FREE
which is the default value of global tx_mode.offloads in testpmd is removed
from bonding device configuration because of zero rx_offload_capa.
As a result, this offload isn't set to bonding device.
Generally, port configurations of bonding device must be within the
intersection of the capability of all slave devices. If use original port
configurations, the removed capabilities because of adding a new slave may
cause failure when re-initialize bonding device.
So port configurations of bonding device need to be updated because of the
added and removed capabilities. In addition, this also helps to ensure
consistency between testpmd and bonding device.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Min Hu (Connor) <humin29@huawei.com>
Currently, by default, bond4 will first try to enable allmulti and
then enable promiscuous if fail to enable allmulti. On reception,
whether unicast and multicast packets should be dropped depends on
which mode has been enabled on the bonding interface.
In fact, if MAC address of packets in mac_addrs array of bonding
interface, these packets should not be dropped. However, now only
check the default MAC address, which will cause the packets with
MAC added by the '.mac_addr_add' are dropped.
Fixes: 68218b87c1 ("net/bonding: prefer allmulti to promiscuous for LACP")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Change the advertised link speed from 10G to 100G as the memory
interfaces can reach higher throughput than 10G with large packets.
Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Normally, the Rx/Tx offload capability of bonding interface is
the intersection of the capability of all slave devices. And
Rx/Tx offloads configuration of slave device comes from bonding
interface. But now there is a risk that slave device retains its
previous offload configurations which is not within the offload
configurations of bond interface.
Fixes: 57b156540f ("net/bonding: fix offloading configuration")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Min Hu (Connor) <humin29@huawei.com>
The driver had once been broken by patch [1] looking to have
a non-zero "nb_max" value in a use case not involving adding
any back-end ports. That was addressed afterwards ([2]). But,
as per report [3], similar test cases exist which attempt to
setup Rx queues on a void bond before attaching any back-end
ports. Rx queue setup, in turn, involves device info get API
invocation, and one of the checks on received data causes an
exception (division by zero). The "nb_align" value is indeed
zero at that time, but, as explained in [2], such test cases
are totally incorrect since a bond device must have at least
one back-end port plugged before any ethdev APIs can be used.
Once again, to avoid any problems with fixing the test cases,
this patch adjusts the bond PMD itself to workaround the bug.
[1] commit 5be3b40fea ("net/bonding: fix values of descriptor limits")
[2] commit d03c0e83cc ("net/bonding: fix descriptor limit reporting")
[3] https://bugs.dpdk.org/show_bug.cgi?id=1118
Bugzilla ID: 1118
Fixes: d03c0e83cc ("net/bonding: fix descriptor limit reporting")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Min Hu (Connor) <humin29@huawei.com>
Tested-by: Weiyuan Li <weiyuanx.li@intel.com>
As per report [1], the previous patch for device
configure code apparently overlooks the corner
case of manually adding back-end devices to
the bond using testpmd CLI. The problem is
in removing back-end ports on re-configure
instead of just stopping them. Fix that.
[1] https://bugs.dpdk.org/show_bug.cgi?id=1119
Bugzilla ID: 1119
Fixes: 339f1ba513 ("net/bonding: make configure method re-entrant")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Tested-by: Dukai Yuan <dukaix.yuan@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Fix the check logic of the index of the array, which
caused the out of bounds write problem.
Coverity issue: 381616
Fixes: c55abf6141 ("net/nfp: support RSS on VXLAN inner layer")
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
If tx_pkts is NULL, nb_pkts must be 0. Coverity doesn't know
this so it thinks this is a forward-NULL violation.
Make things more clear by checking for nb_pkts instead.
Coverity issue: 381614
Coverity issue: 381619
Fixes: e86a6fcc7c ("net/ionic: add optimized non-scattered Rx/Tx")
Signed-off-by: Andrew Boyer <andrew.boyer@amd.com>
(uint16_t * uint16_t) promoted to uint64_t has a sign extension
problem reported by Coverity. Cast one arg to uint64_t first
to eliminate the sign extension.
Coverity issue: 381617
Coverity issue: 381618
Fixes: 7b20fc2f3c ("net/ionic: overhaul Rx for performance")
Signed-off-by: Andrew Boyer <andrew.boyer@amd.com>
The pointers 'rxq' and 'txq' are dereferenced before the null check.
Fixed the logic in this patch.
Fixes: 4bec2d0b55 ("net/gve: support queue operations")
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@amd.com>
With some higher GCC/CLANG version, it is not recommended to use a
structure with a tailing flexible array inside another structure.
Accessing this array may be considered as a risk to corrupt the
following field even if it is by intention.
The error below was observed:
drivers/net/mlx5/linux/mlx5_ethdev_os.c: In function 'mlx5_get_flag_dropless_rq':
drivers/net/mlx5/linux/mlx5_ethdev_os.c:1679:42: error:
invalid use of structure with flexible array member [-Werror=pedantic]
1679 | struct ethtool_sset_info hdr;
| ^~~
Changing it to memory dynamic allocation method will help to get
rid of this complain.
Fixes: e848218741 ("net/mlx5: check delay drop settings in kernel driver")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Add support of AVX512 vector data path for single queue model.
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Add Tx offloading support:
- support TSO for single queue model and split queue model.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Add Rx offloading support:
- support CHKSUM and RSS offload for split queue model
- support CHKSUM offload for single queue model
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Enable write back on ITR expire, then packets can be received one by
one.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Add basic Tx support in split queue mode and single queue mode.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Add basic Rx support in split queue mode and single queue mode.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Add support for tx_queue_setup ops.
In the single queue model, the same descriptor queue is used by SW to
post buffer descriptors to HW and by HW to post completed descriptors
to SW.
In the split queue model, "RX buffer queues" are used to pass
descriptor buffers from SW to HW while Rx queues are used only to
pass the descriptor completions, that is, descriptors that point
to completed buffers, from HW to SW. This is contrary to the single
queue model in which Rx queues are used for both purposes.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Support device init and add the following dev ops:
- dev_configure
- dev_close
- dev_infos_get
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Add support to generate ipad and opad for MD5.
Skip the call to additional command WRITE_SA during SA creation.
Instead use the software defined function to generate opad and ipad.
Signed-off-by: Vidya Sagar Velumuri <vvelumuri@marvell.com>
The variable mlx5_config may be used by other mlx5 drivers
and should be always initialized.
By moving its initialization (with configuration file generation),
it is made consistent for Linux and Windows builds.
And the check of mlx5_config in net/mlx5 is moved at the top of
net/mlx5/hws/meson.build so HWS requirements are in the right context.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Alex Vesker <valex@nvidia.com>
If the dependency common/mlx5 is explicitly disabled,
but net/mlx5 is not explicitly disabled,
Meson will read the full recipe of net/mlx5
and will fail when accessing a variable from common/mlx5:
drivers/net/mlx5/meson.build:76:4: ERROR: Unknown variable "mlx5_config".
The solution is to stop parsing net/mlx5 if common/mlx5 is disabled.
The deps array must be defined before stopping, in order to automatically
disable the build of net/mlx5 and print the reason.
The same protection is applied to other mlx5 drivers,
so it will allow using the variable mlx5_config in future.
Fixes: 22681deead ("net/mlx5/hws: enable hardware steering")
Reported-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Alex Vesker <valex@nvidia.com>
The mlx5_is_thread_alive function always returns false
(terminated) regardless to the actual thread state.
Fixed to return the correct thread state.
Bugzilla ID: 1089
Fixes: 5d55a494f4 ("net/mlx5: split multi-thread flow handling per OS")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Based on the hard limits configured in the SA context,
PMD passes corresponding event subtype to the application
to notify hard expiry event
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Removed unnecessary datapointer(DPTR) update and remove ESN update
from microcode command word 0 based on the latest microcode.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Vidya Sagar Velumuri <vvelumuri@marvell.com>
In outbound inline case, use NIX Tx offset instead of
NIX Tx address for cn10kb as per new instruction format.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Fix later skip to include mbuf priv data as mbuf->buf_addr
is populated based on calculation including per-mbuf priv area.
Fixes: 706eeae607 ("net/cnxk: add multi-segment Rx for CN10K")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Add support to override soft expiry poll frequency via devargs.
Also provide helper API to indicate reassembly support on a chip
and documentation for devargs that are already present.
Fixes: 780b9c8924 ("net/cnxk: support zero AURA for inline meta")
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>