Ensure that Rx/Tx/Async completion entry fields are accessed
only after the completion's valid flag has been loaded and
verified. This is needed for correct operation on systems that
use relaxed memory consistency models.
Fixes: 2eb53b134a ("net/bnxt: add initial Rx code")
Fixes: 6eb3cc2294 ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Add support for rte_flow_item_raw to parse custom L2 and L3
protocols.
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add roc API for parsing custom L2 and L3 protocols.
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
Set link status to down and don't fetch link status from kernel
when device in stopped state.
Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Until hierarchy committed TM hardware resources are not allocated
for node.
This patch check for status of HW resources before reading statistics.
Fixes: 1e25d57fae ("net/octeontx2: add TM stats and shaper profile")
Cc: stable@dpdk.org
Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Set link status to down and don't fetch link status from kernel
when device in stopped state.
Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Preallocation of MCAM entries is not valid anymore since the
AF side MCAM allocation scheme has changed. This patch disables
preallocation by changing the default MCAM preallocation size
from 8 to 1.
Fixes: 168c59cfe4 ("net/octeontx2: add flow MCAM utility functions")
Cc: stable@dpdk.org
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
In the inline inound path, a custom header would be present at L3 which
has sequence number & SPI. L2 need to be adjusted such that the eventual
packet would have L3 after L2. Remove assumption of L2 type in this
handling.
Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
vlan_strip and vlan_extend features need to return "unsupported"
error value.
Fixes: ff0b8b10dc ("net/mvpp2: support VLAN offload")
Cc: stable@dpdk.org
Signed-off-by: Meir Levi <mlevi4@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
Need to set configure flag to allow create and commit mrvl tm
hierarchy tree. tm configuration depends on parameters that are
being set in port configure stage, e.g. nb_tx_queues.
This also aligned with the tm api description.
Fixes: 429c394417 ("net/mvpp2: support traffic manager")
Cc: stable@dpdk.org
Signed-off-by: Dana Vardi <danat@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
ethtool_cmd_speed return uint32 and after the arithmetic
operation in mrvl_get_max_rate func the result is out of range.
Fixes: 429c394417 ("net/mvpp2: support traffic manager")
Cc: stable@dpdk.org
Signed-off-by: Dana Vardi <danat@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
The replenishment scheme for the vectorized MPRQ Rx burst aims
to improve the cache locality by allocating new mbufs only when
there are almost no mbufs left: one burst gap between allocated
and consumed indexes.
This gap is not big enough to accommodate a corner case when we
have a very aggressive CQE compression with multiple regular CQEs
at the beginning and 64 zipped CQEs at the end.
Need to keep in mind this case and extend the replenishment
threshold by MLX5_VPMD_RX_MAX_BURST (64) to avoid mbuf overflow.
Fixes: 5fc2e5c27d ("net/mlx5: fix mbuf overflow in vectorized MPRQ")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
IPV6_FRAG_EXT item is missed for RSS expansion which causes wrongly
expanded flows:
flow create 0 ingress pattern eth / ipv6 / udp dst is 250 / vxlan-gpe /
ipv6 / ipv6_frag_ext / end actions rss level 2 types ip end / end
Different from other items, IPV6_FRAG_EXT hasn't next field because HW
only support to do hash of UDP/TCP for non-fragment.
This MLX5_EXPANSION_IPV6_FRAG_EXT node in RSS expansion graph only helps
RSS expansion function to locate right node in graph from which start
to expand.
Fixes: 0e5a0d8f75 ("net/mlx5: support match on IPv6 fragment extension")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Some RSS expandable items are missing which leads to the expanded
rte flow rules with wrong patterns.
Fix by adding missed items.
Fixes: d91093b9a2 ("net/mlx5: fix RSS pattern expansion")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Query MLX5 port hardware if it is capable to offload IPv4
IHL field.
Provide flow rules capability to match on IPv4 IHL field.
Minimal HCA firmware version required to offload IPv4 IHL is
xx_30_2000.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
As hrxq struct has the indirect table pointer, while matching the
hrxq, better to use the hrxq indirect table instead of searching
from the list.
This commit optimizes the hrxq indirect table matching.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit changes the index pool memory release configuration
to 0 when memory reclaim mode is not required.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently, all the hash list tables are allocated during start up.
Since different applications may only use dedicated limited actions,
optimized the hash list table allocate on demand will save initial
memory.
This commit optimizes hash list table allocate on demand.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit enables the tag and header modify action indexed
pool per-core cache in non-reclaim memory mode.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
With the new per core optimization to the list, the hash bucket size
can be tuned to a more accurate number.
This commit adjusts the hash bucket size.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Modify header actions are allocated by mlx5_malloc which has a big
overhead of memory and allocation time.
One of the action types under the modify header object is SET_TAG,
The SET_TAG action is commonly not reused by the flows and each flow has
its own value.
Hence, the mlx5_malloc becomes a bottleneck in flow insertion rate in
the common cases of SET_TAG.
Use ipool allocator for SET_TAG action.
Ipool allocator has less overhead of memory and insertion rate and has
better synchronization mechanism in multithread cases.
Different ipool is created for each optional size of modify header
handler.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
This commit supports the list non-lcore operations with
an extra sub-list and lock.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently, hash list uses the cache list as bucket list. The list
in the buckets have the same name, ctx and callbacks. This wastes
the memory.
This commit abstracts all the name, ctx and callback members in the
list to a constant struct and others to the inconstant struct, uses
the wrapper functions to satisfy both hash list and cache list can
set the list constant and inconstant struct individually.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently, the list's local cache instance memory is allocated with
the list. As the local cache instance array size is RTE_MAX_LCORE,
most of the cases the system will only have very limited cores.
allocate the instance memory individually per core will be more
economic to the memory.
This commit changes the instance array to pointer array, allocate
the local cache memory only when the core is to be used.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Using the mlx5 list utility object in the hlist buckets.
This patch moves the list utility object to the common utility, creates
all the clone operations for all the hlist instances in the driver.
Also adjust all the utility callbacks to be generic for both list and
hlist.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
This commit optimizes to call the list callback functions with global
context directly.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Without lcores_share flag, mlx5 PMD was sharing the rdma-core objects
between all lcores.
Having lcores_share flag disabled, means each lcore will have its own
objects, which will eventually lead to increased insertion/deletion
rates.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Hash list is planned to be implemented with the cache list code.
This commit moves the list utility to common directory.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently, the list memory was allocated by the list API caller.
Move it to be allocated by the create API in order to save consistence
with the hlist utility.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
The atomic operation in the list utility no need a barriers because the
critical part are managed by RW lock.
Relax them.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
When a cache entry is allocated by lcore A and is released by lcore B,
the driver should synchronize the cache list access of lcore A.
The design decision is to manage a counter per lcore cache that will be
increased atomically when the non-original lcore decreases the reference
counter of cache entry to 0.
In list register operation, before the running lcore starts a lookup in
its cache, it will check the counter in order to free invalid entries in
its cache.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
The mlx5 internal list utility is thread safe.
In order to synchronize list access between the threads, a RW lock is
taken for the critical sections.
The create\remove\clone\clone_free operations are in the critical
sections.
These operations are heavy and make the critical sections heavy because
they are used for memory and other resources allocations\deallocations.
Moved out the operations from the critical sections and use generation
counter in order to detect parallel allocations.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
When mlx5 list object is accessed by multiple cores, the list lock
counter is all the time written by all the cores what increases cache
misses in the memory caches.
In addition, when one thread accesses the list for add\remove\lookup
operation, all the other threads coming to do an operation in the list
are stuck in the lock.
Add per lcore cache to allow thread manipulations to be lockless when
the list objects are mostly reused.
Synchronization with atomic operations should be done in order to
allow threads to unregister an entry from other thread cache.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
The internal mlx5 list tool is used mainly when the list objects need to
be synchronized between multiple threads.
The "cache" term is used in the internal mlx5 list API.
Next enhancements on this tool will use the "cache" term for per thread
cache management.
To prevent confusing, remove the current "cache" term from the API's
names.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
Define the types of the modify header action fields to be with the
minimum size needed for the optional values range.
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
The flow list is used to save the create flows and to be used only
when port closes all the flows need to be flushed.
This commit takes advantage of the index pool foreach operation to
flush all the allocated flows.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit supports the index pool non-lcore operations with
an extra cache and lcore lock.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
In some cases, application may want to know all the allocated
index in order to apply some operations to the allocated index.
This commit adds the indexed pool functions to support foreach
operation.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
For object which wants efficient index allocate and free, local
cache will be very helpful.
Two level cache is introduced to allocate and free the index more
efficient. One as local and the other as global. The global cache
is able to save all the allocated index. That means all the allocated
index will not be freed. Once the local cache is full, the extra
index will be flushed to the global cache. Once local cache is empty,
first try to fetch more index from global, if global is still empty,
allocate new trunk with more index.
This commit adds new local cache mechanism for indexed pool.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Some ipool instances in the driver are used as ID\index allocator and
added other logic in order to work with limited index values.
Add a new configuration for ipool specify the maximum index value.
The ipool will ensure that no index bigger than the maximum value is
provided.
Use this configuration in ID allocator cases instead of the current
logics. This patch add the maximum ID configurable for the index pool.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
MR btree len is a constant during Rx replenish.
Moved retrieve of the value out of loop to reduce data loads.
Slight performance uplift was measured on both N1SDP and x86.
Suggested-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Mask of entries after the compressed CQE is covered by invalid mask of
non-compressed valid CQEs. Hence remove redundant calculation on mask.
The change showed slight performance uplift on N1SDP.
Fixes: 570acdb1da ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add a new testpmd pattern field 'last_rsvd' that supports the
last 8-bits matching of VXLAN header.
The examples for the "last_rsvd" pattern field are as below:
1. ...pattern eth / ipv4 / udp / vxlan last_rsvd is 0x80 / end ...
This flow will exactly match the last 8-bits to be 0x80.
2. ...pattern eth / ipv4 / udp / vxlan last_rsvd spec 0x80
vxlan mask 0x80 / end ...
This flow will only match the MSB of the last 8-bits to be 1.
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Raslan Darawsheh <rasland@nvidia.com>
This adds matching on the reserved field of VXLAN
header (the last 8-bits). The capability from rdma-core
is detected by creating a dummy matcher using misc5
when the device is probed.
For non-zero groups and FDB domain, the capability is
detected from rdma-core, meanwhile for NIC domain group
zero it's relying on the HCA_CAP from FW.
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Raslan Darawsheh <rasland@nvidia.com>
The new flow item allows PMD to offload IPv4 IHL field for matching,
if hardware supports that operation.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
For the newly attached ports (with "port attach" command) the
default offloads settings, configured from application command
line, were not applied, causing port start failure following
the attach.
For example, if scattering offload was configured in command
line and rxpkts was configured for multiple segments, the newly
attached port start was failed due to missing scattering offload
enable in the new port settings. The missing code to apply
the offloads to the new device and its queues is added.
The new local routine init_config_port_offloads() is introduced,
embracing the shared part of port offloads initialization code.
Fixes: c9cce42876 ("ethdev: remove deprecated attach/detach functions")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Aman Deep Singh <aman.deep.singh@intel.com>
Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
MAC PAUSE can take effect on a single TC or multiple TCs, depending on the
hardware. For example, the Kunpeng 920 supports MAC pause in a single TC,
and the Kunpeng 930 supports MAC pause in multiple TCs. This patch
supports MAC PAUSE in multiple TC for some hardware.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Since the HW limitation for VF, the VLAN filter is default enabled, and
is not allowed to be closed. Now, the limitation has been removed in
Kunpeng930 network engine, so this patch add support for VF to modify the
VLAN filter state.
A capabilities bit is added to differentiate between different platforms
and achieve compatibility. When the VF runs on an incomatible platform or
an incompatible kernel-mode driver version is used, the VF behavior is
the same as that before.
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>