In order to allow\disallow configuring rules with identical
patterns, the new device argument 'allow_duplicate_pattern'
is introduced.
If allow, these rules be inserted successfully and only the
first rule take affect.
If disallow, the first rule will be inserted and other rules
be rejected.
The default is to allow.
Set it to 0 if disallow, for example:
-a <PCI_BDF>,allow_duplicate_pattern=0
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This adds the validation when creating a policy with meter action.
Currently meter action is only allowed for green color in policy, and
8 meters are supported at maximum in one meter hierarchy.
Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently when creating meter policy, a src port_id match item will
always be added in switch domain. So if one meter is used by another
port, it will not work correctly.
This issue is solved:
1. If policy fate action is port_id, add the src port_id match item,
and the meter cannot be shared by another port.
2. If policy fate action isn't port_id, don't add the src port_id
match, meter can be shared by another port.
This fix enables one meter being shared by different ports. User can
create a meter flow using a port_id match item to make this meter
shared by other port.
Fixes: afb4aa4f12 ("net/mlx5: support meter policy operations")
Cc: stable@dpdk.org
Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
ConnectX-4 and ConnectX-4 Lx NICs require all L2 headers of transmitted
packets to be inlined. By default only first 18 bytes are inlined,
which is insufficient if additional encapsulation is used, like Q-in-Q.
Thus, default settings caused such traffic to be dropepd on Tx.
Document a recommendation to increase inlined data size in such cases.
Fixes: 505f1fe426 ("net/mlx5: add Tx devargs")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
WINOF2 2.70 Windows kernel driver allows DevX rule creation
of types TCP and IPv6.
Added the types to the supported items in mlx5_flow_os_item_supported
to allow them to be created in the PMD.
Added description of new rules support in Windows kernel driver WINOF2 2.70
to the mlx5 driver guide.
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The tool lstopo from hwloc package can provide a graphical
or textual view.
In its textual form, the option --merge gives a shorter summary
which fits well with the DPDK need.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The support of the new RTE_FLOW_ITEM_TYPE_INTEGRITY
was added in the release notes 21.02 by mistake.
The support of the Sub-Function representors was missing
in the release notes and the mlx5 guide.
Fixes: 79f8952783 ("net/mlx5: support integrity flow item")
Fixes: cb95feefdd ("net/mlx5: support sub-function representor")
Signed-off-by: Asaf Penso <asafp@nvidia.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
After creating a connection tracking context, it can be used between
two ports. For each port, the flow for one direction traffic will
be created.
The context can only be shared between the owner port and the peer
port that was specified when being created. Only the owner port
could update the context or query it in current implementation.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Allocating a CT from the management pools and creating the DR actions
for both directions by default.
If there is no available connection tracking action, a new pool will
be created with a fixed size bulk allocation. Right now, all the
resources are controlled by the linked list.
The ASO connection tracking context associated with these actions
need to be updated via WQE before using for steering.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Based on the capacity, 3 registers could be used. Due to the register
allocation, only the one REG_C_3 for meter color could be reused
right now.
Then in the same flow, no more than one ASO action can be supported.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
During startup, the ASO connection tracking offload capability could
be queried via HCA_CAP_QUERY command. If the HW doesn't support ASO
CT, the value would be 0 by default. The following initialization
should be skipped and the creation of the CT object should return
a failure directly.
The following CT creation should also check this capability. With
the old driver, the pre-processing macro should be used in order to
make the compiling pass.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
MLX5 PMD supports the following integrity filters for outer and
inner network headers:
- l3_ok
- l4_ok
- ipv4_csum_ok
- l4_csum_ok
`level` values 0 and 1 reference outer headers.
`level` > 1 reference inner headers.
Flow rule items supplied by application must explicitly specify
network headers referred by integrity item. For example:
flow create 0 ingress
pattern
integrity level is 0 value mask l3_ok value spec l3_ok /
eth / ipv6 / end …
or
flow create 0 ingress
pattern
integrity level is 0 value mask l4_ok value spec 0 /
eth / ipv4 proto is udp / end …
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Existing API supports counter action to count traffic of a single flow.
The user can share the count action among different flows using the
shared flag and the same counter ID in the count action configuration.
Recent patch [1] introduced the indirect action API.
Using this API, an action can be created as indirect, unattached to any
flow rule.
Multiple flows can then be created using the same indirect action.
The new API also supports query operation of an indirect action.
The new API is more efficient because the driver gets it's own handler
for the count action instead of managing a mapping between the user ID
to the driver handle.
Support create, query and destroy indirect action operations for flow
count action.
Application will use the indirect action query operation to query this
count action.
In the meantime the old sharing mechanism (with the sharing flag)
continues to be supported, and the user can choose the way he wants to
share the counter.
The new indirect action API is only supported in DevX, so sharing
counter action in Verbs can only be done through the old mechanism.
[1] https://mails.dpdk.org/archives/dev/2020-July/174110.html
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently meter algorithms only supports bytes units for meter profiles.
Using ASO feature, the driver can support metering in per packet units.
Add support for packet units in meter profiles.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Currently ASO meter must be followed by policy table, so this adds
the support that connecting meter and policy table.
There are several cases to be considered:
1. For non-termination policy, connect meter to the default policy
table.
2. For non-RSS termination policy case, simply get the policy
table id and connect meter to it.
3. For RSS termination policy case, need to split the flow due
to RSS info in policy, and translate each sub-flow using that RSS,
then create the sub policy table to be connected.
4. In termination policy case, if there's no actions to modify the
packet before meter, no need to use set_tag to save meter id in
register. Only add a new flow in drop table using the same match
criteria as suf-flow, to save cache miss.
Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Meter statistics are each policer action each counter.
Totally 4 counters per each meter.
It causes cache missed
and lead to data forwarding performance low.
To optimize it, support pass counter for green
and drop counter for red.
Totally two counters per each meter.
Also use the global drop statistics for
all meter drop action.
Limitations as below:
1. It does not support yellow counter and return 0.
2. All the meter colors with drop action will be
counted only by the global drop statistics.
3. Red color must be with drop action.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Previous implementations support dump all the flows. Add new arg
rte_flow in rte_flow_dev_dump to dump one flow.
Signed-off-by: Haifei Luo <haifeil@nvidia.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Updates the documentation for push/pop VLAN support. In E-Switch
mode, push VLAN on ingress traffic and pop VLAN in egress traffic
are both support.
Signed-off-by: Dong Zhou <dongzhou@nvidia.com>
Reviewed-by: Asaf Penso <asafp@nvidia.com>
Add support for NVGRE encap as a sample action
and validate it.
Signed-off-by: Salem Sol <salems@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add support for VXLAN encap as a sample action
and validate it.
Signed-off-by: Salem Sol <salems@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Modification of the 802.1Q Tag Identifier, VXLAN Network
Identifier or GENEVE Network Identifier is not supported.
Reject attempt to modify these fields via the MODIFY_FIELD
action and document this mlx5 driver limitation.
Fixes: 641dbe4fb0 ("net/mlx5: support modify field flow action")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
To probe representors from different kernel bonding PFs, had to specify
2 separate devargs like this:
-a 03:00.0,representor=pf0vf[0-3] -a 03:00.0,representor=pf1vf[0-3]
This patch supports range or list of PF section in devargs, so the
alternative short devargs of above is:
-a 03:00.0,representor=pf[0-1]vf[0-3]
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
To probe representor on 2nd PF of kernel bonding device, had to specify
PF1 BDF in devarg:
<PF1_BDF>,representor=0
When closing bonding device, all representors had to be closed together
and this implies all representors have to use primary PF of bonding
device. So after probing representor port on 2nd PF, when locating new
probed device using device argument, the filter used 2nd PF as PCI
address and failed to locate new device.
Conflict happened by using current representor devargs:
- Use PCI BDF to specify representor owner PF
- Use PCI BDF to locate probed representor device.
- PMD uses primary PCI BDF as PCI device.
To resolve such conflicts, new representor syntax is introduced here:
<primary BDF>,representor=pfXvfY
All representors must use primary PF as owner PCI device, PMD internally
locate owner PCI address by checking representor "pfX" part. To EAL, all
representors are registered to primary PCI device, the 2nd PF is hidden
to EAL, thus all search should be consistent.
Same to VF representor, HPF (host PF on BlueField) uses same syntax to
probe, example: representor=pf1vf[0-3,-1]
This patch also adds pf index into kernel bonding representor port name:
<BDF>_<ib_name>_representor_pf<X>vf<Y>
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch adds support for SF representor. Similar to VF representor,
switch port name of SF representor in phys_port_name sysfs key is
"pf<x>sf<y>".
Device representor argument is "representors=sf[list]", list member
could be mix of instance and range. Example:
representors=sf[0,2,4,8-12,-1]
To probe VF representor and SF representor, need to separate into 2
devices:
-a <BDF>,representor=vf[list] -a <BDF>,representor=sf[list]
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Updates the documentation for supported sample actions in the NIC Rx
and E-Switch steering flow.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The tool dpdk-hugepages.py, added in DPDK 20.11,
is referenced in the guides instead of more complicate commands.
The original Linux commands are kept in linux_gsg/sys_reqs.rst
and nics/build_and_test.rst.
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
replace testpmd with dpdk-testpmd in all commands
because on compilation through meson, dpdk-testpmd is the default
application name.
Signed-off-by: Sarosh Arif <sarosh.arif@emumba.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The zero value in flow MARK action is reported in Rx datapath
as tagged with zero FDIR ID. Once packet is marked in flow engine
it will be always reported as tagged. For metadata only the zero
value means there is "no metadata" in the packet and the metadata
flag is not set for the case.
Fixes: 3ceeed9f78 ("doc: update flow mark action in mlx5 guide")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This sets the correct minimal requirements for these features:
- Buffer Split offload is supported/verified on ConnectX-5
- Tx scheduling requires ConnectX-6DX and depends on firmware version
Fixes: cb7b0c24c8 ("doc: update hardware offloads support in mlx5 guide")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Asaf Penso <asafp@nvidia.com>
Verbs cannot be used to configure newly introduced miniCQE formats for
Flow Tag and L3/L4 Header compression. Support for these formats has
been added to the DevX configuration only. And the RX queue descriptor
has been updated with the CQE compression format information only as
well. But the datapath relies on this info no matter which method is
used for Rx queues configuration. Set proper CQE compression format
information in the Verbs configuration to fix the miniCQE parsing logic.
Fixes: 54c2d46b16 ("net/mlx5: support flow tag and packet header miniCQEs")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add support for new MODIFY_FIELD action to the Mellanox PMD.
This is the generic API that allows to manipulate any packet
header field by copying data from another packet field or
mark, metadata, tag, or immediate value (or pointer to it).
Since the API is generic and covers a lot of action under its
umbrella it makes sense to implement all the mechanics gradually
in order to move to this API for any packet field manipulations
in the future. This is the first step of RTE flows consolidation.
The modify field RTE flow action supports three operations: set,
add and sub. This patch brings to live only the "set" operation.
Support is provided for any packet header field as well as
meta/tag/mark and immediate value can be used as a source.
There are few limitations for this first version of API support:
- encapsulation levels are not supported, just outermost header
can be manipulated for now.
- offsets can only be 4-bytes aligned: 32, 64 and 96 for IPv6.
- the special ITEM_START ID is not supported as we do not allow
to cross packet header field boundaries yet.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch adds support of the mbuf fast free offload to the
transmit datapath. This offload allows freeing the mbufs on
transmit completion in the most efficient way. It requires
the all mbufs were allocated from the same pool, have
the reference counter value as 1, and have no any externally
attached buffers.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
There some limitations added for the MARK action value range.
Fixes: 2d241515eb ("net/mlx5: add devarg for extensive metadata support")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Currently, the maximal flow priority in non-root table to user
is 4, it's not enough for user to do some flow match by priority,
such as LPM, for one IPV4 address, we need 32 priorities for each
bit of 32 mask length.
PMD will manage 3 sub-priorities per user priority according to L2,
L3 and L4. The internal priority is 16 bits, user can use priorities
from 0 - 21843.
Those enlarged flow priorities are only used for ingress or egress
flow groups greater than 0 and for any transfer flow group.
Signed-off-by: Dong Zhou <dongzhou@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
While there's the modify action and sample action with ratio=1
in the E-Switch flow, and modify action is after the sample
action, means that the modify should only impact on after sample.
MLX5 PMD will monitor the above case and split the E-Switch flow
into two sub flows, similar as sample flow did before:
- the prefix sub flow with all actions preceding the sample and the
sample action itself, also append the new jump action after sample
in the prefix sub flow;
- the suffix sub flow with the modify action and other actions
following the sample action.
The flow split as below:
Original flow: items / actions pre / sample / modify / actions sfx
prefix sub flow -
items / actions pre / set_tag action / sample / jump
suffix sub flow -
tag_item / modify / actions sfx
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
mlx5 E-Switch mirroring is implemented as multiple destination array in
one steering table. The array currently supports only port ID as
destination actions.
This patch adds the jump action support to the array as one of
destination.
The packets can be mirrored to the port and jump to the next table in
the same destination array allowing to continue handling in the new
table.
For example:
set sample_actions 0 port_id id 1 / end
flow create 0 ingress transfer pattern eth / end actions
sample ratio 1 index 0 / jump group 1 / end
flow create 1 ingress transfer group 1 pattern eth / end actions
set_mac_dst mac_addr 00:aa:bb:cc:dd:ee / port_id id 2 / end
The flow results all the matched ingress packets are mirrored
to port id 1 and go to group 1. In the group 1, packets are modified
with the destination mac and sent to port id 2.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
PMD validates the rss action in the sample sub-actions list,
then translates into rdma-core action and it will be used for sample
path destination.
If the RSS action is in both sample sub-actions list and original flow,
the rss level and rss type in the sample sub-actions list should be
consistent with the original flow list, since the expanding items
for RSS should be the same for both actions.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The GENEVE TLV option matching flows must be created
using a translation function.
This function checks whether we already created a Devx
object for the matching and either creates the objects
or updates the reference counter.
Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch adds the translation function which
sets the qfi, PDU type.
The next extension header which indicates the following
extension header type is set to 0x85 - a PDU session
container.
Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The data-path code doesn't take care on 'rxq_cqe_pad_en' and use padded
CQE for any case when the system cache-line size is 128B.
This makes the argument redundant.
Remove it.
Fixes: bc91e8db12 ("net/mlx5: add 128B padding of Rx completion entry")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
In DPDK 20.11 the following offload features are added:
* Buffer Split
* Sampling
* Tunnel offload
* 2-port hairpin
* RSS shared action
* Age shared action
Update the relevant tables with OFED/rdma-core/NIC versions.
Signed-off-by: Asaf Penso <asafp@nvidia.com>
The mlx5 PMD supports various Rx burst functions.
Each function is enabled differently and supports different features.
Signed-off-by: Asaf Penso <asafp@nvidia.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Replace -w / --pci-whitelist with -a / --allow options
and --pci-blacklist with --block.
The -b short option remains unchanged.
Allow the old options for now, but print a nag
warning since old options are deprecated.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
CQE compression allows us to save the PCI bandwidth and improve
the performance by compressing several CQEs together to a miniCQE.
But the miniCQE size is only 8 bytes and this limits the ability
to successfully keep the compression session in case of various
traffic patterns.
The current miniCQE format only keeps the compression session alive
in case of uniform traffic with the Hash RSS as the only difference.
There are requests to keep the compression session in case of tagged
traffic by RTE Flow Mark Id and mixed UDP/TCP and IPv4/IPv6 traffic.
Add 2 new miniCQE formats in order to achieve the best performance
for these traffic patterns: Flow Tag and Packet Header miniCQEs.
The existing rxq_cqe_comp_en devarg is modified to specify the
desired miniCQE format. Specifying 2 selects Flow Tag format
for better compression rate in case of RTE Flow Mark traffic.
Specifying 3 selects Checksum format (existing format for MPRQ).
Specifying 4 selects L3/L4 Header format for better compression
rate in case of mixed TCP/UDP and IPv4/IPv6 traffic.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
To support multi-thread flow insertion, this patch removes shared data
lock since all resources should support concurrent protection.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The fields ``has_vlan`` and ``has_more_vlan`` were added in rte_flow by
patch [1].
Using these fields, the application can match all the VLAN options by
single flow: any, VLAN only and non-VLAN only.
Add the support for the fields.
By the way, add the support for QinQ packets matching.
VLAN\QinQ limitations are listed in the driver document.
[1] https://patches.dpdk.org/patch/80965/
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Hairpin between two ports will be supported by mlx5 PMD.
The supported scenarios and limitations are listed in "mlx5.rst".
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>