Add Protocol Agnostic Flow (raw flow) support for flow subscription
in AVF.
For example, testpmd creates a flow subscription raw packet rule:
rule: eth + ipv4 src is 1.1.1.1 dst is 2.2.2.2
cmd: flow create 0 ingress pattern raw pattern spec \
00000000000000000000000008004500001400000000000000000101010102020202 \
pattern mask \
0000000000000000000000000000000000000000000000000000FFFFFFFFFFFFFFFF \
/ end actions port_representor port_id 0 / end
Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Fix an error check where the return code was assigned to a
unsigned integer which can hide negative error codes.
Fixes: 6bc987ecb860 ("net/iavf: support IPsec inline crypto")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
In case rte_eth_dma_zone_reserve fails in ice_tx_queue_setup
ice_tx_queue_release is called on 0 allocated but not initialized
txq struct.
This may happen on ENOMEM condition, size exhaustion of
memconfig->memzones array as well as some others.
Fixes: edec6dd83824 ("net/ice: remove redundant functions")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Jonak <tomasz@graphiant.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Passing tainted expression "rss_meta->proto_hdrs.count" to
"iavf_refine_proto_hdrs", which uses it as a loop boundary.
Replace tainted expression with a temp variable to avoid the
tainted scalar coverity warning.
Coverity issue: 381131
Fixes: f30157d988cf ("net/iavf: support PPPoL2TPv2oUDP RSS Hash")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
When the promiscuous mode is enabled on a VF, the IXGBE_VMOLR_VPE
bit (VLAN Promiscuous Enable) is set. This means that the VF will
receive packets whose VLAN is not the same as the VLAN of the VF.
For instance, in this situation:
┌────────┐ ┌────────┐ ┌────────┐
│ │ │ │ │ │
│ │ │ │ │ │
│ VF0├────┤VF1 VF2├────┤VF3 │
│ │ │ │ │ │
└────────┘ └────────┘ └────────┘
VM1 VM2 VM3
vf 0: vlan 1000
vf 1: vlan 1000
vf 2: vlan 1001
vf 3: vlan 1001
If we tcpdump on VF3, we see all the packets, even those transmitted
on vlan 1000.
This behavior prevents to bridge VF1 and VF2 in VM2, because it will
create a loop: packets transmitted on VF1 will be received by VF2 and
vice-versa, and bridged again through the software bridge.
This patch remove the activation of VLAN Promiscuous when a VF enables
the promiscuous mode. However, the IXGBE_VMOLR_UPE bit (Unicast
Promiscuous) is kept, so that a VF receives all packets that has the
same VLAN, whatever the destination MAC address.
A similar patch was accepted in Linux kernel (see link).
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7bb0fb7c63df
Fixes: 0355c379b71f ("net/ixgbe: support VF promiscuous by PF driver")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Wenjun Wu <wenjun1.wu@intel.com>
After a VF requested to remove the promiscuous flag on an interface, the
broadcast packets are not received anymore. This breaks some protocols
like ARP.
In ixgbe_update_vf_xcast_mode(), we should keep the IXGBE_VMOLR_BAM
bit (Broadcast Accept) on promiscuous removal. This flag is already set
by default in ixgbe_vf_reset_event() on VF reset.
A similar patch was accepted in Linux kernel (see link).
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=803e9895ea2b
Fixes: 0355c379b71f ("net/ixgbe: support VF promiscuous by PF driver")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Wenjun Wu <wenjun1.wu@intel.com>
Config load functions updated to support 100G rates for subport and pipes.
Added new parse function to convert string to unsigned long long.
Added error checks.
Fixed format warnings.
Signed-off-by: Megha Ajmera <megha.ajmera@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
This library has no maintainer and, for now, nobody expressed interest
in taking over.
Mark this experimental library as deprecated and announce plan for
removal in v23.11.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>
Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Bernard is no longer with Intel and is no longer involved in the DPDK
community, so remove him from the MAINTAINERS file.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
When adding buffer split feature to mlx in DPDK 20.11,
it has been forgotten to fill the feature matrix.
Fixes: 6c8f7f1c1877 ("net/mlx5: report Rx buffer split capabilities")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Raslan Darawsheh <rasland@nvidia.com>
The "mlx5_os_parse_eth_devargs()" function parses the ETH devargs into a
specific structure called "eth_da".
It gets structure called "devargs" as a member of EAL device containing
the relevant information.
When "devargs" structure is invalid, the function avoids parsing it.
However, when it valid but its field "args" is invalid, the function
tries to parse it and dereference to NULL pointer.
This patch adds check to avoid this NULL dereferencing.
Fixes: 919488fbfa71 ("net/mlx5: support Sub-Function")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The common MLX5 probe function parses first the devargs and save them in
a dictionary.
It gets structure called "devargs" as a member of EAL device containing
all needed information.
When "devargs" structure is invalid, the function avoids parsing it.
However, when it valid but its field "args" is invalid, the function
tries to parse it and dereference to NULL pointer.
This patch adds check to avoid this NULL dereferencing.
Fixes: a729d2f093e9 ("common/mlx5: refactor devargs management")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
For the flows with multiple tunnel layers and containing
tunnel decap and modify actions, for example:
... / vxlan / eth / ipv4 proto is 4 / end
actions raw_decap / modify_field / ...
(note: proto 4 means we have the IP-over-IP tunnel in VXLAN payload)
We have added the multiple tunnel layers validation rejecting
the flows like above mentioned one.
The hardware supports the above match combination till the inner
IP-over-IP header (not including the last one), both for IP-over-IPv4
and IP-over-IPv6, so we should not blindly reject. Also, for the modify
actions following the decap we should set the layer attributes correctly.
This patch reverts the below code changes to support the match, and
adjusts the layers update in case of decap with outer tunnel header.
Fixes: fa06906a48ee ("net/mlx5: fix IPIP multi-tunnel validation")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add support for port_representor item, it will match on traffic
originated from representor port specified in the pattern. This item
is supported in FDB steering domain only (in the flow with transfer
attribute).
For example, below flow will redirect the destination of traffic from
ethdev 1 to ethdev 2.
testpmd> ... pattern eth / port_representor port_id is 1 / end actions
represented_port ethdev_port_id 2 / ...
To handle abovementioned item, Tx queue matching is added in the driver,
and the flow will be expanded to number of the Tx queues. If the spec of
port_representor is NULL, the flow will not be expanded and match on
traffic from any representor port.
Signed-off-by: Sean Zhang <xiazhang@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch adds the creation of control flow rules required to receive
default traffic (based on port configuration) with HWS.
Control flow rules are created on port start and destroyed on port stop.
Handling of destroying these rules was already implemented before that
patch.
Control flow rules are created if and only if flow isolation mode is
disabled and the creation process goes as follows:
- Port configuration is collected into a set of flags. Each flag
corresponds to a certain Ethernet pattern type, defined by
mlx5_flow_ctrl_rx_eth_pattern_type enumeration. There is a separate
flag for VLAN filtering.
- For each possible Ethernet pattern type and:
- For each possible RSS action configuration:
- If configuration flags do not match this combination, it is
omitted.
- A template table is created using this combination of pattern
and actions template (templates are fetched from hw_ctrl_rx
struct stored in the port's private data).
- Flow rules are created in this table.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In some E-Switch use cases, applications want to receive all traffic
on a single port. Since currently, flow API does not provide a way to
match traffic forwarded to any port representor, this patch adds
support for controlling representor matching on ingress flow rules.
Representor matching is controlled through a new device argument
repr_matching_en.
- If representor matching is enabled (default setting),
then each ingress pattern template has an implicit REPRESENTED_PORT
item added. Flow rules based on this pattern template will match
the vport associated with the port on which the rule is created.
- If representor matching is disabled, then there will be no implicit
item added. As a result ingress flow rules will match traffic
coming to any port, not only the port on which the flow rule is
created.
Representor matching is enabled by default, to provide an expected
default behavior.
This patch enables egress flow rules on representors when E-Switch is
enabled in the following configurations:
- repr_matching_en=1 and dv_xmeta_en=4
- repr_matching_en=1 and dv_xmeta_en=0
- repr_matching_en=0 and dv_xmeta_en=0
When representor matching is enabled, the following logic is
implemented:
1. Creating an egress template table in group 0 for each port. These
tables will hold default flow rules defined as follows:
pattern SQ
actions MODIFY_FIELD (set available bits in REG_C_0 to
vport_meta_tag)
MODIFY_FIELD (copy REG_A to REG_C_1, only when
dv_xmeta_en == 4)
JUMP (group 1)
2. Egress pattern templates created by an application have an implicit
MLX5_RTE_FLOW_ITEM_TYPE_TAG item prepended to the pattern, which
matches available bits of REG_C_0.
3. Egress flow rules created by an application have an implicit
MLX5_RTE_FLOW_ITEM_TYPE_TAG item prepended to the pattern, which
matches vport_meta_tag placed in available bits of REG_C_0.
4. Egress template tables created by an application, which are in
group n, are placed in group n + 1.
5. Items and actions related to META are operating on REG_A when
dv_xmeta_en == 0 or REG_C_1 when dv_xmeta_en == 4.
When representor matching is disabled and extended metadata is disabled,
no changes to the current logic are required.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch adds support for fdb_def_rule_en device argument to HW
Steering, which controls:
- the creation of the default FDB jump flow rule.
- the ability of the user to create transfer flow rules in the root
table.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The queue based rte_flow_async_action_* functions work the same as
queue based async flow functions. The operations can be pushed
asynchronously, and so is the pull.
This commit adds the async action missing push and pull support.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add support for AGE action for HW steering.
This patch includes:
1. Add new structures to manage aging.
2. Initialize all of them in configure function.
3. Implement per second aging check using CNT background thread.
4. Enable AGE action in flow create/destroy operations.
5. Implement a queue-based function to report aged flow rules.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add the ability to create an indirect action handle for METER_MARK.
It allows sharing one Meter between several different actions.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This commit adds the support of connection tracking to HW steering as
SW steering did before.
The difference from SW steering implementation is that it takes
advantage of HW steering bulk action allocation support, in HW
steering only one single CT pool is needed.
An indexed pool is introduced to record allocated actions from bulk and
CT action state etc. Once one CT action is allocated from bulk, one
indexed object will also be allocated from the indexed pool, similar to
deallocating. That makes mlx5_aso_ct_action can also be managed by that
indexed pool, no need to be reserved from mlx5_aso_ct_pool. The single
CT pool is also saved to mlx5_aso_ct_action struct directly.
The ASO operation functions are shared with SW steering implementation.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch adapts mlx5 PMD to changes in mlx5dr API regarding the action
templates. It changes the following:
1. Actions template creation:
- Flow actions types are translated to mlx5dr action types in order
to create mlx5dr_action_template object.
- An offset is assigned to each flow action. This offset is used to
predetermine the action's location in the rule_acts array passed
on the rule creation.
2. Template table creation:
- Fixed actions are created and put in the rule_acts cache using
predetermined offsets
- mlx5dr matcher is parametrized by action templates bound to
template table.
- mlx5dr matcher is configured to optimize rule creation based on
passed rule indices.
3. Flow rule creation:
- mlx5dr rule is parametrized by the action template on which these
rule's actions are based.
- Rule index hint is provided to mlx5dr.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This commit adds HW steering counter action support.
The pool mechanism is the basic data structure for the HW steering
counter.
The HW steering's counter pool is based on the rte_ring of zero-copy
variation.
There are two global rte_rings:
1. free_list:
Store the counters indexes, which are ready for use.
2. wait_reset_list:
Store the counters indexes, which are just freed from the user and
need to query the hardware counter to get the reset value before
this counter can be reused again.
The counter pool also supports cache per HW steering's queues, which are
also based on the rte_ring of zero-copy variation.
The cache can be configured in size, preload, threshold, and fetch size,
they are all exposed via device args.
The main operations of the counter pool are as follows:
- Get one counter from the pool:
1. The user call _get_* API.
2. If the cache is enabled, dequeue one counter index from the local
cache:
2. A: if the dequeued one from the local cache is still in reset
status (counter's query_gen_when_free is equal to pool's query
gen):
I. Flush all counters in the local cache back to global
wait_reset_list.
II. Fetch _fetch_sz_ counters into the cache from the global
free list.
III. Fetch one counter from the cache.
3. If the cache is empty, fetch _fetch_sz_ counters from the global
free list into the cache and fetch one counter from the cache.
- Free one counter into the pool:
1. The user calls _put_* API.
2. Put the counter into the local cache.
3. If the local cache is full:
A: Write back all counters above _threshold_ into the global
wait_reset_list.
B: Also, write back this counter into the global wait_reset_list.
When the local cache is disabled, _get_/_put_ cache directly from/into
global list.
Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This commit adds meter action for HWS steering.
HW steering meter is based on ASO. The number of meters will
be used by flows should be specified in advance in the flow
configure API.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The new mode 4 of devarg "dv_xmeta_en" is added for HWS only. In this
mode, the Rx / Tx metadata with 32b width copy between FDB and NIC is
supported.
The mark is only supported in NIC and there is no copy supported.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch implements creating and caching of port action for use with
HW Steering FDB flows.
Actions are created on flow template API configuration and created
only on the port designated as the master. Attaching and detaching ports
in the same switching domain causes an update to the port actions cache
by, respectively, creating and destroying actions.
A new devarg fdb_def_rule_en is being added and it's used to control
the default dedicated E-Switch rules that are created by the PMD
implicitly or not, and PMD sets this value to 1 by default.
If set to 0, the default E-Switch rule will not be created and the user
can create the specific E-Switch rules on the root table if needed.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
This patch introduces support for modify_field rte_flow actions in HWS
mode that includes:
- Ingress and egress domains,
- SET and ADD operations,
- usage of arbitrary bit offsets and widths for packet and metadata
fields.
This is implemented in two phases:
1. On flow table creation the hardware commands are generated, based
on rte_flow action templates, and stored alongside action template.
2. On flow rule creation/queueing the hardware commands are updated with
values provided by the user. Any masks over immediate values, provided
in action templates, are applied to these values before enqueueing rules
for creation.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
As the rte_flow_async API defines the action mask with a field value
not being 0 means the action will be used as shared in all the flows
in the table.
The header reformat action with the action mask field not being 0 will
be created as constant shared action. For encapsulation header reformat
action, there are two kinds of encapsulation data, raw_encap_data
and rte_flow_item encap_data. Both of these two kinds of data can be
identified from the action mask conf as constant or not.
Examples:
1. VXLAN encap (encap_data: rte_flow_item)
action conf (eth/ipv4/udp/vxlan_hdr)
a. action mask conf (eth/ipv4/udp/vxlan_hdr)
- items are constant.
b. action mask conf (NULL)
- items will change.
2. RAW encap (encap_data: raw)
action conf (raw_data)
a. action mask conf (not NULL)
- encap_data constant.
b. action mask conf (NULL)
- encap_data will change.
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In the flow_dv_hashfields_set() function, while item_flags was 0,
the code went directly to the first if and the else case would
never have a chance to be checked. This caused the IPv6 and TCP hash
fields in the else case would never be set.
This commit adds the dedicated HW steering hash field set function
to generate the RSS hash fields.
Fixes: 3a2f674b6aa8 ("net/mlx5: add queue and RSS HW steering action")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In the function flow_get_drv_type(), attr will be read in non-HWS mode.
In case the user calls the HWS API in SWS mode, the attr should be
placed in HWS functions or it will cause a crash.
Fixes: c40c061a022e ("net/mlx5: add basic flow queue operation")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The debug layer is used to generate a debug CSV file
containing details of the context, table, matcher, rules
and other useful debug information.
Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
Action objects are used for executing different HW actions
over packets. Each action contains the HW resources and parameters
needed for action use over the HW when creating a rule.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
HWS rule objects reside under the matcher, each rule holds
the configuration for the packet fields to match and the
set of actions to execute over the packet that has the requested
fields.
Rules can be created asynchronously in parallel over multiple
queues to different matchers with each rule configured to the HW.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
HWS matcher resides under the table object, each table can
have multiple chained matches with different attributes.
Each matcher represents a combination of match and action templates,
and can contain multiple configurations based on the templates.
Packets are steered from the table to the matcher and from there to
other objects.
The matcher allows efficient HW packet field matching and action
execution based on the configuration done to it.
Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
HWS table resides under the context object, each context can
have multiple tables with different steering types RX/TX/FDB.
The table is not only a logical object but it is also represented
in the HW, packets can be steered to the table, and from there
to other tables.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
Context is the first mlx5dr object created, all sub objects
table, matcher, rule, and action are created using the context.
The context holds the capabilities and the send queues used for
configuring the offloads to the HW.
Signed-off-by: Alex Vesker <valex@nvidia.com>
Definers are HW objects that are used for matching, rte items
are translated to definers, each definer holds the fields and
bit-masks used for HW flow matching. The definer layer is used
for finding the most efficient definer for each set of items.
In addition to definer creation we also calculate the field
copy (fc) array used for efficient items to WQE conversion.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
HWS configures flows to the HW using a QP, each WQE has
the details of the flow we want to offload. The send layer
allocates the resources needed to send the request to the HW
as well as managing the queues, getting completions and
handling failures.
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
HWS needs to manage different types of device memory in
an efficient and quick way. For this, memory pools are
being used.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
This adds the command layer which is used to communicate with
the FW, to query capabilities and allocate FW resources needed
for HWS.
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Alex Vesker <valex@nvidia.com>
Add missing glue support for the HWS mlx5dr layer.
The new glue functions are needed for mlx5dv to create
matcher and action, which are used as the kernel root
table as well as for capabilities query like device
name and port info.
Signed-off-by: Alex Vesker <valex@nvidia.com>
This stores the available tags that can be used by the
application in a global array that will be used to
transfer the TAG item directly from the ID to the REG_C_x
since these can't be changed after startup.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Add new fields to flow table capabilities to query the capabilities to
set, add, and copy to REG_C_x.
The set capability is queried and saved for future usage.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
This adds conversion functions between both ethdev port_id and IB
context to internal corresponding tag/mask values.
Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>