numam-dpdk

Author	SHA1	Message	Date
Jiawei Wang	bfa87e21bd	net/mlx5: fix tunnel header with IPIP offload For the flows with multiple tunnel layers and containing tunnel decap and modify actions, for example: ... / vxlan / eth / ipv4 proto is 4 / end actions raw_decap / modify_field / ... (note: proto 4 means we have the IP-over-IP tunnel in VXLAN payload) We have added the multiple tunnel layers validation rejecting the flows like above mentioned one. The hardware supports the above match combination till the inner IP-over-IP header (not including the last one), both for IP-over-IPv4 and IP-over-IPv6, so we should not blindly reject. Also, for the modify actions following the decap we should set the layer attributes correctly. This patch reverts the below code changes to support the match, and adjusts the layers update in case of decap with outer tunnel header. Fixes: `fa06906a48` ("net/mlx5: fix IPIP multi-tunnel validation") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:45 +02:00
Sean Zhang	707d5e7d79	net/mlx5: support flow matching on representor ID Add support for port_representor item, it will match on traffic originated from representor port specified in the pattern. This item is supported in FDB steering domain only (in the flow with transfer attribute). For example, below flow will redirect the destination of traffic from ethdev 1 to ethdev 2. testpmd> ... pattern eth / port_representor port_id is 1 / end actions represented_port ethdev_port_id 2 / ... To handle abovementioned item, Tx queue matching is added in the driver, and the flow will be expanded to number of the Tx queues. If the spec of port_representor is NULL, the flow will not be expanded and match on traffic from any representor port. Signed-off-by: Sean Zhang <xiazhang@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:44 +02:00
Gregory Etelson	31b29e0c7f	net/mlx5: fix RSS expansion buffer size Increase expansion buffer size to accumulate more RSS types. Fixes: `3f02c7ff68` ("net/mlx5: fix RSS expansion for inner tunnel VLAN") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-26 13:33:44 +02:00
Dariusz Sosnowski	9fa7c1cddb	net/mlx5: create control flow rules with HWS This patch adds the creation of control flow rules required to receive default traffic (based on port configuration) with HWS. Control flow rules are created on port start and destroyed on port stop. Handling of destroying these rules was already implemented before that patch. Control flow rules are created if and only if flow isolation mode is disabled and the creation process goes as follows: - Port configuration is collected into a set of flags. Each flag corresponds to a certain Ethernet pattern type, defined by mlx5_flow_ctrl_rx_eth_pattern_type enumeration. There is a separate flag for VLAN filtering. - For each possible Ethernet pattern type and: - For each possible RSS action configuration: - If configuration flags do not match this combination, it is omitted. - A template table is created using this combination of pattern and actions template (templates are fetched from hw_ctrl_rx struct stored in the port's private data). - Flow rules are created in this table. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:43 +02:00
Dariusz Sosnowski	483181f7b6	net/mlx5: support device control of representor matching In some E-Switch use cases, applications want to receive all traffic on a single port. Since currently, flow API does not provide a way to match traffic forwarded to any port representor, this patch adds support for controlling representor matching on ingress flow rules. Representor matching is controlled through a new device argument repr_matching_en. - If representor matching is enabled (default setting), then each ingress pattern template has an implicit REPRESENTED_PORT item added. Flow rules based on this pattern template will match the vport associated with the port on which the rule is created. - If representor matching is disabled, then there will be no implicit item added. As a result ingress flow rules will match traffic coming to any port, not only the port on which the flow rule is created. Representor matching is enabled by default, to provide an expected default behavior. This patch enables egress flow rules on representors when E-Switch is enabled in the following configurations: - repr_matching_en=1 and dv_xmeta_en=4 - repr_matching_en=1 and dv_xmeta_en=0 - repr_matching_en=0 and dv_xmeta_en=0 When representor matching is enabled, the following logic is implemented: 1. Creating an egress template table in group 0 for each port. These tables will hold default flow rules defined as follows: pattern SQ actions MODIFY_FIELD (set available bits in REG_C_0 to vport_meta_tag) MODIFY_FIELD (copy REG_A to REG_C_1, only when dv_xmeta_en == 4) JUMP (group 1) 2. Egress pattern templates created by an application have an implicit MLX5_RTE_FLOW_ITEM_TYPE_TAG item prepended to the pattern, which matches available bits of REG_C_0. 3. Egress flow rules created by an application have an implicit MLX5_RTE_FLOW_ITEM_TYPE_TAG item prepended to the pattern, which matches vport_meta_tag placed in available bits of REG_C_0. 4. Egress template tables created by an application, which are in group n, are placed in group n + 1. 5. Items and actions related to META are operating on REG_A when dv_xmeta_en == 0 or REG_C_1 when dv_xmeta_en == 4. When representor matching is disabled and extended metadata is disabled, no changes to the current logic are required. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:43 +02:00
Dariusz Sosnowski	26e1eaf2da	net/mlx5: support device control for E-Switch default rule This patch adds support for fdb_def_rule_en device argument to HW Steering, which controls: - the creation of the default FDB jump flow rule. - the ability of the user to create transfer flow rules in the root table. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:43 +02:00
Gregory Etelson	a3778a4784	net/mlx5: support flow integrity in HWS group 0 - Reformat flow integrity item translation for HWS code. - Support flow integrity bits in HWS group 0. - Update integrity item translation to match positive semantics only. Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:42 +02:00
Suanming Mou	478ba4bbe6	net/mlx5: support async flow action push and pull The queue based rte_flow_async_action_* functions work the same as queue based async flow functions. The operations can be pushed asynchronously, and so is the pull. This commit adds the async action missing push and pull support. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:42 +02:00
Michael Baum	04a4de756e	net/mlx5: support flow age action with HWS Add support for AGE action for HW steering. This patch includes: 1. Add new structures to manage aging. 2. Initialize all of them in configure function. 3. Implement per second aging check using CNT background thread. 4. Enable AGE action in flow create/destroy operations. 5. Implement a queue-based function to report aged flow rules. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:41 +02:00
Alexander Kozyrev	48fbb0e93d	net/mlx5: support flow meter mark indirect action with HWS Add the ability to create an indirect action handle for METER_MARK. It allows sharing one Meter between several different actions. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:41 +02:00
Gregory Etelson	773ca0e91b	net/mlx5: support VLAN push/pop/modify with HWS Add PMD implementation for HW steering VLAN push, pop, and modify flow actions. HWS VLAN push flow action is triggered by a sequence of mandatory OF_PUSH_VLAN, OF_SET_VLAN_VID, and optional OF_SET_VLAN_PCP flow action commands. The commands must be arranged in the exact order: OF_PUSH_VLAN / OF_SET_VLAN_VID [ / OF_SET_VLAN_PCP ]. In masked HWS VLAN push flow action template ALL the above flow actions must be masked. In non-masked HWS VLAN push flow action template ALL the above flow actions must not be masked. Example: flow actions_template <port id> create \ actions_template_id <action id> \ template \ of_push_vlan / \ of_set_vlan_vid \ [ / of_set_vlan_pcp ] / end \ mask \ of_push_vlan ethertype 0 / \ of_set_vlan_vid vlan_vid 0 \ [ / of_set_vlan_pcp vlan_pcp 0 ] / end\ flow actions_template <port id> create \ actions_template_id <action id> \ template \ of_push_vlan ethertype <E>/ \ of_set_vlan_vid vlan_vid <VID>\ [ / of_set_vlan_pcp <PCP>] / end \ mask \ of_push_vlan ethertype <type != 0> / \ of_set_vlan_vid vlan_vid <vid_mask != 0>\ [ / of_set_vlan_pcp vlan_pcp <pcp_mask != 0> ] / end\ HWS VLAN pop flow action is triggered by OF_POP_VLAN flow action command. HWS VLAN pop action template is always non-masked. Example: flow actions_template <port id> create \ actions_template_id <action id> \ template of_pop_vlan / end mask of_pop_vlan / end HWS VLAN VID modify flow action is triggered by a standalone OF_SET_VLAN_VID flow action command. HWS VLAN VID modify action template can be ether masked or non-masked. Example: flow actions_template <port id> create \ actions_template_id <action id> \ template of_set_vlan_vid / end mask of_set_vlan_vid vlan_vid 0 / end flow actions_template <port id> create \ actions_template_id <action id> \ template of_set_vlan_vid vlan_vid 0x101 / end \ mask of_set_vlan_vid vlan_vid 0xffff / end Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:41 +02:00
Suanming Mou	463170a7c9	net/mlx5: support connection tracking with HWS This commit adds the support of connection tracking to HW steering as SW steering did before. The difference from SW steering implementation is that it takes advantage of HW steering bulk action allocation support, in HW steering only one single CT pool is needed. An indexed pool is introduced to record allocated actions from bulk and CT action state etc. Once one CT action is allocated from bulk, one indexed object will also be allocated from the indexed pool, similar to deallocating. That makes mlx5_aso_ct_action can also be managed by that indexed pool, no need to be reserved from mlx5_aso_ct_pool. The single CT pool is also saved to mlx5_aso_ct_action struct directly. The ASO operation functions are shared with SW steering implementation. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:40 +02:00
Dariusz Sosnowski	f1fecffa88	net/mlx5: support Direct Rules action template API This patch adapts mlx5 PMD to changes in mlx5dr API regarding the action templates. It changes the following: 1. Actions template creation: - Flow actions types are translated to mlx5dr action types in order to create mlx5dr_action_template object. - An offset is assigned to each flow action. This offset is used to predetermine the action's location in the rule_acts array passed on the rule creation. 2. Template table creation: - Fixed actions are created and put in the rule_acts cache using predetermined offsets - mlx5dr matcher is parametrized by action templates bound to template table. - mlx5dr matcher is configured to optimize rule creation based on passed rule indices. 3. Flow rule creation: - mlx5dr rule is parametrized by the action template on which these rule's actions are based. - Rule index hint is provided to mlx5dr. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:40 +02:00
Xiaoyu Min	4d368e1da3	net/mlx5: support flow counter action for HWS This commit adds HW steering counter action support. The pool mechanism is the basic data structure for the HW steering counter. The HW steering's counter pool is based on the rte_ring of zero-copy variation. There are two global rte_rings: 1. free_list: Store the counters indexes, which are ready for use. 2. wait_reset_list: Store the counters indexes, which are just freed from the user and need to query the hardware counter to get the reset value before this counter can be reused again. The counter pool also supports cache per HW steering's queues, which are also based on the rte_ring of zero-copy variation. The cache can be configured in size, preload, threshold, and fetch size, they are all exposed via device args. The main operations of the counter pool are as follows: - Get one counter from the pool: 1. The user call _get_* API. 2. If the cache is enabled, dequeue one counter index from the local cache: 2. A: if the dequeued one from the local cache is still in reset status (counter's query_gen_when_free is equal to pool's query gen): I. Flush all counters in the local cache back to global wait_reset_list. II. Fetch _fetch_sz_ counters into the cache from the global free list. III. Fetch one counter from the cache. 3. If the cache is empty, fetch _fetch_sz_ counters from the global free list into the cache and fetch one counter from the cache. - Free one counter into the pool: 1. The user calls _put_* API. 2. Put the counter into the local cache. 3. If the local cache is full: A: Write back all counters above _threshold_ into the global wait_reset_list. B: Also, write back this counter into the global wait_reset_list. When the local cache is disabled, _get_/_put_ cache directly from/into global list. Signed-off-by: Xiaoyu Min <jackmin@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:39 +02:00
Alexander Kozyrev	24865366e4	net/mlx5: support flow meter action for HWS This commit adds meter action for HWS steering. HW steering meter is based on ASO. The number of meters will be used by flows should be specified in advance in the flow configure API. Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:39 +02:00
Bing Zhao	ddb68e4733	net/mlx5: add extended metadata mode for HWS The new mode 4 of devarg "dv_xmeta_en" is added for HWS only. In this mode, the Rx / Tx metadata with 32b width copy between FDB and NIC is supported. The mark is only supported in NIC and there is no copy supported. Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:38 +02:00
Dariusz Sosnowski	1939eb6f66	net/mlx5: support flow port action with HWS This patch implements creating and caching of port action for use with HW Steering FDB flows. Actions are created on flow template API configuration and created only on the port designated as the master. Attaching and detaching ports in the same switching domain causes an update to the port actions cache by, respectively, creating and destroying actions. A new devarg fdb_def_rule_en is being added and it's used to control the default dedicated E-Switch rules that are created by the PMD implicitly or not, and PMD sets this value to 1 by default. If set to 0, the default E-Switch rule will not be created and the user can create the specific E-Switch rules on the root table if needed. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:38 +02:00
Suanming Mou	0f4aa72b99	net/mlx5: support flow modify field with HWS This patch introduces support for modify_field rte_flow actions in HWS mode that includes: - Ingress and egress domains, - SET and ADD operations, - usage of arbitrary bit offsets and widths for packet and metadata fields. This is implemented in two phases: 1. On flow table creation the hardware commands are generated, based on rte_flow action templates, and stored alongside action template. 2. On flow rule creation/queueing the hardware commands are updated with values provided by the user. Any masks over immediate values, provided in action templates, are applied to these values before enqueueing rules for creation. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:38 +02:00
Suanming Mou	7f6daa490d	net/mlx5: add shared header reformat As the rte_flow_async API defines the action mask with a field value not being 0 means the action will be used as shared in all the flows in the table. The header reformat action with the action mask field not being 0 will be created as constant shared action. For encapsulation header reformat action, there are two kinds of encapsulation data, raw_encap_data and rte_flow_item encap_data. Both of these two kinds of data can be identified from the action mask conf as constant or not. Examples: 1. VXLAN encap (encap_data: rte_flow_item) action conf (eth/ipv4/udp/vxlan_hdr) a. action mask conf (eth/ipv4/udp/vxlan_hdr) - items are constant. b. action mask conf (NULL) - items will change. 2. RAW encap (encap_data: raw) action conf (raw_data) a. action mask conf (not NULL) - encap_data constant. b. action mask conf (NULL) - encap_data will change. Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:37 +02:00
Suanming Mou	b206c558f7	net/mlx5: fix IPv6 and TCP RSS hash fields In the flow_dv_hashfields_set() function, while item_flags was 0, the code went directly to the first if and the else case would never have a chance to be checked. This caused the IPv6 and TCP hash fields in the else case would never be set. This commit adds the dedicated HW steering hash field set function to generate the RSS hash fields. Fixes: `3a2f674b6a` ("net/mlx5: add queue and RSS HW steering action") Cc: stable@dpdk.org Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:37 +02:00
Suanming Mou	4fb0ef2976	net/mlx5: fix steering engine type check In the function flow_get_drv_type(), attr will be read in non-HWS mode. In case the user calls the HWS API in SWS mode, the attr should be placed in HWS functions or it will cause a crash. Fixes: `c40c061a02` ("net/mlx5: add basic flow queue operation") Cc: stable@dpdk.org Signed-off-by: Suanming Mou <suanmingm@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:36 +02:00
Alex Vesker	22681deead	net/mlx5/hws: enable hardware steering Replace stub implementation of HWS with mlx5dr code. Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:36 +02:00
Hamdan Igbaria	78580cf4e7	net/mlx5/hws: add debug layer The debug layer is used to generate a debug CSV file containing details of the context, table, matcher, rules and other useful debug information. Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com> Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:36 +02:00
Erez Shitrit	f8c8a6d844	net/mlx5/hws: add action object Action objects are used for executing different HW actions over packets. Each action contains the HW resources and parameters needed for action use over the HW when creating a rule. Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:35 +02:00
Alex Vesker	405242c52d	net/mlx5/hws: add rule object HWS rule objects reside under the matcher, each rule holds the configuration for the packet fields to match and the set of actions to execute over the packet that has the requested fields. Rules can be created asynchronously in parallel over multiple queues to different matchers with each rule configured to the HW. Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:35 +02:00
Alex Vesker	c467608215	net/mlx5/hws: add matcher object HWS matcher resides under the table object, each table can have multiple chained matches with different attributes. Each matcher represents a combination of match and action templates, and can contain multiple configurations based on the templates. Packets are steered from the table to the matcher and from there to other objects. The matcher allows efficient HW packet field matching and action execution based on the configuration done to it. Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Erez Shitrit <erezsh@nvidia.com>	2022-10-26 13:33:34 +02:00
Alex Vesker	394cc7ba40	net/mlx5/hws: add table object HWS table resides under the context object, each context can have multiple tables with different steering types RX/TX/FDB. The table is not only a logical object but it is also represented in the HW, packets can be steered to the table, and from there to other tables. Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:34 +02:00
Alex Vesker	b0290e56dd	net/mlx5/hws: add context object Context is the first mlx5dr object created, all sub objects table, matcher, rule, and action are created using the context. The context holds the capabilities and the send queues used for configuring the offloads to the HW. Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:34 +02:00
Alex Vesker	c55c2bf353	net/mlx5/hws: add definer layer Definers are HW objects that are used for matching, rte items are translated to definers, each definer holds the fields and bit-masks used for HW flow matching. The definer layer is used for finding the most efficient definer for each set of items. In addition to definer creation we also calculate the field copy (fc) array used for efficient items to WQE conversion. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:33 +02:00
Alex Vesker	3eb748869d	net/mlx5/hws: add send layer HWS configures flows to the HW using a QP, each WQE has the details of the flow we want to offload. The send layer allocates the resources needed to send the request to the HW as well as managing the queues, getting completions and handling failures. Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:33 +02:00
Erez Shitrit	b4dd7bcb0d	net/mlx5/hws: add pool and buddy HWS needs to manage different types of device memory in an efficient and quick way. For this, memory pools are being used. Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:33 +02:00
Erez Shitrit	365cdf5f8c	net/mlx5/hws: add command layer This adds the command layer which is used to communicate with the FW, to query capabilities and allocate FW resources needed for HWS. Signed-off-by: Erez Shitrit <erezsh@nvidia.com> Signed-off-by: Alex Vesker <valex@nvidia.com>	2022-10-26 13:33:32 +02:00
Bing Zhao	8a89038f40	net/mlx5: provide available tag registers This stores the available tags that can be used by the application in a global array that will be used to transfer the TAG item directly from the ID to the REG_C_x since these can't be changed after startup. Signed-off-by: Bing Zhao <bingz@nvidia.com>	2022-10-26 13:33:31 +02:00
Dariusz Sosnowski	5bd0e3e671	net/mlx5: add port to metadata conversion This adds conversion functions between both ethdev port_id and IB context to internal corresponding tag/mask values. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com>	2022-10-26 13:33:31 +02:00
Suanming Mou	75a00812b1	net/mlx5: add hardware steering item translation This provides shared item tranlsation code for hardware steering root table flows as they still work under FW steering mode. Signed-off-by: Suanming Mou <suanmingm@nvidia.com>	2022-10-26 13:33:30 +02:00
Suanming Mou	cd4ab74206	net/mlx5: split flow item matcher and value translation This split the item matcher and value translation to make the code reusable for the new steering mode. Signed-off-by: Suanming Mou <suanmingm@nvidia.com>	2022-10-26 13:33:30 +02:00
Suanming Mou	e64fd460b7	net/mlx5: split flow item translation This splits flow item translation code to a dedicated function to share the item translation code with hardware steering mode. Signed-off-by: Suanming Mou <suanmingm@nvidia.com>	2022-10-26 13:33:29 +02:00
Michael Savisko	7f6e276b02	net/mlx5: support flow action send to kernel Introduce mlx5_get_send_to_kernel_priority() function which returns value of priority which must be used to jump back to table 0 in order to send traffic to kernel. This function returns lowest priority. Add flow_dv_translate_action_send_to_kernel() function which will allocate rdma-core send_to_kernel action object. Called from flow_dv_translate(). Fail translation of RTE_FLOW_ACTION_TYPE_SEND_TO_KERNEL action in HW steering. Signed-off-by: Michael Savisko <michaelsav@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-26 13:33:29 +02:00
Michael Savisko	f31a141e64	net/mlx5: add send to kernel action resource holder Add new structure mlx5_send_to_kernel_action which will hold together allocated action resource and a reference to used table. A new structure member of this type added to struct mlx5_dev_ctx_shared. The member will be initialized upon first created send_to_kernel action and will be reused for all future actions of this type. Release of these resources will be done when all shared DR resources are being released in mlx5_os_free_shared_dr(). Change function flow_dv_tbl_resource_release() from static to external. Signed-off-by: Michael Savisko <michaelsav@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-26 13:33:29 +02:00
Michael Savisko	25c4d6dfae	net/mlx5: add flow action stub for send to kernel Add new mlx5 action flag MLX5_FLOW_ACTION_SEND_TO_KERNEL. Add element MLX5_FLOW_FATE_SEND_TO_KERNEL in enum mlx5_flow_fate_type. For that purpose field 'fate_action' in structure mlx5_flow_handle must be expanded from 3 bits to 4 bits. Signed-off-by: Michael Savisko <michaelsav@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-26 13:33:28 +02:00
Michael Savisko	80f998da1d	common/mlx5: add send to kernel flow action Add new glue callback dr_create_flow_action_send_to_kernel. Default callback invokes mlx5dv_dr_action_create_dest_root_table(). Add static inline mlx5_flow_os_create_flow_action_send_to_kernel(), which calls dr_create_flow_action_send_to_kernel glue callback. Define HAVE_MLX5DV_DR_ACTION_CREATE_DEST_ROOT_TABLE macro if function mlx5dv_dr_action_create_dest_root_table exists in infiniband/mlx5dv.h Signed-off-by: Michael Savisko <michaelsav@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-26 13:33:28 +02:00
Dong Zhou	4df7f801ff	net/mlx5: fix thread workspace memory leak The thread workspace push/pop should be paired. In the "flow_list_create" routine, if error happened the workspace pop was missed. This patch shares the workspace pop for all return paths. Fixes: `0064bf4318` ("net/mlx5: fix nested flow creation") Cc: stable@dpdk.org Signed-off-by: Dong Zhou <dongzhou@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-26 13:33:27 +02:00
Dariusz Sosnowski	f2d43ff54d	net/mlx5: allow hairpin Rx queue in locked memory This patch adds a capability to place hairpin Rx queue in locked device memory. This capability is equivalent to storing hairpin RQ's data buffers in locked internal device memory. Hairpin Rx queue creation is extended with requesting that RQ is allocated in locked internal device memory. If allocation fails and force_memory hairpin configuration is set, then hairpin queue creation (and, as a result, device start) fails. If force_memory is unset, then PMD will fallback to allocating memory for hairpin RQ in unlocked internal device memory. To allow such allocation, the user must set HAIRPIN_DATA_BUFFER_LOCK flag in FW using mlxconfig tool. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-08 18:30:50 +02:00
Dariusz Sosnowski	7274b41756	net/mlx5: allow hairpin Tx queue in host memory This patch adds a capability to place hairpin Tx queue in host memory managed by DPDK. This capability is equivalent to storing hairpin SQ's WQ buffer in host memory. Hairpin Tx queue creation is extended with allocating a memory buffer of proper size (calculated from required number of packets and WQE BB size advertised in HCA capabilities). force_memory flag of hairpin queue configuration is also supported. If it is set and: - allocation of memory buffer fails, - or hairpin SQ creation fails, then device start will fail. If it is unset, PMD will fallback to creating the hairpin SQ with WQ buffer located in unlocked device memory. Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-08 18:30:50 +02:00
Shun Hao	9b7fcf395c	net/mlx5: fix meter profile delete after disable If a meter's profile is changed after meter disabled, there's an issue that will fail when deleting the old profile. This patch fixes this by adding the correct process to decrease the old profile's reference count when changing profile. Fixes: `63ffeb2ff2` ("net/mlx5: support meter profile update") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-02 09:13:55 +02:00
Shun Hao	530dc58073	net/mlx5: fix meter ID tag for meter hierarchy Currently, when flow usese meter hierarchy, a tag action is always applied to set the first meter's meter id, so as to update the first meter's drop count. But it's not considered if first meter doesn't have drop count. This patch fixes it, that in hierarchy, if the first meter doesn't have drop count, no need to add the meter id tag action. No change for non-hierarchy meter. Fixes: `e8146c63` ("net/mlx5: support represented port item in flow rules") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-02 09:13:54 +02:00
Shun Hao	ca7e6051e7	net/mlx5: limit meter flow when matching all ports If there's no param in represented_port item, it will be treated as matching all ports by default. But there's some limitation when using it with meter hierarchy. This patch adds the limitation that when matching all ports, the meter hierarchy should not contain any meter having drop count. Fixes: `e8146c63` ("net/mlx5: support represented port item in flow rules") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-02 09:13:54 +02:00
Shun Hao	33d506b9e5	net/mlx5: fix meter hierarchy with represented port item There is a new item type represented_port, and currently it will fail when using meter hierarchy in flow using represented_port match. This patch fixes this fail by adding support for represented_port item in meter hierarchy flow split. Fixes: `e8146c63` ("net/mlx5: support represented port item in flow rules") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-02 09:13:54 +02:00
Jiawei Wang	9f71a297da	net/mlx5: fix modify action with tunnel decapsulation The driver splits the flow with sample action into two sub-flows, sub prefix flow and sub suffix flow. In the case of tunnel flow including a decap action, the driver should translate the inner as outer for actions coming after the decap action. In the case of flow splitting, the packet layers, used to detect the attributes, are inherited from the prefix flow to the suffix flow but the driver wrongly didn't handle the decap adjustment and the inner layers didn't shift to the outer. This patch adjusts the inherited layers in case of decap. Fixes: `6e77151286` ("net/mlx5: fix match information in meter") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-02 09:13:53 +02:00
Raja Zidane	130bb7da53	net/mlx5: fix Tx check for hardware descriptor length If hardware descriptor (WQE) length exceeds one the HW can handle, the Tx queue failure occurs. PMD does the length check but there was a bug - the length limit was expressed in 16B units (WQEBB segments), while the calculated WQE length and limit were in 64B units (WQEBBs). Fix the condition to avoid subsequent Tx queue failure. Fixes: `18a1c20` ("net/mlx5: implement Tx burst template") Cc: stable@dpdk.org Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-02 09:13:53 +02:00
Viacheslav Ovsiienko	d15bfd2930	net/mlx5: fix inline length exceeding descriptor limit The hardware descriptor (WQE) length field is 6 bits wide and we have the native limitation for the overall descriptor length. To improve the PCIe bandwidth the packet data can be inline into descriptor. If PMD was configured to inline large amount of data it happened there was no enough space remaining in the descriptor to specify all the packet data segments and PMD rejected problematic packets. The patch tries to adjust the inline data length conservatively and allows to avoid error occurring. Fixes: `18a1c20044` ("net/mlx5: implement Tx burst template") Fixes: `e2259f93ef` ("net/mlx5: fix Tx when inlining is impossible") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>	2022-10-02 09:13:52 +02:00
Viacheslav Ovsiienko	166f185fef	net/mlx5: fix single not inline packet storing The mlx5 PMD can inline packet data into transmitting descriptor (WQE) and free mbuf immediately as data no longer needed, for non-inline packets the mbuf pointer should be stored in elts array for coming freeing on send completion. There was an optimization on storing pointers in batch and there was missed storing mbuf for single packet if non-inline was explicitly requested by flag. Fixes: `cacb44a099` ("net/mlx5: add no-inline Tx flag") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-02 09:13:52 +02:00
Viacheslav Ovsiienko	37d6fc30c1	net/mlx5: fix check for orphan wait descriptor The mlx5 PMD supports send scheduling feature, it allows to send packets at specified moment of time, to do that PMD pushes special wait descriptor (WQE) to the hardware queue and then pushes descriptor for packet data as usual. If queue is close to be full or there is no enough elts buffers to store mbufs being sent the data descriptors might be not pushed and the orphan wait WQE (not followed by the data) might reside in queue on tx_burst routine exit. To avoid orphan wait WQEs there was the check for enough free space in the queue WQE buffer and enough amount of the free elts in queue mbuf storage. This check was incomplete and did not cover all the cases for Enhanced Multi-Packet Write descriptors. Fixes: `2f827f5ea6` ("net/mlx5: support scheduling on send routine template") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-02 09:13:52 +02:00
Bassam Zaid AlKilani	6fa815722e	net/mlx5: fix flow matching priority for ESP item ESP is one of IPSec protocols over both IPv4 and IPv6 and is considered a tunnel layer that cannot be followed by any other layer. Taking that into consideration, ESP is considered as a 4 layer. Not defining ESP's priority will make it match with the same priority as its prior IP layer, which has a layer 3 priority. This will lead to issues in matching and will match the packet with the first matching rule even if it doesn't have an esp layer in its pattern, disregarding any following rules that could have an esp item and can be actually a more accurate match since it will have a longer matching criterion. This is fixed by defining the priority for the ESP item to have a layer 4 priority, making the match be for the rule with the more accurate and longer matching criteria. Fixes: `18ca4a4ec7` ("net/mlx5: support ESP SPI match and RSS hash") Cc: stable@dpdk.org Signed-off-by: Bassam Zaid AlKilani <bzalkilani@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Raslan Darawsheh <rasland@nvidia.com>	2022-10-02 09:13:51 +02:00
Michael Baum	593f913a8e	net/mlx5: fix LRO requirements check One of the conditions to allow LRO offload is the DV configuration. The function incorrectly checks the DV configuration before initializing it by the user devarg; hence, LRO cannot be allowed. This patch moves this check to mlx5_shared_dev_ctx_args_config, where DV configuration is initialized. Fixes: `c4b8620135` ("net/mlx5: refactor to detect operation by DevX") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Reported-by: Gal Shalom <galshalom@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-10-02 09:13:51 +02:00
Long Li	bc5d8fdb70	net/mlx5: fix Verbs FD leak in secondary process FDs passed from rte_mp_msg are duplicated to the secondary process and need to be closed. Fixes: `9a8ab29b84` ("net/mlx5: replace IPC socket with EAL API") Cc: stable@dpdk.org Signed-off-by: Long Li <longli@microsoft.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-10-02 09:13:50 +02:00
Ivan Malov	5e3779b7ab	ethdev: remove deprecated flow item physical port Such deprecation was commenced in DPDK 21.11. Since then, no parties have objected. Remove. The patch breaks ABI. Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com>	2022-09-27 10:26:51 +02:00
David Marchand	a04322f616	bus: hide bus object Make rte_bus opaque for non internal users. This will make extending this object possible without breaking the ABI. Introduce a new driver header and move rte_bus definition and helpers. Update drivers and library to use the internal header. Some applications may have been dereferencing rte_bus objects, mark this object's accessors as stable. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	1f37cb2bb4	bus/pci: make driver-only headers private The pci bus interface is for drivers only. Mark as internal and move the header in the driver headers list. While at it, cleanup the code: - fix indentation, - remove unneeded reference to bus specific singleton object, - remove unneeded list head structure type, - reorder the definitions and macro manipulating the bus singleton object, - remove inclusion of rte_bus.h and fix the code that relied on implicit inclusion, Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Acked-by: Rosen Xu <rosen.xu@intel.com>	2022-09-23 16:14:34 +02:00
David Marchand	b3f89090d6	bus/auxiliary: make driver-only headers private The auxiliary bus interface is for drivers only. Mark as internal and move the header in the driver headers list. While at it, cleanup the code: - fix indentation, - remove unneeded reference to bus specific singleton object, - remove unneeded list head structure type, - reorder the definitions and macro manipulating the bus singleton object, - remove inclusion of rte_bus.h and fix the code that relied on implicit inclusion, Signed-off-by: David Marchand <david.marchand@redhat.com>	2022-09-23 16:14:34 +02:00
Matan Azrad	60b254e392	net/mlx5: fix Rx queue recovery mechanism The local variables are getting inconsistent in data receiving routines after queue error recovery. Receive queue consumer index is getting wrong, need to reset one to the size of the queue (as RQ was fully replenished in recovery procedure). In MPRQ case, also the local consumed strd variable should be reset. CVE-2022-28199 Fixes: `88c0733535` ("net/mlx5: extend Rx completion with error handling") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Signed-off-by: Matan Azrad <matan@nvidia.com>	2022-08-29 12:53:49 +02:00
David Marchand	72206323a5	version: 22.11-rc0 Start a new release cycle with empty release notes. The ABI version becomes 23.0. The map files are updated to the new ABI major number (23). The ABI exceptions are dropped and CI ABI checks are disabled because compatibility is not preserved. Special handling of removed drivers is also dropped in check-abi.sh and a note has been added in libabigail.abignore as a reminder. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2022-07-21 12:13:48 +02:00
Raja Zidane	5ddb903824	net/mlx5: reject negative integrity item configuration Negative integrity item refers to condition when the item value mask is set, but value spec is cleared: ... integrity value mask l4_ok value spec 0 ... ethdev library defines integrity bits `l3_ok` and `l4_ok` as accumulators for all hardware L3 and L4 integrity verifications respectfully. Hardware `l3_ok` and `l4_ok` integrity bits refer to L3 and L4 network headers only. Integrity bits `l3_ok` and `l4_ok` are not compatible between ethdev library and hardware. PMD translations for ethdev `l3_ok` are: IPv4: `l3_ok` and `l3_csum_ok` IPv6: `l3_ok` ethdev `l4_ok` is translated into PMD `l4_ok` and `l4_csum_ok` bits. Positive IPv4 `l3_ok` flow item configuration is translated into a single matcher that AND corresponding hardware bits. Negative IPv4 `l3_ok` is translated into 2 hardware conditions where each condition probes a single integrity bit: ethdev::l3_ok is 0 => MLX5::l3_ok is 0 OR MLX5:l3_csum_ok is 0 MLX5 hardware does not do OR condition in flow rule item. Negative IPv4 `l3_ok` must be translated into 2 flow rules. Similarly negative ethdev `l4_ok` condition is also translated into 2 hardware rules. Current PMD roadmap does not allow implicit flow rule split. Bugzilla ID: 948 Cc: stable@dpdk.org Suggested-by: Raja Zidane <rzidane@nvidia.com> Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-07-05 20:04:02 +02:00
Michael Baum	740a28366c	net/mlx5: add test for external Rx queue Add mlx5 internal test for map and unmap external RxQs. This patch adds to testpmd app a runtime function to test the mapping API. testpmd> mlx5 port (port_id) ext_rxq map (sw_queue_id) (hw_queue_id) testpmd> mlx5 port (port_id) ext_rxq unmap (sw_queue_id) Signed-off-by: Michael Baum <michaelba@nvidia.com> Reviewed-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Matan Azrad <matan@nvidia.com>	2022-07-05 20:02:57 +02:00
Michael Baum	85d9252e55	net/mlx5: add test for remote PD and CTX Add mlx5 internal option in testpmd similar to run-time function "port attach" which adds another parameter named "socket" for attaching port and add 2 devargs before. The arguments are "cmd_fd" and "pd_handle" using to import device created out of PMD. Testpmd application import it using IPC, and updates the devargs list before attaching. These arguments were added in the commit `9d936f4f1a` ("common/mlx5: support remote PD and CTX") The syntax is: testpmd> mlx5 port attach (identifier) socket=(path) Where "path" is the IPC socket path agreed on the remote process. Signed-off-by: Michael Baum <michaelba@nvidia.com> Reviewed-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Matan Azrad <matan@nvidia.com>	2022-07-05 20:01:33 +02:00
Yunjian Wang	a73b78554a	net/mlx5: fix stack buffer overflow in drop action The mlx5_drop_action_create function use mlx5_malloc for allocating 'hrxq', but don't allocate for 'rss_key'. This is wrong and it can cause buffer overflow. Detected with address sanitizer: 0 (/usr/lib64/libasan.so.4+0x7b8e2) 1 in mlx5_devx_tir_attr_set ../drivers/net/mlx5/mlx5_devx.c:765 2 in mlx5_devx_hrxq_new ../drivers/net/mlx5/mlx5_devx.c:800 3 in mlx5_devx_drop_action_create ../drivers/net/mlx5/mlx5_devx.c:1051 4 in mlx5_drop_action_create ../drivers/net/mlx5/mlx5_rxq.c:2846 5 in mlx5_dev_spawn ../drivers/net/mlx5/linux/mlx5_os.c:1743 6 in mlx5_os_pci_probe_pf ../drivers/net/mlx5/linux/mlx5_os.c:2501 7 in mlx5_os_pci_probe ../drivers/net/mlx5/linux/mlx5_os.c:2647 8 in mlx5_os_net_probe ../drivers/net/mlx5/linux/mlx5_os.c:2722 9 in drivers_probe ../drivers/common/mlx5/mlx5_common.c:657 10 in mlx5_common_dev_probe ../drivers/common/mlx5/mlx5_common.c:711 11 in mlx5_common_pci_probe ../drivers/common/mlx5/mlx5_common_pci.c:150 12 in rte_pci_probe_one_driver ../drivers/bus/pci/pci_common.c:269 13 in pci_probe_all_drivers ../drivers/bus/pci/pci_common.c:353 14 in pci_probe ../drivers/bus/pci/pci_common.c:380 15 in rte_bus_probe ../lib/eal/common/eal_common_bus.c:72 16 in rte_eal_init ../lib/eal/linux/eal.c:1286 17 in main ../app/test-pmd/testpmd.c:4112 Fixes: `0c762e81da` ("net/mlx5: share Rx queue drop action code") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-06-23 17:25:07 +02:00
Shun Hao	92b3c68e9f	net/mlx5: fix metering on E-Switch Manager When meter is used by E-Switch Manager port, there's an error that cannot get correct port ID. This patch fixes this by using specific parsing process to get port ID for E-Switch Manager. Fixes: `3c481324ba` ("net/mlx5: fix meter flow direction check") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-23 17:25:06 +02:00
Shun Hao	68e9925c30	net/mlx5: add limitation for E-Switch Manager match For BF with old FW which doesn't expose the E-Switch Manager vport ID, E-Switch Manager port matching works correctly only when BF is in embedded CPU mode. This patch adds the limitation description. Fixes: `a564038699` ("net/mlx5: support E-Switch manager egress traffic match") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-23 17:25:06 +02:00
Gregory Etelson	57d6a458a8	net/mlx5: fix RSS expansion for patterns with ICMP item MLX5 PMD RSS expansion implementation added L4 UDP or TCP headers after ICMP. For example: ETH / IPv4 / ICMP expanded into ETH / IPv4 / ICMP / {UDP \| TCP} ETH / IPv6 / ICMPv6 expanded into ETH / IPv6 / ICMPv6 / {UDP \| TCP} The patch updates PMD expansion scheme to handle ICMP and ICMPv6 types as non-expandable for RSS. Fixes: `c7870bfe09` ("ethdev: move RSS expansion code to mlx5 driver") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-23 17:25:05 +02:00
Spike Du	f41a5092e6	app/testpmd: add host shaper command Add command line options to support host shaper configure. - Command syntax: mlx5 set port <port_id> host_shaper avail_thresh_triggered <0\|1> rate <rate_num> - Example commands: To enable avail_thresh_triggered on port 1 and disable current host shaper: testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 1 rate 0 To disable avail_thresh_triggered and current host shaper on port 1: testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 0 rate 0 The rate unit is 100Mbps. To disable avail_thresh_triggered and configure a shaper of 5Gbps on port 1: testpmd> mlx5 set port 1 host_shaper avail_thresh_triggered 0 rate 50 Add sample code to handle rxq available descriptor threshold event, it delays a while so that rxq empties, then disables host shaper and rearms available descriptor threshold event. Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-23 17:25:04 +02:00
Spike Du	2235fcda12	net/mlx5: add API to configure host port shaper Host port shaper can be configured with QSHR (QoS Shaper Host Register). Add check in build files to enable this function or not. The host shaper configuration affects all the ethdev ports belonging to the same host port. Host shaper can configure shaper rate and lwm-triggered for a host port. The shaper limits the rate of traffic from host port to wire port. If lwm-triggered is enabled, a 100Mbps shaper is enabled automatically when one of the host port's Rx queues receives available descriptor threshold event. Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-23 17:25:04 +02:00
Spike Du	5c9f3294e6	net/mlx5: support Rx descriptor threshold event Add mlx5 specific available descriptor threshold configuration and query handler. In mlx5 PMD, available descriptor threshold is also called LWM (limit watermark). While the Rx queue fullness reaches the LWM limit, the driver catches an HW event and invokes the user callback. The query handler finds the next Rx queue with pending LWM event if any, starting from the given Rx queue index. Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-23 17:25:02 +02:00
Spike Du	25025da3c5	net/mlx5: handle Rx descriptor LWM event When LWM meets RQ WQE, the kernel driver raises an event to SW. Use devx event_channel to catch this and to notify the user. Allocate this channel per shared device. The channel has a cookie that informs the specific event port and queue. Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-23 17:25:00 +02:00
Spike Du	72d7efe464	common/mlx5: share interrupt management There are many duplicate code of creating and initializing rte_intr_handle. Add a new mlx5_os API to do this, replace all PMD related code with this API. Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-23 17:24:59 +02:00
Spike Du	7158e46cb9	net/mlx5: support descriptor LWM for Rx queue Add LWM (Limit WaterMark) field to Rxq object which indicates the percentage of Rx queue size used by HW to raise descriptor event to the user. Allow LWM setting in modify_rq command. Allow the LWM configuration dynamically by adding RDY2RDY state change. Signed-off-by: Spike Du <spiked@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-23 17:23:29 +02:00
Ali Alnubani	bae645a23a	net/mlx5: fix build with clang 14 Use fgets instead of fscanf to resolve the following warning reported by clang 14.0.0 in Fedora 37 (Rawhide): drivers/net/mlx5/linux/mlx5_ethdev_os.c:1137:52: error: 'fscanf' may overflow; destination buffer in argument 3 has size 16, but the corresponding specifier may require size 17 [-Werror,-Wfortify-source] ret = fscanf(file, "%" RTE_STR(IF_NAMESIZE) "s", port_name); Fixes: `63d1db710f` ("net/mlx5: fix unlimited parsing of switch info") Cc: stable@dpdk.org Signed-off-by: Ali Alnubani <alialnu@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-06-23 17:23:28 +02:00
Sean Zhang	6431068d0f	net/mlx5: support field modification in meter rules This patch introduces MODIFY_FIELD action support in meter. User can create meter policy with MODIFY_FIELD action in green/yellow action. For example: testpmd> add port meter policy 0 21 g_actions modify_field op set dst_type ipv4_ecn src_type value src_value 3 width 2 / ... Signed-off-by: Sean Zhang <xiazhang@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-06-23 17:23:26 +02:00
Sean Zhang	76d5756122	net/mlx5: support modifying ECN field This patch is to support modify ECN field in IPv4/IPv6 header. Signed-off-by: Sean Zhang <xiazhang@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-06-23 17:23:25 +02:00
Sean Zhang	e8146c63c3	net/mlx5: support represented port item in flow rules Add support for represented_port item in pattern. And if the spec and mask both are NULL, translate function will not add source vport to matcher. For example, testpmd starts with PF, VF-rep0 and VF-rep1, below command will redirect packets from VF0 and VF1 to wire: testpmd> flow create 0 ingress transfer group 0 pattern eth / represented_port / end actions represented_port ethdev_id is 0 / end Signed-off-by: Sean Zhang <xiazhang@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-06-23 17:23:23 +02:00
Raja Zidane	fb96caa56a	net/mlx5: support ESP item on Windows ESP item is not supported on Windows, yet it is expanded from the expansion graph when trying to create default flow to RSS all packets. Support ESP item match (without ability to match on SPI field on Windows). Split ESP validation per OS. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-05 17:04:48 +02:00
Michael Baum	2192599c75	net/mlx5: fix entry size in construct data ipool The mlx5_action_construct_data structure memory is managed by ipool named acts_ipool. The size of one entry in this ipool is mistakenly defined as size of rte_flow_hw structure. This size is used to reset in the allocated part. When the size is incorrect it resets memory that does not belong to it. This patch defines the correct size. Fixes: `f13fab2392` ("net/mlx5: add flow jump action") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-05 17:04:46 +02:00
Geoffrey Le Gourriérec	eadc35df59	net/mlx5: fix statistics read on Linux This patch encompasses a few fixes carried by a previous patch that aimed to support bonding device stats counting. - If mlx5_os_read_dev_stat fails, it returns 1 instead of a negative value, causing mlx5_xstats_get to return an invalid number of counters. Since this error is not blocking, do not mess ret value with mlx5_os_read_dev_stat returned value. This allows avoiding the very annoying log: "n_xstats != n_xstats_names => skipping" - Invert the check for mlx5_os_read_dev_stat(), currently leading us to store the result if the function failed, and use a backup value if it succeeded, which is the opposite of what we actually want. Revert to the original (correct) test. - Add missing test on _mlx5_os_read_dev_counters() to prevent using trash stats values. Fixes: `7ed15acdcd` ("net/mlx5: improve xstats of bonding port") Cc: stable@dpdk.org Signed-off-by: Didier Pallard <didier.pallard@6wind.com> Signed-off-by: Geoffrey Le Gourriérec <geoffrey.le_gourrierec@6wind.com> Tested-by: Bassam Zaid AlKilani <bzalkilani@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-02 17:01:11 +02:00
Rongwei Liu	2bd03a4361	net/mlx5: add Rx drop counters to xstats Add two kinds of Rx drop counters to DPDK xstats which are physical port scope. 1. rx_prio[0-7]_buf_discard The number of unicast packets dropped due to lack of shared buffer resources. 2. rx_prio[0-7]_cong_discard The number of packets that is dropped by the Weighted Random Early Detection (WRED) function. Prio[0-7] is determined by VLAN PCP value which is 0 by default. Both counters are retrieved from kernel ethtool API which calls PRM command finally. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-06-01 09:49:44 +02:00
Raja Zidane	1485d961e2	net/mlx5: fix Tx recovery When an error occurs in Tx, and it is moved to ERROR state, it is not recoverable, during recovery it's state cannot be modified to INIT. to modify state from RESET to INIT, the port must be passed in modify attributes, and in case of ERROR to READY modification path, it was not provided. Provide port number when changing state from RESET to INIT. Fixes: `3a87b964ed` ("net/mlx5: create Tx queues with DevX") Cc: stable@dpdk.org Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>	2022-06-01 09:49:42 +02:00
Shun Hao	96ca87da4f	net/mlx5: validate yellow meter action Yellow meter action support is added in meter hierarchy validation. If one color uses meter action, the other can only use NULL action or the same meter action. And only shared meter is supported. Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-01 09:49:41 +02:00
Shun Hao	3dc7afa2fa	net/mlx5: support yellow meter action for hierarchy tag rule When a hierarchy meter is shared by other ports, it's needed to iterate all meter policies in hierarchy to create tag rules, to set packet with next meter ID, which will be used by related meter drop count. This patch adds the tag rule for yellow support in hierarchy, so both green/yellow policy flows can set the correct meter ID. Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-01 09:49:38 +02:00
Shun Hao	bf62fb7693	net/mlx5: support yellow meter action in hierarchy This patch adds the support of meter action for yellow meter policy flow, so can use meter action for both green and yellow policy flows in meter hierarchy. Currently must use the same meter within one meter policy. Packets passing green/yellow policy flow will have previous meter color of green/yellow in subsequent meter. Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-01 09:49:36 +02:00
Shun Hao	6b838de3d5	net/mlx5: support previous meter color aware This patch adds the support for previous color aware for meter. Start_color setting is set to UNDEFINED when creating meter object that is color aware. Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-06-01 09:49:30 +02:00
Thomas Monjalon	64fcadeac0	avoid AltiVec keyword vector The AltiVec header file is defining "vector", except in C++ build. The keyword "vector" may conflict easily. As a rule, it is better to use the alternative keyword "__vector", so we will be able to #undef vector after including AltiVec header. Later it may become possible to #undef vector in rte_altivec.h with a compatibility breakage. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>	2022-05-25 11:49:39 +02:00
Raja Zidane	18ca4a4ec7	net/mlx5: support ESP SPI match and RSS hash In packets with ESP header, the inner IP will be encrypted, and its fields cannot be used for RSS hashing. So, ESP packets can be hashed only by the outer IP layer. So, when using RSS on ESP packets, hashing may not be efficient, because the fields used by the hash functions are only the outer IPs, causing all traffic belonging to all tunnels between a given pair of GWs to land on one core. Adding the SPI hash field can extend the spreading of IPsec packets. Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-05-15 09:38:59 +02:00
Shun Hao	fc7211097d	net/mlx5: fix no-green metering with RSS When a meter with RSS action being used, there might be several sub-flows using different sub-policies in the flow splitting stage. If there's no green action, there's an error that will always use the same sub-policy for all sub-flows, some resources will be overwritten and cannot be released, leading assert during port close. This patch fixes this issue by checking both green and yellow queue index during getting a blank sub-policy, to avoid the incorrect resource overwrite. Fixes: `b38a12272b` ("net/mlx5: split meter color policy handling") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-05-11 09:45:23 +02:00
Michael Baum	7a99336865	net/mlx5: fix LRO configuration in drop Rx queue The driver wrongly set the LRO configurations to the TIR of the DevX drop queue even when LRO is not supported. Actually, the LRO configuration is not relevant to the drop queue at all. This causes failure in the initialization of the device, which doesn't support LRO where the drop queue is created. Probably, the drop queue creation by DevX missed the fact that LRO is set by default in the TIR creation function and didn't unset it in the drop queue case like other cases that unset LRO. Move the default LRO configuration to unset it and set it only in the case of all the TIR queues configured with LRO. Fixes: `bc5bee028e` ("net/mlx5: create drop queue using DevX") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-04-26 11:52:18 +02:00
Michael Baum	a213b86821	net/mlx5: fix LRO validation in Rx setup The mlx5_rx_queue_setup() get LRO offload from user. When LRO is configured, the LRO flag in rxq_data is set to 1. This patch adds validation to make sure the LRO is supported. Fixes: `17ed314` ("net/mlx5: allow LRO per Rx queue") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-04-26 11:52:18 +02:00
Dariusz Sosnowski	d2fa2632a4	net/mlx5: fix RSS hash types adjustment When an indirect action was created with an RSS action configured to hash on both source and destination L3 addresses (or L4 ports), it caused shared hrxq to be configured to hash only on destination address (or port). This patch fixes this behavior by refining RSS types specified in configuration before calculating hash types used for hrxq. Refining RSS types removes _SRC_ONLY and _DST_ONLY flags if they are both set. Fixes: `212d17b6a6` ("net/mlx5: fix missing shared RSS hash types") Cc: stable@dpdk.org Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-04-26 11:51:56 +02:00
Raja Zidane	773a7de21a	net/mlx5: fix Rx/Tx stats concurrency Queue statistics are being continuously updated in Rx/Tx burst routines while handling traffic. In addition to that, statistics can be reset (written with zeroes) on statistics reset in other threads, causing a race condition, which in turn could result in wrong stats. The patch provides an approach with reference values, allowing the actual counters to be writable within Rx/Tx burst threads only, and updating reference values on stats reset. Fixes: `87011737b7` ("mlx5: add software counters") Cc: stable@dpdk.org Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-04-21 12:50:26 +02:00
Dariusz Sosnowski	26f22fa64e	net/mlx5: fix GTP handling in header modify action GTP items were ignored during conversion of modify header actions. This caused modify TTL action to generate a wrong modify header command when tunnel and inner headers used different IP versions. This patch adds GTP item handling to modify header action conversion. Fixes: `04233f36c7` ("net/mlx5: fix layer type in header modify action") Cc: stable@dpdk.org Signed-off-by: Dariusz Sosnowski <dsosnowski@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>	2022-04-21 12:47:44 +02:00
Adham Masarwah	cb91f12f4a	net/mlx5: support MTU settings on Windows Mlx5Devx library has new API's for setting and getting MTU. Added new glue functions that wrap the new mlx5devx lib API's. Implemented the os_ethdev callbacks to use the new glue functions in Windows. Signed-off-by: Adham Masarwah <adham@nvidia.com> Tested-by: Idan Hackmon <idanhac@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-04-21 12:47:43 +02:00
Adham Masarwah	3014718fd2	net/mlx5: support promiscuous modes on Windows Support of the set promiscuous modes by calling the new API In Mlx5DevX Lib. Added new glue API for Windows which will be used to communicate with Windows driver to enable/disable PROMISC or ALLMC. Signed-off-by: Adham Masarwah <adham@nvidia.com> Tested-by: Idan Hackmon <idanhac@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-04-21 12:47:42 +02:00
Michael Baum	d37b0b4d77	net/mlx5: remove redundant check for hairpin queue The mlx5_rxq_is_hairpin() function checks whether RxQ type is Hairpin. It is done by reading a flag in Rx control structure coming from mlx5_rxq_ctrl_get() function. The function verifies that the queue index is valid even though it has been checked within the mlx5_rxq_ctrl_get() function. This patch removes the redundant check. Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-04-21 12:47:41 +02:00
Michael Baum	1573b07284	net/mlx5: restrict Rx queue array access to boundary The mlx5_rxq_get() function gets RxQ index and return RxQ priv accordingly. When it gets an invalid index, it accesses out of array bounds which might cause undefined behavior. This patch adds a check for invalid indexes before accessing to array. Fixes: `0cedf34da7` ("net/mlx5: move Rx queue reference count") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>	2022-04-21 12:47:41 +02:00

1 2 3 4 5 ...

2531 Commits