Commit Graph

1959 Commits

Author SHA1 Message Date
Jiawei Wang
ae2927cd26 net/mlx5: extend skip scale flag
The sampling feature introduces the scale flow group with factor,
then the scaled table value can be used for the normal path table
due to this table be created implicitly.

But if the input group value already be scaled, for example the
group value of sampling suffix flow, then use 'skip_scale" flag
to skip the scale twice in the translation action.

Consider the flow with jump action and this jump action could be
created implicitly, PMD may only scale the original flow group
value or scale the jump group value or both, so extend the
'skip_scale' flag to two bits:
If bit0 of 'skip_scale' flag is set to 1, then skip the scale the
original flow group;
If bit1 of 'skip_scale' flag is set to 1, then skip the scale the
jump flow group.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-29 18:16:07 +01:00
Jiawei Wang
6a951567c1 net/mlx5: support E-Switch mirroring and jump in one flow
mlx5 E-Switch mirroring is implemented as multiple destination array in
one steering table. The array currently supports only port ID as
destination actions.

This patch adds the jump action support to the array as one of
destination.
The packets can be mirrored to the port and jump to the next table in
the same destination array allowing to continue handling in the new
table.

For example:
    set sample_actions 0 port_id id 1 / end
    flow create 0 ingress transfer pattern eth / end actions
    sample ratio 1 index 0 / jump group 1 / end
    flow create 1 ingress transfer group 1 pattern eth / end actions
    set_mac_dst mac_addr 00:aa:bb:cc:dd:ee / port_id id 2 / end

The flow results all the matched ingress packets are mirrored
to port id 1 and go to group 1. In the group 1, packets are modified
with the destination mac and sent to port id 2.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-29 18:16:07 +01:00
Bruce Richardson
df96fd0d73 ethdev: make driver-only headers private
The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Steven Webster <steven.webster@windriver.com>
2021-01-29 20:59:09 +01:00
Jiawei Wang
bd49d1d343 net/mlx5: handle RSS action in sample
PMD validates the rss action in the sample sub-actions list,
then translates into rdma-core action and it will be used for sample
path destination.

If the RSS action is in both sample sub-actions list and original flow,
the rss level and rss type in the sample sub-actions list should be
consistent with the original flow list, since the expanding items
for RSS should be the same for both actions.

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-19 13:49:41 +01:00
Jiawei Wang
6c2a3a9049 net/mlx5: fix unnecessary checking for RSS action
RSS action is valid only in NIC-RX domain, this fix bypass
the function that getting RSS action from the flow action list
under no NIC-RX domain.

Fixes: e745f90007 ("net/mlx5: optimize flow RSS struct")
Cc: stable@dpdk.org

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-19 03:30:32 +01:00
Shiri Kuzin
e440d6cf58 net/mlx5: add GENEVE TLV option flow translation
The GENEVE TLV option matching flows must be created
using a translation function.

This function checks whether we already created a Devx
object for the matching and either creates the objects
or updates the reference counter.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-19 03:30:16 +01:00
Shiri Kuzin
f7239fce69 net/mlx5: add GENEVE TLV option flow validation
This patch adds validation routine for the GENEVE
header TLV option.

The GENEVE TLV option match must include all fields
with full masks due to NIC does not support masking
on option class, type and length.

The option data length must be non zero and provided
data pattern should be zero neither due to hardware
limitations.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-19 03:30:16 +01:00
Shiri Kuzin
f15f0c3806 net/mlx5: create GENEVE TLV option management
Currently firmware supports the only TLV object per device
to match on the GENEVE header option.

This patch adds the simple TLV object management to the mlx5 PMD.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-19 03:30:16 +01:00
Shiri Kuzin
06cd4cf63f net/mlx5: add GTP PSC item translation
This patch adds the translation function which
sets the qfi, PDU type.

The next extension header which indicates the following
extension header type is set to 0x85 - a PDU session
container.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-19 03:30:13 +01:00
Shiri Kuzin
2c9f961703 net/mlx5: add GTP PSC flow validation
In this patch we add validation routine for
GTP PSC extension header.

The GTP PSC extension header must follow the
GTP item.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-19 03:30:13 +01:00
Michael Baum
6e0a3637d8 net/mlx5: move Rx RQ creation to common
Using common function for Rx RQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
389ab7f5fd net/mlx5: move ASO SQ creation to common
Using common function for ASO SQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
74e918604a net/mlx5: move Tx SQ creation to common
Using common function for Tx SQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
71011bd56b net/mlx5: move rearm and clock queue SQ creation to common
Using common function for DevX SQ creation for rearm and clock queue.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
5cd33796dd net/mlx5: move Rx CQ creation to common
Using common function for Rx CQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
5f04f70ccf net/mlx5: move Tx CQ creation to common
Using common function for Tx CQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
c7d41d98a7 net/mlx5: move ASO CQ creation to common
Use common function for ASO CQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
a7787bb0b7 net/mlx5: move rearm and clock queue CQ creation to common
Using common function for CQ creation at rearm queue and clock queue.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
0e8273176e net/mlx5: fix leak on ASO SQ creation failure
In ASO SQ creation, the PMD allocates umem buffer for SQ.

When umem buffer allocation fails, the MR and CQ memory are not freed
what caused a memory leak.

Free it.

Fixes: f935ed4b64 ("net/mlx5: support flow hit action for aging")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
4a7f979af2 net/mlx5: remove CQE padding device argument
The data-path code doesn't take care on 'rxq_cqe_pad_en' and use padded
CQE for any case when the system cache-line size is 128B.

This makes the argument redundant.

Remove it.

Fixes: bc91e8db12 ("net/mlx5: add 128B padding of Rx completion entry")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Michael Baum
a2521c8f98 common/mlx5: fix completion queue entry size configuration
According to the current data-path implementation in the PMD the CQE
size must follow the cache-line size.
So, the configuration of the CQE size should be depended in
RTE_CACHE_LINE_SIZE.

Wrongly, part of the CQE creations didn't follow it exactly what caused
an incompatibility between HW and SW in the data-path when working in
128B cache-line size systems.

Adjust the rule for any CQE creation.
Remove the cqe_size attribute from the DevX CQ creation command and set
it inside the command translation according to the cache-line size.

Fixes: 79a7e409a2 ("common/mlx5: prepare support of packet pacing")
Fixes: 5cd0a83f41 ("common/mlx5: support more fields in DevX CQ create")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Dekel Peled
19e13263ed net/mlx5: fix hairpin flow split decision
Previously, the identification of hairpin queue was done using
mlx5_rxq_get_type() function.
Recent patch replaced it with use of mlx5_rxq_get_hairpin_conf(),
and check of the return value conf != NULL.
The case of return value is NULL (queue is not hairpin) was not handled.
As result, non-hairpin flows were wrongly handled.
This patch adds the required check for return value is NULL.

Fixes: 509f8470de ("net/mlx5: do not split hairpin flow in explicit mode")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-13 19:45:30 +01:00
Tal Shnaiderman
5d55a494f4 net/mlx5: split multi-thread flow handling per OS
multi-threaded flows feature uses pthread function pthread_key_create
but for Windows the destruction option in the function is unimplemented.

To resolve it, Windows will implement destruction mechanism to cleanup
mlx5_flow_workspace object for each terminated thread.

Linux flow will keep the current behavior.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Khoa To <khot@microsoft.com>
2021-01-13 19:45:30 +01:00
Ophir Munk
bd935fe3e6 net/mlx5: wrap sampling actions per OS
Wrap glue calls dr_create_flow_action_sampler() and
dr_create_flow_action_dest_array() as OS-specific functions.
This is a follow up on
commit b293fbf967 ("net/mlx5: add OS specific flow actions operations")

On Windows, the sampling actions wrappers currently return ENOTSUP.
Using configuration definitions HAVE_MLX5_DR_CREATE_ACTION_FLOW_SAMPLE and
HAVE_MLX5_DR_CREATE_ACTION_DEST_ARRAY the missing sampling DV structs
are added as stubs to windows/mlx5_glue.h file.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:34:52 +01:00
Michael Baum
b689b00dd2 net/mlx5: fix leak on Tx queue creation failure
In Tx queue creation, there are two validations for the Tx
configuration.

When one of them fails, the MR btree memory was not freed what caused a
memory leak.

Free it.

Fixes: f6d9ab4e76 ("net/mlx5: check Tx queue size overflow")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:34:52 +01:00
Michael Baum
e28e6c63a9 net/mlx5: fix leak on Rx queue creation failure
In Rx queue creation, there are some validations for the Rx
configuration.

When one of them fails, the MR btree memory was not freed what caused a
memory leak.

Free it.

Fixes: 974f1e7ef1 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:34:52 +01:00
Shiri Kuzin
d362e6f6ac net/mlx5: fix VXLAN decap on non-VXLAN flow
The vxlan_decap action performs decapsulation of the VXLAN tunnel.

Currently we can create a flow with vxlan_decap without
matching on VXLAN header.

To solve this issue this patch adds validation verifying
that the VXLAN item was detected when specifying
vxlan_decap action.

Fixes: 49d6465af3 ("net/mlx5: add VXLAN decap action to Direct Verbs")
Cc: stable@dpdk.org

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
2021-01-08 16:34:52 +01:00
Suanming Mou
8e61555657 net/mlx5: fix shared RSS and mark actions combination
In order to allow mbuf mark ID update in Rx data-path, there is a
mechanism in the PMD to enable it according to the rte_flows.
When a flow with mark ID and RSS/QUEUE action exists, all the relevant
Rx queues will be enabled to report the mark ID.

When shared RSS action is combined with mark action, the PMD mechanism
misses the Rx queues updates.

This commit handles the shared RSS case in the mechanism too.

Fixes: e1592b6c4d ("net/mlx5: make Rx queue thread safe")
Cc: stable@dpdk.org

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:34:51 +01:00
Tal Shnaiderman
16047bd015 net/mlx5: fix comparison sign in flow engine
The clang compiler warns on size mismatches of several
comparisons.

warning: comparison of integers of different signs

To resolve those the right types is used/cast to.

Fixes: 3e8edd0ef8 ("net/mlx5: update metadata register ID query")
Fixes: e554b672aa ("net/mlx5: support flow tag")
Fixes: c8f0abe7f8 ("net/mlx5: fix meter color register consideration")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
084de7a108 net/mlx5: skip IPv6 broadcast flow creation failure
IPv6 broadcast flow creation is unsupported in Windows.
do not fail on IPv6 broadcast flow creation on this mast
to avoid entire default rules creation failure.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
62d5b30bf3 net/mlx5: wrap flow domain sync per OS
use OS functions for flow_dv_sync_domain to compile
Windows.

mlx5_os_flow_dr_sync_domain is unsupported for Windows.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
ef65067c46 net/mlx5: initialize context list mutex dynamically
The mutex mlx5_dev_ctx_list_mutex was initialized with
PTHREAD_MUTEX_INITIALIZER global macro however this macro
is not supported on Windows OS shim implementation of pthreads
in DPDK.

Moved the init of this mutex to RTE_INIT to support this mutex
on both OSs.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
adc65dee4f net/mlx5: use OS-independent code in ASO feature
Modify the ASO feature to use OS independent code
not to break Windows build.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
28743807e8 net/mlx5: fix device name size on Windows
Windows Devx interface name is the same as device name with
different size then IF_NAMESIZE. To support it MLX5_NAMESIZE
is defined with IF_NAMESIZE value for Linux and MLX5_FS_NAME_MAX
value for Windows.

Fixes: e9c0b96e35 ("net/mlx5: move Linux ifname function")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
b012b4ce72 net/mlx5: unify operations for all OS
There are three types of eth_dev_ops: primary, secondary and isolate
represented in three callback tables per OS.  In this commit the OS
specific eth dev tables are unified into shared tables in file mlx5.c.
Starting from this commit all operating systems must implement the same
eth dev APIs. In case an OS does not support an API - it can return in
its implementation an error ENOTSUP.

Fixes: 042f5c94fd ("net/mlx5: refactor device operations for Linux")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
f1ae0b3590 net/mlx5: enable more shared code on Windows
Use macro HAVE_INFINIBAND_VERBS_H to successfully compile files both
under Linux and Windows (or any non Linux in general). Under Windows
this macro:
1. Hides Verbs references.
2. Exposes required DV structs that are under ifdefs related to rdma
core.

Linux code under definitions such as #ifdef HAVE_IBV_FLOW_DV_SUPPORT is
required unconditionally under Windows however those definitions are
never effective without rdma-core presence. Therefore update the #ifdef
condition to consider HAVE_INFINIBAND_VERBS_H as well (undefined macro
when running without an rdma-core library).

For example:
-#ifdef HAVE_IBV_FLOW_DV_SUPPORT
+#if defined(HAVE_IBV_FLOW_DV_SUPPORT) || !defined(HAVE_INFINIBAND_VERBS_H)

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
1d194496b9 net/mlx5: create flow rule on Windows
This commit implements mlx5_flow_os_create_flow() API. It is equivalent
to Linux rdma-core implementation. The API receives the matcher mask,
matcher value and an array of actions. They are copied into a PRM-like
struct devx_fs_rule_add_in. Then glue API devx_fs_rule_add() is called.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
68e28591ee net/mlx5: create flow action dest TIR object on Windows
This commit implements mlx5_flow_os_create_flow_action_dest_devx_tir()
API as the Linux rdma-core equivalent. Missing rdma-core parameters are
added to file mlx5_win_defs.h. The action TIR id and type
(MLX5_FLOW_CONTEXT_DEST_TYPE_TIR) are saved in the action struct.  The
action struct will be added to array of actions and will be used later
by the flow creation API.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
03e1f7f760 net/mlx5: create flow matcher object on Windows
This commit implements the mlx5_flow_os_create_flow_matcher() API. It is
the Linux rdma-core equivalent implementation. Missing rdma-core
parameters (e.g. struct mlx5dv_flow_match_parameters) are added to file
mlx5_win_defs.h. The API allocates space to hold the PRM bits in PRM
fte_match_param format and copy the DV translated PRM bits into the
matcher struct. This matcher struct will be used later by the flow
creation API.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
882595159f net/mlx5: introduce flow support on Windows
This patch adds the initial flow framework under Windows OS. It supports
a subset of filters (ETH, IPV4, UDP) and a QUEUE action.  It is based on
DevX mechanism to send commands to the NIC through the kernel. It does
not support steering rules (i.e. writing directly to the NIC memory).
The Windows framework uses the existing DV framework where file
mlx5_flow_dv.c remains intact.

Steps involved in flow creation:
1. Create a domain (RX, TX, FDB). Since domains are created by steering
rules and not with DevX, Windows does not require a domain object (this
means switch dev mode which requires an FDB domain is not supported).
2. Create a table object. Windows only supports table 0. The call to
mlx5_flow_os_create_flow_tbl() silently returns successfully.
3. Create a matcher object. A matcher struct is created by calling
mlx5_flow_os_create_flow_matcher().  The matcher validation and
translation are part of the DV implementation. The matcher bits that
were created by DV in standard PRM format are copied into the matcher
struct.
4. Create an action object. The call to
mlx5_flow_os_create_flow_action_dest_devx_tir() creates an action struct
with the TIR type and id.  This struct will be a parameter later in a
call to flow creation.  All other action calls (e.g. packet reformat,
header modification, jump to flow table, etc) return with a non
supported error.
5. Create the flow. The call to mlx5_flow_os_create_flow() receives the
matcher struct, action struct, and copy them into Windows specific
fs_rule struct, then it calls glue API devx_fs_rule_add().

Details on additional APIs:
* mlx5_flow_os_get_type() is called during flow type selection. In
Windows it constantly returns MLX5_FLOW_TYPE_DV.
* mlx5_flow_os_item_supported() is called before starting DV items
validation or translation. It filters out the OS non supported items in
advance.
* mlx5_flow_os_action_supported() is called before starting DV actions
validation or translation. It filters out the OS non supported actions
in advance.
* mlx5_flow_adjust_priority() is an OS stub for flow priority
adjustment. Windows only supports flow priority 0.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
8801972313 net/mlx5: fix flow operation wrapper per OS
Wrap glue call dv_create_flow_action_dest_devx_tir() with an OS API.

Fixes: b293fbf967 ("net/mlx5: add OS specific flow actions operations")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
14020ad53d net/mlx5: wrap default miss flow action per OS
Wrap glue call dr_create_flow_action_default_miss() with an OS API. This
commit is a follow up on [1].

[1]
commit d4d85aa6f1 ("common/mlx5: add default miss action")
commit b293fbf967 ("net/mlx5: add OS specific flow actions operations")

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
c604d92af7 net/mlx5: wrap adjust flow priority per OS
mlx5_flow_adjust_priority() is used to adjust priorities according to
priorities levels. It is Verbs based and it is called from shared code
(mlx5_flow_dv.c). Therefore, wrap it in an OS API.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
e7252614cd net/mlx5: support VF PCI address on Windows
Support VF BDF scanning by checking both the BDF and raw BDF provided by
DevX. In Linux a PCI address is formatted as: domain, bus, device,
function (DBDF).  This is right for both a PF and a VF. In Windows a PF
also has a DBDF format, but the domain is always 0, while a VF has a
special "domain" called "Virtual PCI Bus, Serial" (for example: "Virtual
PCI Bus Slot 2 Serial 2") or segment.  The full VF format under Windows
is called raw DBF.  Windows special domain must be considered and DevX
must be called to support it.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
93f4ece91a net/mlx5: spawn ethdev ports on Windows
This commit implements mlx5_dev_spawn() API which allocates an eth
device (struct rte_eth_dev) for each PCI device. When working with
representors virtual functions (as in Linux), one PCI device may spawn
several eth devices: the master device for the main physical function
(PF) and several representors for the virtual functions (VFs).  However,
currently Windows does not work in switch dev mode, therefore, no VFs
are created and no representors are spawned. In this case one eth device
is created per one PCI main port.  In addition to device creation - the
device configuration must be correctly set. The device arguments
(devargs - set by the user) are parsed but they may be overridden by
Windows limitations or hardware configurations. Some associated network
parameters are stored in eth device (e.g. ifindex, MAC address, MTU) and
some callback (e.g. burst functions) are set.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
980826dc6f net/mlx5: probe on Windows
This commit implements mlx5_os_pci_probe API under Windows. It does all
required initializations then it gets the PCI device list using glue API
get_device_list().  Next, all non MLX5 matched devices are filtered out.
The supported NIC types are: CONNECTX4VF, CONNECTX4LXVF, CONNECTX5VF,
CONNECTX5EXVF, CONNECTX5BFVF, CONNECTX6VF, MELLANOX_CONNECTX6DXVF.  Each
device in the list is assigned with default configuration parameters,
most of them are 0. The default dv_flow_en parameter value is 1 (which
means Windows match and action flows are based on DV code). Next for
each PCI device call mlx5_dev_spawn() to create an eth device (struct
rte_ethdev). The implementation of device spawn is in the follow up
commit.  Finally, the device list is free.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
61b37ff74b net/mlx5: open device on Windows
This commit implements mlx5_os_open_device() API. It calls glue API
open_device() then glue API query_device() to fill in 'struct
mlx5_context' with data for later usage.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
d266956b77 net/mlx5: support getting PDN on Windows
Implement OS function call to get pdn.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Ophir Munk
60f522c0e4 net/mlx5: add VLAN stubs on Windows
This commit adds stubs to VLAN VM operations.  It is the Windows
equivalent implementation of [1].  The Linux implementation was based on
Netlink APIs which are not supported in Windows.

[1]
commit 7af10d29a4 ("net/mlx5/linux: refactor VLAN")

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
165e5d07ce net/mlx5: support device removed query on Windows
This commit implements mlx5_is_removed() API. A new glue call
'init_shutdown_event' is added to support the new API.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
49c797f8ab net/mlx5: support getting interface name on Windows
This commit copies the interface name as saved in the device context
since its creation.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
07cae8ffab net/mlx5: support getting MTU on Windows
This commit implements API mlx5_get_mtu(). It returns the MTU size as
saved in the device context since its creation.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
99d7c45cf8 net/mlx5: support clock read on Windows
This commit adds a new glue function query_rt_values to support the new
API mlx5_read_clock().

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:08 +01:00
Tal Shnaiderman
6fbd73709e net/mlx5: support link update on Windows
Add support for mlx5_link_update() to get link speed and link state.
Other parameters are currently hard-coded.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
b653ce1dae net/mlx5: add stubs on Windows
This commits adds ethdev stubs. These APIs are called from shared code
that must compile under Linux and Windows. The following stubs are added:
mlx5_set_mtu
mlx5_os_read_dev_counters
mlx5_intr_callback_unregister
mlx5_os_get_stats_n
mlx5_os_stats_init
mlx5_set_link_down
mlx5_set_link_up
mlx5_dev_get_flow_ctrl
mlx5_dev_set_flow_ctrl
mlx5_get_module_info
mlx5_get_module_eeprom

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Tal Shnaiderman
93cc4f0dfe net/mlx5: support getting MAC on Windows
This commits implements API mlx5_get_mac().  It returns the MAC address
saved in the device context since its creation.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
c2bf5433fb net/mlx5: add stubs for MP requests on Windows
Windows supports the primary process with no secondary process control.
This commit adds stubs for requests to start/stop the data-path to the
secondary process and for requests to start/stop a queue of the primary
process.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
38e8684aa7 net/mlx5: add memory region callbacks on Windows
This commit is the Windows part implementation of
commit d5ed8aa944 ("net/mlx5: add memory region callbacks in per-device cache")

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Tal Shnaiderman
d0b3ef1a6e net/mlx5: add macros for file name and path
ibdev_name and ibdev_path sizes are defined in Windows DevX
differently from the sizes used in Linux with
IBV_SYSFS_NAME_MAX and IBV_SYSFS_PATH_MAX.

Added MLX5_FS_NAME_MAX and MLX5_FS_NAME_PATH in mlx5_os.h for both OSs.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
d1be572d49 net/mlx5: refactor ops for Windows
There are two types of eth_dev_ops used under Windows: primary and
isolate mode. Their function calls initialization is added to the OS
specific file mlx5_os.c. Secondary process eth_dev_ops is nullified.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Tal Shnaiderman
d36bb662df net/mlx5: support adding MAC address on Windows
Get the list of MAC addresses and verify if the input mac parameter
already exists. If not - return -ENOTSUP (as Windows does not support
adding new MAC addresses). If the MAC address exists (EEXIST) return 0
(the equivalent of Linux implementation of this API).

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
59f102074f net/mlx5: add Windows stubs
mlx5_os_set_nonblock_channel_fd
mlx5_os_dev_shared_handler_install
mlx5_os_dev_shared_handler_uninstall
mlx5_os_read_dev_stat
mlx5_os_mac_addr_flush
mlx5_os_mac_addr_remove
mlx5_os_vf_mac_addr_modify
mlx5_os_set_promisc
mlx5_os_set_allmulti

Set struct mlx5_flow_driver_ops mlx5_flow_verbs_drv_ops with NULL
pointers.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
1137ecee26 net/mlx5: implement device attribute getter on Windows
This commit is the Windows implementation of mlx5_os_get_dev_attr() API.
It follows the commit in [1]. A new file named mlx5_os.c is added under
windows directory as its Linux counterpart file: linux/mlx5_os.c.

[1].
commit e85f623e13 ("net/mlx5: remove attributes dependency on Verbs")

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
db12615b42 net/mlx5: prepare MR prototypes for DevX
Currently MR operations are Verbs based. This commit updates MR
operations prototypes such that DevX MR operations callbacks can be used
as well.  Rename 'struct mlx5_verbs_ops' as 'struct mlx5_mr_ops' and
move it to shared file mlx5.h.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Tal Shnaiderman
981746264e common/mlx5: wrap event channel functions per OS
Wrap the API to create/destroy event channel and to subscribe an event
with OS calls. In Linux those calls are implemented by glue functions
while in Windows they are not supported.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
223f2c2162 net/mlx5: fix flow action destroy wrapper
Glue function destroy_flow_action() was wrapped by OS specific operation
mlx5_flow_os_destroy_flow_action(). It was skipped in file mlx5.c.

Fixes: b293fbf967 ("net/mlx5: add OS specific flow actions operations")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Tal Shnaiderman
07a99de886 net/mlx5: wrap glue reg/dereg UMEM per OS
Wrap glue calls for UMEM registration and deregistration with generic OS
calls since each OS (Linux or Windows) has a different glue API
parameters.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
1cb210abdd net/mlx5: wrap glue alloc/dealloc PD per OS
Wrap glue calls alloc_pd() and dealloc_pd() with generic OS calls.  In
Linux - protection domain allocations are implemented by Verbs glue API
while in Windows it is by DevX API.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
9b9890e20d net/mlx5: move static asserts to global scope
Some Windows compilers consider static_assert() as calls to another
function rather than a compiler directive which allows checking type
information at compile time.  This only occurs if the static_assert call
appears inside another function scope. To solve it move the
static_assert calls to global scope in the files where they are used.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
ffe262d675 net/mlx5: do not define static assert in Windows
In Linux 'static_assert' is defined in file mlx5_defs.h:

 #ifndef HAVE_STATIC_ASSERT
    #define static_assert _Static_assert
 #endif

The same definition can originate from Linux file /usr/include/assert.h.

In Windows static_assert is used while _Static_assert is unknown.
Therefore update the definition condition to be:

 #if !defined(HAVE_STATIC_ASSERT) && !defined(RTE_EXEC_ENV_WINDOWS)
    #define static_assert _Static_assert
 #endif

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
bd4a263560 net/mlx5: define MPRQ functions as static inline
Functions mlx5_check_mprq_support(), mlx5_rxq_mprq_enabled(),
mlx5_mprq_enabled() are moved from source file mlx5_rxq.c to header file
mlx5_rxtx.h and their type is updated to 'static __rte_always_inline'.
Previously the functions were declared as 'inline' in the source file
which was reported as 'unresolved external symbol' error by some Windows
linkers.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
20698c9f15 net/mlx5: replace Linux sleep
Replace Linux API usleep() and nanosleep() with rte_delay_us_sleep().
The replacement occurs in shared files compiled under different
operating systems.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
b492e28882 net/mlx5: fix freeing packet pacing
Packet pacing is allocated under condition #ifdef HAVE_MLX5DV_PP_ALLOC.
In a similar way - free packet pacing index under the same condition.
This update is required to successfully compile under operating systems
which do not support packet pacing.

Fixes: aef1e20ebe ("net/mlx5: allocate packet pacing context")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
ed7f6255c9 net/mlx5: remove Linux files from Windows compilation
This commit removes Linux files flow_verbs.c and mlx5_rxtx_vec.c
from Windows compilation.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Ophir Munk
1f29d15ec9 net/mlx5: extend device attributes getter
This commit adds device attributes parameters to be reported by
mlx5_os_get_dev_attr(): max_cqe, max_mr, max_pd, max_srq, max_srq_wr

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Tal Shnaiderman
09a5e9777e net/mlx5: fix constant array size
Before this commit the PMD used:
   const int elt_n = 8
   const int *stack[elt_n];

In Windows clang compiler complains:
net/mlx5/mlx5_flow.c:215:19: error: variable length array folded
to constant array as an extension [-Werror,-Wgnu-folding-constant]

Fix it by using a constant macro definition instead of a variable:
   #define MLX5_RSS_EXP_ELT_N 8
   const int *stack[MLX5_RSS_EXP_ELT_N];

Fixes: c7870bfe09 ("ethdev: move RSS expansion code to mlx5 driver")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Gregory Etelson
187f942b2d net/mlx5: fix tunnel rules validation on VF representor
MLX5 PMD implicitly adds vxlan_decap flow action to tunnel offload
match type rules. However, VXLAN decap action on VF representors is
not supported on MLX5 PMD hardware.

The patch rejects attempt to create tunnel offload flow rules on VF
representor.

Refer commit 9c4971e523 ("net/mlx5: update VLAN and encap actions validation")

Fixes: 4ec6360de3 ("net/mlx5: implement tunnel offload")
Cc: stable@dpdk.org

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-08 16:03:07 +01:00
Viacheslav Ovsiienko
ddb0384346 net/mlx5: fix buffer split offload advertising
The buffer split Rx offload is not compatible with Multi-Packet
Receiving Queue (MPRQ) Rx offload, hence, the buffer split
offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT and other related
values should be advertised only if there is no MPRQ engaged.

Fixes: 6c8f7f1c18 ("net/mlx5: report Rx buffer split capabilities")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Asaf Penso <asafp@nvidia.com>
2021-01-08 16:03:05 +01:00
Alexander Kozyrev
ac340e1fe5 net/mlx5: fix mbuf freeing in vectorized MPRQ
Wrong index is used to find mbufs belonging to an application in
the rxq_free_elts_sprq() function in the case of vectorized MPRQ.
elts_ci points to the last allocated mbuf in this case, not rq_ci.
Use this field to avoid double free of mbuf and segmentation fault.

Fixes: 0f20acbf5e ("net/mlx5: implement vectorized MPRQ burst")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-08 16:03:05 +01:00
Gregory Etelson
3ab5a3a7ac net/mlx5: fix Direct Verbs flow descriptor allocation
Initialize flow descriptor tunnel member during flow creation.
Prevent access to stale data and pointers when flow descriptor is
reallocated after release.
Fix flow index validation.

Fixes: e7bfa3596a ("net/mlx5: separate the flow handle resource")
Fixes: 8bb81f2649 ("net/mlx5: use thread specific flow workspace")
Cc: stable@dpdk.org

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-01-08 16:03:05 +01:00
Suanming Mou
495b2ed40a net/mlx5: optimize tunnel offload index pool
Currently, when creating the index pool, if the trunk size is not
configured, the index pool default trunk size will be 4096.

The maximum tunnel offload supported now is 256(MLX5_MAX_TUNNELS),
create the index pool with trunk size 4096 wastes the memory.

This commits changes the tunnel offload index pool trunk size to
MLX5_MAX_TUNNELS to save the memory.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Reviewed-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:04 +01:00
Suanming Mou
f5b0aed2df net/mlx5: optimize hash list entry memory
Currently, the hash list saves the hash key in the hash entry. And the
key is mostly used to get the bucket index only.

Save the entire 64 bits key to the entry will not be a good option if
the key is only used to get the bucket index. Since 64 bits costs more
memory for the entry, mostly the signature data in the key only uses
32 bits. And in the unregister function, the key in the entry causes
extra bucket index calculation.

This commit saves the bucket index to the entry instead of the hash key.
For the hash list like table, tag and mreg_copy which save the signature
data in the key, the signature data is moved to the resource data struct
itself.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:04 +01:00
Suanming Mou
d14cbf3db1 net/mlx5: optimize hash list synchronization
Since all the hash table operations are related with one dedicated
bucket, the hash table lock and gen_cnt can be allocated per-bucket.

Currently, the hash table uses one global lock to protect all the
buckets, that global lock avoids the buckets to be operated at one
time, it hurts the hash table performance. And the gen_cnt updated
by the entire hash table causes incorrect redundant list research.

This commit optimized the lock and gen_cnt to bucket solid allows
different bucket entries can be operated more efficiently.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:04 +01:00
Dekel Peled
4165bfd20d net/mlx5: fix shared age action validation
Previous patch added support of shared age action.
This feature is supported on group 1 and higher, and validation was
added accordingly.
On FDB table the group 0 is skipped to improve performance.
As a result the mentioned validation is not relevant for transfer rules.
This patch adds the required check to ensure proper validation.

Fixes: f9bc5274a6 ("net/mlx5: allow age modes combination")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:04 +01:00
Viacheslav Ovsiienko
81c3b97735 net/mlx5: fix Verbs memory allocation callback
The rdma-core library uses callbacks to allocate and free memory
from DPDK. The memory allocation callback used the complicated
and incorrect way to get the NUMA socket ID from the context.
The context was wrong that might result in wrong socket ID
and allocating memory from wrong node.

The callbacks are assigned once as Infinibande device context
is created allowing early access to shared DPDK memory for all
Verbs internal objects need that.

Fixes: 36dabcea78 ("net/mlx5: use anonymous Direct Verbs allocator argument")
Fixes: 2eb4d0107a ("net/mlx5: refactor PCI probing on Linux")
Fixes: 17e19bc4dd ("net/mlx5: add IB shared context alloc/free functions")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:04 +01:00
Thomas Monjalon
72bf1b5d80 net/mlx5: fix flow shared action destroy error code
In the function rte_flow_shared_action_destroy(),
the errno ETOOMANYREFS has been replaced with EBUSY in the
commit dc328d1c55 ("ethdev: rename a flow shared action error code").

Another occurrence of ETOOMANYREFS, added later by mistake,
is replaced with EBUSY errno.

Fixes: fa7ad49e96 ("net/mlx5: fix shared RSS action update")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Tal Shnaiderman <talshn@nvidia.com>
Tested-by: Tal Shnaiderman <talshn@nvidia.com>
2020-11-25 13:55:05 +01:00
Gregory Etelson
bb5d49c616 net/mlx5: fix tunnel offload freeing
PMD did not remove tunnel offload object from tunnels database before
it released the object memory. As the result, the tunnels database
become corrupted and subsequent search operations triggered PMD crash.
The patch removes tunnel offload object from the tunnels database when
the object is not in-use by PMD any more.

Fixes: bc1d90a3cf ("net/mlx5: fix build with Direct Verbs disabled")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-25 13:54:20 +01:00
Matan Azrad
e98f479df9 net/mlx5: reduce log level in hash list registration
In mlx5 internal hash list tool, there is a log print when an entry
allocation is failed: Can't allocate hash list entry.

Some initialization checks triggers hash list registration in order to
check some capabilities. Here, the failure in registration doesn't
lead to failure in the initialization flow, that is why the log level
can be lower.

Move the entry allocation failure log to debug level.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Asaf Penso <asafp@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-24 23:30:21 +01:00
Gregory Etelson
6e09c7bb6f net/mlx5: fix DevX resources freeing
Invalid memory release order of DevX resources caused PMD crash.

1. SQ and CQ memory must be unregistered with DevX before it is freed.
2. SQ objects reference to a CQ ones. Hence, SQ should be destroyed in
   advance of CQ it references to.

Fixes: 6deb19e1b2 ("net/mlx5: separate Rx queue object creations")
Fixes: 88f2e3f18c ("net/mlx5: rearrange SQ and CQ creation in DevX module")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-24 23:17:19 +01:00
Gregory Etelson
5882bde88d net/mlx5: fix representor interrupts handler
Representor is a port in DPDK that is connected to a VF in such a way
that assuming there are no offload flows, each packet that is sent
from the VF will be received by the corresponding representor. While
each packet that is sent to a representor will be received by the VF.
This is very useful in case of SRIOV mode, where the first packet that
is sent by the VF will be received by the DPDK application which will
decide if this flow should be offloaded to the E-Switch.

Representor shares interrupts handler with host PF over the PCI
address. Therefore, after PF completes its interrupts handler
initialization, no additional actions required for representor.

Fixes: 26c08b979d ("net/mlx5: add port representor awareness")
Cc: stable@dpdk.org

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-22 18:27:08 +01:00
Andrey Vesnovaty
fa7ad49e96 net/mlx5: fix shared RSS action update
The shared RSS action update was not operational due to lack
of kernel driver support of TIR object modification.
This commit introduces the workaround to support shared RSS
action modify using an indirect queue table update instead of
touching TIR object directly.
Limitations: the only supported RSS property to update is queues, the
rest of the properties ignored.

Fixes: d2046c09aa ("net/mlx5: support shared action for RSS")

Signed-off-by: Andrey Vesnovaty <andreyv@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-22 16:40:03 +01:00
Tonghao Zhang
5ea8356ec8 net/mlx5: check register available for metadata action
If user don't set the dv_xmeta_en to 1 or 2,
in the flow_dv_convert_action_set_meta function:

- flow_dv_get_metadata_reg may return the REG_NONE,
  when MLX5_METADATA_FDB enabled for metadata set action.

- reg_to_field(REG_NONE) returns MLX5_MODI_OUT_NONE,
  that is invalid and rdma-core fails.

The rdma-core calltrace:
    dr_action_create_modify_action
    dr_actions_convert_modify_header
    dr_action_modify_sw_to_hw
    dr_action_modify_sw_to_hw_set
    dr_ste_get_modify_hdr_hw_field

Fixes: fcc8d2f716 ("net/mlx5: extend flow metadata support")
Cc: stable@dpdk.org

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-22 15:37:05 +01:00
Alexander Kozyrev
5fc2e5c27d net/mlx5: fix mbuf overflow in vectorized MPRQ
Changing the allocation scheme to improve mbufs locality caused mbufs
overrun in some cases. Revert the previous replenish logic back.
Calculate a number of unused mbufs and replenish max this number of mbufs.

Mark the last 4 mbufs as fake mbufs to prevent overflowing into consumed
mbufs in the future. Keep the consumed index and the produced index 4 mbufs
apart for this purpose.

Replenish some mbufs only in case the consumed index is within the
replenish threshold of the produced index in order to retain the cache
locality for the vectorized MPRQ routine.

Fixes: 5c68764377 ("net/mlx5: improve vectorized MPRQ descriptors locality")

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-22 15:37:03 +01:00
Viacheslav Ovsiienko
b15af1573a net/mlx5: make Tx scheduling xstats names compliant
xstats names for Tx packet scheduling should be compliant with [1]

[1] http://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html?highlight=xstats#extended-statistics-api

Bugzilla ID: 558

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-22 15:37:02 +01:00
Viacheslav Ovsiienko
1101809b43 net/mlx5: make ethernet xstats names compliant
xstats names for simple stats are mostly standardized in ethdev drivers
and should be compliant with [1]

[1] http://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html?highlight=xstats#extended-statistics-api

Bugzilla ID: 558

Reported-by: Igor Ryzhov <iryzhov@nfware.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-22 15:37:00 +01:00
Benoît Ganne
1688c580e8 net/mlx5: allow unknown link speed
mlx5 PMD refuses to update link state if link speed is defined but
status is down or if link speed is undefined but status is up, even if
the ioctl() succeeded.
This prevents application to detect link up/down event, especially when
the link speed is not correctly detected.

Commit [1] allowed returning unknown link speed, so now PMD allows
the return of unknown link speed in the above case.

Due to some old kernel driver bug, link speed wasn't detected properly.

[1] http://git.dpdk.org/dpdk/commit/?id=810b17d116f03

Signed-off-by: Benoît Ganne <bganne@cisco.com>
Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-22 15:36:56 +01:00
Xiaoyu Min
388dd1c9a6 net/mlx5: fix encap/decap limit for hairpin flow split
The rte_flow_item_eth and rte_flow_item_vlan items are refined.
The structs do not exactly represent the packet bits captured on the
wire anymore.
Should use real header instead of the whole struct.

Replace the rte_flow_item_* with the existing corresponding rte_*_hdr.

Fixes: 09315fc838 ("ethdev: add VLAN attributes to ethernet and VLAN items")
Fixes: f9210259ca ("net/mlx5: fix raw encap/decap limit")

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-22 17:07:27 +01:00
Thomas Monjalon
dc328d1c55 ethdev: rename a flow shared action error code
In the experimental function rte_flow_shared_action_destroy()
introduced in DPDK 20.11, the errno ETOOMANYREFS was used.
This errno is not always available on Windows,
so it is preferred using EBUSY instead.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Tal Shnaiderman <talshn@nvidia.com>
Tested-by: Tal Shnaiderman <talshn@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-20 21:10:05 +01:00
Dekel Peled
a2999c7bfe common/mlx5: move to formal ASO action API
Existing code uses the previous API offered by rdma-core in order
to create ASO Flow Hit action.

A general API is now formally released, to create ASO action of any
type. This patch moves the MLX5 PMD code to use the formal API.

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-20 21:10:05 +01:00
Dekel Peled
31ef2982fa net/mlx5: fix input register for ASO object
Existing code uses the hard-coded value REG_C_5 as input for function
mlx5dv_dr_action_create_flow_hit().

This patch updates function mlx5_flow_get_reg_id() to return the
selected REG_C value for ASO Flow Hit operation.
The returned value is used, after reducing offset REG_C_0, as input
for function mlx5dv_dr_action_create_flow_hit().

Fixes: f935ed4b64 ("net/mlx5: support flow hit action for aging")

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-20 21:10:05 +01:00
Dekel Peled
7ad0b6d91f net/mlx5: fix memory leak on ASO age close
Recent patch introduced the use of ASO flow hit action for age action.
The relevant management struct uses dynamically allocated memory.
This memory was not freed on closing.

This patch adds memory freeing as needed.

Fixes: f935ed4b64 ("net/mlx5: support flow hit action for aging")

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-20 21:10:05 +01:00
Raslan Darawsheh
3ea12cad71 common/mlx5: fix name for ConnectX VF device ID
Starting ConnectX-6 Dx, the VF device ID is generic
and not per chip.

https://pci-ids.ucw.cz/v2.2/pci.ids
101e  ConnectX Family mlx5Gen Virtual Function

This means that all will have the same VF device ID.

Fixes: 5fc66630be ("net/mlx5: add ConnectX6-DX device ID")
Cc: stable@dpdk.org

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Gregory Etelson
1db3678dbd net/mlx5: fix restore info in non-tunnel traffic
Tunnel offload API provides applications with ability to restore
packet outer headers after partial offload. Exact feature execution
depends on hardware abilities and PMD implementation. Hardware that is
supported by MLX5 PMD places a mark on a packet after partial offload.
PMD decodes that mark and provides application with required
information.
Application can call the restore API for packets that are part of
offloaded tunnel and not. It's up to a PMD to provide correct
information.
Current MLX5 tunnel offload implementation does not allow applications
to use flow MARK actions. It is restricted to tunnel offload use only.
This fault was triggered by application that did not activate tunnel
offload and called the restore API with a marked packet. The PMD tried
to decode the mark value and crashed. The patch decodes mark value
only if tunnel offload is active.

Fixes: 4ec6360de3 ("net/mlx5: implement tunnel offload")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Bing Zhao
2451574a49 net/mlx5: fix eCPRI item value with mask
When creating a flow with eCPRI item, the mask and the value are both
needed in order to build the matching criteria.

In the current implementation, the unused value bits clear operation
was missed when filling the mask and value fields. For the value, the
bits not required were not masked with the mask provided. Indeed,
this action is not mandatory. But when creating a flow in the root
table, the kernel driver got involved and a check would prevent this
flow from being created. The same flow could be created successfully
with the userspace rdma-core on the non-root tables.

An AND operation needs to be added to clear the unused bits in the
value when building the matching criteria. Then the same flow can be
created successfully no matter with kernel driver or with rdma-core.

Fixes: daa38a8924 ("net/mlx5: add flow translation of eCPRI header")
Cc: stable@dpdk.org

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Suanming Mou
01c05ee0e4 net/mlx5: fix sample and mirror flow action deletion
The sample and mirror action objects are maintained on the list
shared between the ports belonging to the same multiport Infiniband
device(between representors).

The actions in the NIC steering domains might contain the references
to the sub-flow action objects created over the given port. The action
deletion might happen in the context of the different port and on the
deletion of referenced objects the incorrect port might be specified.
To avoid this we should save the port on what the sub-flow actions
were created and then use this saved port for sub-flow action release.

This commit saves the create device in the sample and mirror actions
struct to avoid using the incorrect port device in releasing.

Fixes: 1978414169 ("net/mlx5: make sample and mirror action thread safe")

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Reviewed-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Suanming Mou
f15b82cdf8 net/mlx5: fix header reformat action hash key
Currently, header reformat action uses the hash list 32-bit key
generated in header reformat register function directly. The key will
not be recalculated in the hash list function.

As the 64-bit key is composed of the 32-bit attributes and 32-bit
reformat buffer csum, the hash list function only gets 32-bit key
directly will take the attribute part only, csum part will be ignored.
For different header reformat actions, the attributes can be the same,
while the buffer will be different. Only take the attribute part causes
lots of the conflicts.

This commits adds the attribute part and the significant different csum
part for the key.

Fixes: f961fd490f ("net/mlx5: make header reformat action thread safe")

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-20 21:10:05 +01:00
Viacheslav Ovsiienko
f9210259ca net/mlx5: fix raw encap/decap limit
The MLX5_ENCAPSULATION_DECISION_SIZE constant is used
to check the raw encap/decap actions for the raw header
size. The header is constructed of the rte_xxx_hdr
structures instead of rte items. Hence, constant
must be defined with rte_xxx_hdr structure sizes.

Fixes: 50f576d657 ("net/mlx5: fix VLAN actions in meter")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Suanming Mou <suanmingm@nvidia.com>
2020-11-20 21:10:05 +01:00
Xueming Li
e6818853c0 net/mlx5: set representor to first PF in bonding mode
When the representor device was set to PF1 in bonding mode, iterating
device iterator that looking for representors by bonding device failed
to match PF0 pci address with PF1 address. So detaching PF bonding
device only detached all representors on PF0.

This patch registers all representors of PF1 with PF0 as PCI device.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Xueming Li
38c6dc2039 net/mlx5: fix flow index type
Fix assertion check warnings.

Fixes: 8bb81f2649 ("net/mlx5: use thread specific flow workspace")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Alexander Kozyrev
af01eeb755 net/mlx5: fix WQE counter assert in free completion queue
The following assertion fails in case RTE_ENABLE_ASSERT is enabled:
PANIC in mlx5_tx_handle_completion():
assert "(txq->fcqs[txq->cq_ci & txq->cqe_m] >> 16)
	== cqe->wqe_counter" failed

The free completion queue only contains an expected WQE counter if
RTE_LIBRTE_MLX5_DEBUG is enabled as well. Thus enabling
RTE_ENABLE_ASSERT alone causes the assert to fail.

Compile the assert conditionally only if RTE_ENABLE_ASSERT is enabled.

Fixes: 0afacb04f5 ("common/mlx5: remove NDEBUG")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Viacheslav Ovsiienko
aaf34de5d8 net/mlx5: add wire vport hint
The kernel can use two approaches to distinguish the E-Switch
source vport in the packet metadata - either with dedicated
source_port field or register C0. To eliminate the extra source
vport matching in the hardware the source_port field can be
set to specific values (0xFFFF) for the wire source port.

This match can be applied to recognize wire port only in FDB
domain. Missing the register C0 match in the NIC Rx domain causes
incorrect representor steering within shared IB device ports
and must be always specified (if kernel uses this approach).

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Didier Pallard
7c085d3a4a net/mlx5: fix Rx descriptor status
Three bugs in rx_queue_count function:
- One entry may contain several segments, so 'used' must be multiplied
  by number of segments per entry to properly reflect the queue usage.
- The number of cqes is equals to (1U << rxq->elts_n) - 1 in SPRQ mode.
  The range returned by rx_queue_count should be the number of entries
  used in queue, so it ranges from 0 to max number of entries
  in queue, not this number minus one.
- For MPRQ mode, we need to take into account of the number of strd.

Fixes: 8788fec1f2 ("net/mlx5: implement descriptor status API")
Cc: stable@dpdk.org

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Maxime Leroy
9ae9720d7f net/mlx5: fix Rx queue count calculation
The commit d2d5760552 ("net/mlx5: fix Rx queue count calculation") is
incorrect because the count calculation is wrong for the next cqe:

Example:

 Compressed Set of packets 1  |   Compressed Set of packets 2
C | a | e0 | e1 | e2 | e3 | e4 | e5 | C | a | e0

There are 2 compressed set of packets in the first queue. For the first
set, n is computed correctly.

But for the second, n is not computed properly. Because the zip context
is for the first set. The  second set is not yet decompressed, so
there are no context.

To fix the issue, we should only use the zip context for the first CQEs
series.

Fixes: d2d5760552 ("net/mlx5: fix Rx queue count calculation")
Cc: stable@dpdk.org

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Xiaoyu Min
765633655c net/mlx5: fix RSS queue type validation
When the RSS queues' types are not uniformed, i.e, mixed with normal Rx
queue and hairpin queue, PMD accept this flow after commit[1] instead of
rejecting it.

This because commit[1] creates Rx queue object as DevX type via DevX API
instead of IBV type via Verbs, in which the latter will check the queues'
type when creating Verbs ind table but the former doesn't check when
creating DevX ind table.

However, in any case, logically PMD should check whether the input
configuration of RSS action is reasonable or not, which should
include queues' type check as well as the others.

So add the check of RSS queues' type in validation function to fix issue.

[1]:
commit 6deb19e1b2 ("net/mlx5: separate Rx queue object creations")

Fixes: 63bd16292c ("net/mlx5: support RSS on hairpin")
Cc: stable@dpdk.org

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-20 21:10:05 +01:00
Alexander Kozyrev
ff2deada2e net/mlx5: fix Rx packet padding config via DevX
Received packets can be aligned to the size of the cache line on
PCI transactions. This could improve performance by avoiding
partial cache line writes in exchange for increased PCI bandwidth.

This feature is supposed to be controlled by the rxq_pkt_pad_en
devarg and it is true for an RxQ created via the Verbs API.
But in the DevX API case, it is erroneously controlled by the
rxq_cqe_pad_en devarg instead, which is in charge of the CQE
padding instead and should not control the RxQ creation.

Fix DevX RxQ creation by using the proper configuration flag for
Rx packet padding that is being set by the rxq_pkt_pad_en devarg.

Fixes: dc9ceff73c ("net/mlx5: create advanced RxQ via DevX")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-20 21:10:05 +01:00
Gregory Etelson
a0e4728c43 net/mlx5: fix crash in tunnel offload setup
The new flow table resource management API triggered a PMD crash in
tunnel offload mode, when tunnel match flow rule was inserted before
tunnel set rule.

Reason for the crash was double flow table registration. The table was
registered by the tunnel offload code for the first time and once
more by PMD code, as part of general table processing. The table
counter was decremented only once during the rule destruction and
caused a resource leak that triggered the crash.

The patch updates PMD registration with tunnel offload parameters and
removes table registration in tunnel related code.

Fixes: afd7a62514 ("net/mlx5: make flow table cache thread safe")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:04 +01:00
Gregory Etelson
868d2e342c net/mlx5: fix tunnel offload hub multi-thread protection
The original patch was removing active tunnel offload objects from a
tunnels db list without checking its reference counter value.
That action was leading to a PMD crash.

Current patch isolates tunnels db list into a separate API. That API
manages MT protection of the tunnel offload db.

Fixes: 5b38d8cd46 ("net/mlx5: make tunnel hub list thread safe")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:04 +01:00
Gregory Etelson
9cac7ded37 net/mlx5: fix tunnel offload object allocation
The original patch allocated tunnel offload objects with invalid
indexes. As the result, PMD tunnel object allocation failed.

In this patch indexed pool provides both an index and memory for a new
tunnel offload object.
Also tunnel offload ipool moved to dv enabled code only.

Fixes: 4ae8825c50 ("net/mlx5: use indexed pool as id generator")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:04 +01:00
Gregory Etelson
eab3ca4858 net/mlx5: fix structure passing method in function call
Tunnel offload implementation introduced 64 bit-field flow_grp_info
structure. Since the structure size is 64 bits, the code passed that
type by value in function calls.

The patch changes that structure passing method to reference.

Fixes: 4ec6360de3 ("net/mlx5: implement tunnel offload")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:04 +01:00
Gregory Etelson
bc1d90a3cf net/mlx5: fix build with Direct Verbs disabled
Tunnel offload API is implemented for Direct Verbs environment only.
Current patch re-arranges tunnel related functions for compilation in
non Direct Verbs setups to prevent compilation failures.  The patch
does not introduce new functions.

Fixes: 4ec6360de3 ("net/mlx5: implement tunnel offload")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:04 +01:00
Gregory Etelson
8b11f9aa7b net/mlx5: fix tunnel offload callback names
Fix mlx5_flow_tunnel_action_release and mlx5_flow_tunnel_item_release
callback names to match tunnel offload names pattern.

Fixes: 4ec6360de3 ("net/mlx5: implement tunnel offload")

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:04 +01:00
Michael Baum
2a87415cc9 net/mlx5/linux: fix probing adjustment depending on DevX
Bonding adjustment is done only when DEVX_PORT is supported in the
rdma-core.

Some bonding condition was done even when DEVX_PORT is not supported.

Remove it.

Fixes: 2eb4d0107a ("net/mlx5: refactor PCI probing on Linux")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-14 01:36:19 +01:00
Michael Baum
436973b0d1 net/mlx5: fix leak on ASO age pools resize failure
In ASO age pools resize, the PMD starts ASO data-path.

When starting ASO data-path is failed, the pools memory was not freed
what caused a memory leak.

Free it.

Fixes: f935ed4b64 ("net/mlx5: support flow hit action for aging")

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-14 01:36:19 +01:00
Michael Baum
41217cec2f net/mlx5: fix leak on Rx queue creation failure
In Rx queue creation, there is a validation for the Rx configuration.

When scatter offload validation for buffer split is failed, the Rx queue
object memory was not freed what caused a memory leak.

Free it.

Fixes: a0a45e8af7 ("net/mlx5: configure Rx queue for buffer split")

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-14 01:36:19 +01:00
Michael Baum
8b3799531b net/mlx5: remove unused calculation in RSS expansion
The RSS flow expansion get a memory buffer to fill the new patterns of
the expanded flows.
This memory management saves the next address to write into the buffer
in a dedicated variable.

The calculation for the next address was wrongly also done when all the
patterns were ready.

Remove it.

Fixes: 4ed05fcd44 ("ethdev: add flow API to expand RSS flows")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-14 01:36:19 +01:00
Xueming Li
0064bf4318 net/mlx5: fix nested flow creation
If xmedata mode 1 enabled and create a flow with RSS and mark action,
there was an error that rdma-core failed to create RQT due to wrong
queue definition. This was due to mixed flow creation in thread specific
flow workspace.

This patch introduces nested flow workspace(context data), each flow
uses dedicate flow workspace, pop and restore workspace when nested flow
creation done, the original flow with continue with original flow
workspace. The total number of thread specific flow workspace should be
2 due to only one nested flow creation scenario so far.

Fixes: 8bb81f2649 ("net/mlx5: use thread specific flow workspace")
Fixes: 3ac3d8234b ("net/mlx5: fix index when creating flow")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 23:36:14 +01:00
Xueming Li
733bbf518f net/mlx5: fix Unix socket path
mlx_steering_dump_parser.py tool failed to dump flow due to socket file
name changed.

Change socket file name back to make it consistent.

Fixes: e4b7b8d082 ("common/mlx5: fix PCI driver name")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 23:36:14 +01:00
Suanming Mou
0080b100dd net/mlx5: fix detection of counter offset support
Currently, the counter offset support is discovered by creating the
rule with invalid offset counter and jump action in root table. If
the rule creation fails with EINVAL errno, that mean counter offset
is not supported in root table.

However, jump action may not be supported in some rdma-core version.
In this case, the discover code will not work properly.

This commits changes the jump action to generic drop action. That
makes the discover code to be more compatible.

Fixes: 994829e695 ("net/mlx5: remove single counter container")

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 23:36:14 +01:00
Bing Zhao
0746dcabfd net/mlx5: fix hairpin unbind
In the implementation of mlx5_hairpin_unbind, a copy-paste error was
inside. If a single peer Rx port needed to be unbound, it would be
bound again by mistake.

All the hardware resources were released when stopping the device and
no mess of the configuration was introduced. But when trying to unbind
the ports again, the issue would appear.

The typo of the function call is fixed. If there is no hairpin queue
bound between two ports, the unbinding process should be considered
successful.

Fixes: 37cd4501e8 ("net/mlx5: support two ports hairpin mode")

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 23:36:14 +01:00
Xiaoyu Min
6f921f61d4 net/mlx5: validate MPLSoGRE with GRE key
Currently PMD only accept flow which item_mpls directly follow item_gre,
means to match the GRE header without GRE optional field key in MPLSoGRE
encapsulation.

However, for the MPLSoGRE, the GRE header could have the optional field
(i.e, key) according to the RFC. So PMD need to accept this.

Add MLX5_FLOW_LAYER_GRE_KEY into allowed prev_layer to fix

Fixes: a7a0365565 ("net/mlx5: match GRE key and present bits")
Cc: stable@dpdk.org

Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 23:36:14 +01:00
Bing Zhao
1d1f909cf5 net/mlx5: fix check of eCPRI previous layer
Based on the specification, eCPRI can only follow ETH (VLAN) layer
or UDP layer. When creating a flow with eCPRI item, this should be
checked and invalid layout of the layers should be rejected.

Fixes: c7eca23657 ("net/mlx5: add flow validation of eCPRI header")
Cc: stable@dpdk.org

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 23:36:14 +01:00
Alexander Kozyrev
70a3ee6bb7 net/mlx5: fix Rx descriptors info for MPRQ
The number of descriptors configured is returned to a user
via the rxq_info_get API. This number is incorrect for MPRQ.
For SPRQ this number matches the number of mbufs allocated.
For MPRQ we have fewer external MPRQ buffers that can hold
multiple packets in strides of this big buffer. Take that
into account and return the number of MPRQ buffers multiplied
by the number of strides in this case.

Fixes: 26f1bae837 ("net/mlx5: add Rx/Tx burst mode info")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 23:36:14 +01:00
Alexander Kozyrev
5c68764377 net/mlx5: improve vectorized MPRQ descriptors locality
There is a performance penalty for the replenish scheme
used in vectorized Rx burst for both MPRQ and SPRQ.
Mbuf elements are being filled at the end of the mbufs
array and being replenished at the beginning. That leads
to an increase in cache misses and the performance drop.
The more Rx descriptors are used the worse the situation.

Change the allocation scheme for vectorized MPRQ Rx burst:
allocate new mbufs only when consumed mbufs are almost
depleted (always have one burst gap between allocated and
consumed indices). Keeping a small number of mbufs allocated
improves cache locality and improves performance a lot.

Unfortunately, this approach cannot be applied to SPRQ Rx
burst routine. In MPRQ Rx burst we simply copy packets from
external MPRQ buffers or attach these buffers to mbufs.
In SPRQ Rx burst we allow the NIC to fill mbufs for us.
Hence keeping a small number of allocated mbufs will limit
NIC ability to fill as many buffers as possible. This fact
offsets the advantage of better cache locality.

Fixes: 0f20acbf5e ("net/mlx5: implement vectorized MPRQ burst")

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 23:36:14 +01:00
Suanming Mou
fabf8a3724 net/mlx5: fix shared RSS action release
As shared RSS action will be shared by multiple flows, the action
is created as global standalone action and managed only by the
relevant shared action management functions.

Currently, hrxqs will be created by shared RSS action or general
queue action. For hrxqs created by shared RSS action, they should
also only be released with shared RSS action. It's not correct to
release the shared RSS action hrxqs as general queue actions do
in flow destroy.

This commit adds a new fate action type for shared RSS action to
handle the shared RSS action hrxq release correctly.

Fixes: e1592b6c4d ("net/mlx5: make Rx queue thread safe")

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 19:43:26 +01:00
Xueming Li
c3ba8ecb76 net/mlx5: fix missing meter packet
For transfer flow with meter, packet was passed without applying flow
action. The group level was multiplied by 10 for group level 65531.

This patch fixes this issue by correcting suffix table group level
calculation.

Fixes: 3e8f3e51fd ("net/mlx5: fix meter table definitions")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 19:43:26 +01:00
Viacheslav Ovsiienko
27b095072b net/mlx5: fix Tx queue completion on stop
The Tx queue completion production index was not reset
on Tx queue stop and there were completions remaining
from the previous queue run. This caused the wrong
completion queue operating and overall Tx queue malfunction
on queue restart.

Fixes: 161d103b23 ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 19:43:25 +01:00
Viacheslav Ovsiienko
70d83ebbbb net/mlx5: fix Rx queue completion index consistency
The Rx queue completion consumer index got temporary
wrong value pointing to the midst of the compressed CQE
session. If application crashed at the moment the next
queue restart caused handling wrong CQEs pointed by index
and losing consuming index synchronization, that made
reliable queue restart impossible.

Fixes: 88c0733535 ("net/mlx5: extend Rx completion with error handling")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 19:43:25 +01:00
Suanming Mou
3770feb827 net/mlx5: fix hash list entry assert
The entry variable assert in the mlx5_hlist_register() function is not
correct. Remove the invalid entry variable.

Fixes: e69a59227d ("net/mlx5: support concurrent access for hash list")

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 19:43:25 +01:00
Dekel Peled
58df16e08c net/mlx5: fix use of local array for global error
Recent patch uses a local string array as input for function
rte_flow_error_set().
This stack memory may be later used by other code sections,
overwriting the desired error string.

This patch implements an error string for the specific case
requested, of ICMP item not supported in Verbs flow engine.

Fixes: d51475d1bf ("net/mlx5: support item type error message in flow Verbs")

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 19:43:25 +01:00
Jiawei Wang
9ade91dfe8 net/mlx5: fix group value of sample suffix flow
mlx5 PMD split the sampling flow into prefix flow and suffix
flow. On the sample action translation function, the scaled
group value of suffix flow be attached into sample object and
saved into sample resource.

mlx5 PMD fetched the group value from the sample resource to
create the suffix flow. On the mlx5_flow_group_to_table
function the group value of suffix flow was scaled with table
factor again and translated into HW table. That caused the
incorrect group value of sample suffix flow.

The fix introduces a 'skip_scale' flag and sets it to 1 for the
sample suffix flow creation. On the mlx5_flow_group_to_table
function skips the scale with table factor to use the correct
group value.

Fixes: 4ec6360de3 ("net/mlx5: implement tunnel offload")

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 19:43:25 +01:00
Dong Zhou
cdbdcd46a2 net/mlx5: fix switch port id when representor in bonding
In the bonding configurations the port switch id for representors was
composed of pf index in bonding as the 1 MSB and the representor's index
as the remaining 15 LSBs. The special corner case for the host PF
representor on BF setups with representor id 0xFFFF was missed as well.

The new switch port id consists of 4 MSBs for the pf bonding index and
the remaining 12 LSBs for the representor index. The switch port id
ranges for each type of representors are as follows:

Uplink representor(AKA master): 0xFFFF
Host PF representor: 0x<pf_bond>FFF
VF representor: 0x<pf_bond>[0-FFE]

Fixes: bee57a0a35 ("net/mlx5: update switch port id in bonding configuration")
Cc: stable@dpdk.org

Signed-off-by: Dong Zhou <dongzhou@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 19:42:40 +01:00
Dekel Peled
105d214965 net/mlx5: fix aging queue doorbell ringing
Recent patch introduced a new SQ for ASO flow hit management.
This SQ uses two WQEBB's for each WQE.
The SQ producer index is 16 bits wide.

The enqueue loop posts new WQEs to the ASO SQ, using WQE index for
the SQ management.
This 16 bits index multiplied by 2 was wrongly used also for SQ
doorbell ringing.
The multiplication caused the SW index overlapping to be out of sync
with the hardware index, causing it to get stuck.

This patch separates the WQE index management from the doorbell index
management.
So, for each WQE index incrementation by 1, the doorbell index is
incremented by 2.

Fixes: f935ed4b64 ("net/mlx5: support flow hit action for aging")

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 16:26:54 +01:00
Bing Zhao
ebda282cbb net/mlx5: fix eCPRI common header endianness
The input header of a RTE flow item is with network byte order. In
the host with little endian, the bit field order are the same as the
byte order.
When checking the eCPRI message type, the wrong field will be selected.
Fixing to use correct field.

Fixes: daa38a8924 ("net/mlx5: add flow translation of eCPRI header")
Cc: stable@dpdk.org

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 16:26:54 +01:00
Matan Azrad
876b5d52a3 net/mlx5: fix Tx queue stop state
The Tx queue stop API doesn't call the PMD callback when the state of
the queue is stopped.
The drivers should update the state to be stopped when the queue stop
callback is done successfully or when the port is stopped.
The drivers should update the state to be started when the queue start
callback is done successfully or when the port is started.

The driver wrongly didn't update the state as started when the port
start callback was done which kept the state as stopped.
Following call to a queue stop API was not completed by ethdev layer
because the state is already stopped.

Move the state update from the Tx queue setup to the port start
callback.

Fixes: 161d103b23 ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 16:26:54 +01:00
Matan Azrad
95940894f3 net/mlx5: fix Tx queue reference count check
The Txq refcnt 1 value means that there is no real reference to the
queue and only the control configurations are saved in the struct.

The patch below wrongly didn't consider it and caused a leak in the Txq
object resource.

Revert the specific update in the refcnt.

Fixes: b5c8b3e70c ("net/mlx5: use C11 atomics for RxQ/TxQ refcounts")

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 16:26:54 +01:00
Jiawei Wang
992e6df3da common/mlx5: free MR resource on device DMA unmap
mlx5 PMD created the MR (Memory Region) resource on the
mlx5_dma_map call to make the memory available for DMA
operations. On the mlx5_dma_unmap call the MR resource
was not freed but inserted to MR Free list for further
garbage collection.
Actual MR resource destroying happened on device stop
call. That caused the runtime out of memory in case of
application performed multiple DMA map/unmap calls.

The fix immediately frees the MR resource on mlx5_dma_unmap
call not engaging the list. The export for mlx5_mr_free
function from common PMD part is added as well.

Fixes: 989e999d93 ("net/mlx5: support PCI device DMA map and unmap")
Cc: stable@dpdk.org

Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-13 16:26:54 +01:00
Tal Shnaiderman
8178d9be73 net/mlx5: fix SQ resources release in error flow
Fix in error flow in which the function
mlx5_txq_release_devx_sq_resources is called twice by setting the
release object to NULL after the first call

The incorrect flow was introduced in the work done on generic
object creation.

Once an error flow inside mlx5_txq_create_devx_sq_resources
occurs the function will call mlx5_txq_release_devx_sq_resources
however the released pointers are not set to NULL after the release
calls and undefined memory is released in the same call in
mlx5_txq_release_devx_resources.

This results in calls to MLX5_FREE with
an already released memory addresses and assert in mlx5_release_dbr:

EAL: Error: Invalid memory
EAL: Error: Invalid memory

PANIC in mlx5_txq_release_devx_sq_resources():
assert "(mlx5_release_dbr(&txq_obj->txq_ctrl->priv->dbrpgs,
 mlx5_os_get_umem_id (txq_obj->sq_dbrec_page->umem),
 txq_obj->sq_dbrec_offset)) == 0" failed

The fix is setting the released pointers to NULL after the first release
calls.

Fixes: 86d259cec8 ("net/mlx5: separate Tx queue object creations")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 16:26:54 +01:00
Ophir Munk
6c3dc9ebaa net/mlx5: fix Rx queue object allocation with MPRQ
The space for extra buffer pointers used by MPRQ routines was not
allocated in Rx queue object creation structure causing memory
corruption.
The fix allocates the extra memory for the pointers in case MPRQ is
engaged.

Fixes: a0a45e8af7 ("net/mlx5: configure Rx queue for buffer split")

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-13 16:26:53 +01:00
Viacheslav Ovsiienko
eb63ec0e56 net/mlx5: fix UAR used by ASO queues
The dedicated UAR was allocated for the ASO queues.
The shared UAR created for Tx queues can be used instead.

Fixes: f935ed4b64 ("net/mlx5: support flow hit action for aging")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-14 10:56:30 +01:00
Viacheslav Ovsiienko
9cc0e99c81 common/mlx5: share UAR allocation routine
This patch introduces the routine to allocate the UAR (User
Access Region) with various memory mapping types. The origin
patch being fixed provided the UAR allocation workaround
for the mlx5 net PMD only. As it was found the other mlx5
based drivers - vdpa and regex are affected by the issue
as well and must be fixed.

Fixes: a0bfe9d56f ("net/mlx5: fix UAR memory mapping type")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-14 10:56:30 +01:00