Commit Graph

604 Commits

Author SHA1 Message Date
Shun Hao
bf62fb7693 net/mlx5: support yellow meter action in hierarchy
This patch adds the support of meter action for yellow meter policy
flow, so can use meter action for both green and yellow policy flows
in meter hierarchy.
Currently must use the same meter within one meter policy. Packets
passing green/yellow policy flow will have previous meter color of
green/yellow in subsequent meter.

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-06-01 09:49:36 +02:00
Shun Hao
6b838de3d5 net/mlx5: support previous meter color aware
This patch adds the support for previous color aware for meter.
Start_color setting is set to UNDEFINED when creating meter object that
is color aware.

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-06-01 09:49:30 +02:00
Shun Hao
9267617bb0 net/mlx5: fix meter policy creation assert
The meter policy creation doesn't belong to flow rule creation
process, so thread workspace was not initialized and there will be
assert error when using it.

This patch removes the incorrect using of thread workspace in meter
policy creation, and adds a flag in policy instead. When creating
flow rule, can use the flag to set the mark flag in thread workspace.

Fixes: 082becbf1f ("net/mlx5: fix mark enabling for Rx")
Cc: stable@dpdk.org

Signed-off-by: Shun Hao <shunh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-03-07 11:43:11 +01:00
Bing Zhao
cff6aad7af net/mlx5: remove unused reference counter
In the previous implementation, a count was used to record the number
of the references to a table resource, including the creation of the
table, the jumping to the table and the matchers created on the
table. Before releasing the table resource via the driver, it needed
to ensure that there is no reference to this table.

After the optimization of the resources management, the reference
count now is in the hash list entry as a unified solution for all the
resources management.

There is no need to keep the "refcnt" in the table resource
structure. It is removed in case that there is some unnecessary
memory overhead.

Fixes: afd7a62514 ("net/mlx5: make flow table cache thread safe")
Cc: stable@dpdk.org

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-03-07 11:43:10 +01:00
Dmitry Kozlyuk
17f95513ad net/mlx5: fix link status change detection
Sometimes net/mlx5 devices did not detect link status change to "up".

Each shared device was monitoring IBV_EVENT_PORT_{ACTIVE,ERR}
and queried the link status upon receiving the event.
IBV_EVENT_PORT_ACTIVE is delivered when the logical link status
(UP flag) is set, but the physical link status (RUNNING flag)
may be down at that time, in which case the new link status
would be erroneously considered down.

IBV interface is insufficient for the task.
Monitor interface events using Netlink.

Fixes: 198a3c339a ("mlx5: handle link status interrupts")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-03-01 16:54:07 +01:00
Michael Baum
80f872ee02 net/mlx5: add external Rx queue mapping API
External queue is a queue that has been created and managed outside the
PMD. The queues owner might use PMD to generate flow rules using these
external queues.

When the queue is created in hardware it is given an ID represented by
32 bits. In contrast, the index of the queues in PMD is represented by
16 bits. To enable the use of PMD to generate flow rules, the queue
owner must provide a mapping between the HW index and a 16-bit index
corresponding to the ethdev API.

This patch adds an API enabling to insert/cancel a mapping between HW
queue id and ethdev queue id.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-25 17:33:31 +01:00
Sean Zhang
5c4d491791 net/mlx5: support matching GRE optional fields
This patch adds matching on the optional fields (checksum/key/sequence)
of GRE header. The matching on checksum and sequence fields requests
support from rdma-core with the capability of misc5 and tunnel_header 0-3.

For patterns without checksum and sequence specified, keep using misc for
matching as before, but for patterns with checksum or sequence, validate
capability first and then use misc5 for the matching.

Signed-off-by: Sean Zhang <xiazhang@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-25 16:34:08 +01:00
Suanming Mou
fe3620aaba net/mlx5: add header reformat HW steering action
HW steering header reformat action can work under bulk mode. In
this case, when create the table, bulk size of header reformat
actions will be allocated in low level. Afterwards, when create
flow, just simply specify the action index in the bulk and the
encapsulation data to the action will be enough.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:23 +01:00
Suanming Mou
1deadfd709 net/mlx5: add HW mark action
The mark action is covered by tag action internally. While it is added
the HW will add a tag to the packet. The mark value can be set as fixed
or dynamic as the action mask indicates.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:22 +01:00
Suanming Mou
3a2f674b6a net/mlx5: add queue and RSS HW steering action
This commit adds the queue and RSS action. Similar to the jump action,
dynamic ones will be added to the action construct list.

Due to the queue and RSS action in template should not be destroyed
during port restart, the actions are created with standalone indirect
table as indirect action does. When port stops, detaches the indirect
table from action, when port starts, attaches the indirect table back
to the action.

One more change is made to accelerate the action creation. Currently
the mlx5_hrxq_get() function returns the object index instead of object
pointer. This introduced an extra converting the index to the object by
calling mlx5_ipool_get() in most of the case. And that extra converting
hurts multi-thread performance since mlx5_ipool_get() uses the global
lock inside. As the hash Rx queue object itself also contains the index,
returns the object directly will achieve better performance without the
global lock.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:22 +01:00
Suanming Mou
f13fab2392 net/mlx5: add flow jump action
Jump action connects different level of flow tables and allows packet
handling in the chain of flows.

A new action construct data struct is also added in this commit to help
to handle not only the dynamic jump action but also for the other
generic dynamic actions. The actions with empty mask configuration means
dynamic action, and the dedicated action will be created with the flow
action configuration during flow creation. In that dynamic action case,
the action will be appended to the table template's action list during
table creation.
When creating the flows, traverse the action list and pick the dynamic
action configuration details from flow actions as the action construct
data struct describes, then create the dedicated dynamic actions.

This commit adds the jump action and the generic dynamic action
construct mechanism.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:21 +01:00
Suanming Mou
c40c061a02 net/mlx5: add basic flow queue operation
The HW steering uses async queue-based flow rules management
mechanism. The matcher and part of the actions have been
prepared during flow table creation. Some remaining actions
will be constructed during flow creation if needed.

A flow postpone attribute bit describes if flow management
should be applied to the HW directly. An extra push function
is provided to force push all the cached flows to the HW.

Once the flow has been applied to the HW, the pull function
will be called to get the queued creation/destruction flows.

The DR rule flow memory is represented in PMD layer instead
of allocating from HW steering layer. While destroying the
flow, the flow rule memory can only be freed after the CQE
received.

The HW queue job descriptor is currently introduced to convey
the flow information and operation type between the flow
insertion/destruction in the pull function.

This commit adds the basic flow queue operation for:
rte_flow_async_create();
rte_flow_async_destroy();
rte_flow_push();
rte_flow_pull();

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:20 +01:00
Suanming Mou
d1559d66ed net/mlx5: add table management
Flow table is a group of flows with the same matching criteria
and the same actions defined for them. The table defines rules
that have the same matching fields but with different matching
values. For example, matching on 5 tuple, the table will be
(IPv4 source + IPv4 dest + s_port + d_port + next_proto)
while the values for each rule will be different.

The templates' relevant matching criteria and action instances
will be created in the table creation and saved in the table.
As table attributes indicate the supported flow number, the flow
memory will also be allocated at the same time.

This commit adds the table management functions.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:19 +01:00
Suanming Mou
836b5c9b5e net/mlx5: add action template management
The action template holds a list of action types that will be
used together on the same rule. The template's actions instances
will be created only when the template bind to the dedicated
group. And the created actions will be saved to each individual
group in order for best performance. The actions in a group will
not be shared with each other unless shared actions are specified.

This commit adds the action template management which stores the
flow action template.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:18 +01:00
Suanming Mou
42431df924 net/mlx5: add pattern template management
The pattern template defines flows that have the same matching
fields but with different matching values.
For example, matching on 5 tuple TCP flow, the template will be
(eth(null) + IPv4(source + dest) + TCP(s_port + d_port) while
the values for each rule will be different.

Due to the pattern template can be used in different domains, the
items will only be cached in pattern template create stage, while
the template is bound to a dedicated table, the HW criteria will
be created and saved to the table. The pattern templates can be
used by multiple tables. But different tables create the same
criteria and will not share the matcher between each other in order
to have better performance.

This commit adds pattern template management.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:18 +01:00
Suanming Mou
b401400db2 net/mlx5: add port flow configuration
The hardware steering is backend to support rte_flow_async API in
mlx5 PMD. The port configuration function creates the queues and
needed flow management resources.

The PMD layer configuration function allocates the queues' context
and per-queue job descriptor pool. The job descriptor pool size
is equal to the queue size, and the job descriptors will be popped
from pool with LIFO strategy to convey the flow information during
flow insertion/destruction. Then, while polling the queued operation
result, the flow information will be extracted from the job descriptor
and the descriptor will be pushed back to the LIFO pool.

The commit creates the flow port queues and the job descriptor pools.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:17 +01:00
Suanming Mou
d84c3cf766 net/mlx5: introduce hardware steering enable routine
The new hardware steering engine relies on using dedicated steering WQEs
instead of writing to the low-level steering table entries directly.
In the first implementation the hardware steering engine supports the
new queue based Flow API, the existing synchronous non-queue based Flow
API is not supported.

A new dv_flow_en value 2 is added to manage mlx5 PMD steering engine:

dv_flow_en	rte_flow API	rte_flow_async API
------------------------------------------------
 0		support		not support
 1		support		not support
 2		not support	support

This commit introduces the extra dv_flow_en = 2 to specify the new
flow initialize and manage operation routine.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 22:10:17 +01:00
Viacheslav Ovsiienko
2f5122dfc4 net/mlx5: configure Tx queue with send on time offload
The wait on time configuration flag is copied to the Tx queue
structure due to performance considerations. Timestamp
mask is prepared and stored in queue structure as well.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-24 13:46:56 +01:00
Haifei Luo
9b57df5575 net/mlx5: refactor getting counter action pointer
Previously API flow_dv_query_count_ptr is defined to get counter's
action pointer. This DV function is directly called and the better
way is by the callback.

Add one arg in API mlx5_counter_query and the related callback
counter_query. The added arg is for counter's action pointer.

Signed-off-by: Haifei Luo <haifeil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-02-23 15:57:34 +01:00
Michael Baum
a729d2f093 common/mlx5: refactor devargs management
Improve the devargs handling in two aspects:
 - Parse the devargs string only once.
 - Return error and report for unknown keys.

The common driver parses once the devargs string into a dictionary, then
provides it to all the drivers' probe. Each driver updates within it
which keys it has used, then common driver receives the updated
dictionary and reports about unknown devargs.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:56 +01:00
Michael Baum
45a6df804a net/mlx5: separate per port configuration
Add configuration structure for port (ethdev). This structure contains
all configurations coming from devargs which oriented to port. It is a
field of mlx5_priv structure, and is updated in spawn function for each
port.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:54 +01:00
Michael Baum
c4b8620135 net/mlx5: refactor to detect operation by DevX
Add inline function indicating whether HW objects operations can be
created by DevX. It makes the code more readable.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:53 +01:00
Michael Baum
a13ec19c19 net/mlx5: add shared device context config structure
Add configuration structure for shared device context. This structure
contains all configurations coming from devargs which oriented to
device. It is a field of shared device context (SH) structure, and is
updated once in mlx5_alloc_shared_dev_ctx() function.
This structure cannot be changed when probing again, so add function to
prevent it. The mlx5_probe_again_args_validate() function creates a
temporary IB context configure structure according to new devargs
attached in probing again, then checks the match between the temporary
structure and the existing IB context configure structure.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:52 +01:00
Michael Baum
87af0d1e1b net/mlx5: concentrate all device configurations
Move all device configure to be performed by mlx5_os_cap_config()
function instead of the spawn function.
In addition move all relevant fields from mlx5_dev_config structure to
mlx5_dev_cap.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:51 +01:00
Michael Baum
91d1cfafc9 net/mlx5: rearrange device attribute structure
Rearrange the mlx5_os_get_dev_attr() function in such a way that it
first executes the queries and only then updates the fields.
In addition, it changed its name in preparation for expanding its
operations to configure the capabilities inside it.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:50 +01:00
Michael Baum
cf004fd33a net/mlx5: add E-Switch mode flag
This patch adds in SH structure a flag which indicates whether is
E-Switch mode.
When configure "dv_esw_en" from devargs, it is enabled only when is
E-switch mode. So, since dv_esw_en has been configure, it is enough to
check if "dv_esw_en" is valid.
This patch also removes E-Switch mode check when "dv_esw_en" is checked
too.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:49 +01:00
Michael Baum
cf8971db65 net/mlx5: share counter config function
The mlx5_flow_counter_mode_config function exists for both Linux and
Windows with the same name and content.
This patch moves its implementation to the folder shared between the
operating systems, removing the duplication.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:48 +01:00
Michael Baum
e3032e9c73 net/mlx5: share realtime timestamp configure
The realtime timestamp configure work for Linux as same as Windows.
This patch removes it to the function implemented in the folder shared
between the operating systems, removing the duplication.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:47 +01:00
Michael Baum
6dc0cbc6c6 net/mlx5: remove DevX flag duplication
The sharing device context structure has a field named "devx" which
indicates if DevX is supported.
The common configure structure has also field named "devx" with the same
meaning.

There is no need for this duplication, because there is a reference to
the common structure from within the sharing device context structure.

This patch removes it from sharing device context structure and uses the
common config structure instead.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:44 +01:00
Michael Baum
538205614f net/mlx5: remove HCA attribute structure duplication
The HCA attribute structure is field of net configure structure.
It is also field of common configure structure.

There is no need for this duplication, because there is a reference to
the common structure from within the net structures.

This patch removes it from net configure structure and uses the common
config structure instead.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:43 +01:00
Michael Baum
cfe0639b30 net/mlx5: remove redundant check of devargs
The device arguments are parsed and updated twice during spawning. First
time before creating the share device context, and again later after
updating a default value to one of the arguments.

This patch consolidates them into one parsing and updates the default
values before it.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:42 +01:00
Michael Baum
dec50e58f7 net/mlx5: remove declaration duplications
In mlx5_ethdev.c file are implemented those 4 functions:
 - mlx5_dev_infos_get
 - mlx5_fw_version_get
 - mlx5_dev_set_mtu
 - mlx5_hairpin_cap_get

In mlx5.h file they are declared twice. First time under mlx5.c file and
second time under mlx5_ethdev.c file.

This patch removes the redundant declaration.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-02-21 11:36:41 +01:00
Raja Zidane
082becbf1f net/mlx5: fix mark enabling for Rx
To optimize datapath, the mlx5 pmd checked for mark action on flow
creation, and flagged possible destination rxqs (through queue/RSS
actions), then it enabled the mark action logic only for flagged rxqs.

Mark action didn't work if no queue/rss action was in the same flow,
even when the user use multi-group logic to manage the flows.
So, if mark action is performed in group X and the packet is moved to
group Y > X when the packet is forwarded to Rx queues, SW did not get
the mark ID to the mbuf.

Flag Rx datapath to report mark action for any queue when the driver
detects the first mark action after dev_start operation.

Fixes: 8e61555657 ("net/mlx5: fix shared RSS and mark actions combination")
Cc: stable@dpdk.org

Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-01-26 17:41:11 +01:00
Michael Baum
34776af600 net/mlx5: fix MPRQ stride devargs adjustment
In Multi-Packet RQ creation, the user can choose the number of strides
and their size in bytes. The user updates it using specific devargs for
both of these parameters.
The above two parameters determine the size of the WQE which is actually
their product of multiplication.

If the user selects values that are not in the supported range, the PMD
changes them to default values. However, apart from the range
limitations for each parameter individually there is also a minimum
value on their multiplication. When the user selects values that their
multiplication are lower than minimum value, no adjustment is made and
the creation of the WQE fails.

This patch adds an adjustment in these cases as well. When the user
selects values whose multiplication is lower than the minimum, they are
replaced with the default values.

Fixes: ecb160456a ("net/mlx5: add device parameter for MPRQ stride size")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-12-05 12:22:09 +01:00
Michael Baum
0947ed380f net/mlx5: improve stride parameter names
In the striding RQ management there are two important parameters, the
size of the single stride in bytes and the number of strides.

Both the data-path structure and config structure keep the log of the
above parameters. However, in their names there is no mention that the
value is a log which may be misleading as if the fields represent the
values themselves.

This patch updates their names describing the values more accurately.

Fixes: ecb160456a ("net/mlx5: add device parameter for MPRQ stride size")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-12-05 12:22:09 +01:00
Josh Soref
7be78d0279 fix spelling in comments and strings
The tool comes from https://github.com/jsoref

Signed-off-by: Josh Soref <jsoref@gmail.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2022-01-11 12:16:53 +01:00
Rongwei Liu
0888c011d5 net/mlx5: fix meter policy validation
When a user specifies meter policy like "g_actions queue / end
y_actions queue / r_action drop / end", validation logic missed
to set meter policy mode and it took a random value from the stack.

Define ALL policy modes for the mentioned cases.

Fixes: 4b7bf3ffb4 ("net/mlx5: support yellow in meter policy validation")
Cc: stable@dpdk.org

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Bing Zhao <bingz@nvidia.com>
2021-11-10 15:44:42 +01:00
Michael Baum
8451e165b8 net/mlx5: workaround MR creation for flow counter
Due to kernel driver / FW issues in direct MKEY creation using the DevX
API, this patch replaces the counter MR creation to use wrapped mkey
API.

Fixes: 5382d28c21 ("net/mlx5: accelerate DV flow counter transactions")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
2021-11-10 15:50:44 +01:00
Michael Baum
5dfa003db5 common/mlx5: fix post doorbell barrier
The rdma-core library can map doorbell register in two ways, depending
on the environment variable "MLX5_SHUT_UP_BF":

  - as regular cached memory, the variable is either missing or set to
    zero. This type of mapping may cause the significant doorbell
    register writing latency and requires an explicit memory write
    barrier to mitigate this issue and prevent write combining.

  - as non-cached memory, the variable is present and set to not "0"
    value. This type of mapping may cause performance impact under
    heavy loading conditions but the explicit write memory barrier is
    not required and it may improve core performance.

The UAR creation function maps a doorbell in one of the above ways
according to the system. In run time, it always adds an explicit memory
barrier after writing to.
In cases where the doorbell was mapped as non-cached memory, the
explicit memory barrier is unnecessary and may impair performance.

The commit [1] solved this problem for a Tx queue. In run time, it
checks the mapping type and provides the memory barrier after writing to
a Tx doorbell register if it is needed. The mapping type is extracted
directly from the uar_mmap_offset field in the queue properties.

This patch shares this code between the drivers and extends the above
solution for each of them.

[1] commit 8409a28573
    ("net/mlx5: control transmit doorbell register mapping")

Fixes: f8c97babc9 ("compress/mlx5: add data-path functions")
Fixes: 8e196c08ab ("crypto/mlx5: support enqueue/dequeue operations")
Fixes: 4d4e245ad6 ("regex/mlx5: support enqueue")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 16:21:03 +01:00
Michael Baum
b6e9c33c82 net/mlx5: remove duplicated reference of Tx doorbell
The Tx doorbell has different virtual addresses per process.
The secondary process takes the UAR physical page ID of the primary and
mmap it to its own virtual address.
The primary doorbell references were saved in two shared memory
locations: the TxQ structure and a dedicated doorbell array.

Remove the doorbell reference from the TxQ structure and move the
primary processes to take the UAR information from the primary doorbell
array.

Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 16:21:03 +01:00
Bing Zhao
e848218741 net/mlx5: check delay drop settings in kernel driver
The delay drop is the common feature managed on per device basis
and the kernel driver is responsible one for the initialization and
rearming.

By default, the timeout value is set to activate the delay drop when
the driver is loaded.

A private flag "dropless_rq" is used to control the rearming. Only
when it is on, the rearming will be handled once received a timeout
event. Or else, the delay drop will be deactivated after the first
timeout occurs and all the Rx queues won't have this feature.

The PMD is trying to query this flag and warn the application when
some queues are created with delay drop but the flag is off.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-05 17:04:53 +01:00
Bing Zhao
febcac7b46 net/mlx5: support Rx queue delay drop
For the Ethernet RQs, if there all receiving descriptors are
exhausted, the packets being received will be dropped. This behavior
prevents slow or malicious software entities at the host from
affecting the network. While for hairpin cases, even if there is no
software involved during the packet forwarding from Rx to Tx side,
some hiccup in the hardware or back pressure from Tx side may still
cause the descriptors to be exhausted. In certain scenarios it may be
preferred to configure the device to avoid such packet drops,
assuming the posting of descriptors will resume shortly.

To support this, a new devarg "delay_drop" is introduced. By default,
the delay drop is enabled for hairpin Rx queues and disabled for
standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

The "rq_delay_drop" capability in the HCA_CAP is checked before
creating any queue. If the hardware capabilities do not support
this delay drop, all the Rx queues will still be created without
this attribute, and the devarg setting will be ignored even if it
is specified explicitly. A warning log is used to notify the
application when this occurs.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-05 17:04:53 +01:00
Xueming Li
09c2555303 net/mlx5: support shared Rx queue
This patch introduces shared RxQ. All shared Rx queues with same group
and queue ID share the same rxq_ctrl. Rxq_ctrl and rxq_data are shared,
all queues from different member port share same WQ and CQ, essentially
one Rx WQ, mbufs are filled into this singleton WQ.

Shared rxq_data is set into device Rx queues of all member ports as
RxQ object, used for receiving packets. Polling queue of any member
ports returns packets of any member, mbuf->port is used to identify
source port.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:50 +01:00
Xueming Li
5cf0707fc7 net/mlx5: remove Rx queue data list from device
Rx queue data list(priv->rxqs) can be replaced by Rx queue
list(priv->rxq_privs), removes it and replaces with universal wrapper
API.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:49 +01:00
Xueming Li
5ceb3a02b0 net/mlx5: move Rx queue DevX resource
To support shared RX queue, moves DevX RQ which is per queue resource to
Rx queue private data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:48 +01:00
Xueming Li
4cda06c3c3 net/mlx5: split Rx queue into shareable and private
To prepare shared Rx queue, splits RxQ data into shareable and private.
Struct mlx5_rxq_priv is per queue data.
Struct mlx5_rxq_ctrl is shared queue resources and data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:45 +01:00
Viacheslav Ovsiienko
6dac7d7ff2 net/mlx5: translate flex item pattern into matcher
The matcher is an steering engine entity that represents
the flow pattern to hardware to match. It order to
provide match on the flex item pattern the appropriate
matcher fields should be configured with values and masks
accordingly.

The flex item related matcher fields is an array of eight
32-bit fields to match with data captured by sample registers
of configured flex parser. One packet field, presented in
item pattern can be split between several sample registers,
and multiple fields can be combined together into single
sample register to optimize hardware resources usage
(number os sample registers is limited), depending on field
modes, widths and offsets. Actual mapping is complicated
and controlled by special translation data, built by PMD
on flex item creation.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:40 +01:00
Viacheslav Ovsiienko
b293e8e49d net/mlx5: translate flex item configuration
RTE Flow flex item configuration should be translated
into actual hardware settings:

  - translate header length and next protocol field samplings
  - translate data field sampling, the similar fields with the
    same mode and matching related parameters are relocated
    and grouped to be covered with minimal amount of hardware
    sampling registers (each register can cover arbitrary
    neighbour 32 bits (aligned to byte boundary) in the packet
    and we can combine the fields with smaller lengths or
    segments of bigger fields)
  - input and output links translation
  - preparing data for parsing flex item pattern on flow creation

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:39 +01:00
Gregory Etelson
9086ac093a net/mlx5: add flex parser DevX object management
The DevX flex parsers can be shared between representors
within the same IB context. We should put the flex parser
objects into the shared list and engage the standard
mlx5_list_xxx API to manage ones.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:38 +01:00
Viacheslav Ovsiienko
db25cadc08 net/mlx5: add flex item operations
This patch is a preparation step of implementing
flex item feature in driver and it provides:

  - external entry point routines for flex item
    creation/deletion

  - flex item objects management over the ports.

The flex item object keeps information about
the item created over the port - reference counter
to track whether item is in use by some active
flows and the pointer to underlying shared DevX
object, providing all the data needed to translate
the flow flex pattern into matcher fields according
hardware configuration.

There is not too many flex items supposed to be
created on the port, the design is optimized
rather for flow insertion rate than memory savings.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:38 +01:00