17095 Commits

Author SHA1 Message Date
Raja Zidane
56faca3bc1 crypto/mlx5: fix freeing on probing failure
When calling device close, unset dek is called which destroys a hash list.
In case of error during dev probe, close is called when dek hlist is not
initialized.
Ensure non null list destroy.

Fixes: 90646d6c6e22 ("crypto/mlx5: support basic operations")
Cc: stable@dpdk.org

Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-08 19:46:28 +01:00
Raja Zidane
54feeab1c0 common/mlx5: fix DevX queue size overflow
The HW QP/SQ/RQ/CQ queue sizes may be bigger than 64KB.
The width of the variable handled the queue size is 16 bits
which cannot contain the maximum queue size.
Replace the size type to be uint32_t.

Fixes: 9dab4d62b4dc ("common/mlx5: share DevX CQ creation")
Fixes: 38f537635c15 ("common/mlx5: share DevX SQ creation")
Fixes: f9213ab12cf9 ("common/mlx5: share DevX queue pair operations")
Cc: stable@dpdk.org

Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-08 19:46:28 +01:00
Harman Kalra
49fdb0ae0d net/mlx4: fix crash on allocation failure
This patch fixes coverity issue by adding a NULL check.

Coverity issue: 373687
Fixes: d61138d4f0e2 ("drivers: remove direct access to interrupt handle")

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-11-08 17:32:42 +01:00
Harman Kalra
aedd054c5c drivers: check interrupt file descriptor validity
This patch fixes coverity issue by adding a check for negative value to
avoid bad bit shift operation and other invalid use of file descriptors.

Coverity issue: 373717, 373697, 373685
Coverity issue: 373723, 373720, 373719, 373718, 373715, 373714, 373713
Coverity issue: 373710, 373707, 373706, 373705, 373704, 373701, 373700
Coverity issue: 373698, 373695, 373692, 373690, 373689
Coverity issue: 373722, 373721, 373709, 373702, 373696
Fixes: d61138d4f0e2 ("drivers: remove direct access to interrupt handle")

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-11-08 17:32:42 +01:00
Elena Agostini
8b8036a66e gpudev: introduce GPU device class library
In heterogeneous computing system, processing is not only in the CPU.
Some tasks can be delegated to devices working in parallel.

The new library gpudev is for dealing with GPGPU computing devices
from a DPDK application running on the CPU.

The infrastructure is prepared to welcome drivers in drivers/gpu/.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-11-08 17:20:52 +01:00
Raja Zidane
f66898ebd0 common/mlx5: fix MMO configuration in DevX queue pair
The QP extension valid bit was not set in the QP creation for MMO
configuration.
That caused the QP not to be connected to the GGA MMO engines,
and any MMO WQE job got CQE with an error.
Set the QP ext bit when MMO is configured.

Fixes: ddda0006188a ("common/mlx5: add MMO configuration for DevX queue pair")

Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-08 09:36:27 +01:00
Raja Zidane
4b99fe0577 common/mlx5: fix HCA capabilities PRM alignment
0x20 reserved bytes were missed in the HCA cap PRM structure before the
newly added fields for MMO QP capabilities.
That caused reading MMO QP caps incorrectly.
Add the reserved fields in the HCA cap structure.

Fixes: cbc4c13a255e ("common/mlx5: update MMO HCA capabilities")

Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-08 09:29:36 +01:00
Radha Mohan Chintakuntla
ebc539271e dma/cnxk: add statistics
Add the stats function to get the DMA statistics.

Signed-off-by: Radha Mohan Chintakuntla <radhac@marvell.com>
2021-11-08 00:08:45 +01:00
Radha Mohan Chintakuntla
3340c3e227 dma/cnxk: add scatter-gather copy
Add the copy_sg function that will do the multiple DMA transfers of
different sizes and different source/destination as well.

Signed-off-by: Radha Mohan Chintakuntla <radhac@marvell.com>
2021-11-08 00:08:45 +01:00
Radha Mohan Chintakuntla
b56f1e2dad dma/cnxk: add channel operations
Add functions for the dmadev vchan setup and DMA operations.

Signed-off-by: Radha Mohan Chintakuntla <radhac@marvell.com>
2021-11-08 00:08:45 +01:00
Radha Mohan Chintakuntla
53f6d7328b dma/cnxk: create and initialize device on PCI probing
This patch creates and initializes a dmadev device on pci probe.

Signed-off-by: Radha Mohan Chintakuntla <radhac@marvell.com>
2021-11-08 00:08:45 +01:00
Radha Mohan Chintakuntla
b6e395692b common/cnxk: add DPI DMA support
Add base support as ROC(Rest of Chip) API which will be used by PMD
dmadev driver.

This patch adds routines to init, fini, configure the DPI DMA device
found in Marvell's CN9k or CN10k SoC families.

Signed-off-by: Radha Mohan Chintakuntla <radhac@marvell.com>
2021-11-07 23:29:58 +01:00
Chengwen Feng
569e850b4b dma/hisilicon: support multi-process
This patch add multi-process support for Kunpeng DMA devices.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2021-11-07 20:02:27 +01:00
Chengwen Feng
2db4f0b823 dma/hisilicon: add data path
This patch add data path functions for Kunpeng DMA devices.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2021-11-07 20:02:27 +01:00
Chengwen Feng
3c5f5f03a0 dma/hisilicon: add control path
This patch add control path functions for Kunpeng DMA devices.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2021-11-07 20:02:24 +01:00
Chengwen Feng
9e16317a38 dma/hisilicon: add probing
This patch add dmadev instances create during the PCI probe, and
destroy them during the PCI remove. Internal structures and HW
definitions was also included.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2021-11-07 20:01:52 +01:00
Chengwen Feng
4d0d4cf327 dma/hisilicon: introduce driver skeleton
Add the basic device probe and remove functions and initial
documentation for new hisilicon DMA drivers.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2021-11-07 19:54:19 +01:00
Michael Baum
5dfa003db5 common/mlx5: fix post doorbell barrier
The rdma-core library can map doorbell register in two ways, depending
on the environment variable "MLX5_SHUT_UP_BF":

  - as regular cached memory, the variable is either missing or set to
    zero. This type of mapping may cause the significant doorbell
    register writing latency and requires an explicit memory write
    barrier to mitigate this issue and prevent write combining.

  - as non-cached memory, the variable is present and set to not "0"
    value. This type of mapping may cause performance impact under
    heavy loading conditions but the explicit write memory barrier is
    not required and it may improve core performance.

The UAR creation function maps a doorbell in one of the above ways
according to the system. In run time, it always adds an explicit memory
barrier after writing to.
In cases where the doorbell was mapped as non-cached memory, the
explicit memory barrier is unnecessary and may impair performance.

The commit [1] solved this problem for a Tx queue. In run time, it
checks the mapping type and provides the memory barrier after writing to
a Tx doorbell register if it is needed. The mapping type is extracted
directly from the uar_mmap_offset field in the queue properties.

This patch shares this code between the drivers and extends the above
solution for each of them.

[1] commit 8409a28573d3
    ("net/mlx5: control transmit doorbell register mapping")

Fixes: f8c97babc9f4 ("compress/mlx5: add data-path functions")
Fixes: 8e196c08ab53 ("crypto/mlx5: support enqueue/dequeue operations")
Fixes: 4d4e245ad637 ("regex/mlx5: support enqueue")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 16:21:03 +01:00
Michael Baum
b6e9c33c82 net/mlx5: remove duplicated reference of Tx doorbell
The Tx doorbell has different virtual addresses per process.
The secondary process takes the UAR physical page ID of the primary and
mmap it to its own virtual address.
The primary doorbell references were saved in two shared memory
locations: the TxQ structure and a dedicated doorbell array.

Remove the doorbell reference from the TxQ structure and move the
primary processes to take the UAR information from the primary doorbell
array.

Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 16:21:03 +01:00
Michael Baum
b4371d3d56 common/mlx5: fix doorbell mapping configuration
UAR mapping type can be affected by the devarg tx_db_nc, which can cause
setting the environment variable MLX5_SHUT_UP_BF.
So, the MLX5_SHUT_UP_BF value and the UAR mapping parameter affect the
UAR cache mode.

Wrongly, the devarg was considered for the MLX5_SHUT_UP_BF but not for
the UAR mapping parameter in all the drivers except the net.

Take the tx_db_nc devarg into account for all the drivers.

Fixes: ca1418ce3910 ("common/mlx5: share device context object")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 16:21:03 +01:00
Michael Baum
3f0e54fe00 common/mlx5: fix UAR allocation diagnostics messages
Depending on kernel capabilities and rdma-core version the mapping of
UAR (User Access Region) of desired memory caching type (non-cached or
write combining) might fail. The PMD implements the flexible strategy
of UAR mapping, alternating the type of caching to succeed.

During this process the failure diagnostics messages are emitted.
These messages are merely diagnostics ones and the logging level should
be adjusted to DEBUG.

Fixes: 9cc0e99c81ab0 ("common/mlx5: share UAR allocation routine")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 16:21:03 +01:00
Michael Baum
d1325200ac common/mlx5: remove unreachable branch in UAR allocation
The User Access Region (UAR) provides access to the hardware resources
like Doorbell Register from userspace.
It means the resources should be mapped by the kernel to some virtual
address range. There two types of memory mapping are supported by mlx5
kernel driver:

 MLX5DV_UAR_ALLOC_TYPE_NC - non-cached, all writes promoted directly to
			    hardware.
 MLX5DV_UAR_ALLOC_TYPE_BF - "BlueFlame", all writes might be cached by
			    CPU, and will be flushed to hardware
			    explicitly with memory barriers.

The supported mapping types depend on the platform (x86/ARM/etc), kernel
version, driver version, virtualization environment (hypervisor), etc.

In UAR allocation, if the system supports the allocation with non-cached
mapping, the first attempt is performed with MLX5DV_UAR_ALLOC_TYPE_NC.
Then, if this fails, the next attempt is done with
MLX5DV_UAR_ALLOC_TYPE_BF.

However, the function adds a condition for the case where the first
attempt was performed with MLX5DV_UAR_ALLOC_TYPE_BF, a condition that is
unattainable since the first attempt was always performed with
MLX5DV_UAR_ALLOC_TYPE_NC.

Remove the unreachable code.

Fixes: 9cc0e99c81ab0 ("common/mlx5: share UAR allocation routine")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 16:21:03 +01:00
Michael Baum
801b4885c5 crypto/mlx5: fix login release in probing and removal
The probe function creates DevX object named login and saves pointer to
it in priv structure.

The remove function releases first the priv structure and then releases
the login object.
However, the pointer to login object is field of priv structure, which
is invalid.

Release the login object and then release the priv structure.

Fixes: debb27ea3442 ("crypto/mlx5: create login object using DevX")
Cc: stable@dpdk.org

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 16:20:35 +01:00
Michael Baum
204891763c common/mlx5: make multi-process MR management port-agnostic
In the multi-process mechanism, there are things that the secondary
process does not perform itself but asks the primary process to perform
for it.
There is a special API for communication between the processes that
receives parameters necessary for the specific action required as well
as a special structure called mp_id that contains the port number of the
processes through which the initial process finds the relevant ETH
device for the processes.

One of the operations performed through this mechanism is the creation
of a memory region, where the secondary process sends the virtual
address as a parameter and the mp_id structure with the port number
inside it.
However, once the memory area management is shared between the drivers
and either port number or ETH device is no longer relevant to them, it
seems unnecessary to continue communicating between the processes
through the mp_id variable.

In this patch we will remove the use of the above structure for all MR
management, and add to the specific parameter of operations a pointer to
the common device that contains everything needed to create/register MR.

Fixes: 9f1d636f3ef08 ("common/mlx5: share MR management")

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 14:12:08 +01:00
Michael Baum
334ed198ab common/mlx5: remove redundant parameter in MR search
Memory region management has recently been shared between drivers,
including the search for caches in the data plane.
The initial search in the local linear cache of the queue, usually
yields a result and one should not continue searching in the next level
caches.

The function that searches in the local cache gets the pointer to a
device as a parameter, that is not necessary for its operation
but for subsequent searches (which, as mentioned, usually do not
happen).
Transferring the device to a function and maintaining it, takes some
time and causes some impact on performance.

Add the pointer to the device as a field of the mr_ctrl structure. The
field will be updated during control path and will be used only when
needed in the search.

Fixes: fc59a1ec556b ("common/mlx5: share MR mempool registration")

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 14:11:16 +01:00
Michael Baum
6a4e438576 common/mlx5: fix MR search inlining
Memory region management has recently been shared between drivers,
including the search for caches in the data plane.
The initial search in the local linear cache of the queue,
usually yields a result and one should not continue searching
in the next layer caches.

Prior to cache sharing the local linear cache lookup function
was defined with "static inline" attributes,
those were missed in routine commoditizing step
and this caused performance degradation.

Set the common function as static inline.

Fixes: fc59a1ec556b ("common/mlx5: share MR mempool registration")

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-11-07 14:04:47 +01:00
Andrew Rybchenko
5e973b3fa1 common/sfc_efx: fix debug compilation control
efsys.h belongs to common/sfc_efx and common driver debug
toggle should be used instead of net/sfc toggle.

Fixes: 5e111ed87999 ("net/sfc: introduce common driver library")
Cc: stable@dpdk.org

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-11-05 22:04:55 +01:00
Bing Zhao
e848218741 net/mlx5: check delay drop settings in kernel driver
The delay drop is the common feature managed on per device basis
and the kernel driver is responsible one for the initialization and
rearming.

By default, the timeout value is set to activate the delay drop when
the driver is loaded.

A private flag "dropless_rq" is used to control the rearming. Only
when it is on, the rearming will be handled once received a timeout
event. Or else, the delay drop will be deactivated after the first
timeout occurs and all the Rx queues won't have this feature.

The PMD is trying to query this flag and warn the application when
some queues are created with delay drop but the flag is off.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-05 17:04:53 +01:00
Bing Zhao
febcac7b46 net/mlx5: support Rx queue delay drop
For the Ethernet RQs, if there all receiving descriptors are
exhausted, the packets being received will be dropped. This behavior
prevents slow or malicious software entities at the host from
affecting the network. While for hairpin cases, even if there is no
software involved during the packet forwarding from Rx to Tx side,
some hiccup in the hardware or back pressure from Tx side may still
cause the descriptors to be exhausted. In certain scenarios it may be
preferred to configure the device to avoid such packet drops,
assuming the posting of descriptors will resume shortly.

To support this, a new devarg "delay_drop" is introduced. By default,
the delay drop is enabled for hairpin Rx queues and disabled for
standard Rx queues. This value is used as a bit mask:
  - bit 0: enablement of standard Rx queue
  - bit 1: enablement of hairpin Rx queue
And this attribute will be applied to all Rx queues of a device.

The "rq_delay_drop" capability in the HCA_CAP is checked before
creating any queue. If the hardware capabilities do not support
this delay drop, all the Rx queues will still be created without
this attribute, and the devarg setting will be ignored even if it
is specified explicitly. A warning log is used to notify the
application when this occurs.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-05 17:04:53 +01:00
Jiawen Wu
b4ce1520c9 net/txgbe: fix link process in KR mode
Set the 'present' parameter to 0 by default. It is configured by hardware,
users can set it to 1 for manual configuration.

Fixes: f611dada1af8 ("net/txgbe: update link setup process of backplane NICs")
Cc: stable@dpdk.org

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2021-11-05 15:10:21 +01:00
Jie Wang
8cc79a1636 net/i40e: fix forward outer IPv6 VXLAN
Testpmd forwards packets in checksum mode that it need to calculate
the checksum of each layer's protocol. Then it will fill flags and
header length into mbuf.

In process_outer_cksums, HW calculates the outer checksum if
tx_offloads contains outer UDP checksum otherwise SW calculates
the outer checksum.

When tx_offloads contains outer UDP checksum or outer IPv4 checksum,
mbuf will be filled with correct header length.

This patch added outer UDP checksum in tx_offload_capa and
I40E_TX_OFFLOAD_MASK, when we set csum hw outer-udp on that the
engine can forward outer IPv6 VXLAN packets.

Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")
Cc: stable@dpdk.org

Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
2021-11-05 05:31:22 +01:00
Viacheslav Ovsiienko
25ed2ebff1 net/mlx5: support shared Rx queue port data path
When receive packet, mlx5 PMD saves mbuf port number from
RxQ data.

To support shared RxQ, save port number into RQ context as user index.
Received packet resolve port number from CQE user index which derived
from RQ context.

Legacy Verbs API doesn't support RQ user index setting, still read from
RxQ port number.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:51 +01:00
Xueming Li
09c2555303 net/mlx5: support shared Rx queue
This patch introduces shared RxQ. All shared Rx queues with same group
and queue ID share the same rxq_ctrl. Rxq_ctrl and rxq_data are shared,
all queues from different member port share same WQ and CQ, essentially
one Rx WQ, mbufs are filled into this singleton WQ.

Shared rxq_data is set into device Rx queues of all member ports as
RxQ object, used for receiving packets. Polling queue of any member
ports returns packets of any member, mbuf->port is used to identify
source port.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:50 +01:00
Xueming Li
5cf0707fc7 net/mlx5: remove Rx queue data list from device
Rx queue data list(priv->rxqs) can be replaced by Rx queue
list(priv->rxq_privs), removes it and replaces with universal wrapper
API.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:49 +01:00
Xueming Li
5ceb3a02b0 net/mlx5: move Rx queue DevX resource
To support shared RX queue, moves DevX RQ which is per queue resource to
Rx queue private data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:48 +01:00
Xueming Li
5db77fef78 net/mlx5: remove port info from shareable Rx queue
To prepare for shared Rx queue, removes port info from shareable Rx
queue control.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:47 +01:00
Xueming Li
44126bd9d0 net/mlx5: move Rx queue hairpin info to private data
Hairpin info of Rx queue can't be shared, moves to private queue data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:47 +01:00
Xueming Li
0cedf34da7 net/mlx5: move Rx queue reference count
Rx queue reference count is counter of RQ, used to count reference to RQ
object. To prepare for shared Rx queue, this patch moves it from
rxq_ctrl to Rx queue private data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:46 +01:00
Xueming Li
4cda06c3c3 net/mlx5: split Rx queue into shareable and private
To prepare shared Rx queue, splits RxQ data into shareable and private.
Struct mlx5_rxq_priv is per queue data.
Struct mlx5_rxq_ctrl is shared queue resources and data.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:45 +01:00
Xueming Li
53232e3b05 net/mlx5: clean Rx queue code
This patch removes unused Rx queue code.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:45 +01:00
Xueming Li
fdb67b84a5 net/mlx5: fix Rx queue memory allocation return value
If error happened during Rx queue mbuf allocation, boolean value
returned. From description, return value should be error number.

This patch returns negative error number.

Fixes: 0f20acbf5eda ("net/mlx5: implement vectorized MPRQ burst")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:44 +01:00
Xueming Li
056c87d07d common/mlx5: support receive memory pool
The hardware Receive Memory Pool (RMP) object holds the destination for
incoming packets/messages that are routed to the RMP through RQs. RMP
enables sharing of memory across multiple Receive Queues. Multiple
Receive Queues can be attached to the same RMP and consume memory
from that shared poll. When using RMPs, completions are reported to the
CQ pointed to by the RQ, user index that set in RQ creation time is
carried to completion entry.

This patch enables RMP based RQ, RMP is created when mlx5_devx_rq.rmp is
set.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:43 +01:00
Xueming Li
ee16071164 common/mlx5: support basic receive memory pool
The hardware Receive Memory Pool (RMP) object holds the destination for
incoming packets/messages that are routed to the RMP through RQs. RMP
enables sharing of memory across multiple Receive Queues. Multiple
Receive Queues can be attached to the same RMP and consume memory
from that shared poll. When using RMPs, completions are reported to the
CQ pointed to by the RQ, and this Completion Queue can be shared as
well.

This patch adds DevX supports of PRM RMP object.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:43 +01:00
Xueming Li
9c777ccfbb common/mlx5: introduce user index field in completion
On ConnectX devices the completion entry provides the dedicated 24-bit
field, that is filled up with some static value assigned at the
Receiving Queue creation moment. This patch declares this field. This is
a preparation step for supporting shared RQs and the field is supposed
to provide actual port index while handling the shared receiving
queue(s).

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:42 +01:00
Xueming Li
68fa62924d net/mlx5: fix Altivec Rx
This patch fixes stale field reference.

Fixes: a18ac6113331 ("net/mlx5: add metadata support to Rx datapath")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:41 +01:00
Gregory Etelson
a23e9b6e3e net/mlx5: handle flex item in flows
Provide flex item recognition, validation and translation
in flow patterns. Track the flex item referencing.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:41 +01:00
Viacheslav Ovsiienko
6dac7d7ff2 net/mlx5: translate flex item pattern into matcher
The matcher is an steering engine entity that represents
the flow pattern to hardware to match. It order to
provide match on the flex item pattern the appropriate
matcher fields should be configured with values and masks
accordingly.

The flex item related matcher fields is an array of eight
32-bit fields to match with data captured by sample registers
of configured flex parser. One packet field, presented in
item pattern can be split between several sample registers,
and multiple fields can be combined together into single
sample register to optimize hardware resources usage
(number os sample registers is limited), depending on field
modes, widths and offsets. Actual mapping is complicated
and controlled by special translation data, built by PMD
on flex item creation.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:40 +01:00
Viacheslav Ovsiienko
b293e8e49d net/mlx5: translate flex item configuration
RTE Flow flex item configuration should be translated
into actual hardware settings:

  - translate header length and next protocol field samplings
  - translate data field sampling, the similar fields with the
    same mode and matching related parameters are relocated
    and grouped to be covered with minimal amount of hardware
    sampling registers (each register can cover arbitrary
    neighbour 32 bits (aligned to byte boundary) in the packet
    and we can combine the fields with smaller lengths or
    segments of bigger fields)
  - input and output links translation
  - preparing data for parsing flex item pattern on flow creation

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:39 +01:00
Gregory Etelson
9086ac093a net/mlx5: add flex parser DevX object management
The DevX flex parsers can be shared between representors
within the same IB context. We should put the flex parser
objects into the shared list and engage the standard
mlx5_list_xxx API to manage ones.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:38 +01:00
Viacheslav Ovsiienko
db25cadc08 net/mlx5: add flex item operations
This patch is a preparation step of implementing
flex item feature in driver and it provides:

  - external entry point routines for flex item
    creation/deletion

  - flex item objects management over the ports.

The flex item object keeps information about
the item created over the port - reference counter
to track whether item is in use by some active
flows and the pointer to underlying shared DevX
object, providing all the data needed to translate
the flow flex pattern into matcher fields according
hardware configuration.

There is not too many flex items supposed to be
created on the port, the design is optimized
rather for flow insertion rate than memory savings.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-11-04 22:55:38 +01:00