The query for the tunnel stateless offloads is wrongly implemented
because of:
1. It was using the device id to query for the offloads.
2. It was using a compilation flag for Verbs which no longer exits.
The main reason was lack of proper API from Verbs.
Fixing the query to use rdma-core API. The capability returned from
rdma-core refer to both Tx and Rx sides.
Eventhough there is a separate cap for GRE and VXLAN, implementation merge
them into a single flag in order to simplify the checks on the data
path.
Fixes: 43e9d9794cde ("net/mlx5: support upstream rdma-core")
Fixes: f5fde5205101 ("net/mlx5: add hardware checksum offload for tunnel packets")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
This lays the groundwork for externalizing rdma-core as an optional
run-time dependency instead of a mandatory one.
No functional change.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
When no memory is available on the same numa node than the device, the
initialization of the device fails. However, the use case where the
cores and memory are on a different socket than the device is valid,
even if not optimal.
To fix this issue, this commit introduces an infrastructure to select
the socket on which to allocate the verbs objects based on the ethdev
configuration and the object type, rather than the PCI numa node.
Fixes: 1e3a39f72d5d ("net/mlx5: allocate verbs object into shared memory")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.
There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.
More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Move device configuration and features capabilities to its own structure.
This structure is filled by mlx5_pci_probe(), outside of this function
it should be treated as *read only*.
This configuration struct will be used for the Tx/Rx queue setup to
select the Tx/Rx queue parameters based on the user configuration and
device capabilities.
In addition it will be used by the burst selection function to decide
on the best pkt burst to be used.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Since the secondary process has its own devops, function which cannot be
called by the secondary don't need anymore to verify which process is
calling it.
Fixes: 87ec44ce1651 ("net/mlx5: add operations for secondary process")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
mlx5_get_priv() is barely use across the driver. To avoid mixing access,
this function is definitely removed.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
While the PMD avoids from creating hash RXQ with no hash fields and
array of queues after the port was already started, it lacks such
protection when re-creating the flows after the port restarts.
This may lead to inconsistent behavior for flows depending if they were
created before or after the port start.
Fixes: 8086cf08b2f0 ("net/mlx5: handle RSS hash configuration in RSS flow")
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Indirection table size must be in log to communicate with verbs when the
number of queue is not a power of two, the maximum indirection table
size is use, but not converted to log2. This makes a memory corruption.
Fixes: 4c7a0f5ff876 ("net/mlx5: make indirection tables shareable")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Interrupt disable can be called when the interrupt vector is not yet
allocated. Such case ends up with segmentation fault.
Fixing it by adding verification for interrupt vector validity.
Fixes: 09cb5b581762 ("net/mlx5: separate DPDK from verbs Rx queue objects")
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The size of Rx completion queue should be doubled if compression is enabled
in case of non-vectorized Rx.
Fixes: 523f5a742102 ("net/mlx5: fix configuration of Rx CQE compression")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
With the upstream rdma-core, to enable Rx CQE compression,
mlx5dv_create_cq() in Direct Verbs has to be used instead of regular
Verbs call (ibv_create_cq()). And if the size of CQE is 128 bytes,
compression is supported only by certain devices. Thus, it has to be
decided by checking the capability bits.
Fixes: 43e9d9794cde ("net/mlx5: support upstream rdma-core")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
From this commit the RSS support becomes un-available until it is
replaced by the generic flow implementation.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Since RSS configuration can also be used by flow API, there is no more
necessity to keep a list of RSS configurable for each protocol.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
RSS hash configuration is currently ignored by the PMD, this commits
removes the RSS feature.
This functionality will be added in a later commit.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
RSS hash configuration is currently ignored by the PMD, this commits
removes the RSS feature on promiscuous mode.
This functionality will be added in a later commit.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
RSS hash configuration is currently ignored by the PMD, this commits
removes the RSS feature on promiscuous mode.
This functionality will be added in a later commit.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
drivers/net/mlx5/mlx5_rxq.c:606:6: error: comparison of constant 4
with expression of type 'enum hash_rxq_flow_type' is always true
[-Werror,-Wtautological-constant-out-of-range-compare]
i != (int)RTE_DIM((*priv->hash_rxqs)[0].special_flow);
~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Clang expects to have an index going upto special_flow size which is
defined by MLX5_MAX_SPECIAL_FLOWS and value is 4. Comparing to an
unrelated enum where index my be lower cause this compilation issue.
Fixes: 36351ea34b92 ("net/mlx: fix build with icc")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Hash Rx queue is an high level queue providing the RSS hash algorithm,
key and indirection table to spread the packets. Those objects can be
easily shared between several Verbs flows. This commit bring this
capability to the PMD.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Indirection table in verbs side resides in a list of final work queues
to spread the packets according to an higher level queue. This
indirection table can be shared among the hash Rx queues which points
to them.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Use the same design for DPDK queue as for Verbs queue for symmetry, this
also helps in fixing some issues like the DPDK release queue API which
is not expected to fail. With such design, the queue is released when
the reference counters reaches 0.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Move verbs object to their own functions to allocate/release them
independently from the DPDK queue. At the same time a reference counter
is added to help in issues detections when the queue is being release
but still in use somewhere else (flows for instance).
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This patch introduce the Memory region as a shared object where users
should get a reference to it by calling the priv_mr_get() or
priv_mr_new() to create the memory region. This last one will
register the memory pool in the kernel driver and retrieve the
associated memory region.
This should help to reduce the memory consumption cause by registering
multiple times the same memory pool.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This flag is already present in the Ethernet device.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Generic flow API should be use for flow steering as is provides a better
and easier way to configure flows.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
rte_eth_dev created by primary process were not available in secondary
process, it was not possible to use the primary process local memory
object from a secondary process.
This patch modify the reference of primary rte_eth_dev object, use
local rte_eth_dev secondary process instead.
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
This removes the dependency on specific Mellanox OFED libraries by
using the upstream rdma-core and linux upstream community code.
Both rdma-core upstream and Mellanox OFED are Linux user-space packages:
1. Rdma-core is Linux upstream user-space package.(Generic)
2. Mellanox OFED is Mellanox's Linux user-space package.(Proprietary)
The difference between the two are the APIs towards the kernel.
Support for x86-32 is removed due to issues in rdma-core library.
ICC compilation will be supported as soon as the following patch is
integrated in rdma-core:
https://marc.info/?l=linux-rdma&m=150643474705690&w=2
Signed-off-by: Shachar Beiser <shacharbe@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Secondary process is a copy/paste of the mlx4 drivers, it was never
tested and it even segfault at the secondary process start in the
mlx5_pci_probe().
This makes more sense to wipe this non working feature to re-write a
working and functional version.
Fixes: a48deada651b ("mlx5: allow operation in secondary processes")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Those are useless since DPDK headers have been cleaned up.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Vector PMD returns buffers to the application without setting the pointers
in the Rx queue to null nor allocating them. When the PMD cleanup the ring
it needs to take a special care to those pointers to not free the mbufs
before the application have used them nor if the application have already
freed them.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
To use the vector, it needs to add to the PMD Rx mbuf ring four extra mbuf
to avoid memory corruption. This additional mbuf are added on dev_start()
whereas all other mbuf are allocated on queue setup.
This patch brings this allocation back to the same place as other mbuf
allocation.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This patch prepare the merge of fake mbuf allocation needed by the vector
code with rxq_alloc_elts() where all mbuf of the queues should be
allocated.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Unlike mlx5_rx_burst(), mlx5_rx_burst_vec() doesn't replace completed
buffers one by one right after completion is processed but replenishes
multiple buffers later with rte_mempool_get_bulk(). Therefore, there could
be some buffer addresses left in the SW ring (rxq->elts[]) which have
already been delivered to application. As PMD doesn't own such buffers, it
must not be freed by PMD. "Trimming" is needed before cleanup.
A problem can be seen when quitting testpmd when
CONFIG_RTE_LIBRTE_MBUF_DEBUG=y and CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=y
Trimming should be as simple as possible, it shouldn't touch any indexes
and buffer allocation isn't necessary.
Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Changing the MTU is not related to changing the number of segments,
activating or not the multi-segment support should be handled by the
application.
Fixes: 9964b965ad69 ("net/mlx5: re-add Rx scatter support")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
To make vectorized burst routines enabled, it is required to run on x86_64
architecture. If all the conditions are met, the vectorized burst functions
are enabled automatically. The decision is made individually on RX and TX.
There's no PMD option to make a selection.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The callbacks are global to a device but the selection is made every queue
configuration, which is redundant.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
This commit addresses various issues that may lead to undefined behavior
when configuring Rx interrupts.
While failure to create a Rx queue completion channel in rxq_ctrl_setup()
prevents that queue from being created, existing queues still have theirs.
Since the error handler disables dev_conf.intr_conf.rxq as well, subsequent
calls to rxq_ctrl_setup() create Rx queues without interrupts. This leads
to a scenario where not all Rx queues support interrupts; missing checks on
the presence of completion channels may crash the application.
Considering that the PMD is not supposed to disable user-provided
configuration parameters (dev_conf.intr_conf.rxq), and that these can
change for subsequent rxq_ctrl_setup() calls anyway, properly supporting a
mixed mode where not all Rx queues have interrupts enabled is a better
approach.
To do so with a minimum set of changes, priv_intr_efd_enable() and
priv_create_intr_vec() are first refactored as a single
priv_rx_intr_vec_enable() function (same for their "disable" counterparts).
Since they had to be used together, there was no point in keeping them
separate.
Remaining changes:
- Always clean up before reconfiguring interrupts to avoid memory leaks.
- Always clean up when closing the device.
- Use malloc()/free() instead of their rte_*() counterparts since there is
no need to store the vector in huge pages-backed memory.
- Allow more Rx queues than the size of the event file descriptor array as
long as Rx interrupts are not requested on all of them.
- Properly clean up interrupt handle when disabling Rx interrupts (nb_efd
and intr_vec reset to 0).
- Check completion channel presence while toggling Rx interrupts on a given
queue.
Fixes: 3c7d44af252a ("net/mlx5: support user space Rx interrupt event")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
A negative return value is documented for that function in case of error.
Fixes: 3c7d44af252a ("net/mlx5: support user space Rx interrupt event")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Not exposing Rx interrupts callbacks when this feature is unsupported is
less intrusive than having two different versions for these functions.
Fixes: 3c7d44af252a ("net/mlx5: support user space Rx interrupt event")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
These functions do not belong to the data path. Their prototypes are also
misplaced.
Fixes: 3c7d44af252a ("net/mlx5: support user space Rx interrupt event")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>