numam-dpdk/doc/guides/vdpadevs/mlx5.rst
Matan Azrad 99f9d799ce vdpa/mlx5: improve interrupt management
The driver should notify the guest for each traffic burst detected by CQ
polling.

The CQ polling trigger is defined by `event_mode` device argument,
either by busy polling on all the CQs or by blocked call to HW
completion event using DevX channel.

Also, the polling event modes can move to blocked call when the
traffic rate is low.

The current blocked call uses the EAL interrupt API suffering a lot
of overhead in the API management and serve all the drivers and
libraries using only single thread.

Use blocking FD of the DevX channel in order to do blocked call
directly by the DevX channel FD mechanism.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-05-04 10:22:17 +02:00

173 lines
5.5 KiB
ReStructuredText

.. SPDX-License-Identifier: BSD-3-Clause
Copyright 2019 Mellanox Technologies, Ltd
.. include:: <isonum.txt>
MLX5 vDPA driver
================
The MLX5 vDPA (vhost data path acceleration) driver library
(**librte_vdpa_mlx5**) provides support for **Mellanox ConnectX-6**,
**Mellanox ConnectX-6 Dx** and **Mellanox BlueField** families of
10/25/40/50/100/200 Gb/s adapters as well as their virtual functions (VF) in
SR-IOV context.
.. note::
This driver is enabled automatically when using "meson" build system which
will detect dependencies.
Design
------
For security reasons and robustness, this driver only deals with virtual
memory addresses. The way resources allocations are handled by the kernel,
combined with hardware specifications that allow to handle virtual memory
addresses directly, ensure that DPDK applications cannot access random
physical memory (or memory that does not belong to the current process).
The PMD can use libibverbs and libmlx5 to access the device firmware
or directly the hardware components.
There are different levels of objects and bypassing abilities
to get the best performances:
- Verbs is a complete high-level generic API
- Direct Verbs is a device-specific API
- DevX allows to access firmware objects
- Direct Rules manages flow steering at low-level hardware layer
Enabling librte_vdpa_mlx5 causes DPDK applications to be linked against
libibverbs.
A Mellanox mlx5 PCI device can be probed by either net/mlx5 driver or vdpa/mlx5
driver but not in parallel. Hence, the user should decide the driver by the
``class`` parameter in the device argument list.
By default, the mlx5 device will be probed by the net/mlx5 driver.
Supported NICs
--------------
* Mellanox\ |reg| ConnectX\ |reg|-6 200G MCX654106A-HCAT (2x200G)
* Mellanox\ |reg| ConnectX\ |reg|-6 Dx EN 25G MCX621102AN-ADAT (2x25G)
* Mellanox\ |reg| ConnectX\ |reg|-6 Dx EN 100G MCX623106AN-CDAT (2x100G)
* Mellanox\ |reg| ConnectX\ |reg|-6 Dx EN 200G MCX623105AN-VDAT (1x200G)
* Mellanox\ |reg| BlueField SmartNIC 25G MBF1M332A-ASCAT (2x25G)
Prerequisites
-------------
- Mellanox OFED version: **5.0**
see :doc:`../../nics/mlx5` guide for more Mellanox OFED details.
Compilation option
~~~~~~~~~~~~~~~~~~
The meson option ``ibverbs_link`` is **shared** by default,
but can be configured to have the following values:
- ``dlopen``
Build PMD with additional code to make it loadable without hard
dependencies on **libibverbs** nor **libmlx5**, which may not be installed
on the target system.
In this mode, their presence is still required for it to run properly,
however their absence won't prevent a DPDK application from starting (with
DPDK shared build disabled) and they won't show up as missing with ``ldd(1)``.
It works by moving these dependencies to a purpose-built rdma-core "glue"
plug-in which must be installed in a directory whose name is based
on ``RTE_EAL_PMD_PATH`` suffixed with ``-glue``.
This option has no performance impact.
- ``static``
Embed static flavor of the dependencies **libibverbs** and **libmlx5**
in the PMD shared library or the executable static binary.
.. note::
Default armv8a configuration of meson build sets ``RTE_CACHE_LINE_SIZE``
to 128 then brings performance degradation.
Run-time configuration
~~~~~~~~~~~~~~~~~~~~~~
- **ethtool** operations on related kernel interfaces also affect the PMD.
Driver options
^^^^^^^^^^^^^^
- ``class`` parameter [string]
Select the class of the driver that should probe the device.
`vdpa` for the mlx5 vDPA driver.
- ``event_mode`` parameter [int]
- 0, Completion queue scheduling will be managed by a timer thread which
automatically adjusts its delays to the coming traffic rate.
- 1, Completion queue scheduling will be managed by a timer thread with fixed
delay time.
- 2, Completion queue scheduling will be managed by interrupts. Each CQ burst
arms the CQ in order to get an interrupt event in the next traffic burst.
- Default mode is 1.
- ``event_us`` parameter [int]
Per mode micro-seconds parameter - relevant only for event mode 0 and 1:
- 0, A nonzero value to set timer step in micro-seconds. The timer thread
dynamic delay change steps according to this value. Default value is 1us.
- 1, A value to set fixed timer delay in micro-seconds. Default value is 0us.
- ``no_traffic_time`` parameter [int]
A nonzero value defines the traffic off time, in polling cycle time units,
that moves the driver to no-traffic mode. In this mode the polling is stopped
and interrupts are configured to the device in order to notify traffic for the
driver. Default value is 16.
- ``event_core`` parameter [int]
CPU core number to set polling thread affinity to, default to control plane
cpu.
- ``hw_latency_mode`` parameter [int]
The completion queue moderation mode:
- 0, HW default.
- 1, Latency is counted from the first packet completion report.
- 2, Latency is counted from the last packet completion.
- ``hw_max_latency_us`` parameter [int]
- 1 - 4095, The maximum time in microseconds that packet completion report
can be delayed.
- 0, HW default.
- ``hw_max_pending_comp`` parameter [int]
- 1 - 65535, The maximum number of pending packets completions in an HW queue.
- 0, HW default.
Error handling
^^^^^^^^^^^^^^
Upon potential hardware errors, mlx5 PMD try to recover, give up if failed 3
times in 3 seconds, virtq will be put in disable state. User should check log
to get error information, or query vdpa statistics counter to know error type
and count report.