net/mlx5: add Tx devargs

This patch introduces new mlx5 PMD devarg options:

- txq_inline_min - specifies minimal amount of data to be inlined into
  WQE during Tx operations. NICs may require this minimal data amount
  to operate correctly. The exact value may depend on NIC operation
  mode, requested offloads, etc.

- txq_inline_max - specifies the maximal packet length to be completely
  inlined into WQE Ethernet Segment for ordinary SEND method. If packet
  is larger the specified value, the packet data won't be copied by the
  driver at all, data buffer is addressed with a pointer. If packet
  length is less or equal all packet data will be copied into WQE.

- txq_inline_mpw - specifies the maximal packet length to be completely
  inlined into WQE for Enhanced MPW method.

Driver documentation is also updated.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This commit is contained in:
Viacheslav Ovsiienko 2019-07-21 14:24:54 +00:00 committed by Ferruh Yigit
parent a6bd4911ad
commit 505f1fe426
4 changed files with 135 additions and 46 deletions

View File

@ -351,24 +351,102 @@ Run-time configuration
- ``txq_inline`` parameter [int]
Amount of data to be inlined during TX operations. This parameter is
deprecated and ignored, kept for compatibility issue.
deprecated and converted to the new parameter ``txq_inline_max`` providing
partial compatibility.
- ``txqs_min_inline`` parameter [int]
Enable inline send only when the number of TX queues is greater or equal
Enable inline data send only when the number of TX queues is greater or equal
to this value.
This option should be used in combination with ``txq_inline`` above.
This option should be used in combination with ``txq_inline_max`` and
``txq_inline_mpw`` below and does not affect ``txq_inline_min`` settings above.
On ConnectX-4, ConnectX-4 LX, ConnectX-5, ConnectX-6 and BlueField without
Enhanced MPW:
If this option is not specified the default value 16 is used for BlueField
and 8 for other platforms
- Disabled by default.
- In case ``txq_inline`` is set recommendation is 4.
The data inlining consumes the CPU cycles, so this option is intended to
auto enable inline data if we have enough Tx queues, which means we have
enough CPU cores and PCI bandwidth is getting more critical and CPU
is not supposed to be bottleneck anymore.
On ConnectX-5, ConnectX-6 and BlueField with Enhanced MPW:
The copying data into WQE improves latency and can improve PPS performance
when PCI back pressure is detected and may be useful for scenarios involving
heavy traffic on many queues.
- Set to 8 by default.
Because additional software logic is necessary to handle this mode, this
option should be used with care, as it may lower performance when back
pressure is not expected.
- ``txq_inline_min`` parameter [int]
Minimal amount of data to be inlined into WQE during Tx operations. NICs
may require this minimal data amount to operate correctly. The exact value
may depend on NIC operation mode, requested offloads, etc.
If ``txq_inline_min`` key is present the specified value (may be aligned
by the driver in order not to exceed the limits and provide better descriptor
space utilization) will be used by the driver and it is guaranteed the
requested data bytes are inlined into the WQE beside other inline settings.
This keys also may update ``txq_inline_max`` value (default of specified
explicitly in devargs) to reserve the space for inline data.
If ``txq_inline_min`` key is not present, the value may be queried by the
driver from the NIC via DevX if this feature is available. If there is no DevX
enabled/supported the value 18 (supposing L2 header including VLAN) is set
for ConnectX-4, value 58 (supposing L2-L4 headers, required by configurations
over E-Switch) is set for ConnectX-4 Lx, and 0 is set by default for ConnectX-5
and newer NICs. If packet is shorter the ``txq_inline_min`` value, the entire
packet is inlined.
For the ConnectX-4 and ConnectX-4 Lx NICs driver does not allow to set
this value below 18 (minimal L2 header, including VLAN).
Please, note, this minimal data inlining disengages eMPW feature (Enhanced
Multi-Packet Write), because last one does not support partial packet inlining.
This is not very critical due to minimal data inlining is mostly required
by ConnectX-4 and ConnectX-4 Lx, these NICs do not support eMPW feature.
- ``txq_inline_max`` parameter [int]
Specifies the maximal packet length to be completely inlined into WQE
Ethernet Segment for ordinary SEND method. If packet is larger than specified
value, the packet data won't be copied by the driver at all, data buffer
is addressed with a pointer. If packet length is less or equal all packet
data will be copied into WQE. This may improve PCI bandwidth utilization for
short packets significantly but requires the extra CPU cycles.
The data inline feature is controlled by number of Tx queues, if number of Tx
queues is larger than ``txqs_min_inline`` key parameter, the inline feature
is engaged, if there are not enough Tx queues (which means not enough CPU cores
and CPU resources are scarce), data inline is not performed by the driver.
Assigning ``txqs_min_inline`` with zero always enables the data inline.
The default ``txq_inline_max`` value is 290. The specified value may be adjusted
by the driver in order not to exceed the limit (930 bytes) and to provide better
WQE space filling without gaps, the adjustment is reflected in the debug log.
- ``txq_inline_mpw`` parameter [int]
Specifies the maximal packet length to be completely inlined into WQE for
Enhanced MPW method. If packet is large the specified value, the packet data
won't be copied, and data buffer is addressed with pointer. If packet length
is less or equal, all packet data will be copied into WQE. This may improve PCI
bandwidth utilization for short packets significantly but requires the extra
CPU cycles.
The data inline feature is controlled by number of TX queues, if number of Tx
queues is larger than ``txqs_min_inline`` key parameter, the inline feature
is engaged, if there are not enough Tx queues (which means not enough CPU cores
and CPU resources are scarce), data inline is not performed by the driver.
Assigning ``txqs_min_inline`` with zero always enables the data inline.
The default ``txq_inline_mpw`` value is 188. The specified value may be adjusted
by the driver in order not to exceed the limit (930 bytes) and to provide better
WQE space filling without gaps, the adjustment is reflected in the debug log.
Due to multiple packets may be included to the same WQE with Enhanced Multi
Packet Write Method and overall WQE size is limited it is not recommended to
specify large values for the ``txq_inline_mpw``.
- ``txqs_max_vec`` parameter [int]
@ -376,47 +454,34 @@ Run-time configuration
equal to this value. This parameter is deprecated and ignored, kept
for compatibility issue to not prevent driver from probing.
- ``txq_mpw_en`` parameter [int]
A nonzero value enables multi-packet send (MPS) for ConnectX-4 Lx and
enhanced multi-packet send (Enhanced MPS) for ConnectX-5, ConnectX-6 and BlueField.
MPS allows the TX burst function to pack up multiple packets in a
single descriptor session in order to save PCI bandwidth and improve
performance at the cost of a slightly higher CPU usage. When
``txq_inline`` is set along with ``txq_mpw_en``, TX burst function tries
to copy entire packet data on to TX descriptor instead of including
pointer of packet only if there is enough room remained in the
descriptor. ``txq_inline`` sets per-descriptor space for either pointers
or inlined packets. In addition, Enhanced MPS supports hybrid mode -
mixing inlined packets and pointers in the same descriptor.
This option cannot be used with certain offloads such as ``DEV_TX_OFFLOAD_TCP_TSO,
DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO, DEV_TX_OFFLOAD_VLAN_INSERT``.
When those offloads are requested the MPS send function will not be used.
It is currently only supported on the ConnectX-4 Lx, ConnectX-5, ConnectX-6 and BlueField
families of adapters.
On ConnectX-4 Lx the MPW is considered un-secure hence disabled by default.
Users which enable the MPW should be aware that application which provides incorrect
mbuf descriptors in the Tx burst can lead to serious errors in the host including, on some cases,
NIC to get stuck.
On ConnectX-5, ConnectX-6 and BlueField the MPW is secure and enabled by default.
- ``txq_mpw_hdr_dseg_en`` parameter [int]
A nonzero value enables including two pointers in the first block of TX
descriptor. The parameter is deprecated and ignored, kept for compatibility
issue.
Effective only when Enhanced MPS is supported. Disabled by default.
- ``txq_max_inline_len`` parameter [int]
Maximum size of packet to be inlined. This limits the size of packet to
be inlined. If the size of a packet is larger than configured value, the
packet isn't inlined even though there's enough space remained in the
descriptor. Instead, the packet is included with pointer. This parameter
is deprecated.
is deprecated and converted directly to ``txq_inline_mpw`` providing full
compatibility. Valid only if eMPW feature is engaged.
- ``txq_mpw_en`` parameter [int]
A nonzero value enables Enhanced Multi-Packet Write (eMPW) for ConnectX-5,
ConnectX-6 and BlueField. eMPW allows the TX burst function to pack up multiple
packets in a single descriptor session in order to save PCI bandwidth and improve
performance at the cost of a slightly higher CPU usage. When ``txq_inline_mpw``
is set along with ``txq_mpw_en``, TX burst function copies entire packet
data on to TX descriptor instead of including pointer of packet.
The Enhanced Multi-Packet Write feature is enabled by default if NIC supports
it, can be disabled by explicit specifying 0 value for ``txq_mpw_en`` option.
Also, if minimal data inlining is requested by non-zero ``txq_inline_min``
option or reported by the NIC, the eMPW feature is disengaged.
- ``tx_vec_en`` parameter [int]
@ -424,12 +489,6 @@ Run-time configuration
NICs if the number of global Tx queues on the port is less than
``txqs_max_vec``. The parameter is deprecated and ignored.
This option cannot be used with certain offloads such as ``DEV_TX_OFFLOAD_TCP_TSO,
DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO, DEV_TX_OFFLOAD_VLAN_INSERT``.
When those offloads are requested the MPS send function will not be used.
Enabled by default on ConnectX-5, ConnectX-6 and BlueField.
- ``rx_vec_en`` parameter [int]
A nonzero value enables Rx vector if the port is not configured in

View File

@ -116,6 +116,7 @@ New Features
* Added support for IP-in-IP tunnel.
* Accelerate flows with count action creation and destroy.
* Accelerate flows counter query.
* Improved Tx datapath performance with enabled HW offloads.
* **Updated Solarflare network PMD.**

View File

@ -72,6 +72,15 @@
/* Device parameter to configure inline send. Deprecated, ignored.*/
#define MLX5_TXQ_INLINE "txq_inline"
/* Device parameter to limit packet size to inline with ordinary SEND. */
#define MLX5_TXQ_INLINE_MAX "txq_inline_max"
/* Device parameter to configure minimal data size to inline. */
#define MLX5_TXQ_INLINE_MIN "txq_inline_min"
/* Device parameter to limit packet size to inline with Enhanced MPW. */
#define MLX5_TXQ_INLINE_MPW "txq_inline_mpw"
/*
* Device parameter to configure the number of TX queues threshold for
* enabling inline send.
@ -1006,7 +1015,15 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
} else if (strcmp(MLX5_RXQS_MIN_MPRQ, key) == 0) {
config->mprq.min_rxqs_num = tmp;
} else if (strcmp(MLX5_TXQ_INLINE, key) == 0) {
DRV_LOG(WARNING, "%s: deprecated parameter, ignored", key);
DRV_LOG(WARNING, "%s: deprecated parameter,"
" converted to txq_inline_max", key);
config->txq_inline_max = tmp;
} else if (strcmp(MLX5_TXQ_INLINE_MAX, key) == 0) {
config->txq_inline_max = tmp;
} else if (strcmp(MLX5_TXQ_INLINE_MIN, key) == 0) {
config->txq_inline_min = tmp;
} else if (strcmp(MLX5_TXQ_INLINE_MPW, key) == 0) {
config->txq_inline_mpw = tmp;
} else if (strcmp(MLX5_TXQS_MIN_INLINE, key) == 0) {
config->txqs_inline = tmp;
} else if (strcmp(MLX5_TXQS_MAX_VEC, key) == 0) {
@ -1016,7 +1033,9 @@ mlx5_args_check(const char *key, const char *val, void *opaque)
} else if (strcmp(MLX5_TXQ_MPW_HDR_DSEG_EN, key) == 0) {
DRV_LOG(WARNING, "%s: deprecated parameter, ignored", key);
} else if (strcmp(MLX5_TXQ_MAX_INLINE_LEN, key) == 0) {
DRV_LOG(WARNING, "%s: deprecated parameter, ignored", key);
DRV_LOG(WARNING, "%s: deprecated parameter,"
" converted to txq_inline_mpw", key);
config->txq_inline_mpw = tmp;
} else if (strcmp(MLX5_TX_VEC_EN, key) == 0) {
DRV_LOG(WARNING, "%s: deprecated parameter, ignored", key);
} else if (strcmp(MLX5_RX_VEC_EN, key) == 0) {
@ -1064,6 +1083,9 @@ mlx5_args(struct mlx5_dev_config *config, struct rte_devargs *devargs)
MLX5_RX_MPRQ_MAX_MEMCPY_LEN,
MLX5_RXQS_MIN_MPRQ,
MLX5_TXQ_INLINE,
MLX5_TXQ_INLINE_MIN,
MLX5_TXQ_INLINE_MAX,
MLX5_TXQ_INLINE_MPW,
MLX5_TXQS_MIN_INLINE,
MLX5_TXQS_MAX_VEC,
MLX5_TXQ_MPW_EN,
@ -2026,6 +2048,9 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
.hw_padding = 0,
.mps = MLX5_ARG_UNSET,
.rx_vec_en = 1,
.txq_inline_max = MLX5_ARG_UNSET,
.txq_inline_min = MLX5_ARG_UNSET,
.txq_inline_mpw = MLX5_ARG_UNSET,
.txqs_inline = MLX5_ARG_UNSET,
.vf_nl_en = 1,
.mr_ext_memseg_en = 1,

View File

@ -198,6 +198,7 @@ struct mlx5_dev_config {
unsigned int cqe_comp:1; /* CQE compression is enabled. */
unsigned int cqe_pad:1; /* CQE padding is enabled. */
unsigned int tso:1; /* Whether TSO is supported. */
unsigned int tx_inline:1; /* Engage TX data inlining. */
unsigned int rx_vec_en:1; /* Rx vector is enabled. */
unsigned int mr_ext_memseg_en:1;
/* Whether memseg should be extended for MR creation. */
@ -223,6 +224,9 @@ struct mlx5_dev_config {
unsigned int ind_table_max_size; /* Maximum indirection table size. */
unsigned int max_dump_files_num; /* Maximum dump files per queue. */
int txqs_inline; /* Queue number threshold for inlining. */
int txq_inline_min; /* Minimal amount of data bytes to inline. */
int txq_inline_max; /* Max packet size for inlining with SEND. */
int txq_inline_mpw; /* Max packet size for inlining with eMPW. */
struct mlx5_hca_attr hca_attr; /* HCA attributes. */
};