common/mlx5: fix queue pair ack timeout configuration

VDPA driver creates two QPs(1 queue pair include 1 send queue
and 1 receive queue) per virtio queue to get traffic events
from NIC to SW.
Two QPs(called FW QP and SW QP) are created as loopback QP
and FW QP'SQ is connected to SW QP'RQ internally.

When packet receive or send out, HW will send WQE by FW QP'SQ,
then SW will get CQE from the CQ of SW QP.

With large scale and heavy traffic, the SQ's request may fail
to get ACK from RQ HW, because HW is busy.
SQ will retry the request with qpc.retry_count times and each time
wait for 4.096 uS *2^(ack_timeout) for the response. If still can’t
get RQ’s HW response, SQ will go to an error state.

16 is experienced value. It should not be too high or too low.
Too high will make QP waits too long in case it’s packet drop.
Too low will cause QP to go to an error state(retry-exceeded) easily.

Fixes: 15c3807e86ab ("common/mlx5: support DevX QP operations")
Cc: stable@dpdk.org

Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This commit is contained in:
Yajun Wu 2022-02-14 08:03:19 +02:00 committed by Raslan Darawsheh
parent dbbdeb8b47
commit 05b54bf089

View File

@ -2279,7 +2279,7 @@ mlx5_devx_cmd_modify_qp_state(struct mlx5_devx_obj *qp, uint32_t qp_st_mod_op,
case MLX5_CMD_OP_RTR2RTS_QP:
qpc = MLX5_ADDR_OF(rtr2rts_qp_in, &in, qpc);
MLX5_SET(rtr2rts_qp_in, &in, qpn, qp->id);
MLX5_SET(qpc, qpc, primary_address_path.ack_timeout, 14);
MLX5_SET(qpc, qpc, primary_address_path.ack_timeout, 16);
MLX5_SET(qpc, qpc, log_ack_req_freq, 0);
MLX5_SET(qpc, qpc, retry_count, 7);
MLX5_SET(qpc, qpc, rnr_retry, 7);