net/i40e: relax barrier in Tx for NEON

To keep ordering of mixed accesses, 'DMB OSH' is sufficient.
'DSB' inside the I40E_PCI_REG_WRITE is overkill.[1]

This patch fixes by replacing with just sufficient barriers in the
normal PMD and vPMD.

It showed 7% performance uplift on ThunderX2 and 4% on Arm N1SDP.
The test case is the RFC2544 zero-loss test running testpmd.

[1] http://inbox.dpdk.org/dev/CALBAE1M-ezVWCjqCZDBw+MMDEC4O9
qf0Kpn89EMdGDajepKoZQ@mail.gmail.com

Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
Cc: stable@dpdk.org

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
This commit is contained in:
Gavin Hu 2020-04-14 00:40:24 +08:00 committed by Ferruh Yigit
parent 746664d546
commit 6b50c489a3

View File

@ -72,8 +72,9 @@ i40e_rxq_rearm(struct i40e_rx_queue *rxq)
rx_id = (uint16_t)((rxq->rxrearm_start == 0) ?
(rxq->nb_rx_desc - 1) : (rxq->rxrearm_start - 1));
rte_cio_wmb();
/* Update the tail pointer on the NIC */
I40E_PCI_REG_WRITE(rxq->qrx_tail, rx_id);
I40E_PCI_REG_WRITE_RELAXED(rxq->qrx_tail, rx_id);
}
static inline void
@ -564,7 +565,8 @@ i40e_xmit_fixed_burst_vec(void *tx_queue, struct rte_mbuf **tx_pkts,
txq->tx_tail = tx_id;
I40E_PCI_REG_WRITE(txq->qtx_tail, txq->tx_tail);
rte_cio_wmb();
I40E_PCI_REG_WRITE_RELAXED(txq->qtx_tail, tx_id);
return nb_pkts;
}