numam-dpdk/drivers/net
Bruce Richardson 0b6493fbb0 i40e: improve performance of vector PMD
An analysis of the i40e code using Intel® VTune™ Amplifier 2016 showed
that the code was unexpectedly causing stalls due to "Loads blocked by
Store Forwards". This can occur when a load from memory has to wait
due to the prior store being to the same address, but being of a smaller
size i.e. the stored value cannot be directly returned to the loader.
[See ref: https://software.intel.com/en-us/node/544454]

These stalls are due to the way in which the data_len values are handled
in the driver. The lengths are extracted using vector operations, but those
16-bit lengths are then assigned using scalar operations i.e. 16-bit
stores.

These regular 16-bit stores actually have two effects in the code:
* they cause the "Loads blocked by Store Forwards" issues reported
* they also cause the previous loads in the RX function to actually be a
load followed by a store to an address on the stack, because the 16-bit
assignment can't be done to an xmm register.

By converting the 16-bit store operations into a sequence of SSE blend
operations, we can ensure that the descriptor loads only occur once, and
avoid both the additional stores and loads from the stack, as well as the
stalls due to the blocked loads.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
2016-05-06 15:51:22 +02:00
..
af_packet ethdev: redesign link speed config 2016-04-01 21:38:34 +02:00
bnx2x ethdev: redesign link speed config 2016-04-01 21:38:34 +02:00
bonding eal: add assert macro for debug 2016-05-02 15:31:17 +02:00
cxgbe ethdev: remove deprecated statistics 2016-04-20 13:49:31 +02:00
e1000 ethdev: remove deprecated statistics 2016-04-20 13:49:31 +02:00
ena eal: add assert macro for debug 2016-05-02 15:31:17 +02:00
enic eal: add assert macro for debug 2016-05-02 15:31:17 +02:00
fm10k fm10k: fix packet type for multi-segment packets 2016-05-06 15:51:22 +02:00
i40e i40e: improve performance of vector PMD 2016-05-06 15:51:22 +02:00
ixgbe ixgbe: fix bit shift overflow in VMDQ pool setup 2016-05-06 15:51:22 +02:00
mlx4 ethdev: redesign link speed config 2016-04-01 21:38:34 +02:00
mlx5 ethdev: add 100G link speed 2016-04-01 21:38:34 +02:00
mpipe ethdev: redesign link speed config 2016-04-01 21:38:34 +02:00
nfp ethdev: remove deprecated statistics 2016-04-20 13:49:31 +02:00
null ethdev: redesign link speed config 2016-04-01 21:38:34 +02:00
pcap ethdev: redesign link speed config 2016-04-01 21:38:34 +02:00
qede qede: add DCBX support 2016-05-06 15:51:22 +02:00
ring ethdev: redesign link speed config 2016-04-01 21:38:34 +02:00
szedata2 ethdev: add 100G link speed 2016-04-01 21:38:34 +02:00
vhost vhost: enable guest notification only on enabled queues 2016-04-07 19:24:17 +02:00
virtio virtio: use zeroed memory for simple Tx header 2016-04-06 12:27:57 +02:00
vmxnet3 eal: add assert macro for debug 2016-05-02 15:31:17 +02:00
xenvirt eal: add assert macro for debug 2016-05-02 15:31:17 +02:00
Makefile qede: enable PMD build 2016-05-06 15:51:22 +02:00