numam-dpdk

Go to file

Zhihong Wang f5472703c0 eal: optimize aligned memcpy on x86

This patch optimizes rte_memcpy for well aligned cases, where both
dst and src addr are aligned to maximum MOV width. It introduces a
dedicated function called rte_memcpy_aligned to handle the aligned
cases with simplified instruction stream. The existing rte_memcpy
is renamed as rte_memcpy_generic. The selection between them 2 is
done at the entry of rte_memcpy.

The existing rte_memcpy is for generic cases, it handles unaligned
copies and make store aligned, it even makes load aligned for micro
architectures like Ivy Bridge. However alignment handling comes at
a price: It adds extra load/store instructions, which can cause
complications sometime.

DPDK Vhost memcpy with Mergeable Rx Buffer feature as an example:
The copy is aligned, and remote, and there is header write along
which is also remote. In this case the memcpy instruction stream
should be simplified, to reduce extra load/store, therefore reduce
the probability of load/store buffer full caused pipeline stall, to
let the actual memcpy instructions be issued and let H/W prefetcher
goes to work as early as possible.

This patch is tested on Ivy Bridge, Haswell and Skylake, it provides
up to 20% gain for Virtio Vhost PVP traffic, with packet size ranging
from 64 to 1500 bytes.

The test can also be conducted without NIC, by setting loopback
traffic between Virtio and Vhost. For example, modify the macro
TXONLY_DEF_PACKET_LEN to the requested packet size in testpmd.h,
rebuild and start testpmd in both host and guest, then "start" on
one side and "start tx_first 32" on the other.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>

2017-01-17 16:40:05 +01:00

app

mbuf: add a function to linearize a packet

2017-01-15 19:30:00 +01:00

buildtools

pmdinfogen: fix null dereference

2017-01-06 11:40:30 +01:00

config

ethdev: add Tx preparation

2017-01-04 20:40:15 +01:00

devtools

devtools: skip capitalization check for commit prefixes

2017-01-13 17:03:43 +01:00

doc

net/virtio: setup Rx queue interrupts

2017-01-17 09:26:54 +01:00

drivers

net/virtio: unmap queue/irq when closing

2017-01-17 09:26:59 +01:00

examples

examples/l3fwd-power: fix stop and close on signal

2017-01-17 09:27:16 +01:00

lib

eal: optimize aligned memcpy on x86

2017-01-17 16:40:05 +01:00

tools: move to usertools

2017-01-04 21:17:32 +01:00

pkg

tools: move to usertools

2017-01-04 21:17:32 +01:00

usertools

tools: move to usertools

2017-01-04 21:17:32 +01:00

.gitattributes

improve git diff

2016-11-13 15:25:12 +01:00

.gitignore

doc: generate NIC overview table from ini files

2016-08-03 18:42:17 +02:00

GNUmakefile

pmdinfogen: add buildtools and pmdinfogen utility

2016-07-06 22:34:39 +02:00

LICENSE.GPL

doc: GPL/LGPL licenses

2013-07-25 14:43:06 +02:00

LICENSE.LGPL

doc: fix file format (dos to unix)

2013-09-06 11:43:07 +02:00

MAINTAINERS

tools: move to usertools

2017-01-04 21:17:32 +01:00

Makefile

remove trailing whitespaces

2014-06-11 00:29:34 +02:00

README

doc: add readme file

2015-12-13 22:06:58 +01:00

README

DPDK is a set of libraries and drivers for fast packet processing.
It supports many processor architectures and both FreeBSD and Linux.

The DPDK uses the Open Source BSD license for the core libraries and
drivers. The kernel components are GPLv2 licensed.

Please check the doc directory for release notes,
API documentation, and sample application information.

For questions and usage discussions, subscribe to: users@dpdk.org
Report bugs and issues to the development mailing list: dev@dpdk.org