Add support of setting hash lookup table size according
to the hardawre capability.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Adding support for lsc interrupt from bonded device to link
bonding library with supporting unit tests in the test application.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
This is a Linux-specific virtual PMD driver backed by an AF_PACKET
socket. This implementation uses mmap'ed ring buffers to limit copying
and user/kernel transitions. The PACKET_FANOUT_HASH behavior of
AF_PACKET is used for frame reception. In the current implementation,
Tx and Rx queues are always paired, and therefore are always equal
in number -- changing this would be a Simple Matter Of Programming.
Interfaces of this type are created with a command line option like
"--vdev=eth_af_packet0,iface=...". There are a number of options available
as arguments:
- Interface is chosen by "iface" (required)
- Number of queue pairs set by "qpairs" (optional, default: 1)
- AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
- AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
- AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)
Signed-off-by: John W. Linville <linville@tuxdriver.com>
[Thomas: disable because of incompatibility with some kernels]
set link-up and set link-down were not included
in the help command.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch fixes two occurrences where a call to strncmp had the closing
brace in the wrong place. Changing this form:
if (strncmp(X,Y,sizeof(X) != 0))
which does a comparison of length 1, to
if (strncmp(X,Y,sizeof(X)) != 0)
which does the correct length comparison and then compares the result to
zero in the "if" part, as the author presumably originally intended.
Reported-by: Larry Wang <liang-min.wang@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
eal_flags and multiprocess unit tests use --file-prefix option
which is not supported in FreeBSD, so it has been removed
if compiled for this OS.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Some features of the cmdline were broken in FreeBSD as a result of
termios not being compiled.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Since commit 08b563ffb1 ("mbuf: replace data pointer by an offset"),
data is not an mbuf field anymore.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Since commit 08b563ffb1 ("mbuf: replace data pointer by an offset"),
data is not an mbuf field anymore.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
It eliminates a race between threads using rte_alarm_cancel() and
rte_alarm_set().
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
During initialization of rte_table_hash_ext and rte_table_hash_lru, a
contiguous region of memory is allocated to store meta data, buckets,
extended buckets, keys, stack of keys, stack of extended buckets and
data entries. The size of each region depends on the hash table
configuration.
The address of each region is calculated using offsets relative to the
beginning of the memory region. Without this patch, the offsets
contain the size of the table meta data (sizeof(struct
rte_table_hash)). These addresses are stored in pointers which are
used when entries are added or deleted and lookups are performed.
Instead of adding these offsets to the address of the beginning of the
memory region, they are added to the address of the end of the meta
data (= address of the beginning of the memory region + sizeof(struct
rte_table_hash)). The resulting addresses are off by sizeof(struct
rte_table_hash) bytes. As a consequence, memory past the allocated
region can be accessed by the add, delete and lookup operations.
This patch corrects the address calculation by not including the size
of the meta data in the offsets.
Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
During initialization of rte_hash_table_ext and rte_hash_table_lru,
t->data_size_shl is calculated. This member contains the number of
bits to shift left during calculation of the location of entries in
the hash table. To determine the number of bits to shift left, the
size of the entry (as provided to the rte_table_hash_ext_create and
rte_table_hash_lru_create) has to be used instead of the size of the
key.
Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
If a key is not found in a bucket and the bucket has been extended,
the extended buckets also have to checked for potentially matching
keys. The extended buckets are checked at the end of the lookup. In
most cases, this logic is skipped as it is uncommon to have buckets in
an extended state.
In case the lookup is performed with less than 5 packets, an
unoptimized version is run instead (the optimized version requires at
least 5 packets). The extended buckets should also be checked in this
case instead of simply ignoring the extended buckets.
Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
When an entry is deleted from an extensible rte_table_hash, the bucket
that stored the entry can become empty. If this is the case, the
bucket needs to be removed from the chain of buckets.
During removal of the bucket, the chain should be updated first. If
the bucket that will be removed is cleared first, the chain is broken
and the information to update the chain is lost.
Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
The 1.7 DPDK_Prog_Guide document in MSWord has been converted to rst format for
use with Sphinx. There is an rst file for each chapter and an index.rst file
which contains the table of contents.
The top level index file has been modified to include this guide.
This document contains some png image files. If any of these png files are modified
they should be replaced with an svg file.
This is the sixth document from a set of 6 documents.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
A new sample app that shows the usage of the distributor library. This
app works as follows:
* An RX thread runs which pulls packets from each ethernet port in turn
and passes those packets to worker using a distributor component.
* The workers take the packets in turn, and determine the output port
for those packets using basic l2forwarding doing an xor on the source
port id.
* The RX thread takes the returned packets from the workers and enqueue
those packets into an rte_ring structure.
* A TX thread pulls the packets off the rte_ring structure and then
sends each packet out the output port specified previously by the worker
* Command-line option support provided only for portmask.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Data_ring is a pre-mapped guest ring buffer that vmxnet3
backend has access to directly without a need for buffer
address mapping and unmapping during packet transmission.
It is useful in reducing device emulation cost on the tx
path. There are some additional cost though on the guest
driver for packet copy and overall it's a win.
This patch leverages the data_ring for packets with a
length less than or equal to the data_ring entry size
(128B). For larger packet, we won't use the data_ring
as that requires one extra tx descriptor and it's not
clear if doing this will be beneficial.
Performance results show that this patch significantly
boosts vmxnet3 64B tx performance (pkt rate) for l2fwd
application on a Ivy Bridge server by >20% at which
point we start to hit some bottleneck on the rx side.
Signed-off-by: Yong Wang <yongwang@vmware.com>
This patch includes two small performance optimizations
on the rx path:
(1) It adds unlikely hints on various infrequent error
paths to the compiler to make branch prediction more
efficient.
(2) It also moves a constant assignment out of the pkt
polling loop. This saves one branching per packet.
Performance evaluation configs:
- On the DPDK-side, it's running some l3 forwarding app
inside a VM on ESXi with one core assigned for polling.
- On the client side, pktgen/dpdk is used to generate
64B tcp packets at line rate (14.8M PPS).
Performance results on a Nehalem box (4cores@2.8GHzx2)
shown below. CPU usage is collected factoring out the
idle loop cost.
- Before the patch, ~900K PPS with 65% CPU of a core
used for DPDK.
- After the patch, only 45% of a core used, while
maintaining the same packet rate.
Signed-off-by: Yong Wang <yongwang@vmware.com>
This change makes vmxnet3 consistent with other pmds in
terms of dev_stop behavior: rather than releasing tx/rx
rings, it only resets the ring structure and release the
pending mbufs.
Verified with various tests (test-pmd and pktgen) over
vmxnet3 that dev stop/restart works fine.
Signed-off-by: Yong Wang <yongwang@vmware.com>
With introduction of in_flight_bitmask, the whole 32 bits of tag can be
used. Further more, this patch fixed the integer overflow when finding
the matched tags.
The maximum number workers is now defined as 64, which is length of
double-word. The link between number of workers and RTE_MAX_LCORE is
now removed. Compile time check is added to ensure the
RTE_DISTRIB_MAX_WORKERS is less than or equal to size of double-word.
Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This field is added for librte_distributor. User of librte_distributor
is advocated to set value of mbuf->hash.usr before calling
rte_distributor_process. The value of usr is the tag which stands as
identifier of flow.
Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
According to the changes of the i40e base driver, two device
IDs (0x1573, 0x1582) are not supported anymore, and one new
device ID (0x1586) is supported. The list of i40e device IDs
DPDK supported should be modified accordingly.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The 1.7 DPDK_SampleApp_UG document in MSWord has been converted to rst format for
use with Sphinx. There is an rst file for each chapter and an index.rst file
which contains the table of contents.
The top level index file has been modified to include this guide.
This document contains some png image files. If any of thes png files are modified
they should be replaced with an svg file.
This is the fifth document from a set of 6 documents.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
The unit test can be used to measure cycles per packet in different rx/tx routines.
The NIC works in loopback mode. So it doesn't require test equipment to measure throughput.
As result, the unit test shows the average cycles per packet consuming.
When doing the test, make sure the link is UP.
Usage Example:
1. Run unit test app in interactive mode
app/test -c f -n 4 -- -i
2. Run and wait for the result
pmd_perf_autotest
There's option to choose rx/tx pair, default is vector.
set_rxtx_mode [vector|scalar|full|hybrid]
Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' without INC_VEC=y in config
It supports to measure standalone rx or tx.
Usage Example:
Choose rx or tx standalone, default is both
set_rxtx_anchor [rxtx|rxonly|txonly]
It also supports to measure standalone RX burst cycles.
In this way, it won't repeat re-send received packets.
Now it measures two situations, poll before/after xmit(w or w/o desc. cache conflict)
Usage Example:
Set stream control mode, by default is continuous
set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]
Test report: http://dpdk.org/ml/archives/dev/2014-October/007145.html
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Add support to allow packet burst generator to create packets
in different sizes.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
The scattered_rx configuration is updated in dev_start().
For the execution sequence "stop, re-configure and then re-start",
it expects using the new configuration.
But during re-configure, the stored data may still be the old one.
The patch clean the configuration anyway in dev_stop().
So that make sure always get the best Rx routine.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count
returns the packet number.
When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was
set to -ENOTSUP. It makes confusing.
The patch always return 0 no matter no packet or there's error.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This is to enable user space vhost receiving and forwarding broadcast
and multicast packets:
Use new option in command line to enable promisc mode;
Enable 2 bits in VMDQ RX mode: ETH_VMDQ_ACCEPT_BROADCAST and ETH_VMDQ_ACCEPT_MULTICAST.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Config PFVML2FLT register in ixgbe PMD to enable it receive broadcast and multicast packets;
also factorize the common logic with ixgbe_set_pool_rx_mode.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Config VM offload register in igb PMD to enable it receive broadcast and multicast packets.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Add vmdq rx mode field into rx config struct, it is flag from ETH_VMDQ_ACCEPT_*.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch supports new VMDQ API in vmdq example.
Besides, it allows users to specify num_pools different with
max_nb_pools, thus the polling thread needn't to poll queues
of all pools.
Due to i40e implementation issue, there is no default mac for
VMDQ pool, so app needs to specify mac address for each pool
explicitly.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Adding this check is to avoid breakage from future data structure changes.
Signed-off-by: Jia Yu <jyu@vmware.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>