Commit Graph

226 Commits

Author SHA1 Message Date
Konstantin Ananyev
da826b7135 eal: introduce ymm type for AVX 256-bit
New data type to manipulate 256 bit AVX values.
Rename field in the rte_xmm to keep common naming across SSE/AVX fields.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Ouyang Changchun
f0adccd4dc examples/vhost: fix vlan offload
The following commit break vm2vm hard mode test cases:
commit db4014f2b6 ("use factorized default Rx/Tx configuration")

Investigation show that it needs enabling vlan offload since it is turn off
by default in some drivers, and Tx need it, especially when vm2vm is in hard mode.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Tested-by: Jingguo Fu <jingguox.fu@intel.com>
2014-12-18 00:26:09 +01:00
Olivier Matz
b68bc0b83a examples/vm_power: fix initialization of cmdline token
Fix a typo: cmdline_parse_token_string_t was used in place of
cmdline_parse_num_string_t.

Seen with clang-3.5.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-18 00:26:08 +01:00
Olivier Matz
2e099bc5d1 examples/vm_power: fix split of compiler and linker options
The argument -lvirt is a linker parameter, not a CFLAG.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-18 00:26:08 +01:00
Olivier Matz
d50f463fbe examples/netmap_compat: fix overflow in ioctl operation
Compiling the netmap example with clang-3.5 triggered the following
warning:

  compat_netmap.c:783:11: error: overflow converting case value to
    switch condition type (3225184658 to 18446744072639768978)
    [-Werror,-Wswitch]
              case NIOCREGIF:
                 ^
Indeed, an ioctl value should be an unsigned 32 bits, not an int.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-18 00:26:08 +01:00
Olivier Matz
74cc294485 examples/l3fwd: fix compilation with clang 3.5
Fix the following error:
  error: unused function 'l3fwd_simple_forward'

The l3fwd_simple_forward() is maybe unused, due to compilation options
(APP_LOOKUP_METHOD, ENABLE_MULTI_BUFFER_OPTIMIZE). As the combinatorial
is quite big, it looks simpler to add the __attribute__((unused)) on
this function, so that the compiler does not complain.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-18 00:26:08 +01:00
Bruce Richardson
5b628fe19a examples/vm_power: fix check for null
The check for NULL is in the wrong position in the "if" error leg. The
pointer should be checked for NULL before checking what the value of
what the pointer points to is.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-17 01:04:06 +01:00
Bruce Richardson
0d74597c1b examples/vm_power: fix max length of unix socket path
The length of the path to a unix socket is not PATH_MAX but instead is
UNIX_PATH_MAX which is generally just over 100 bytes in size. It's not
actually defined in sys/un.h on linux - despite the man page referencing
it, so calculate the size in the case where it's not defined.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-17 01:04:06 +01:00
Bruce Richardson
95df1d78c6 examples/ip_pipeline: fix memory allocation check
Static analysis shows that once instance of rte_zmalloc is missing
a return value check in the code. This is fixed by adding a return
value check. The malloc call itself is moved to earlier in the function
so that no work is done unless all memory allocation requests have
succeeded - thereby removing the need for rollback on error.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2014-12-17 01:04:06 +01:00
Konstantin Ananyev
225979558a examples/l3fwd-acl: fix possible memory leak
At error app_acl_init() can return without freeing dynamically allocated memory.
Not really a big problem, as if app_acl_init() fails,
then application would terminate immediately anyway.
Though it is a good coding practise to make a function to cleanup after itself.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-12-17 01:04:06 +01:00
Daniel Mrzyglod
c062d3eb71 examples/l3fwd-vf: fix race condition
When the routing is through the same queue, the app crashed.

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-12-11 01:42:03 +01:00
Ouyang Changchun
c2ab5162db examples/vhost: fix hard forward of jumbo frames
Search the right segment to increase its data length, rather than
wrongly early return and exit the tx function, which leads to drop all jumbo frame packets
when vm2vm is in hard forward mode.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-12-11 01:42:03 +01:00
Huawei Xie
8bd6c395a5 examples/vhost: increase maximum queue number
Increase MAX_QUEUES from 256 to 512.

In vhost example, MAX_QUEUES macro should be the maximum possible queue number of the port.
Theoretically we should only set up the queues that are used, i.e., first rx queue of each pool, or
at most queues from 0 to MAX_QUEUES. Before we revise the implementation and are certain all NICs support
this well, add a remind message to user.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
2014-12-11 01:42:03 +01:00
Huawei Xie
db4014f2b6 examples/vhost: use factorized default Rx/Tx configuration
Refer to Pablo's commit (81f7ecd934):
    "use factorized default Rx/Tx configuration

    For apps that were using default rte_eth_rxconf and rte_eth_txconf
    structures, these have been removed and now they are obtained by
    calling rte_eth_dev_info_get, just before setting up RX/TX queues."

Move zero copy's deferred start set up ahead.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
2014-12-06 11:08:35 +01:00
Huawei Xie
84b02d16c5 examples/vhost: support new VMDQ API for i40e
In Niantic, if VMDQ mode is set, all queues are allocated to VMDQ in DPDK.
In I40E, only configured part of continous queues are allocated to VMDQ.
The rte_eth_dev_info structure is extended to provide VMDQ queue base,
queue number, and VMDQ pool base information.
This patch support the new VMDQ API in vhost example.

FIXME in PMD:
 * added mac address will be flushed at rte_eth_dev_start.
 * we don't support selectively setting up queues well.

Test report: http://dpdk.org/ml/archives/dev/2014-December/009427.html

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Tested-by: Jingguo Fu <jingguox.fu@intel.com>
2014-12-06 11:08:35 +01:00
Bruce Richardson
94aa16b45c examples/multi_process: fix resilience by enabling Rx drop
The symmetric_mp example app is set up to allow two processes to
share a NIC port, with each pulling packets from one queue. In order
to have the app continue working when one of the process dies, the
drop_en bit should be set in the NIC configuration. Without this bit
set, the NIC will stall once any queue fills. With the bit set, once
a queue fills, all subsequent packets for that queue are discarded
allowing other queues to continue operating as normal.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-05 16:55:00 +01:00
Alan Carew
aaa662e75c cmdline: fix overflow on bsd
When using test-pmd with flow director in FreeBSD, the application will
segfault/Bus error while parsing the command-line. This is due to how
each commands result structure is represented during parsing, where the offsets
for each tokens value is stored in a character array(char result_buf[BUFSIZ])
in cmdline_parse()(./lib/librte_cmdline/cmdline_parse.c).

The overflow occurs where BUFSIZ is less than the size of a commands result
structure, in this case "struct cmd_pkt_filter_result"
(app/test-pmd/cmdline.c) is 1088 bytes and BUFSIZ on FreeBSD is 1024 bytes as
opposed to 8192 bytes on Linux.

The problem can be reproduced by running test-pmd on FreeBSD:
./testpmd -c 0x3 -n 4 -- -i --portmask=0x3 --pkt-filter-mode=perfect
And adding a filter:
add_perfect_filter 0 udp src 192.168.0.0 1024 dst 192.168.0.0 1024 flexbytes
0x800 vlan 0 queue 0 soft 0x17

This patch removes the OS dependency on BUFSIZ and defines and uses a
library #define CMDLINE_PARSE_RESULT_BUFSIZE 8192

Added boundary checking to ensure this buffer size cannot overflow, with
an error message being produced.

Suggested-by: Olivier Matz <olivier.matz@6wind.com>
http://git.droids-corp.org/?p=libcmdline.git;a=commitdiff;h=b1d5b169352e57df3fc14c51ffad4b83f3e5613f

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-05 16:54:53 +01:00
Sergio Gonzalez Monroy
fdf20fa7be add prefix to cache line macros
CACHE_LINE_SIZE is a macro defined in machine/param.h in FreeBSD and
conflicts with DPDK macro version.
Adding RTE_ prefix to avoid conflicts.
CACHE_LINE_MASK and CACHE_LINE_ROUNDUP are also prefixed.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
[Thomas: updated on HEAD, including PPC]
2014-11-27 16:21:11 +01:00
David Marchand
98a1648109 examples: no more bare metal environment
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-11-27 13:09:59 +01:00
Olivier Matz
4199fdea60 mbuf: generic support for TCP segmentation offload
Some of the NICs supported by DPDK have a possibility to accelerate TCP
traffic by using segmentation offload. The application prepares a packet
with valid TCP header with size up to 64K and deleguates the
segmentation to the NIC.

Implement the generic part of TCP segmentation offload in rte_mbuf. It
introduces 2 new fields in rte_mbuf: l4_len (length of L4 header in bytes)
and tso_segsz (MSS of packets).

To delegate the TCP segmentation to the hardware, the user has to:

- set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
  PKT_TX_TCP_CKSUM)
- set the flag PKT_TX_IPV4 or PKT_TX_IPV6
- set PKT_TX_IP_CKSUM if it's IPv4, and set the IP checksum to 0 in
  the packet
- fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
- calculate the pseudo header checksum without taking ip_len in account,
  and set it in the TCP header, for instance by using
  rte_ipv4_phdr_cksum(ip_hdr, ol_flags)

The API is inspired from ixgbe hardware (the next commit adds the
support for ixgbe), but it seems generic enough to be used for other
hw/drivers in the future.

This commit also reworks the way l2_len and l3_len are used in igb
and ixgbe drivers as the l2_l3_len is not available anymore in mbuf.

Signed-off-by: Mirek Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-26 19:35:56 +01:00
Alan Carew
f5e5c3347a examples/vm_power: cli in guest
Provides a small sample application(guest_vm_power_mgr) to run on a VM.
The application is run by providing a core mask(-c) and number of memory
channels(-n). The core mask corresponds to the number of lcore channels to
attempt to open. A maximum of 64 channels per VM is allowed. The channels must
be monitored by the host.
After successful initialisation a CPU frequency command can be sent to the host
using:
set_cpu_freq <lcore_num> <up|down|min|max>.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:03 +01:00
Alan Carew
8db653ff78 examples/vm_power: vm power management application
For launching CLI thread and Monitor thread and initialising
resources.
Requires a minimum of two lcores to run, additional cores specified by eal core
mask are not used.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:03 +01:00
Alan Carew
d26c18c932 examples/vm_power: cpu frequency in host
A wrapper around librte_power(using ACPI cpufreq), providing locking around the
non-threadsafe library, allowing for frequency changes based on core masks and
core numbers from both the CLI thread and epoll monitor thread.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:03 +01:00
Alan Carew
3842bf2424 examples/vm_power: cli in host
The CLI is used for administrating the channel monitor and manager and
manually setting the CPU frequency on the host.

Supports the following commands:
 add_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 rm_vm [Mul-choice STRING]: add_vm|rm_vm <name>, add a VM for subsequent
  operations with the CLI or remove a previously added VM from the VM Power
  Manager

 add_channels [Fixed STRING]: add_channels <vm_name> <list>|all, add
  communication channels for the specified VM, the virtio channels must be
  enabled in the VM configuration(qemu/libvirt) and the associated VM must be
  active. <list> is a comma-separated list of channel numbers to add, using the
  keyword 'all' will attempt to add all channels for the VM

 set_channel_status [Fixed STRING]:
  set_channel_status <vm_name> <list>|all enabled|disabled,  enable or disable
  the communication channels in list(comma-separated) for the specified VM,
  alternatively list can be replaced with keyword 'all'. Disabled channels will
  still receive packets on the host, however the commands they specify will be
  ignored. Set status to 'enabled' to begin processing requests again.

 show_vm [Fixed STRING]: show_vm <vm_name>, prints the information on the
  specified VM(s), the information lists the number of vCPUS, the pinning to
  pCPU(s) as a bit mask, along with any communication channels associated with
  each VM

 show_cpu_freq_mask [Fixed STRING]: show_cpu_freq_mask <mask>, Get the current
  frequency for each core specified in the mask

 set_cpu_freq_mask [Fixed STRING]: set_cpu_freq <core_mask> <up|down|min|max>,
  Set the current frequency for the cores specified in <core_mask> by scaling
  each up/down/min/max.

 show_cpu_freq [Fixed STRING]: Get the current frequency for the specified core

 set_cpu_freq [Fixed STRING]: set_cpu_freq <core_num> <up|down|min|max>,
  Set the current frequency for the specified core by scaling up/down/min/max

 quit [Fixed STRING]: close the application

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:03 +01:00
Alan Carew
e8ae9b6625 examples/vm_power: channel manager and monitor in host
The manager is responsible for adding communications channels to the Monitor
thread, tracking and reporting VM state and employs the libvirt API for
synchronization with the KVM Hypervisor. The manager interacts with the
Hypervisor to discover the mapping of virtual CPUS(vCPUs) to the host
physical CPUS(pCPUs) and to inspect the VM running state.

The manager provides the following functionality to the CLI:
1) Connect to a libvirtd instance, default: qemu:///system
2) Add a VM to an internal list, each VM is identified by a "name" which must
   correspond a valid libvirt Domain Name.
3) Add communication channels associated with a VM to the epoll based Monitor
   thread.
   The channels must exist and be in the form of:
   /tmp/powermonitor/<vm_name>.<channel_number>. Each channel is a
   Virtio-Serial endpoint configured as an AF_UNIX file socket and opened in
   non-blocking mode.
   Each VM can have a maximum of 64 channels associated with it.
4) Disable or re-enable VM communication channels, channels once added to the
   Monitor thread remain in that threads control, however acting on channel
   requests can be disabled and renabled via CLI.

The monitor is an epoll based infinite loop running in a separate thread that
waits on channel events from VMs and calls the corresponding functions. Channel
definitions from the manager are registered via the epoll event opaque pointer
when calling epoll_ctl(EPOLL_CTL_ADD), this allows for obtaining the channels
file descriptor for reading EPOLLIN events and mapping the vCPU to pCPU(s)
associated with a request from a particular VM.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:03 +01:00
Bruce Richardson
7107e471a6 examples/skeleton: very simple code for packet forwarding
This is a very simple example app for doing packet forwarding with the
Intel DPDK. It's designed to serve as a start point for people new to
the Intel DPDK and who want to develop a new app.

Therefore it's meant to:
* have as good a performance out-of-the-box as possible, using the
  best-known settings for configuring the PMDs, so that any new apps can
  be based off it.
* be kept as short as possible to make it easy to understand it and get
  started with it.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-26 17:27:03 +01:00
Pablo de Lara
f9d2068d9f examples/dpdk_qat: fix reference to old mbuf field
Since commit 08b563ffb1 ("mbuf: replace data pointer by an offset"),
data is not an mbuf field anymore.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-24 13:17:49 +01:00
Reshma Pattan
07db4a9750 examples/distributor: new sample app
A new sample app that shows the usage of the distributor library. This
app works as follows:

* An RX thread runs which pulls packets from each ethernet port in turn
  and passes those packets to worker using a distributor component.
* The workers take the packets in turn, and determine the output port
  for those packets using basic l2forwarding doing an xor on the source
  port id.
* The RX thread takes the returned packets from the workers and enqueue
  those packets into an rte_ring structure.
* A TX thread pulls the packets off the rte_ring structure and then
  sends each packet out the output port specified previously by the worker
* Command-line option support provided only for portmask.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-16 22:54:56 +01:00
Cunming Liang
ec3d82db2d ether: new function to format mac address
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
2014-11-13 00:48:16 +01:00
Ouyang Changchun
90924caf08 vhost: enable promiscuous and multicast
This is to enable user space vhost receiving and forwarding broadcast
and multicast packets:
Use new option in command line to enable promisc mode;
Enable 2 bits in VMDQ RX mode: ETH_VMDQ_ACCEPT_BROADCAST and ETH_VMDQ_ACCEPT_MULTICAST.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-11-12 00:10:23 +01:00
Huawei Xie
b30eb1d26e examples/vmdq: fix code style
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
2014-11-11 23:48:05 +01:00
Huawei Xie
2a13a5a08d examples/vmdq: use new VMDQ API
This patch supports new VMDQ API in vmdq example.
Besides, it allows users to specify num_pools different with
max_nb_pools, thus the polling thread needn't to poll queues
of all pools.

Due to i40e implementation issue, there is no default mac for
VMDQ pool, so app needs to specify mac address for each pool
explicitly.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
2014-11-11 23:48:05 +01:00
Thomas Monjalon
b4bb86cc6a app,examples: remove references to drivers config
These references to drivers break the layering isolation between
application and drivers.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2014-11-10 10:07:56 +01:00
Ouyang Changchun
6630bc4244 examples/vhost: check offset with vlan
This patch checks the packet length offset value, and checks if the
extra bytes inside buffer cross page boundary.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-05 22:20:32 +01:00
Ouyang Changchun
72ec8d77ac examples/vhost: rework duplicated code
Extract a function to replace duplicated codes in one copy and zero copy TX function.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-05 22:20:32 +01:00
Ouyang Changchun
e44fb8a430 examples/vhost: fix packet length
As HW vlan strip will reduce the packet length by minus length of vlan tag,
so it need restore the packet length by plus it.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2014-11-05 22:20:32 +01:00
Ouyang Changchun
94cae38575 examples/vhost: allow mergeable packets with vector ixgbe
Since the commit 33e79bed3e has fixed the issue in vector PMD,
and then it can receive jumbo frame by scatter-gather mode, so remove this check.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2014-10-30 09:38:52 +01:00
Huawei Xie
b82da75977 examples/vhost: add new example based on lib
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: clean makefile and add in examples/Makefile]
2014-10-23 13:07:36 +02:00
Huawei Xie
a981294b29 examples/vhost: minor fixes
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-23 13:07:36 +02:00
Huawei Xie
364dddcd1b examples/vhost: add branch hints
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-23 13:07:36 +02:00
Huawei Xie
b5967c1fe5 examples/vhost: disable guest notifications
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-23 13:07:36 +02:00
Huawei Xie
28deb0204b examples/vhost: mergeable buffer option
Mergeable feature doesn't work with latest mbuf change.
Disabling IXGBE_INC_VECTOR is a temporary workaround.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-23 13:07:36 +02:00
Huawei Xie
4d50b6acbd examples/vhost: adapt Tx routing to lib
The packet passed to virtio_tx_route has been allocated
mbuf, so there is no need to allocate mbuf for it.
Use vlan offload to transmit vlan tagged packet.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: remove useless mbuf pool]
2014-10-23 13:07:03 +02:00
Huawei Xie
be800696c2 examples/vhost: use burst enqueue and dequeue from lib
In switch_worker and virtio_tx_local, rte_vhost_enqueue_burst is called to
push host packets to guest VM.
Before enqueue packets to guest VM, vhost example uses configure-able retry logic
to wait for enough vring entries.
In switch_worker, rte_vhost_dequeue_burst is called to get packets from guest VM,
then virtio device will be bound to a queue in VMDQ for the first transmitted
packet.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-23 12:00:51 +02:00
Huawei Xie
5cf2714469 examples/vhost: register with lib
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-23 12:00:50 +02:00
Huawei Xie
9915bb1f21 examples/vhost: hpa regions for zero copy
check_hpa_regions, fill_hpa_memory_regions and hpa memory region
data structure are added back from old virtio-net.c.

Add hpa (host physical address) region generation/destroy logic.
gpa<->hpa memory translation regions are generated at new_device,
when a virtio device is ready for packet processing.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-23 11:59:57 +02:00
Huawei Xie
e571e6b472 examples/vhost: add vhost dev struct
Define vhost_dev data structure.
Change reference to virtio_dev to vhost_dev.
The vhost example use vdev data structure for switching related logic
and container for virtio_dev.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-23 11:56:06 +02:00
Huawei Xie
d476ed5d9b examples/vhost: remove functions implemented in lib
Those functions are integrated into the user space vhost library:
virtio_dev_rx, virtio_dev_merge_rx, virtio_dev_tx, virtio_dev_merge_tx,
copy_from_mbuf_to_ring, gpa_to_vva.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-23 11:18:54 +02:00
Huawei Xie
d19533e86f examples/vhost: copy old vhost example
This patch copies two files main.c/main.h from most recent vhost example
(before transforming into a library) as the base for new vhost example.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-10-22 19:00:45 +02:00
Marc Sune
0c6bc8ef70 kni: memzone pool for alloc and release
The previous implementation of rte_kni_alloc() was allocating memzones with a
name composed of a fixed string and the interface name. When an application was
allocating and deallocating multiple interfaces with different names, memzones
were quickly exhausted, even though memzones from deallocated interfaces were
never used anymore (unless an interface with the same name was re-allocated).
As a result, the application was unable to allocate more KNI interfaces with
different names.

This patch implements the KNI memzone pool in order to prevent memzone
exhaustion when allocating/deallocating KNI interfaces. It adds a new API call,
rte_kni_init(max_kni_ifaces) that shall be called before any call to
rte_kni_alloc() if KNI is used. The memzones are pre-allocated with interface-
independent names so that they can be reused.

Signed-off-by: Marc Sune <marc.sune@bisdn.de>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2014-10-21 17:24:53 +02:00
Huawei Xie
5c7a80aec3 vhost: move from examples to dedicated library
Those files will be refactored in subsequent patches to form user space
vhost library.
Makefile and main.h are removed.
main.c is renamed to vhost_rxtx.c and will provide vring enqueue/dequeue API.
virtio-net.h is renamed to rte_virtio_net.h which is the API header file.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: remove from examples Makefile and merge file renaming]
2014-10-13 19:10:09 +02:00
Pablo de Lara
81f7ecd934 examples: use factorized default Rx/Tx configuration
For apps that were using default rte_eth_rxconf and rte_eth_txconf
structures, these have been removed and now they are obtained by
calling rte_eth_dev_info_get, just before setting up RX/TX queues.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-10-10 13:01:49 +02:00
Pablo de Lara
dd6303b230 examples/netmap_compat: add default build target
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-10-09 20:02:35 +02:00
Ouyang Changchun
3111eae26e ethdev: rename flag for queue start and stop
Rename start_?x_per_q to ?x_deferred_start
and add comments.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-09-29 19:30:12 +02:00
Thomas Monjalon
68fa37e021 examples: do not probe pci twice
Since commit a155d43011 ("support link bonding device initialization"),
rte_eal_pci_probe() is called in rte_eal_init().
So it doesn't have to be called by application anymore.
It has been fixed for testpmd in commit 2950a76931,
and this patch remove it from other applications.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-09-29 13:08:53 +02:00
Olivier Matz
08b563ffb1 mbuf: replace data pointer by an offset
The mbuf structure already contains a pointer to the beginning of the
buffer (m->buf_addr). It is not needed to use 8 bytes again to store
another pointer to the beginning of the data.

Using a 16 bits unsigned integer is enough as we know that a mbuf is
never longer than 64KB. We gain 6 bytes in the structure thanks to
this modification.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>

* Updated to apply to latest on mainline.
* Disabled vector PMD in config as it relies heavily on the mbuf layout
  This will be re-enabled in a subsequent commit once vPMD has been
  reworked to take account of mbuf changes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-09-17 18:53:40 +02:00
Bruce Richardson
7869536f3f mbuf: flatten struct vlan_macip
The vlan_macip structure combined a vlan tag id with l2 and l3 headers
lengths for tracking offloads. However, this structure was only used as
a unit by the e1000 and ixgbe drivers, not generally.

This patch removes the structure from the mbuf header and places the
fields into the mbuf structure directly at the required point, without
any net effect on the structure layout. This allows us to treat the vlan
tags and header length fields as separate for future mbuf changes. The
drivers which were written to use the combined structure still do so,
using a driver-local definition of it.

Reduce perf regression caused by splitting vlan_macip field. This is
done by providing a single uint16_t value to allow writing/clearing
the l2 and l3 lengths together. There is still a small perf hit to the
slow path TX due to the reads from vlan_tci and l2/l3 lengths being
separated. (<5% in my tests with testpmd with no extra params).
Unfortunately, this cannot be eliminated, without restoring the vlan
tags and l2/l3 lengths as a combined 32-bit field. This would prevent
us from ever looking to move those fields about and is an artificial tie
that applies only for performance in igb and ixgbe drivers. Therefore,
this patch keeps the vlan_tci field separate from the lengths as the
best solution going forward.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-09-17 11:29:17 +02:00
Bruce Richardson
ca04aaea80 mbuf: rename in_port to just port
In some cases we may want to tag a packet for a particular destination
or output port, so rename the "in_port" field in the mbuf to just "port"
so that it can be re-used for this purpose if an application needs it.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-09-17 11:27:51 +02:00
Olivier Matz
ea672a8b16 mbuf: remove the rte_pktmbuf structure
The rte_pktmbuf structure was initially included in the rte_mbuf
structure. This was needed when there was 2 types of mbuf (ctrl and
packet). As the control mbuf has been removed, we can merge the
rte_pktmbuf into the rte_mbuf structure.

Advantages of doing this:
  - the access to mbuf fields is easier (ex: m->data instead of m->pkt.data)
  - make the structure more consistent: for instance, there was no reason
    to have the ol_flags field in rte_mbuf
  - it will allow a deeper reorganization of the rte_mbuf structure in the
    next commits, allowing to gain several bytes in it

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
[Bruce: updated for latest code and new example apps]
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-09-17 11:27:51 +02:00
Olivier Matz
9aaccf1abd mbuf: remove rte_ctrlmbuf
The initial role of rte_ctrlmbuf is to carry generic messages (data
pointer + data length) but it's not used by the DPDK or it applications.
Keeping it implies:
  - loosing 1 byte in the rte_mbuf structure
  - having some dead code rte_mbuf.[ch]

This patch removes this feature. Thanks to it, it is now possible to
simplify the rte_mbuf structure by merging the rte_pktmbuf structure
in it. This is done in next commit.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>

* Updated patch to HEAD.
* Modified patch to retain the old function names for ctrl mbufs as
  macros. This helps with app compatibility, and allows the concept
  of a control mbuf to be reintroduced via a single-bit flag in
  a future change.
* Updated the packet framework ip_pipeline example application to
  work following this change.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-09-17 11:27:50 +02:00
Olivier Matz
62814bc2e9 mbuf: rename RTE_MBUF_SCATTER_GATHER into RTE_MBUF_REFCNT
It seems that RTE_MBUF_SCATTER_GATHER is not the proper name for the
feature it provides. "Scatter gather" means that data is stored using
several buffers. RTE_MBUF_REFCNT seems to be a better name for that
feature as it provides a reference counter for mbufs.

The macro RTE_MBUF_SCATTER_GATHER is poisoned to ensure this
modification is seen by drivers or applications using it.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-09-17 11:27:50 +02:00
Konstantin Ananyev
074f54ad03 acl: fix build and runtime for default target
Make ACL library to build/work on 'default' architecture:
- make rte_acl_classify_scalar really scalar
 (make sure it wouldn't use sse4 instrincts through resolve_priority()).
- Provide two versions of rte_acl_classify code path:
  rte_acl_classify_sse() - could be build and used only on systems with sse4.2
  and upper, return -ENOTSUP on lower arch.
  rte_acl_classify_scalar() - a slower version, but could be build and used
  on all systems.
- Addition of a new function rte_acl_classify_alg.  This function lets you
  specify an enum value to override the acl contexts default algorithm when doing
  a classification.  This allows an application to specify a classification
  algorithm without needing to publicize each method. I know there was concern
  over keeping those methods public, but we don't have a static ABI at the moment,
  so this seems to me a reasonable thing to do, as it gives us less of an ABI
  surface to worry about.
- keep common code shared between these two codepaths.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-09-03 03:26:50 +02:00
David Marchand
add720fce9 fix unix permissions for source files
No need for that 'x bit' on source files.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-08-28 17:04:01 +02:00
Ouyang Changchun
2bbb811998 examples/vhost: support jumbo frame
This patch support mergeable RX feature and thus support jumbo frame RX and TX
in user space vhost(as virtio backend).

On RX, it secures enough room from vring to accommodate one complete scattered
packet which is received by PMD from physical port, and then copy data from
mbuf to vring buffer, possibly across a few vring entries and descriptors.

On TX, it gets a jumbo frame, possibly described by a few vring descriptors which
are chained together with the flags of 'NEXT', and then copy them into one scattered
packet and TX it to physical port through PMD.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2014-08-25 17:14:36 +02:00
Konstantin Ananyev
7e7ab81422 examples/l3fwd: improve grouping by destination port
Latest changes introduced a small degradation for the corner case
when each input packet is destined to the different port.
For the test-case when 1 core manages 4 ports and packet stream looks like:
IPV4_DSTPORT0, IPV4_DSTPORT1, IPV4_DSTPORT3, IPV4_DSTPORT4, IPV4_DSTPORT0, ...
non-optimised code path outperforms optimised one by 2-3%.
These changes supposed to close that gap.
From my testing: now for the case described above optimised code path
produces same numbers as non-optimised one.
For other test-cases numbers remain about the same.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-08-01 18:25:05 +02:00
Matthew Hall
a98f77c209 virtio: fix incorrect parens
Signed-off-by: Matthew Hall <mhall@mhcomputing.net>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-07-22 15:00:00 +02:00
Thomas Monjalon
a6664a09a7 examples/pipeline: build with all examples
When adding this packet framework sample (commit 77a3346),
it has been forgotten to add it into the global makefile for
"make examples".

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-07-19 01:47:51 +02:00
Pablo de Lara
efcef73d50 examples: fix default build target
L3fwd-acl and ip pipeline apps were using old
x86_64-default-linuxapp-gcc as their default target,
instead of x86_64-native-linuxapp-gcc

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-07-19 01:47:51 +02:00
Yong Liu
d827c2695e examples/qos_sched: fix flow pause after 2M packets
After enable vector pmd, qos_sched only send 32 packets every burst.
That will cause some packets not transmitted and therefore mempool
will be drain after a while.
App qos_sched now will re-send the packets which failed to send out in
previous tx function.

Signed-off-by: Yong Liu <yong.liu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
2014-07-03 16:28:12 +02:00
Thomas Monjalon
3417cd687e examples: ignore linux apps on bsd
Do not try to build Linux examples in a BSD environment.

Reported-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-07-03 00:10:30 +02:00
Stephen Hemminger
6f41fe75e2 eal: deprecate rte_snprintf
The function rte_snprintf serves no useful purpose. It is the
same as snprintf() for all valid inputs. Deprecate it and
replace all uses in current code.

Leave the tests for the deprecated function in place.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-27 02:31:24 +02:00
Bruce Richardson
e8ed6c7817 eal: fix usage of printf-like functions
Mark the rte_log, cmdline_printf and rte_snprintf functions as
being printf-style functions. This causes compilation errors
due to mis-matched parameter types, so the parameter types are
fixed where appropriate.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-27 02:31:08 +02:00
Anatoly Burakov
d47bd10c8d examples/ip_reassembly: reduce memory usage
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-26 22:52:08 +02:00
Anatoly Burakov
324bcf45e5 examples/ip_frag: fix socket detection
Make everything NUMA-related depend on lcore sockets, not device
sockets. This is because the init_mem() function allocates all data
structures based on NUMA nodes of the lcores in the coremask. Therefore,
when no cores are on socket 0, but there are devices on socket 0, it may
lead to segmentation faults.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-26 22:51:58 +02:00
Anatoly Burakov
eaa8d3bf6d examples/ip_frag: check for non-existent ports
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-26 22:51:51 +02:00
Anatoly Burakov
240952a9e5 ip_frag: add config option to enable statistics
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-26 22:51:44 +02:00
Helin Zhang
8a387fa85f ethdev: more RSS flags
- i40e RSS flags have been added (and enlarged to 64-bit)
- A new configuration of 'uint8_t rss_key_len' has been added in
  'struct rte_eth_rss_conf' to support different length of RSS keys.
- In each PMD, only the supported flags are masked.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
2014-06-17 18:21:41 +02:00
Ouyang Changchun
c2bebe5f5a examples/vmdq: fix Tx queue id
This patch fixes a core id issue in sample vmdq, in case core mask
doesn't start with lcore_id 0 but 20, for instance,
queue id should use core_id instead of lcore_id.

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-17 11:51:42 +02:00
Ivan Boule
70bdb18657 ethdev: add Rx error counters for missed, badcrc and badlen packets
Split input error stats to have a better understanding of why packets
have been dropped.
Keep ierrors field untouched for backward compatibility.

Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-06-17 11:28:14 +02:00
Cristian Dumitrescu
77a334675f examples/pipeline: packet framework sample
This Packet Framework sample application illustrates the capabilities
of the Intel DPDK Packet Framework toolbox.

It creates different functional blocks used by a typical IPv4 framework like:
flow classification, firewall, routing, etc.

CPU cores are connected together through standard interfaces built on SW rings,
which each CPU core running a separate pipeline instance.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
2014-06-17 03:34:11 +02:00
Thomas Monjalon
7b79b2718f examples/vhost: restrict log type namespace
RTE_LOGTYPE_CONFIG, RTE_LOGTYPE_DATA and RTE_LOGTYPE_PORT are renamed
by adding VHOST prefix.
It prevents from conflict with new RTE_LOGTYPE_PORT of packet framework.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 23:16:06 +02:00
Anatoly Burakov
b84fb4cb88 examples/ip_reassembly: overhaul
New stuff:
* Support for regular traffic as well as IPv4 and IPv6
* Simplified config
* Routing table printed out on start
* Uses LPM/LPM6 for lookup
* Unmatched traffic is sent to the originating port

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 18:55:05 +02:00
Anatoly Burakov
74de12b7b6 examples/ip_fragmentation: overhaul
New stuff:
* Support for regular traffic as well as IPv4 and IPv6
* Simplified config
* Routing table printed out on start
* Uses LPM/LPM6 for lookup
* Unmatched traffic is sent to the originating port

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 18:55:05 +02:00
Anatoly Burakov
e107e82eac examples: rename ipv4_frag example to ip_fragmentation
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 18:55:05 +02:00
Anatoly Burakov
5ab22ca3ba ip_frag: rename ipv4_fragmentation function
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 18:55:05 +02:00
Anatoly Burakov
416707812c ip_frag: refactor reassembly code into a proper library
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 18:55:05 +02:00
Anatoly Burakov
63ec0b5851 ip_frag: rename structures in fragmentation table
Technically, fragmentation table can work for both IPv4 and IPv6
packets, so we're renaming everything to be generic enough to make sense
in IPv6 context.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 18:55:05 +02:00
Anatoly Burakov
4c38e5532a ip_frag: refactor IPv4 fragmentation into a proper library
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
[Thomas: add in doxygen]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 18:55:04 +02:00
Anatoly Burakov
601e279df0 ip_frag: move fragmentation/reassembly headers into a library
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 18:55:04 +02:00
Konstantin Ananyev
361b2e9559 acl: new sample l3fwd-acl
Demonstrates the use of the ACL library in the DPDK application to
implement packet classification and L3 forwarding.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
[Thomas: some code-style changes]
2014-06-14 01:29:45 +02:00
Konstantin Ananyev
96ff445371 examples/l3fwd: reorganise and optimize LPM code path
With latest HW and optimised RX/TX path there is a huge gap between
tespmd iofwd and l3fwd performance results.
So there is an attempt to optimise l3fwd LPM code path and reduce the gap:
 - Instead of processing each input packet up to completion -
 divide packet processing into several stages and perform
 stage by stage for the whole burst.
 - Unroll things by the factor of 4 whenever possible.
 - Use SSE instincts for some operations (bswap, replace MAC addresses, etc).
 - Avoid TX packet buffering whenever possible.
 - Move some checks from RX/TX into setup phase.

Note that new(optimized) code path can be switched on/off by setting
ENABLE_MULTI_BUFFER_OPTIMIZE macro to 1/0.

Some performance data:
SUT: dual-socket board IVB 2.8GHz, 2x1GB pages.
4 ports on 4 NICs (all at socket 0) connected to the traffic generator.
kernel: 3.11.3-201.fc19.x86_64, gcc: 4.8.2.
64B packets, using the packet flooding method.
All 4 ports are managed by one logical core:
Optimised scalar PMD RX/TX was used.

                          DIFF % (NEW-OLD)
IPV4-CONT-BURST:               +23%
IPV6-CONT-BURST :              +13%
IPV4/IPV6-CONT-BURST:          +8%
IPV4-4STREAMSX8:               +7%
IPV4-4STREAMSX1:               -2%

Test cases description:
IPV4-CONT-BURST - IPV4 packets all packets from the one input port
are destined for the same output port.
IPV6-CONT-BURST - IPV6 packets all packets from the one input port
are destined for the same output port.
IPV4/IPV6-CONT-BURST - mix of the first 2 with interleave=1
(e.g: IPV4,IPV6,IPV4,IPV6, ...)
IPV4-4STREAMSX1 - 4 streams of IPV4 packets, where all packets
from same stream are destined for the same output port
(e.g: IPV4_DST_P0, IPV4_DST_P1,  IPV4_DST_P2, IPV4_DST_P3, IPV4_DST_P0, ...)
IPV4-4STREAMSX8 - same as above but packets for each stream
are coming in groups of 8
(e.g: IPV4_DST_P0 X 8, IPV4_DST_P1 X 8, IPV4_DST_P2 X 8, IPV4_DST_P3 X 8,
IPV4_DST_P0 X 8, ...)

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
2014-06-12 12:11:54 +02:00
Bruce Richardson
3031749c2d remove trailing whitespaces
This commit removes trailing whitespace from lines in files. Almost all
files are affected, as the BSD license copyright header had trailing
whitespace on 4 lines in it [hence the number of files reporting 8 lines
changed in the diffstat].

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: remove spaces before tabs in libs]
[Thomas: remove more trailing spaces in non-C files]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-11 00:29:34 +02:00
Ouyang Changchun
c3dfe188ba examples/vhost: zero copy mode
This patch supports user space vhost zero copy. It removes packets copying between host and guest in RX/TX.
It introduces an extra ring to store the detached mbufs. At initialization stage all mbufs will put into
this ring; when one guest starts, vhost gets the available buffer address allocated by guest for RX and
translates them into host space addresses, then attaches them to mbufs and puts the attached mbufs into
mempool.
Queue starting and DMA refilling will get mbufs from mempool and use them to set the DMA addresses.

For TX, it gets the buffer addresses of available packets to be transmitted from guest and translates
them to host space addresses, then attaches them to mbufs and puts them to TX queues.
After TX finishes, it pulls mbufs out from mempool, detaches them and puts them back into the extra ring.

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-28 16:00:55 +02:00
David Marchand
519f32279e config: rename "default" configurations as "native"
The "default" part in configuration filenames is misleading.
Rename this as "native", as this is the RTE_MACHINE that is set in these files.
This should make it clearer for people who build DPDK on a system then run it on
another one.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-05-21 16:25:06 +02:00
Neil Horman
e5ffdd1457 ethdev: remove rte_pmd_init_all function
Now that we've converted all the pmds in dpdk to use the driver registration
macro, rte_pmd_init_all has become empty.  As theres no reason to keep it around
anymore, just remove it and fix up all the eample callers.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-20 14:28:26 +02:00
Neil Horman
7b9d82208b ixgbe: convert to use of PMD_REGISTER_DRIVER and fix linking
Convert the ixgbe pmd driver to use the PMD_REGISTER_DRIVER macro.
This means that the test applications now have no reference to the ixgbe library
when building DSO's and must specify its use on the command line with the -d
option.  Static linking will still initalize the driver automatically.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-20 14:28:16 +02:00
Neil Horman
d93c252e88 igb: convert to use of PMD_REGISTER_DRIVER and fix linking
Convert the igb pmd driver to use the PMD_REGISTER_DRIVER macro.
This means that the test applications now have no reference to the igb library
when building DSO's and must specify its use on the command line with the -d
option.  Static linking will still initalize the driver automatically.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-20 14:28:16 +02:00
Stephen Hemminger
591a9d7985 add FILE argument to debug functions
The DPDK dump functions are useful for remote debugging of an
applications. But when application runs as a daemon, stdout
is typically routed to /dev/null.

Instead change all these functions to take a stdio FILE * handle
instead. An application can then use open_memstream() to capture
the output.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: fix quota_watermark example]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-16 16:02:55 +02:00
Olivier Matz
d88219f588 examples/netmap_compat: fix makefile
It is not allowed to reference a an absolute file name in SRCS-y.
A VPATH has to be used, else the dependencies won't be checked
properly.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-16 16:02:55 +02:00
Olivier Matz
ea9a59b26c examples/qos_sched: fix makefile
The example does not compile as the linker complains about duplicated
symbols.

Remove -lsched from LDLIBS, it is already present in rte.app.mk and
added by the DPDK framework automatically.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-16 16:02:55 +02:00