Generic Receive Offload (GRO) is a widely used SW-based offloading
technique to reduce per-packet processing overhead. It gains
performance by reassembling small packets into large ones. This
patchset is to support GRO in DPDK. To support GRO, this patch
implements a GRO API framework.
To enable more flexibility to applications, DPDK GRO is implemented as
a user library. Applications explicitly use the GRO library to merge
small packets into large ones. DPDK GRO provides two reassembly modes.
One is called lightweight mode, the other is called heavyweight mode.
If applications want to merge packets in a simple way and the number
of packets is relatively small, they can use the lightweight mode.
If applications need more fine-grained controls, they can choose the
heavyweight mode.
rte_gro_reassemble_burst is the main reassembly API which is used in
lightweight mode and processes N packets at a time. For applications,
performing GRO in lightweight mode is simple. They just need to invoke
rte_gro_reassemble_burst. Applications can get GROed packets as soon as
rte_gro_reassemble_burst returns.
rte_gro_reassemble is the main reassembly API which is used in
heavyweight mode and tries to merge N inputted packets with the packets
in GRO reassembly tables. For applications, performing GRO in heavyweight
mode is relatively complicated. Before performing GRO, applications need
to create a GRO context object, which keeps reassembly tables of
desired GRO types, by rte_gro_ctx_create. Then applications can use
rte_gro_reassemble to merge packets. The GROed packets are in the
reassembly tables of the GRO context object. If applications want to get
them, applications need to manually flush them by flush API.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Add in a new rte_event_ring structure type and functions to allow events to
be passed core to core. This is needed because the standard rte_ring type
only works on pointers, while for events, we want to copy the entire, 16B
events themselves - not just pointers to them. The code makes extensive use
of the functions already defined in rte_ring.h
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
APIs for selecting the architecure specific implementation and computing
the crc (16-bit and 32-bit CRCs) are added. For CRCs calculation, scalar
as well as x86 intrinsic(sse4.2) versions are implemented.
The scalar version is based on generic Look-Up Table(LUT) algorithm,
while x86 intrinsic version uses carry-less multiplication for
fast CRC computation.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Add a library designed to calculate latency statistics and report them
to the application when queried. The library measures minimum, average and
maximum latencies, and jitter in nano seconds. The current implementation
supports global latency stats, i.e. per application stats.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Signed-off-by: Remy Horton <remy.horton@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds a library that calculates peak and average data-rate
statistics. For ethernet devices. These statistics are reported using
the metrics library.
Signed-off-by: Remy Horton <remy.horton@intel.com>
This patch adds a new information metrics library. This Metrics
library implements a mechanism by which producers can publish
numeric information for later querying by consumers. Metrics
themselves are statistics that are not generated by PMDs, and
hence are not reported via ethdev extended statistics.
Metric information is populated using a push model, where
producers update the values contained within the metric
library by calling an update function on the relevant metrics.
Consumers receive metric information by querying the central
metric data, which is held in shared memory.
Signed-off-by: Remy Horton <remy.horton@intel.com>
Since eventdev uses event structures rather than working directly on
mbufs, there is no actual dependencies on the mbuf library. The
inclusion of an mbuf pointer element inside the event itself does not
require the inclusion of the mbuf header file. Similarly the pci
header is not needed, but following their removal, rte_memory.h is
needed for the definition of the __rte_cache_aligned macro.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This patch implements northbound eventdev API interface using
southbond driver interface
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
In rte.lib.mk, the list of libraries passed to the link
command (LDLIBS) is generated from the DEPDIRS-xxx variables.
If a library is not compiled because it is disabled in
configuration, it should not appear in DEPDIRS-xxx.
- librte_port depends on librte_kni only if it is enabled.
- librte_table depends on librte_acl only if it is enabled.
Fixes: feb9f680cd ("mk: optimize directory dependencies")
Reported-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Before this patch, the management of dependencies between directories
had several issues:
- the generation of .depdirs, done at configuration is slow: it can take
more than one minute on some slow targets (usually ~10s on a standard
PC without -j).
- for instance, it is possible to express a dependency like:
- app/foo depends on lib/librte_foo
- and lib/librte_foo depends on app/bar
But this won't work because the directories are traversed with a
depth-first algorithm, so we have to choose between doing 'app' before
or after 'lib'.
- the script depdirs-rule.sh is too complex.
- we cannot use "make -d" for debug, because the output of make is used for
the generation of .depdirs.
This patch moves the DEPDIRS-* variables in the upper Makefile, making
the dependencies much easier to calculate. A DEPDIRS variable is still
used to process library dependencies in LDLIBS.
After this commit, "make config" is almost immediate.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Robin Jarry <robin.jarry@6wind.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Elastic Flow Distributor (EFD) is a distributor library that uses
perfect hashing to determine a target/value for a given incoming flow key.
It has the following advantages:
- First, because it uses perfect hashing, it does not store
the key itself and hence lookup performance is not dependent
on the key size.
- Second, the target/value can be any arbitrary value hence
the system designer and/or operator can better optimize service rates
and inter-cluster network traffic locating.
- Third, since the storage requirement is much smaller than a hash-based
flow table (i.e. better fit for CPU cache), EFD can scale to
millions of flow keys.
Finally, with current optimized library implementation performance
is fully scalable with number of CPU cores.
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Acked-by: Christian Maciocco <christian.maciocco@intel.com>
Following discussions on the mailing list [1] and since nobody stood up to
implement the necessary cleanups, here is the ivshmem integration removal.
There is not much to say about this patch, a lot of code is being removed.
The default configuration file for packet_ordering example is replaced with
the "native" x86 file.
The only tricky part is in eal_memory with the memseg index stuff.
More cleanups can be done after this but will come in subsequent patchsets.
[1]: http://dpdk.org/ml/archives/dev/2016-June/040844.html
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The librte_pdump library provides a framework for
packet capturing in dpdk. The library provides set of
APIs to initialize the packet capture framework, to
enable or disable the packet capture, and to uninitialize
it.
The librte_pdump library works on a client/server model.
The server is responsible for enabling or disabling the
packet capture and the clients are responsible
for requesting the enabling or disabling of the packet
capture.
Enabling APIs are supported with port, queue, ring and
mempool parameters. Applications should pass on this information
to get the packets from the dpdk ports.
For enabling requests from applications, library creates the client
request containing the mempool, ring, port and queue information and
sends the request to the server. After receiving the request, server
registers the Rx and Tx callbacks for all the port and queues.
After the callbacks registration, registered callbacks will get the
Rx and Tx packets. Packets then will be copied to the new mbufs that
are allocated from the user passed mempool. These new mbufs then will
be enqueued to the application passed ring. Applications need to dequeue
the mbufs from the rings and direct them to the devices like
pcap vdev for viewing the packets outside of the dpdk
using the packet capture tools.
For disabling requests, library creates the client request containing
the port and queue information and sends the request to the server.
After receiving the request, server removes the Rx and Tx callback
for all the port and queues.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
As cryptodev library does not depend on mbuf_offload library
any longer, this patch removes it.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Deepak Kumar Jain <deepak.k.jain@intel.com>
The physically linked-together combined library has been an increasing
source of problems, as was predicted when library and symbol versioning
was introduced. Replace the complex and fragile construction with a
simple linker script which achieves the same without all the problems,
remove the related kludges from eg mlx drivers.
Since creating the linker script is practically zero cost, remove the
config option and just create it always.
Based on a patch by Sergio Gonzales Monroy, linker script approach
initially suggested by Neil Horman.
Suggested-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This library add support for adding a chain of offload operations to a
mbuf. It contains the definition of the rte_mbuf_offload structure as
well as helper functions for attaching offloads to mbufs and a mempool
management functions.
This initial implementation supports attaching multiple offload
operations to a single mbuf, but only a single offload operation of a
specific type can be attach to that mbuf.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch contains the initial proposed APIs and device framework for
integrating crypto packet processing into DPDK.
features include:
- Crypto device configuration / management APIs
- Definitions of supported cipher algorithms and operations.
- Definitions of supported hash/authentication algorithms and
operations.
- Crypto session management APIs
- Crypto operation data structures and APIs allocation of crypto
operation structure used to specify the crypto operations to
be performed on a particular mbuf.
- Extension of mbuf to contain crypto operation data pointer and
extra flags.
- Burst enqueue / dequeue APIs for processing of crypto operations.
Signed-off-by: Des O Dea <des.j.o.dea@intel.com>
Signed-off-by: John Griffin <john.griffin@intel.com>
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
The malloc library is now part of the EAL.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Move malloc inside eal and create a new section in MAINTAINERS file for
Memory Allocation in EAL.
Create a dummy malloc library to avoid breaking applications that have
librte_malloc in their DT_NEEDED entries.
This is the first step towards using malloc to allocate memory directly
from memsegs. Thus, memzones would allocate memory through malloc,
allowing to free memzones.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Move xenvirt PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move vmxnet3 PMD to drivers/net directory.
As part of the move, rename the "vmxnet3" subdirectory, containing the
original FreeBSD drivers, from "vmxnet3" to the more standard name
"base", to indicate it contains the base drivers used for the
implementation.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move virtio PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move ring PMD to drivers directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move pcap pmd to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move null PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
move mlx4 PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
move ixgbe PMD to drivers/net directory.
As part of the move, we rename the ixgbe directory, containing the
ixgbe "base driver" code, from "ixgbe" to "base".
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move i40e PMD to drivers/net directory.
As part of the move, rename the "i40e" directory, containing the "base
driver" code, from "i40e" to "base".
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
move fm10k PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
move enic PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
[Thomas: move vnic/ to base/]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move e1000 pmd to drivers/net directory
As part of move, rename "e1000" subdirectory, which contains the code
from the "base driver", to "base".
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move bonded ethdev pmd to drivers/net
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
move af_packet pmd to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Null PMD is a driver of the virtual device particularly designed to measure
performance of DPDK PMDs. When an application call rx, Null PMD just allocates
mbufs and returns those. Also tx, the PMD just frees mbufs.
The PMD has following options.
- size: specify packe size allocated by RX. Default packet size is 64.
- copy: specify 1 or 0 to enable or disable copy while RX and TX.
Default value is 0(disabled).
This option is used for emulating more realistic data transfer.
Copy size is equal to packet size.
To use the PMD, enable CONFIG_RTE_BUILD_SHARED_LIB in config file. Then
compile the PMD as shared library. The library can be linked using '-d'
option when an application invokes.
Here is an example.
$ sudo ./testpmd -c f -n 4 -d librte_pmd_null.so \
--vdev 'eth_null0' --vdev 'eth_null1' -- -i --no-flush-rx
If testpmd is compiled with CONFIG_RTE_BUILD_SHARED_LIB, it may need to
specify more libraries using '-d' option.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
This PMD manages all variants of Mellanox ConnectX-3 (EN 40, EN 10, Pro EN
40) as well as their virtual functions in SR-IOV context through IB Verbs
(libibverbs) and the dedicated user-space driver (libmlx4).
It is disabled by default due to dependencies on these libraries and only
supports Linux userland at the moment partly because /sys (sysfs) support is
required.
Also claim responsibility in the MAINTAINERS file.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
This library provide API to measure time spend in particular parts of
code and to calculate optimal polling time.
To calculate a those statistics application code need to be divided into
parts (called jobs) that do something. It is up to application to decide
what is considered a job.
Series of jobs must be surrounded with the rte_jobstats_context_start()
and rte_jobstats_context_finish() calls. After that, jobs might be
started. Each job must be surrounded with rte_jobstats_start() and
rte_jobstats_finish() calls.
After job finishes its execution, period in which it should be called
again is adjusted. It might be used to minimize time wasted on
unnecessary polls/calls. Adjustment is based on data provided by job
itself (ex: number of packets it processed).
After all jobs in serie are executed fallowing statistics are updated
and might be used by application. Statistics can be reset. Some of
provided statistic data:
- total/min/max execution - time spent in executing jobs.
- total/min/max management - time spent outside execution area. This
value might be used to measure overhead of scheduling jobs. This time
also contains overhead of rte_jobstats library itself.
- number of loops that executed at least one job
- executed jobs
- time when statistics were reset.
Each job provide total/min/max execution time and execution count
statistics.
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This library provides reordering capability for out of order mbufs based
on a sequence number in the mbuf structure.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Signed-off-by: Richardson Bruce <bruce.richardson@intel.com>
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
1. Add init function to scan and initialize fm10k PF device.
2. Add implementation to register fm10k pmd PF driver.
3. Add 3 functions fm10k_dev_configure, fm10k_stats_get and
fm10k_stats_get.
4. Add fm10k.h to define macros and basic data structure.
5. Add fm10k_logs.h to control log message output.
6. Change config/common_bsdapp and config/common_linuxapp, add
macros to control fm10k pmd driver compile for linux and bsd.
7. Add Makefile.
8. Change lib/Makefile to add fm10k driver into compile list.
9. Change mk/rte.app.mk to add fm10k lib into link.
10. Add ABI version of librte_pmd_fm10k
Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Add initial pass header files to support symbol versioning.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: enable for BSD - not tested]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This is a Linux-specific virtual PMD driver backed by an AF_PACKET
socket. This implementation uses mmap'ed ring buffers to limit copying
and user/kernel transitions. The PACKET_FANOUT_HASH behavior of
AF_PACKET is used for frame reception. In the current implementation,
Tx and Rx queues are always paired, and therefore are always equal
in number -- changing this would be a Simple Matter Of Programming.
Interfaces of this type are created with a command line option like
"--vdev=eth_af_packet0,iface=...". There are a number of options available
as arguments:
- Interface is chosen by "iface" (required)
- Number of queue pairs set by "qpairs" (optional, default: 1)
- AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
- AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
- AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)
Signed-off-by: John W. Linville <linville@tuxdriver.com>
[Thomas: disable because of incompatibility with some kernels]
vhost lib is turned off by default.
vhost lib is based on cuse, which requires fuse development package
to be installed.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: fix build dependencies]
This library provides a tool to interpret config files that have
standard structure.
It is used by the Packet Framework examples/ip_pipeline sample application.
It originates from examples/qos_sched sample application and now it makes
this code available as a library for other sample applications to use.
The code duplication with qos_sched sample app to be addressed later.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
The Packet Framework pipeline library provides a standard methodology
(logically similar to OpenFlow) for rapid development of complex packet
processing pipelines out of ports, tables and actions.
A pipeline is constructed by connecting its input ports to its output ports
through a chain of lookup tables. As result of lookup operation into the
current table, one of the table entries (or the default table entry, in case
of lookup miss) is identified to provide the actions to be executed on the
current packet and the associated action meta-data.
The behavior of user actions is defined through the configurable table action
handler, while the reserved actions define the next hop for the current packet
(either another table, an output port or packet drop) and are handled
transparently by the framework.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
This file defines the operations to be implemented by
any Packet Framework table.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
This file defines the port operations that have to be implemented
by Packet Framework ports.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>