This patch enables TCP/IPv4 GRO library in csum forwarding engine.
By default, GRO is turned off. Users can use command "gro (on|off)
(port_id)" to enable or disable GRO for a given port. If a port is
enabled GRO, all TCP/IPv4 packets received from the port are performed
GRO. Besides, users can set max flow number and packets number per-flow
by command "gro set (max_flow_num) (max_item_num_per_flow) (port_id)".
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
In this patch, we introduce five APIs to support TCP/IPv4 GRO.
- gro_tcp4_reassemble: reassemble an inputted TCP/IPv4 packet.
- gro_tcp4_tbl_create: create a TCP/IPv4 reassembly table, which is used
to merge packets.
- gro_tcp4_tbl_destroy: free memory space of a TCP/IPv4 reassembly table.
- gro_tcp4_tbl_pkt_count: return the number of packets in a TCP/IPv4
reassembly table.
- gro_tcp4_tbl_timeout_flush: flush timeout packets from a TCP/IPv4
reassembly table.
TCP/IPv4 GRO API assumes all inputted packets are with correct IPv4
and TCP checksums. And TCP/IPv4 GRO API doesn't update IPv4 and TCP
checksums for merged packets. If inputted packets are IP fragmented,
TCP/IPv4 GRO API assumes they are complete packets (i.e. with L4
headers).
In TCP/IPv4 GRO, we use a table structure, called TCP/IPv4 reassembly
table, to reassemble packets. A TCP/IPv4 reassembly table includes a key
array and a item array, where the key array keeps the criteria to merge
packets and the item array keeps packet information.
One key in the key array points to an item group, which consists of
packets which have the same criteria value. If two packets are able to
merge, they must be in the same item group. Each key in the key array
includes two parts:
- criteria: the criteria of merging packets. If two packets can be
merged, they must have the same criteria value.
- start_index: the index of the first incoming packet of the item group.
Each element in the item array keeps the information of one packet. It
mainly includes three parts:
- firstseg: the address of the first segment of the packet
- lastseg: the address of the last segment of the packet
- next_pkt_index: the index of the next packet in the same item group.
All packets in the same item group are chained by next_pkt_index.
With next_pkt_index, we can locate all packets in the same item
group one by one.
To process an incoming packet needs three steps:
a. check if the packet should be processed. Packets with one of the
following properties won't be processed:
- FIN, SYN, RST, URG, PSH, ECE or CWR bit is set;
- packet payload length is 0.
b. traverse the key array to find a key which has the same criteria
value with the incoming packet. If find, goto step c. Otherwise,
insert a new key and insert the packet into the item array.
c. locate the first packet in the item group via the start_index in the
key. Then traverse all packets in the item group via next_pkt_index.
If find one packet which can merge with the incoming one, merge them
together. If can't find, insert the packet into this item group.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Generic Receive Offload (GRO) is a widely used SW-based offloading
technique to reduce per-packet processing overhead. It gains
performance by reassembling small packets into large ones. This
patchset is to support GRO in DPDK. To support GRO, this patch
implements a GRO API framework.
To enable more flexibility to applications, DPDK GRO is implemented as
a user library. Applications explicitly use the GRO library to merge
small packets into large ones. DPDK GRO provides two reassembly modes.
One is called lightweight mode, the other is called heavyweight mode.
If applications want to merge packets in a simple way and the number
of packets is relatively small, they can use the lightweight mode.
If applications need more fine-grained controls, they can choose the
heavyweight mode.
rte_gro_reassemble_burst is the main reassembly API which is used in
lightweight mode and processes N packets at a time. For applications,
performing GRO in lightweight mode is simple. They just need to invoke
rte_gro_reassemble_burst. Applications can get GROed packets as soon as
rte_gro_reassemble_burst returns.
rte_gro_reassemble is the main reassembly API which is used in
heavyweight mode and tries to merge N inputted packets with the packets
in GRO reassembly tables. For applications, performing GRO in heavyweight
mode is relatively complicated. Before performing GRO, applications need
to create a GRO context object, which keeps reassembly tables of
desired GRO types, by rte_gro_ctx_create. Then applications can use
rte_gro_reassemble to merge packets. The GROed packets are in the
reassembly tables of the GRO context object. If applications want to get
them, applications need to manually flush them by flush API.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Seen with gcc 4.9.2:
drivers/crypto/scheduler/scheduler_multicore.c:286:2: error:
'for' loop initial declarations are only allowed in C99 or C11 mode
for (uint16_t i = 0; i < sched_ctx->nb_wc; i++)
^
Fixes: 4c07e0552f ("crypto/scheduler: add multicore scheduling mode")
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Introduce a more versatile helper to parse device strings. This
helper expects a generic rte_devargs structure as storage in order not
to require API changes in the future, should this structure be
updated.
The old equivalent function is thus being deprecated, as its API does
not allow to accompany rte_devargs evolutions.
A deprecation notice is issued.
This new helper will parse bus information as well as device name and
device parameters. It does not allocate an rte_devargs structure and
expects one to be given as input.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
rte_devargs now represents any device from any bus.
The related devtypes do not identify a bus anymore, only which scan
policy the device subscribes to.
The bus itself is identified by a bus handle previously introduced.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Remove the dependency of this subsystem upon bus specific device
representation.
Devargs only validates that a device declaration is correct and handled
by a bus. The device interpretation is done afterward within the bus.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Scan policies describe the way a bus should scan the system to search
for possible devices.
Three flags are introduced:
RTE_BUS_SCAN_UNDEFINED: Configuration is irrelevant for this bus
RTE_BUS_SCAN_WHITELIST: Scanning should be limited to declared devices
RTE_BUS_SCAN_BLACKLIST: Scanning should exclude only declared devices
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Device kernel module is a device attribute.
It is used in generic device structures and must not be tied to a bus.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
In devargs rework, rte_pci.h won't be included by rte_ethdev.h
(via rte_devargs.h) anymore.
rte_ethtool_get_drvinfo() could use rte_devargs.name instead of
creating equivalent bus specific name.
For now, it is workarounded by just including rte_pci.h.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Find which bus should be able to parse this device name into an internal
device representation.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This operation can be used either to validate that a device
representation can be understood by a bus, as well as store the resulting
specialized device representation in any format determined by the bus.
Implementing this function allows EAL initialization routines to infer
which bus should handle a device. This is used as a way to respect
backward compatibility.
This API will disappear once this compatibility is not enforced anymore.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
The bus name was stored with embedded double quotes.
Indeed the bus name is given with a string in a macro,
which is not used elsewhere.
These macros are useless because the buses are drivers,
so they must not have any API for the application writer.
The registration can be done with a hardcoded value without quotes.
There is another (small) benefit of not using macros for driver names:
it is to have a meaningful constructor function name.
For instance, it was businitfn_PCI_BUS_NAME instead of businitfn_pci.
The bus registration macro is also changed to use
the new RTE_INIT_PRIO macro, similar to RTE_INIT used for other drivers.
The priority is the highest (101) in order to be sure that the bus driver
is registered before its device drivers.
Fixes: 0fd1a0eaae ("pci: add bus driver")
Fixes: fea892e35f ("bus/vdev: use standard bus registration")
Fixes: 7e7df6d0a4 ("bus/fslmc: introduce fsl-mc bus driver")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
A separate boolean variable is not necessary when searching for
starting point in find_device. Just use the passed argument
as its own flag value.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When adding items to a hash table with multiple threads,
there is an spinlock used to prevent data corruption
(unless Transactional Memory is supported).
If there is a failure, the spinlock should be released,
but there were cases where that was not happening.
Fixes: be856325cb ("hash: add scalable multi-writer insertion with Intel TSX")
Cc: stable@dpdk.org
Signed-off-by: Mike Stolarchuk <mike.stolarchuk@bigswitch.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Check that numbers of Rx and Tx descriptors satisfy descriptors limits
from the Ethernet device information, otherwise adjust them to boundaries.
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Fix document for fuzzy match and GRE
Fixes: a3a2e2c8f7 ("ethdev: add fuzzy match in flow API")
Fixes: 7cd048321d ("ethdev: add MPLS and GRE flow API items")
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This allows PMDs and applications to save flow rules in their generic
format for later processing. This is useful when rules cannot be applied
immediately, such as when the device is not properly initialized.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Replace the incorrect reference to "Cavium Networks", "Cavium Ltd"
company name with correct the "Cavium, Inc" company name in
copyright headers.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Add documentation to describe usage of eventdev test application and
supported command line arguments.
Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
This is a performance test case that aims at testing the following:
1. Measure the number of events can be processed in a second.
2. Measure the latency to forward an event.
The atq queue test functions as same as "perf_queue" test.
The difference is, it uses, "all type queue" scheme instead of separate
queues for each stage and thus reduces the number of queues required to
realize the use case and enables flow pinning as the event does not
move to the next queue.
Example command to run perf "all types queue" test:
sudo build/app/dpdk-test-eventdev --vdev=event_octeontx --\
--test=perf_atq --plcores=2 --wlcore=3 --stlist=p --nb_pkts=1000000000
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This is a performance test case that aims at testing the following:
1. Measure the number of events can be processed in a second.
2. Measure the latency to forward an event.
The perf queue test configures the eventdev with Q queues and P ports,
where Q is nb_producers * nb_stages and P is nb_workers + nb_producers.
The user can choose the number of workers, the number of producers and
number of stages through the --wlcores , --plcores and the --stlist
application command line arguments respectively.
The producer(s) injects the events to eventdev based the
first stage sched type list requested by the user through --stlist
the command line argument.
Based on the number of stages to process(selected through --stlist),
the application forwards the event to next upstream queue and
terminates when it reaches the last stage in the pipeline.
On event termination, application increments the number events
processed and print periodically in one second to get the
number of events processed in one second.
When --fwd_latency command line option selected, the application
inserts the timestamp in the event on the first stage and then
on termination, it updates the number of cycles to forward
a packet. The application uses this value to compute the average
latency to a forward packet.
Example command to run perf queue test:
sudo build/app/dpdk-test-eventdev --vdev=event_sw0 -- --test=perf_queue\
--slcore=1 --plcores=2 --wlcore=3 --stlist=p --nb_pkts=1000000000
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
The event producer and master lcore's test termination and
the logic to print the mpps and latency are common for the
queue and all types queue test.
Move them as the common function.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Setup one port per worker and link to all queues and setup
N producer ports to inject the events.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
add functions to create mempool, destroy mempool and print the test result.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
perf test has the queue and all types queue variants.
Introduce test_perf_common* to share the common code between those tests.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This test verifies the same aspects of order_queue test,
The difference is the number of queues used, this test
operates on a single "all types queue"(atq) instead of two
different queues for ordered and atomic.
Example command to run order all types queue test:
sudo build/app/dpdk-test-eventdev --vdev=event_octeontx --\
--test=order_atq --plcores 1 --wlcores 2,3
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>