gro: cleanup

This patch updates codes as follows:
- change appropriate names for internal structures, variants and functions
- update comments and the content of the gro programmer guide for better
  understanding
- remove needless check and redundant comments

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Junjie Chen <junjie.j.chen@intel.com>
This commit is contained in:
Jiayu Hu 2018-01-10 22:03:10 +08:00 committed by Thomas Monjalon
parent 50bdac5916
commit 1e4cf4d6d4
6 changed files with 642 additions and 422 deletions

View File

@ -32,128 +32,154 @@ Generic Receive Offload Library
===============================
Generic Receive Offload (GRO) is a widely used SW-based offloading
technique to reduce per-packet processing overhead. It gains performance
by reassembling small packets into large ones. To enable more flexibility
to applications, DPDK implements GRO as a standalone library. Applications
explicitly use the GRO library to merge small packets into large ones.
technique to reduce per-packet processing overheads. By reassembling
small packets into larger ones, GRO enables applications to process
fewer large packets directly, thus reducing the number of packets to
be processed. To benefit DPDK-based applications, like Open vSwitch,
DPDK also provides own GRO implementation. In DPDK, GRO is implemented
as a standalone library. Applications explicitly use the GRO library to
reassemble packets.
The GRO library assumes all input packets have correct checksums. In
addition, the GRO library doesn't re-calculate checksums for merged
packets. If input packets are IP fragmented, the GRO library assumes
they are complete packets (i.e. with L4 headers).
Overview
--------
Currently, the GRO library implements TCP/IPv4 packet reassembly.
In the GRO library, there are many GRO types which are defined by packet
types. One GRO type is in charge of process one kind of packets. For
example, TCP/IPv4 GRO processes TCP/IPv4 packets.
Reassembly Modes
----------------
Each GRO type has a reassembly function, which defines own algorithm and
table structure to reassemble packets. We assign input packets to the
corresponding GRO functions by MBUF->packet_type.
The GRO library provides two reassembly modes: lightweight and
heavyweight mode. If applications want to merge packets in a simple way,
they can use the lightweight mode API. If applications want more
fine-grained controls, they can choose the heavyweight mode API.
The GRO library doesn't check if input packets have correct checksums and
doesn't re-calculate checksums for merged packets. The GRO library
assumes the packets are complete (i.e., MF==0 && frag_off==0), when IP
fragmentation is possible (i.e., DF==0). Additionally, it requires IPv4
ID to be increased by one.
Lightweight Mode
~~~~~~~~~~~~~~~~
Currently, the GRO library provides GRO supports for TCP/IPv4 packets.
The ``rte_gro_reassemble_burst()`` function is used for reassembly in
lightweight mode. It tries to merge N input packets at a time, where
N should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``.
Two Sets of API
---------------
In each invocation, ``rte_gro_reassemble_burst()`` allocates temporary
reassembly tables for the desired GRO types. Note that the reassembly
table is a table structure used to reassemble packets and different GRO
types (e.g. TCP/IPv4 GRO and TCP/IPv6 GRO) have different reassembly table
structures. The ``rte_gro_reassemble_burst()`` function uses the reassembly
tables to merge the N input packets.
For different usage scenarios, the GRO library provides two sets of API.
The one is called the lightweight mode API, which enables applications to
merge a small number of packets rapidly; the other is called the
heavyweight mode API, which provides fine-grained controls to
applications and supports to merge a large number of packets.
For applications, performing GRO in lightweight mode is simple. They
just need to invoke ``rte_gro_reassemble_burst()``. Applications can get
GROed packets as soon as ``rte_gro_reassemble_burst()`` returns.
Lightweight Mode API
~~~~~~~~~~~~~~~~~~~~
Heavyweight Mode
~~~~~~~~~~~~~~~~
The lightweight mode only has one function ``rte_gro_reassemble_burst()``,
which process N packets at a time. Using the lightweight mode API to
merge packets is very simple. Calling ``rte_gro_reassemble_burst()`` is
enough. The GROed packets are returned to applications as soon as it
finishes.
The ``rte_gro_reassemble()`` function is used for reassembly in heavyweight
mode. Compared with the lightweight mode, performing GRO in heavyweight mode
is relatively complicated.
In ``rte_gro_reassemble_burst()``, table structures of different GRO
types are allocated in the stack. This design simplifies applications'
operations. However, limited by the stack size, the maximum number of
packets that ``rte_gro_reassemble_burst()`` can process in an invocation
should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``.
Before performing GRO, applications need to create a GRO context object
by calling ``rte_gro_ctx_create()``. A GRO context object holds the
reassembly tables of desired GRO types. Note that all update/lookup
operations on the context object are not thread safe. So if different
processes or threads want to access the same context object simultaneously,
some external syncing mechanisms must be used.
Heavyweight Mode API
~~~~~~~~~~~~~~~~~~~~
Once the GRO context is created, applications can then use the
``rte_gro_reassemble()`` function to merge packets. In each invocation,
``rte_gro_reassemble()`` tries to merge input packets with the packets
in the reassembly tables. If an input packet is an unsupported GRO type,
or other errors happen (e.g. SYN bit is set), ``rte_gro_reassemble()``
returns the packet to applications. Otherwise, the input packet is either
merged or inserted into a reassembly table.
Compared with the lightweight mode, using the heavyweight mode API is
relatively complex. Firstly, applications need to create a GRO context
by ``rte_gro_ctx_create()``. ``rte_gro_ctx_create()`` allocates tables
structures in the heap and stores their pointers in the GRO context.
Secondly, applications use ``rte_gro_reassemble()`` to merge packets.
If input packets have invalid parameters, ``rte_gro_reassemble()``
returns them to applications. For example, packets of unsupported GRO
types or TCP SYN packets are returned. Otherwise, the input packets are
either merged with the existed packets in the tables or inserted into the
tables. Finally, applications use ``rte_gro_timeout_flush()`` to flush
packets from the tables, when they want to get the GROed packets.
When applications want to get GRO processed packets, they need to use
``rte_gro_timeout_flush()`` to flush them from the tables manually.
Note that all update/lookup operations on the GRO context are not thread
safe. So if different processes or threads want to access the same
context object simultaneously, some external syncing mechanisms must be
used.
Reassembly Algorithm
--------------------
The reassembly algorithm is used for reassembling packets. In the GRO
library, different GRO types can use different algorithms. In this
section, we will introduce an algorithm, which is used by TCP/IPv4 GRO.
Challenges
~~~~~~~~~~
The reassembly algorithm determines the efficiency of GRO. There are two
challenges in the algorithm design:
- a high cost algorithm/implementation would cause packet dropping in a
high speed network.
- packet reordering makes it hard to merge packets. For example, Linux
GRO fails to merge packets when encounters packet reordering.
The above two challenges require our algorithm is:
- lightweight enough to scale fast networking speed
- capable of handling packet reordering
In DPDK GRO, we use a key-based algorithm to address the two challenges.
Key-based Reassembly Algorithm
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
:numref:`figure_gro-key-algorithm` illustrates the procedure of the
key-based algorithm. Packets are classified into "flows" by some header
fields (we call them as "key"). To process an input packet, the algorithm
searches for a matched "flow" (i.e., the same value of key) for the
packet first, then checks all packets in the "flow" and tries to find a
"neighbor" for it. If find a "neighbor", merge the two packets together.
If can't find a "neighbor", store the packet into its "flow". If can't
find a matched "flow", insert a new "flow" and store the packet into the
"flow".
.. note::
Packets in the same "flow" that can't merge are always caused
by packet reordering.
The key-based algorithm has two characters:
- classifying packets into "flows" to accelerate packet aggregation is
simple (address challenge 1).
- storing out-of-order packets makes it possible to merge later (address
challenge 2).
.. _figure_gro-key-algorithm:
.. figure:: img/gro-key-algorithm.*
:align: center
Key-based Reassembly Algorithm
TCP/IPv4 GRO
------------
TCP/IPv4 GRO supports merging small TCP/IPv4 packets into large ones,
using a table structure called the TCP/IPv4 reassembly table.
The table structure used by TCP/IPv4 GRO contains two arrays: flow array
and item array. The flow array keeps flow information, and the item array
keeps packet information.
TCP/IPv4 Reassembly Table
~~~~~~~~~~~~~~~~~~~~~~~~~
Header fields used to define a TCP/IPv4 flow include:
A TCP/IPv4 reassembly table includes a "key" array and an "item" array.
The key array keeps the criteria to merge packets and the item array
keeps the packet information.
- source and destination: Ethernet and IP address, TCP port
Each key in the key array points to an item group, which consists of
packets which have the same criteria values but can't be merged. A key
in the key array includes two parts:
- TCP acknowledge number
* ``criteria``: the criteria to merge packets. If two packets can be
merged, they must have the same criteria values.
TCP/IPv4 packets whose FIN, SYN, RST, URG, PSH, ECE or CWR bit is set
won't be processed.
* ``start_index``: the item array index of the first packet in the item
group.
Header fields deciding if two packets are neighbors include:
Each element in the item array keeps the information of a packet. An item
in the item array mainly includes three parts:
- TCP sequence number
* ``firstseg``: the mbuf address of the first segment of the packet.
* ``lastseg``: the mbuf address of the last segment of the packet.
* ``next_pkt_index``: the item array index of the next packet in the same
item group. TCP/IPv4 GRO uses ``next_pkt_index`` to chain the packets
that have the same criteria value but can't be merged together.
Procedure to Reassemble a Packet
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To reassemble an incoming packet needs three steps:
#. Check if the packet should be processed. Packets with one of the
following properties aren't processed and are returned immediately:
* FIN, SYN, RST, URG, PSH, ECE or CWR bit is set.
* L4 payload length is 0.
#. Traverse the key array to find a key which has the same criteria
value with the incoming packet. If found, go to the next step.
Otherwise, insert a new key and a new item for the packet.
#. Locate the first packet in the item group via ``start_index``. Then
traverse all packets in the item group via ``next_pkt_index``. If a
packet is found which can be merged with the incoming one, merge them
together. If one isn't found, insert the packet into this item group.
Note that to merge two packets is to link them together via mbuf's
``next`` field.
When packets are flushed from the reassembly table, TCP/IPv4 GRO updates
packet header fields for the merged packets. Note that before reassembling
the packet, TCP/IPv4 GRO doesn't check if the checksums of packets are
correct. Also, TCP/IPv4 GRO doesn't re-calculate checksums for merged
packets.
- IPv4 ID. The IPv4 ID fields of the packets should be increased by 1.

View File

@ -0,0 +1,223 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd">
<!-- Generated by Microsoft Visio 11.0, SVG Export, v1.0 gro-key-algorithm.svg Page-1 -->
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:ev="http://www.w3.org/2001/xml-events"
xmlns:v="http://schemas.microsoft.com/visio/2003/SVGExtensions/" width="6.06163in" height="2.66319in"
viewBox="0 0 436.438 191.75" xml:space="preserve" color-interpolation-filters="sRGB" class="st10">
<v:documentProperties v:langID="1033" v:viewMarkup="false"/>
<style type="text/css">
<![CDATA[
.st1 {fill:url(#grad30-4);stroke:#404040;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.25}
.st2 {fill:#000000;font-family:Calibri;font-size:1.00001em}
.st3 {font-size:1em;font-weight:bold}
.st4 {fill:#000000;font-family:Calibri;font-size:1.00001em;font-weight:bold}
.st5 {font-size:1em;font-weight:normal}
.st6 {marker-end:url(#mrkr5-38);stroke:#404040;stroke-linecap:round;stroke-linejoin:round;stroke-width:1}
.st7 {fill:#404040;fill-opacity:1;stroke:#404040;stroke-opacity:1;stroke-width:0.28409090909091}
.st8 {fill:none;stroke:none;stroke-linecap:round;stroke-linejoin:round;stroke-width:0.25}
.st9 {fill:#000000;font-family:Calibri;font-size:0.833336em}
.st10 {fill:none;fill-rule:evenodd;font-size:12px;overflow:visible;stroke-linecap:square;stroke-miterlimit:3}
]]>
</style>
<defs id="Patterns_And_Gradients">
<linearGradient id="grad30-4" v:fillPattern="30" v:foreground="#c6d09f" v:background="#d1dab4" x1="0" y1="1" x2="0" y2="0">
<stop offset="0" style="stop-color:#c6d09f;stop-opacity:1"/>
<stop offset="1" style="stop-color:#d1dab4;stop-opacity:1"/>
</linearGradient>
<linearGradient id="grad30-35" v:fillPattern="30" v:foreground="#f0f0f0" v:background="#ffffff" x1="0" y1="1" x2="0" y2="0">
<stop offset="0" style="stop-color:#f0f0f0;stop-opacity:1"/>
<stop offset="1" style="stop-color:#ffffff;stop-opacity:1"/>
</linearGradient>
</defs>
<defs id="Markers">
<g id="lend5">
<path d="M 2 1 L 0 0 L 1.98117 -0.993387 C 1.67173 -0.364515 1.67301 0.372641 1.98465 1.00043 " style="stroke:none"/>
</g>
<marker id="mrkr5-38" class="st7" v:arrowType="5" v:arrowSize="2" v:setback="6.16" refX="-6.16" orient="auto"
markerUnits="strokeWidth" overflow="visible">
<use xlink:href="#lend5" transform="scale(-3.52,-3.52) "/>
</marker>
</defs>
<g v:mID="0" v:index="1" v:groupContext="foregroundPage">
<title>Page-1</title>
<v:pageProperties v:drawingScale="1" v:pageScale="1" v:drawingUnits="0" v:shadowOffsetX="9" v:shadowOffsetY="-9"/>
<v:layer v:name="Connector" v:index="0"/>
<g id="shape1-1" v:mID="1" v:groupContext="shape" transform="translate(0.25,-117.25)">
<title>Rounded rectangle</title>
<desc>Categorize into an existed “flow”</desc>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="90" cy="173.75" width="180" height="36"/>
<path d="M171 191.75 A9.00007 9.00007 -180 0 0 180 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75
A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z"
class="st1"/>
<text x="8.91" y="177.35" class="st2" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Categorize into an <tspan
class="st3">existed</tspan><tspan class="st3" v:langID="2052"> </tspan><tspan class="st3">flow</tspan></text> </g>
<g id="shape2-9" v:mID="2" v:groupContext="shape" transform="translate(0.25,-58.75)">
<title>Rounded rectangle.2</title>
<desc>Search for a “neighbor”</desc>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="90" cy="173.75" width="180" height="36"/>
<path d="M171 191.75 A9.00007 9.00007 -180 0 0 180 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75
A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z"
class="st1"/>
<text x="32.19" y="177.35" class="st2" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Search for a “<tspan
class="st3">neighbor</tspan></text> </g>
<g id="shape3-14" v:mID="3" v:groupContext="shape" transform="translate(225.813,-117.25)">
<title>Rounded rectangle.3</title>
<desc>Insert a new “flow” and store the packet</desc>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="105.188" cy="173.75" width="210.38" height="36"/>
<path d="M201.37 191.75 A9.00007 9.00007 -180 0 0 210.37 182.75 L210.37 164.75 A9.00007 9.00007 -180 0 0 201.37 155.75
L9 155.75 A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L201.37 191.75
Z" class="st1"/>
<text x="5.45" y="177.35" class="st2" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Insert a <tspan
class="st3">new </tspan><tspan class="st3">flow</tspan>” and <tspan class="st3">store </tspan>the packet</text> </g>
<g id="shape4-21" v:mID="4" v:groupContext="shape" transform="translate(225.25,-58.75)">
<title>Rounded rectangle.4</title>
<desc>Store the packet</desc>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="83.25" cy="173.75" width="166.5" height="36"/>
<path d="M157.5 191.75 A9.00007 9.00007 -180 0 0 166.5 182.75 L166.5 164.75 A9.00007 9.00007 -180 0 0 157.5 155.75 L9
155.75 A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L157.5 191.75 Z"
class="st1"/>
<text x="42.81" y="177.35" class="st4" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Store <tspan
class="st5">the packet</tspan></text> </g>
<g id="shape5-26" v:mID="5" v:groupContext="shape" transform="translate(0.25,-0.25)">
<title>Rounded rectangle.5</title>
<desc>Merge the packet</desc>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="90" cy="173.75" width="180" height="36"/>
<path d="M171 191.75 A9.00007 9.00007 -180 0 0 180 182.75 L180 164.75 A9.00007 9.00007 -180 0 0 171 155.75 L9 155.75
A9.00007 9.00007 -180 0 0 -0 164.75 L0 182.75 A9.00007 9.00007 -180 0 0 9 191.75 L171 191.75 Z"
class="st1"/>
<text x="46.59" y="177.35" class="st4" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>Merge <tspan
class="st5">the packet</tspan></text> </g>
<g id="shape6-31" v:mID="6" v:groupContext="shape" v:layerMember="0" transform="translate(81.25,-175.75)">
<title>Dynamic connector</title>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<path d="M9 191.75 L9 208.09" class="st6"/>
</g>
<g id="shape7-39" v:mID="7" v:groupContext="shape" v:layerMember="0" transform="translate(81.25,-117.25)">
<title>Dynamic connector.7</title>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<path d="M9 191.75 L9 208.09" class="st6"/>
</g>
<g id="shape8-45" v:mID="8" v:groupContext="shape" v:layerMember="0" transform="translate(81.25,-58.75)">
<title>Dynamic connector.8</title>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<path d="M9 191.75 L9 208.09" class="st6"/>
</g>
<g id="shape9-51" v:mID="9" v:groupContext="shape" v:layerMember="0" transform="translate(180.25,-126.25)">
<title>Dynamic connector.9</title>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<path d="M0 182.75 L39.4 182.75" class="st6"/>
</g>
<g id="shape10-57" v:mID="10" v:groupContext="shape" v:layerMember="0" transform="translate(180.25,-67.75)">
<title>Dynamic connector.10</title>
<v:userDefs>
<v:ud v:nameU="visVersion" v:val="VT0(14):26"/>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<path d="M0 182.75 L38.84 182.75" class="st6"/>
</g>
<g id="shape11-63" v:mID="11" v:groupContext="shape" transform="translate(65.5,-173.5)">
<title>Sheet.11</title>
<desc>packet</desc>
<v:userDefs>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="24.75" cy="182.75" width="49.5" height="18"/>
<rect x="0" y="173.75" width="49.5" height="18" class="st8"/>
<text x="8.46" y="186.35" class="st2" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>packet</text> </g>
<g id="shape14-66" v:mID="14" v:groupContext="shape" transform="translate(98.125,-98.125)">
<title>Sheet.14</title>
<desc>find a “flow”</desc>
<v:userDefs>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="32.0625" cy="183.875" width="64.13" height="15.75"/>
<rect x="0" y="176" width="64.125" height="15.75" class="st8"/>
<text x="6.41" y="186.88" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>find a “flow”</text> </g>
<g id="shape15-69" v:mID="15" v:groupContext="shape" transform="translate(99.25,-39.625)">
<title>Sheet.15</title>
<desc>find a “neighbor”</desc>
<v:userDefs>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="40.5" cy="183.875" width="81" height="15.75"/>
<rect x="0" y="176" width="81" height="15.75" class="st8"/>
<text x="5.48" y="186.88" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>find a “neighbor”</text> </g>
<g id="shape13-72" v:mID="13" v:groupContext="shape" transform="translate(181.375,-79)">
<title>Sheet.13</title>
<desc>not find</desc>
<v:userDefs>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="21.375" cy="183.875" width="42.75" height="15.75"/>
<rect x="0" y="176" width="42.75" height="15.75" class="st8"/>
<text x="5.38" y="186.88" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>not find</text> </g>
<g id="shape12-75" v:mID="12" v:groupContext="shape" transform="translate(181.375,-137.5)">
<title>Sheet.12</title>
<desc>not find</desc>
<v:userDefs>
<v:ud v:nameU="msvThemeColors" v:val="VT0(36):26"/>
<v:ud v:nameU="msvThemeEffects" v:val="VT0(16):26"/>
</v:userDefs>
<v:textBlock v:margins="rect(4,4,4,4)"/>
<v:textRect cx="21.375" cy="183.875" width="42.75" height="15.75"/>
<rect x="0" y="176" width="42.75" height="15.75" class="st8"/>
<text x="5.38" y="186.88" class="st9" v:langID="1033"><v:paragraph v:horizAlign="1"/><v:tabList/>not find</text> </g>
</g>
</svg>

After

Width:  |  Height:  |  Size: 12 KiB

View File

@ -44,20 +44,20 @@ gro_tcp4_tbl_create(uint16_t socket_id,
}
tbl->max_item_num = entries_num;
size = sizeof(struct gro_tcp4_key) * entries_num;
tbl->keys = rte_zmalloc_socket(__func__,
size = sizeof(struct gro_tcp4_flow) * entries_num;
tbl->flows = rte_zmalloc_socket(__func__,
size,
RTE_CACHE_LINE_SIZE,
socket_id);
if (tbl->keys == NULL) {
if (tbl->flows == NULL) {
rte_free(tbl->items);
rte_free(tbl);
return NULL;
}
/* INVALID_ARRAY_INDEX indicates empty key */
/* INVALID_ARRAY_INDEX indicates an empty flow */
for (i = 0; i < entries_num; i++)
tbl->keys[i].start_index = INVALID_ARRAY_INDEX;
tbl->max_key_num = entries_num;
tbl->flows[i].start_index = INVALID_ARRAY_INDEX;
tbl->max_flow_num = entries_num;
return tbl;
}
@ -69,7 +69,7 @@ gro_tcp4_tbl_destroy(void *tbl)
if (tcp_tbl) {
rte_free(tcp_tbl->items);
rte_free(tcp_tbl->keys);
rte_free(tcp_tbl->flows);
}
rte_free(tcp_tbl);
}
@ -81,50 +81,46 @@ gro_tcp4_tbl_destroy(void *tbl)
* the original packet.
*/
static inline int
merge_two_tcp4_packets(struct gro_tcp4_item *item_src,
merge_two_tcp4_packets(struct gro_tcp4_item *item,
struct rte_mbuf *pkt,
uint16_t ip_id,
int cmp,
uint32_t sent_seq,
int cmp)
uint16_t ip_id)
{
struct rte_mbuf *pkt_head, *pkt_tail, *lastseg;
uint16_t tcp_datalen;
uint16_t hdr_len;
if (cmp > 0) {
pkt_head = item_src->firstseg;
pkt_head = item->firstseg;
pkt_tail = pkt;
} else {
pkt_head = pkt;
pkt_tail = item_src->firstseg;
pkt_tail = item->firstseg;
}
/* check if the packet length will be beyond the max value */
tcp_datalen = pkt_tail->pkt_len - pkt_tail->l2_len -
pkt_tail->l3_len - pkt_tail->l4_len;
if (pkt_head->pkt_len - pkt_head->l2_len + tcp_datalen >
TCP4_MAX_L3_LENGTH)
/* check if the IPv4 packet length is greater than the max value */
hdr_len = pkt_head->l2_len + pkt_head->l3_len + pkt_head->l4_len;
if (unlikely(pkt_head->pkt_len - pkt_head->l2_len + pkt_tail->pkt_len -
hdr_len > MAX_IPV4_PKT_LENGTH))
return 0;
/* remove packet header for the tail packet */
rte_pktmbuf_adj(pkt_tail,
pkt_tail->l2_len +
pkt_tail->l3_len +
pkt_tail->l4_len);
/* remove the packet header for the tail packet */
rte_pktmbuf_adj(pkt_tail, hdr_len);
/* chain two packets together */
if (cmp > 0) {
item_src->lastseg->next = pkt;
item_src->lastseg = rte_pktmbuf_lastseg(pkt);
item->lastseg->next = pkt;
item->lastseg = rte_pktmbuf_lastseg(pkt);
/* update IP ID to the larger value */
item_src->ip_id = ip_id;
item->ip_id = ip_id;
} else {
lastseg = rte_pktmbuf_lastseg(pkt);
lastseg->next = item_src->firstseg;
item_src->firstseg = pkt;
lastseg->next = item->firstseg;
item->firstseg = pkt;
/* update sent_seq to the smaller value */
item_src->sent_seq = sent_seq;
item->sent_seq = sent_seq;
}
item_src->nb_merged++;
item->nb_merged++;
/* update mbuf metadata for the merged packet */
pkt_head->nb_segs += pkt_tail->nb_segs;
@ -133,45 +129,46 @@ merge_two_tcp4_packets(struct gro_tcp4_item *item_src,
return 1;
}
/*
* Check if two TCP/IPv4 packets are neighbors.
*/
static inline int
check_seq_option(struct gro_tcp4_item *item,
struct tcp_hdr *tcp_hdr,
uint16_t tcp_hl,
uint16_t tcp_dl,
struct tcp_hdr *tcph,
uint32_t sent_seq,
uint16_t ip_id,
uint32_t sent_seq)
uint16_t tcp_hl,
uint16_t tcp_dl)
{
struct rte_mbuf *pkt0 = item->firstseg;
struct ipv4_hdr *ipv4_hdr0;
struct tcp_hdr *tcp_hdr0;
uint16_t tcp_hl0, tcp_dl0;
uint16_t len;
struct rte_mbuf *pkt_orig = item->firstseg;
struct ipv4_hdr *iph_orig;
struct tcp_hdr *tcph_orig;
uint16_t len, tcp_hl_orig;
ipv4_hdr0 = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt0, char *) +
pkt0->l2_len);
tcp_hdr0 = (struct tcp_hdr *)((char *)ipv4_hdr0 + pkt0->l3_len);
tcp_hl0 = pkt0->l4_len;
iph_orig = (struct ipv4_hdr *)(rte_pktmbuf_mtod(pkt_orig, char *) +
pkt_orig->l2_len);
tcph_orig = (struct tcp_hdr *)((char *)iph_orig + pkt_orig->l3_len);
tcp_hl_orig = pkt_orig->l4_len;
/* check if TCP option fields equal. If not, return 0. */
len = RTE_MAX(tcp_hl, tcp_hl0) - sizeof(struct tcp_hdr);
if ((tcp_hl != tcp_hl0) ||
((len > 0) && (memcmp(tcp_hdr + 1,
tcp_hdr0 + 1,
/* Check if TCP option fields equal */
len = RTE_MAX(tcp_hl, tcp_hl_orig) - sizeof(struct tcp_hdr);
if ((tcp_hl != tcp_hl_orig) ||
((len > 0) && (memcmp(tcph + 1, tcph_orig + 1,
len) != 0)))
return 0;
/* check if the two packets are neighbors */
tcp_dl0 = pkt0->pkt_len - pkt0->l2_len - pkt0->l3_len - tcp_hl0;
if ((sent_seq == (item->sent_seq + tcp_dl0)) &&
(ip_id == (item->ip_id + 1)))
len = pkt_orig->pkt_len - pkt_orig->l2_len - pkt_orig->l3_len -
tcp_hl_orig;
if ((sent_seq == item->sent_seq + len) && (ip_id == item->ip_id + 1))
/* append the new packet */
return 1;
else if (((sent_seq + tcp_dl) == item->sent_seq) &&
((ip_id + item->nb_merged) == item->ip_id))
else if ((sent_seq + tcp_dl == item->sent_seq) &&
(ip_id + item->nb_merged == item->ip_id))
/* pre-pend the new packet */
return -1;
else
return 0;
return 0;
}
static inline uint32_t
@ -187,13 +184,13 @@ find_an_empty_item(struct gro_tcp4_tbl *tbl)
}
static inline uint32_t
find_an_empty_key(struct gro_tcp4_tbl *tbl)
find_an_empty_flow(struct gro_tcp4_tbl *tbl)
{
uint32_t i;
uint32_t max_key_num = tbl->max_key_num;
uint32_t max_flow_num = tbl->max_flow_num;
for (i = 0; i < max_key_num; i++)
if (tbl->keys[i].start_index == INVALID_ARRAY_INDEX)
for (i = 0; i < max_flow_num; i++)
if (tbl->flows[i].start_index == INVALID_ARRAY_INDEX)
return i;
return INVALID_ARRAY_INDEX;
}
@ -201,10 +198,10 @@ find_an_empty_key(struct gro_tcp4_tbl *tbl)
static inline uint32_t
insert_new_item(struct gro_tcp4_tbl *tbl,
struct rte_mbuf *pkt,
uint16_t ip_id,
uint32_t sent_seq,
uint64_t start_time,
uint32_t prev_idx,
uint64_t start_time)
uint32_t sent_seq,
uint16_t ip_id)
{
uint32_t item_idx;
@ -221,7 +218,7 @@ insert_new_item(struct gro_tcp4_tbl *tbl,
tbl->items[item_idx].nb_merged = 1;
tbl->item_num++;
/* if the previous packet exists, chain the new one with it */
/* if the previous packet exists, chain them together. */
if (prev_idx != INVALID_ARRAY_INDEX) {
tbl->items[item_idx].next_pkt_idx =
tbl->items[prev_idx].next_pkt_idx;
@ -237,7 +234,7 @@ delete_item(struct gro_tcp4_tbl *tbl, uint32_t item_idx,
{
uint32_t next_idx = tbl->items[item_idx].next_pkt_idx;
/* set NULL to firstseg to indicate it's an empty item */
/* NULL indicates an empty item */
tbl->items[item_idx].firstseg = NULL;
tbl->item_num--;
if (prev_item_idx != INVALID_ARRAY_INDEX)
@ -247,44 +244,42 @@ delete_item(struct gro_tcp4_tbl *tbl, uint32_t item_idx,
}
static inline uint32_t
insert_new_key(struct gro_tcp4_tbl *tbl,
struct tcp4_key *key_src,
insert_new_flow(struct gro_tcp4_tbl *tbl,
struct tcp4_flow_key *src,
uint32_t item_idx)
{
struct tcp4_key *key_dst;
uint32_t key_idx;
struct tcp4_flow_key *dst;
uint32_t flow_idx;
key_idx = find_an_empty_key(tbl);
if (key_idx == INVALID_ARRAY_INDEX)
flow_idx = find_an_empty_flow(tbl);
if (unlikely(flow_idx == INVALID_ARRAY_INDEX))
return INVALID_ARRAY_INDEX;
key_dst = &(tbl->keys[key_idx].key);
dst = &(tbl->flows[flow_idx].key);
ether_addr_copy(&(key_src->eth_saddr), &(key_dst->eth_saddr));
ether_addr_copy(&(key_src->eth_daddr), &(key_dst->eth_daddr));
key_dst->ip_src_addr = key_src->ip_src_addr;
key_dst->ip_dst_addr = key_src->ip_dst_addr;
key_dst->recv_ack = key_src->recv_ack;
key_dst->src_port = key_src->src_port;
key_dst->dst_port = key_src->dst_port;
ether_addr_copy(&(src->eth_saddr), &(dst->eth_saddr));
ether_addr_copy(&(src->eth_daddr), &(dst->eth_daddr));
dst->ip_src_addr = src->ip_src_addr;
dst->ip_dst_addr = src->ip_dst_addr;
dst->recv_ack = src->recv_ack;
dst->src_port = src->src_port;
dst->dst_port = src->dst_port;
/* non-INVALID_ARRAY_INDEX value indicates this key is valid */
tbl->keys[key_idx].start_index = item_idx;
tbl->key_num++;
tbl->flows[flow_idx].start_index = item_idx;
tbl->flow_num++;
return key_idx;
return flow_idx;
}
/*
* Check if two TCP/IPv4 packets belong to the same flow.
*/
static inline int
is_same_key(struct tcp4_key k1, struct tcp4_key k2)
is_same_tcp4_flow(struct tcp4_flow_key k1, struct tcp4_flow_key k2)
{
if (is_same_ether_addr(&k1.eth_saddr, &k2.eth_saddr) == 0)
return 0;
if (is_same_ether_addr(&k1.eth_daddr, &k2.eth_daddr) == 0)
return 0;
return ((k1.ip_src_addr == k2.ip_src_addr) &&
return (is_same_ether_addr(&k1.eth_saddr, &k2.eth_saddr) &&
is_same_ether_addr(&k1.eth_daddr, &k2.eth_daddr) &&
(k1.ip_src_addr == k2.ip_src_addr) &&
(k1.ip_dst_addr == k2.ip_dst_addr) &&
(k1.recv_ack == k2.recv_ack) &&
(k1.src_port == k2.src_port) &&
@ -292,7 +287,7 @@ is_same_key(struct tcp4_key k1, struct tcp4_key k2)
}
/*
* update packet length for the flushed packet.
* update the packet length for the flushed packet.
*/
static inline void
update_header(struct gro_tcp4_item *item)
@ -315,27 +310,31 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
struct ipv4_hdr *ipv4_hdr;
struct tcp_hdr *tcp_hdr;
uint32_t sent_seq;
uint16_t tcp_dl, ip_id;
uint16_t tcp_dl, ip_id, hdr_len;
struct tcp4_key key;
struct tcp4_flow_key key;
uint32_t cur_idx, prev_idx, item_idx;
uint32_t i, max_key_num;
uint32_t i, max_flow_num, remaining_flow_num;
int cmp;
uint8_t find;
eth_hdr = rte_pktmbuf_mtod(pkt, struct ether_hdr *);
ipv4_hdr = (struct ipv4_hdr *)((char *)eth_hdr + pkt->l2_len);
tcp_hdr = (struct tcp_hdr *)((char *)ipv4_hdr + pkt->l3_len);
hdr_len = pkt->l2_len + pkt->l3_len + pkt->l4_len;
/*
* if FIN, SYN, RST, PSH, URG, ECE or
* CWR is set, return immediately.
* Don't process the packet which has FIN, SYN, RST, PSH, URG, ECE
* or CWR set.
*/
if (tcp_hdr->tcp_flags != TCP_ACK_FLAG)
return -1;
/* if payload length is 0, return immediately */
tcp_dl = rte_be_to_cpu_16(ipv4_hdr->total_length) - pkt->l3_len -
pkt->l4_len;
if (tcp_dl == 0)
/*
* Don't process the packet whose payload length is less than or
* equal to 0.
*/
tcp_dl = pkt->pkt_len - hdr_len;
if (tcp_dl <= 0)
return -1;
ip_id = rte_be_to_cpu_16(ipv4_hdr->packet_id);
@ -349,25 +348,34 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
key.dst_port = tcp_hdr->dst_port;
key.recv_ack = tcp_hdr->recv_ack;
/* search for a key */
max_key_num = tbl->max_key_num;
for (i = 0; i < max_key_num; i++) {
if ((tbl->keys[i].start_index != INVALID_ARRAY_INDEX) &&
is_same_key(tbl->keys[i].key, key))
break;
/* Search for a matched flow. */
max_flow_num = tbl->max_flow_num;
remaining_flow_num = tbl->flow_num;
find = 0;
for (i = 0; i < max_flow_num && remaining_flow_num; i++) {
if (tbl->flows[i].start_index != INVALID_ARRAY_INDEX) {
if (is_same_tcp4_flow(tbl->flows[i].key, key)) {
find = 1;
break;
}
remaining_flow_num--;
}
}
/* can't find a key, so insert a new key and a new item. */
if (i == tbl->max_key_num) {
item_idx = insert_new_item(tbl, pkt, ip_id, sent_seq,
INVALID_ARRAY_INDEX, start_time);
/*
* Fail to find a matched flow. Insert a new flow and store the
* packet into the flow.
*/
if (find == 0) {
item_idx = insert_new_item(tbl, pkt, start_time,
INVALID_ARRAY_INDEX, sent_seq, ip_id);
if (item_idx == INVALID_ARRAY_INDEX)
return -1;
if (insert_new_key(tbl, &key, item_idx) ==
if (insert_new_flow(tbl, &key, item_idx) ==
INVALID_ARRAY_INDEX) {
/*
* fail to insert a new key, so
* delete the inserted item
* Fail to insert a new flow, so delete the
* stored packet.
*/
delete_item(tbl, item_idx, INVALID_ARRAY_INDEX);
return -1;
@ -375,24 +383,26 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
return 0;
}
/* traverse all packets in the item group to find one to merge */
cur_idx = tbl->keys[i].start_index;
/*
* Check all packets in the flow and try to find a neighbor for
* the input packet.
*/
cur_idx = tbl->flows[i].start_index;
prev_idx = cur_idx;
do {
cmp = check_seq_option(&(tbl->items[cur_idx]), tcp_hdr,
pkt->l4_len, tcp_dl, ip_id, sent_seq);
sent_seq, ip_id, pkt->l4_len, tcp_dl);
if (cmp) {
if (merge_two_tcp4_packets(&(tbl->items[cur_idx]),
pkt, ip_id,
sent_seq, cmp))
pkt, cmp, sent_seq, ip_id))
return 1;
/*
* fail to merge two packets since the packet
* length will be greater than the max value.
* So insert the packet into the item group.
* Fail to merge the two packets, as the packet
* length is greater than the max value. Store
* the packet into the flow.
*/
if (insert_new_item(tbl, pkt, ip_id, sent_seq,
prev_idx, start_time) ==
if (insert_new_item(tbl, pkt, start_time, prev_idx,
sent_seq, ip_id) ==
INVALID_ARRAY_INDEX)
return -1;
return 0;
@ -401,12 +411,9 @@ gro_tcp4_reassemble(struct rte_mbuf *pkt,
cur_idx = tbl->items[cur_idx].next_pkt_idx;
} while (cur_idx != INVALID_ARRAY_INDEX);
/*
* can't find a packet in the item group to merge,
* so insert the packet into the item group.
*/
if (insert_new_item(tbl, pkt, ip_id, sent_seq, prev_idx,
start_time) == INVALID_ARRAY_INDEX)
/* Fail to find a neighbor, so store the packet into the flow. */
if (insert_new_item(tbl, pkt, start_time, prev_idx, sent_seq,
ip_id) == INVALID_ARRAY_INDEX)
return -1;
return 0;
@ -420,44 +427,33 @@ gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl *tbl,
{
uint16_t k = 0;
uint32_t i, j;
uint32_t max_key_num = tbl->max_key_num;
uint32_t max_flow_num = tbl->max_flow_num;
for (i = 0; i < max_key_num; i++) {
/* all keys have been checked, return immediately */
if (tbl->key_num == 0)
for (i = 0; i < max_flow_num; i++) {
if (unlikely(tbl->flow_num == 0))
return k;
j = tbl->keys[i].start_index;
j = tbl->flows[i].start_index;
while (j != INVALID_ARRAY_INDEX) {
if (tbl->items[j].start_time <= flush_timestamp) {
out[k++] = tbl->items[j].firstseg;
if (tbl->items[j].nb_merged > 1)
update_header(&(tbl->items[j]));
/*
* delete the item and get
* the next packet index
* Delete the packet and get the next
* packet in the flow.
*/
j = delete_item(tbl, j,
INVALID_ARRAY_INDEX);
j = delete_item(tbl, j, INVALID_ARRAY_INDEX);
tbl->flows[i].start_index = j;
if (j == INVALID_ARRAY_INDEX)
tbl->flow_num--;
/*
* delete the key as all of
* packets are flushed
*/
if (j == INVALID_ARRAY_INDEX) {
tbl->keys[i].start_index =
INVALID_ARRAY_INDEX;
tbl->key_num--;
} else
/* update start_index of the key */
tbl->keys[i].start_index = j;
if (k == nb_out)
if (unlikely(k == nb_out))
return k;
} else
/*
* left packets of this key won't be
* timeout, so go to check other keys.
* The left packets in this flow won't be
* timeout. Go to check other flows.
*/
break;
}

View File

@ -9,13 +9,13 @@
#define GRO_TCP4_TBL_MAX_ITEM_NUM (1024UL * 1024UL)
/*
* the max L3 length of a TCP/IPv4 packet. The L3 length
* is the sum of ipv4 header, tcp header and L4 payload.
* The max length of a IPv4 packet, which includes the length of the L3
* header, the L4 header and the data payload.
*/
#define TCP4_MAX_L3_LENGTH UINT16_MAX
#define MAX_IPV4_PKT_LENGTH UINT16_MAX
/* criteria of mergeing packets */
struct tcp4_key {
/* Header fields representing a TCP/IPv4 flow */
struct tcp4_flow_key {
struct ether_addr eth_saddr;
struct ether_addr eth_daddr;
uint32_t ip_src_addr;
@ -26,41 +26,38 @@ struct tcp4_key {
uint16_t dst_port;
};
struct gro_tcp4_key {
struct tcp4_key key;
struct gro_tcp4_flow {
struct tcp4_flow_key key;
/*
* the index of the first packet in the item group.
* If the value is INVALID_ARRAY_INDEX, it means
* the key is empty.
* The index of the first packet in the flow.
* INVALID_ARRAY_INDEX indicates an empty flow.
*/
uint32_t start_index;
};
struct gro_tcp4_item {
/*
* first segment of the packet. If the value
* The first MBUF segment of the packet. If the value
* is NULL, it means the item is empty.
*/
struct rte_mbuf *firstseg;
/* last segment of the packet */
/* The last MBUF segment of the packet */
struct rte_mbuf *lastseg;
/*
* the time when the first packet is inserted
* into the table. If a packet in the table is
* merged with an incoming packet, this value
* won't be updated. We set this value only
* when the first packet is inserted into the
* table.
* The time when the first packet is inserted into the table.
* This value won't be updated, even if the packet is merged
* with other packets.
*/
uint64_t start_time;
/*
* we use next_pkt_idx to chain the packets that
* have same key value but can't be merged together.
* next_pkt_idx is used to chain the packets that
* are in the same flow but can't be merged together
* (e.g. caused by packet reordering).
*/
uint32_t next_pkt_idx;
/* the sequence number of the packet */
/* TCP sequence number of the packet */
uint32_t sent_seq;
/* the IP ID of the packet */
/* IPv4 ID of the packet */
uint16_t ip_id;
/* the number of merged packets */
uint16_t nb_merged;
@ -72,31 +69,31 @@ struct gro_tcp4_item {
struct gro_tcp4_tbl {
/* item array */
struct gro_tcp4_item *items;
/* key array */
struct gro_tcp4_key *keys;
/* flow array */
struct gro_tcp4_flow *flows;
/* current item number */
uint32_t item_num;
/* current key num */
uint32_t key_num;
/* current flow num */
uint32_t flow_num;
/* item array size */
uint32_t max_item_num;
/* key array size */
uint32_t max_key_num;
/* flow array size */
uint32_t max_flow_num;
};
/**
* This function creates a TCP/IPv4 reassembly table.
*
* @param socket_id
* socket index for allocating TCP/IPv4 reassemble table
* Socket index for allocating the TCP/IPv4 reassemble table
* @param max_flow_num
* the maximum number of flows in the TCP/IPv4 GRO table
* The maximum number of flows in the TCP/IPv4 GRO table
* @param max_item_per_flow
* the maximum packet number per flow.
* The maximum number of packets per flow
*
* @return
* if create successfully, return a pointer which points to the
* created TCP/IPv4 GRO table. Otherwise, return NULL.
* - Return the table pointer on success.
* - Return NULL on failure.
*/
void *gro_tcp4_tbl_create(uint16_t socket_id,
uint16_t max_flow_num,
@ -106,62 +103,56 @@ void *gro_tcp4_tbl_create(uint16_t socket_id,
* This function destroys a TCP/IPv4 reassembly table.
*
* @param tbl
* a pointer points to the TCP/IPv4 reassembly table.
* Pointer pointing to the TCP/IPv4 reassembly table.
*/
void gro_tcp4_tbl_destroy(void *tbl);
/**
* This function searches for a packet in the TCP/IPv4 reassembly table
* to merge with the inputted one. To merge two packets is to chain them
* together and update packet headers. Packets, whose SYN, FIN, RST, PSH
* CWR, ECE or URG bit is set, are returned immediately. Packets which
* only have packet headers (i.e. without data) are also returned
* immediately. Otherwise, the packet is either merged, or inserted into
* the table. Besides, if there is no available space to insert the
* packet, this function returns immediately too.
* This function merges a TCP/IPv4 packet. It doesn't process the packet,
* which has SYN, FIN, RST, PSH, CWR, ECE or URG set, or doesn't have
* payload.
*
* This function assumes the inputted packet is with correct IPv4 and
* TCP checksums. And if two packets are merged, it won't re-calculate
* IPv4 and TCP checksums. Besides, if the inputted packet is IP
* fragmented, it assumes the packet is complete (with TCP header).
* This function doesn't check if the packet has correct checksums and
* doesn't re-calculate checksums for the merged packet. Additionally,
* it assumes the packets are complete (i.e., MF==0 && frag_off==0),
* when IP fragmentation is possible (i.e., DF==0). It returns the
* packet, if the packet has invalid parameters (e.g. SYN bit is set)
* or there is no available space in the table.
*
* @param pkt
* packet to reassemble.
* Packet to reassemble
* @param tbl
* a pointer that points to a TCP/IPv4 reassembly table.
* Pointer pointing to the TCP/IPv4 reassembly table
* @start_time
* the start time that the packet is inserted into the table
* The time when the packet is inserted into the table
*
* @return
* if the packet doesn't have data, or SYN, FIN, RST, PSH, CWR, ECE
* or URG bit is set, or there is no available space in the table to
* insert a new item or a new key, return a negative value. If the
* packet is merged successfully, return an positive value. If the
* packet is inserted into the table, return 0.
* - Return a positive value if the packet is merged.
* - Return zero if the packet isn't merged but stored in the table.
* - Return a negative value for invalid parameters or no available
* space in the table.
*/
int32_t gro_tcp4_reassemble(struct rte_mbuf *pkt,
struct gro_tcp4_tbl *tbl,
uint64_t start_time);
/**
* This function flushes timeout packets in a TCP/IPv4 reassembly table
* to applications, and without updating checksums for merged packets.
* The max number of flushed timeout packets is the element number of
* the array which is used to keep flushed packets.
* This function flushes timeout packets in a TCP/IPv4 reassembly table,
* and without updating checksums.
*
* @param tbl
* a pointer that points to a TCP GRO table.
* TCP/IPv4 reassembly table pointer
* @param flush_timestamp
* this function flushes packets which are inserted into the table
* before or at the flush_timestamp.
* Flush packets which are inserted into the table before or at the
* flush_timestamp.
* @param out
* pointer array which is used to keep flushed packets.
* Pointer array used to keep flushed packets
* @param nb_out
* the element number of out. It's also the max number of timeout
* The element number in 'out'. It also determines the maximum number of
* packets that can be flushed finally.
*
* @return
* the number of packets that are returned.
* The number of flushed packets
*/
uint16_t gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl *tbl,
uint64_t flush_timestamp,
@ -173,10 +164,10 @@ uint16_t gro_tcp4_tbl_timeout_flush(struct gro_tcp4_tbl *tbl,
* reassembly table.
*
* @param tbl
* pointer points to a TCP/IPv4 reassembly table.
* TCP/IPv4 reassembly table pointer
*
* @return
* the number of packets in the table
* The number of packets in the table
*/
uint32_t gro_tcp4_tbl_pkt_count(void *tbl);
#endif

View File

@ -23,11 +23,14 @@ static gro_tbl_destroy_fn tbl_destroy_fn[RTE_GRO_TYPE_MAX_NUM] = {
static gro_tbl_pkt_count_fn tbl_pkt_count_fn[RTE_GRO_TYPE_MAX_NUM] = {
gro_tcp4_tbl_pkt_count, NULL};
#define IS_IPV4_TCP_PKT(ptype) (RTE_ETH_IS_IPV4_HDR(ptype) && \
((ptype & RTE_PTYPE_L4_TCP) == RTE_PTYPE_L4_TCP))
/*
* GRO context structure, which is used to merge packets. It keeps
* many reassembly tables of desired GRO types. Applications need to
* create GRO context objects before using rte_gro_reassemble to
* perform GRO.
* GRO context structure. It keeps the table structures, which are
* used to merge packets, for different GRO types. Before using
* rte_gro_reassemble(), applications need to create the GRO context
* first.
*/
struct gro_ctx {
/* GRO types to perform */
@ -85,8 +88,6 @@ rte_gro_ctx_destroy(void *ctx)
uint64_t gro_type_flag;
uint8_t i;
if (gro_ctx == NULL)
return;
for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
gro_type_flag = 1ULL << i;
if ((gro_ctx->gro_types & gro_type_flag) == 0)
@ -103,62 +104,54 @@ rte_gro_reassemble_burst(struct rte_mbuf **pkts,
uint16_t nb_pkts,
const struct rte_gro_param *param)
{
uint16_t i;
uint16_t nb_after_gro = nb_pkts;
uint32_t item_num;
/* allocate a reassembly table for TCP/IPv4 GRO */
struct gro_tcp4_tbl tcp_tbl;
struct gro_tcp4_key tcp_keys[RTE_GRO_MAX_BURST_ITEM_NUM];
struct gro_tcp4_flow tcp_flows[RTE_GRO_MAX_BURST_ITEM_NUM];
struct gro_tcp4_item tcp_items[RTE_GRO_MAX_BURST_ITEM_NUM] = {{0} };
struct rte_mbuf *unprocess_pkts[nb_pkts];
uint16_t unprocess_num = 0;
uint32_t item_num;
int32_t ret;
uint64_t current_time;
uint16_t i, unprocess_num = 0, nb_after_gro = nb_pkts;
if ((param->gro_types & RTE_GRO_TCP_IPV4) == 0)
if (unlikely((param->gro_types & RTE_GRO_TCP_IPV4) == 0))
return nb_pkts;
/* get the actual number of packets */
/* Get the maximum number of packets */
item_num = RTE_MIN(nb_pkts, (param->max_flow_num *
param->max_item_per_flow));
param->max_item_per_flow));
item_num = RTE_MIN(item_num, RTE_GRO_MAX_BURST_ITEM_NUM);
for (i = 0; i < item_num; i++)
tcp_keys[i].start_index = INVALID_ARRAY_INDEX;
tcp_flows[i].start_index = INVALID_ARRAY_INDEX;
tcp_tbl.keys = tcp_keys;
tcp_tbl.flows = tcp_flows;
tcp_tbl.items = tcp_items;
tcp_tbl.key_num = 0;
tcp_tbl.flow_num = 0;
tcp_tbl.item_num = 0;
tcp_tbl.max_key_num = item_num;
tcp_tbl.max_flow_num = item_num;
tcp_tbl.max_item_num = item_num;
current_time = rte_rdtsc();
for (i = 0; i < nb_pkts; i++) {
if ((pkts[i]->packet_type & (RTE_PTYPE_L3_IPV4 |
RTE_PTYPE_L4_TCP)) ==
(RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)) {
ret = gro_tcp4_reassemble(pkts[i],
&tcp_tbl,
current_time);
if (IS_IPV4_TCP_PKT(pkts[i]->packet_type)) {
/*
* The timestamp is ignored, since all packets
* will be flushed from the tables.
*/
ret = gro_tcp4_reassemble(pkts[i], &tcp_tbl, 0);
if (ret > 0)
/* merge successfully */
nb_after_gro--;
else if (ret < 0) {
unprocess_pkts[unprocess_num++] =
pkts[i];
}
else if (ret < 0)
unprocess_pkts[unprocess_num++] = pkts[i];
} else
unprocess_pkts[unprocess_num++] = pkts[i];
}
/* re-arrange GROed packets */
if (nb_after_gro < nb_pkts) {
i = gro_tcp4_tbl_timeout_flush(&tcp_tbl, current_time,
pkts, nb_pkts);
/* Flush all packets from the tables */
i = gro_tcp4_tbl_timeout_flush(&tcp_tbl, 0, pkts, nb_pkts);
/* Copy unprocessed packets */
if (unprocess_num > 0) {
memcpy(&pkts[i], unprocess_pkts,
sizeof(struct rte_mbuf *) *
@ -174,31 +167,28 @@ rte_gro_reassemble(struct rte_mbuf **pkts,
uint16_t nb_pkts,
void *ctx)
{
uint16_t i, unprocess_num = 0;
struct rte_mbuf *unprocess_pkts[nb_pkts];
struct gro_ctx *gro_ctx = ctx;
void *tcp_tbl;
uint64_t current_time;
uint16_t i, unprocess_num = 0;
if ((gro_ctx->gro_types & RTE_GRO_TCP_IPV4) == 0)
if (unlikely((gro_ctx->gro_types & RTE_GRO_TCP_IPV4) == 0))
return nb_pkts;
tcp_tbl = gro_ctx->tbls[RTE_GRO_TCP_IPV4_INDEX];
current_time = rte_rdtsc();
for (i = 0; i < nb_pkts; i++) {
if ((pkts[i]->packet_type & (RTE_PTYPE_L3_IPV4 |
RTE_PTYPE_L4_TCP)) ==
(RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_TCP)) {
if (gro_tcp4_reassemble(pkts[i],
gro_ctx->tbls
[RTE_GRO_TCP_IPV4_INDEX],
if (IS_IPV4_TCP_PKT(pkts[i]->packet_type)) {
if (gro_tcp4_reassemble(pkts[i], tcp_tbl,
current_time) < 0)
unprocess_pkts[unprocess_num++] = pkts[i];
} else
unprocess_pkts[unprocess_num++] = pkts[i];
}
if (unprocess_num > 0) {
memcpy(pkts, unprocess_pkts,
sizeof(struct rte_mbuf *) *
memcpy(pkts, unprocess_pkts, sizeof(struct rte_mbuf *) *
unprocess_num);
}
@ -224,6 +214,7 @@ rte_gro_timeout_flush(void *ctx,
flush_timestamp,
out, max_nb_out);
}
return 0;
}
@ -232,19 +223,20 @@ rte_gro_get_pkt_count(void *ctx)
{
struct gro_ctx *gro_ctx = ctx;
gro_tbl_pkt_count_fn pkt_count_fn;
uint64_t gro_types = gro_ctx->gro_types, flag;
uint64_t item_num = 0;
uint64_t gro_type_flag;
uint8_t i;
for (i = 0; i < RTE_GRO_TYPE_MAX_NUM; i++) {
gro_type_flag = 1ULL << i;
if ((gro_ctx->gro_types & gro_type_flag) == 0)
for (i = 0; i < RTE_GRO_TYPE_MAX_NUM && gro_types; i++) {
flag = 1ULL << i;
if ((gro_types & flag) == 0)
continue;
gro_types ^= flag;
pkt_count_fn = tbl_pkt_count_fn[i];
if (pkt_count_fn == NULL)
continue;
item_num += pkt_count_fn(gro_ctx->tbls[i]);
if (pkt_count_fn)
item_num += pkt_count_fn(gro_ctx->tbls[i]);
}
return item_num;
}

View File

@ -31,8 +31,8 @@ extern "C" {
/**< TCP/IPv4 GRO flag */
/**
* A structure which is used to create GRO context objects or tell
* rte_gro_reassemble_burst() what reassembly rules are demanded.
* Structure used to create GRO context objects or used to pass
* application-determined parameters to rte_gro_reassemble_burst().
*/
struct rte_gro_param {
uint64_t gro_types;
@ -78,26 +78,23 @@ void rte_gro_ctx_destroy(void *ctx);
/**
* This is one of the main reassembly APIs, which merges numbers of
* packets at a time. It assumes that all inputted packets are with
* correct checksums. That is, applications should guarantee all
* inputted packets are correct. Besides, it doesn't re-calculate
* checksums for merged packets. If inputted packets are IP fragmented,
* this function assumes them are complete (i.e. with L4 header). After
* finishing processing, it returns all GROed packets to applications
* immediately.
* packets at a time. It doesn't check if input packets have correct
* checksums and doesn't re-calculate checksums for merged packets.
* It assumes the packets are complete (i.e., MF==0 && frag_off==0),
* when IP fragmentation is possible (i.e., DF==0). The GROed packets
* are returned as soon as the function finishes.
*
* @param pkts
* a pointer array which points to the packets to reassemble. Besides,
* it keeps mbuf addresses for the GROed packets.
* Pointer array pointing to the packets to reassemble. Besides, it
* keeps MBUF addresses for the GROed packets.
* @param nb_pkts
* the number of packets to reassemble.
* The number of packets to reassemble
* @param param
* applications use it to tell rte_gro_reassemble_burst() what rules
* are demanded.
* Application-determined parameters for reassembling packets.
*
* @return
* the number of packets after been GROed. If no packets are merged,
* the returned value is nb_pkts.
* The number of packets after been GROed. If no packets are merged,
* the return value is equals to nb_pkts.
*/
uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
uint16_t nb_pkts,
@ -107,32 +104,28 @@ uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts,
* @warning
* @b EXPERIMENTAL: this API may change without prior notice
*
* Reassembly function, which tries to merge inputted packets with
* the packets in the reassembly tables of a given GRO context. This
* function assumes all inputted packets are with correct checksums.
* And it won't update checksums if two packets are merged. Besides,
* if inputted packets are IP fragmented, this function assumes they
* are complete packets (i.e. with L4 header).
* Reassembly function, which tries to merge input packets with the
* existed packets in the reassembly tables of a given GRO context.
* It doesn't check if input packets have correct checksums and doesn't
* re-calculate checksums for merged packets. Additionally, it assumes
* the packets are complete (i.e., MF==0 && frag_off==0), when IP
* fragmentation is possible (i.e., DF==0).
*
* If the inputted packets don't have data or are with unsupported GRO
* types etc., they won't be processed and are returned to applications.
* Otherwise, the inputted packets are either merged or inserted into
* the table. If applications want get packets in the table, they need
* to call flush API.
* If the input packets have invalid parameters (e.g. no data payload,
* unsupported GRO types), they are returned to applications. Otherwise,
* they are either merged or inserted into the table. Applications need
* to flush packets from the tables by flush API, if they want to get the
* GROed packets.
*
* @param pkts
* packet to reassemble. Besides, after this function finishes, it
* keeps the unprocessed packets (e.g. without data or unsupported
* GRO types).
* Packets to reassemble. It's also used to store the unprocessed packets.
* @param nb_pkts
* the number of packets to reassemble.
* The number of packets to reassemble
* @param ctx
* a pointer points to a GRO context object.
* GRO context object pointer
*
* @return
* return the number of unprocessed packets (e.g. without data or
* unsupported GRO types). If all packets are processed (merged or
* inserted into the table), return 0.
* The number of unprocessed packets.
*/
uint16_t rte_gro_reassemble(struct rte_mbuf **pkts,
uint16_t nb_pkts,
@ -142,29 +135,28 @@ uint16_t rte_gro_reassemble(struct rte_mbuf **pkts,
* @warning
* @b EXPERIMENTAL: this API may change without prior notice
*
* This function flushes the timeout packets from reassembly tables of
* desired GRO types. The max number of flushed timeout packets is the
* element number of the array which is used to keep the flushed packets.
* This function flushes the timeout packets from the reassembly tables
* of desired GRO types. The max number of flushed packets is the
* element number of 'out'.
*
* Besides, this function won't re-calculate checksums for merged
* packets in the tables. That is, the returned packets may be with
* wrong checksums.
* Additionally, the flushed packets may have incorrect checksums, since
* this function doesn't re-calculate checksums for merged packets.
*
* @param ctx
* a pointer points to a GRO context object.
* GRO context object pointer.
* @param timeout_cycles
* max TTL for packets in reassembly tables, measured in nanosecond.
* The max TTL for packets in reassembly tables, measured in nanosecond.
* @param gro_types
* this function only flushes packets which belong to the GRO types
* specified by gro_types.
* This function flushes packets whose GRO types are specified by
* gro_types.
* @param out
* a pointer array that is used to keep flushed timeout packets.
* Pointer array used to keep flushed packets.
* @param max_nb_out
* the element number of out. It's also the max number of timeout
* The element number of 'out'. It's also the max number of timeout
* packets that can be flushed finally.
*
* @return
* the number of flushed packets. If no packets are flushed, return 0.
* The number of flushed packets.
*/
uint16_t rte_gro_timeout_flush(void *ctx,
uint64_t timeout_cycles,
@ -180,10 +172,10 @@ uint16_t rte_gro_timeout_flush(void *ctx,
* of a given GRO context.
*
* @param ctx
* pointer points to a GRO context object.
* GRO context object pointer.
*
* @return
* the number of packets in all reassembly tables.
* The number of packets in the tables.
*/
uint64_t rte_gro_get_pkt_count(void *ctx);