Added API for `rte_eth_tx_prepare`
uint16_t rte_eth_tx_prepare(uint8_t port_id, uint16_t queue_id,
struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
Added fields to the `struct rte_eth_desc_lim`:
uint16_t nb_seg_max;
/**< Max number of segments per whole packet. */
uint16_t nb_mtu_seg_max;
/**< Max number of segments per one MTU */
These fields can be used to create valid packets according to the
following rules:
* For non-TSO packet, a single transmit packet may span up to
"nb_mtu_seg_max" buffers.
* For TSO packet the total number of data descriptors is "nb_seg_max",
and each segment within the TSO may span up to "nb_mtu_seg_max".
Added functions:
int
rte_validate_tx_offload(struct rte_mbuf *m)
to validate general requirements for tx offload set in mbuf of packet
such a flag completness. In current implementation this function is
called optionaly when RTE_LIBRTE_ETHDEV_DEBUG is enabled.
int rte_net_intel_cksum_prepare(struct rte_mbuf *m)
to prepare pseudo header checksum for TSO and non-TSO tcp/udp packets
before hardware tx checksum offload.
- for non-TSO tcp/udp packets full pseudo-header checksum is
counted and set.
- for TSO the IP payload length is not included.
int
rte_net_intel_cksum_flags_prepare(struct rte_mbuf *m, uint64_t ol_flags)
this function uses same logic as rte_net_intel_cksum_prepare, but
allows application to choose which offloads should be taken into
account, if full preparation is not required.
PERFORMANCE TESTS
-----------------
This feature was tested with modified csum engine from test-pmd.
The packet checksum preparation was moved from application to Tx
preparation step placed before burst.
We may expect some overhead costs caused by:
1) using additional callback before burst,
2) rescanning burst,
3) additional condition checking (packet validation),
4) worse optimization (e.g. packet data access, etc.)
We tested it using ixgbe Tx preparation implementation with some parts
disabled to have comparable information about the impact of different
parts of implementation.
IMPACT:
1) For unimplemented Tx preparation callback the performance impact is
negligible,
2) For packet condition check without checksum modifications (nb_segs,
available offloads, etc.) is 14626628/14252168 (~2.62% drop),
3) Full support in ixgbe driver (point 2 + packet checksum
initialization) is 14060924/13588094 (~3.48% drop)
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This function can be used to calculate the checksum of data embedded in
mbuf, that can be composed of several segments.
This function will be used by the virtio pmd in next commits to calculate
the checksum in software in case the protocol is not recognized.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Add a parameter to rte_net_get_ptype() to select which
layers should be parsed. This avoids to parse all layers if
only the first ones are required.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Add support of Nvgre tunnels in rte_net_get_ptype(). At the same
time, as Nvgre transports Ethernet, we need to add the support for inner
Vlan, QinQ, and Mpls.
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Add the Gre header structure in librte_net. It will be used by next
patches that adds the support of Gre tunnels in the software packet type
parser.
The extended headers (checksum, key or sequence number) are not defined.
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Add support of IP and IP6 tunnels in rte_net_get_ptype().
We need to duplicate some code because the packet types do not have the
same value for a given protocol between inner and outer.
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Add a new RTE_PTYPE_L2_ETHER_QINQ packet type, and its support in
rte_net_get_ptype().
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Add a new RTE_PTYPE_L2_ETHER_VLAN packet type, and its support in
rte_net_get_ptype().
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Introduce the function rte_net_get_ptype() that parses a mbuf and
returns its packet type. For now, the following packet types are parsed:
L2: Ether
L3: IPv4, IPv6
L4: TCP, UDP, SCTP
The goal here is to provide a reference implementation for packet type
parsing. This function will be used by testpmd in next commits, allowing
to compare its result with the value given by the hardware.
This function will also be useful when implementing Rx offload support
in virtio pmd. Indeed, the virtio protocol gives the csum start and
offset, but it does not give the L4 protocol nor it tells if the
checksum is relevant for inner or outer. This information has to be
known to properly set the ol_flags in mbuf.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Previously, librte_net only contained header files. Add a C file
(empty for now) and generate a library. It will contain network helpers
like checksum calculation, software packet type parser, ...
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The proper place for rte_ether.h is in librte_net because it defines
network headers.
Moving it will also prevent to have circular references in the following
patches that will require the Ethernet header definition in rte_mbuf.c.
By the way, fix minor checkpatch issues.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This fix is for IPv6 checksum offload error on RHEL65.
Any optimalisation above -O0 provide error in IPv6 checksum
flag "-fstrict-aliasing" is default for optimalisation above -O0.
Step 1: testpmd -c 0x6 -n 4 -- -i --portmask=0x3 --disable-hw-vlan
--enable-rx-cksum --crc-strip --txqflags=0
Step 2: settings and start
set verbose 1
set fwd csum
start
Step 3: send scapy with bad checksum IPv6/TCP packet
Ether(src="52:00:00:00:00:00",
dst="90:e2:ba:4a:33:5d")/IPv6(src="::1")/TCP(chksum=0xf)/("X"*46)
Step 4: Received packets:
RESULTS: IPv6/TCP': ['0xd41'] or other unexpected.
EXPECTED RESULTS: IPv6/TCP': ['0x9f5e']
Fixes: 2b039d5f20 ("net: fix build with gcc 4.4.7 and strict aliasing")
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
There are no memcpy functions in rte_ip.h so there is no need to include
rte_memcpy.h in that file.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
- Only x540 and 82599 devices support LRO.
- Add the appropriate HW configuration.
- Add RSC aware rx_pkt_burst() handlers:
- Implemented bulk allocation and non-bulk allocation versions.
- Add LRO-specific fields to rte_eth_rxmode, to rte_eth_dev_data
and to ixgbe_rx_queue.
- Use the appropriate handler when LRO is requested.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Changed MAC address type from uint8_t[6] to struct ether_addr and IP
address type from uint8_t[4] to uint32_t to make it consistent with
other DPDK code using MAC and IP addresses. It allows us to use
is_same_ether_addr and ether_addr_copy functions on MAC addresses in ARP header. Also
removed union from arp_hdr struct to make calls to arp_data items
shorter. Updated test-pmd to match new arp_hdr version.
Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
[Thomas: doxygenize comments]
This patch contains a fix for link bonding handling of vlan tagged packets in mode 3 and 5.
Currently xmit_slave_hash function misinterprets the PKT_RX_VLAN_PKT flag to mean that
there is a vlan tag within the packet when in actually means that there is a valid entry
in the vlan_tci field in the mbuf.
- Fixed VLAN tag support in hashing functions.
- Adds support for TCP in layer 4 header hashing.
- Splits transmit hashing function into separate functions for each policy to
reduce branching and to make the code clearer.
- Fixed incorrect flag set in test application packet generator.
Test report: http://dpdk.org/ml/archives/dev/2015-January/010792.html
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
For rte_ipv6_phdr_cksum() gcc 4.8.* with "-O3" not always generates
correct code.
Sometimes it 'forgets' to put len and proto fields of psd_header on the stack.
To overcome that problem and speedup things a bit, refactored rte_raw_cksum()
by splitting ipv6 pseudo-header csum calculation into 3 phases:
1. calc sum for src & dst addresses
2. add sum for proto & len.
3. finalise sum
That makes gcc to generate valid code and helps to avoid any copying.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
include/rte_ip.h:161: error: dereferencing pointer ‘u16’
does break strict-aliasing rules
include/rte_ip.h:157: note: initialized from here
...
The root cause is that, compile enable strict aliasing by default,
while in function rte_raw_cksum() try to convert 'const char *'
to 'const uint16_t *'.
This workaround is to solve the compile issue of GCC strict-aliasing (two
different type pointers should not be point to the same memory address).
For GCC 4.4.7 it will definitely occurs if flags "-fstrict-aliasing"
and "-Wall" used.
Signed-off-by: Michael Qiu <michael.qiu@intel.com>
[Thomas: add workaround comment]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
It was impossible to include netinet/in.h and rte_ip.h
because the IP protocols were redefined.
It is removed because useless.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
Some of the NICs supported by DPDK have a possibility to accelerate TCP
traffic by using segmentation offload. The application prepares a packet
with valid TCP header with size up to 64K and deleguates the
segmentation to the NIC.
Implement the generic part of TCP segmentation offload in rte_mbuf. It
introduces 2 new fields in rte_mbuf: l4_len (length of L4 header in bytes)
and tso_segsz (MSS of packets).
To delegate the TCP segmentation to the hardware, the user has to:
- set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
PKT_TX_TCP_CKSUM)
- set the flag PKT_TX_IPV4 or PKT_TX_IPV6
- set PKT_TX_IP_CKSUM if it's IPv4, and set the IP checksum to 0 in
the packet
- fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
- calculate the pseudo header checksum without taking ip_len in account,
and set it in the TCP header, for instance by using
rte_ipv4_phdr_cksum(ip_hdr, ol_flags)
The API is inspired from ixgbe hardware (the next commit adds the
support for ixgbe), but it seems generic enough to be used for other
hw/drivers in the future.
This commit also reworks the way l2_len and l3_len are used in igb
and ixgbe drivers as the l2_l3_len is not available anymore in mbuf.
Signed-off-by: Mirek Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Introduce new functions to calculate checksums. These new functions
are derivated from the ones provided csumonly.c but slightly reworked.
There is still some room for future optimization of these functions
(maybe SSE/AVX, ...).
This API will be modified in tbe next commits by the introduction of
TSO that requires a different pseudo header checksum to be set in the
packet.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This commit removes trailing whitespace from lines in files. Almost all
files are affected, as the BSD license copyright header had trailing
whitespace on 4 lines in it [hence the number of files reporting 8 lines
changed in the diffstat].
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: remove spaces before tabs in libs]
[Thomas: remove more trailing spaces in non-C files]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Add a new specific packet processing engine in the "testpmd" application that
only replies to ARP requests and to ICMP echo requests.
For this purpose, a new "icmpecho" forwarding mode is provided that can be
dynamically selected with the following testpmd command:
set fwd icmpecho
before starting the receipt of packets on the selected ports.
Then, the "icmpecho" engine performs the following actions on all received
packets:
- replies to a received ARP request by sending back on the RX port a ARP
reply with a "sender hardware address" field containing the MAC address
of the RX port,
- replies to a ICMP echo request by sending back on the RX port a ICMP echo
reply, swapping the IP source and the IP destination address in the IP
header,
- otherwise, simply drops the received packet.
When replying to a received packet that was encapsulated into a VLAN tunnel,
the reply is sent back with the same VLAN identifier.
By default, the testpmd configures VLAN header stripping RX option on each
port.
This option is not managed by the icmpecho engine which won't detect
packets that were encapsulated into a VLAN.
To address this issue, the VLAN header stripping option must be previously
switched off with the following testpmd command:
vlan set strip off
When the "verbose" mode has been set with the testpmd command
"set verbose 1", the "icmpecho" engine displays informations about each
received packet.
The "icmpecho" forwarding engine can also be used to simply check port
connectivity at the hardware level (check that cables are well-plugged)
and at the software level (receipt of VLAN packets, for instance).
Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>