75 Commits

Author SHA1 Message Date
Helin Zhang
dc6c911c99 mbuf: use reserved space for double vlan
Use the reserved 16 bits in rte_mbuf structure for the outer vlan,
also add QinQ offloading flags for both RX and TX sides.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2015-07-07 16:39:32 +02:00
Cyril Chemparathy
b2906abb0d mbuf: add macro for offset arithmetic
There are a number of instances in the code where rte_pktmbuf_mtod()
is used to get the mbuf data pointer, only to add an offset before
casting the result to some other header type.  This patch adds a new
rte_pktmbuf_mtod_offset() macro to eliminate these awful double cast
situations.

Signed-off-by: Cyril Chemparathy <cchemparathy@ezchip.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-06-24 12:00:42 +02:00
Cyril Chemparathy
7755baae83 mbuf: silence warning on pointer arithmetic
Translating from an mbuf element to the mbuf pointer does not break alignment
constraints.  However, the compiler is unaware of this fact and complains on
-Wcast-align.  This patch modifies the code to use RTE_PTR_SUB(), thereby
silencing the compiler by casting through (void *).

Signed-off-by: Cyril Chemparathy <cchemparathy@ezchip.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-06-24 12:00:41 +02:00
Ferruh Yigit
6307b909b8 lib: remove extra parenthesis after return
Remove extra parenthesis from return statements.

Signed-off-by: Ferruh Yigit <ferruhy@gmail.com>
2015-06-23 23:31:15 +02:00
Olivier Matz
f20b50b946 mbuf: optimize refcnt update
In __rte_pktmbuf_prefree_seg(), there was an optimization to avoid using
a costly atomic operation when updating the mbuf reference counter if
its value is 1. Indeed, it means that we are the only owner of the mbuf,
and therefore nobody can change it at the same time.

We can generalize this optimization directly in rte_mbuf_refcnt_update()
so the other callers of this function, like rte_pktmbuf_attach(), can
also take advantage of this optimization.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-06-12 16:16:50 +02:00
Konstantin Ananyev
824cb29c0e apps: fix default mbuf size
Latest mbuf changes (priv_size addition and related fixes)
exposed small problem with testpmd and few other sample apps:
when mbuf size is exaclty 2KB or less, that causes
ixgbe PMD to select scattered RX even for configs with 'normal'
max packet length (max_rx_pkt_len == ETHER_MAX_LEN).
To overcome that problem and unify the code, new macro was created
to represent recommended minimal buffer length for mbuf.
When appropriate, samples are updated to use that macro.

Fixes: dfb03bbe2b ("app/testpmd: use standard functions to initialize
mbufs and mbuf pool")

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-05-11 15:51:14 +02:00
Olivier Matz
4a96e8b652 mbuf: allow to clone an indirect mbuf
Remove one limitation of rte_pktmbuf_attach(): "mbuf we're attaching to
must be direct".

Now, when we attach to an indirect mbuf:
- copy the all relevant fields (addr, len, offload, ...) as before
- get the pointer to the mbuf that embeds the data buffer (direct mbuf),
  and increase the reference counter of this one.

When detaching the mbuf, we can retrieve this direct mbuf as the pointer
is determined from the buffer address.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-04-28 11:35:53 +02:00
Olivier Matz
355e6735b3 mbuf: fix cloning with private mbuf data
Add a new priv_size field in mbuf structure that should
be initialized at mbuf pool creation. This field contains the
size of the application private data in mbufs.

Introduce new static inline functions rte_mbuf_from_indirect()
and rte_mbuf_to_baddr() to replace the existing macros, which
take the private size in account when attaching and detaching
mbufs.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Zoltan Kiss <zoltan.kiss@linaro.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-04-28 11:34:10 +02:00
Olivier Matz
bbd7782489 mbuf: introduce a new helper to create a pool
Add a new wrapper to rte_mempool_create() to simplify the creation
of a packet mbuf pool.

This wrapper can be used if there is no specific mempool flags, and
no specific mbuf or pool constructor function, which is most of the
use cases.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-04-28 11:34:04 +02:00
Olivier Matz
c511a96a34 mbuf: add accessors for data room and private size
This code retrieving the pool private area is duplicated in many
places, we can use of function for it.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-04-28 11:07:59 +02:00
Olivier Matz
1d493a4949 mbuf: fix data room size calculation in pool init
Deduct the mbuf data room size from mempool->elt_size and priv_size,
instead of using an hardcoded value that is not related to the real
buffer size.

To use rte_pktmbuf_pool_init(), the user can either:
- give a NULL parameter to rte_pktmbuf_pool_init(): in this case, the
  private size is assumed to be 0, and the room size is
  mp->elt_size - sizeof(struct rte_mbuf).
- give the rte_pktmbuf_pool_private filled with appropriate
  data_room_size and priv_size values.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-04-27 22:39:20 +02:00
Thomas Monjalon
2ee98e69e1 mbuf: clean old refcnt option
CONFIG_RTE_MBUF_SCATTER_GATHER was renamed into CONFIG_RTE_MBUF_REFCNT
by commit 62814bc2e923 and removed by commit 4769bc5a27cc.
Some traces remain because of delayed patches.

It can also be removed from doxygen config.
It is now poisoned in rte_mbuf.h to warn any misuse.

Fixes: d0dff9ba445e ("doc: sample application user guide")
Fixes: fc1f2750a3ec ("doc: programmers guide")
Fixes: 4769bc5a27cc ("mbuf: remove build option to disable refcnt")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2015-04-03 15:55:40 +02:00
Bruce Richardson
2911571232 mbuf: add explanation for confusing checks when freeing
The logic used in the condition check before freeing an mbuf is
sometimes confusing, so explain it in a proper comment.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-03-30 19:10:09 +02:00
Sergio Gonzalez Monroy
e1545b393a mbuf: fix a couple of doxygen comments
Fix a couple of doxygen comments in mbuf structure:
 - seqn had no doxygen syntax.
 - usr was not generating proper link to function.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-24 03:00:31 +01:00
Sergio Gonzalez Monroy
4769bc5a27 mbuf: remove build option to disable refcnt
This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
field in the mbuf struct permanently.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-23 19:31:24 +01:00
Sergio Gonzalez Monroy
e8b9ef877e mbuf: introduce indirect attached flag
Currently for mbufs with refcnt, we cannot free mbufs with external memory
buffers (ie. vhost zero copy), as they are recognized as indirect
attached mbufs and therefore we free the direct mbuf it points to,
resulting in an error in the case of external memory buffers.

We solve the issue by introducing the IND_ATTACHED_MBUF flag, which indicates
that the mbuf is an indirect attached mbuf pointing to another mbuf.
When we free an mbuf, we only free the direct mbuf if the flag is set.
Freeing an mbuf with external buffer is the same as freeing a non attached mbuf.
The flag is set during attach and clear on detach.

So in the case of vhost zero copy where we have mbufs with external
buffers, by default we just free the mbuf and it is up to the user to deal with
the external buffer.

This patch would allow the removal of the RTE_MBUF_REFCNT config option,
setting refcnt for all mbufs permanently.

The patch also modifies the vhost example as it was using the
RTE_MBUF_INDIRECT macro to detect if it was an mbuf with external buffer.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-23 19:27:06 +01:00
Sergio Gonzalez Monroy
b70b56032b reorder: new library
This library provides reordering capability for out of order mbufs based
on a sequence number in the mbuf structure.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Signed-off-by: Richardson Bruce <bruce.richardson@intel.com>
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-18 16:52:05 +01:00
Olivier Matz
c9433176b3 mbuf: remove UDP tunnel flag
Since previous commit, the flag PKT_TX_UDP_TUNNEL_PKT is not used by any PMD,
remove it from mbuf API and from csumonly (testpmd). In csumonly, the
PKT_TX_OUTER_IP_CKSUM flag is already set for vxlan checksum, providing
enough information to the underlying driver.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:17 +01:00
Olivier Matz
609dd68ef1 mbuf: enhance the API documentation of offload flags
Based on http://dpdk.org/ml/archives/dev/2015-January/011127.html

Also adapt the csum forward engine code to the API.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:17 +01:00
Olivier Matz
fac4c750b2 mbuf: remove flag alias for IP checksum
The alias PKT_TX_IPV4_CSUM is only used in one place of i40e driver.
Remove it and only keep the legacy flag PKT_TX_IP_CSUM.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:17 +01:00
Jia Yu
3a52e64742 lib: fix cache alignment of structures
Include rte_memory.h for lib files that use __rte_cache_aligned
attribute.

Consider the following code:

	struct per_core_foo {
		...
	} __rte_cache_aligned;

	struct global_foo {
		struct per_core_foo foo[RTE_MAX_CORE];
	};

If __rte_cache_aligned is not defined (rte_memory.h is not included),
the code compiles but the structure is not aligned... it defines the
structure and creates a global variable called __rte_cache_aligned.
And this can lead to really bad things if this code is in a .h that
is included by files that may or may not include rte_memory.h

Signed-off-by: Jia Yu <jyu@vmware.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:02 +01:00
Konstantin Ananyev
51e16682cf ixgbe: do not override buffer length
The template mbuf_initializer is hard coded with a buflen which
might have been set differently by the application at the time of
mbuf pool creation.

- move buf_len fields out of rearm_data marker.
- make ixgbe_recv_pkts_vec() not touch buf_len field at all
(as all other RX functions behave).

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
2014-12-05 22:57:18 +01:00
Jijiang Liu
c14236f210 mbuf: replace inner fields by outer fields semantic
Replace the inner_l2_len and the inner_l3_len field with the
outer_l2_len and outer_l3_len field, and rework csum forward engine
and i40e PMD due to these changes.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-05 16:55:00 +01:00
Jijiang Liu
1c3b7c33e9 mbuf: add Tx offloading flags for tunnels
Replace PKT_TX_VXLAN_CKSUM with PKT_TX_UDP_TUNNEL_PKT in order to indicate
a packet is an UDP tunneling packet, and introduce 3 TX offload flags for
outer IP TX checksum, which are PKT_TX_OUTER_IP_CKSUM, PKT_TX_OUTER_IPV4
and PKT_TX_OUTER_IPV6 respectively.
Rework csum forward engine and i40e PMD due to these changes.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-05 16:55:00 +01:00
Jijiang Liu
711ba9e23e mbuf: remove aliasing of Tx offloading flags with Rx ones
The reason of redefining the PKT_TX_IPV4 and the PKT_TX_IPV6 is listed below,
It will avoid to send a packet with a bad info:
  - we receive a Ether/IP6/IP4/L4/data packet
  - the driver sets PKT_RX_IPV6_HDR
  - the stack decapsulates IP6
  - the stack sends the packet, it has the PKT_TX_IPV6 flag but it's an IPv4 packet.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-05 16:55:00 +01:00
Thomas Monjalon
df2e9cec22 mbuf: sort TCP segmentation offload flag
Due to reordering conflicts, the TSO flag was not sorted.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-27 10:39:21 +01:00
Olivier Matz
1224decaa4 ixgbe: support TCP segmentation offload
Implement TSO (TCP segmentation offload) in ixgbe driver. The driver is
now able to use PKT_TX_TCP_SEG mbuf flag and mbuf hardware offload infos
(l2_len, l3_len, l4_len, tso_segsz) to configure the hardware support of
TCP segmentation.

In ixgbe, when doing TSO, the IP length must not be included in the TCP
pseudo header checksum. A new function ixgbe_fix_tcp_phdr_cksum() is
used to fix the pseudo header checksum of the packet before giving it to
the hardware.

In the patch, the tx_desc_cksum_flags_to_olinfo() and
tx_desc_ol_flags_to_cmdtype() functions have been reworked to make them
clearer. This should not impact performance as gcc (version 4.8 in my
case) is smart enough to convert the tests into a code that does not
contain any branch instruction.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-26 19:35:56 +01:00
Olivier Matz
4199fdea60 mbuf: generic support for TCP segmentation offload
Some of the NICs supported by DPDK have a possibility to accelerate TCP
traffic by using segmentation offload. The application prepares a packet
with valid TCP header with size up to 64K and deleguates the
segmentation to the NIC.

Implement the generic part of TCP segmentation offload in rte_mbuf. It
introduces 2 new fields in rte_mbuf: l4_len (length of L4 header in bytes)
and tso_segsz (MSS of packets).

To delegate the TCP segmentation to the hardware, the user has to:

- set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
  PKT_TX_TCP_CKSUM)
- set the flag PKT_TX_IPV4 or PKT_TX_IPV6
- set PKT_TX_IP_CKSUM if it's IPv4, and set the IP checksum to 0 in
  the packet
- fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
- calculate the pseudo header checksum without taking ip_len in account,
  and set it in the TCP header, for instance by using
  rte_ipv4_phdr_cksum(ip_hdr, ol_flags)

The API is inspired from ixgbe hardware (the next commit adds the
support for ixgbe), but it seems generic enough to be used for other
hw/drivers in the future.

This commit also reworks the way l2_len and l3_len are used in igb
and ixgbe drivers as the l2_l3_len is not available anymore in mbuf.

Signed-off-by: Mirek Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-26 19:35:56 +01:00
Olivier Matz
6006818cfb net: new checksum functions
Introduce new functions to calculate checksums. These new functions
are derivated from the ones provided csumonly.c but slightly reworked.
There is still some room for future optimization of these functions
(maybe SSE/AVX, ...).

This API will be modified in tbe next commits by the introduction of
TSO that requires a different pseudo header checksum to be set in the
packet.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-26 19:35:55 +01:00
Olivier Matz
e4a1c50e69 mbuf: get the name of offload flags
In test-pmd (rxonly.c), the code is able to dump the list of ol_flags.
The issue is that the list of flags in the application has to be
synchronized with the flags defined in rte_mbuf.h.

This patch introduces 2 new functions rte_get_rx_ol_flag_name()
and rte_get_tx_ol_flag_name() that returns the name of a flag from
its mask. It also fixes rxonly.c to use this new functions and to
display the proper flags.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2014-11-26 19:35:55 +01:00
Olivier Matz
b161f72107 mbuf: remove too specific flags mask
This definition is specific to Intel PMD drivers and its definition
"indicate what bits required for building TX context" shows that it
should not be in the generic rte_mbuf.h but in the PMD driver.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-26 19:35:55 +01:00
Olivier Matz
b029fd236d mbuf: add help about Tx checksum flags
Describe how to use hardware checksum API.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-26 19:35:55 +01:00
Olivier Matz
340e52d9bd mbuf: reorder Tx flags
The tx mbuf flags are now ordered from the lowest value to the
the highest. Add comments to explain where to add new flags.

By the way, move the PKT_TX_VXLAN_CKSUM at the right place.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-26 18:54:41 +01:00
Jingjing Wu
829a1c2c41 mbuf: extend flow director field
fdir field in rte_mbuf is extended to support flex bytes reported when fdir match.
8 flex bytes can be reported in maximum.
The reported flex bytes are part of flexible payload.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-25 00:06:03 +01:00
Qinglai Xiao
9f2e99d171 mbuf: add usr alias for hash
This field is added for librte_distributor. User of librte_distributor
is advocated to set value of mbuf->hash.usr before calling
rte_distributor_process. The value of usr is the tag which stands as
identifier of flow.

Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-13 12:26:10 +01:00
Thomas Monjalon
7e33267d41 fix VXLAN acronym
Each letter must be uppercased:
Virtual eXtensible Local Area Network (VXLAN)

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-10-27 16:11:04 +01:00
Jijiang Liu
77b8301733 i40e: VXLAN Tx checksum offload
Support VxLAN Tx checksum offload, which include
  - outer L3(IP) checksum offload
  - inner L3(IP) checksum offload
  - inner L4(UDP, TCP and SCTP) checksum offload

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
2014-10-27 14:37:34 +01:00
Jijiang Liu
15dbb63ef9 i40e: VXLAN packet identification
Implement the configuration API of VXLAN destination UDP port,
and add new Rx offload flags for supporting VXLAN packet offload.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
2014-10-27 14:37:34 +01:00
Jijiang Liu
73b7d59cf4 mbuf: add fields for tunnels
Replace the "reserved2" field with the "packet_type" field
and add the "inner_l2_l3_len" field in the rte_mbuf structure.
The "packet_type" field is used to indicate ordinary packet format and also
tunneling packet format such as IP in IP, IP in GRE, MAC in GRE and MAC in UDP.
The "inner_l2_len" and the "inner_l3_len" fields are added
in the second cache line, they use 2 bytes for TX offloading of tunnels.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yong Liu <yong.liu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-10-27 11:43:51 +01:00
Bruce Richardson
17c696b4ff mbuf: comment for ctrl mbuf flag
Add in a doxygen comment for the ctrl mbuf flag definition.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-10-08 14:45:36 +02:00
Bruce Richardson
578cca42da mbuf: update Rx flag format
Update the format of the RX flags to match that of the TX flags. In
general the flags are now specified as "1ULL << X", with a few
exceptions.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-10-08 14:45:14 +02:00
Bruce Richardson
dff38e8e0d mbuf: group Tx flags near end of field
This patch takes the existing TX flags defined for the mbuf and shifts
each uniquely defined one left so that additional RX flags can be
defined without having RX and TX flags mixed together. Under the new
scheme, RX flags start at bit 0 and work left, TX flags start at bit 55
and work right, and bits 56-63 are reserved for generic mbuf use, not
for offloads.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-10-08 14:43:47 +02:00
Bruce Richardson
a62bfb72b9 mbuf: switch vlan_tci and reserved2 fields
Move the vlan_tci field up by two bytes in the mbuf data structure. This
has two effects:
* Ensures the the ixgbe vector driver places the vlan tag in the correct
  place in the mbuf.
* Allows a second vlan tag field, if one is added in the future, to be
  placed after the existing vlan field, rather than before.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-10-08 14:24:59 +02:00
Bruce Richardson
4cd917b308 mbuf: add userdata pointer field
While some applications may store metadata about packets in the packet
mbuf headroom, this is not a workable solution for packet metadata which
is either:
* larger than the headroom (or headroom is needed for adding pkt headers)
* needs to be shared or copied among packets

To support these use cases in applications, we reserve a general
"userdata" pointer field inside the second cache-line of the mbuf. This
is better than having the application store the pointer to the external
metadata in the packet headroom, as it saves an additional cache-line
from being used.

Apart from storing metadata, this field also provides a general 8-byte
scratch space inside the mbuf for any other application uses that are
applicable.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-10-08 14:24:59 +02:00
Bruce Richardson
7f78e67701 mbuf: ensure next pointer is set to null on free
The receive functions for packets do not modify the next pointer so
the next pointer should always be cleared on mbuf free, just in case.
The slow-path TX needs to clear it, and the standard mbuf free function
also needs to clear it. Fast path TX does not handle chained mbufs so
is unaffected

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-10-08 14:12:31 +02:00
Bruce Richardson
bd815dc0b7 mbuf: move l2_len and l3_len to second cache line
The l2_len and l3_len fields are used for TX offloads and so should be
put on the second cache line, along with the other fields only used on
TX.

The l2 and l3 lengths can be accessed as a single uint16_t for
performance, as well as individually.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-09-17 18:57:29 +02:00
Bruce Richardson
f867492346 mbuf: split mbuf across two cache lines.
This change splits the mbuf in two to move the pool and next pointers to
the second cache line. This frees up 16 bytes in first cache line.

The reason for this change is that we believe that there is no possible
way that we can ever fit all the fields we need to fit into a 64-byte
mbuf, and so we need to start looking at a 128-byte mbuf instead. Examples
of new fields that need to fit in, include -
* 32-bits more for filter information for support for the new filters in
  the i40e driver (and possibly other future drivers)
* an additional 2-4 bytes for storing info on a second vlan tag to allow
  drivers to support double Vlan/QinQ
* 4-bytes for storing a sequence number to enable out of order packet
  processing and subsequent packet reordering
as well as potentially a number of other fields or splitting out fields
that are superimposed over each other right now, e.g. for the qos scheduler.
We also want to allow space for use by other non-Intel NIC drivers that may
be open-sourced to dpdk.org in the future too, where they support fields
and offloads that currently supported hardware doesn't.

If we accept the fact of a 2-cache-line mbuf, then the issue becomes
how to rework things so that we spread our fields over the two
cache lines while causing the lowest slow-down possible. The general
approach that we are looking to take is to focus the first cache
line on fields that are updated on RX , so that receive only deals
with one cache line. The second cache line can be used for application
data and information that will only be used on the TX leg. This would
allow us to work on the first cache line in RX as now, and have the
second cache line being prefetched in the background so that it is
available when necessary. Hardware prefetches should help us out
here. We also may move rarely used, or slow-path RX fields e.g. such
as those for chained mbufs with jumbo frames, to the second
cache line, depending upon the performance impact and bytes savings
achieved.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-09-17 18:57:19 +02:00
Bruce Richardson
587c63cea9 mbuf: add named points inside the structure
Add markers or "labels" at given points inside the mbuf which can be
used instead of individual fields to identify the start of logical
sections inside the mbuf.

The use of typedefs and dummy fields was chosen over using unions
because of a couple reasons:
* unions cause an extra level of indentation (more likely two levels as
  a union containing a struct for multiple fields would be needed). This
  makes the lines longer than they need to be and increases the need for
  wrapping. [This was the main reason]
* with markers, you can apply multiple markers at the same point if
  wanted.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-09-17 18:53:40 +02:00
Bruce Richardson
fa363849d0 mbuf: move metadata macros to rte_port library
The metadata macros are only used by libs and apps using the rte_port
packet framework library, so move them to a header file there.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-09-17 18:53:40 +02:00
Bruce Richardson
d93677d6e2 mbuf: use macros only to access metadata
Removed the explicit zero-sized metadata definition at the end of the
mbuf data structure. Updated the metadata macros to take account of this
change so that all existing code which uses those macros still works.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-09-17 18:53:40 +02:00