Commit Graph

5678 Commits

Author SHA1 Message Date
Flavio Leitner
84c39beb2f vhost: fix IPv4 checksum
Currently the IPv4 header checksum is calculated including its
current value, which can be a valid checksum or just garbage.
In any case, if the original value is not zero, then the result
is always wrong.

The IPv4 checksum is defined in RFC791, page 14 says:
  Header Checksum:  16 bits

  The checksum algorithm is:
  The checksum field is the 16 bit one's complement of the one's
  complement sum of all 16 bit words in the header.  For purposes of
  computing the checksum, the value of the checksum field is zero.

Thus force the csum field to always be zero.

Fixes: b08b8cfeb2 ("vhost: fix IP checksum")
Cc: stable@dpdk.org

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:22 +02:00
Ilya Maximets
70c7747689 vhost: disable host TSO for linear buffers without extbuf
If linear buffers requested and external buffers are not, vhost
will not be able to receive any buffer that doesn't fit in a
single mbuf.  Moreover, if such a buffer will appear in a vring
it will never be dequeued and the whole vring will become dead
breaking the network connection.

Disable segmentation offloading from the host side to avoid
having such a big buffers.

Fixes: c3ff0ac70a ("vhost: improve performance by supporting large buffer")

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:23:06 +02:00
Ilya Maximets
19896c7393 vhost: return error message for mbuf allocation failure
mbuf allocation failure is a hard failure that highlights some
significant issues with memory pool size or a mbuf leak.

We still have the message for subsequent chained mbufs, but not
for the first one.  It was removed while introducing extbuf
support for large buffers.  But it was useful for catching
mempool issues and needs to be returned back.

Fixes: c3ff0ac70a ("vhost: improve performance by supporting large buffer")

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Flavio Leitner <fbl@sysclose.org>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2019-10-25 19:22:46 +02:00
Marvin Liu
f974ca7a29 vhost: optimize packed ring dequeue when in-order
When VIRTIO_F_IN_ORDER feature is negotiated, vhost can optimize dequeue
function by only update first used descriptor.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
31d6c6a5b8 vhost: optimize packed ring dequeue
Optimize vhost device packed ring dequeue function by splitting batch
and single functions. No-chained and direct descriptors will be handled
by batch and other will be handled by single as before.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
d1eafb5322 vhost: add packed ring zcopy batch and single dequeue
Add vhost packed ring zero copy batch and single dequeue functions like
normal dequeue path.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
0294211bb6 vhost: optimize packed ring enqueue
Optimize vhost device packed ring enqueue function by splitting batch
and single functions. Packets can be filled into one desc will be
handled by batch and others will be handled by single as before.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
c119edbc2d vhost: update packed ring dequeue
Buffer used ring updates as many as possible in vhost dequeue function
for coordinating with virtio driver. For supporting buffer, shadow used
ring element should contain descriptor's flags. First shadowed ring
index was recorded for calculating buffered number.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
f41516c309 vhost: flush batched enqueue descs directly
Flush used elements when batched enqueue function is finished.
Descriptor's flags are pre-calculated as they will be reset by vhost.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
33d4a554f9 vhost: flush enqueue updates by cacheline
Buffer vhost packed ring enqueue updates, flush ring descs if buffered
content filled up one cacheline. Thus virtio can receive packets at a
faster frequency.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
75ed516978 vhost: add packed ring batch dequeue
Add batch dequeue function like enqueue function for packed ring, batch
dequeue function will not support chained descriptors, single packet
dequeue function will handle it.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
47ac243ac4 vhost: add packed ring single dequeue
Add vhost single packet dequeue function for packed ring and meanwhile
left space for shadow used ring update function.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
ef861692c3 vhost: add packed ring batch enqueue
Batch enqueue function will first check whether descriptors are cache
aligned. It will also check prerequisites in the beginning. Batch
enqueue function do not support chained mbufs, single packet enqueue
function will handle it.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
934274065a vhost: try to unroll for each loop
Create macro for adding unroll pragma before for each loop. Batch
functions will be contained of several small loops which can be
optimized by compilers' loop unrolling pragma.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
93520085ef vhost: add packed ring single enqueue
Add vhost enqueue function for single packet and meanwhile left space
for flush used ring function.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Marvin Liu
86202aae94 vhost: add packed ring indexes increasing function
When enqueuing or dequeuing, the virtqueue's local available and used
indexes are increased.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-25 19:20:47 +02:00
Flavia Musatescu
512d873ff1 net: add new header file for VXLAN
The VXLAN related definitions and structures are moved from
rte_ether.h to a new header file: rte_xvlan.h.

Also introducing a new define macro for VXLAN default port id:
RTE_VXLAN_DEFAULT_PORT

Signed-off-by: Flavia Musatescu <flavia.musatescu@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Raslan Darawsheh <rasland@mellanox.com>
2019-10-25 19:00:22 +02:00
David Marchand
40549b086c net: hide internal CRC defines
No need to let those (non RTE_ prefixed) defines public.
Hide them where we use them.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-25 19:00:22 +02:00
David Marchand
d613fe10b3 net: add rte prefix to MPLS structure
Add 'rte_' prefix to structures:
- rename struct mpls_hdr as struct rte_mpls_hdr.

Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-25 19:00:22 +02:00
David Marchand
2379572969 net: add missing rte prefix on PPPoE defines
Those two defines have been missed.

Fixes: 35b2d13fd6 ("net: add rte prefix to ether defines")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-25 19:00:22 +02:00
Ciara Power
22a0763673 ethdev: fix include of ethernet header file
The include for rte_ether.h in each of these files should not use
quotes, as the header file is not in the librte_ethdev directory.

These are now updated to use <> symbols, to search directories
pre-designated by the compiler.

Fixes: 57668ed7bc ("net: move ethernet definitions to the net library")
Cc: stable@dpdk.org

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-10-25 19:00:22 +02:00
Ting Xu
d8e5e69f3a app/testpmd: add GTP parsing and Tx checksum offload
Enable testpmd to forward GTP packet in csum fwd mode.

A GTP header structure (without optional fields and extension header)
is defined in new rte_gtp.h.
A parser function in testpmd is added.  GTPU and GTPC packets are both
supported, with respective UDP destination port and GTP message type.

Signed-off-by: Ting Xu <ting.xu@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-25 19:00:22 +02:00
Olivier Matz
4958ca3a44 mbuf: support dynamic fields and flags
Many features require to store data inside the mbuf. As the room in mbuf
structure is limited, it is not possible to have a field for each
feature. Also, changing fields in the mbuf structure can break the API
or ABI.

This commit addresses these issues, by enabling the dynamic registration
of fields or flags:

- a dynamic field is a named area in the rte_mbuf structure, with a
  given size (>= 1 byte) and alignment constraint.
- a dynamic flag is a named bit in the rte_mbuf structure.

The typical use case is a PMD that registers space for an offload
feature, when the application requests to enable this feature.  As
the space in mbuf is limited, the space should only be reserved if it
is going to be used (i.e when the application explicitly asks for it).

The registration can be done at any moment, but it is not possible
to unregister fields or flags.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-26 19:08:50 +02:00
Anatoly Burakov
6d3f9917ff eal: fix memory config allocation for multi-process
Currently, mem config will be mapped without using the virtual
area reservation infrastructure, which means it will be mapped
at an arbitrary location. This may cause failures to map the
shared config in secondary process due to things like PCI
whitelist arguments allocating memory in a space where the
primary has allocated the shared mem config.

Fix this by using virtual area reservation to reserve space for
the mem config, thereby avoiding the problem and reserving the
shared config (hopefully) far away from any normal memory
allocations.

Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-10-26 18:03:26 +02:00
Anatoly Burakov
6080796f65 mem: make base address hint OS specific
Not all OS's follow Linux's memory layout, which may lead to
problems following the suggested common address hint absent
of a base-virtaddr flag. Make this address hint OS-specific.

Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-10-26 18:03:24 +02:00
Pallavi Kadam
7e708cd8c6 eal: move CPU operations to OS specific headers
Moving RTE_CPU* definitions from the common code to the Linux and
FreeBSD rte_os.h file to avoid #ifdef clutter.

Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Signed-off-by: Antara Ganesh Kolar <antara.ganesh.kolar@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2019-10-26 17:06:41 +02:00
Pavan Nikhilesh
9b0a1dadc3 reciprocal: fix off-by-one with 32-bit divisor
Fix off-by-one error in 64bit reciprocal division when divisor is 32bit.

Caught with the unit test:

RTE>>reciprocal_division
Validating unsigned 32bit division.
Validating unsigned 64bit division.
Validating unsigned 64bit division with 32bit divisor.
Division failed, 16983222950483802557/819 = expected 20736535959076681
result 20736535959076682
Validating division by power of 2.
Test Failed

Fixes: 6d45659eac ("eal: add u64-bit variant for reciprocal divide")
Cc: stable@dpdk.org

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
2019-10-26 16:09:51 +02:00
Konstantin Ananyev
3eb860b08e mbuf: move definitions into a separate file
Right now inclusion of rte_mbuf.h header can cause inclusion of
some arch/os specific headers.
That prevents it to be included directly by some
non-DPDK (but related) entities: KNI, BPF programs, etc.
To overcome that problem usually a separate definitions of rte_mbuf
structure is created within these entities.
That aproach has a lot of drawbacks: code duplication, error prone, etc.
This patch moves rte_mbuf structure definition (and some related macros)
into a separate file that can be included by both rte_mbuf.h and
other non-DPDK entities.

Note that it doesn't introduce any change for current DPDK code.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Michel Machado <michel@digirati.com.br>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-25 19:30:38 +02:00
Konstantin Ananyev
2dcb5f7987 eal: move cache line and IOVA related definitions
Right now RTE_CACHE_ and IOVA definitions are located inside rte_memory.h
That might cause an unwanted inclusions of arch/os specific header files.
See [1] for particular problem example.
Probably the simplest way to deal with such problems -
move these definitions into rte_commmon.h

Note that this move doesn't introduce any change in functionality.

[1] https://bugs.dpdk.org/show_bug.cgi?id=321

Suggested-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Michel Machado <michel@digirati.com.br>
2019-10-25 19:30:36 +02:00
Rahul Shah
5d92c4e592 port: add eventdev port type
Adding a new port type called eventdev to the
rte_port library.

Signed-off-by: Rahul Shah <rahul.r.shah@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2019-10-25 18:29:48 +02:00
Jasvinder Singh
68c1f26d42 sched: support 64-bit values
Modify internal structure and functions to support 64-bit
values for rates and stats parameters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2019-10-25 18:07:37 +02:00
Jasvinder Singh
0edf18eee2 sched: add 64-bit values
To support high bandwidth network interfaces, all rates (port,
subport level token bucket and traffic class rates, pipe level
token bucket and traffic class rates) and stats counters defined
in public data structures (rte_sched.h) are modified to support
64 bit counters.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2019-10-25 18:07:26 +02:00
Jasvinder Singh
def9c49267 sched: remove redundant code
Remove redundant data structure fields from port level data
structures and update the release notes.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:53:36 +02:00
Jasvinder Singh
831104f0e8 sched: update queue stats read for config flexibility
Modify pipe queue stats read function to allow different subports
of the same port to have different configuration in terms of number
of pipes, pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:51:26 +02:00
Jasvinder Singh
2a718309fd sched: update pkt dequeue for flexible config
Modify scheduler packet dequeue operation to allow different
subports of the same port to have different configuration in terms
of number of pipes, pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:51:22 +02:00
Jasvinder Singh
4d2ad6e34b sched: update grinder functions for config flexibility
Modify packet grinder functions of the schedule to allow different
subports of the same port to have different configuration in terms
of number of pipes, pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:51:19 +02:00
Jasvinder Singh
21dca4e3f6 sched: update memory compute to support flexiblity
Update memory footprint compute function for allowing subports of
the same port to have different configuration in terms of number of
pipes, pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:51:17 +02:00
Jasvinder Singh
6fbbb0ef48 sched: modify pkt enqueue for config flexibility
Modify scheduler packet enqueue operation of the scheduler to allow
different subports of the same port to have different configuration
in terms of number of pipes, pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:51:14 +02:00
Jasvinder Singh
34a90f8665 sched: modify pipe functions for config flexibility
Modify pipe level functions to allow different subports of the same
port to have different configuration in terms of number of pipes,
pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:51:12 +02:00
Jasvinder Singh
ce7c4fd7c2 sched: add pipe config to subport level
Add pipes configuration from the port level to allow different
subports of the same port to have different configuration in terms
of number of pipes, pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:51:10 +02:00
Jasvinder Singh
d9213b829a sched: remove pipe params config from port level
Remove pipes configuration from the port level to allow different
subports of the same port to have different configuration in terms
of number of pipes, pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:51:07 +02:00
Jasvinder Singh
b757097e37 sched: modify internal structs for config flexibility
Update internal structures related to port and subport to allow
different subports of the same port to have different configuration
in terms of number of pipes, pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:51:04 +02:00
Jasvinder Singh
85f52aa422 sched: add pipe config params to subport struct
Add pipe configuration parameters to subport level structure to
allow different subports of the same port to have different
configuration in terms of number of pipes, pipe queue sizes, etc.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Signed-off-by: Lukasz Krakowiak <lukaszx.krakowiak@intel.com>
2019-10-25 17:49:45 +02:00
Ting Xu
d892768c6d mbuf: add GTP tunnel type
Add GTP tunnel type flag in mbuf for future use in GTP
Tx checksum offload.

Signed-off-by: Ting Xu <ting.xu@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-23 16:43:10 +02:00
Kiran Kumar K
01b3156d33 ethdev: add HIGIG2 key field to flow API
Add new rte_flow_item_higig2_hdr in order to match higig2 header.
It is a layer 2.5 protocol and used in Broadcom switches.
Header format is based on the following document.
http://read.pudn.com/downloads558/doc/comm/2301468/HiGig_protocol.pdf

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-23 16:43:10 +02:00
Ciara Power
400d758182 ethdev: check device promiscuous state
The promiscuous enable and disable functions now check the
promiscuous state of the device before checking if the dev_ops
function exists for the device.

This change is necessary to allow sample applications run on
virtual PMDs, as previously -ENOTSUP returned when the promiscuous
enable function was called. This caused the sample application to
fail unnecessarily.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-10-23 16:43:10 +02:00
David Marchand
7eca7f7fd0 net: add missing endianness annotations
OVS currently maintains a copy of those headers with the right endianness
annotations so that sparse checks can pass.

We introduced rte_beXX_t for better readibility in v17.08.
Let's make use of them, OVS then only needs to override those rte_beXX_t
types by exposing a tweaked rte_byteorder.h header.

Other existing dpdk users won't be affected since rte_beXX_t types are
mapped to uintXX_t types.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-23 16:43:10 +02:00
Simei Su
d3ae8c44b8 ethdev: extend RSS offload types
This patch reserves several bits as input set selection from the
high end of the 64 bits. It is combined with exisiting ETH_RSS_*
to represent RSS types. This patch also checks the simultaneous
use of SRC_ONLY and DST_ONLY of the same level.

Signed-off-by: Simei Su <simei.su@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-10-23 16:43:09 +02:00
Simei Su
fce6b66893 ethdev: decouple flow types and RSS offload types
This patch decouples RTE_ETH_FLOW_* and ETH_RSS_*. The former defines
flow types and the latter defines RSS offload types.

Signed-off-by: Simei Su <simei.su@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-10-23 16:43:09 +02:00
Flavio Leitner
c3ff0ac70a vhost: improve performance by supporting large buffer
The rte_vhost_dequeue_burst supports two ways of dequeuing data.
If the data fits into a buffer, then all data is copied and a
single linear buffer is returned. Otherwise it allocates
additional mbufs and chains them together to return a multiple
segments mbuf.

While that covers most use cases, it forces applications that
need to work with larger data sizes to support multiple segments
mbufs. The non-linear characteristic brings complexity and
performance implications to the application.

To resolve the issue, add support to attach external buffer
to a pktmbuf and let the host provide during registration if
attaching an external buffer to pktmbuf is supported and if
only linear buffer are supported.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-23 16:43:09 +02:00