This patch checks the packet length offset value, and checks if the
extra bytes inside buffer cross page boundary.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Extract a function to replace duplicated codes in one copy and zero copy TX function.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
As HW vlan strip will reduce the packet length by minus length of vlan tag,
so it need restore the packet length by plus it.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Since the commit 33e79bed3edc2bcf59 has fixed the issue in vector PMD,
and then it can receive jumbo frame by scatter-gather mode, so remove this check.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
The packet passed to virtio_tx_route has been allocated
mbuf, so there is no need to allocate mbuf for it.
Use vlan offload to transmit vlan tagged packet.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: remove useless mbuf pool]
In switch_worker and virtio_tx_local, rte_vhost_enqueue_burst is called to
push host packets to guest VM.
Before enqueue packets to guest VM, vhost example uses configure-able retry logic
to wait for enough vring entries.
In switch_worker, rte_vhost_dequeue_burst is called to get packets from guest VM,
then virtio device will be bound to a queue in VMDQ for the first transmitted
packet.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
check_hpa_regions, fill_hpa_memory_regions and hpa memory region
data structure are added back from old virtio-net.c.
Add hpa (host physical address) region generation/destroy logic.
gpa<->hpa memory translation regions are generated at new_device,
when a virtio device is ready for packet processing.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Define vhost_dev data structure.
Change reference to virtio_dev to vhost_dev.
The vhost example use vdev data structure for switching related logic
and container for virtio_dev.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Those functions are integrated into the user space vhost library:
virtio_dev_rx, virtio_dev_merge_rx, virtio_dev_tx, virtio_dev_merge_tx,
copy_from_mbuf_to_ring, gpa_to_vva.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
This patch copies two files main.c/main.h from most recent vhost example
(before transforming into a library) as the base for new vhost example.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
The previous implementation of rte_kni_alloc() was allocating memzones with a
name composed of a fixed string and the interface name. When an application was
allocating and deallocating multiple interfaces with different names, memzones
were quickly exhausted, even though memzones from deallocated interfaces were
never used anymore (unless an interface with the same name was re-allocated).
As a result, the application was unable to allocate more KNI interfaces with
different names.
This patch implements the KNI memzone pool in order to prevent memzone
exhaustion when allocating/deallocating KNI interfaces. It adds a new API call,
rte_kni_init(max_kni_ifaces) that shall be called before any call to
rte_kni_alloc() if KNI is used. The memzones are pre-allocated with interface-
independent names so that they can be reused.
Signed-off-by: Marc Sune <marc.sune@bisdn.de>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Those files will be refactored in subsequent patches to form user space
vhost library.
Makefile and main.h are removed.
main.c is renamed to vhost_rxtx.c and will provide vring enqueue/dequeue API.
virtio-net.h is renamed to rte_virtio_net.h which is the API header file.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
[Thomas: remove from examples Makefile and merge file renaming]
For apps that were using default rte_eth_rxconf and rte_eth_txconf
structures, these have been removed and now they are obtained by
calling rte_eth_dev_info_get, just before setting up RX/TX queues.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Rename start_?x_per_q to ?x_deferred_start
and add comments.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Since commit a155d430119 ("support link bonding device initialization"),
rte_eal_pci_probe() is called in rte_eal_init().
So it doesn't have to be called by application anymore.
It has been fixed for testpmd in commit 2950a769315,
and this patch remove it from other applications.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
The mbuf structure already contains a pointer to the beginning of the
buffer (m->buf_addr). It is not needed to use 8 bytes again to store
another pointer to the beginning of the data.
Using a 16 bits unsigned integer is enough as we know that a mbuf is
never longer than 64KB. We gain 6 bytes in the structure thanks to
this modification.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
* Updated to apply to latest on mainline.
* Disabled vector PMD in config as it relies heavily on the mbuf layout
This will be re-enabled in a subsequent commit once vPMD has been
reworked to take account of mbuf changes.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The vlan_macip structure combined a vlan tag id with l2 and l3 headers
lengths for tracking offloads. However, this structure was only used as
a unit by the e1000 and ixgbe drivers, not generally.
This patch removes the structure from the mbuf header and places the
fields into the mbuf structure directly at the required point, without
any net effect on the structure layout. This allows us to treat the vlan
tags and header length fields as separate for future mbuf changes. The
drivers which were written to use the combined structure still do so,
using a driver-local definition of it.
Reduce perf regression caused by splitting vlan_macip field. This is
done by providing a single uint16_t value to allow writing/clearing
the l2 and l3 lengths together. There is still a small perf hit to the
slow path TX due to the reads from vlan_tci and l2/l3 lengths being
separated. (<5% in my tests with testpmd with no extra params).
Unfortunately, this cannot be eliminated, without restoring the vlan
tags and l2/l3 lengths as a combined 32-bit field. This would prevent
us from ever looking to move those fields about and is an artificial tie
that applies only for performance in igb and ixgbe drivers. Therefore,
this patch keeps the vlan_tci field separate from the lengths as the
best solution going forward.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
In some cases we may want to tag a packet for a particular destination
or output port, so rename the "in_port" field in the mbuf to just "port"
so that it can be re-used for this purpose if an application needs it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The rte_pktmbuf structure was initially included in the rte_mbuf
structure. This was needed when there was 2 types of mbuf (ctrl and
packet). As the control mbuf has been removed, we can merge the
rte_pktmbuf into the rte_mbuf structure.
Advantages of doing this:
- the access to mbuf fields is easier (ex: m->data instead of m->pkt.data)
- make the structure more consistent: for instance, there was no reason
to have the ol_flags field in rte_mbuf
- it will allow a deeper reorganization of the rte_mbuf structure in the
next commits, allowing to gain several bytes in it
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
[Bruce: updated for latest code and new example apps]
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The initial role of rte_ctrlmbuf is to carry generic messages (data
pointer + data length) but it's not used by the DPDK or it applications.
Keeping it implies:
- loosing 1 byte in the rte_mbuf structure
- having some dead code rte_mbuf.[ch]
This patch removes this feature. Thanks to it, it is now possible to
simplify the rte_mbuf structure by merging the rte_pktmbuf structure
in it. This is done in next commit.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
* Updated patch to HEAD.
* Modified patch to retain the old function names for ctrl mbufs as
macros. This helps with app compatibility, and allows the concept
of a control mbuf to be reintroduced via a single-bit flag in
a future change.
* Updated the packet framework ip_pipeline example application to
work following this change.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
It seems that RTE_MBUF_SCATTER_GATHER is not the proper name for the
feature it provides. "Scatter gather" means that data is stored using
several buffers. RTE_MBUF_REFCNT seems to be a better name for that
feature as it provides a reference counter for mbufs.
The macro RTE_MBUF_SCATTER_GATHER is poisoned to ensure this
modification is seen by drivers or applications using it.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Make ACL library to build/work on 'default' architecture:
- make rte_acl_classify_scalar really scalar
(make sure it wouldn't use sse4 instrincts through resolve_priority()).
- Provide two versions of rte_acl_classify code path:
rte_acl_classify_sse() - could be build and used only on systems with sse4.2
and upper, return -ENOTSUP on lower arch.
rte_acl_classify_scalar() - a slower version, but could be build and used
on all systems.
- Addition of a new function rte_acl_classify_alg. This function lets you
specify an enum value to override the acl contexts default algorithm when doing
a classification. This allows an application to specify a classification
algorithm without needing to publicize each method. I know there was concern
over keeping those methods public, but we don't have a static ABI at the moment,
so this seems to me a reasonable thing to do, as it gives us less of an ABI
surface to worry about.
- keep common code shared between these two codepaths.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
No need for that 'x bit' on source files.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This patch support mergeable RX feature and thus support jumbo frame RX and TX
in user space vhost(as virtio backend).
On RX, it secures enough room from vring to accommodate one complete scattered
packet which is received by PMD from physical port, and then copy data from
mbuf to vring buffer, possibly across a few vring entries and descriptors.
On TX, it gets a jumbo frame, possibly described by a few vring descriptors which
are chained together with the flags of 'NEXT', and then copy them into one scattered
packet and TX it to physical port through PMD.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Latest changes introduced a small degradation for the corner case
when each input packet is destined to the different port.
For the test-case when 1 core manages 4 ports and packet stream looks like:
IPV4_DSTPORT0, IPV4_DSTPORT1, IPV4_DSTPORT3, IPV4_DSTPORT4, IPV4_DSTPORT0, ...
non-optimised code path outperforms optimised one by 2-3%.
These changes supposed to close that gap.
From my testing: now for the case described above optimised code path
produces same numbers as non-optimised one.
For other test-cases numbers remain about the same.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
When adding this packet framework sample (commit 77a3346),
it has been forgotten to add it into the global makefile for
"make examples".
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
L3fwd-acl and ip pipeline apps were using old
x86_64-default-linuxapp-gcc as their default target,
instead of x86_64-native-linuxapp-gcc
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
After enable vector pmd, qos_sched only send 32 packets every burst.
That will cause some packets not transmitted and therefore mempool
will be drain after a while.
App qos_sched now will re-send the packets which failed to send out in
previous tx function.
Signed-off-by: Yong Liu <yong.liu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Do not try to build Linux examples in a BSD environment.
Reported-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The function rte_snprintf serves no useful purpose. It is the
same as snprintf() for all valid inputs. Deprecate it and
replace all uses in current code.
Leave the tests for the deprecated function in place.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Mark the rte_log, cmdline_printf and rte_snprintf functions as
being printf-style functions. This causes compilation errors
due to mis-matched parameter types, so the parameter types are
fixed where appropriate.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Make everything NUMA-related depend on lcore sockets, not device
sockets. This is because the init_mem() function allocates all data
structures based on NUMA nodes of the lcores in the coremask. Therefore,
when no cores are on socket 0, but there are devices on socket 0, it may
lead to segmentation faults.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
- i40e RSS flags have been added (and enlarged to 64-bit)
- A new configuration of 'uint8_t rss_key_len' has been added in
'struct rte_eth_rss_conf' to support different length of RSS keys.
- In each PMD, only the supported flags are masked.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
This patch fixes a core id issue in sample vmdq, in case core mask
doesn't start with lcore_id 0 but 20, for instance,
queue id should use core_id instead of lcore_id.
Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Split input error stats to have a better understanding of why packets
have been dropped.
Keep ierrors field untouched for backward compatibility.
Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This Packet Framework sample application illustrates the capabilities
of the Intel DPDK Packet Framework toolbox.
It creates different functional blocks used by a typical IPv4 framework like:
flow classification, firewall, routing, etc.
CPU cores are connected together through standard interfaces built on SW rings,
which each CPU core running a separate pipeline instance.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
RTE_LOGTYPE_CONFIG, RTE_LOGTYPE_DATA and RTE_LOGTYPE_PORT are renamed
by adding VHOST prefix.
It prevents from conflict with new RTE_LOGTYPE_PORT of packet framework.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
New stuff:
* Support for regular traffic as well as IPv4 and IPv6
* Simplified config
* Routing table printed out on start
* Uses LPM/LPM6 for lookup
* Unmatched traffic is sent to the originating port
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
New stuff:
* Support for regular traffic as well as IPv4 and IPv6
* Simplified config
* Routing table printed out on start
* Uses LPM/LPM6 for lookup
* Unmatched traffic is sent to the originating port
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>