On thunderx and octeontx, ring_perf_autotest and
ring_pmd_perf_autotest test shows better performance
when disabling CONFIG_RTE_RING_USE_C11_MEM_MODEL.
On the other hand, Enabling CONFIG_RTE_RING_USE_C11_MEM_MODEL
shows better performance on thunderx2.
Since thunderx2 is using the default armv8 config,
no particular change is required.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This patch is to support C11 memory model barrier in librte_ring.
There are 2 barrier implementation options in librte_ring (suggested
by Jerin).
1. use rte_smp_rmb
2. use load_acquire/store_release(refer to [1]).
The reason why providing 2 options is the performance benchmark
difference in different arm machines, refer to [2].
CONFIG_RTE_RING_USE_C11_MEM_MODEL is provided, and by default it is "n"
on any architectures and only "y" on arm64 so far.
[1] https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h#L170
[2] http://dpdk.org/ml/archives/dev/2017-October/080861.html
Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
No config option changed, added or removed.
Only reshuffle PMD config options mostly to help new PMDs where to put
their new config option.
Ordered as physical, paravirtual and virtual groups. Alphabetical order
within a group.
Also tried to group vendor devices together which breaks alphabetical
order in some places.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch lays the groundwork for this driver (draft documentation,
copyright notices, code base skeleton and build system hooks). While it can
be successfully compiled and invoked, it's an empty shell at this stage.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
This patch adds support for handling run-time driver arguments.
We have removed config option for per VF Tx switching and added
a run-time argument vf_txswitch. By default, the VF Tx switching is
enabled however it can be disabled using run-time argument.
Sample usage to disable per port VF Tx switching is something like...
-w 05:00.0,vf_txswitch=0 -w 05:00.1,vf_txswitch=0
Fixes: 1282943aa0 ("net/qede: fix default config option")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Two macros were defined in cryptodev, to serve the same
purpose: RTE_CRYPTODEV_NAME_LEN (in the config file) and
RTE_CRYPTODEV_NAME_MAX_LEN (in the rte_cryptodev.h file).
Since the second one is part of the external API,
the first one has been removed, avoiding duplications.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Tomasz Duszynski <tdu@semihalf.com>
This patch provides an option to do rte_memcpy() using 'restrict'
qualifier, which can induce GCC to do optimizations by using more
efficient instructions, providing some performance gain over memcpy()
on some ARM64 platforms/enviroments.
The memory copy performance differs between different ARM64
platforms. And a more recent glibc (e.g. 2.23 or later)
can provide a better memcpy() performance compared to old glibc
versions. It's always suggested to use a more recent glibc if
possible, from which the entire system can get benefit. If for some
reason an old glibc has to be used, this patch is provided for an
alternative.
This implementation can improve memory copy on some ARM64
platforms, when an old glibc (e.g. 2.19, 2.17...) is being used.
It is disabled by default and needs "RTE_ARCH_ARM64_MEMCPY"
defined to activate. It's not always proving better performance
than memcpy() so users need to run DPDK unit test
"memcpy_perf_autotest" and customize parameters in "customization
section" in rte_memcpy_64.h for best performance.
Compiler version will also impact the rte_memcpy() performance.
It's observed on some platforms and with the same code, GCC 7.2.0
compiled binary can provide better performance than GCC 4.8.5. It's
suggested to use GCC 5.4.0 or later.
Signed-off-by: Herbert Guan <herbert.guan@arm.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
OPDL ring is the core infrastructure of OPDL PMD. OPDL ring library
provide the core data structure and core helper function set. The Ring
implements a single ring multi-port/stage pipelined packet distribution
mechanism. This mechanism has the following characteristics:
• No multiple queue cost, therefore, latency is significant reduced.
• Fixed dependencies between queue/ports is more suitable for complex.
fixed pipelines of stateless packet processing (static pipeline).
• Has decentralized distribution (no scheduling core).
• Packets remain in order (no reorder core(s)).
* Update build system to enable compilation.
Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: Peter Mccarthy <peter.mccarthy@intel.com>
Reviewed-by: Seán Harte <seanbh@gmail.com>
- full test suite for bbdev
- test App works seamlessly on all PMDs registered with bbdev
framework
- a python script is provided to make our life easier
- supports execution of tests by parsing Test Vector files
- test Vectors can be added/deleted/modified with no need for
re-compilation
- various tests can be executed:
(a) Throughput test
(b) Offload latency test
(c) Operation latency test
(d) Validation test
(c) Sanity checks
Signed-off-by: Amr Mokhtar <amr.mokhtar@intel.com>
- bbdev 'turbo_sw' is the software accelerated version of 3GPP L1
Turbo coding operation using the optimized Intel FlexRAN SDK libraries.
- 'turbo_sw' pmd is disabled by default
Signed-off-by: Amr Mokhtar <amr.mokhtar@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
- 'bbdev_null' is a basic pmd that performs a minimalistic
bbdev operation
- useful for bbdev smoke testing and in measuring the overhead
introduced by the bbdev library
- 'bbdev_null' pmd is enabled by default
Signed-off-by: Amr Mokhtar <amr.mokhtar@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
- wireless baseband device (bbdev) library files
- bbdev is tagged as EXPERIMENTAL
- Makefiles and configuration macros definition
- bbdev library is enabled by default
- release notes of the initial version
Signed-off-by: Amr Mokhtar <amr.mokhtar@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Unlike every other DPDK application's compilation, proc_info's
compilation cannot be turned off on Linux. Fix it by adding a
config option to base linuxapp config.
Fixes: 22561383ea ("app: replace dump_cfg by proc_info")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Remove RTE_LOG_LEVEL config option, use existing RTE_LOG_DP_LEVEL config
option for controlling datapath log level.
RTE_LOG_LEVEL is no longer needed as dynamic logging can be used to
control global and module specific log levels.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Make max vfio groups compile-time configurable so that platforms can
choose vfio group limit.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Without this patch, the number of queues per i40e VF is set to 4
by CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4 in config/common_base.
It is a fixed value determined at compile time and can't be changed
at run time.
With this patch, the number of queues per i40e VF can be determined
at run time. For example, if the PCI address of an i40e PF is
aaaa:bb.cc, with the EAL parameter -w aaaa:bb.cc,queue-num-per-vf=8,
the number of queues per VF created from this PF is set to 8.
If there is no "queue-num-per-vf" setting in EAL parameters, it uses
the default value of 4. And if the value after the "queue-num-per-vf"
is invalid, it will also use the default value. The valid values can
be 1, 2, 4, 8, or 16.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This driver is mostly like others with slightly different logging
macros. The semantics were retained, with some minor reformatting.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This workaround was needed to properly handle device removal with old
Mellanox OFED releases that are not supported by this PMD anymore.
Starting from rdma-core v16 this removal issue shouldn't happen when
setting MLX4_DEVICE_FATAL_CLEANUP environment variable to 1.
Set the aforementioned variable to 1.
Reverts: 5f4677c6ad ("net/mlx4: workaround verbs error after plug-out")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Provide a knob to control per-VF Tx switching feature by adding a config
option, CONFIG_RTE_LIBRTE_QEDE_VF_TX_SWITCH. By default, it will be kept
in disabled state for better performance with small sized frames.
Fixes: 2ea6f76aff ("qede: add core driver")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@cavium.com>
Move the vdev bus from lib/librte_eal to drivers/bus.
As the crypto vdev helper function refers to data structure
in rte_vdev.h, so we move those helper function into drivers/bus
too.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
The PCI lib defines the types and methods allowing to use PCI elements.
The PCI bus implements a bus driver for PCI devices by constructing
rte_bus elements using the PCI lib.
Move the relevant code out of the EAL to its expected place.
Libraries, drivers, unit tests and applications are updated to use the
new rte_bus_pci.h header when necessary.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
RTE_MRVL_MUSDK_DMA_MEMSIZE can be removed from DPDK configuration
as it's no longer used as a synchronization point for net and crypto
mrvl pmds.
Fixes: 0ddc9b815b ("net/mrvl: add net PMD skeleton")
Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
The following APIs's are implemented in the
librte_flow_classify library:
rte_flow_classifier_create
rte_flow_classifier_free
rte_flow_classifier_query
rte_flow_classify_table_create
rte_flow_classify_table_entry_add
rte_flow_classify_table_entry_delete
The following librte_table API's are used:
f_create to create a table.
f_add to add a rule to the table.
f_del to delete a rule from the table.
f_free to free a table
f_lookup to match packets with the rules.
The library supports counting of IPv4 five tupple packets only,
ie IPv4 UDP, TCP and SCTP packets.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
Add mrvl net pmd driver skeleton providing base for the further
development. Besides the basic functionality QoS configuration is
introduced as well.
Signed-off-by: Jacek Siuda <jck@semihalf.com>
Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
Generic Segmentation Offload (GSO) is a SW technique to split large
packets into small ones. Akin to TSO, GSO enables applications to
operate on large packets, thus reducing per-packet processing overhead.
To enable more flexibility to applications, DPDK GSO is implemented
as a standalone library. Applications explicitly use the GSO library
to segment packets. To segment a packet requires two steps. The first
is to set proper flags to mbuf->ol_flags, where the flags are the same
as that of TSO. The second is to call the segmentation API,
rte_gso_segment(). This patch introduces the GSO API framework to DPDK.
rte_gso_segment() splits an input packet into small ones in each
invocation. The GSO library refers to these small packets generated
by rte_gso_segment() as GSO segments. Each of the newly-created GSO
segments is organized as a two-segment MBUF, where the first segment is a
standard MBUF, which stores a copy of packet header, and the second is an
indirect MBUF which points to a section of data in the input packet.
rte_gso_segment() reduces the refcnt of the input packet by 1. Therefore,
when all GSO segments are freed, the input packet is freed automatically.
Additionally, since each GSO segment has multiple MBUFs (i.e. 2 MBUFs),
the driver of the interface which the GSO segments are sent to should
support to transmit multi-segment packets.
The GSO framework clears the PKT_TX_TCP_SEG flag for both the input
packet, and all produced GSO segments in the event of success, since
segmentation in hardware is no longer required at that point.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Currently, enabling assertion have to set CONFIG_RTE_LOG_LEVEL to
RTE_LOG_DEBUG. CONFIG_RTE_LOG_LEVEL is the default log level of control
path, RTE_LOG_DP_LEVEL is the log level of data path. It's a little bit
hard to understand literally that assertion is decided by control path
LOG_LEVEL, especially assertion used on data path.
On the other hand, DPDK need an assertion enabling switch w/o impacting
log output level, assuming "--log-level" not specified.
Assertion is an important API to balance DPDK high performance and
robustness. To promote assertion usage, it's valuable to unhide
assertion out of COFNIG_RTE_LOG_LEVEL.
In one word, log is log, assertion is assertion, debug is hot pot :)
Rationale of this patch is to introduce an dedicate switch of
assertion: RTE_ENABLE_ASSERT
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
We remove xen-specific code in EAL, including the option --xen-dom0,
memory initialization code, compiling dependency, etc.
Related documents are removed or updated, and bump the eal library
version.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Membership library is an extension and generalization of a traditional
filter (for example Bloom Filter and cuckoo filter) structure.
In general, the Membership library is a data structure that provides a
"set-summary" and responds to set-membership queries of whether a
certain element belongs to a set(s). A membership test for an element
will return the set this element belongs to or not-found if the
element is never inserted into the set-summary.
The results of the membership test are not 100% accurate. Certain
false positive or false negative probability could exist. However,
comparing to a "full-blown" complete list of elements, a "set-summary"
is memory efficient and fast on lookup.
This patch adds the main API definition.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The kernel patch was merged to support pci resource mapping.
https://patchwork.kernel.org/patch/9677441/
So enable igu_uio in the default arm64 configuration.
Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
This patch also adds configuration necessary for compilation of DPAA
Mempool driver into the DPAA specific config file.
CONFIG_RTE_MBUF_DEFAULT_MEMPOOL_OPS=dpaa is also configured to allow
applications to use DPAA mempool as default.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
This option both sets the maximum number of segments for Rx/Tx packets and
whether scattered mode is supported at all. This commit removes the latter
as well as configuration file exposure since the most appropriate value
should be decided at run-time.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The patch simplifies DPDK applications analysis for developers which use
Intel® VTune Amplifier.
The empty cycles are such iterations that yielded no RX packets. As far as
DPDK is running in poll mode, wasting cycles is equal to wasting CPU time.
Tracing such iterations can identify that device is underutilized. Tracing
empty cycles becomes even more critical if a system uses a lot of Ethernet
ports.
The patch gives possibility to analyze empty cycles without changing
application code. All needs to be done is just to reconfigure and rebuild
the DPDK itself with CONFIG_RTE_ETHDEV_PROFILE_ITT_WASTED_RX_ITERATIONS
enbled. The important thing here is that this does not affect DPDK code.
The profiling code is not being compiled if user does not specify config
flag.
The patch provides common way to inject RX queues profiling and VTune
specific implementation.
Signed-off-by: Ilia Kurakin <ilia.kurakin@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Current mlx4 OFED version has bug which returns error to
ibv destroy functions when the device was plugged out, in
spite of the resources were destroyed correctly.
Hence, failsafe PMD was aborted, only in debug mode, when
it tries to remove the device in plug-out process.
The workaround added option to replace all claim_zero
assertions with debugging messages, by the way, this option
affects non ibv destroy assertions.
DPDK 18.02 release should work with Mellanox OFED-4.2 which will
include the verbs fix to this bug, then, this patch can
be removed.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Earlier bonding pmd was disabled in default config for ppc64le.
Hence, removing it as it has been verified.
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Introduce the fail-safe poll mode driver initialization and enable its
build infrastructure.
This PMD allows for applications to benefit from true hot-plugging
support without having to implement it.
It intercepts and manages Ethernet device removal events issued by
slave PMDs and re-initializes them transparently when brought back.
It also allows defining a contingency to the removal of a device, by
designating a fail-over device that will take on transmitting operations
if the preferred device is removed.
Applications only see a fail-safe instance, without caring for
underlying activity ensuring their continued operations.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
NXP Copyright has been wrongly worded with '(c)' at various places.
This patch removes these extra characters. It also removes
"All rights reserved".
Only NXP copyright syntax is changed. Freescale copyright is not
modified.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Generic Receive Offload (GRO) is a widely used SW-based offloading
technique to reduce per-packet processing overhead. It gains
performance by reassembling small packets into large ones. This
patchset is to support GRO in DPDK. To support GRO, this patch
implements a GRO API framework.
To enable more flexibility to applications, DPDK GRO is implemented as
a user library. Applications explicitly use the GRO library to merge
small packets into large ones. DPDK GRO provides two reassembly modes.
One is called lightweight mode, the other is called heavyweight mode.
If applications want to merge packets in a simple way and the number
of packets is relatively small, they can use the lightweight mode.
If applications need more fine-grained controls, they can choose the
heavyweight mode.
rte_gro_reassemble_burst is the main reassembly API which is used in
lightweight mode and processes N packets at a time. For applications,
performing GRO in lightweight mode is simple. They just need to invoke
rte_gro_reassemble_burst. Applications can get GROed packets as soon as
rte_gro_reassemble_burst returns.
rte_gro_reassemble is the main reassembly API which is used in
heavyweight mode and tries to merge N inputted packets with the packets
in GRO reassembly tables. For applications, performing GRO in heavyweight
mode is relatively complicated. Before performing GRO, applications need
to create a GRO context object, which keeps reassembly tables of
desired GRO types, by rte_gro_ctx_create. Then applications can use
rte_gro_reassemble to merge packets. The GROed packets are in the
reassembly tables of the GRO context object. If applications want to get
them, applications need to manually flush them by flush API.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Replace the incorrect reference to "Cavium Networks", "Cavium Ltd"
company name with correct the "Cavium, Inc" company name in
copyright headers.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The dpdk-test-eventdev tool is a Data Plane Development Kit (DPDK)
application that allows exercising various eventdev use cases. This
application has a generic framework to add new eventdev based test cases
to verify functionality and measure the performance parameters of DPDK
eventdev devices.
This patch adds the skeleton of the dpdk-test-eventdev application.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Moved all common defines from defconfig_arm64-armv8a-linuxapp-gcc
to common_armv8a_linuxapp.
Created new config arm64-armv8a-linuxapp-clang which adds the
clang support to armv8a.
Now defconfigs arm64-armv8a-linuxapp-gcc/clang contain only the
CONFIG_RTE_TOOLCHAIN* defines and all other common defines are
inherited from common_armv8a_linuxapp.
Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
* Removed setting CONFIG_RTE_SCHED_VECTOR=n from armv8a config
so that the setting from common_base is taken as the default
setting for armv8a
* Verified the changes with sched_autotest unit test case
Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
It is safe to enable LIBRTE_VHOST_NUMA by default for all
configurations where libnuma is already a default dependency.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Currently EAL allocates hugepages one by one not paying attention
from which NUMA node allocation was done.
Such behaviour leads to allocation failure if number of available
hugepages for application limited by cgroups or hugetlbfs and
memory requested not only from the first socket.
Example:
# 90 x 1GB hugepages availavle in a system
cgcreate -g hugetlb:/test
# Limit to 32GB of hugepages
cgset -r hugetlb.1GB.limit_in_bytes=34359738368 test
# Request 4GB from each of 2 sockets
cgexec -g hugetlb:test testpmd --socket-mem=4096,4096 ...
EAL: SIGBUS: Cannot mmap more hugepages of size 1024 MB
EAL: 32 not 90 hugepages of size 1024 MB allocated
EAL: Not enough memory available on socket 1!
Requested: 4096MB, available: 0MB
PANIC in rte_eal_init():
Cannot init memory
This happens beacause all allocated pages are
on socket 0.
Fix this issue by setting mempolicy MPOL_PREFERRED for each hugepage
to one of requested nodes using following schema:
1) Allocate essential hugepages:
1.1) Allocate as many hugepages from numa N to
only fit requested memory for this numa.
1.2) repeat 1.1 for all numa nodes.
2) Try to map all remaining free hugepages in a round-robin
fashion.
3) Sort pages and choose the most suitable.
In this case all essential memory will be allocated and all remaining
pages will be fairly distributed between all requested nodes.
New config option RTE_EAL_NUMA_AWARE_HUGEPAGES introduced and
enabled by default for linuxapp except armv7 and dpaa2.
Enabling of this option adds libnuma as a dependency for EAL.
Fixes: 77988fc08d ("mem: fix allocating all free hugepages")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Move all bypass functions to ixgbe pmd and remove function
pointers from the eth_dev_ops struct.
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
TX coalescing waits for ETH_COALESCE_PKT_NUM packets to be coalesced
across bursts before transmitting them. For slow traffic, such as
100 PPS, this approach increases latency since packets are received
one at a time and tx coalescing has to wait for ETH_COALESCE_PKT
number of packets to arrive before transmitting.
To fix this:
- Update rx path to use status page instead and only receive packets
when either the ingress interrupt timer threshold (5 us) or
the ingress interrupt packet count threshold (32 packets) fires.
(i.e. whichever happens first).
- If number of packets coalesced is <= number of packets sent
by tx burst function, stop coalescing and transmit these packets
immediately.
Also added compile time option to favor throughput over latency by
default.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
DPAA2 devices now support cortex-a72. They no longer support a57.
Also fp and simd is no more required to be stated explicitly for
standard a72 core.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Stub callbacks for the generic flow API and a new FLOW debug define.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Nelson Escobar <neescoba@cisco.com>
When building DPDK with musl, there is need not to disable
backtrace to remove some references to execinfo.h which is
not supported by musl now.
This also applies to some other libc implementation which
doesn't support backtrace() and backtrace_symbols().
musl is an implementation of the userspace portion
of the standard library functionality described in
the ISO C and POSIX standards, plus common extensions.
Got more details about musl from http://www.musl-libc.org .
Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Making AVX and AVX512 configurable is useful for performance and power
testing.
The similar kernel patch at https://patchwork.kernel.org/patch/9618883/.
AVX512 support like in rte_memcpy has been in DPDK since 16.04, but it's
still unproven in rich use cases in hardware. Therefore it's marked as
experimental for now, will enable it after enough field test and possible
optimization.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
armv8 implementations may have 64B or 128B cache line.
Setting to the maximum available cache line size in generic config to
address minimum DMA alignment across all arm64 implementations.
Increasing the cacheline size has no negative impact to cache invalidation
on systems with a smaller cache line.
The need for the minimum DMA alignment has impact on functional aspects
of the platform so default config should cater the functional aspects.
There is an impact on memory usage with this scheme, but that's not too
important for the single image arm64 distribution use case.
The arm64 linux kernel followed the similar approach for single
arm64 image use case.
http://lxr.free-electrons.com/source/arch/arm64/include/asm/cache.h
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
DPAA2 Hardware Mempool handlers allow enqueue/dequeue from NXP's
QBMAN hardware block.
CONFIG_RTE_MBUF_DEFAULT_MEMPOOL_OPS is set to 'dpaa2', if the pool
is enabled.
This memory pool currently supports packet mbuf type blocks only.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Having packets received without any offload flags given in the mbuf is not
very useful, and performance tests with testpmd indicates little
benefit is got with the current code by turning off the flags. This makes
the build-time option pointless, so we can remove it.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Having packets received without any offload flags given in the mbuf is not
very useful, and performance tests with testpmd indicates little to no
benefit is got with the current code by turning off the flags. This makes
the build-time option pointless, so we can remove it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
The AVP devices are only supported on Intel 64-bit architectures so
adjusting the defconfig attributes accordingly.
Fixes: 908072e9d0 ("net/avp: support driver registration")
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Enable Arkville on supported configurations
Add overview documentation
Minimum driver support for valid compile
Arkville PMD is not supported on ARM or PowerPC at this time
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Signed-off-by: John Miller <john.miller@atomicrules.com>
The crypto scheduler PMD has no external dependencies to enable that by
default.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Add a library designed to calculate latency statistics and report them
to the application when queried. The library measures minimum, average and
maximum latencies, and jitter in nano seconds. The current implementation
supports global latency stats, i.e. per application stats.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Signed-off-by: Remy Horton <remy.horton@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds a library that calculates peak and average data-rate
statistics. For ethernet devices. These statistics are reported using
the metrics library.
Signed-off-by: Remy Horton <remy.horton@intel.com>
This patch adds a new information metrics library. This Metrics
library implements a mechanism by which producers can publish
numeric information for later querying by consumers. Metrics
themselves are statistics that are not generated by PMDs, and
hence are not reported via ethdev extended statistics.
Metric information is populated using a push model, where
producers update the values contained within the metric
library by calling an update function on the relevant metrics.
Consumers receive metric information by querying the central
metric data, which is held in shared memory.
Signed-off-by: Remy Horton <remy.horton@intel.com>
This adds the minimal changes to allow a SW eventdev implementation to
be compiled, linked and created at run time. The eventdev does nothing,
but can be created via vdev on commandline, e.g.
sudo ./x86_64-native-linuxapp-gcc/app/test --vdev=event_sw0
...
PMD: Creating eventdev sw device event_sw0, numa_node=0, sched_quanta=128
RTE>>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The skeleton driver facilitates, bootstrapping the new
eventdev driver and creates a platform to verify
the northbound eventdev common code.
The driver supports both VDEV and PCI based eventdev
devices.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This patch implements northbound eventdev API interface using
southbond driver interface
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Adds the initial framework for registering the driver against the support
PCI device identifiers.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds a header file with log macros for the AVP PMD
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
This commit introduces the AVP PMD file structure without adding any actual
driver functionality. Functional blocks will be added in later patches.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Add debug options to config file. Define macros used for log and make
use of config file options to enable them.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Venkat Koppula <venkat.koppula@caviumnetworks.com>
Signed-off-by: Srisivasubramanian S <ssrinivasan@caviumnetworks.com>
Signed-off-by: Mallesham Jatharakonda <mjatharakonda@oneconvergence.com>
Enable Thunderx nicvf PMD driver in the common
config as it does not have build dependency
with any external library and/or architecture.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This patch enables i40e driver in PowerPC along with its altivec
intrinsic support.
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Add KNI PMD which wraps librte_kni for ease of use.
KNI PMD can be used as any regular PMD to send / receive packets to the
Linux networking stack.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Yong Wang <yongwang@vmware.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Moved from lib/librte_mempool, stack mempool handler is an independent
driver.
Shared builds would now require to link in librte_mempool_stack for
"stack" mempool handler.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Moved from lib/librte_mempool, ring mempool is now an independent
driver.
Shared builds would now need to add librte_mempool_ring for:
* ring_mp_mc
* ring_sp_sc
* ring_sp_mc
* ring_mp_sc
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
There was a compile time setting to enable a ring to yield when
it entered a loop in mp or mc rings waiting for the tail pointer update.
Build time settings are not recommended for enabling/disabling features,
and since this was off by default, remove it completely. If needed, a
runtime enabled equivalent can be used.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The debug option only provided statistics to the user, most of
which could be tracked by the application itself. Remove this as a
compile time option, and feature, simplifying the code.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Users compiling DPDK should not need to know or care about the arrangement
of cachelines in the rte_ring structure. Therefore just remove the build
option and set the structures to be always split. On platforms with 64B
cachelines, for improved performance use 128B rather than 64B alignment
since it stops the producer and consumer data being on adjacent cachelines.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Downstreams might want to provide different DPDK releases at the same
time to support multiple consumers of DPDK linked against older and newer
sonames.
Also due to the interdependencies that DPDK libraries can have applications
might end up with an executable space in which multiple versions of a
library are mapped by ld.so.
Think of LibA that got an ABI bump and LibB that did not get an ABI bump
but is depending on LibA.
Application
\-> LibA.old
\-> LibB.new -> LibA.new
That is a conflict which can be avoided by setting CONFIG_RTE_MAJOR_ABI.
If set CONFIG_RTE_MAJOR_ABI overwrites any LIBABIVER value.
An example might be ``CONFIG_RTE_MAJOR_ABI=16.11`` which will make all
libraries librte<?>.so.16.11 instead of librte<?>.so.<LIBABIVER>.
We need to cut arbitrary long stings after the .so now and this would work
for any ABI version in LIBABIVER:
$(Q)ln -s -f $< $(patsubst %.$(LIBABIVER),%,$@)
But using the following instead additionally allows to simplify the Make
File for the CONFIG_RTE_NEXT_ABI case.
$(Q)ln -s -f $< $(shell echo $@ | sed 's/\.so.*/.so/')
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reviewed-by: Jan Blunck <jblunck@infradead.org>
Tested-by: Jan Blunck <jblunck@infradead.org>
Re-enable CONFIG_RTE_LIBRTE_SCHED, since it is needed to build
correctly.
Fix a few warnings when compiling mpipe_tilegx.c.
Remove an empty rte_cpu_feature_table[] array using a bogus type.
Properly set RTE_OBJCOPY_{TARGET,ARCH} in mk/arch/tile/rte.vars.mk.
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Remove RTE_LIBRTE_SFC_EFX_TSO config option since it is not
required any more:
- unreasonable limit on number of Tx queues when TSO is not
actually required should be solved using per-device parameter
- performance difference with and without TSO compiled in is small
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
This patchset introduce new application which allows measuring
performance parameters of PMDs available in crypto tree. The goal of
this application is to replace existing performance tests in app/test.
Parameters available are: throughput (--ptest throughput) and latency
(--ptest latency). User can use multiply cores to run tests on but only
one type of crypto PMD can be measured during single application
execution. Cipher parameters, type of device, type of operation and
chain mode have to be specified in the command line as application
parameters. These parameters are checked using device capabilities
structure.
Couple of new library functions in librte_cryptodev are introduced for
application use.
To build the application a CONFIG_RTE_APP_CRYPTO_PERF flag has to be set
(it is set by default).
Example of usage: -c 0xc0 --vdev crypto_aesni_mb_pmd -w 0000:00:00.0 --
--ptest throughput --devtype crypto_aesni_mb --optype cipher-then-auth
--cipher-algo aes-cbc --cipher-op encrypt --cipher-key-sz 16 --auth-algo
sha1-hmac --auth-op generate --auth-key-sz 64 --auth-digest-sz 12
--total-ops 10000000 --burst-sz 32 --buffer-sz 64
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Signed-off-by: Piotr Azarewicz <piotrx.t.azarewicz@intel.com>
Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com>
Signed-off-by: Michal Kobylinski <michalx.kobylinski@intel.com>
Adds Makefile for scheduler cryptodev PMD, and updates existing
Makefiles. Different than other cryptodev PMDs, scheduler PMD
is required to be built as shared libraries.
Adds scheduler PMD enable and debug flags to config/common_base.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
KNI ethtool support (KNI control path) is not commonly used,
and it tends to break the build with new version of the Linux kernel.
KNI ethtool feature is disabled by default. KNI datapath is not effected
from this update.
It is possible to enable feature explicitly with config option:
"CONFIG_RTE_KNI_KMOD_ETHTOOL=y"
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch introduces crypto poll mode driver
using ARMv8 cryptographic extensions.
CPU compatibility with this driver is detected in
run-time and virtual crypto device will not be
created if CPU doesn't provide:
AES, SHA1, SHA2 and NEON.
This PMD is optimized to provide performance boost
for chained crypto operations processing,
such as encryption + HMAC generation,
decryption + HMAC validation. In particular,
cipher only or hash only operations are
not provided.
The driver currently supports AES-128-CBC
in combination with: SHA256 HMAC and SHA1 HMAC
and relies on the external armv8_crypto library:
https://github.com/caviumnetworks/armv8_crypto
Build ARMv8 crypto PMD if compiling for ARM64
and CONFIG_RTE_LIBRTE_PMD_ARMV8_CRYPTO option
is enable in the configuration file.
ARMV8_CRYPTO_LIB_PATH environment variable will
point to the appropriate library directory.
Signed-off-by: Zbigniew Bodek <zbigniew.bodek@caviumnetworks.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Elastic Flow Distributor (EFD) is a distributor library that uses
perfect hashing to determine a target/value for a given incoming flow key.
It has the following advantages:
- First, because it uses perfect hashing, it does not store
the key itself and hence lookup performance is not dependent
on the key size.
- Second, the target/value can be any arbitrary value hence
the system designer and/or operator can better optimize service rates
and inter-cluster network traffic locating.
- Third, since the storage requirement is much smaller than a hash-based
flow table (i.e. better fit for CPU cache), EFD can scale to
millions of flow keys.
Finally, with current optimized library implementation performance
is fully scalable with number of CPU cores.
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Acked-by: Christian Maciocco <christian.maciocco@intel.com>
Because using a NFP PMD requires specific BSP installed, the PMD
support was not the default option before. This was just for making
people aware of such dependency, since there is no need for such a
BSP for just compiling DPDK with NFP PMD support.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Add PCI device ID for ConnectX-5 and enable multi-packet send for PF and VF
along with changing documentation and release note.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Mark Spender <mspender@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
The PMD allows for DPDK and the host to communicate using a raw
device interface on the host and in the DPDK application. The device
created is a Tap device with a L2 packet header.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Aws Ismail <aismail@ciena.com>
Tested-by: Vasily Philipov <vasilyf@mellanox.com>
Enable the PMD by default on supported configurations.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Added API for `rte_eth_tx_prepare`
uint16_t rte_eth_tx_prepare(uint8_t port_id, uint16_t queue_id,
struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
Added fields to the `struct rte_eth_desc_lim`:
uint16_t nb_seg_max;
/**< Max number of segments per whole packet. */
uint16_t nb_mtu_seg_max;
/**< Max number of segments per one MTU */
These fields can be used to create valid packets according to the
following rules:
* For non-TSO packet, a single transmit packet may span up to
"nb_mtu_seg_max" buffers.
* For TSO packet the total number of data descriptors is "nb_seg_max",
and each segment within the TSO may span up to "nb_mtu_seg_max".
Added functions:
int
rte_validate_tx_offload(struct rte_mbuf *m)
to validate general requirements for tx offload set in mbuf of packet
such a flag completness. In current implementation this function is
called optionaly when RTE_LIBRTE_ETHDEV_DEBUG is enabled.
int rte_net_intel_cksum_prepare(struct rte_mbuf *m)
to prepare pseudo header checksum for TSO and non-TSO tcp/udp packets
before hardware tx checksum offload.
- for non-TSO tcp/udp packets full pseudo-header checksum is
counted and set.
- for TSO the IP payload length is not included.
int
rte_net_intel_cksum_flags_prepare(struct rte_mbuf *m, uint64_t ol_flags)
this function uses same logic as rte_net_intel_cksum_prepare, but
allows application to choose which offloads should be taken into
account, if full preparation is not required.
PERFORMANCE TESTS
-----------------
This feature was tested with modified csum engine from test-pmd.
The packet checksum preparation was moved from application to Tx
preparation step placed before burst.
We may expect some overhead costs caused by:
1) using additional callback before burst,
2) rescanning burst,
3) additional condition checking (packet validation),
4) worse optimization (e.g. packet data access, etc.)
We tested it using ixgbe Tx preparation implementation with some parts
disabled to have comparable information about the impact of different
parts of implementation.
IMPACT:
1) For unimplemented Tx preparation callback the performance impact is
negligible,
2) For packet condition check without checksum modifications (nb_segs,
available offloads, etc.) is 14626628/14252168 (~2.62% drop),
3) Full support in ixgbe driver (point 2 + packet checksum
initialization) is 14060924/13588094 (~3.48% drop)
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
There was an option CONFIG_RTE_INSECURE_FUNCTION_WARNING (disabled by
default), which prevents from using some libc functions:
sprintf, snprintf, vsnprintf, strcpy, strncpy, strcat, strncat, sscanf,
strtok, strsep and strlen.
It's all about using them at the right place with the right precautions.
However, it is neither really possible nor a good advice to disable them.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Today, all logs whose level is lower than INFO are dropped at
compile-time. This prevents from enabling debug logs at runtime using
--log-level=8.
The rationale was to remove debug logs from the data path at
compile-time, avoiding a test at run-time.
This patch changes the behavior of RTE_LOG() to avoid the compile-time
optimization, and introduces the RTE_LOG_DP() macro that has the same
behavior than the previous RTE_LOG(), for the rare cases where debug
logs are in the data path.
So it is now possible to enable debug logs at run-time by just
specifying --log-level=8. Some drivers still have special compile-time
options to enable more debug log. Maintainers may consider to
remove/reduce them.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
- Fix to use bitmapped values in NVM configuration for speed capability
advertisement. This issue is specific to 25G NIC since it is capable
of 25G and 10G speeds.
- Update feature list.
Fixes: 64c239b7f8 ("net/qede: fix advertising link speed capability")
Signed-off-by: Harish Patil <harish.patil@qlogic.com>