The mempool library assigns handler ops indexes based on the dynamic load
order of mempool handlers. Indexes are used so a mempool can be used by
multiple processes, but this only works if all processes agree on the
mapping from index to mempool handler.
When using the '-d' argument, it's possible for different processes to load
mempool handlers in different orders, and thus have different
index->handler mappings. Using a mempool in multiple of such processes will
result in undefined behavior.
This commit adds a note to the mempool library programmer's guide warning
users against this.
Fixes: 449c49b93a ("mempool: support handler operations")
Cc: stable@dpdk.org
Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Add local APIs to trigger data copies, and retrieve handle values once
those copies are completed. Included are unit tests to validate the data
is copies correctly.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Add stats functions to track what is happening in the driver, and put
unit tests to check those.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Allow initializing a driver instance. Include selftest to validate these
functions.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Add in the "info_get" function to the driver, to allow us to query the
device. This allows us to have the unit test pick up the presence of
supported hardware or not.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Add the create/destroy driver functions so that we can actually allocate
a rawdev and destroy it when done. No rawdev API functions are actually
implemented at this point.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
In order to allow binding/unbinding of devices for use by the
ioat_rawdev, we need to update the devbind script to add a new class
of device, and add device ids for the specific HW instances.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Add stubs for ioat rawdev driver support in DPDK, specifically:
* makefile and meson build hooks
* initial public header file
* rawdev main C file, with probe and release functions
* release note update announcing the driver
* initial documentation for the new section in the rawdev doc
* unit test stubs for device unit tests
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds ice_flow_create, ice_flow_destroy,
ice_flow_flush and ice_flow_validate support,
these are used to handle all the generic filters.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Add devargs to control each event timer adapter i.e. TIM rings internal
parameters uniquely. The following dict format is expected
[ring-chnk_slots-disable_npa-stats_ena]. 0 represents default values.
Example:
--dev "0002:0e:00.0,tim_ring_ctl=[2-1023-1-0]"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add devargs to limit the max number of TIM rings reserved on probe.
Since, TIM rings are HW resources we can avoid starving other
applications by not grabbing all the rings.
Example:
--dev "0002:0e:00.0,tim_rings_lmt=2"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add event timer adapter statistics get and reset functions.
Stats are disabled by default and can be enabled through devargs.
Example:
--dev "0002:0e:00.0,tim_stats_ena=1"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add devargs support to modify number of chunk slots. Chunks are used to
store event timers, a chunk can be visualised as an array where the last
element points to the next chunk and rest of them are used to store
events. TIM traverses the list of chunks and enqueues the event timers
to SSO.
If no argument is passed then a default value of 255 is taken.
Example:
--dev "0002:0e:00.0,tim_chnk_slots=511"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
If the chunks are allocated from NPA then TIM can automatically free
them when traversing the list of chunks.
Add devargs to disable NPA and use software mempool to manage chunks.
Example:
--dev "0002:0e:00.0,tim_disable_npa=1"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Add selftest to verify sanity of SSO.
Can be run by passing devargs to SSO PF as follows:
Example:
--dev "0002:0e:00.0,selftest=1"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
SSO GGRPs i.e. queue uses DRAM & SRAM buffers to hold in-flight
events. By default the buffers are assigned to the SSO GGRPs to
satisfy minimum HW requirements. SSO is free to assign the remaining
buffers to GGRPs based on a preconfigured threshold.
We can control the QoS of SSO GGRP by modifying the above mentioned
thresholds. GGRPs that have higher importance can be assigned higher
thresholds than the rest.
Example:
--dev "0002:0e:00.0,qos=[1-50-50-50]" // [Qx-XAQ-TAQ-IAQ]
Qx -> Event queue Aka SSO GGRP.
XAQ -> DRAM In-flights.
TAQ & IAQ -> SRAM In-flights.
The values need to be expressed in terms of percentages, 0 represents
default.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Octeontx2 SSO by default is set to use dual workslot mode.
Add devargs option to force legacy mode i.e. single workslot mode.
Example:
--dev "0002:0e:00.0,single_ws=1"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
The number of events for a *open system* event device is specified
as -1 as per the eventdev specification.
Since, Octeontx2 SSO inflight events are only limited by DRAM size, the
xae_cnt devargs parameter is introduced to provide upper limit for
in-flight events.
Example:
--dev "0002:0e:00.0,xae_cnt=8192"
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Add the make and meson based build infrastructure along with the
eventdev(SSO) device probe.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Replace the mbuf pointer array in the event eth Rx adapter
callback with an event array. Using an event array allows
the application to change attributes of the events enqueued
by the SW adapter.
The callback can drop packets and populate a callback
argument with the number of dropped packets. Add a Rx adapter
stats field to keep track of the total number of dropped packets.
This commit removes the experimental tags from
the callback and stats APIs, the experimental tag from eventdev
is also removed and eventdev functions become part of the
main DPDK API/ABI.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
When configuring with meson we print out a list of enabled components, but
it is also useful to list out the disabled components and the reasons why.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Update doc the match with code.
Fixes: 81f7ecd9 ("examples: use factorized default Rx/Tx configuration")
Cc: stable@dpdk.org
Signed-off-by: Bao-Long Tran <longtb5@viettel.com.vn>
The function rte_malloc_set_limit was defined but never implemented.
Mark it as deprecated for now, and remove in next release.
There is no point in keeping dead code.
"You Aren't Going to Need It"
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch enables need_wakeup flag for Tx and fill rings, when this
flag is set by the driver, it means that the userspace application has
to explicitly wake up the kernel Rx or kernel Tx processing by issuing
a syscall. Poll() can wake up both and sendto() or its alternatives
will wake up Tx processing only.
This feature is to provide efficient support for case that application
and driver executing on the same core.
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
It can be useful to use pcap files for some rudimental performance
testing. This patch enables this functionality in the pcap driver.
At a high level, this works by creating a ring of sufficient size to
store the packets in the pcap file passed to the application. When the
rx function for this mode is called, packets are dequeued from the ring
for use by the application and also enqueued back on to the ring to be
"received" again.
A tx_drop mode is also added since transmitting to a tx_pcap file isn't
desirable at a high traffic rate.
Jumbo frames are not supported in this mode. When filling the ring at rx
queue setup time, the presence of multi segment mbufs is checked for.
The PMD will exit on detection of these multi segment mbufs.
Signed-off-by: Cian Ferriter <cian.ferriter@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add build and doc files along with hinic_pmd_ethdev.c
which just includes PMD register and log initialization
for compilation.
Signed-off-by: Ziyang Xuan <xuanziyang2@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
We had some inconsistencies between functions prototypes and actual
definitions.
Let's avoid this by only adding the experimental tag to the prototypes.
Tests with gcc and clang show it is enough.
git grep -l __rte_experimental |grep \.c$ |while read file; do
sed -i -e '/^__rte_experimental$/d' $file;
sed -i -e 's/ *__rte_experimental//' $file;
sed -i -e 's/__rte_experimental *//' $file;
done
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Add a function rte_rand_max() which generates an uniformly distributed
pseudo-random number less than a user-specified upper bound.
The commonly used pattern rte_rand() % SOME_VALUE creates biased
results (as in some values in the range are more frequently occurring
than others) if SOME_VALUE is not a power of 2.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This commit replaces rte_rand()'s use of lrand48() with a DPDK-native
combined Linear Feedback Shift Register (LFSR) (also known as
Tausworthe) pseudo-random number generator.
This generator is faster and produces better-quality random numbers
than the linear congruential generator (LCG) of lib's lrand48(). The
implementation, as opposed to lrand48(), is multi-thread safe in
regards to concurrent rte_rand() calls from different lcore threads.
A LCG is still used, but only to seed the five per-lcore LFSR
sequences.
In addition, this patch also addresses the issue of the legacy
implementation only producing 62 bits of pseudo randomness, while the
API requires all 64 bits to be random.
This pseudo-random number generator is not cryptographically secure -
just like lrand48().
Bugzilla ID: 114
Bugzilla ID: 276
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Add new telemetry mode support for l3fwd-power.
This is a standalone mode, in this mode l3fwd-power
does simple l3fwding along with calculating
empty polls, full polls, and busy percentage for
each forwarding core. The aggregation of these
values of all cores is reported as application
level telemetry to metric library for every 500ms from the
master core.
The busy percentage is calculated by recording the poll_count
and when the count reaches a defined value the total
cycles it took is measured and compared with minimum and maximum
reference cycles and busy rate is set according to either 0% or
50% or 100%.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
telemetry has support for fetching port based stats
from metrics library.
Metrics library also has global stats which are
not fetched by telemetry, so extend telemetry to
fetch the global metrics.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
The EAL init diagram had a typo for "lauch"
instead of "launch".
Fixes: fc1f2750a3 ("doc: programmers guide")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Add a global function in the PMD which dumps debug information to
specific file.
The data can be printed in hexadecimal format or as regular string.
The number of debug files per PMD entity should be limited by a new PMD
probe parameter called max_dump_files_num.
The files will be created in the /var/log directory or in the current
directory.
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
To get the VF's link status by calling 'rte_eth_link_get_nowait()', the
VF not only check PF's physical link status, but also check the mailbox
running status. And mailbox checking will generate mailbox interrupt in
PF, it will be worse if many VFs are running in the system, the PF will
have to handle many interrrupts.
Normally, checking the PF's physical link status is enough for nowait.
For different scenarios, adding an 'pflink_fullchk' option to control
whether to check the link fully or not.
Fixes: 91546fb62e ("net/ixgbevf: fix link state")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Scott Daniels <daniels@research.att.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The firmware in 1400 series VIC adapters which would support COUNT
flow action was postponed and reworked. The capability will be
re-added in a future release when the firmware is available.
This reverts the following commits.
commit 86df6c4e2f ("net/enic: support flow counter action")
commit 1b4ce87dc5 ("net/enic: fix counter action")
Cc: stable@dpdk.org
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
Available link speeds are based on VIC adapter model, which is encoded
in PCI subsystem device ID.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
When Rx interrupts are disabled, we simply disable rearm when
the interrupt fires the next time. So, the next packet will
trigger interrupt (if it is not happened yet after previous Rx
burst processing).
Signed-off-by: Georgiy Levashov <georgiy.levashov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
This patch adds two parameters `start_queue` and `queue_count` to
specify the range of netdev queues used by AF_XDP pmd.
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Implement zero copy of af_xdp pmd through mbuf's external memory
mechanism to achieve high performance.
This patch also provides a new parameter "pmd_zero_copy" for user, so they
can choose to enable zero copy of af_xdp pmd or not.
To be clear, "zero copy" here is different from the "zero copy mode" of
AF_XDP, it is about zero copy between af_xdp umem and mbuf used in dpdk
application.
Suggested-by: Vipin Varghese <vipin.varghese@intel.com>
Suggested-by: Tummala Sivaprasad <sivaprasad.tummala@intel.com>
Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
This commit adds support to the bnxt PMD for devices
based on the BCM57508 "thor" Ethernet controller.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Shared memory packet interface (memif) PMD allows for DPDK and any other
client using memif (DPDK, VPP, libmemif) to communicate using shared
memory. The created device transmits packets in a raw format. It can be
used with Ethernet mode, IP mode, or Punt/Inject. At this moment, only
Ethernet mode is supported in DPDK memif implementation. Memif is Linux
only.
Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Update HWRM API to version 1.10.0.74
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>