Replace the mbuf pointer array in the event eth Rx adapter
callback with an event array. Using an event array allows
the application to change attributes of the events enqueued
by the SW adapter.
The callback can drop packets and populate a callback
argument with the number of dropped packets. Add a Rx adapter
stats field to keep track of the total number of dropped packets.
This commit removes the experimental tags from
the callback and stats APIs, the experimental tag from eventdev
is also removed and eventdev functions become part of the
main DPDK API/ABI.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
This patch introduces a new version of the event timer adapter software
PMD. In the original design, timer event producer lcores in the primary
and secondary processes enqueued event timers into a ring, and a
service core in the primary process dequeued them and processed them
further. To improve performance, this version does away with the ring
and lets lcores insert timers directly into timer skiplist data
structures; the service core directly accesses the lists as well, when
looking for timers that have expired.
To compare the burst and non-burst performance of the original and new
versions of the software event timer adapter, I ran the following
commands:
$ sudo ./build/app/dpdk-test-eventdev -c 0xFFE -s 0xC --vdev=event_sw0 \
-- --test=perf_queue --plcores=4,5,6 --wlcore=7,8,9 --stlist=p \
--prod_type_timerdev --worker_deq_depth=32
$ sudo ./build/app/dpdk-test-eventdev -c 0xFFE -s 0xC --vdev=event_sw0 \
-- --test=perf_queue --plcores=4,5,6 --wlcore=7,8,9 --stlist=p \
--prod_type_timerdev_burst --worker_deq_depth=32
With the new version, I see a 151% improvement in throughput for the
non-burst case, and a 270% improvement in throughput for the burst case.
I also see a 53% improvement in arm latency in the non-burst case and a
65% improvement in arm latency in the burst case.
Note: To perform the test, I commented out a check in the original
version that checks the adapter tick interval against a minimum value.
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Setup event when the Rx queue is added to the
adapter in place of generating the event when it is
being enqueued to the event device.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Remove copy from temporary event array on the stack to the
enqueue buffer event array entry, instead initialize event in the
enqueue buffer event array entry.
Suggested-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
For each library where we optionally disable it, add in the reason why it's
being disabled, so the user knows how to fix it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
When configuring with meson we print out a list of enabled components, but
it is also useful to list out the disabled components and the reasons why.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
When running self-tests, the driver needs to know the device on which to
run the tests, so we need to take the device ID as parameter. Only the
skeleton driver is providing this selftest capability right now, so we can
easily update it for this change.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
The function rte_malloc_set_limit was defined but never implemented.
Mark it as deprecated for now, and remove in next release.
There is no point in keeping dead code.
"You Aren't Going to Need It"
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Linux EAL will attach the shared config at an arbitrary address,
find out where the shared config is mapped in the primary, and
then will reattach it at that exact address.
FreeBSD version doesn't seem to go for that extra reattach step,
which makes one wonder how did it ever work in the first place.
Fix the FreeBSD init to also reattach shared config to the exact
same place the primary process has it.
Fixes: 764bf26873 ("add FreeBSD support")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
When init is complete, EAL is supposed to update internal config
to indicate that initialization is complete. Add missing write.
Fixes: a99c96e96a ("eal: add internal flag of init completed")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The default policy for offload-specific fields is that
they are undefined unless the corresponding offloads are
requested in mbuf ol_flags. This is also the case for outer
L2 and L3 length fields which must not be assumed to contain
zeros for non-tunnel packets. The patch clarifies this behaviour
in the comments and also adds appropriate checks to the PMDs which
do not check any tunnel-related offloads before using the said fields.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The API to prepare checksum offloads mistreats L4
checksum type enum values as self-contained flags.
Turning these flag checks into enum checks causes
warnings by GCC about possibly uninitialised IPv4
header pointer. The issue was found to show up in
the case of GCC versions 4.8.5 and 5.4.0, however,
it might be the case for a wider variety of other
versions. Initialise the pointer upon declaration.
and explain the reason behind this in the comment.
Fixes: 4fb7e803eb ("ethdev: add Tx preparation")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The API to prepare checksum offloads employs outer
IP checksum flag to tell regular IPv4 packets from
tunnel packets with outer IPv4 encapsulation. This
flag cannot serve as a marker for the said purpose
because a packet can have outer IPv4 encapsulation
and may not have outer IP checksum offload request.
Fix the API by changing the said criterion to test
outer IPv4 flag rather than outer IP checksum flag.
Use simpler spelling of the conditional expression.
Fixes: 4fb7e803eb ("ethdev: add Tx preparation")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
According to API, 'rte_dev_probe()' and 'rte_dev_remove()' must
return 0 or negative error code. Bus code returns positive values
if device wasn't recognized by any driver, so the result of
'bus->plug/unplug()' must be converted. 'local_dev_probe()' and
'local_dev_remove()' also has their internal API, so the conversion
should be done there.
Positive on remove means that device not found by driver.
Positive on probe means that there are no suitable buses/drivers,
i.e. device is not supported.
Users of these API fixed to provide a good example by respecting
DPDK API. This also will allow to catch such issues in the future.
Fixes: a3ee360f44 ("eal: add hotplug add/remove device")
Fixes: 244d513071 ("eal: enable hotplug on multi-process")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Putting a '__attribute__((deprecated))' in the middle of a function
prototype does not result in the expected result with gcc (while clang
is fine with this syntax).
$ cat deprecated.c
void * __attribute__((deprecated)) incorrect() { return 0; }
__attribute__((deprecated)) void *correct(void) { return 0; }
int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
$ gcc -o deprecated.o -c deprecated.c
deprecated.c: In function ‘main’:
deprecated.c:3:1: warning: ‘correct’ is deprecated (declared at
deprecated.c:2) [-Wdeprecated-declarations]
int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
^
Move the tag on a separate line and make it the first thing of function
prototypes.
This is not perfect but we will trust reviewers to catch the other not
so easy to detect patterns.
sed -i \
-e '/^\([^#].*\)\?__rte_experimental */{' \
-e 's//\1/; s/ *$//; i\' \
-e __rte_experimental \
-e '/^$/d}' \
$(git grep -l __rte_experimental -- '*.h')
Special mention for rte_mbuf_data_addr_default():
There is either a bug or a (not yet understood) issue with gcc.
gcc won't drop this inline when unused and rte_mbuf_data_addr_default()
calls rte_mbuf_buf_addr() which itself is experimental.
This results in a build warning when not accepting experimental apis
from sources just including rte_mbuf.h.
For this specific case, we hide the call to rte_mbuf_buf_addr() under
the ALLOW_EXPERIMENTAL_API flag.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
We had some inconsistencies between functions prototypes and actual
definitions.
Let's avoid this by only adding the experimental tag to the prototypes.
Tests with gcc and clang show it is enough.
git grep -l __rte_experimental |grep \.c$ |while read file; do
sed -i -e '/^__rte_experimental$/d' $file;
sed -i -e 's/ *__rte_experimental//' $file;
sed -i -e 's/__rte_experimental *//' $file;
done
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
This function is not visible from outside this code unit.
Fixes: 84e7477e10 ("mem: add thread unsafe version for DMA mask check")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
The incriminated commit promoted this symbol as stable but the
definition still has the tag.
Fixes: 787ae736a3 ("vfio: remove experimental tag")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
The incriminated commit promoted those symbols as stable but the
prototypes still have the tag.
Fixes: 73eca2f77f ("devargs: promote experimental API as stable")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
This API was experimental and not properly marked in the map file.
But looking more closely, this is just an internal wrapper for EAL init.
Hide it in the hotplug code.
Fixes: 244d513071 ("eal: enable hotplug on multi-process")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
When seeding the pseudo-random number generator, replace the 64-bit
RDSEED with two 32-bit RDSEED instructions to allow building and
running on 32-bit x86.
Fixes: faf8fd2527 ("eal: improve entropy for initial PRNG seed")
Reported-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Add a function rte_rand_max() which generates an uniformly distributed
pseudo-random number less than a user-specified upper bound.
The commonly used pattern rte_rand() % SOME_VALUE creates biased
results (as in some values in the range are more frequently occurring
than others) if SOME_VALUE is not a power of 2.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Replace the use of rte_get_timer_cycles() with getentropy() for
seeding the pseudo-random number generator. getentropy() provides a
more truly random value.
getentropy() requires glibc 2.25 and Linux kernel 3.17. In case
getentropy() is not found at compile time, or the relevant syscall
fails in runtime, the rdseed machine instruction will be used as a
fallback.
rdseed is only available on x86 (Broadwell or later). In case it is
not present, rte_get_timer_cycles() will be used as a second fallback.
On non-Meson builds, getentropy() will not be used.
Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This commit replaces rte_rand()'s use of lrand48() with a DPDK-native
combined Linear Feedback Shift Register (LFSR) (also known as
Tausworthe) pseudo-random number generator.
This generator is faster and produces better-quality random numbers
than the linear congruential generator (LCG) of lib's lrand48(). The
implementation, as opposed to lrand48(), is multi-thread safe in
regards to concurrent rte_rand() calls from different lcore threads.
A LCG is still used, but only to seed the five per-lcore LFSR
sequences.
In addition, this patch also addresses the issue of the legacy
implementation only producing 62 bits of pseudo randomness, while the
API requires all 64 bits to be random.
This pseudo-random number generator is not cryptographically secure -
just like lrand48().
Bugzilla ID: 114
Bugzilla ID: 276
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Build error:
../lib/librte_telemetry/rte_telemetry.c:558:28:
error: comparison of unsigned expression < 0 is always false
[-Werror,-Wtautological-compare]
Build error not observed in default make build because telemetry library
disabled by default but easier to reproduce via meson.
Fixing by converting unsigned variables to signed.
Fixes: 0fe3a37924 ("telemetry: format json response when sending stats")
Fixes: 4080e46c80 ("telemetry: support global metrics")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
Array ins_chk in lib/librte_bpf/bpf_validate.c has 255 entries.
So the instruction with opcode == 255 will reading beyond array
boundaries.
For more details please refer to:
https://bugs.dpdk.org/show_bug.cgi?id=283
Fixes: 6e12ec4c4d ("bpf: add more checks")
Cc: stable@dpdk.org
Reported-by: Michel Machado <michel@digirati.com.br>
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Take into account IPv6 fragment extension header when
calculating data size for each fragment.
Fixes: 7a838c8798 ("ip_frag: fix IPv6 when MTU sizes not aligned to 8 bytes")
Fixes: 0aa31d7a59 ("ip_frag: add IPv6 fragmentation support")
Cc: stable@dpdk.org
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
clang raise 'pointer-sign' warnings in __atomic_compare_exchange
when passing 'uint64_t *' to parameter of type 'int64_t *' converts
between pointers to integer types with different sign.
Fixes: 7e6e609939 ("stack: add C11 atomic implementation")
Cc: stable@dpdk.org
Suggested-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
When adding an alarm, if an error happen when registering
the common alarm callback, it is not considered as a major failure.
The alarm is then inserted in the list.
However it was returning an error code after inserting the alarm.
The error code is not set anymore to be consistent with the behaviour.
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This patch changes some void functions to return a value,
so that the init sequence may tear down orderly
instead of calling panic.
Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
When a component uses either XOPEN_SOURCE or POSIX_C_SOURCE macro
explicitly in its build recipe, it restricts visibility of a non POSIX
features subset, such as IANA protocol numbers (IPPROTO_* macros).
Non standard features are enabled by default for DPDK both for Linux
thanks to _GNU_SOURCE and for FreeBSD thanks to __BSD_VISIBLE. However
using XOPEN_SOURCE or POSIX_(C_)SOURCE in a component causes
__BSD_VISIBLE to be defined to 0 for FreeBSD, causing different feature
sets visibility for Linux and FreeBSD. It restricts from using IPPROTO
macros in public headers, such as rte_ip.h, despite the fact they are
already widely used in sources.
Add __BSD_VISIBLE macro specified unconditionally for FreeBSD targets
which enforces feature sets visibility unification between Linux and
FreeBSD.
Add single -D_GNU_SOURCE to config/meson.build as a project argument
instead of adding separate directive for each project subtree.
This patch solves the problem of build breaks for [1] on FreeBSD [2]
following the discussion [3].
[1] https://mails.dpdk.org/archives/dev/2019-May/131885.html
[2] http://mails.dpdk.org/archives/test-report/2019-May/082263.html
[3] https://mails.dpdk.org/archives/dev/2019-May/132110.html
Signed-off-by: Marcin Smoczynski <marcinx.smoczynski@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
telemetry has support for fetching port based stats
from metrics library.
Metrics library also has global stats which are
not fetched by telemetry, so extend telemetry to
fetch the global metrics.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
This patch fixes the inferred misuse of enum of crypto algorithms.
Coverity issue: 325879
Fixes: e80a987081 ("vhost/crypto: add session message handler")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch fixes a few same class bugs that causes the
logically dead code in vhost_crypto.
Coverity issue: 277236, 277233, 277220, 277214
Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add a missing include with the defines for vhost-user driver features.
Fixes: 5fbb3941da ("vhost: introduce driver features related APIs")
Cc: stable@dpdk.org
Signed-off-by: Noa Ezra <noae@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Matan Azrad <matan@mellanox.com>
Free the `values` pointer before returning
from rte_telemetry_command_ports_all_stat_values()
to avoid memory leak.
Fixes: c12cefa379 ("telemetry: fix mapping of statistics")
Cc: stable@dpdk.org
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
As there is no ethtool support in KNI anymore,
PCI related information is no longer needed.
Fixes: ea6b39b5b8 ("kni: remove ethtool support")
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_eth_dev_info structure exposes, nb_seg_max & nb_mtu_seg_max
to provide maximum number of supported segments for a given platform.
Defining UINT16_MAX as default value of above mentioned variables to
expose support of infinite/maximum segments.
Based on above values, application can decide best size for buffers
while creating mbuf pool.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
We currently have no check on the mempool pointer passed to
rte_eth_rx_queue_setup.
Let's avoid a plain crash when dereferencing it.
Suggested-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Some helpers in the header file are forced inlined other are
only inlined, this patch forces inline for all.
It will avoid it to be embedded as functions when called multiple
times in the same object file. For example, when we added packed
ring support in vhost-user library, rte_memcpy_generic got no
more inlined.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Now that we have a single function to map the descriptors
buffers, let's prefetch them there as it is the earliest
place we can do it.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Handling of fragmented virtio-net header and indirect descriptors
tables was implemented to fix CVE-2018-1059. It should never
happen with healthy guests and so is already considered as
unlikely code path.
This patch moves these bits into non-inline dedicated functions
to reduce the I-cache pressure.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
At runtime either packed Tx/Rx functions will always be called,
or split Tx/Rx functions will always be called.
This patch removes the forced inlining in order to reduce
the I-cache pressure.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
In order to reduce the I-cache pressure, this patch removes
the inlining of the dirty pages logging functions, that we
can consider as cold path.
Indeed, these functions are only called while doing live
migration, so not called most of the time.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Add rte_eth_read_clock to read the raw clock of a device.
The main use is to get the device clock conversion co-efficients to be
able to translate the raw clock of the timestamp field of the pkt mbuf
to a local synced time value.
This function was missing to allow users to convert the Rx timestamp
field to real time without the complexity of the rte_timesync* facility.
One can derivate the clock frequency by calling twice read_clock and
then keep a common time base.
Signed-off-by: Tom Barbette <barbette@kth.se>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Some compilers reporting the following error, though the existing
code doesn't have any uninitialized variable case.
Just to make compiler happy, initialize the int32x4_t variable
one shot using vdupq_n_s32.
lib/librte_acl/acl_run_neon.h: In function 'search_neon_4'
lib/librte_acl/acl_run_neon.h:230:12: error:
'input' may be used uninitialized in this function
int32x4_t input;
Fixes: 34fa6c27c1 ("acl: add NEON optimization for ARMv8")
Cc: stable@dpdk.org
Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Aaron Conole <aconole@redhat.com>