Currently, the function to wait until config completion is
static inline for no reason. Move its implementation to
an EAL common file.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
There is no reason to pack the memconfig structure, and doing so
gives out warnings in some static analyzers. Fix it by removing
the packed attributed.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
Now that everything that has ever accessed the shared memory
config is doing so through the public API's, we can make it
internal. Since we're removing quite a few headers from
rte_eal_memconfig.h, we need to add them back in places
where this header is used.
This bumps the ABI, so also change all build files and make
update documentation.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
Currently, in order to lock access to the mempool list, a direct
access to the shared memory structure is needed. Add an API to do
the same, and search-and-replace all usages.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
Currently, locking/unlocking the TAILQ list requires direct
access to the shared memory config. Add an API to do the same,
and search-and-replace all usages.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
Currently, the memory hotplug is locked automatically by all
memory-related _walk() functions, but sometimes locking the
memory subsystem outside of them is needed. There is no
public API to do that, so it creates a dependency on shared
memory config to be public. Fix this by introducing a new
API to lock/unlock the memory hotplug subsystem.
Create a new common file for all things mem config, and a
new API namespace rte_mcfg_*, and search-and-replace all
usages of the locks with the new API.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
Currently, if the bus selects IOVA as PA, the memory init can fail when
lacking access to physical addresses.
This can be quite hard for normal users to understand what is wrong
since this is the default behavior.
Catch this situation earlier in eal init by validating physical addresses
availability, or select IOVA when no clear preferrence had been expressed.
The bus code is changed so that it reports when it does not care about
the IOVA mode and let the eal init decide.
In Linux implementation, rework rte_eal_using_phys_addrs() so that it can
be called earlier but still avoid a circular dependency with
rte_mem_virt2phys().
In FreeBSD implementation, rte_eal_using_phys_addrs() always returns
false, so the detection part is left as is.
If librte_kni is compiled in and the KNI kmod is loaded,
- if the buses requested VA, force to PA if physical addresses are
available as it was done before,
- else, keep iova as VA, KNI init will fail later.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The function rte_malloc_set_limit was defined but never implemented.
Mark it as deprecated for now, and remove in next release.
There is no point in keeping dead code.
"You Aren't Going to Need It"
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
According to API, 'rte_dev_probe()' and 'rte_dev_remove()' must
return 0 or negative error code. Bus code returns positive values
if device wasn't recognized by any driver, so the result of
'bus->plug/unplug()' must be converted. 'local_dev_probe()' and
'local_dev_remove()' also has their internal API, so the conversion
should be done there.
Positive on remove means that device not found by driver.
Positive on probe means that there are no suitable buses/drivers,
i.e. device is not supported.
Users of these API fixed to provide a good example by respecting
DPDK API. This also will allow to catch such issues in the future.
Fixes: a3ee360f4440 ("eal: add hotplug add/remove device")
Fixes: 244d5130719c ("eal: enable hotplug on multi-process")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Putting a '__attribute__((deprecated))' in the middle of a function
prototype does not result in the expected result with gcc (while clang
is fine with this syntax).
$ cat deprecated.c
void * __attribute__((deprecated)) incorrect() { return 0; }
__attribute__((deprecated)) void *correct(void) { return 0; }
int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
$ gcc -o deprecated.o -c deprecated.c
deprecated.c: In function ‘main’:
deprecated.c:3:1: warning: ‘correct’ is deprecated (declared at
deprecated.c:2) [-Wdeprecated-declarations]
int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
^
Move the tag on a separate line and make it the first thing of function
prototypes.
This is not perfect but we will trust reviewers to catch the other not
so easy to detect patterns.
sed -i \
-e '/^\([^#].*\)\?__rte_experimental */{' \
-e 's//\1/; s/ *$//; i\' \
-e __rte_experimental \
-e '/^$/d}' \
$(git grep -l __rte_experimental -- '*.h')
Special mention for rte_mbuf_data_addr_default():
There is either a bug or a (not yet understood) issue with gcc.
gcc won't drop this inline when unused and rte_mbuf_data_addr_default()
calls rte_mbuf_buf_addr() which itself is experimental.
This results in a build warning when not accepting experimental apis
from sources just including rte_mbuf.h.
For this specific case, we hide the call to rte_mbuf_buf_addr() under
the ALLOW_EXPERIMENTAL_API flag.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
We had some inconsistencies between functions prototypes and actual
definitions.
Let's avoid this by only adding the experimental tag to the prototypes.
Tests with gcc and clang show it is enough.
git grep -l __rte_experimental |grep \.c$ |while read file; do
sed -i -e '/^__rte_experimental$/d' $file;
sed -i -e 's/ *__rte_experimental//' $file;
sed -i -e 's/__rte_experimental *//' $file;
done
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
This function is not visible from outside this code unit.
Fixes: 84e7477e10b1 ("mem: add thread unsafe version for DMA mask check")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
The incriminated commit promoted those symbols as stable but the
prototypes still have the tag.
Fixes: 73eca2f77f4c ("devargs: promote experimental API as stable")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
This API was experimental and not properly marked in the map file.
But looking more closely, this is just an internal wrapper for EAL init.
Hide it in the hotplug code.
Fixes: 244d5130719c ("eal: enable hotplug on multi-process")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
When seeding the pseudo-random number generator, replace the 64-bit
RDSEED with two 32-bit RDSEED instructions to allow building and
running on 32-bit x86.
Fixes: faf8fd252785 ("eal: improve entropy for initial PRNG seed")
Reported-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Add a function rte_rand_max() which generates an uniformly distributed
pseudo-random number less than a user-specified upper bound.
The commonly used pattern rte_rand() % SOME_VALUE creates biased
results (as in some values in the range are more frequently occurring
than others) if SOME_VALUE is not a power of 2.
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Replace the use of rte_get_timer_cycles() with getentropy() for
seeding the pseudo-random number generator. getentropy() provides a
more truly random value.
getentropy() requires glibc 2.25 and Linux kernel 3.17. In case
getentropy() is not found at compile time, or the relevant syscall
fails in runtime, the rdseed machine instruction will be used as a
fallback.
rdseed is only available on x86 (Broadwell or later). In case it is
not present, rte_get_timer_cycles() will be used as a second fallback.
On non-Meson builds, getentropy() will not be used.
Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This commit replaces rte_rand()'s use of lrand48() with a DPDK-native
combined Linear Feedback Shift Register (LFSR) (also known as
Tausworthe) pseudo-random number generator.
This generator is faster and produces better-quality random numbers
than the linear congruential generator (LCG) of lib's lrand48(). The
implementation, as opposed to lrand48(), is multi-thread safe in
regards to concurrent rte_rand() calls from different lcore threads.
A LCG is still used, but only to seed the five per-lcore LFSR
sequences.
In addition, this patch also addresses the issue of the legacy
implementation only producing 62 bits of pseudo randomness, while the
API requires all 64 bits to be random.
This pseudo-random number generator is not cryptographically secure -
just like lrand48().
Bugzilla ID: 114
Bugzilla ID: 276
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Some helpers in the header file are forced inlined other are
only inlined, this patch forces inline for all.
It will avoid it to be embedded as functions when called multiple
times in the same object file. For example, when we added packed
ring support in vhost-user library, rte_memcpy_generic got no
more inlined.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call to explicitly handle unsupported IPC cases.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, unregister will be attempted even if IPC wasn't
supported in the first place. It is harmless, but for
consistency reasons, update the unregister API call to
exit early when IPC is not supported.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.
For primary processes, it is OK to not have IPC because
there may not be any secondary processes in the first place,
and there are valid use cases that disable IPC support, so
all primary process usages are fixed up to ignore IPC
failures.
For secondary processes, IPC will be crucial, so leave all
of the error handling as is.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When checking RTE_PCI_DRV_IOVA_AS_VA flag to determine IOVA mode,
pci_one_device_has_iova_va() returns true only if kernel driver of the
device is vfio. However, Mellanox mlx4/5 PMD doesn't need to be detached
from kernel driver and attached to VFIO/UIO. Control path still goes
through the existing kernel driver, which is mlx4_core/mlx5_core. In order
to make RTE_PCI_DRV_IOVA_AS_VA effective for mlx4/mlx5 PMD, a new kernel
driver type has to be introduced.
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
The meson build never checked for the presence of rdrand and rdseed
instructions, while make build never checked for rdseed. Ensure builds
always have the appropriate checks - and therefore defines - for these
instructions. For runtime, we also add in rdseed to the list of known
bits returned from cpuid() instruction, so we can confirm its presence at
application init time.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
The fields of the internal EAL core configuration are currently
laid bare as part of the API. This is not good practice and limits
fixing issues with layout and sizes.
Make new accessor functions for the fields used by current drivers
and examples.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Purely cosmetic change, use unsigned int instead of unsigned alone.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Marchand <david.marchand@redhat.com>
snprintf guarantees to always correctly place a null terminator
in the buffer string. So manually placing a null terminator
in a buffer right after a call to snprintf is redundant code.
Additionally, there is no need to use 'sizeof(buffer) - 1' in snprintf as this
means we are not using the last character in the buffer. 'sizeof(buffer)' is
enough.
Cc: stable@dpdk.org
Signed-off-by: Michael Santana <msantana@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
When handling synchronous or asynchronous requests, the reply
must be sent explicitly even if the result of the operation is
an error, to avoid the other side timing out. Make note of this
in documentation explicitly.
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
IPC and memory-related API's should not be mixed because memory
relies on IPC internally. Add explicit warnings to IPC API and
to the documentation about this.
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
The function check_input() was returning a bool as error code.
It is changed to return an int, semantically more correct.
While at it, make checks of validate_action_name() return
explicit as described in the coding guidelines.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Length of buffer and number of fd's to send are signed values, so
they can be negative, but the API doesn't check for that. Fix it
by checking for negative values as well.
Fixes: bacaa2754017 ("eal: add channel for multi-process communication")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, IPC does not check received messages for invalid data
and passes them to user code unchanged. This may result in buffer
overruns on reading message data. Fix this by checking the message
length and fd number on receive, and discard any messages that
are not valid.
Fixes: bacaa2754017 ("eal: add channel for multi-process communication")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
According to manpage, ENOBUFS error indicates that either the
input or the output queue is full. This should be considered
an error, but it is treated as an "ignore" condition. Fix the
code to report an error instead.
Fixes: bacaa2754017 ("eal: add channel for multi-process communication")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
When sending multiple requests, rte_mp_request_sync
can succeed sending a few of those requests, but then
fail on a later one and in the end return with rc=-1.
The upper layers - e.g. device hotplug - currently
handles this case as if no messages were sent and no
memory for response buffers was allocated, which is
not true. Fixed by always freeing memory buffers on
failure.
Bugzilla ID: 228
Fixes: 783b6e54971d ("eal: add synchronous multi-process communication")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
This message was missing newline, and should capitalize
"Cannot" like all the others in this area.
Fixes: ac9e4a17370f ("eal: support attach/detach shared device from secondary")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Rami Rosen <ramirose@gmail.com>
The function rte_eal_cleanup() was introduced more than one year ago,
in DPDK 18.02. It is no longer experimental, allowing
pdump, proc-info and hotplug_mp apps to not need any experimental API.
The function rte_ctrl_thread_create() was introduced one year ago
in DPDK 18.05. It is no longer experimental, allowing
KNI PMD and TEP example to not need any experimental API.
The functions rte_socket_count() and rte_socket_id_by_idx() were
introduced one year ago in DPDK 18.05. They are no longer experimental.
The function rte_dev_is_probed() was introduced half a year ago
in DPDK 18.11. It is no longer experimental.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
These APIs are available in DPDK for last 4 releases
and used by multiple drivers.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Do a global replace of snprintf(..."%s",...) with strlcpy, adding in the
rte_string_fns.h header if needed. The function changes in this patch were
auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy.cocci --dir . --in-place
and then the files edited using awk to add in the missing header:
gawk -i inplace '/include <rte_/ && ! seen { \
print "#include <rte_string_fns.h>"; seen=1} {print}'
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
For files that already have rte_string_fns.h included in them, we can
do a straight replacement of snprintf(..."%s",...) with strlcpy. The
changes in this patch were auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy-with-header.cocci --dir . --in-place
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
When enabling pedantic compilation with CONFIG_RTE_LIBRTE_MLX5_DEBUG,
the compiler complains about non standard 128-bit integer type:
include/rte_atomic_64.h:223:3: error:
ISO C does not support ‘__int128’ types [-Werror=pedantic]
It must be marked as an extension of the standard C language
to be accepted in pedantic compilation.
Fixes: 640c5f09ef2c ("eal/x86: add 128-bit atomic compare exchange")
Cc: gage.eads@intel.com
Reported-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Gage Eads <gage.eads@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
baremetal execution environments may have a different
method to enable RTE_INIT instead of using compiler
constructor and/or OS specific linker scheme.
Allow an option to override RTE_INIT* macros using
rte_os.h or appropriate header file.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
This operation can be used for non-blocking algorithms, such as a
non-blocking stack or ring.
It is available only for x86_64.
Signed-off-by: Gage Eads <gage.eads@intel.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
The commit below added an address hint as starting address for 64-bit
systems in case an explicit base virtual address was not set by the user.
The justification for such hint was to help devices that work in VA
mode and has a address range limitation to work smoothly with the eal
memory subsystem.
While the base address value selected may work fine for the eal
initialization, it easily breaks when trying to register external memory
using rte_extmem_register API.
Trying to register anonymous memory on RH x86_64 machine took several
minutes, during them the function eal_get_virtual_area repeatedly
scanned for a good VA candidate.
The attempt to guess which VA address will be free for mapping will
always result in not portable, error prone code:
* different application may use different libraries along w/ DPDK. One
can never guess which library was called first and how much virtual
memory it consumed.
* external memory can be registered at any time in the application run
time.
In order not to break the existing secondary process design, this patch
only limits the max number of tries that will be done with the
address hint.
When the number of tries exceeds the threshold the code
will use the suggested address from kernel.
Fixes: 1df21702873d ("mem: use address hint for mapping hugepages")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>