Normally, each library has it's own version number based on the ABI.
Add an option to have all libs just use the DPDK version number as the
.so version.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Luca Boccassi <luca.boccassi@gmail.com>
Support building igb_uio using meson and ninja. For this, we still use the
kernel's kbuild system, by calling out to make, since it's safer and easier
than trying to reproduce that in meson. A list of suitable file
dependencies is given so that we have a reasonable chance of a rebuild when
necessary.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Luca Boccassi <luca.boccassi@gmail.com>
Support building the EAL with meson and ninja. This involves a number of
different meson.build files for iterating through all the different
subdirectories in the EAL. The library itself will be compiled on build but
the header files are only copied from their initial location once "ninja
install" is run. Instead, we use meson dependency tracking to ensure that
other libraries which use the EAL headers can find them in their original
locations.
Note: this does not include building kernel modules on either BSD or Linux
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Luca Boccassi <luca.boccassi@gmail.com>
We need the synchronous way for multi-process communication,
i.e., blockingly waiting for reply message when we send a request
to the peer process.
We add two APIs rte_eal_mp_request() and rte_eal_mp_reply() for
such use case. By invoking rte_eal_mp_request(), a request message
is sent out, and then it waits there for a reply message. The caller
can specify the timeout. And the response messages will be collected
and returned so that the caller can decide how to translate them.
The API rte_eal_mp_reply() is always called by an mp action handler.
Here we add another parameter for rte_eal_mp_t so that the action
handler knows which peer address to reply.
sender-process receiver-process
---------------------- ----------------
thread-n
|_rte_eal_mp_request() ----------> mp-thread
|_timedwait() |_process_msg()
|_action()
|_rte_eal_mp_reply()
mp_thread <---------------------|
|_process_msg()
|_signal(send_thread)
thread-m <----------|
|_collect-reply
* A secondary process is only allowed to talk to the primary process.
* If there are multiple secondary processes for the primary process,
it will send request to peer1, collect response from peer1; then
send request to peer2, collect response from peer2, and so on.
* When thread-n is sending request, thread-m of that process can send
request at the same time.
* For pair <action_name, peer>, we guarantee that only one such request
is on the fly.
Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Previouly, there are three channels for multi-process
(i.e., primary/secondary) communication.
1. Config-file based channel, in which, the primary process writes
info into a pre-defined config file, and the secondary process
reads the info out.
2. vfio submodule has its own channel based on unix socket for the
secondary process to get container fd and group fd from the
primary process.
3. pdump submodule also has its own channel based on unix socket for
packet dump.
It'd be good to have a generic communication channel for multi-process
communication to accommodate the requirements including:
a. Secondary wants to send info to primary, for example, secondary
would like to send request (about some specific vdev to primary).
b. Sending info at any time, instead of just initialization time.
c. Share FDs with the other side, for vdev like vhost, related FDs
(memory region, kick) should be shared.
d. A send message request needs the other side to response immediately.
This patch proposes to create a communication channel, based on datagram
unix socket, for above requirements. Each process will block on a unix
socket waiting for messages from the peers.
Three new APIs are added:
1. rte_eal_mp_action_register() is used to register an action,
indexed by a string, when a component at receiver side would like
to response the messages from the peer processe.
2. rte_eal_mp_action_unregister() is used to unregister the action
if the calling component does not want to response the messages.
3. rte_eal_mp_sendmsg() is used to send a message, and returns
immediately. If there are n secondary processes, the primary
process will send n messages.
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Calling rte_smp_{w/r}mb macro expands into a compound block, which
would break compiling a else clause following it, if that calling
place has been terminated already with ";", as in below code.
This patch adds { } around this macro to allow compiling else too.
Fixes: d23a6bd04d ("eal/ppc: fix memory barrier for IBM POWER")
Fixes: 05c3fd7110 ("eal/ppc: atomic operations for IBM Power")
Cc: stable@dpdk.org
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Add checks during build to ensure that all symbols in the EXPERIMENTAL
version map section have __experimental tags on their definitions, and
enable the warnings needed to announce their use. Also add an
ALLOW_EXPERIMENTAL_APIS define to allow individual libraries and files
to declare the acceptability of experimental api usage
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Append the __rte_experimental tag to api calls appearing in the
EXPERIMENTAL section of their libraries version map
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
This commit adds a new function rte_eal_cleanup().
The function serves as a hook to allow DPDK to release
internal resources (e.g.: hugepage allocations).
This function allows DPDK to become more like an ordinary
library, where the library context itself can be initialized
and cleaned up by the application.
The rte_exit() and rte_panic() functions must be considered,
particularly if they should call rte_eal_cleanup() to release any
resources or not. This patch adds the cleanup to rte_exit(),
but does not clean up on rte_panic(). The reason to not clean
up on panicing is that the developer may wish to inspect the
exact internal state of EAL and hugepages.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
This commit moves the rte_service_finalize() function
to be in the component header, and marks it as @internal.
The function is only called internally by rte_eal_finalize().
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
At present the userdefined mempool ops name overwrites
the default mempool ops name variable in internal_config.
This patch change the logic to maintain the value of
user defined only in the internal config.
The pktmbuf_create_pool is updated to reflect the same ie.
use user defined. If not present than use the default.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
This patch prefix the mbuf pool ops name with "user" to indicate
that it is user defined.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
On x86 it is possible to use lock-prefixed instructions to get
the similar effect as mfence.
As pointed by Java guys, on most modern HW that gives a better
performance than using mfence:
https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
That patch adopts that technique for rte_smp_mb() implementation.
On BDW 2.2 mb_autotest on single lcore reports 2X cycle reduction,
i.e. from ~110 to ~55 cycles per operation.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
for the code as follows:
if (condition)
rte_smp_rmb();
else
rte_smp_wmb();
Without this patch, compiler will report this error:
error: 'else' without a previous 'if'
Fixes: 84733fd0d75e ("eal/arm64: fix memory barrier definition")
Cc: stable@dpdk.org
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
This commit introduces rte_cio_wmb() and rte_cio_rmb(), in order to
guarantee the ordering of coherent shared memory between the CPU and a DMA
capable device.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Currently, rte_reciprocal only supports unsigned 32bit divisors. This
commit adds support for unsigned 64bit divisors.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
In some use cases of integer division, denominator remains constant and
numerator varies. It is possible to optimize division for such specific
scenarios.
The librte_sched uses rte_reciprocal to optimize division so, moving it to
eal/common would allow other libraries and applications to use it.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
The rte_service_finalize routine checks if service is initialized
or not. If yes; releases internal memory for services and lcore
states are freed. This routine is to be invoked at end of application
termination.
Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
The __rte_cache_aligned was applied to the whole array,
not the array elements. This leads to a false sharing between
the monitored cores.
Fixes: e70a61ad50ab ("keepalive: export states")
Cc: stable@dpdk.org
Signed-off-by: Andriy Berestovskyy <aber@semihalf.com>
Acked-by: Remy Horton <remy.horton@intel.com>
This commit ensures that if that if we run out of memory
during the initialization of the service library, that the
first allocated memory is correctly freed instead of leaked.
Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
A warning is issued when using an argument to likely() or unlikely()
builtins which is evaluated to a pointer value, as __builtin_expect()
expects a 'long int' type for its first argument. With this fix
a pointer value is converted to an integer with the value of 0 or 1.
Signed-off-by: Aleksey Baulin <aleksey.baulin@gmail.com>
This patch provides an option to do rte_memcpy() using 'restrict'
qualifier, which can induce GCC to do optimizations by using more
efficient instructions, providing some performance gain over memcpy()
on some ARM64 platforms/enviroments.
The memory copy performance differs between different ARM64
platforms. And a more recent glibc (e.g. 2.23 or later)
can provide a better memcpy() performance compared to old glibc
versions. It's always suggested to use a more recent glibc if
possible, from which the entire system can get benefit. If for some
reason an old glibc has to be used, this patch is provided for an
alternative.
This implementation can improve memory copy on some ARM64
platforms, when an old glibc (e.g. 2.19, 2.17...) is being used.
It is disabled by default and needs "RTE_ARCH_ARM64_MEMCPY"
defined to activate. It's not always proving better performance
than memcpy() so users need to run DPDK unit test
"memcpy_perf_autotest" and customize parameters in "customization
section" in rte_memcpy_64.h for best performance.
Compiler version will also impact the rte_memcpy() performance.
It's observed on some platforms and with the same code, GCC 7.2.0
compiled binary can provide better performance than GCC 4.8.5. It's
suggested to use GCC 5.4.0 or later.
Signed-off-by: Herbert Guan <herbert.guan@arm.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Kernels v4.4 and earlier does have vfio, but not
the noiommu mode, so the file does not exist.
Check and report errors on open/read in noiommu check.
Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Adding common test assertion macros for unit testing.
Replaced common macros in test/test.h with new RTE_TEST_ASSERT_* macros.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
This patch fixes the following compilation errors in bsdapp
lib/librte_eal/bsdapp/eal/eal.c:782:5:
error: no previous prototype for function 'rte_vfio_clear_group'
int rte_vfio_clear_group(int vfio_group_fd)
^
lib/librte_eal/bsdapp/eal/eal.c:782:30:
error: unused parameter 'vfio_group_fd'
int rte_vfio_clear_group(int vfio_group_fd)
^
Fixes: c564a2a20093 ("vfio: expose clear group function for internal usages")
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Use global default loglevel to DEBUG(8) and dynamic default loglevel
to INFO(7).
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Make max vfio groups compile-time configurable so that platforms can
choose vfio group limit.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
other vfio based module e.g. fslmc will also need to use
the clear_group call.
So, exposing it and renaming it to *rte_vfio_clear_group*
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
In some case, one device are accessed by different processes via
different BARs, so one uio device may be opened by more than one
process, for this case we just need to enable interrupt once, and
pci_clear_master only when the last process closed.
Fixes: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in VM")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Many exported headers rely on definitions found in rte_config.h without
including it, as shown by the following command:
grep -L '^#include <rte_config.h>' -- \
$(grep -Rl \
$(sed -n '/^#define \([^ ]\+\).*$/{s//\1/;H;};${x;s/\n//;s/\n/\\|/g;p;}' \
build/include/rte_config.h) \
-- build/include/)
We cannot assume external applications will include rte_config.h on their
own, neither directly nor through a -include parameter like DPDK does
internally.
This not only causes obvious compilation failures that can be reproduced
with check-includes.sh such as:
[...]/rte_memory.h:88:43: error: ‘RTE_CACHE_LINE_SIZE’ was not declared in
this scope
#define __rte_cache_aligned __rte_aligned(RTE_CACHE_LINE_SIZE)
^
It also results in less visible issues, for instance rte_hash_crc.h relying
on RTE_ARCH_X86_64's presence to provide dedicated inline functions.
This patch partially reverts the commit below and adds missing include
lines to the remaining files.
Fixes: f1a7a5c5f404 ("remove include of generated config header")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
The 'register' keyword does nothing, and has been removed in C++17.
Remove it for compatibility, like following commit:
Fixes: 0d5f2ed12f9e ("eal: remove use of register keyword")
Signed-off-by: Avi Kivity <avi@scylladb.com>
rte_eal_check_module() might return -1, which would have been a
"not false" condition for mod_available. Fix that to only report
vfio being enabled if rte_eal_check_module() returns 1.
Fixes: 221f7c220d6b ("vfio: move global config out of PCI files")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
In cases when alignment is bigger than boundary, we may incorrectly
calculate end of a bounded malloc element.
Consider this: suppose we are allocating a bounded malloc element
that should be of 128 bytes in size, bounded to 128 bytes and
aligned on a 256-byte boundary. Suppose our malloc element ends
at 0x140 - that is, 256 plus one cacheline.
So, right at the start, we are aligning our new_data_start to
include the required element size, and to be aligned on a specified
boundary - so new_data_start becomes 0. This fails the following
bounds check, because our element cannot go above 128 bytes from
the start, and we are at 320. So, we enter the bounds handling
branch.
While we're in there, we are aligning end_pt to our boundedness
requirement of 128 byte, and end up with 0x100 (since 256 is
128-byte aligned). We recalculate new_data_size and it stays at
0, however our end is at 0x100, which is beyond the 128 byte
boundary, and we report inability to reserve a bounded element
when we could have.
This patch adds an end_pt recalculation after new_data_start
adjustment - we already know that size <= bound, so we can do it
safely - and we then correctly report that we can, in fact, try
using this element for bounded malloc allocation.
Fixes: fafcc11985a2 ("mem: rework memzone to be allocated by malloc")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When we're gathering statistics, we are traversing the freelist,
which may change under our feet in multithreaded scenario. This
is verified by occasional segfaults when running malloc autotest
on a machine with big amount of cores.
This patch protects malloc heap stats call with a lock. It changes
its definition in the process due to locking invalidating the
const-ness, but this isn't a public API, so that's OK.
Fixes: 2a5c356e177d ("memory: stats for malloc")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
We check if there's space in config after we allocated the memzone,
but if there isn't, we never free it back. This patch adds memzone
free if there's no room in memzone config.
Fixes: ff909fe21f0a ("mem: introduce memzone freeing")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
This commit adds a new attribute to the service cores attributes
API, which allows the application to retrieve the number of times
that a service-core called the service to perform its action.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
This commit introduces a new API, allowing the application to
reset attributes of a service like the cycle count. Given this
functionality is now exposed to the user, remove the resetting
of stats during a dump() call.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
This commit adds a new function to the service API to allow
the application to retrieve items about each individual service
in the system. A unit test checks the return values of a variety
of invalid and valid calls.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
The CPUID instruction is caught by hypervisor which can return
a flag indicating one is running, and its name.
Suggested-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Print a warning if the --base-virtaddr hint is not respected
since this might lead to problems when mapping memory in
the secondary process.
Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch fixes a potential bug, which was not consistently
showing up in the unit tests. The issue was that the service-
lcore being started was not in a "WAIT" state, and hence EAL
would return -EBUSY instead of launching the lcore.
In order to ensure a core is in a launch-ready state, the application
must call rte_eal_wait_lcore, to ensure that the core has completed
its previous task, and that EAL is ready to re-launch it.
The call to rte_eal_wait_lcore() is explicitly not in the
service core function, to make it visible to the application.
Requiring an explicit function call ensures the developer sees
that a lcore could block in the rte_eal_wait_lcore() function
if the core hasn't returned from its previous function.
From a usability perspective, hiding the wait_lcore() inside
service cores would cause confusion.
This patch adds rte_eal_wait_lcore() calls to the unit tests,
to ensure that the lcores for testing functionality are ready
to run the test.
Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>