Enqueue operation must not fail. Move corresponding debug check
from one particular case to dequeue operation helper in order
to do it for all invocations.
Log critical message with useful information instead of rte_panic().
Make rte_mempool_do_generic_put() implementation more readable and
fix incosistency when return value is not checked in one place and
checked in another.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Callbacks for mempool events were registered in a process-shared tailq.
This was inherently incorrect because the same function
may be loaded to a different address in each process.
Make the tailq process-private.
Use the EAL tailq lock to reduce the number of different locks
this module operates.
Fixes: da2b9cb25e ("mempool: add event callbacks")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
A flush threshold for the mempool cache was introduced in DPDK version
1.3, but rte_mempool_do_generic_get() was not completely updated back
then, and some inefficiencies were introduced.
Fix the following in rte_mempool_do_generic_get():
1. The code that initially screens the cache request was not updated
with the change in DPDK version 1.3.
The initial screening compared the request length to the cache size,
which was correct before, but became irrelevant with the introduction of
the flush threshold. E.g. the cache can hold up to flushthresh objects,
which is more than its size, so some requests were not served from the
cache, even though they could be.
The initial screening has now been corrected to match the initial
screening in rte_mempool_do_generic_put(), which verifies that a cache
is present, and that the length of the request does not overflow the
memory allocated for the cache.
This bug caused a major performance degradation in scenarios where the
application burst length is the same as the cache size. In such cases,
the objects were not ever fetched from the mempool cache, regardless if
they could have been.
This scenario occurs e.g. if an application has configured a mempool
with a size matching the application's burst size.
2. The function is a helper for rte_mempool_generic_get(), so it must
behave according to the description of that function.
Specifically, objects must first be returned from the cache,
subsequently from the backend.
After the change in DPDK version 1.3, this was not the behavior when
the request was partially satisfied from the cache; instead, the objects
from the backend were returned ahead of the objects from the cache.
This bug degraded application performance on CPUs with a small L1 cache,
which benefit from having the hot objects first in the returned array.
(This is probably also the reason why the function returns the objects
in reverse order, which it still does.)
Now, all code paths first return objects from the cache, subsequently
from the backend.
The function was not behaving as described (by the function using it)
and expected by applications using it. This in itself is also a bug.
3. If the cache could not be backfilled, the function would attempt
to get all the requested objects from the backend (instead of only the
number of requested objects minus the objects available in the backend),
and the function would fail if that failed.
Now, the first part of the request is always satisfied from the cache,
and if the subsequent backfilling of the cache from the backend fails,
only the remaining requested objects are retrieved from the backend.
The function would fail despite there are enough objects in the cache
plus the common pool.
4. The code flow for satisfying the request from the cache was slightly
inefficient:
The likely code path where the objects are simply served from the cache
was treated as unlikely. Now it is treated as likely.
Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Since 10 years, memzone allocation is allowed on secondary
processes. Now it's time to update the documentation accordingly.
At the same time, fix mempool, mbuf and ring documentation which rely on
memzones internally.
Bugzilla ID: 1074
Fixes: 916e4f4f4e ("memory: fix for multi process support")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
MEMPOOL_PG_NUM_DEFAULT and MEMPOOL_PG_SHIFT_MAX defines are unused
since xmem API removal.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Replacement RTE_MEMPOOL_REGISTER_OPS() should be used instead.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
MEMPOOL_HEADER_SIZE() is removed. The replacement with RTE_ prefix
is internal only since it is implementation details which are not
required in applications.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Make rte_driver opaque for non internal users.
This will make extending this object possible without breaking the ABI.
Introduce a new driver header and move rte_driver definition.
Update drivers and library to use the internal header.
Some applications may have been dereferencing rte_driver objects, mark
this object's accessors as stable.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Those macros have no real value and are easily replaced with a simple
if() block.
Existing users have been converted using a new cocci script.
Deprecate them.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Start a new release cycle with empty release notes.
The ABI version becomes 23.0.
The map files are updated to the new ABI major number (23).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.
Special handling of removed drivers is also dropped in check-abi.sh and
a note has been added in libabigail.abignore as a reminder.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
These functions all behave like libc free() and do
nothing if handed a NULL pointer. The code is already doing
this, this patch just documents the behavior.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When mempool had been created with RTE_MEMPOOL_F_NO_IOVA_CONTIG flag
but later populated with valid IOVA, RTE_MEMPOOL_F_NON_IO was unset,
while it should be kept. The unit test did not catch this
because rte_mempool_populate_default() it used was populating
with RTE_BAD_IOVA.
Keep setting RTE_MEMPOOL_NON_IO at an empty mempool creation
and add an assert for it in the unit test (remove the separate case).
Do not reset the flag if RTE_MEMPOOL_F_ON_IOVA_CONTIG is set.
Fixes: 11541c5c81 ("mempool: add non-IO flag")
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
As reported by Dmitry, RTE_MEMPOOL_F_POOL_CREATED is a flag only
manipulated internally.
This flag is not supposed to be requested from an application and would
probably result in an incorrect behavior if an application did pass it.
At least one other internal flag has been added recently and more may be
introduced later.
Rework the check and export a mask of valid user flags for use in the
unit test.
Fixes: b240af8b10 ("mempool: enforce valid flags at creation")
Reported-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
MEMPOOL_PG_NUM_DEFAULT and MEMPOOL_PG_SHIFT_MAX are not used.
Fixes: fd943c764a ("mempool: deprecate xmem functions")
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add RTE_ prefix to macro used to register mempool driver.
The old one is still available but deprecated.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add RTE_ prefix to helper macro to calculate mempool header size and
make it internal. Old macro is still available, but deprecated.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add RTE_ prefix to internal API defined in public header.
Use the prefix instead of double underscore.
Use uppercase for macros in the case of name conflict.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Fix the mempool flags namespace by adding an RTE_ prefix to the name.
The old flags remain usable, to be deprecated in the future.
Flag MEMPOOL_F_NON_IO added in the release is just renamed to have RTE_
prefix.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Move documentation into a separate line just before define.
Prepare to have a bit longer flag name because of namespace prefix.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Mempool is a generic allocator that is not necessarily used
for device IO operations and its memory for DMA.
Add MEMPOOL_F_NON_IO flag to mark such mempools automatically
a) if their objects are not contiguous;
b) if IOVA is not available for any object.
Other components can inspect this flag
in order to optimize their memory management.
Discussion: https://mails.dpdk.org/archives/dev/2021-August/216654.html
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Data path performance can benefit if the PMD knows which memory it will
need to handle in advance, before the first mbuf is sent to the PMD.
It is impractical, however, to consider all allocated memory for this
purpose. Most often mbuf memory comes from mempools that can come and
go. PMD can enumerate existing mempools on device start, but it also
needs to track creation and destruction of mempools after the forwarding
starts but before an mbuf from the new mempool is sent to the device.
Add an API to register callback for mempool life cycle events:
* rte_mempool_event_callback_register()
* rte_mempool_event_callback_unregister()
Currently tracked events are:
* RTE_MEMPOOL_EVENT_READY (after populating a mempool)
* RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
Provide a unit test for the new API.
The new API is internal, because it is primarily demanded by PMDs that
may need to deal with any mempools and do not control their creation,
while an application, on the other hand, knows which mempools it creates
and doesn't care about internal mempools PMDs might create.
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
If we do not enforce valid flags are passed by an application, this
application might face issues in the future when we add more flags.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Use correct define as a name array size.
The change breaks ABI and therefore cannot be backported to
stable branches.
Fixes: 38c9817ee1 ("mempool: adjust name size in related data types")
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Currently there are some public headers that include 'sys/queue.h', which
is not POSIX, but usually provided by the Linux/BSD system library.
(Not in POSIX.1, POSIX.1-2001, or POSIX.1-2008. Present on the BSDs.)
The file is missing on Windows. During the Windows build, DPDK uses a
bundled copy, so building a DPDK library works fine. But when OVS or other
applications use DPDK as a library, because some DPDK public headers
include 'sys/queue.h', on Windows, it triggers an error due to no such
file.
One solution is to install the 'lib/eal/windows/include/sys/queue.h' into
Windows environment, such as [1]. However, this means DPDK exports the
functionalities of 'sys/queue.h' into the environment, which might cause
symbols, macros, headers clashing with other applications.
The patch fixes it by removing the "#include <sys/queue.h>" from
DPDK public headers, so programs including DPDK headers don't depend
on the system to provide 'sys/queue.h'. When these public headers use
macros such as TAILQ_xxx, we replace it by the ones with RTE_ prefix.
For Windows, we copy the definitions from <sys/queue.h> to rte_os.h
in Windows EAL. Note that these RTE_ macros are compatible with
<sys/queue.h>, both at the level of API (to use with <sys/queue.h>
macros in C files) and ABI (to avoid breaking it).
Additionally, the TAILQ_FOREACH_SAFE is not part of <sys/queue.h>,
the patch replaces it with RTE_TAILQ_FOREACH_SAFE.
[1] http://mails.dpdk.org/archives/dev/2021-August/216304.html
Suggested-by: Nick Connolly <nick.connolly@mayadata.io>
Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Start a new release cycle with empty release notes.
The ABI version becomes 22.0.
The map files are updated to the new ABI major number (22).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
If cache is enabled, objects will be retrieved/put from/to cache,
subsequently from/to the common pool. Now the debug stats calculate
the objects retrieved/put from/to cache and pool together, it is
better to distinguish them.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
There is no reason for the DPDK libraries to all have 'librte_' prefix on
the directory names. This prefix makes the directory names longer and also
makes it awkward to add features referring to individual libraries in the
build - should the lib names be specified with or without the prefix.
Therefore, we can just remove the library prefix and use the library's
unique name as the directory name, i.e. 'eal' rather than 'librte_eal'
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>