In the same model than rte_mempool_obj_iter(), introduce
rte_mempool_mem_iter() to iterate the memory chunks attached
to the mempool.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Do not use paddr table to store the mempool memory chunks.
This will allow to have several chunks with different virtual addresses.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This commit removes MEMPOOL_IS_CONTIG().
The next commits will change the behavior of the mempool library so that
the objects will never be allocated in the same memzone than the mempool
header. Therefore, there is no reason to keep this macro that would
always return 0.
This macro was only used in app/test.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Store the physical address of the object in its header. It simplifies
rte_mempool_virt2phy() and prepares the removing of the paddr[] table
in the mempool header.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This makes the code of rte_mempool_create() clearer, and it will make
the introduction of external mempool handler easier (in another patch
series). Indeed, this function contains the specific part when a ring is
used, but it could be replaced by something else in the future.
This commit also adds a socket_id field in the mempool structure that
is used by this new function.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Before this patch, the mempool elements were initialized at the time
they were added to the mempool. This patch changes this to do the
initialization of all objects once the mempool is populated, using
rte_mempool_obj_iter() introduced in previous commits.
Thanks to this modification, we are getting closer to a new API
that would allow us to do:
mempool_init()
mempool_populate(mem1)
mempool_populate(mem2)
mempool_populate(mem3)
mempool_init_obj()
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Use the new rte_mempool_obj_iter() instead the old rte_mempool_obj_iter()
to iterate among objects to audit them (check for cookies).
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Now that the mempool objects are chained into a list, we can use it to
browse them. This implies a rework of rte_mempool_obj_iter() API, that
does not need to take as many arguments as before. The previous function
is kept as a private function, and renamed in this commit. It will be
removed in a next commit of the patch series.
The only internal users of this function are the mellanox drivers. The
code is updated accordingly.
Introducing an API compatibility for this function has been considered,
but it is not easy to do without keeping the old code, as the previous
function could also be used to browse elements that were not added in a
mempool. Moreover, the API is already be broken by other patches in this
version.
The library version was already updated in
commit 213af31e09 ("mempool: reduce structure size if no cache needed")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
In next commits, we will use an iterator to walk through the objects in
mempool in rte_mempool_audit(). This iterator takes a "struct
rte_mempool *" as a parameter because it is assumed that the callback
function can modify the mempool.
The previous approach was to introduce a RTE_DECONST() macro, but
after discussion it seems that removing the const qualifier is better
to avoid fooling the compiler, and also because these functions are
not used in datapath (possible compiler optimizations due to const
are not critical).
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This commit removes the const qualifier for the mempool in
rte_mempool_walk() callback prototype.
Indeed, most functions that can be done on a mempool require a non-const
mempool pointer, except the dump and the audit. Therefore, the
mempool_walk() is more useful if the mempool pointer is not const.
This is required by next commit where the mellanox drivers use
rte_mempool_walk() to iterate the mempools, then rte_mempool_obj_iter()
to iterate the objects in each mempool.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Introduce a list entry in object header so they can be listed and
browsed. The objective is to introduce a more simple way to browse the
elements of a mempool.
The next commits will update rte_mempool_obj_iter() to use this list,
and remove the previous complex implementation.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This commit renames mempool_obj_ctor_t as mempool_obj_cb_t.
In next commits, we will add the ability to populate the
mempool and iterate through objects using the same function.
We will use the same callback type for that. As the callback is
not a constructor anymore, rename it into rte_mempool_obj_cb_t.
The rte_mempool_obj_iter_t that was used to iterate over objects
will be removed in next commits.
No functional change.
In this commit, the API is preserved through a compat typedef.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Since commits d2e0ca22f and 97e7e685b the headers and trailers
of the mempool are defined as a structure. We can get their
size using a sizeof instead of doing a calculation that will
become wrong at the first structure update.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
There's no reason to keep this function inlined. Move it to
rte_mempool.c. We need to export the function for when compiling
with shared libraries + debug. We also need to keep the macro,
because we don't want to call an empty function when debug is
disabled.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This commit replaces elt_size by total_elt_size when appropriate.
In some mempool functions, we use the size of the elements as arguments
or variables. There is a confusion between the size including or not
including the header and trailer.
To avoid this confusion:
- update the API documentation
- rename the variables and argument names as "elt_size" when the size
does not include the header and trailer, or else as "total_elt_size".
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked by: Keith Wiles <keith.wiles@intel.com>
No functional change, just fix some comments and styling issues.
Also avoid to duplicate comments between rte_mempool_create()
and rte_mempool_xmem_create().
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked by: Keith Wiles <keith.wiles@intel.com>
The rte_mempool structure is changed, which will cause an ABI change
for this structure. Providing backward compat is not reasonable
here as this structure is used in multiple defines/inlines.
Allow mempool cache support to be dynamic depending on if the
mempool being created needs cache support. Saves about 1.5M of
memory used by the rte_mempool structure.
Allocating small mempools which do not require cache can consume
larges amounts of memory if you have a number of these mempools.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The macro RTE_VERIFY always checks a condition.
It is optimized with "unlikely" hint.
While this macro is well suited for test applications, it is preferred
in libraries and examples to enable such check in debug mode.
That's why the macro RTE_ASSERT is introduced to call RTE_VERIFY only
if built with debug logs enabled.
A lot of assert macros were duplicated and enabled with a specific flag.
Removing these #ifdef allows to test these code branches more easily
and avoid dead code pitfalls.
The ENA_ASSERT is kept (in debug mode only) because it has more
parameters to log.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Since commits ff909fe21f and 4e32101f9b, it is now possible to free
memzones and rings.
The rte_mempool_create() should be modified to take advantage of this
and not leak memory when an allocation fails.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
fix the error reported by checkpatch:
"ERROR: return is not a function, parentheses are not required"
remove parentheses in return like:
"return (logical expressions)"
remove parentheses in return a function like:
"return (rte_mempool_lookup(...))"
Fixes: 6307b909b8 ("lib: remove extra parenthesis after return")
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
The function rte_mempool_obj_iter used in mlx drivers
was not exported. So the driver loading was failing:
EAL: open shared lib librte_pmd_mlx4.so
EAL: x86_64-native-linuxapp-gcc/lib/librte_pmd_mlx4.so:
undefined symbol: rte_mempool_obj_iter
Fixes: 9d41beed24 ("lib: provide initial versioning")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
There is a new function in the EAL API for internal use.
It has neither a proper prefix nor a .map export:
libethdev.so: undefined reference to `is_xen_dom0_supported'
Fixes: 719dbebceb ("xen: allow determining DOM0 at runtime")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Optimize for quad-channel by default, this should work well for
all the cases, better than the previous value of one anyway.
Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: David Marchand <david.marchand@6wind.com>
When DPDK is being compiled in C++ project using g++ then
'invalid conversion from' error appears. Added explicit
typecast on function return to get rid of the error.
Fixes: 6cf14ce4ce ("mempool: silence warning on pointer arithmetic")
Signed-off-by: Sergey Balabanov <balabanovsv@ecotelecom.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
__mempool_get_trailer() calculated header's address.
The address of trailer should set after element area.
This patch fixes this calculating.
Fixes: 97e7e685bf ("mempool: add structure for object trailers")
Signed-off-by: Yuichi Nakai <xoxyuxu@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Move malloc inside eal and create a new section in MAINTAINERS file for
Memory Allocation in EAL.
Create a dummy malloc library to avoid breaking applications that have
librte_malloc in their DT_NEEDED entries.
This is the first step towards using malloc to allocate memory directly
from memsegs. Thus, memzones would allocate memory through malloc,
allowing to free memzones.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
On TILE-Gx and TILE-Mx platforms, the buffers fed into the hardware
buffer manager require a 128-byte alignment. With this change, we
allow configuration based override of the element alignment, and
default to RTE_CACHE_LINE_SIZE if left unspecified.
Signed-off-by: Cyril Chemparathy <cchemparathy@ezchip.com>
Signed-off-by: Zhigang Lu <zlu@ezchip.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Translating from a mempool object to the mempool pointer does not break
alignment constraints. However, the compiler is unaware of this fact and
complains on -Wcast-align. This patch modifies the code to use RTE_PTR_SUB(),
thereby silencing the compiler by casting through (void *).
Signed-off-by: Cyril Chemparathy <cchemparathy@ezchip.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Added Doxygen @param for missing API parameter in
rte_mempool_obj_iter(), to fix Doxygen warning. Also added
minor grammar fixes to that function documentation.
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Each object stored in mempools are suffixed by a trailer, storing
a cookie in debug mode which help to detect memory corruptions.
Like for headers, introduce a structure that materializes the content of
this trailer.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Each object stored in mempools are prefixed by a header, allowing for
instance to retrieve the mempool pointer from the object. When debug is
enabled, a cookie is also added in this header that helps to detect
corruptions and double-frees.
Introduce a structure that materializes the content of this header,
and will simplify future patches adding things in this header.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
In rte_mempool_obj_iter(), when element boundary coincides with page boundary,
even if a single page is required per object, a loop checks that the next page
is contiguous and drops the first one otherwise.
This commit checks subsequent pages only when several are required per object.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
rte_mempool_xmem_usage()'s return type is ssize_t which has the same
architecture-dependent width as size_t but is signed.
On 64-bit architectures, returning a negative uint32_t value without casting
to ssize_t first does not work as intended, the sign bit is lost and the
returned value is garbage.
This commit fixes an assertion failure in testpmd on 64 bit architectures
when combining --no-huge and --mp-anon outside of Xen Dom0:
PANIC in mempool_anon_create():
line 170 assert "elt_num == mp->size" failed
Fixes: 148f963fb5 ("xen: core library changes")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Otherwise cache_flushthresh can be bigger than n, and
a consumer can starve others by keeping every element
either in use or in the cache.
Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
A lot of places just protect against concurrent access and I can not see the
gain of having those macros.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
error: format ‘%p’ expects argument of type ‘void *’,
but argument 5 has type ‘const struct rte_mempool *’ [-Werror=format=]
mp type is (const struct rte_mempool *) and must be casted into a simpler
type to be printed.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
For non-EAL thread, bypass per lcore cache, directly use ring pool.
It allows using rte_mempool in either EAL thread or any user pthread.
As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
It will get bad performance and has critical risk if scheduling policy is RT.
Haven't found significant performance decrease by mempool_perf_test.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
In C++11 concatenated string literals need to have a space in between.
Found with clang++-3.4, IIRC g++-4.8 also complains about this.
Sample error message:
error: invalid suffix on literal; C++11 requires a space between literal
and identifier [-Wreserved-user-defined-literal]
Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
Reviewed-by: John McNamara <john.mcnamara@intel.com>
To differentiate libraries that break ABI, we add a library version number
suffix to the library, which must be incremented when a given libraries ABI is
broken. This patch enforces that addition, sets the initial abi soname
extension to 1 for each library and creates a symlink to the base SONAME so that
the test applications will link properly.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Add linker version script files to each DPDK library to put a stake in the
ground from which we can start cleaning up API's
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Check the FILE *f and rte_mempool *mp pointers for NULL.
Signed-off-by: Keith Wiles <keith.wiles@windriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
CACHE_LINE_SIZE is a macro defined in machine/param.h in FreeBSD and
conflicts with DPDK macro version.
Adding RTE_ prefix to avoid conflicts.
CACHE_LINE_MASK and CACHE_LINE_ROUNDUP are also prefixed.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
[Thomas: updated on HEAD, including PPC]
Remove n_orig variable as it is not required.
Signed-off-by: Keith Wiles <keith.wiles@windriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
When enabling RTE_LIBRTE_MEMPOOL_DEBUG and compiling with clang
compiler an error occurs, because ifdefed code includes push/pop pragmas.
Signed-off-by: Keith Wiles <keith.wiles@windriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Since the data structures such as rings are shared in their entirety,
those TAILQ pointers are shared as well. Meaning that, after a
successful rte_ring creation, the tailq_next pointer of the last
ring in the TAILQ will be updated with a pointer to a ring which may
not be present in the address space of another process (i.e. a ring
that may be host-local or guest-local, and not shared over IVSHMEM).
Any successive ring create/lookup on the other side of IVSHMEM will
result in trying to dereference an invalid pointer.
This patchset fixes this problem by creating a default tailq entry
that may be used by any data structure that chooses to use TAILQs.
This default TAILQ entry will consist of a tailq_next/tailq_prev
pointers, and an opaque pointer to arbitrary data. All TAILQ
pointers from data structures themselves will be removed and
replaced by those generic TAILQ entries, thus fixing the problem
of potentially exposing local address space to shared structures.
Technically, only rte_ring structure require modification, because
IVSHMEM is only using memzones (which aren't in TAILQs) and rings,
but for consistency's sake other TAILQ-based data structures were
adapted as well.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The function rte_snprintf serves no useful purpose. It is the
same as snprintf() for all valid inputs. Deprecate it and
replace all uses in current code.
Leave the tests for the deprecated function in place.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This commit removes trailing whitespace from lines in files. Almost all
files are affected, as the BSD license copyright header had trailing
whitespace on 4 lines in it [hence the number of files reporting 8 lines
changed in the diffstat].
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: remove spaces before tabs in libs]
[Thomas: remove more trailing spaces in non-C files]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Add function to iterate over mempool.
Useful for diagnostic code that wants to look at mempool usage patterns.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The DPDK dump functions are useful for remote debugging of an
applications. But when application runs as a daemon, stdout
is typically routed to /dev/null.
Instead change all these functions to take a stdio FILE * handle
instead. An application can then use open_memstream() to capture
the output.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: fix quota_watermark example]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The second condition of this logical OR:
(get_gcd(new_obj_size, nrank * nchan) != 1 ||
get_gcd(nchan, new_obj_size) != 1)
is redundant with the first condition.
We can show that the first condition is equivalent to its disjunction
with the second condition using these two results:
- R1: For all conditions A and B, if B implies A, then (A || B) is
equivalent to A.
- R2: (get_gcd(nchan, new_obj_size) != 1) implies
(get_gcd(new_obj_size, nrank * nchan) != 1)
We can show R1 with the following truth table (0 is false, 1 is true):
+-----+-----++----------+-----+-------------+
| A | B || (A || B) | A | B implies A |
+-----+-----++----------+-----+-------------+
| 0 | 0 || 0 | 0 | 1 |
| 0 | 1 || 1 | 0 | 0 |
| 1 | 0 || 1 | 1 | 1 |
| 1 | 1 || 1 | 1 | 1 |
+-----+-----++----------+-----+-------------+
Truth table of (A || B) and A
We can show R2 by looking at the code of optimize_object_size and
get_gcd.
We see that:
- S1: (nchan >= 1) and (nrank >= 1).
- S2: get_gcd returns 0 only when both arguments are 0.
Let:
- X be get_gcd(new_obj_size, nrank * nchan).
- Y be get_gcd(nchan, new_obj_size).
Suppose:
- H1: get_gcd returns the greatest common divisor of its arguments.
- H2: (nrank * nchan) does not exceed UINT_MAX.
We prove (Y != 1) implies (X != 1) with the following steps:
- Suppose L0: (Y != 1). We have to show (X != 1).
- By H1, Y is the greatest common divisor of nchan and new_obj_size.
In particular, we have L1: Y divides nchan and new_obj_size.
- By H2, we have L2: nchan divides (nrank * nchan)
- By L1 and L2, we have L3: Y divides (nrank * nchan) and
new_obj_size.
- By H1 and L3, we have L4: (Y <= X).
- By S1 and S2, we have L5: (Y != 0).
- By L0 and L5, we have L6: (Y > 1).
- By L4 and L6, we have (X > 1) and thus (X != 1), which concludes.
R2 was also tested for all values of new_obj_size, nrank, and nchan
between 0 and 2000.
This redundant condition was found using TrustInSoft Analyzer.
Signed-off-by: Julien Cretin <julien.cretin@trust-in-soft.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The include file should not change the GCC compile options for
the whole file being compiled, but only for the one inline function
that needs it. Using the push_options/pop_options fixes this.
Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
In --no-huge mode, mempool provides objects with their associated
header/trailer fitting in a standard page (usually 4KB).
This means all non-UIO driver should work correctly in this mode,
since UIO drivers allocate ring sizes that cannot fit in a page.
Extend rte_mempool_virt2phy to obtain the correct physical address when
elements of the pool are not on the same physically contiguous memory region.
Reason for this patch is to be able to run on a kernel < 2.6.37 without
the need to patch it, since all kernel below are either bugged or don't
have huge page support at all (< 2.6.28).
Signed-off-by: Damien Millescamps <damien.millescamps@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Core support for using the Intel DPDK with Xen Dom0 - including EAL
changes and mempool changes. These changes encompass how memory mapping
is done, including support for initializing a memory pool inside an
already-allocated block of memory.
KNI sample app updated to use KNI close function when used with Xen.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Cleanup mempool and memzone object names so that we can more easily rename them
from headers.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>