Use rte_ring_xxx_elem_xxx APIs to replace legacy API implementation.
This reduces code duplication and improves code maintenance.
Tests done on Arm, x86 [1] and PPC [2] do not indicate performance
degradation.
[1] https://mails.dpdk.org/archives/dev/2020-July/173780.html
[2] https://mails.dpdk.org/archives/dev/2020-July/173863.html
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Remove the experimental tag for rte_ring_xxx_elem APIs that have been
around for 2 releases.
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Remove the experimental tag for rte_ring_reset API that have been around
for 4 releases.
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Testing if the ring is empty is as simple as comparing the producer and
consumer pointers.
In theory, this optimization reduces the number of potential cache misses
from 3 to 2 by not having to read r->mask in rte_ring_count().
The modification of this function were also discussed in the RFC here:
https://mails.dpdk.org/archives/dev/2020-April/165752.html
Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Fix coding style violations that checkpatch will complain about.
Add missing "int" after "unsigned".
Add missing spaces around "+=" and "+".
Remove superfluous type cast of numerical constant.
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
For rings with producer/consumer in RTE_RING_SYNC_ST, RTE_RING_SYNC_MT_HTS
mode, provide an ability to split enqueue/dequeue operation
into two phases:
- enqueue/dequeue start
- enqueue/dequeue finish
That allows user to inspect objects in the ring without removing
them from it (aka MT safe peek).
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Introduce head/tail sync mode for MT ring synchronization.
In that mode enqueue/dequeue operation is fully serialized:
only one thread at a time is allowed to perform given op.
Suppose to reduce stall times in case when ring is used on
overcommitted cpus (multiple active threads on the same cpu).
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Introduce relaxed tail sync (RTS) mode for MT ring synchronization.
Aim to reduce stall times in case when ring is used on
overcommited cpus (multiple active threads on the same cpu).
The main difference from original MP/MC algorithm is that
tail value is increased not by every thread that finished enqueue/dequeue,
but only by the last one.
That allows threads to avoid spinning on ring tail value,
leaving actual tail value change to the last thread in the update queue.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
To make these preparations two main things are done:
- Change from *single* to *sync_type* to allow different
synchronisation schemes to be applied.
Mark *single* as deprecated in comments.
Add new functions to allow user to query ring sync types.
Replace direct access to *single* with appropriate function call.
- Move actual rte_ring and related structures definitions into a
separate file: <rte_ring_core.h>. It allows to refer contents
of <rte_ring_elem.h> from <rte_ring.h> without introducing a
circular dependency.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Current APIs assume ring elements to be pointers. However, in many
use cases, the size can be different. Add new APIs to support
configurable ring element sizes.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Currently, the flush is done by dequeuing the ring in a while loop. It is
much simpler to flush the queue by resetting the head and tail indices.
Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
As memzone.h is introduced by
commit 38c9817ee1 ("mempool: adjust name size in related data types"),
forward declaration for rte_memzone is no longer needed.
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
'/**<' style comments apply to the previous member, which caused doxygen to
emit the RTE_RING_NAMESIZE documentation for RTE_RING_MZ_PREFIX.
Fixes: 38c9817ee1 ("mempool: adjust name size in related data types")
Cc: stable@dpdk.org
Signed-off-by: Gage Eads <gage.eads@intel.com>
Keep only single config option RTE_USE_C11_MEM_MODEL for C11 memory
model, so all modules can leverage C11 atomic extension by enable this
option.
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
rte_ring implementation is not preemptible only under certain
circumstances. This clarification is helpful for data plane and
control plane communication using rte_ring.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
On gcc 5.4.0 / native aarch64 from Ubuntu 16.04:
In function '__rte_ring_do_dequeue':
rte_ring.h: 385:35: warning:
conversion to 'int' from 'unsigned int' may change
the sign of the result [-Wsign-conversion]
n = __rte_ring_move_cons_head(r, is_sc, n, behavior,
^
Fixes: e8ed5056c8 ("ring: remove signed type flip-flopping")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
GCC 8.1 warns:
rte_ring.h:350:46:
warning: conversion to 'uint32_t' {aka 'unsigned int'}
from 'int' may change the sign of the result
[-Wsign-conversion]
update_tail(&r->prod, prod_head, prod_next, is_sp, 1);
The visible apis take unsigned int, then call a private
api taking an int, which finally calls an api taking an
unsigned int.
Convert the private api to take unsigned int removing
5 x warning similar to that shown above.
Fixes: 0dfc98c507 ("ring: separate out head index manipulation")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The initial objective of
commit d9f0d3a1ff ("ring: remove split cacheline build setting")
was to add an empty cache line between the producer and consumer
data (on platform with cache line size = 64B), preventing from
having them on adjacent cache lines.
Following discussion on the mailing list, it appears that this
also imposes an alignment constraint that is not required.
This patch removes the extra alignment constraint and adds the
empty cache lines using padding fields in the structure. The
size of rte_ring structure and the offset of the fields remain
the same on platforms with cache line size = 64B:
rte_ring = 384
rte_ring.name = 0
rte_ring.flags = 32
rte_ring.memzone = 40
rte_ring.size = 48
rte_ring.mask = 52
rte_ring.prod = 128
rte_ring.cons = 256
But it has an impact on platform where cache line size is 128B:
rte_ring = 384 -> 768
rte_ring.name = 0
rte_ring.flags = 32
rte_ring.memzone = 40
rte_ring.size = 48
rte_ring.mask = 52
rte_ring.prod = 128 -> 256
rte_ring.cons = 256 -> 512
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This patch is to support C11 memory model barrier in librte_ring.
There are 2 barrier implementation options in librte_ring (suggested
by Jerin).
1. use rte_smp_rmb
2. use load_acquire/store_release(refer to [1]).
The reason why providing 2 options is the performance benchmark
difference in different arm machines, refer to [2].
CONFIG_RTE_RING_USE_C11_MEM_MODEL is provided, and by default it is "n"
on any architectures and only "y" on arm64 so far.
[1] https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h#L170
[2] http://dpdk.org/ml/archives/dev/2017-October/080861.html
Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Move the common part of rte_ring.h into rte_ring_generic.h.
Move the memory barrier part into update_tail().
No functional changes here.
Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Many exported headers rely on definitions found in rte_config.h without
including it, as shown by the following command:
grep -L '^#include <rte_config.h>' -- \
$(grep -Rl \
$(sed -n '/^#define \([^ ]\+\).*$/{s//\1/;H;};${x;s/\n//;s/\n/\\|/g;p;}' \
build/include/rte_config.h) \
-- build/include/)
We cannot assume external applications will include rte_config.h on their
own, neither directly nor through a -include parameter like DPDK does
internally.
This not only causes obvious compilation failures that can be reproduced
with check-includes.sh such as:
[...]/rte_memory.h:88:43: error: ‘RTE_CACHE_LINE_SIZE’ was not declared in
this scope
#define __rte_cache_aligned __rte_aligned(RTE_CACHE_LINE_SIZE)
^
It also results in less visible issues, for instance rte_hash_crc.h relying
on RTE_ARCH_X86_64's presence to provide dedicated inline functions.
This patch partially reverts the commit below and adds missing include
lines to the remaining files.
Fixes: f1a7a5c5f4 ("remove include of generated config header")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
We watched a rte panic of mbuf_autotest in our qualcomm arm64 server
(Amberwing).
Root cause:
In __rte_ring_move_cons_head()
...
do {
/* Restore n as it may change every loop */
n = max;
*old_head = r->cons.head; //1st load
const uint32_t prod_tail = r->prod.tail; //2nd load
In weak memory order architectures (powerpc,arm), the 2nd load might be
reodered before the 1st load, that makes *entries is bigger than we wanted.
This nasty reording messed enque/deque up.
cpu1(producer) cpu2(consumer) cpu3(consumer)
load r->prod.tail
in enqueue:
load r->cons.tail
load r->prod.head
store r->prod.tail
load r->cons.head
load r->prod.tail
...
store r->cons.{head,tail}
load r->cons.head
Then, r->cons.head will be bigger than prod_tail, then make *entries very
big and the consumer will go forward incorrectly.
After this patch, the old cons.head will be recaculated after failure of
rte_atomic32_cmpset
There is no such issue on X86, because X86 is strong memory order model.
But rte_smp_rmb() doesn't have impact on runtime performance on X86, so
keep the same code without architectures specific concerns.
Fixes: 50d7690548 ("ring: add burst API")
Cc: stable@dpdk.org
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Signed-off-by: Jie Liu <jie2.liu@hxt-semitech.com>
Signed-off-by: Bing Zhao <bing.zhao@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
There is no reason to prevent ring from being larger than 0x0FFFFFFF.
Increase the maximum size to 0x7FFFFFFF, which is the maximum possible
without changing the code and the structure definition (size is stored
on a uint32_t).
Link: http://dpdk.org/ml/archives/dev/2017-September/074701.html
Suggested-by: Venkatesh Nuthula <venki497@gmail.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
The rte_rings traditionally have only supported having ring sizes as powers
of 2, with the actual usable space being the size - 1. In some cases, for
example, with an eventdev where we want to precisely control queue depths
for latency, we need to allow ring sizes which are not powers of two so we
add in an additional ring capacity value to allow that. For existing rings,
this value will be size-1, i.e. the same as the mask, but if the new
EXACT_SZ flag is passed on ring creation, the ring will have exactly the
usable space requested, although the underlying memory size may be bigger.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Remove rte_pause() definition from rte_common.h and
switchover to architecture specific rte_pause.h
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The error return code for rte_ring_sc_dequeue_bulk() and
rte_ring_mc_dequeue_bulk() function should be -ENOENT rather
than -ENOBUFS as described in the function description.
Fixes: cfa7c9e6fc ("ring: make bulk and burst return values consistent")
Cc: stable@dpdk.org
Signed-off-by: Anand B Jyoti <anand.b.jyoti@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Different drivers use internal macros like force_inline for compiler
always inline feature.
Standardizing it through __rte_always_inline macro.
Verified the change by comparing the output binary file.
No difference found in the output binary file with this change.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The error return code for rte_ring_dequeue() function should be -ENOENT
rather than -ENOBUFS (which is the error value from the enqueue() fn).
Fixes: cfa7c9e6fc ("ring: make bulk and burst return values consistent")
Reported-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
build error:
include/rte_ring.h:459:22: error: invalid conversion from ‘void*’
to ‘void**’ [-fpermissive]
ENQUEUE_PTRS(r, &r[1], prod_head, obj_table, n, void *);
Implicit casts of void* to void** are considered warnings in some
compilers. E.g. g++ version 5.8. Cast directly to object types
Fixes: a6619414 ("ring: make struct and macros type agnostic")
Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
build error:
In file included from .../lib/librte_ring/rte_ring.c(90):
.../lib/librte_ring/rte_ring.h(162):
error #1366: a reduction in alignment without the "packed" attribute
is ignored
} __rte_cache_aligned;
^
Alignment attribute moved to first element of the struct
Fixes: a6619414e0 ("ring: make struct and macros type agnostic")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Modify the enqueue and dequeue macros to support copying any type of
object by passing in the exact object type. Rather than using the "ring"
structure member of rte_ring, which is of type "array of void *", instead
have the macros take the start of the ring a a pointer value, thereby
leaving the rte_ring structure as purely a header value. This allows it
to be reused by other future ring types which can add on extra fields if
they want, or even to have the actual ring elements, of whatever type
stored separate from the ring header.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Both producer and consumer use the same logic for updating the tail
index so merge into a single function.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
We can write a single common function for head manipulation for enq
and a common one for deq, allowing us to have a single worker function
for enq and deq, rather than two of each. Update all other inline
functions to use the new functions.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The local variable i is only used for loop control so define it in
the enqueue and dequeue blocks directly, rather than at the function
level.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add an extra parameter to the ring dequeue burst/bulk functions so that
those functions can optionally return the amount of remaining objs in the
ring. This information can be used by applications in a number of ways,
for instance, with single-consumer queues, it provides a max
dequeue size which is guaranteed to work.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Add an extra parameter to the ring enqueue burst/bulk functions so that
those functions can optionally return the amount of free space in the
ring. This information can be used by applications in a number of ways,
for instance, with single-producer queues, it provides a max
enqueue size which is guaranteed to work. It can also be used to
implement watermark functionality in apps, replacing the older
functionality with a more flexible version, which enables apps to
implement multiple watermark thresholds, rather than just one.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The bulk fns for rings returns 0 for all elements enqueued and negative
for no space. Change that to make them consistent with the burst functions
in returning the number of elements enqueued/dequeued, i.e. 0 or N.
This change also allows the return value from enq/deq to be used directly
without a branch for error checking.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Remove the watermark support. A future commit will add support for having
enqueue functions return the amount of free space in the ring, which will
allow applications to implement their own watermark checks, while also
being more useful to the app.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
There was a compile time setting to enable a ring to yield when
it entered a loop in mp or mc rings waiting for the tail pointer update.
Build time settings are not recommended for enabling/disabling features,
and since this was off by default, remove it completely. If needed, a
runtime enabled equivalent can be used.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The debug option only provided statistics to the user, most of
which could be tracked by the application itself. Remove this as a
compile time option, and feature, simplifying the code.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The size and mask fields are duplicated in both the producer and
consumer data structures. Move them out of that into the top level
structure so they are not duplicated.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
create a common structure to hold the metadata for the producer and
the consumer, since both need essentially the same information - the
head and tail values, the ring size and mask.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Users compiling DPDK should not need to know or care about the arrangement
of cachelines in the rte_ring structure. Therefore just remove the build
option and set the structures to be always split. On platforms with 64B
cachelines, for improved performance use 128B rather than 64B alignment
since it stops the producer and consumer data being on adjacent cachelines.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Applications and other libraries should not be reading inside the
rte_ring structure directly to get the ring size. Instead add a fn
to allow it to be queried.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Previous patch updated the functions without updating all the comments.
Fixes: 591a9d7985 ("add FILE argument to debug functions")
Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: John McNamara <john.mcnamara@intel.com>
Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.
The extension keyword is used whenever the C99 syntax cannot do it.
This commit prevents the following errors:
error: ISO C forbids zero-size array `[...]'
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Use of rte_smb_wmb() instead of rte_smb_rmb() in sc dequeue function
creates the additional overhead of waiting for all the STOREs
to be completed to local buffer from ring buffer memory.
The sc dequeue function demands only LOAD-STORE barrier where LOADs
from ring buffer memory needs to be completed before tail pointer update.
Changing to rte_smb_rmb() to enable the required LOAD-STORE barrier.
Fixes: ecc7d10e44 ("ring: guarantee dequeue ordering before tail update")
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>