numam-dpdk/lib/stack/rte_stack.c

198 lines
3.9 KiB
C
Raw Normal View History

/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(c) 2019 Intel Corporation
*/
#include <string.h>
#include <rte_string_fns.h>
#include <rte_atomic.h>
#include <rte_eal.h>
#include <rte_eal_memconfig.h>
#include <rte_errno.h>
#include <rte_malloc.h>
#include <rte_memzone.h>
#include <rte_rwlock.h>
#include <rte_tailq.h>
#include "rte_stack.h"
#include "stack_pvt.h"
TAILQ_HEAD(rte_stack_list, rte_tailq_entry);
static struct rte_tailq_elem rte_stack_tailq = {
.name = RTE_TAILQ_STACK_NAME,
};
EAL_REGISTER_TAILQ(rte_stack_tailq)
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
static void
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
rte_stack_init(struct rte_stack *s, unsigned int count, uint32_t flags)
{
memset(s, 0, sizeof(*s));
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
if (flags & RTE_STACK_F_LF)
rte_stack_lf_init(s, count);
else
rte_stack_std_init(s);
}
static ssize_t
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
rte_stack_get_memsize(unsigned int count, uint32_t flags)
{
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
if (flags & RTE_STACK_F_LF)
return rte_stack_lf_get_memsize(count);
else
return rte_stack_std_get_memsize(count);
}
struct rte_stack *
rte_stack_create(const char *name, unsigned int count, int socket_id,
uint32_t flags)
{
char mz_name[RTE_MEMZONE_NAMESIZE];
struct rte_stack_list *stack_list;
const struct rte_memzone *mz;
struct rte_tailq_entry *te;
struct rte_stack *s;
unsigned int sz;
int ret;
if (flags & ~(RTE_STACK_F_LF)) {
STACK_LOG_ERR("Unsupported stack flags %#x\n", flags);
return NULL;
}
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
#ifdef RTE_ARCH_64
RTE_BUILD_BUG_ON(sizeof(struct rte_stack_lf_head) != 16);
stack: allow lock-free only on relevant architectures Since commit 7911ba0473e0 ("stack: enable lock-free implementation for aarch64"), lock-free stack is supported on arm64 but this description was missing from the doxygen for the flag. Currently it is impossible to detect programmatically whether lock-free implementation of rte_stack is supported. One could check whether the header guard for lock-free stubs is defined (_RTE_STACK_LF_STUBS_H_) but that's an unstable implementation detail. Because of that currently all lock-free ring creations silently succeed (as long as the stack header is 16B long) which later leads to push and pop operations being NOPs. The observable effect is that stack_lf_autotest fails on platforms not supporting the lock-free. Instead it should just skip the lock-free test altogether. This commit adds a new errno value (ENOTSUP) that may be returned by rte_stack_create() to indicate that a given combination of flags is not supported on a current platform. This is detected by checking a compile-time flag in the include logic in rte_stack_lf.h which may be used by applications to check the lock-free support at compile time. Use the added RTE_STACK_LF_SUPPORTED flag to disable the lock-free stack tests at the compile time. Perf test doesn't fail because rte_ring_create() succeeds, however marking this test as skipped gives a better indication of what actually was tested. Fixes: 7911ba0473e0 ("stack: enable lock-free implementation for aarch64") Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-04-12 08:28:59 +00:00
#endif
#if !defined(RTE_STACK_LF_SUPPORTED)
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
if (flags & RTE_STACK_F_LF) {
STACK_LOG_ERR("Lock-free stack is not supported on your platform\n");
stack: allow lock-free only on relevant architectures Since commit 7911ba0473e0 ("stack: enable lock-free implementation for aarch64"), lock-free stack is supported on arm64 but this description was missing from the doxygen for the flag. Currently it is impossible to detect programmatically whether lock-free implementation of rte_stack is supported. One could check whether the header guard for lock-free stubs is defined (_RTE_STACK_LF_STUBS_H_) but that's an unstable implementation detail. Because of that currently all lock-free ring creations silently succeed (as long as the stack header is 16B long) which later leads to push and pop operations being NOPs. The observable effect is that stack_lf_autotest fails on platforms not supporting the lock-free. Instead it should just skip the lock-free test altogether. This commit adds a new errno value (ENOTSUP) that may be returned by rte_stack_create() to indicate that a given combination of flags is not supported on a current platform. This is detected by checking a compile-time flag in the include logic in rte_stack_lf.h which may be used by applications to check the lock-free support at compile time. Use the added RTE_STACK_LF_SUPPORTED flag to disable the lock-free stack tests at the compile time. Perf test doesn't fail because rte_ring_create() succeeds, however marking this test as skipped gives a better indication of what actually was tested. Fixes: 7911ba0473e0 ("stack: enable lock-free implementation for aarch64") Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-04-12 08:28:59 +00:00
rte_errno = ENOTSUP;
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
return NULL;
}
#endif
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
sz = rte_stack_get_memsize(count, flags);
ret = snprintf(mz_name, sizeof(mz_name), "%s%s",
RTE_STACK_MZ_PREFIX, name);
if (ret < 0 || ret >= (int)sizeof(mz_name)) {
rte_errno = ENAMETOOLONG;
return NULL;
}
te = rte_zmalloc("STACK_TAILQ_ENTRY", sizeof(*te), 0);
if (te == NULL) {
STACK_LOG_ERR("Cannot reserve memory for tailq\n");
rte_errno = ENOMEM;
return NULL;
}
rte_mcfg_tailq_write_lock();
mz = rte_memzone_reserve_aligned(mz_name, sz, socket_id,
0, __alignof__(*s));
if (mz == NULL) {
STACK_LOG_ERR("Cannot reserve stack memzone!\n");
rte_mcfg_tailq_write_unlock();
rte_free(te);
return NULL;
}
s = mz->addr;
stack: add lock-free implementation This commit adds support for a lock-free (linked list based) stack to the stack API. This behavior is selected through a new rte_stack_create() flag, RTE_STACK_F_LF. The stack consists of a linked list of elements, each containing a data pointer and a next pointer, and an atomic stack depth counter. The lock-free push operation enqueues a linked list of pointers by pointing the tail of the list to the current stack head, and using a CAS to swing the stack head pointer to the head of the list. The operation retries if it is unsuccessful (i.e. the list changed between reading the head and modifying it), else it adjusts the stack length and returns. The lock-free pop operation first reserves num elements by adjusting the stack length, to ensure the dequeue operation will succeed without blocking. It then dequeues pointers by walking the list -- starting from the head -- then swinging the head pointer (using a CAS as well). While walking the list, the data pointers are recorded in an object table. This algorithm stack uses a 128-bit compare-and-swap instruction, which atomically updates the stack top pointer and a modification counter, to protect against the ABA problem. The linked list elements themselves are maintained in a lock-free LIFO list, and are allocated before stack pushes and freed after stack pops. Since the stack has a fixed maximum depth, these elements do not need to be dynamically created. Signed-off-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-04-03 23:20:17 +00:00
rte_stack_init(s, count, flags);
/* Store the name for later lookups */
ret = strlcpy(s->name, name, sizeof(s->name));
if (ret < 0 || ret >= (int)sizeof(s->name)) {
rte_mcfg_tailq_write_unlock();
rte_errno = ENAMETOOLONG;
rte_free(te);
rte_memzone_free(mz);
return NULL;
}
s->memzone = mz;
s->capacity = count;
s->flags = flags;
te->data = s;
stack_list = RTE_TAILQ_CAST(rte_stack_tailq.head, rte_stack_list);
TAILQ_INSERT_TAIL(stack_list, te, next);
rte_mcfg_tailq_write_unlock();
return s;
}
void
rte_stack_free(struct rte_stack *s)
{
struct rte_stack_list *stack_list;
struct rte_tailq_entry *te;
if (s == NULL)
return;
stack_list = RTE_TAILQ_CAST(rte_stack_tailq.head, rte_stack_list);
rte_mcfg_tailq_write_lock();
/* find out tailq entry */
TAILQ_FOREACH(te, stack_list, next) {
if (te->data == s)
break;
}
if (te == NULL) {
rte_mcfg_tailq_write_unlock();
return;
}
TAILQ_REMOVE(stack_list, te, next);
rte_mcfg_tailq_write_unlock();
rte_free(te);
rte_memzone_free(s->memzone);
}
struct rte_stack *
rte_stack_lookup(const char *name)
{
struct rte_stack_list *stack_list;
struct rte_tailq_entry *te;
struct rte_stack *r = NULL;
if (name == NULL) {
rte_errno = EINVAL;
return NULL;
}
stack_list = RTE_TAILQ_CAST(rte_stack_tailq.head, rte_stack_list);
rte_mcfg_tailq_read_lock();
TAILQ_FOREACH(te, stack_list, next) {
r = (struct rte_stack *) te->data;
if (strncmp(name, r->name, RTE_STACK_NAMESIZE) == 0)
break;
}
rte_mcfg_tailq_read_unlock();
if (te == NULL) {
rte_errno = ENOENT;
return NULL;
}
return r;
}
log: register with standardized names Let's try to enforce the convention where most drivers use a pmd. logtype with their class reflected in it, and libraries use a lib. logtype. Introduce two new macros: - RTE_LOG_REGISTER_DEFAULT can be used when a single logtype is used in a component. It is associated to the default name provided by the build system, - RTE_LOG_REGISTER_SUFFIX can be used when multiple logtypes are used, and then the passed name is appended to the default name, RTE_LOG_REGISTER is left untouched for existing external users and for components that do not comply with the convention. There is a new Meson variable log_prefix to adapt the default name for baseband (pmd.bb.), bus (no pmd.) and mempool (no pmd.) classes. Note: achieved with below commands + reverted change on net/bonding + edits on crypto/virtio, compress/mlx5, regex/mlx5 $ git grep -l RTE_LOG_REGISTER drivers/ | while read file; do pattern=${file##drivers/}; class=${pattern%%/*}; pattern=${pattern#$class/}; drv=${pattern%%/*}; case "$class" in baseband) pattern=pmd.bb.$drv;; bus) pattern=bus.$drv;; mempool) pattern=mempool.$drv;; *) pattern=pmd.$class.$drv;; esac sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file; sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file; done $ git grep -l RTE_LOG_REGISTER lib/ | while read file; do pattern=${file##lib/}; pattern=lib.${pattern%%/*}; sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file; sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file; done Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-26 12:51:08 +00:00
RTE_LOG_REGISTER_DEFAULT(stack_logtype, NOTICE);