numam-dpdk

Author	SHA1	Message	Date
Ciara Power	1e6a661302	acl: check max SIMD bitwidth When choosing a vector path to take, an extra condition must be satisfied to ensure the max SIMD bitwidth allows for the CPU enabled path. These checks are added in the check alg helper functions. Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-19 16:45:02 +02:00
Konstantin Ananyev	6fba1c8ba0	acl: optimize AVX512 classify with 4 bytes loads With current ACL implementation first field in the rule definition has always to be one byte long. Though for optimising classify implementation it might be useful to do 4B reads (as we do for rest of the fields). So at build phase, check user provided field definitions to determine is it safe to do 4B loads for first ACL field. Then at run-time this information can be used to choose classify behavior. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:23:01 +02:00
Konstantin Ananyev	867d0d3649	acl: select 256-bit AVX512 classify method by default On supported platforms, set RTE_ACL_CLASSIFY_AVX512X16 as default ACL classify algorithm. Note that AVX512X16 implementation uses 256-bit registers/instincts only to avoid possibility of frequency drop. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:23:01 +02:00
Konstantin Ananyev	7c6cca6b60	acl: add infrastructure for AVX512 classify methods Add necessary changes to support new AVX512 specific ACL classify algorithm: - changes in meson.build to check that build tools (compiler, assembler, etc.) do properly support AVX512. - run-time checks to make sure target platform does support AVX512. - dummy rte_acl_classify_avx512() for targets where AVX512 implementation couldn't be properly supported. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2020-10-14 14:23:00 +02:00
Konstantin Ananyev	0cea36d689	acl: rework classify method selection Right now ACL library determines best possible (default) classify method on a given platform with special constructor function rte_acl_init(). This patch makes the following changes: - Move selection of default classify method into a separate private function and call it for each ACL context creation (rte_acl_create()). - Remove library constructor function - Make rte_acl_set_ctx_classify() to check that requested algorithm is supported on given platform. The purpose of these changes to improve and simplify algorithm selection process and prepare ACL library to be integrated with the max SIMD bitwidth series in discussion. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2020-10-14 14:23:00 +02:00
Konstantin Ananyev	85348c3e7d	acl: fix x86 build for compiler without AVX2 Right now we define dummy version of rte_acl_classify_avx2() when both X86 and AVX2 are not detected, though it should be for non-AVX2 case only. Fixes: e53ce4e41379 ("acl: remove use of weak functions") Cc: stable@dpdk.org Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2020-10-14 14:23:00 +02:00
Ruifeng Wang	e9b9739264	config: remap flags used for Arm platforms RTE_ARCH_xx flags are used to distinguish platform architectures. These flags can be used to pick different code paths for different architectures at compile time. For Arm platforms, there are 3 flags in use: RTE_ARCH_ARM, RTE_ARCH_ARMv7 and RTE_ARCH_ARM64. RTE_ARCH_ARM64 is for 64-bit aarch64 platforms, and RTE_ARCH_ARM & RTE_ARCH_ARMv7 are for 32-bit platforms. RTE_ARCH_ARMv7 is for ARMv7 platforms as its name suggested. The issue is meaning of RTE_ARCH_ARM is not clear enough. Because no info about platform word length is included in the name. To make the flag names more clear, a naming scheme is proposed. RTE_ARCH_ARM (all Arm platforms) \| +----RTE_ARCH_32 (New. 32-bit platforms of all architectures) \| \| \| +----RTE_ARCH_ARMv7 (ARMv7 platforms) \| \| \| +----RTE_ARCH_ARMv8_AARCH32 (aarch32 state on aarch64 machine) \| +----RTE_ARCH_64 (64-bit platforms of all architectures) \| +----RTE_ARCH_ARM64 (64-bit Arm platforms) RTE_ARCH_32 will be explicitly defined for 32-bit platforms. To fit into the new naming scheme, current usage of RTE_ARCH_ARM in project is mapped to (RTE_ARCH_ARM && RTE_ARCH_32). Matching flags for other architectures are: RTE_ARCH_X86 \| +----RTE_ARCH_32 \| \| \| +----RTE_ARCH_I686 \| \| \| +----RTE_ARCH_X86_X32 \| +----RTE_ARCH_64 \| +----RTE_ARCH_X86_64 RTE_ARCH_PPC_64 ---- RTE_ARCH_64 Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com>	2020-10-13 16:35:48 +02:00
David Marchand	8ac3591694	remove useless include of EAL memory config header Restrict this header inclusion to its real users. Fixes: 028669bc9f0d ("eal: hide shared memory config") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-10-09 10:22:24 +02:00
Anatoly Burakov	028669bc9f	eal: hide shared memory config Now that everything that has ever accessed the shared memory config is doing so through the public API's, we can make it internal. Since we're removing quite a few headers from rte_eal_memconfig.h, we need to add them back in places where this header is used. This bumps the ABI, so also change all build files and make update documentation. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-06 10:32:34 +02:00
Anatoly Burakov	a36f5ce06e	eal: add API to lock/unlock tailq list Currently, locking/unlocking the TAILQ list requires direct access to the shared memory config. Add an API to do the same, and search-and-replace all usages. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>	2019-07-05 22:13:23 +02:00
Bruce Richardson	e53ce4e413	acl: remove use of weak functions Weak functions don't work well with static libraries and require the use of "whole-archive" flag to ensure that the correct function is used when linking. Since the weak functions are only used as placeholders within this library alone, we can replace them with non-weak functions using preprocessor ifdefs. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-06-05 16:28:11 +02:00
Bruce Richardson	6723c0fc72	replace snprintf with strlcpy Do a global replace of snprintf(..."%s",...) with strlcpy, adding in the rte_string_fns.h header if needed. The function changes in this patch were auto-generated via command: spatch --sp-file devtools/cocci/strlcpy.cocci --dir . --in-place and then the files edited using awk to add in the missing header: gawk -i inplace '/include <rte_/ && ! seen { \ print "#include <rte_string_fns.h>"; seen=1} {print}' Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-04-04 22:46:05 +02:00
Keith Wiles	81bede55e3	eal: add macro for attribute weak eal: add shorthand __rte_weak macro qat: update code to use __rte_weak macro avf: update code to use __rte_weak macro fm10k: update code to use __rte_weak macro i40e: update code to use __rte_weak macro ixgbe: update code to use __rte_weak macro mlx5: update code to use __rte_weak macro virtio: update code to use __rte_weak macro acl: update code to use __rte_weak macro bpf: update code to use __rte_weak macro Signed-off-by: Keith Wiles <keith.wiles@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2018-10-25 02:11:23 +02:00
Bruce Richardson	369991d997	lib: use SPDX tag for Intel copyright files Replace the BSD license header with the SPDX tag for files with only an Intel copyright on them. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2018-01-04 22:41:39 +01:00
Thomas Monjalon	17715a5339	use macro to declare constructor functions It is easier to find all constructor functions when they use the same macros RTE_INIT or RTE_INIT_PRIO. The macro definitions are moved from rte_eal.h to rte_common.h. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2017-11-06 21:56:07 +01:00
Michał Mirosław	c6c7a8d7e6	acl: allow zero verdict This enables ACL matches to return 0 where the distinction from no-match case is not needed. Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2017-01-30 11:08:47 +01:00
Gowrishankar Muthukrishnan	1d73135f9f	acl: add AltiVec for ppc64 This patch adds port for ACL library in ppc64le. Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com> Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2016-09-09 17:56:14 +02:00
Jianbo Liu	68b67f9724	acl/arm: enable acl for ARMv7 Implement vqtbl1q_u8 intrinsic function, which is not supported in armv7-a. Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2015-12-08 03:00:42 +01:00
Jerin Jacob	34fa6c27c1	acl: add NEON optimization for ARMv8 The implementation uses NEON gcc intrinsic. Verified with testacl and acl_autotest applications on arm64 architecture. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-11-18 22:44:01 +01:00
Thomas Monjalon	0b6fbe8749	acl: remove old API The functions and structures are moved to app/test in order to keep existing unit tests. Some minor changes were done in these functions because of library scope restrictions. An enum is also copied in two other applications to keep existing code. The library version is incremented. Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-09-03 19:22:48 +02:00
Konstantin Ananyev	12c4e86969	acl: remove redundant macro Use global RTE_LEN2MASK macro, instead of local LEN2MASK. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-06-18 17:59:18 +02:00
David Marchand	a2348166ea	tailq: move to dynamic tailq Use dynamic tailq rather than static entries. Signed-off-by: David Marchand <david.marchand@6wind.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-03-10 12:06:08 +01:00
Konstantin Ananyev	62945e029e	acl: introduce config parameter for performance/space trade-off If at build phase we don't make any trie splitting, then temporary build structures and resulting RT structure might be much bigger than current. >From other side - having just one trie instead of multiple can speedup search quite significantly. >From my measurements on rule-sets with ~10K rules: RT table up to 8 times bigger, classify() up to 80% faster than current implementation. To make it possible for the user to decide about performance/space trade-off - new parameter for build config structure (max_size) is introduced. Setting it to the value greater than zero, instructs rte_acl_build() to: - make sure that size of RT table wouldn't exceed given value. - attempt to minimise number of tries in the table. Setting it to zero maintains current behaviour. That introduces a minor change in the public API, but I think the possible performance gain is too big to ignore it. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:26 +01:00
Konstantin Ananyev	5dd71363bf	acl: add AVX2 classify method Introduce new classify() method that uses AVX2 instructions. >From my measurements: On HSW boards when processing >= 16 packets per call, AVX2 method outperforms it's SSE counterpart by 10-25%, (depending on the ruleset). When build with the compilers that don't support AVX2 instructions, make rte_acl_classify_avx2() do nothing and return an error. At runtime, if librte_acl was build with the compiler that supports AVX2, this method is selected as default one on HW that supports AVX2. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	3858b90d82	acl: deduplicate a bit of RT code Move common check for input parameters up into rte_acl_classify_alg(). Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Sergio Gonzalez Monroy	fdf20fa7be	add prefix to cache line macros CACHE_LINE_SIZE is a macro defined in machine/param.h in FreeBSD and conflicts with DPDK macro version. Adding RTE_ prefix to avoid conflicts. CACHE_LINE_MASK and CACHE_LINE_ROUNDUP are also prefixed. Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com> [Thomas: updated on HEAD, including PPC]	2014-11-27 16:21:11 +01:00
Thomas Monjalon	7eef9194ab	acl: fix comments typos Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>	2014-11-14 17:23:50 +01:00
Konstantin Ananyev	074f54ad03	acl: fix build and runtime for default target Make ACL library to build/work on 'default' architecture: - make rte_acl_classify_scalar really scalar (make sure it wouldn't use sse4 instrincts through resolve_priority()). - Provide two versions of rte_acl_classify code path: rte_acl_classify_sse() - could be build and used only on systems with sse4.2 and upper, return -ENOTSUP on lower arch. rte_acl_classify_scalar() - a slower version, but could be build and used on all systems. - Addition of a new function rte_acl_classify_alg. This function lets you specify an enum value to override the acl contexts default algorithm when doing a classification. This allows an application to specify a classification algorithm without needing to publicize each method. I know there was concern over keeping those methods public, but we don't have a static ABI at the moment, so this seems to me a reasonable thing to do, as it gives us less of an ABI surface to worry about. - keep common code shared between these two codepaths. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2014-09-03 03:26:50 +02:00
Anatoly Burakov	8d8d88cbd9	acl: make tailq fully local Since the data structures such as rings are shared in their entirety, those TAILQ pointers are shared as well. Meaning that, after a successful rte_ring creation, the tailq_next pointer of the last ring in the TAILQ will be updated with a pointer to a ring which may not be present in the address space of another process (i.e. a ring that may be host-local or guest-local, and not shared over IVSHMEM). Any successive ring create/lookup on the other side of IVSHMEM will result in trying to dereference an invalid pointer. This patchset fixes this problem by creating a default tailq entry that may be used by any data structure that chooses to use TAILQs. This default TAILQ entry will consist of a tailq_next/tailq_prev pointers, and an opaque pointer to arbitrary data. All TAILQ pointers from data structures themselves will be removed and replaced by those generic TAILQ entries, thus fixing the problem of potentially exposing local address space to shared structures. Technically, only rte_ring structure require modification, because IVSHMEM is only using memzones (which aren't in TAILQs) and rings, but for consistency's sake other TAILQ-based data structures were adapted as well. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2014-07-22 19:42:23 +02:00
Stephen Hemminger	6f41fe75e2	eal: deprecate rte_snprintf The function rte_snprintf serves no useful purpose. It is the same as snprintf() for all valid inputs. Deprecate it and replace all uses in current code. Leave the tests for the deprecated function in place. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>	2014-06-27 02:31:24 +02:00
Konstantin Ananyev	dc276b5780	acl: new library The ACL library is used to perform an N-tuple search over a set of rules with multiple categories and find the best match for each category. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Waterman Cao <waterman.cao@intel.com> Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com> [Thomas: some code-style changes]	2014-06-14 01:29:45 +02:00

31 Commits