31 Commits

Author SHA1 Message Date
Ciara Power
1e6a661302 acl: check max SIMD bitwidth
When choosing a vector path to take, an extra condition must be
satisfied to ensure the max SIMD bitwidth allows for the CPU enabled
path. These checks are added in the check alg helper functions.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-19 16:45:02 +02:00
Konstantin Ananyev
6fba1c8ba0 acl: optimize AVX512 classify with 4 bytes loads
With current ACL implementation first field in the rule definition
has always to be one byte long. Though for optimising classify
implementation it might be useful to do 4B reads
(as we do for rest of the fields).
So at build phase, check user provided field definitions to determine
is it safe to do 4B loads for first ACL field.
Then at run-time this information can be used to choose classify
behavior.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:23:01 +02:00
Konstantin Ananyev
867d0d3649 acl: select 256-bit AVX512 classify method by default
On supported platforms, set RTE_ACL_CLASSIFY_AVX512X16 as
default ACL classify algorithm.
Note that AVX512X16 implementation uses 256-bit registers/instincts only
to avoid possibility of frequency drop.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:23:01 +02:00
Konstantin Ananyev
7c6cca6b60 acl: add infrastructure for AVX512 classify methods
Add necessary changes to support new AVX512 specific ACL classify
algorithm:
 - changes in meson.build to check that build tools
   (compiler, assembler, etc.) do properly support AVX512.
 - run-time checks to make sure target platform does support AVX512.
 - dummy rte_acl_classify_avx512() for targets where AVX512
   implementation couldn't be properly supported.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-10-14 14:23:00 +02:00
Konstantin Ananyev
0cea36d689 acl: rework classify method selection
Right now ACL library determines best possible (default) classify method
on a given platform with special constructor function rte_acl_init().
This patch makes the following changes:
 - Move selection of default classify method into a separate private
   function and call it for each ACL context creation (rte_acl_create()).
 - Remove library constructor function
 - Make rte_acl_set_ctx_classify() to check that requested algorithm
   is supported on given platform.

The purpose of these changes to improve and simplify algorithm selection
process and prepare ACL library to be integrated with the
max SIMD bitwidth series in discussion.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:23:00 +02:00
Konstantin Ananyev
85348c3e7d acl: fix x86 build for compiler without AVX2
Right now we define dummy version of rte_acl_classify_avx2()
when both X86 and AVX2 are not detected, though it should be
for non-AVX2 case only.

Fixes: e53ce4e41379 ("acl: remove use of weak functions")
Cc: stable@dpdk.org

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2020-10-14 14:23:00 +02:00
Ruifeng Wang
e9b9739264 config: remap flags used for Arm platforms
RTE_ARCH_xx flags are used to distinguish platform architectures.
These flags can be used to pick different code paths for different
architectures at compile time.
For Arm platforms, there are 3 flags in use: RTE_ARCH_ARM,
RTE_ARCH_ARMv7 and RTE_ARCH_ARM64.
RTE_ARCH_ARM64 is for 64-bit aarch64 platforms,
and RTE_ARCH_ARM & RTE_ARCH_ARMv7 are for 32-bit platforms.
RTE_ARCH_ARMv7 is for ARMv7 platforms as its name suggested.

The issue is meaning of RTE_ARCH_ARM is not clear enough.
Because no info about platform word length is included in the name.
To make the flag names more clear, a naming scheme is proposed.

RTE_ARCH_ARM (all Arm platforms)
    |
    +----RTE_ARCH_32 (New. 32-bit platforms of all architectures)
    |        |
    |        +----RTE_ARCH_ARMv7 (ARMv7 platforms)
    |        |
    |        +----RTE_ARCH_ARMv8_AARCH32 (aarch32 state on aarch64 machine)
    |
    +----RTE_ARCH_64 (64-bit platforms of all architectures)
             |
             +----RTE_ARCH_ARM64 (64-bit Arm platforms)

RTE_ARCH_32 will be explicitly defined for 32-bit platforms.

To fit into the new naming scheme, current usage of RTE_ARCH_ARM in
project is mapped to (RTE_ARCH_ARM && RTE_ARCH_32).

Matching flags for other architectures are:
RTE_ARCH_X86
    |
    +----RTE_ARCH_32
    |        |
    |        +----RTE_ARCH_I686
    |        |
    |        +----RTE_ARCH_X86_X32
    |
    +----RTE_ARCH_64
             |
             +----RTE_ARCH_X86_64

RTE_ARCH_PPC_64 ---- RTE_ARCH_64

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
2020-10-13 16:35:48 +02:00
David Marchand
8ac3591694 remove useless include of EAL memory config header
Restrict this header inclusion to its real users.

Fixes: 028669bc9f0d ("eal: hide shared memory config")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-10-09 10:22:24 +02:00
Anatoly Burakov
028669bc9f eal: hide shared memory config
Now that everything that has ever accessed the shared memory
config is doing so through the public API's, we can make it
internal. Since we're removing quite a few headers from
rte_eal_memconfig.h, we need to add them back in places
where this header is used.

This bumps the ABI, so also change all build files and make
update documentation.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
2019-07-06 10:32:34 +02:00
Anatoly Burakov
a36f5ce06e eal: add API to lock/unlock tailq list
Currently, locking/unlocking the TAILQ list requires direct
access to the shared memory config. Add an API to do the same,
and search-and-replace all usages.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
2019-07-05 22:13:23 +02:00
Bruce Richardson
e53ce4e413 acl: remove use of weak functions
Weak functions don't work well with static libraries and require the use of
"whole-archive" flag to ensure that the correct function is used when
linking. Since the weak functions are only used as placeholders within
this library alone, we can replace them with non-weak functions using
preprocessor ifdefs.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-06-05 16:28:11 +02:00
Bruce Richardson
6723c0fc72 replace snprintf with strlcpy
Do a global replace of snprintf(..."%s",...) with strlcpy, adding in the
rte_string_fns.h header if needed.  The function changes in this patch were
auto-generated via command:

  spatch --sp-file devtools/cocci/strlcpy.cocci --dir . --in-place

and then the files edited using awk to add in the missing header:

  gawk -i inplace '/include <rte_/ && ! seen { \
  	print "#include <rte_string_fns.h>"; seen=1} {print}'

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2019-04-04 22:46:05 +02:00
Keith Wiles
81bede55e3 eal: add macro for attribute weak
eal: add shorthand __rte_weak macro
qat: update code to use __rte_weak macro
avf: update code to use __rte_weak macro
fm10k: update code to use __rte_weak macro
i40e: update code to use __rte_weak macro
ixgbe: update code to use __rte_weak macro
mlx5: update code to use __rte_weak macro
virtio: update code to use __rte_weak macro
acl: update code to use __rte_weak macro
bpf: update code to use __rte_weak macro

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-10-25 02:11:23 +02:00
Bruce Richardson
369991d997 lib: use SPDX tag for Intel copyright files
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2018-01-04 22:41:39 +01:00
Thomas Monjalon
17715a5339 use macro to declare constructor functions
It is easier to find all constructor functions when they use
the same macros RTE_INIT or RTE_INIT_PRIO.

The macro definitions are moved from rte_eal.h to rte_common.h.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2017-11-06 21:56:07 +01:00
Michał Mirosław
c6c7a8d7e6 acl: allow zero verdict
This enables ACL matches to return 0 where the distinction
from no-match case is not needed.

Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2017-01-30 11:08:47 +01:00
Gowrishankar Muthukrishnan
1d73135f9f acl: add AltiVec for ppc64
This patch adds port for ACL library in ppc64le.

Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2016-09-09 17:56:14 +02:00
Jianbo Liu
68b67f9724 acl/arm: enable acl for ARMv7
Implement vqtbl1q_u8 intrinsic function, which is not supported in armv7-a.

Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2015-12-08 03:00:42 +01:00
Jerin Jacob
34fa6c27c1 acl: add NEON optimization for ARMv8
The implementation uses NEON gcc intrinsic.
Verified with testacl and acl_autotest applications on arm64 architecture.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-11-18 22:44:01 +01:00
Thomas Monjalon
0b6fbe8749 acl: remove old API
The functions and structures are moved to app/test in order to keep
existing unit tests. Some minor changes were done in these functions
because of library scope restrictions.
An enum is also copied in two other applications to keep existing code.
The library version is incremented.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-09-03 19:22:48 +02:00
Konstantin Ananyev
12c4e86969 acl: remove redundant macro
Use global RTE_LEN2MASK macro, instead of local LEN2MASK.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-06-18 17:59:18 +02:00
David Marchand
a2348166ea tailq: move to dynamic tailq
Use dynamic tailq rather than static entries.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-03-10 12:06:08 +01:00
Konstantin Ananyev
62945e029e acl: introduce config parameter for performance/space trade-off
If at build phase we don't make any trie splitting,
then temporary build structures and resulting RT structure might be
much bigger than current.
>From other side - having just one trie instead of multiple can speedup
search quite significantly.
>From my measurements on rule-sets with ~10K rules:
RT table up to 8 times bigger, classify() up to 80% faster
than current implementation.
To make it possible for the user to decide about performance/space trade-off -
new parameter for build config structure (max_size) is introduced.
Setting it to the value greater than zero, instructs  rte_acl_build() to:
- make sure that size of RT table wouldn't exceed given value.
- attempt to minimise number of tries in the table.
Setting it to zero maintains current behaviour.
That introduces a minor change in the public API, but I think the possible
performance gain is too big to ignore it.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:26 +01:00
Konstantin Ananyev
5dd71363bf acl: add AVX2 classify method
Introduce new classify() method that uses AVX2 instructions.

>From my measurements:
On HSW boards when processing >= 16 packets per call,
AVX2 method outperforms it's SSE counterpart by 10-25%,
(depending on the ruleset).

When build with the compilers that don't support AVX2 instructions,
make rte_acl_classify_avx2() do nothing and return an error.
At runtime, if librte_acl was build with the compiler that supports AVX2,
this method is selected as default one on HW that supports AVX2.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
3858b90d82 acl: deduplicate a bit of RT code
Move common check for input parameters up into rte_acl_classify_alg().

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Sergio Gonzalez Monroy
fdf20fa7be add prefix to cache line macros
CACHE_LINE_SIZE is a macro defined in machine/param.h in FreeBSD and
conflicts with DPDK macro version.
Adding RTE_ prefix to avoid conflicts.
CACHE_LINE_MASK and CACHE_LINE_ROUNDUP are also prefixed.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
[Thomas: updated on HEAD, including PPC]
2014-11-27 16:21:11 +01:00
Thomas Monjalon
7eef9194ab acl: fix comments typos
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-14 17:23:50 +01:00
Konstantin Ananyev
074f54ad03 acl: fix build and runtime for default target
Make ACL library to build/work on 'default' architecture:
- make rte_acl_classify_scalar really scalar
 (make sure it wouldn't use sse4 instrincts through resolve_priority()).
- Provide two versions of rte_acl_classify code path:
  rte_acl_classify_sse() - could be build and used only on systems with sse4.2
  and upper, return -ENOTSUP on lower arch.
  rte_acl_classify_scalar() - a slower version, but could be build and used
  on all systems.
- Addition of a new function rte_acl_classify_alg.  This function lets you
  specify an enum value to override the acl contexts default algorithm when doing
  a classification.  This allows an application to specify a classification
  algorithm without needing to publicize each method. I know there was concern
  over keeping those methods public, but we don't have a static ABI at the moment,
  so this seems to me a reasonable thing to do, as it gives us less of an ABI
  surface to worry about.
- keep common code shared between these two codepaths.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-09-03 03:26:50 +02:00
Anatoly Burakov
8d8d88cbd9 acl: make tailq fully local
Since the data structures such as rings are shared in their entirety,
those TAILQ pointers are shared as well. Meaning that, after a
successful rte_ring creation, the tailq_next pointer of the last
ring in the TAILQ will be updated with a pointer to a ring which may
not be present in the address space of another process (i.e. a ring
that may be host-local or guest-local, and not shared over IVSHMEM).
Any successive ring create/lookup on the other side of IVSHMEM will
result in trying to dereference an invalid pointer.

This patchset fixes this problem by creating a default tailq entry
that may be used by any data structure that chooses to use TAILQs.
This default TAILQ entry will consist of a tailq_next/tailq_prev
pointers, and an opaque pointer to arbitrary data. All TAILQ
pointers from data structures themselves will be removed and
replaced by those generic TAILQ entries, thus fixing the problem
of potentially exposing local address space to shared structures.

Technically, only rte_ring structure require modification, because
IVSHMEM is only using memzones (which aren't in TAILQs) and rings,
but for consistency's sake other TAILQ-based data structures were
adapted as well.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-07-22 19:42:23 +02:00
Stephen Hemminger
6f41fe75e2 eal: deprecate rte_snprintf
The function rte_snprintf serves no useful purpose. It is the
same as snprintf() for all valid inputs. Deprecate it and
replace all uses in current code.

Leave the tests for the deprecated function in place.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-27 02:31:24 +02:00
Konstantin Ananyev
dc276b5780 acl: new library
The ACL library is used to perform an N-tuple search over a set of rules with
multiple categories and find the best match for each category.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
[Thomas: some code-style changes]
2014-06-14 01:29:45 +02:00