Commit Graph

14 Commits

Author SHA1 Message Date
Jianbo Liu
68b67f9724 acl/arm: enable acl for ARMv7
Implement vqtbl1q_u8 intrinsic function, which is not supported in armv7-a.

Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2015-12-08 03:00:42 +01:00
Jerin Jacob
34fa6c27c1 acl: add NEON optimization for ARMv8
The implementation uses NEON gcc intrinsic.
Verified with testacl and acl_autotest applications on arm64 architecture.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-11-18 22:44:01 +01:00
Thomas Monjalon
0b6fbe8749 acl: remove old API
The functions and structures are moved to app/test in order to keep
existing unit tests. Some minor changes were done in these functions
because of library scope restrictions.
An enum is also copied in two other applications to keep existing code.
The library version is incremented.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-09-03 19:22:48 +02:00
Konstantin Ananyev
12c4e86969 acl: remove redundant macro
Use global RTE_LEN2MASK macro, instead of local LEN2MASK.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-06-18 17:59:18 +02:00
David Marchand
a2348166ea tailq: move to dynamic tailq
Use dynamic tailq rather than static entries.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-03-10 12:06:08 +01:00
Konstantin Ananyev
62945e029e acl: introduce config parameter for performance/space trade-off
If at build phase we don't make any trie splitting,
then temporary build structures and resulting RT structure might be
much bigger than current.
>From other side - having just one trie instead of multiple can speedup
search quite significantly.
>From my measurements on rule-sets with ~10K rules:
RT table up to 8 times bigger, classify() up to 80% faster
than current implementation.
To make it possible for the user to decide about performance/space trade-off -
new parameter for build config structure (max_size) is introduced.
Setting it to the value greater than zero, instructs  rte_acl_build() to:
- make sure that size of RT table wouldn't exceed given value.
- attempt to minimise number of tries in the table.
Setting it to zero maintains current behaviour.
That introduces a minor change in the public API, but I think the possible
performance gain is too big to ignore it.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:26 +01:00
Konstantin Ananyev
5dd71363bf acl: add AVX2 classify method
Introduce new classify() method that uses AVX2 instructions.

>From my measurements:
On HSW boards when processing >= 16 packets per call,
AVX2 method outperforms it's SSE counterpart by 10-25%,
(depending on the ruleset).

When build with the compilers that don't support AVX2 instructions,
make rte_acl_classify_avx2() do nothing and return an error.
At runtime, if librte_acl was build with the compiler that supports AVX2,
this method is selected as default one on HW that supports AVX2.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
3858b90d82 acl: deduplicate a bit of RT code
Move common check for input parameters up into rte_acl_classify_alg().

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Sergio Gonzalez Monroy
fdf20fa7be add prefix to cache line macros
CACHE_LINE_SIZE is a macro defined in machine/param.h in FreeBSD and
conflicts with DPDK macro version.
Adding RTE_ prefix to avoid conflicts.
CACHE_LINE_MASK and CACHE_LINE_ROUNDUP are also prefixed.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
[Thomas: updated on HEAD, including PPC]
2014-11-27 16:21:11 +01:00
Thomas Monjalon
7eef9194ab acl: fix comments typos
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-14 17:23:50 +01:00
Konstantin Ananyev
074f54ad03 acl: fix build and runtime for default target
Make ACL library to build/work on 'default' architecture:
- make rte_acl_classify_scalar really scalar
 (make sure it wouldn't use sse4 instrincts through resolve_priority()).
- Provide two versions of rte_acl_classify code path:
  rte_acl_classify_sse() - could be build and used only on systems with sse4.2
  and upper, return -ENOTSUP on lower arch.
  rte_acl_classify_scalar() - a slower version, but could be build and used
  on all systems.
- Addition of a new function rte_acl_classify_alg.  This function lets you
  specify an enum value to override the acl contexts default algorithm when doing
  a classification.  This allows an application to specify a classification
  algorithm without needing to publicize each method. I know there was concern
  over keeping those methods public, but we don't have a static ABI at the moment,
  so this seems to me a reasonable thing to do, as it gives us less of an ABI
  surface to worry about.
- keep common code shared between these two codepaths.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-09-03 03:26:50 +02:00
Anatoly Burakov
8d8d88cbd9 acl: make tailq fully local
Since the data structures such as rings are shared in their entirety,
those TAILQ pointers are shared as well. Meaning that, after a
successful rte_ring creation, the tailq_next pointer of the last
ring in the TAILQ will be updated with a pointer to a ring which may
not be present in the address space of another process (i.e. a ring
that may be host-local or guest-local, and not shared over IVSHMEM).
Any successive ring create/lookup on the other side of IVSHMEM will
result in trying to dereference an invalid pointer.

This patchset fixes this problem by creating a default tailq entry
that may be used by any data structure that chooses to use TAILQs.
This default TAILQ entry will consist of a tailq_next/tailq_prev
pointers, and an opaque pointer to arbitrary data. All TAILQ
pointers from data structures themselves will be removed and
replaced by those generic TAILQ entries, thus fixing the problem
of potentially exposing local address space to shared structures.

Technically, only rte_ring structure require modification, because
IVSHMEM is only using memzones (which aren't in TAILQs) and rings,
but for consistency's sake other TAILQ-based data structures were
adapted as well.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-07-22 19:42:23 +02:00
Stephen Hemminger
6f41fe75e2 eal: deprecate rte_snprintf
The function rte_snprintf serves no useful purpose. It is the
same as snprintf() for all valid inputs. Deprecate it and
replace all uses in current code.

Leave the tests for the deprecated function in place.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-27 02:31:24 +02:00
Konstantin Ananyev
dc276b5780 acl: new library
The ACL library is used to perform an N-tuple search over a set of rules with
multiple categories and find the best match for each category.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
[Thomas: some code-style changes]
2014-06-14 01:29:45 +02:00