fix the error reported by checkpatch:
"ERROR: return is not a function, parentheses are not required"
remove parentheses in return like:
"return (logical expressions)"
remove parentheses in return a function like:
"return (rte_mempool_lookup(...))"
Fixes: 6307b909b8e0 ("lib: remove extra parenthesis after return")
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Replace O(n^2) list sort with an O(n log n) merge sort.
The merge sort is based on the solution suggested in:
http://cslibrary.stanford.edu/105/LinkedListProblems.pdf
Tested sort_rules() improvement:
100K rules: O(n^2): 31382 milliseconds; O(n log n): 10 milliseconds
259K rules: O(n^2): 133753 milliseconds; O(n log n): 22 milliseconds
Signed-off-by: Mark Smith <marsmith@akamai.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Introduce new RTE_ACL_MASKLEN_TO_BITMASK macro, that will be used
in several places inside librte_acl and it's UT.
Simplify and cleanup build_trie() code a bit.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
When rebuilding a trie for limited rule-set,
don't try to split the rule-set even further.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Move check for build confg parameter into a separate function.
Simplify acl_calc_wildness() function.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
As now subtree_id is not used acl_merge_trie() any more,
there is no point to calculate and maintain that information.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reported by Zi Hu:
"
cat test_data/rule1
@192.168.0.0/24 192.168.0.0/24 400 : 500 0 : 52 6/0xff
@192.168.0.0/24 192.168.0.0/24 400 : 500 54 : 65280 6/0xff
@192.168.0.0/24 192.168.0.0/24 400 : 500 0 : 65535 6/0xff
cat test_data/trace1
0xc0a80005 0xc0a80009 450 53 0x06
I run the test by:
sudo ./testacl -n 2 -c 4 -- --rulesf=./test_data/rule1
--tracef=./test_data/trace1
The result shows that the packet matches the second rule, which is wrong.
The dest port of the pkt is 53, so it should match the third rule.
"
Indeed there is problem at ACL build stage.
Sometimes acl_merge_trie() is too aggressive in trying to conserve
space at build time.
So it takes a wrong assumptions and didn't duplicate a node,
even when it should.
The easiest and safest fix seems to always duplicate a left non-root/non-leaf
node first, and let the further code to destroy the node, if it is not needed.
Reported-by: Zi Hu <huzilucky@gmail.com>
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
During build phase ACL doing quite a lot of memory allocations
for relatively small temporary structures.
In theory each of such allocation can fail, so we need to handle
all these possible failures.
That adds a lot of extra checks and makes the code harder to read and follow.
To simplify the process, made changes to handle all such failures
in one place.
Note, that all that memory for temporary structures
is freed at one go at the end of build phase.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
If at build phase we don't make any trie splitting,
then temporary build structures and resulting RT structure might be
much bigger than current.
>From other side - having just one trie instead of multiple can speedup
search quite significantly.
>From my measurements on rule-sets with ~10K rules:
RT table up to 8 times bigger, classify() up to 80% faster
than current implementation.
To make it possible for the user to decide about performance/space trade-off -
new parameter for build config structure (max_size) is introduced.
Setting it to the value greater than zero, instructs rte_acl_build() to:
- make sure that size of RT table wouldn't exceed given value.
- attempt to minimise number of tries in the table.
Setting it to zero maintains current behaviour.
That introduces a minor change in the public API, but I think the possible
performance gain is too big to ignore it.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Right now we allocate indexes for all types of nodes, except MATCH,
at 'gen final RT table' stage.
For MATCH type nodes we are doing it at building temporary tree stage.
This is totally unnecessary and makes code more complex and error prone.
Rework the code and make MATCH indexes being allocated at the same stage
as all others.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
There was a bug at build phase that can cause matches beeing overwritten.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Current rule-wildness based heuristics can cause unnecessary splits of
the ruleset.
That might have negative performance effect:
more tries to traverse, bigger RT tables.
After removing it, on some test-cases with big rulesets (~10K)
observed ~50% speedup.
No difference for smaller rulesets.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Make data_indexes long enough to survive idle transitions.
That allows to simplify match processing code.
Also fix incorrect size calculations for data indexes.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Make ACL library to build/work on 'default' architecture:
- make rte_acl_classify_scalar really scalar
(make sure it wouldn't use sse4 instrincts through resolve_priority()).
- Provide two versions of rte_acl_classify code path:
rte_acl_classify_sse() - could be build and used only on systems with sse4.2
and upper, return -ENOTSUP on lower arch.
rte_acl_classify_scalar() - a slower version, but could be build and used
on all systems.
- Addition of a new function rte_acl_classify_alg. This function lets you
specify an enum value to override the acl contexts default algorithm when doing
a classification. This allows an application to specify a classification
algorithm without needing to publicize each method. I know there was concern
over keeping those methods public, but we don't have a static ABI at the moment,
so this seems to me a reasonable thing to do, as it gives us less of an ABI
surface to worry about.
- keep common code shared between these two codepaths.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Clang compile fails without nmmintrin.h being explicitly included.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Zhaochen Zhan <zhaochen.zhan@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The ACL library is used to perform an N-tuple search over a set of rules with
multiple categories and find the best match for each category.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
[Thomas: some code-style changes]