numam-dpdk

Author	SHA1	Message	Date
Stephen Hemminger	d24b29d167	lib: remove duplicate includes Include files only need to be refrenced once per file. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2017-07-16 17:30:06 +02:00
Jerin Jacob	3abcd29f2d	update Cavium Inc copyright headers Replace the incorrect reference to "Cavium Networks", "Cavium Ltd" company name with correct the "Cavium, Inc" company name in copyright headers. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2017-07-08 17:43:49 +02:00
Bruce Richardson	35320649fa	acl: remove checks for SSE4 Since SSE4 is now part of the minimum requirements for DPDK, we now longer need this check. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2017-07-04 14:35:41 +02:00
Ashwin Sekhar T K	30b156d5ef	acl: fix build with ARMv8 clang Fixed warning -Wunknown-warning-option seen with armv8a clang compilation. Signed-off-by: Ashwin Sekhar T K <ashwin.sekhar@caviumnetworks.com> Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2017-07-03 22:28:10 +02:00
Jerin Jacob	c0583d98a9	eal: introduce macro for always inline Different drivers use internal macros like force_inline for compiler always inline feature. Standardizing it through __rte_always_inline macro. Verified the change by comparing the output binary file. No difference found in the output binary file with this change. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2017-06-06 17:21:55 +02:00
Stephen Hemminger	c5ba278876	lib: remove unnecessary void cast Remove unnecessary casts of void * pointers to a specific type. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2017-04-11 18:05:10 +02:00
Olivier Matz	feb9f680cd	mk: optimize directory dependencies Before this patch, the management of dependencies between directories had several issues: - the generation of .depdirs, done at configuration is slow: it can take more than one minute on some slow targets (usually ~10s on a standard PC without -j). - for instance, it is possible to express a dependency like: - app/foo depends on lib/librte_foo - and lib/librte_foo depends on app/bar But this won't work because the directories are traversed with a depth-first algorithm, so we have to choose between doing 'app' before or after 'lib'. - the script depdirs-rule.sh is too complex. - we cannot use "make -d" for debug, because the output of make is used for the generation of .depdirs. This patch moves the DEPDIRS-* variables in the upper Makefile, making the dependencies much easier to calculate. A DEPDIRS variable is still used to process library dependencies in LDLIBS. After this commit, "make config" is almost immediate. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Tested-by: Robin Jarry <robin.jarry@6wind.com> Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2017-03-27 23:28:43 +02:00
Michał Mirosław	aad0c999b3	acl: fix flow data comments Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2017-01-30 11:15:11 +01:00
Michał Mirosław	c6c7a8d7e6	acl: allow zero verdict This enables ACL matches to return 0 where the distinction from no-match case is not needed. Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2017-01-30 11:08:47 +01:00
Adrien Mazarguil	347a1e037f	lib: use C99 syntax for zero-size arrays Exported header files used by applications should allow the strictest compiler flags. Language extensions used in many places must be explicitly marked or removed to avoid warnings and compilation failures. The extension keyword is used whenever the C99 syntax cannot do it. This commit prevents the following errors: error: ISO C forbids zero-size array `[...]' Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2016-09-13 15:35:28 +02:00
Gowrishankar Muthukrishnan	1d73135f9f	acl: add AltiVec for ppc64 This patch adds port for ACL library in ppc64le. Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com> Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2016-09-09 17:56:14 +02:00
Jerin Jacob	52b50e8a6b	mk: fix cross-compilation Removed comparison against $CC in Makefiles as in cross-compiling mode CC can be a different string instead of string "gcc" Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2016-06-07 10:02:39 +02:00
Huawei Xie	693f715da4	remove extra parentheses in return statement fix the error reported by checkpatch: "ERROR: return is not a function, parentheses are not required" remove parentheses in return like: "return (logical expressions)" remove parentheses in return a function like: "return (rte_mempool_lookup(...))" Fixes: `6307b909b8` ("lib: remove extra parenthesis after return") Signed-off-by: Huawei Xie <huawei.xie@intel.com>	2016-02-10 15:47:50 +01:00
Jianbo Liu	68b67f9724	acl/arm: enable acl for ARMv7 Implement vqtbl1q_u8 intrinsic function, which is not supported in armv7-a. Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2015-12-08 03:00:42 +01:00
Konstantin Ananyev	a49886ddac	acl: fix native build on haswell with icc On HSW box with icc 16.0.0 build for x86_64-native-linuxapp-icc fails with: icc: command line warning #10120: overriding '-march=native' with '-msse4.1' ... dpdk.org/x86_64-native-linuxapp-icc/include/rte_memcpy.h(96): error: identifier "__m256i" is undefined The reason is that icc treats "-march=native ... -msse4.1" in a different way, then gcc and clang. For icc it means override all flags enabled with '-march=native' with '-msse4.1'. Even when '-march=native' is a superset for '-msse4.1'. To overcome the problem add a check is SSE4.1 compilation flag already enabled. If yes, then no need to add '-msse4.1' Similar change for avx2 compilation option. Fixes: `074f54ad03` ("acl: fix build and runtime for default target") Reported-by: Declan Doherty <declan.doherty@intel.com> Reported-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Declan Doherty <declan.doherty@intel.com>	2015-11-20 17:16:35 +01:00
Jerin Jacob	34fa6c27c1	acl: add NEON optimization for ARMv8 The implementation uses NEON gcc intrinsic. Verified with testacl and acl_autotest applications on arm64 architecture. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-11-18 22:44:01 +01:00
Mark Smith	fd4b6f78ad	acl: improve rules sorting Replace O(n^2) list sort with an O(n log n) merge sort. The merge sort is based on the solution suggested in: http://cslibrary.stanford.edu/105/LinkedListProblems.pdf Tested sort_rules() improvement: 100K rules: O(n^2): 31382 milliseconds; O(n log n): 10 milliseconds 259K rules: O(n^2): 133753 milliseconds; O(n log n): 22 milliseconds Signed-off-by: Mark Smith <marsmith@akamai.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-10-24 22:52:53 +02:00
Thomas Monjalon	0b6fbe8749	acl: remove old API The functions and structures are moved to app/test in order to keep existing unit tests. Some minor changes were done in these functions because of library scope restrictions. An enum is also copied in two other applications to keep existing code. The library version is incremented. Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-09-03 19:22:48 +02:00
Sergio Gonzalez Monroy	2f9d47013e	mem: move librte_malloc to eal/common Move malloc inside eal and create a new section in MAINTAINERS file for Memory Allocation in EAL. Create a dummy malloc library to avoid breaking applications that have librte_malloc in their DT_NEEDED entries. This is the first step towards using malloc to allocate memory directly from memsegs. Thus, memzones would allocate memory through malloc, allowing to free memzones. Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>	2015-07-16 13:44:48 +02:00
Konstantin Ananyev	cd8091d7d8	acl: remove unused code Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-06-18 18:09:46 +02:00
Konstantin Ananyev	cd40cd9195	acl: introduce a macro for bitmask conversion Introduce new RTE_ACL_MASKLEN_TO_BITMASK macro, that will be used in several places inside librte_acl and it's UT. Simplify and cleanup build_trie() code a bit. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-06-18 18:08:34 +02:00
Konstantin Ananyev	4a6ce751ac	acl: fix unneeded trie splitting for subset of rules When rebuilding a trie for limited rule-set, don't try to split the rule-set even further. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-06-18 18:04:58 +02:00
Konstantin Ananyev	819f3a8fb7	acl: add function to check build input parameters Move check for build confg parameter into a separate function. Simplify acl_calc_wildness() function. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-06-18 18:03:33 +02:00
Konstantin Ananyev	12c4e86969	acl: remove redundant macro Use global RTE_LEN2MASK macro, instead of local LEN2MASK. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-06-18 17:59:18 +02:00
Konstantin Ananyev	faea1ce70c	acl: fix invalid rule wildness calculation for bitmask field type Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-06-18 17:57:28 +02:00
Konstantin Ananyev	229ea9a71c	acl: remove subtree calculations at build stage As now subtree_id is not used acl_merge_trie() any more, there is no point to calculate and maintain that information. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-06-04 11:14:45 +02:00
Konstantin Ananyev	2f372ab5c9	acl: fix matching rule Reported by Zi Hu: " cat test_data/rule1 @192.168.0.0/24 192.168.0.0/24 400 : 500 0 : 52 6/0xff @192.168.0.0/24 192.168.0.0/24 400 : 500 54 : 65280 6/0xff @192.168.0.0/24 192.168.0.0/24 400 : 500 0 : 65535 6/0xff cat test_data/trace1 0xc0a80005 0xc0a80009 450 53 0x06 I run the test by: sudo ./testacl -n 2 -c 4 -- --rulesf=./test_data/rule1 --tracef=./test_data/trace1 The result shows that the packet matches the second rule, which is wrong. The dest port of the pkt is 53, so it should match the third rule. " Indeed there is problem at ACL build stage. Sometimes acl_merge_trie() is too aggressive in trying to conserve space at build time. So it takes a wrong assumptions and didn't duplicate a node, even when it should. The easiest and safest fix seems to always duplicate a left non-root/non-leaf node first, and let the further code to destroy the node, if it is not needed. Reported-by: Zi Hu <huzilucky@gmail.com> Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-06-04 11:14:45 +02:00
Konstantin Ananyev	afd7f2d86a	acl: use setjmp/longjmp to handle alloc failures at build phase During build phase ACL doing quite a lot of memory allocations for relatively small temporary structures. In theory each of such allocation can fail, so we need to handle all these possible failures. That adds a lot of extra checks and makes the code harder to read and follow. To simplify the process, made changes to handle all such failures in one place. Note, that all that memory for temporary structures is freed at one go at the end of build phase. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-04-28 11:55:03 +02:00
Konstantin Ananyev	1e496d6fdf	eal/x86: move header file for vector instructions lib/librte_eal/common/include/rte_common_vect.h -> lib/librte_eal/common/include/arch/x86/rte_vect.h Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>	2015-03-20 19:24:38 +01:00
David Marchand	a2348166ea	tailq: move to dynamic tailq Use dynamic tailq rather than static entries. Signed-off-by: David Marchand <david.marchand@6wind.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-03-10 12:06:08 +01:00
David Marchand	ff708facfc	tailq: remove unneeded inclusions Only keep inclusion where really needed. Signed-off-by: David Marchand <david.marchand@6wind.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-03-10 11:47:46 +01:00
Neil Horman	133b75923b	mk: add library version extension To differentiate libraries that break ABI, we add a library version number suffix to the library, which must be incremented when a given libraries ABI is broken. This patch enforces that addition, sets the initial abi soname extension to 1 for each library and creates a symlink to the base SONAME so that the test applications will link properly. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>	2015-02-03 16:56:58 +01:00
Neil Horman	9d41beed24	lib: provide initial versioning Add linker version script files to each DPDK library to put a stake in the ground from which we can start cleaning up API's Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>	2015-02-03 16:56:58 +01:00
Thomas Monjalon	7e60e08397	acl: remove standalone header This is a duplication of some EAL parts for a standalone packaging which is not documented. Packaging should be done outside of DPDK. Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2015-02-02 12:30:33 +01:00
Konstantin Ananyev	17f520d2cf	acl: add comments about internal layout Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:12:16 +01:00
Konstantin Ananyev	f3d24368ef	acl: remove unused constant Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:26 +01:00
Konstantin Ananyev	62945e029e	acl: introduce config parameter for performance/space trade-off If at build phase we don't make any trie splitting, then temporary build structures and resulting RT structure might be much bigger than current. >From other side - having just one trie instead of multiple can speedup search quite significantly. >From my measurements on rule-sets with ~10K rules: RT table up to 8 times bigger, classify() up to 80% faster than current implementation. To make it possible for the user to decide about performance/space trade-off - new parameter for build config structure (max_size) is introduced. Setting it to the value greater than zero, instructs rte_acl_build() to: - make sure that size of RT table wouldn't exceed given value. - attempt to minimise number of tries in the table. Setting it to zero maintains current behaviour. That introduces a minor change in the public API, but I think the possible performance gain is too big to ignore it. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:26 +01:00
Konstantin Ananyev	a0e3310e7a	acl: deduplicate some SSE and AVX2 code Vector code reorganisation/deduplication: To avoid maintaining two nearly identical implementations of calc_addr() (one for SSE, another for AVX2), replace it with a new macro that suits both SSE and AVX2 code-paths. Also remove no needed any more MM_* macros. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	cf59b29bb9	acl: move SSE dwords shuffle Reorganise SSE code-path a bit by moving lo/hi dwords shuffle out from calc_addr(). That allows to make calc_addr() for SSE and AVX2 practically identical and opens opportunity for further code deduplication. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	4269eae463	acl: use scalar method fastest for some cases Previous improvements made scalar method the fastest one for tiny bunch of packets (< 4). That allows us to remove specific vector code-path for small number of packets (search_sse_2) and always use scalar method for such cases. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	5dd71363bf	acl: add AVX2 classify method Introduce new classify() method that uses AVX2 instructions. >From my measurements: On HSW boards when processing >= 16 packets per call, AVX2 method outperforms it's SSE counterpart by 10-25%, (depending on the ruleset). When build with the compilers that don't support AVX2 instructions, make rte_acl_classify_avx2() do nothing and return an error. At runtime, if librte_acl was build with the compiler that supports AVX2, this method is selected as default one on HW that supports AVX2. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	da826b7135	eal: introduce ymm type for AVX 256-bit New data type to manipulate 256 bit AVX values. Rename field in the rte_xmm to keep common naming across SSE/AVX fields. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	3858b90d82	acl: deduplicate a bit of RT code Move common check for input parameters up into rte_acl_classify_alg(). Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	5c0e6c3de5	acl: make scalar RT code more similar to vector one Make classify_scalar to behave in the same way as it's vector counterpart: move match check out of the inner loop, etc. That makes scalar and vector code look more identical. Plus it improves scalar code performance. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	a726650857	acl: simplify match nodes allocation Right now we allocate indexes for all types of nodes, except MATCH, at 'gen final RT table' stage. For MATCH type nodes we are doing it at building temporary tree stage. This is totally unnecessary and makes code more complex and error prone. Rework the code and make MATCH indexes being allocated at the same stage as all others. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	ec51901a0b	acl: introduce DFA nodes compression (group64) for identical entries Introduced division of whole 256 child transition enties into 4 sub-groups (64 kids per group). So 2 groups within the same node with identical children, can use one set of transition entries. That allows to compact some DFA nodes and get space savings in the RT table, without any negative performance impact. >From what I've seen an average space savings: ~20%. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	d4132664d8	acl: fix overwritten matches There was a bug at build phase that can cause matches beeing overwritten. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	c11f17d7f9	acl: remove build phase heuristic with negative performance effect Current rule-wildness based heuristics can cause unnecessary splits of the ruleset. That might have negative performance effect: more tries to traverse, bigger RT tables. After removing it, on some test-cases with big rulesets (~10K) observed ~50% speedup. No difference for smaller rulesets. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	efb2529385	acl: make data indexes long enough to survive idle transitions Make data_indexes long enough to survive idle transitions. That allows to simplify match processing code. Also fix incorrect size calculations for data indexes. Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00
Konstantin Ananyev	47b1402088	acl: fix build in standalone mode Fix compilation with RTE_LIBRTE_ACL_STANDALONE=y Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>	2015-01-28 17:11:25 +01:00

1 2

58 Commits