numam-dpdk

Author	SHA1	Message	Date
Pablo de Lara	98b8ec7060	hash: fix eviction counter When adding a new entry in a hash table, there is a maximum number of evictions that can be performed. When the counter of these evictions reaches this maximum, the entry cannot be added, as it is considered that the algorithm has encountered an infinite loop. The problem with the current implementation, is that this counter was declared as a static variable. If there are multiple threads adding entries in the same table or in different tables, they should access different counters, one per core and per table. Therefore, the variable has been modified to be non-static. Fixes: 243e93a5046f ("hash: fix unlimited cuckoo path") Cc: stable@dpdk.org Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2017-10-07 13:50:43 +02:00
Stephen Hemminger	d24b29d167	lib: remove duplicate includes Include files only need to be refrenced once per file. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2017-07-16 17:30:06 +02:00
Mike Stolarchuk	61d04efc38	hash: fix lock release on add When adding items to a hash table with multiple threads, there is an spinlock used to prevent data corruption (unless Transactional Memory is supported). If there is a failure, the spinlock should be released, but there were cases where that was not happening. Fixes: be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX") Cc: stable@dpdk.org Signed-off-by: Mike Stolarchuk <mike.stolarchuk@bigswitch.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2017-07-08 19:00:22 +02:00
Jerin Jacob	577329e66b	eal: switch to architecture specific pause function Remove rte_pause() definition from rte_common.h and switchover to architecture specific rte_pause.h Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2017-07-03 23:58:51 +02:00
Jerin Jacob	98a7ea332b	fix typos using codespell utility Fixing typos across dpdk source code using codespell utility. Skipped the ethdev driver's base code fixes to keep the base code intact. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: John McNamara <john.mcnamara@intel.com>	2017-06-14 23:54:13 +02:00
Bruce Richardson	ecaed092b6	ring: return remaining entry count when dequeuing Add an extra parameter to the ring dequeue burst/bulk functions so that those functions can optionally return the amount of remaining objs in the ring. This information can be used by applications in a number of ways, for instance, with single-consumer queues, it provides a max dequeue size which is guaranteed to work. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2017-03-29 22:32:20 +02:00
Bruce Richardson	14fbffb0aa	ring: return free space when enqueuing Add an extra parameter to the ring enqueue burst/bulk functions so that those functions can optionally return the amount of free space in the ring. This information can be used by applications in a number of ways, for instance, with single-producer queues, it provides a max enqueue size which is guaranteed to work. It can also be used to implement watermark functionality in apps, replacing the older functionality with a more flexible version, which enables apps to implement multiple watermark thresholds, rather than just one. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2017-03-29 22:32:04 +02:00
Pablo de Lara	243e93a504	hash: fix unlimited cuckoo path When trying to insert a new entry, if its target bucket is full, the alternative location (bucket) of one of the entries is checked, to try to find an empty slot, with make_space_bucket. This function is called every time a new bucket is checked, recursively. To avoid having a very long insert operation (and to avoid filling up the stack), a limit in the number of pushes is introduced. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2016-10-12 18:40:51 +02:00
Byron Marohn	ff15d9c0ba	hash: modify lookup bulk pipeline This patch replaces the pipelined rte_hash lookup mechanism with a loop-and-jump model, which performs significantly better, especially for smaller table sizes and smaller table occupancies. Signed-off-by: Byron Marohn <byron.marohn@intel.com> Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>	2016-10-05 12:10:49 +02:00
Byron Marohn	58017c98ed	hash: add vectorized comparison In lookup bulk function, the signatures of all entries are compared against the signature of the key that is being looked up. Now that all the signatures are together, they can be compared with vector instructions (SSE, AVX2), achieving higher lookup performance. Also, entries per bucket are increased to 8 when using processors with AVX2, as 256 bits can be compared at once, which is the size of 8x32-bit signatures. Signed-off-by: Byron Marohn <byron.marohn@intel.com> Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>	2016-10-05 12:09:50 +02:00
Byron Marohn	8a9f542f32	hash: reorganize bucket structure Move current signatures of all entries together in the bucket and same with all alternative signatures, instead of having current and alternative signatures together per entry in the bucket. This will be benefitial in the next commits, where a vectorized comparison will be performed, achieving better performance. The alternative signatures have been moved away from the current signatures, to make the key indices be consecutive to the current signatures, as these two fields are used by lookup, so they are in the same cache line. Signed-off-by: Byron Marohn <byron.marohn@intel.com> Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>	2016-10-05 12:08:56 +02:00
Pablo de Lara	5fc74c2e14	hash: check if slot is empty with key index Instead of checking if the current and alternative signatures are 0, it is faster to check if the key index associated to an entry is 0, meaning that the slot is empty. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>	2016-09-29 21:51:27 +02:00
Pablo de Lara	24c20a7221	hash: fix false zero signature key hit lookup This commit fixes a corner case scenario. When a key is deleted, its signature in the hash table gets clear, which should prevent a lookup of that same key, unless the signature of the key is all zeroes. In that case, there will be a match, and key would be compared against the key that is in the table (which does not get cleared, as the performance penalty would be high), resulting in a wrong hit. To prevent this from happening, the key index associated to that entry should be set to zero when deleting it, so in case that same key is looked up just after a deletion, it will point to the dummy key slot, which guarantees a miss. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>	2016-09-29 21:50:32 +02:00
Pablo de Lara	1621f69abb	hash: fix ring size Ring stores the free slots available to be used in the key table. The ring size was being increased by 1, because of the dummy slot, used for key misses, but this is not actually stored in the ring, so there is no need to increase it. Fixes: 5915699153d7 ("hash: fix scaling by reducing contention") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>	2016-09-29 21:48:25 +02:00
Masoud Hasanifard	8331c64847	hash: fix custom compare Set cmp_jump_table_idx to KEY_CUSTOM in rte_hash_cmp_eq so that the custom function we are setting in rte_hash_set_cmp_func properly works. The custom function is only called by rte_hash_cmp_eq if cmp_jump_table_idx is set to KEY_CUSTOM. Fixes: 95da2f8e9c61 ("hash: customize compare function") Signed-off-by: Masoud Hasanifard <masoudhasanifard@gmail.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2016-09-22 17:38:17 +02:00
Yari Adan Petralanda	6dc34e0afe	hash: retrieve a key given its position The function rte_hash_get_key_with_position is added in this patch. As the position returned when adding a key is frequently used as an offset into an array of user data, this function performs the operation of retrieving a key given this offset. A possible use case would be to delete a key from the hash table when its entry in the array of data has certain value. For instance, the key could be a flow 5-tuple, and the value stored in the array a time stamp. Signed-off-by: Juan Antonio Montesinos <juan.antonio.montesinos.delgado@ericsson.com> Signed-off-by: Yari Adan Petralanda <yari.adan.petralanda@ericsson.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2016-07-10 14:56:45 +02:00
Wei Shen	be856325cb	hash: add scalable multi-writer insertion with Intel TSX This patch introduced scalable multi-writer Cuckoo Hash insertion based on a split Cuckoo Search and Move operation using Intel TSX. It can do scalable hash insertion with 22 cores with little performance loss and negligible TSX abortion rate. * Added an extra rte_hash flag definition to switch default single writer Cuckoo Hash behavior to multiwriter. - If HTM is available, it would use hardware feature for concurrency. - If HTM is not available, it would fall back to spinlock. * Created a rte_cuckoo_hash_x86.h file to hold all x86-arch related cuckoo_hash functions. And rte_cuckoo_hash.c uses compile time flag to select x86 file or other platform-specific implementations. While HTM check is still done at runtime (same idea with RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT) * Moved rte_hash private struct definitions to rte_cuckoo_hash.h, to allow rte_cuckoo_hash_x86.h or future platform dependent functions to include. * Following new functions are created for consistent names when new platform TM support are added. - rte_hash_cuckoo_move_insert_mw_tm: do insertion with bucket movement. - rte_hash_cuckoo_insert_mw_tm: do insertion without bucket movement. * One extra multi-writer test case is added. Signed-off-by: Wei Shen <wei1.shen@intel.com> Signed-off-by: Sameh Gobriel <sameh.gobriel@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2016-06-24 16:25:07 +02:00
Olivier Matz	5d7bfb7337	hash: fix race condition at creation To avoid a race condition while creating a new hash object, the list has to be locked before the lookup, and released only once the new object is added in the list. As the lock is held by the rte_ring_create(), move its creation at the beginning of the function and only take the lock after the ring is created to avoid a deadlock. Fixes: 48a3991196 ("hash: replace with cuckoo hash implementation") Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2016-04-06 17:30:51 +02:00
Olivier Matz	1aadacb5b0	hash: fix allocation of an existing object Change rte_hash*_create() functions to return NULL and set rte_errno to EEXIST when the object name already exists. This is the behavior described in the API documentation in the header file. These functions were returning a pointer to the existing object in that case, but it is a problem as the caller did not know if the object had to be freed or not. Doing this change also makes the hash API more consistent with the other APIs (mempool, rings, ...). Fixes: 916e4f4f4e ("memory: fix for multi process support") Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2016-04-06 17:30:48 +02:00
Pablo de Lara	f9bd334211	hash: fix multi-process support Hash library used a function pointer to choose a different key compare function, depending on the key size. As a result, multiple processes could not use the same hash table, as the function addresses vary from one process to another. Instead, a jump table is used, so each process has its own function addresses, accessing this table with an index stored in the hash table (note that using a custom key compare function is not supported in multi-process mode). Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2016-04-01 18:56:27 +02:00
Pablo de Lara	dbf17d44f3	hash: use common x86 flag Instead of using RTE_ARCH_X86_64, RTE_ARCH_X86_32 and RTE_ARCH_I686, use directly RTTE_ARCH_X86 Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2016-04-01 18:55:27 +02:00
Huawei Xie	693f715da4	remove extra parentheses in return statement fix the error reported by checkpatch: "ERROR: return is not a function, parentheses are not required" remove parentheses in return like: "return (logical expressions)" remove parentheses in return a function like: "return (rte_mempool_lookup(...))" Fixes: 6307b909b8e0 ("lib: remove extra parenthesis after return") Signed-off-by: Huawei Xie <huawei.xie@intel.com>	2016-02-10 15:47:50 +01:00
Yu Nemo Wenbin	95da2f8e9c	hash: customize compare function Give user a chance to customize the hash key compare function. The default rte_hash_cmp_eq function is set in the rte_hash_create function, but these builtin ones may not good enough, so the user may call this to override the default one. Signed-off-by: Yu Nemo Wenbin <yuwb_bjy@ctbri.com.cn> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2015-12-07 00:55:50 +01:00
Jerin Jacob	3e3a1a4fc4	hash: select CRC hash if armv8-a CRC extension available select hash function for cuckoo, fbk as rte_hash_crc_4byte if arm64-CRC extension available Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2015-11-25 22:14:00 +01:00
Jerin Jacob	f123e3d2ca	hash: replace libc memcmp with optimized functions for arm64 The following measurements shows improvement over the default libc memcmp function Length(B) by X% over libc memcmp 16 149.57% 32 122.7% 48 104.96% 64 98.21% 80 93.75% 96 90.55% 112 110.48% 128 137.24% Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2015-11-25 22:11:37 +01:00
Pablo de Lara	c31af3e169	hash: fix incorrect lookup if key is all zero If user has not added an all zero key in the hash table, and tries to look it up, it results in an incorrect hit, as dummy slot in the key table has all zero as well. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2015-11-04 01:07:25 +01:00
Pablo de Lara	5915699153	hash: fix scaling by reducing contention If using multiple cores on a system with hardware transactional memory support, thread scaling does not work, as there was a single point in the hash library which is a bottleneck for all threads, which is the "free_slots" ring, which stores all the indices of the free slots in the table. This patch fixes the problem, by creating a local cache per logical core, which stores locally indices of free slots, so most times, writer threads will not interfere each other. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2015-11-04 01:04:10 +01:00
Pablo de Lara	9fd0052ee6	hash: free internal ring when freeing hash Since freeing a ring is now possible, then when freeing a hash table, its internal ring can be freed as well. Therefore when a new table, with the same name as a previously freed table, is created, there is no need to look up the already allocated ring. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2015-11-04 01:01:48 +01:00
Pablo de Lara	7d49e0f4a9	hash: fix memory allocation of cuckoo key table When calculating the size for the table which allocates the keys, size was calculated wrongly from multiplying two 32-bit variables, resulting on a 32-bit number, before casting to 64-bit, so maximum size was 4G. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2015-09-09 11:49:20 +02:00
Xavier Simonart	6133acbe82	hash: fix crash when adding already inserted keys When adding with cuckoo hash a key which was already inserted a new slot is dequeued and then enqueued back, but the enqueue operation was not done properly. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Xavier Simonart <xavier.simonart@intel.com> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>	2015-07-28 22:09:30 +02:00
Pablo de Lara	9cd270a678	hash: move struct field to keep ABI stable In order to keep the ABI consistent with the old hash library, hash_func_init_val field has been moved, so it remains at the same offset as previously, since hash_func and hash_func_init_val are fields accessed by the public function rte_hash_hash and must keep the same offset as older versions. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2015-07-22 11:24:40 +02:00
Pablo de Lara	fd1fa9bddd	hash: fix build for non-x86 arch Hash library uses optimized compare functions that use x86 intrinsics, therefore non-x86 systems could not build the library. In that case, the compare function is set to the generic memcmp. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Reported-by: Zhigang Lu <zlu@ezchip.com> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Zhigang Lu <zlu@ezchip.com>	2015-07-18 19:47:21 +02:00
Pablo de Lara	af083e9fcc	hash: fix build without SSE4.1 _mm_test_all_zeros is not available for CPUs with no SSE4.1, therefore, DPDK would not build. This patch adds an alternative for this, using _mm_cmpeq_epi32 and _mm_movemask_epi8. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2015-07-16 16:52:15 +02:00
Pablo de Lara	6f71544ce2	hash: fix build with gcc 4.4 and 4.5 gcc 4.4 and 4.5 throws following error: rte_cuckoo_hash.c:145: error: flexible array member in otherwise empty struct. This is due to empty length in flexible array, which has been changed to use size 0 in the declaration of the array. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Reported-by: Olga Shern <olgas@mellanox.com> Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2015-07-16 16:48:51 +02:00
Pablo de Lara	2a4103eba9	hash: fix out of bounds array access When encountering a loop while adding a new entry, element out of bounds of array was being unnecessarily resetted. Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2015-07-16 16:47:56 +02:00
Pablo de Lara	f9edbc9bb6	hash: add iterate function Since now rte_hash structure is private, a new function has been added to let the user iterate through the hash table, returning next key and data associated on each iteration, plus the position where they were stored. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2015-07-13 00:18:41 +02:00
Pablo de Lara	473d1bebce	hash: allow to store data in hash table Usually hash tables not only store keys, but also data associated to them. In order to maintain the existing API, the old functions will still return the index where the key was stored. The new functions will return the data associated to that key. In the case of the lookup_bulk function, it will return also the number of entries found and a bitmask of which entries were found. Unit tests have been updated to use these new functions. As a final point, a flag has been added in rte_hash_parameters to indicate if there are new parameters for future versions, so there is no need to maintain multiple versions of the existing functions in the future. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> [Thomas: fix \|\| operator in a precondition check]	2015-07-13 00:16:29 +02:00
Pablo de Lara	b26473ff8f	hash: add reset function Added reset function to be able to empty the table, without having to destroy and create it again. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2015-07-13 00:15:03 +02:00
Pablo de Lara	48a3991196	hash: replace with cuckoo hash implementation This patch replaces the existing hash library with another approach, using the Cuckoo Hash method to resolve collisions (open addressing), which pushes items from a full bucket when a new entry tries to be added in it, storing the evicted entry in an alternative location, using a secondary hash function. This gives the user the ability to store more entries when a bucket is full, in comparison with the previous implementation. Therefore, the unit test has been updated, as some scenarios have changed (such as the previous removed restriction). Also note that the API has not been changed, although new fields have been added in the rte_hash structure (structure is internal now). The main change when creating a new table is that the number of entries per bucket is fixed now, so its parameter is ignored now (still there to maintain the same parameters structure). The hash unit test has been updated to reflect these changes. As a last note, the maximum burst size in lookup_burst function hash been increased to 64, to improve performance. Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2015-07-12 23:46:11 +02:00

39 Commits