numam-dpdk

Author	SHA1	Message	Date
Stanislaw Kardach	93cba71bdc	eal/riscv: fix vector header for C++ rte_xmm_t is a union type which wraps around xmm_t and maps its contents to scalar structures. Since C++ has stricter type conversion rules than C, the rte_xmm_t::x has to be used instead of C-casting. Fixes: `f22e705ebf` ("eal/riscv: support RISC-V architecture") Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-06-15 09:12:16 +02:00
David Marchand	e5e613f05b	eal: remove unused arch-specific headers for locks MCS lock, PF lock and Ticket lock have no arch specific implementation, there is no need for the extra redirection in headers. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Stanislaw Kardach <kda@semihalf.com>	2022-06-08 15:44:20 +02:00
Michal Mazurek	f22e705ebf	eal/riscv: support RISC-V architecture Add all necessary elements for DPDK to compile and run EAL on SiFive Freedom U740 SoC which is based on SiFive U74-MC (ISA: rv64imafdc) core complex. This includes: - EAL library implementation for rv64imafdc ISA. - meson build structure for 'riscv' architecture. RTE_ARCH_RISCV define is added for architecture identification. - xmm_t structure operation stubs as there is no vector support in the U74 core. Compilation was tested on Ubuntu and Arch Linux using riscv64 toolchain. Clang compilation currently not supported due to issues with missing relocation relaxation. Two rte_rdtsc() schemes are provided: stable low-resolution using rdtime (default) and unstable high-resolution using rdcycle. User can override the scheme by defining RTE_RISCV_RDTSC_USE_HPM=1 during compile time of both DPDK and the application. The reasoning for this is as follows. The RISC-V ISA mandates that clock read by rdtime has to be of constant period and synchronized between all hardware threads within 1 tick (chapter 10.1 in version 20191213 of RISC-V spec). However this clock may not be of high-enough frequency for dataplane uses. I.e. on HiFive Unmatched (FU740) it is 1MHz. There is a high-resolution alternative in form of rdcycle which is clocked at the core clock frequency. The drawbacks are that it may be disabled during sleep (WFI), its frequency might change due to DVFS and it is core-local and therefore cannot be used as a wall-clock. It can however be used for micro-benchmarking user applications, similarly to Aarch64's PMCCNTR PMU counter. The platform is currently marked as linux-only because rte_cycles implementation uses the timebase-frequency device-tree node read through the proc file system. Such approach was chosen because Linux kernel depends on the presence of this device-tree node. The i40e PMD driver is disabled on RISC-V as the rv64gc ISA has no vector operations. The compilation of following modules has been disabled by this commit and will be re-enabled in later commits as fixes are introduced: net/ixgbe, net/memif, net/tap, example/l3fwd. Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com> Sponsored-by: Sam Grove <sam.grove@sifive.com> Signed-off-by: Michal Mazurek <maz@semihalf.com> Signed-off-by: Stanislaw Kardach <kda@semihalf.com>	2022-06-08 11:26:20 +02:00
Tyler Retzlaff	ca04c78b62	eal: get/set thread priority per thread identifier Add functions for setting and getting the priority of a thread. Priorities on multiple platforms are similarly determined by a priority value and a priority class/policy. Currently in DPDK most threads operate at the OS-default priority level but there are cases when increasing the priority is useful. For example, high performance applications may require elevated priority levels. For these reasons, EAL will expose two priority levels which are named suggestively "normal" and "realtime_critical" and are computed as follows: On Linux, the following mapping is created: RTE_THREAD_PRIORITY_NORMAL corresponds to * policy SCHED_OTHER * priority value: (sched_get_priority_min(SCHED_OTHER) + sched_get_priority_max(SCHED_OTHER))/2; RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to * policy SCHED_RR * priority value: sched_get_priority_max(SCHED_RR); On Windows, the following mapping is created: RTE_THREAD_PRIORITY_NORMAL corresponds to * class NORMAL_PRIORITY_CLASS * priority THREAD_PRIORITY_NORMAL RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to * class REALTIME_PRIORITY_CLASS (when running with privileges) * class HIGH_PRIORITY_CLASS (when running without privileges) * priority THREAD_PRIORITY_TIME_CRITICAL Note that on Linux the resulting priority value will be 0, in accordance to the documentation that mention the value should be 0 for SCHED_OTHER policy. Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>	2022-06-07 13:33:14 +02:00
Mattias Rönnblom	0bee070907	eal: add seqlock A sequence lock (seqlock) is a synchronization primitive which allows for data-race free, low-overhead, high-frequency reads, suitable for data structures shared across many cores and which are updated relatively infrequently. A seqlock permits multiple parallel readers. A spinlock is used to serialize writers. In cases where there is only a single writer, or writer-writer synchronization is done by some external means, the "raw" sequence counter type (and accompanying rte_seqcount_*() functions) may be used instead. To avoid resource reclamation and other issues, the data protected by a seqlock is best off being self-contained (i.e., no pointers [except to constant data]). One way to think about seqlocks is that they provide means to perform atomic operations on data objects larger than what the native atomic machine instructions allow for. DPDK seqlocks (and the underlying sequence counters) are not preemption safe on the writer side. A thread preemption affects performance, not correctness. A seqlock contains a sequence number, which can be thought of as the generation of the data it protects. A reader will 1. Load the sequence number (sn). 2. Load, in arbitrary order, the seqlock-protected data. 3. Load the sn again. 4. Check if the first and second sn are equal, and even numbered. If they are not, discard the loaded data, and restart from 1. The first three steps need to be ordered using suitable memory fences. A writer will 1. Take the spinlock, to serialize writer access. 2. Load the sn. 3. Store the original sn + 1 as the new sn. 4. Perform load and stores to the seqlock-protected data. 5. Store the original sn + 2 as the new sn. 6. Release the spinlock. Proper memory fencing is required to make sure the first sn store, the data stores, and the second sn store appear to the reader in the mentioned order. The sn loads and stores must be atomic, but the data loads and stores need not be. The original seqlock design and implementation was done by Stephen Hemminger. This is an independent implementation, using C11 atomics. For more information on seqlocks, see https://en.wikipedia.org/wiki/Seqlock Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>	2022-06-07 13:33:14 +02:00
Duncan Bellamy	0615dd2aa1	eal/ppc: fix compilation for musl musl lacks __ppc_get_timebase() but has __builtin_ppc_get_timebase() Signed-off-by: Duncan Bellamy <dunk@denkimushi.com> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>	2022-06-07 13:33:14 +02:00
Thomas Monjalon	b251bb7630	eal/ppc: undefine AltiVec keyword vector The AltiVec header file is defining "vector", except in C++ build. The keyword "vector" may conflict easily. As a rule, it is better to use the alternative keyword "__vector". The DPDK header file rte_altivec.h takes care of undefining "vector", so the applications and dependencies are free to define the name "vector". This is a compatibility breakage for applications which were using the keyword "vector" for its AltiVec meaning. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Tested-by: Ali Alnubani <alialnu@nvidia.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>	2022-06-01 16:51:53 +02:00
Thomas Monjalon	64fcadeac0	avoid AltiVec keyword vector The AltiVec header file is defining "vector", except in C++ build. The keyword "vector" may conflict easily. As a rule, it is better to use the alternative keyword "__vector", so we will be able to #undef vector after including AltiVec header. Later it may become possible to #undef vector in rte_altivec.h with a compatibility breakage. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>	2022-05-25 11:49:39 +02:00
David Marchand	2f51bc9c27	eal/freebsd: fix use of newer cpuset macros FreeBSD has updated its CPU macros to align more with the definitions used on Linux[1]. Unfortunately, while this makes compatibility better in future, it means we need to have both legacy and newer definition support. Use a meson check to determine which set of macros are used. [1] https://cgit.freebsd.org/src/commit/?id=e2650af157bc Bugzilla ID: 1014 Fixes: `c3568ea376` ("eal: restrict control threads to startup CPU affinity") Fixes: `b6be16acfe` ("eal: fix control thread affinity with --lcores") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Daxue Gao <daxuex.gao@intel.com>	2022-05-24 12:48:11 +02:00
David Marchand	26d734b5d2	devargs: fix leak on hotplug failure Caught by ASan, if a secondary process tried to attach a device with an incorrect driver name, devargs was leaked. Fixes: `64051bb1f1` ("devargs: unify scratch buffer storage") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com>	2022-05-19 18:45:20 +02:00
Luc Pelletier	00901e4d1a	eal/x86: fix unaligned access for small memcpy Calls to rte_memcpy for 1 < n < 16 could result in unaligned loads/stores, which is undefined behaviour according to the C standard, and strict aliasing violations. The code was changed to use a packed structure that allows aliasing (using the __may_alias__ attribute) to perform the load/store operations. This results in code that has the same performance as the original code and that is also C standards-compliant. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2022-05-19 18:19:30 +02:00
Tyler Retzlaff	b70a9b7886	eal: get/set thread affinity per thread identifier Implement functions for getting/setting thread affinity. Threads can be pinned to specific cores by setting their affinity attribute. Windows error codes are translated to errno-style error codes. The possible return values are chosen so that we have as much semantic compatibility between platforms as possible. note: convert_cpuset_to_affinity has a limitation that all cpus of the set belong to the same processor group. Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2022-05-19 16:45:07 +02:00
Tyler Retzlaff	56539289b8	eal: provide current thread identifier Provide a portable type-safe thread identifier. Provide rte_thread_self for obtaining current thread identifier. Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>	2022-05-19 16:45:07 +02:00
David Marchand	48ff13ef37	test/mem: disable ASan when accessing unallocated memory As described in bugzilla, ASan reports accesses to all memory segment as invalid, since those parts have not been allocated with rte_malloc. Move __rte_no_asan to rte_common.h and disable ASan on a part of the test. Bugzilla ID: 880 Fixes: `6cc51b1293` ("mem: instrument allocator for ASan") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-05-11 14:05:30 +02:00
Tianhao Chai	28c5d60072	eal: fix C++ include for device event and DMA Currently the "extern C" section ends right before rte_dev_dma_unmap and other DMA function declarations, causing some C++ compilers to produce C++ mangled symbols to rte_dev_dma_unmap instead of C symbols. This leads to build failures later when linking a final executable against this object. Fixes: `a753e53d51` ("eal: add device event monitor framework") Cc: stable@dpdk.org Signed-off-by: Tianhao Chai <cth451@gmail.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>	2022-05-05 11:48:33 +02:00
Anatoly Burakov	4d8bdd8b56	malloc: fix ASan handling for unmapped memory Currently, when we free previously allocated memory, we mark the area as "freed" for ASan purposes (flag 0xfd). However, sometimes, freeing a malloc element will cause pages to be unmapped from memory and re-backed with anonymous memory again. This may cause ASan's "use-after-free" error down the line, because the allocator will try to write into memory areas recently marked as "freed". To fix this, we need to mark the unmapped memory area as "available", and fixup surrounding malloc element header/trailers to enable later malloc routines to safely write into new malloc elements' headers or trailers. Bugzilla ID: 994 Fixes: `6cc51b1293` ("mem: instrument allocator for ASan") Cc: stable@dpdk.org Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-05-05 10:13:43 +02:00
Deepak Khandelwal	90bf3f89ed	mem: skip attaching external memory in secondary process Currently, EAL init in secondary processes will attach all fbarrays in the memconfig to have access to the primary process's page tables. However, fbarrays corresponding to external memory segments should not be attached at initialization, because this will happen as part of `rte_extmem_attach` [1] or `rte_malloc_heap_memory_attach` [2] calls. 1: https://doc.dpdk.org/api/rte__memory_8h.html#a2796da68de6825f8edf53759f8e4d230 2: https://doc.dpdk.org/api/rte__malloc_8h.html#af6360dea35bdf162feeb2b62cf149fd3 Fixes: `ff3619d624` ("malloc: allow attaching to external memory chunks") Cc: stable@dpdk.org Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: Deepak Khandelwal <deepak.khandelwal@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-04-28 13:44:13 +02:00
Tyler Retzlaff	8001c0ddfe	eal/windows: set main lcore affinity Add missing code to affinitize main_lcore from lcore configuration. Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-04-25 09:38:15 +02:00
Mattias Rönnblom	76076342ec	eal: emit warning for unused trylock return value Mark the trylock family of spinlock functions with __rte_warn_unused_result. Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>	2022-04-14 14:38:20 +02:00
Mattias Rönnblom	eb13e558f8	eal: add macro to warn for unused function return values This patch adds a wrapper macro __rte_warn_unused_result for the warn_unused_result function attribute. Marking a function __rte_warn_unused_result will make the compiler emit a warning in case the caller does not use the function's return value. Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>	2022-04-14 14:38:20 +02:00
David Marchand	a95d70547c	eal: factorize lcore main loop All OS implementations provide the same main loop. Introduce helpers (shared for Linux and FreeBSD) to handle synchronisation between main and threads and factorize the rest as common code. Thread id are now logged as string in a common format across OS. Note: - this change also fixes Windows EAL: worker threads cpu affinity was incorrectly reported in log. - libabigail flags this change as breaking ABI in clang builds: 1 function with some indirect sub-type change: [C] 'function int rte_eal_remote_launch(int (void), void, unsigned int)' at eal_common_launch.c:35:1 has some indirect sub-type changes: parameter 1 of type 'int (void)' changed: in pointed to type 'function type int (void)' at rte_launch.h:31:1: entity changed from 'function type int (void*)' to 'typedef lcore_function_t' at rte_launch.h:31:1 type size hasn't changed This is being investigated on libabigail side. For now, we don't have much choice but to waive reports on this symbol. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>	2022-04-14 13:59:50 +02:00
David Marchand	449e7dbc7b	eal: cleanup lcore ID hand-over So far, a worker thread has been using its thread_id to discover which lcore has been assigned to it. On the other hand, as noted by Tyler, the pthread API does not strictly guarantee that a new thread won't start running eal_thread_loop before pthread_create writes to &lcore_config[xx].thread_id. Though all OS implementations supported in DPDK (recently) ensure this property, it is more robust to have the main thread directly pass the worker thread lcore. Signed-off-by: David Marchand <david.marchand@redhat.com>	2022-04-14 13:59:50 +02:00
David Marchand	1e230b9be8	eal/windows: add missing C++ include guards Add missing 'extern "C"' to file. Fixes: `1db72630da` ("eal/windows: do not expose private facilities") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>	2022-04-08 10:49:39 +02:00
Tyler Retzlaff	e4e983b975	eal/windows: fix data race when creating threads eal_thread_loop() uses lcore_config[i].thread_id, which is stored upon the return from CreateThread(). Per documentation, eal_thread_loop() can start before CreateThread() returns and the ID is stored. Create lcore worker threads suspended and then subsequently resume to allow &lcore_config[i].thread_id be stored before eal_thread_loop execution. Fixes: `53ffd9f080` ("eal/windows: add minimum viable code") Cc: stable@dpdk.org Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>	2022-03-30 19:01:52 +02:00
Haiyue Wang	f6ecec2b91	eal/x86: remove atomic header include loop Remove the x86 top atomic header include from the architecture related header file, since this x86 top atomic header file has included them. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>	2022-03-30 18:54:14 +02:00
Bruce Richardson	29fd052dcc	eal/freebsd: add missing C++ include guards Add missing 'extern "C"' to file. Fixes: `428eb983f5` ("eal: add OS specific header file") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2022-03-15 02:06:13 +01:00
Wenxuan Wu	4e3582ab5b	eal/linux: fix device monitor stop return The ret value in rte_dev_event_monitor_stop stands for whether the monitor has been successfully closed, and should not bind with rte_intr_callback_unregister, so once it goes to the right exit point of rte_dev_event_monitor, the ret value should be set to 0. Also, the refmonitor count has been carefully evaluated, the value change from 1 to 0, so there is no potential memory leak failure. Fixes: `1fef6ced07` ("eal/linux: allow multiple starts of event monitor") Cc: stable@dpdk.org Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>	2022-03-07 19:21:18 +01:00
Madhuker Mythri	356a2aa054	devargs: fix crash with uninitialized parsing The function rte_devargs_parse() previously was safe to call with non-initialized devargs structure as parameter. When adding the support for the global device syntax, this assumption was broken. Restore it by forcing memset as part of the call itself. Bugzilla ID: 933 Fixes: `b344eb5d94` ("devargs: parse global device syntax") Cc: stable@dpdk.org Signed-off-by: Madhuker Mythri <madhuker.mythri@oracle.com> Signed-off-by: Gaetan Rivet <grive@u256.net>	2022-02-27 19:28:59 +01:00
Steve Yang	1a287fc9c9	eal/linux: fix illegal memory access in uevent handler 'recv()' fills the 'buf', later 'strlcpy()' used to copy from this buffer. But as coverity warns 'recv()' doesn't guarantee that 'buf' is null-terminated, but 'strlcpy()' requires it. Enlarge 'buf' size to 'EAL_UEV_MSG_LEN + 1' and ensure the last one can be set to 0 when received buffer size is EAL_UEV_MSG_LEN. CID 375864: Memory - illegal accesses (STRING_NULL) Passing unterminated string "buf" to "dev_uev_parse", which expects a null-terminated string. Coverity issue: 375864 Fixes: `0d0f478d04` ("eal/linux: add uevent parse and process") Cc: stable@dpdk.org Signed-off-by: Steve Yang <stevex.yang@intel.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2022-02-27 19:12:34 +01:00
Brian Dooley	d7e9c02cca	eal: add missing C++ guards Some public header files were missing 'extern "C"' C++ guards, and couldn't be used by C++ applications. Add the missing guards. Fixes: `af75078fec` ("first public release") Fixes: `7f3aa08639` ("eal: introduce bit operations API") Fixes: `166a743c53` ("compat: add infrastructure to support symbol versioning") Fixes: `8f40ee0734` ("eal/x86: get hypervisor name") Fixes: `75583b0d1e` ("eal: add keep alive monitoring") Fixes: `88701645c9` ("eal: move interrupt type out of igb_uio") Fixes: `f04519d809` ("lib: add missing include dependencies") Fixes: `f58880682c` ("trace: implement register API") Fixes: `428eb983f5` ("eal: add OS specific header file") Cc: stable@dpdk.org Signed-off-by: Brian Dooley <brian.dooley@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>	2022-02-22 14:47:49 +01:00
Sean Morrissey	30a1de105a	lib: remove unneeded header includes These header includes have been flagged by the iwyu_tool and removed. Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>	2022-02-22 13:10:39 +01:00
Michael Barker	ed57d08dfd	eal: ignore gcc-compat warning in clang-only macro When compiling with clang using -Wpedantic (or -Wgcc-compat) the use of diagnose_if kicks up a warning: .../include/rte_interrupts.h:623:1: error: 'diagnose_if' is a clang extension [-Werror,-Wgcc-compat] __rte_internal ^ .../include/rte_compat.h:36:16: note: expanded from macro '__rte_internal' __attribute__((diagnose_if(1, "Symbol is not public ABI", "error"), \ This change ignores the '-Wgcc-compat' warning in the specific location where the warning occurs. It is safe to do in this circumstance as the specific macro is only defined when using the clang compiler. Signed-off-by: Michael Barker <mikeb01@gmail.com>	2022-02-12 14:37:47 +01:00
Stephen Hemminger	06c047b680	remove unnecessary null checks Functions like free, rte_free, and rte_mempool_free already handle NULL pointer so the checks here are not necessary. Remove redundant NULL pointer checks before free functions found by nullfree.cocci Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2022-02-12 12:07:48 +01:00
Stephen Hemminger	a0cc7be20d	mem: cleanup multiprocess resources The mp action resources in malloc should be cleaned up via rte_eal_cleanup. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-11 19:49:22 +01:00
Stephen Hemminger	e8dc971b63	eal: cleanup multiprocess hotplug resources When rte_eal_cleanup is called, hotplug should unregister the resources associated with the multi-process server. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2022-02-11 19:49:22 +01:00
Stephen Hemminger	6412941ae8	vfio: cleanup the multiprocess sync handle When rte_eal_cleanup is called the rte_mp_action for VFIO should be freed. Fixes: `edf73dd330` ("ipc: handle unsupported IPC in action register") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-11 19:49:22 +01:00
Stephen Hemminger	6e858b4d92	ipc: end multiprocess thread during cleanup When rte_eal_cleanup is called, all control threads should exit. For the mp thread, this best handled by closing the mp_socket and letting the thread see that. This also fixes potential problems where the mp_socket gets another hard error, and the thread runs away repeating itself by reading the same error. Fixes: `85d6815fa6` ("eal: close multi-process socket during cleanup") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-11 19:49:22 +01:00
Stephen Hemminger	5f4eb82f3c	log: close in cleanup stage When application calls rte_eal_cleanup on shutdown, the DPDK log should be closed and cleaned up. This helps reduce false reports from tools like ASAN and valgrind that track memory leaks. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2022-02-11 19:49:22 +01:00
Yunjian Wang	5f69ebbd85	mem: check allocation in dynamic hugepage init The function malloc() could return NULL, the return value need to be checked. Fixes: `6f63858e55` ("mem: prevent preallocated pages from being freed") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2022-02-11 08:46:21 +01:00
Pavan Nikhilesh	264ff3f250	eal/arm: inline 128-bit atomic compare exchange with GCC GCC [1] now assigns even register pairs for CASP, the fix has also been backported to all stable releases of older GCC versions. Removing the manual register allocation allows GCC to inline the functions and pick optimal registers for performing CASP. 1: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=563cc649beaf Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>	2022-02-11 08:44:09 +01:00
Bruce Richardson	59144f6edd	eal: fix C++ include C++ files could not include some headers because: * "new" is a keyword in C++, so can't be a variable name * there is no automatic casting to/from void * Fixes: `184104fc61` ("ticketlock: introduce fair ticket based locking") Fixes: `032a7e5499` ("trace: implement provider payload") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2022-02-10 23:02:35 +01:00
Stephen Hemminger	6e97b5fc1a	eal: move Unix filesystem functions into one file Both Linux and FreeBSD have same code for creating runtime directory and reading sysfs files. Put them in the new lib/eal/unix subdirectory. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>	2022-02-09 19:12:53 +01:00
Stephen Hemminger	1835a22f34	support systemd service convention for runtime directory Systemd.exec supports configuring the runtime directory of a service via RuntimeDirectory=. This creates the directory with the necessary permissions which actual service may not have if running in container. The change to DPDK is to look for the environment RUNTIME_DIRECTORY first and use that in preference to the fallback alternatives. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>	2022-02-09 19:12:40 +01:00
Stephen Hemminger	36514d8dfa	eal: remove size for setting runtime directory The size argument to eal_set_runtime_dir is useless and was being used incorrectly in strlcpy. It worked only because all callers passed PATH_MAX which is same as sizeof the destination runtime_dir. Note: this is an internal API so no user exposed change. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2022-02-09 16:42:31 +01:00
Srikanth Yalavarthi	f3ca33bb20	eal: add internal function to get base address Added an internal helper to get OS-specific EAL mapping base address This helper can be used by the drivers to program offload / accelerator devices, where the base address can be used as a reference address by the accelerator to access the host memory An address can also be represented as an offset relative to the base address using smaller data types Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-08 23:59:10 +01:00
Dmitry Kozlyuk	0dff3f26d6	eal: extend --huge-unlink for hugepage file reuse Expose Linux EAL ability to reuse existing hugepage files via --huge-unlink=never switch. Default behavior is unchanged, it can also be specified using --huge-unlink=existing for consistency. Old --huge-unlink switch is kept, it is an alias for --huge-unlink=always. Add a test case for the --huge-unlink=never mode. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-08 21:32:53 +01:00
Dmitry Kozlyuk	32b4771cd8	eal/linux: allow hugepage file reuse Linux EAL ensured that mapped hugepages are clean by always mapping from newly created files: existing hugepage backing files were always removed. In this case, the kernel clears the page to prevent data leaks, because the mapped memory may contain leftover data from the previous process that was using this memory. Clearing takes the bulk of the time spent in mmap(2), increasing EAL initialization time. Introduce a mode to keep existing files and reuse them in order to speed up initial memory allocation in EAL. Hugepages mapped from such files may contain data left by the previous process that used this memory, so RTE_MEMSEG_FLAG_DIRTY is set for their segments. If multiple hugepages are mapped from the same file: 1. When fallocate(2) is used, all memory mapped from this file is considered dirty, because it is unknown which parts of the file are holes. 2. When ftruncate(3) is used, memory mapped from this file is considered dirty unless the file is extended to create a new mapping, which implies clean memory. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-08 21:32:53 +01:00
Dmitry Kozlyuk	52d7d91ed4	eal: refactor --huge-unlink storage In preparation to extend --huge-unlink option semantics refactor how it is stored in the internal configuration. It makes future changes more isolated. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-08 21:32:53 +01:00
Dmitry Kozlyuk	2edd037c09	mem: add dirty malloc element support EAL malloc layer assumed all free elements content is filled with zeros ("clean"), as opposed to uninitialized ("dirty"). This assumption was ensured in two ways: 1. EAL memalloc layer always returned clean memory. 2. Freed memory was cleared before returning into the heap. Clearing the memory can be as slow as around 14 GiB/s. To save doing so, memalloc layer is allowed to return dirty memory. Such segments being marked with RTE_MEMSEG_FLAG_DIRTY. The allocator tracks elements that contain dirty memory using the new flag in the element header. When clean memory is requested via rte_zmalloc*() and the suitable element is dirty, it is cleared on allocation. When memory is deallocated, the freed element is joined with adjacent free elements, and the dirty flag is updated: a) If the joint element contains dirty parts, it is dirty: dirty + freed + dirty = dirty => no need to clean freed + dirty = dirty the freed memory Dirty parts may be large (e.g. initial allocation), so clearing them could create unpredictable slowdown. b) If the only dirty part of the joint element is the freed memory, the joint element can be made clean: clean + freed + clean = clean => freed memory clean + freed = clean must be cleared freed + clean = clean freed = clean This logic naturally reproduces the old behavior and always applies in modes when EAL memalloc layer returns only clean segments. As a result, memory is either cleared on free, as before, or it will be cleared on allocation if need be, but never twice. Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2022-02-08 21:32:53 +01:00
Weiguo Li	0ae7844fcd	eal/windows: remove useless C++ include guard Remove the incomplete cplusplus guard in internal header. Fixes: `6e1ed4cbbe` ("eal/windows: add dirent implementation") Signed-off-by: Weiguo Li <liwg06@foxmail.com> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>	2022-02-08 17:13:50 +01:00

1 2 3 4

172 Commits