numam-dpdk

Author	SHA1	Message	Date
David Hunt	b89168ef15	examples/vm_power: add branch ratio policy type Add the capability for the vm_power_manager to receive a policy of type BRANCH_RATIO. This will add any vcpus in the policy to the oob monitoring thread. Signed-off-by: David Hunt <david.hunt@intel.com> Acked-by: Radu Nicolau <radu.nicolau@intel.com>	2018-07-20 23:59:42 +02:00
Qi Zhang	6a015363b3	vfio: remove uneccessary IPC for group fd clear Clear vfio_group_fd is not necessary to involve any IPC. Also, current IPC implementation for SOCKET_CLR_GROUP is not correct. rte_vfio_clear_group on secondary will always fail, that prevent device be detached correctly on a secondary process. The patch simply removes all IPC related stuff in rte_vfio_clear_group. Fixes: 83a73c5fef66 ("vfio: use generic multi-process channel") Cc: stable@dpdk.org Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-20 14:26:16 +02:00
Qi Zhang	196e9a486c	eal: fix hotplug add and remove If hotplug add an already plugged PCI device, it will cause rte_pci_device->device.name be corrupted due to unexpected rte_devargs_remove. Also if try to hotplug remove an already unplugged device, it will cause segment fault due to unexpected bus->unplug on a rte_device whose driver is NULL. The patch fix these issues. Fixes: 7e8b26650146 ("eal: fix hotplug add / remove") Cc: stable@dpdk.org Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-20 14:26:16 +02:00
Anatoly Burakov	dd536a8bc5	mem: add logic check for static analyzer Technically, single file segments codepath will never get triggered when using in-memory mode, because EAL prohibits mixing these two options at initialization time. However, code analyzers do not know that, and some will complain about either using uninitialized variables, or trying to do operations on an already closed descriptor. Fix this by assuring the compiler or code analyzer that in-memory mode code never gets triggered when using single-file segments mode. Coverity issue: 302847 Fixes: 72b49ff623c4 ("mem: support --in-memory mode") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-20 11:32:03 +02:00
Anatoly Burakov	9554dbb50a	malloc: do not skip pad on free Previously, we were skipping erasing pad because we were expecting it to be freed when we were merging adjacent segments. However, if there were no adjacent segments to merge, we would've skipped erasing the pad, leaving non-zero memory in our free space. Fix this by including pad in the erasing unconditionally. Fixes: e43a9f52b7ff ("malloc: fix pad erasing") Cc: stable@dpdk.org Reported-by: Andrew Rybchenko <arybchenko@solarflare.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>	2018-07-20 11:21:31 +02:00
Andrew Rybchenko	7513bd68ae	devargs: fix parsing truncation when using format Space for string terminating NUL character should be provided to snprintf() to avoid the last symbol truncation. Fixes: a23bc2c4e01b ("devargs: add non-variadic parsing function") Reported-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-20 11:17:03 +02:00
Anatoly Burakov	e4ea1bbd6e	eal: fix dependency in multi-process detection Currently, we need runtime dir to put all of our runtime info in, including the DPDK shared config. However, we use the shared config to determine our proc type, and this happens earlier than we actually create the config dir and thus can know where to place the config file. Fix this by moving runtime dir creation right after the EAL arguments parsing, but before proc type autodetection. Also, previously we were creating the config file unconditionally, even if we specified no_shconf - fix it by only creating the config file if no_shconf is not set. Fixes: adf1d867361c ("eal: move runtime config file to new location") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Lei Yao <lei.a.yao@intel.com>	2018-07-19 12:05:14 +02:00
Anatoly Burakov	d5dd22c9f6	mem: fix alignment of requested virtual areas The original code did not align any addresses that were requested as page-aligned, but were different because addr_is_hint was set. Below fix by Dariusz has introduced an issue where all unaligned addresses were left as unaligned. This patch is a partial revert of commit 7fa7216ed48d ("mem: fix alignment of requested virtual areas") and implements a proper fix for this issue, by asking for alignment in all but the following two cases: 1) page size is equal to system page size, or 2) we got an aligned requested address, and will not accept a different one This ensures that alignment is performed in all cases, except for those we can guarantee that the address will not need alignment. Fixes: b7cc54187ea4 ("mem: move virtual area function in common directory") Fixes: 7fa7216ed48d ("mem: fix alignment of requested virtual areas") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Tested-by: Lei Yao <lei.a.yao@intel.com> Acked-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>	2018-07-18 23:22:33 +02:00
Pablo de Lara	b7167593e0	devargs: fix build with gcc 4.7 Fixed possible out-of-bounds issue: lib/librte_eal/common/eal_common_devargs.c: In function ‘rte_devargs_layers_parse’: lib/librte_eal/common/eal_common_devargs.c:121:7: error: array subscript is above array bounds Bugzilla ID: 71 Fixes: 338327d731e6 ("devargs: add function to parse device layers") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-18 10:36:30 +02:00
Thomas Monjalon	c27dbc300e	version: 18.08-rc1 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-07-16 01:17:18 +02:00
Gaetan Rivet	a3b85476c5	kvargs: add generic string matching callback This function can be used as a callback to rte_kvargs_process. This should reduce code duplication. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-15 23:44:22 +02:00
Gaetan Rivet	ac1a511eff	eal: implement device iteration Use the iteration hooks in the abstraction layers to perform the requested filtering on the internal device lists. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-15 23:44:17 +02:00
Gaetan Rivet	c99a2d4c6b	eal: implement device iteration initialization Parse a device description. Split this description in their relevant part for each layers. No dynamic allocation is performed. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-15 23:43:53 +02:00
Gaetan Rivet	670658b7a9	eal: add device iterator interface A device iterator allows iterating over a set of devices. This set is defined by the two descriptions offered, * rte_bus * rte_class Only one description can be provided, or both. It is not allowed to provide no description at all. Each layer of abstraction then performs a filter based on the description provided. This filtering allows iterating on their internal set of devices, stopping when a match is valid and returning the current iteration context. This context allows starting the next iteration from the same point and going forward. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-15 23:43:40 +02:00
Gaetan Rivet	338327d731	devargs: add function to parse device layers This function is private to the EAL. It is used to parse each layers in a device description string, and store the result in an rte_devargs structure. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>	2018-07-15 23:43:34 +02:00
Gaetan Rivet	d70f8448d0	eal: introduce device class abstraction This abstraction exists since the infancy of DPDK. It needs to be fleshed out however, to allow a generic description of devices properties and capabilities. A device class is the northbound interface of the device, intended for applications to know what it can be used for. It is conceptually just above buses. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-15 23:42:53 +02:00
Gaetan Rivet	a671f01fcc	eal: introduce destructor macros This macro adds symbols to the .fini section using the global RTE priorities, to ensure consistency. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>	2018-07-15 23:42:27 +02:00
Gaetan Rivet	5d6af85ab0	kvargs: introduce a more flexible parsing function This function permits defining additional terminating characters, ending the parsing to arbitrary delimiters. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>	2018-07-15 23:42:22 +02:00
Gaetan Rivet	092ee51649	kvargs: build before EAL This library will be used by the EAL to parse parameters. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-15 23:42:16 +02:00
Gaetan Rivet	12a020ea19	kvargs: remove error logs Error logs in kvargs parsing should be better handled in components calling the library. This library must be as lean as possible. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2018-07-15 23:42:13 +02:00
Gaetan Rivet	a23bc2c4e0	devargs: add non-variadic parsing function rte_devargs_parse becomes non-variadic, rte_devargs_parsef becomes the variadic version, to be used to compose device strings. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2018-07-15 23:42:10 +02:00
Gaetan Rivet	0436120e33	devargs: use log functions Use the standard EAL logging functions in rte_devargs. Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2018-07-15 23:41:59 +02:00
Kiran Kumar	de9c75a548	ethdev: check queue stats mapping input arguments With current implementation, we are not checking for queue_id range and stat_idx range in stats mapping function. This patch will add check for queue_id and stat_idx range. Fixes: 5de201df892 ("ethdev: add stats per queue") Signed-off-by: Kiran Kumar <kkokkilagadda@caviumnetworks.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2018-07-14 00:09:55 +02:00
Stephen Hemminger	6bc67c497a	eal: add uuid API Since uuid functions may not be available everywhere, implement uuid functions in DPDK. These are based off the BSD licensed libuuid in util-link. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>	2018-07-13 23:42:08 +02:00
Dan Gora	9f976204f5	vhost/crypto: use function to access mbuf private area Use rte_mbuf_to_priv() to access the private data area in the mbuf. Signed-off-by: Dan Gora <dg@adax.com>	2018-07-13 23:14:41 +02:00
Dan Gora	f5f45caeb0	mbuf: add accessor function for private data area Add an inline accessor function to return the starting address of the private data area in the supplied mbuf. This allows applications to easily access the private data area between the struct rte_mbuf and the data buffer in the specified mbuf without creating private macros or accessor functions. No checks are made to ensure that a private data area actually exists in the buffer. Signed-off-by: Dan Gora <dg@adax.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2018-07-13 23:08:15 +02:00
Nelio Laranjeiro	449660994e	ethdev: fix missing function in map file Add rte_flow_expand_rss in map file and tag it as experimental. Fixes: 4ed05fcd441b ("ethdev: add flow API to expand RSS flows") Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-07-13 15:53:29 +02:00
Anatoly Burakov	72b49ff623	mem: support --in-memory mode Implement the final piece of the in-memory mode puzzle - enable running DPDK entirely in memory, without creating any files. To do it, use mmap with MAP_HUGETLB and size flags to enable DPDK to work without hugetlbfs mountpoints. In order to enable this, a few things needed to be changed. First of all, we need to allow empty hugetlbfs mountpoints in hugepage_info, and handle them correctly (by not trying to create any files and lock any directories). Next, we need to reorder the mapping sequence, because the page is not really allocated until the page fault, and we cannot get its IOVA address before we trigger the page fault. Finally, decide at compile time whether we are going to be supporting anonymous hugepages or not, because we cannot check for it at runtime. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 15:35:43 +02:00
Anatoly Burakov	14de8734c4	eal: add --in-memory option This command-line option will cause DPDK to operate entirely in memory and not create any shared files at runtime, including any shared configuration or hugetlbfs files. This is useful for debug purposes, as well as for certain use cases like containers or automatic memory cleanup. Currently, this option acts as a strict superset of --no-shconf and --huge-unlink commands. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 15:35:26 +02:00
Anatoly Burakov	d435aad37d	mem: support --huge-unlink mode Unlink hugepages after creating them, to honor the hugepage-unlink mode. We cannot resize non-existing files, so make single file segments explicitly unsupported. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 15:34:17 +02:00
Anatoly Burakov	5cb42707bc	eal: do not create runtime dir in --no-shconf mode Now that the rest of the EAL is adjusted to not create any shared files, prevent runtime directory from ever being created. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 15:33:51 +02:00
Anatoly Burakov	cb14962a00	eal: support --no-shconf in hugepage data file Do not create a shared hugepage data file if we were asked to not create any shared files. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 15:33:27 +02:00
Anatoly Burakov	7296447acb	eal: support --no-shconf for hugepage info Do not create any shared hugepage size info files if we were asked to not create any shared files. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 15:33:07 +02:00
Anatoly Burakov	5848e3d281	ipc: support --no-shconf mode IPC is an inter-process communication mechanism. Since no secondaries can ever be expected to run in no-shconf mode, IPC will be useless, so do not enable it in the first place. In the interests of API usage convenience, we will still allow registering callbacks, but obviously they won't ever be triggered. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 15:32:43 +02:00
Anatoly Burakov	3ee2cde248	fbarray: support --no-shconf mode When using --no-shconf option, the expectation is that no multiprocess will be supported as no shared files are created. However, fbarray still creates some shared files that prevent multiple processes with the same prefix from starting. Fix this by avoiding creating shared files whenever noshconf option is specified. Since virtual areas we get from eal_get_virtual_area() are read-only, remap them as writable. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 15:32:05 +02:00
Anatoly Burakov	adf1d86736	eal: move runtime config file to new location As per deprecation notice [1], move DPDK runtime config to default DPDK runtime data location. Also, remove the deprecation notice and update release notes to indicate the changes. [1] http://dpdk.org/patch/40418 Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 13:29:01 +02:00
Anatoly Burakov	daf9bfca71	ipc: remove thread for async requests Previously, we were using two IPC threads - one to handle messages and synchronous requests, and another to handle asynchronous requests. To handle replies for an async request, rte_mp_handle woke up the rte_mp_handle_async thread to process through pthread_cond variable. Change it to handle asynchronous messages within the main IPC thread. To handle timeout events, for each async request which is sent, we set an alarm for it. If its reply is received before timeout, we will cancel the alarm when we handle the reply; otherwise, alarm will invoke the async_reply_handle() as the alarm callback. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Suggested-by: Thomas Monjalon <thomas@monjalon.net>	2018-07-13 12:41:34 +02:00
Jianfeng Tan	d74b7748d6	eal: bring forward init of interrupt handling Next commit will make asynchronous IPC requests rely on alarm API, which in turn relies on interrupts to work. Therefore, move the EAL interrupt initialization before IPC initialization to avoid breaking IPC in the next commit. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 12:41:15 +02:00
Anatoly Burakov	26021a7150	eal/bsd: support alarm API Implement EAL alarm API support for FreeBSD. The implementation is largely identical to that of Linux version, with one key difference. The alarm API is a little Linux-centric in that it is expecting the alarm API to manage alarm timeouts without involvement of the interrupt thread. This works on Linux because in Linux, there's timerfd API which allows waiting for timer events on an fd. On FreeBSD, however, there are no timerfd's, and timer events are set up directly in kevent. There is no way to pass information from the alarm API to the interrupt thread, so we also add a little back-channel magic to get soonest alarm timeout from the alarm API. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 12:40:45 +02:00
Anatoly Burakov	23150bd8d8	eal/bsd: add interrupt thread Add interrupt thread to FreeBSD. It is largely a copy-paste from Linuxapp interrupt thread, except for a few key differences: * Use kevent instead of epoll * Do not recreate the event queue on adding/removing interrupt sources, add/remove them to/from the queue on the fly instead * No support for UIO/VFIO handles Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 12:40:36 +02:00
Jianfeng Tan	4bb69970af	eal/linux: use libc malloc in interrupt handling IPC uses interrupts API internally, and memory subsystem uses IPC. Therefore, IPC should not use rte_malloc to avoid circular dependency. Switch to using regular glibc malloc in interrupts API. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 12:40:25 +02:00
Jianfeng Tan	204df26c1b	eal/linux: use libc malloc in alarm Alarm API is going to be used by IPC internally. However, because memory subsystem depends on IPC, alarm API cannot use rte_malloc as it creates a circular dependency. To avoid such chicken and egg problem, we change to use glibc malloc in the alarm API. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 12:39:51 +02:00
Anatoly Burakov	c63a42535a	vfio: fix uninitialized variable Some static analyzers complain about it, even though value is never used if not initialized. To avoid additional false positives about a potential null-pointer dereferences, also add a null-check. Bugzilla ID: 58 Fixes: ea2dc1066870 ("vfio: add multi container support") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 11:44:56 +02:00
Anatoly Burakov	96712b33af	eal/linux: fix uninitialized value The value is not used, but some static analyzers may give out a warning. Fix it by assigning default value of zero. Bugzilla ID: 58 Fixes: cdc242f260e7 ("eal/linux: support running as unprivileged user") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 11:44:43 +02:00
Anatoly Burakov	462dd3722e	eal/linux: fix invalid syntax in interrupts Parentheses were missing. It worked because macro is enclosed in parentheses, so syntax was valid after macro expansion. Bugzilla ID: 58 Fixes: 0a45657a6794 ("pci: rework interrupt handling") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 11:44:17 +02:00
Anatoly Burakov	e4348122a4	eal: add option to limit memory allocation on sockets Previously, it was possible to limit maximum amount of memory allowed for allocation by creating validator callbacks. Although a powerful tool, it's a bit of a hassle and requires modifying the application for it to work with DPDK example applications. Fix this by adding a new parameter "--socket-limit", with syntax similar to "--socket-mem", which would set per-socket memory allocation limits, and set up a default validator callback to deny all allocations above the limit. This option is incompatible with legacy mode, as validator callbacks are not supported there. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 11:44:15 +02:00
Anatoly Burakov	0b82bd7b24	memzone: improve zero-length reserve Currently, reserving zero-length memzones is done by looking at malloc statistics, and reserving biggest sized element found in those statistics. This has two issues. First, there is a race condition. The heap is unlocked between the time we check stats, and the time we reserve malloc element for memzone. This may lead to inability to reserve the memzone we wanted to reserve, because another allocation might have taken place and biggest sized element may no longer be available. Second, the size returned by malloc statistics does not include any alignment information, which is worked around by being conservative and subtracting alignment length from the final result. This leads to fragmentation and reserving memzones that could have been bigger but aren't. Fix all of this by using earlier-introduced operation to reserve biggest possible malloc element. This, however, comes with a trade-off, because we can only lock one heap at a time. So, if we check the first available heap and find any element at all, that element will be considered "the biggest", even though other heaps might have bigger elements. We cannot know what other heaps have before we try and allocate it, and it is not a good idea to lock all of the heaps at the same time, so, we will just document this limitation and encourage users to reserve memzones with socket id properly set. Also, fixup unit tests to account for the new behavior. Fixes: fafcc11985a2 ("mem: rework memzone to be allocated by malloc") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 11:27:30 +02:00
Anatoly Burakov	68b6092bd3	malloc: allow reserving biggest element Add an internal-only function to allocate biggest element from the heap. Nominally, it supports SOCKET_ID_ANY as its socket argument, but it's essentially useless because other sockets will only be allocated from if the entire heap on current or specified socket is busy. Still, asking to reserve a biggest element will allow fixing race condition in memzone reserve that has been there for a long time. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Remy Horton <remy.horton@intel.com>	2018-07-13 11:27:27 +02:00
Anatoly Burakov	9fe6bceafd	malloc: add finding biggest free IOVA-contiguous element Adding internal-only function to find biggest free IOVA-contiguous malloc element. This is not exposed to external API. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Remy Horton <remy.horton@intel.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>	2018-07-13 11:23:07 +02:00
Anatoly Burakov	e43a9f52b7	malloc: fix pad erasing Previously, when joining adjacent free elements, we were erasing trailer and header, but did not erase the padding. Fix this by accounting for padding on erase, and do not erase padding twice by adjusting data pointer and data len to not include padding. Fixes: bb372060dad4 ("malloc: make heap a doubly-linked list") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-07-13 11:21:30 +02:00

1 2 3 4 5 ...

4609 Commits