numam-dpdk

Author	SHA1	Message	Date
Anand Rawat	82ba4416dd	build: add module definition files for Windows Updated lib/meson.build to create shared libraries on Windows. Added DEF files to list the exports for the eal and kvargs libraries. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Anand Rawat <anand.rawat@intel.com> Reviewed-by: Pallavi Kadam <pallavi.kadam@intel.com> Reviewed-by: Ranjit Menon <ranjit.menon@intel.com> Acked-by: Harini Ramakrishnan <harini.ramakrishnan@microsoft.com>	2019-04-03 01:21:31 +02:00
Anand Rawat	58836e93f5	eal/windows: add wrappers for string functions Updated rte_common.h to include rte_os.h to contain OS specific macros and functions. Updated rte_string_fns.h to include rte_common.h for rte_os.h Signed-off-by: Anand Rawat <anand.rawat@intel.com> Reviewed-by: Pallavi Kadam <pallavi.kadam@intel.com> Reviewed-by: Ranjit Menon <ranjit.menon@intel.com> Acked-by: Harini Ramakrishnan <harini.ramakrishnan@microsoft.com>	2019-04-03 01:21:15 +02:00
Anand Rawat	428eb983f5	eal: add OS specific header file Added rte_os.h files to support OS specific functionality. Updated build system to contain OS headers in the include path. Signed-off-by: Anand Rawat <anand.rawat@intel.com> Reviewed-by: Pallavi Kadam <pallavi.kadam@intel.com> Acked-by: Harini Ramakrishnan <harini.ramakrishnan@microsoft.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-04-03 01:11:56 +02:00
Anand Rawat	98edcbb5ab	eal/windows: introduce Windows support Added initial stub source files and required meson changes for Windows support. kernel/windows/meson is a stub file added to support Windows specific source in future releases. Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com> Signed-off-by: Anand Rawat <anand.rawat@intel.com> Reviewed-by: Jeff Shaw <jeffrey.b.shaw@intel.com> Reviewed-by: Ranjit Menon <ranjit.menon@intel.com> Acked-by: Harini Ramakrishnan <harini.ramakrishnan@microsoft.com>	2019-04-03 01:06:01 +02:00
Thomas Monjalon	3c45889189	eal: remove exec-env directory Only one header file (rte_kni_common.h) was in the sub-directory include/exec-env/ This file was installed in a sub-directory of the same name in the makefile-based build. Source and install directories are moved as below: lib/librte_eal/linux/eal/include/exec-env/ -> lib/librte_eal/linux/eal/include/ build/include/exec-env/ -> build/include/ The consequence is to have a file hierarchy a bit more flat. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Marchand <david.marchand@redhat.com> Tested-by: David Marchand <david.marchand@redhat.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-04-02 21:49:35 +02:00
Andrew Rybchenko	dfc6b2fd8d	mbuf: remove Intel offload checks from generic API rte_validate_tx_offload() is used in Tx prepare callbacks (RTE_LIBRTE_ETHDEV_DEBUG only) to check Tx offloads consistency. Requirement that packet headers should not be fragmented is not documented and unclear where it comes from except rte_net_intel_cksum_prepare() functions which relies on it. It could be NIC vendor specific driver or hardware limitation, but, if so, it should be documented and checked in corresponding Tx prepare callbacks. Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2019-04-02 16:42:52 +02:00
Anatoly Burakov	1e3380a2f4	mem: do not use lockfiles for single file segments mode Due to internal glibc limitations [1], DPDK may exhaust internal file descriptor limits when using smaller page sizes, which results in inability to use system calls such as select() by user applications. Single file segments option stores lock files per page to ensure that pages are deleted when there are no more users, however this is not necessary because the processes will be holding onto the pages anyway because of mmap(). Thus, removing pages from the filesystem is safe even though they may be used by some other secondary process. As a result, single file segments mode no longer stores inordinate amounts of segment fd's, and the above issue with fd limits is solved. However, this will not work for legacy mem mode. For that, simply document that using bigger page sizes is the only option. [1] https://mails.dpdk.org/archives/dev/2019-February/124386.html Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-04-02 16:07:25 +02:00
Anatoly Burakov	848cbff836	mem: refactor segment resizing function Currently, segment resizing code sits in one giant function which handles both in-memory and regular modes. Split them up into individual functions. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-04-02 16:07:13 +02:00
Darek Stojaczyk	ea4e3ab7bd	eal: initialize alarms early On Linux, we currently initialize rte_alarms after starting to listen for IPC hotplug requests, which gives us a data race window. Upon receiving such hotplug request we always try to set an alarm and this obviously doesn't work if the alarms weren't initialized yet. To fix it, we initialize alarms before starting to listen for IPC hotplug messages. Specifically, we move rte_eal_alarm_init() right after rte_eal_intr_init() as it makes some sense to keep those two close to each other. We update the BSD code as well to keep the initialization order the same in both EAL implementations. Fixes: `244d513071` ("eal: enable hotplug on multi-process") Cc: stable@dpdk.org Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>	2019-04-02 15:00:26 +02:00
Pavan Nikhilesh	e840cb3c2a	eal: increase max number of interrupt vectors MSI-X permits a device to allocate up to 2048 interrupts as per PCIe spec. Increase the max number of vectors to a reasonable value of 512. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>	2019-04-02 02:59:04 +02:00
Bruce Richardson	c12cefa379	telemetry: fix mapping of statistics If we have two NIC ports which have a different set of NIC stats we can end up having two different stats registered with xstats with the same name. [Since the stats are updated in bulk as a contiguous set, the second driver re-using the registration of the first is not possible.] This causes issues with the invalid stat for one driver being found due to a lookup by name which is unnecessary. Instead of getting stat names involved do the lookup by ID instead. Fixes: `1b756087db` ("telemetry: add parser for client socket messages") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Kevin Laatz <kevin.laatz@intel.com>	2019-04-02 02:30:53 +02:00
David Hunt	b60fd5f8b1	power: add bit for high frequency cores This patch adds a new bit in the capabilities mask that's returned by rte_power_get_capabilities(), allowing application to query which cores have the higher frequencies, and can then pin the workloads accordingly. Returned Bits: 0 - Turbo Boost enabled 1 - Higher core base_frequency Signed-off-by: Liang Ma <liang.j.ma@intel.com> Signed-off-by: David Hunt <david.hunt@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-04-02 02:22:08 +02:00
David Hunt	08a710642d	power: fix governor storage to trim newlines Currently the Power Libray stores the governor name with an embedded newline read from the scaling_governor sysfs file. This patch strips it out. Fixes: `445c6528b5` ("power: common interface for guest and host") Cc: stable@dpdk.org Signed-off-by: David Hunt <david.hunt@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-04-01 22:23:44 +02:00
Nemanja Marjanovic	a2fe4c1b38	net: add PPPoE ethertypes Add PPPoE ethertypes in to rte_ether.h. Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-03-29 20:55:25 +01:00
Nemanja Marjanovic	8692f6591e	pipeline: support QinQ PPPoE encapsulation Add support of QinQ PPPoE packet encapsulation action. Signed-off-by: Nemanja Marjanovic <nemanja.marjanovic@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-03-29 20:52:09 +01:00
Eelco Chaudron	ebe3a76991	meter: fix divide by zero for RFC4115 RFC 4115 allows a meter with either cir and/or eir configured. When only one is configured a divide by zero would occur. Fixes: `655796d2b5` ("meter: support RFC4115 trTCM") Signed-off-by: Eelco Chaudron <echaudro@redhat.com>	2019-03-29 20:51:25 +01:00
Stephen Hemminger	ad97ceece1	ethdev: add min/max MTU to device info This addresses the usability issue raised by OVS at DPDK Userspace summit. It adds general min/max MTU into device info. For compatibility, and to save space, it fits in a hole in existing structure. The initial version sets max MTU to normal Ethernet, it is up to PMD to set larger value if it supports Jumbo frames. Also remove the deprecation notice introduced in 18.11 regarding this change and bump ethdev ABI version. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-29 18:57:42 +01:00
Fan Zhang	bc5560c15e	vhost/crypto: fix parens Coverity issue: 277214, 277220, 277233, 277236 Fixes: `cd1e8f03ab` ("vhost/crypto: fix packet copy in chaining mode") Cc: stable@dpdk.org Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-03-29 17:25:32 +01:00
Rami Rosen	a15b7a0e53	ethdev: fix a typo This patch fixes a trivial typo in rte_ethdev.h. retieve=>retrieve Fixes: `80a1deb4c7` ("ethdev: add API to retrieve queue information") Cc: stable@dpdk.org Signed-off-by: Rami Rosen <ramirose@gmail.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-29 17:25:31 +01:00
Natanael Copa	c2d82896ac	eal/linux: remove thread ID from debug message There is no guarantee that pthread_self() returns the thread ID or that pthread_t is an integer. The thread ID is not that useful so simply remove it. This fixes the following warning when building with musl libc: lib/librte_eal/linuxapp/eal/eal_dev.c: In function 'sigbus_handler': lib/librte_eal/linuxapp/eal/eal_dev.c:70:3: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] (int)pthread_self(), info->si_addr); ^ Fixes: `0fc54536b1` ("eal: add failure handling for hot-unplug") Cc: stable@dpdk.org Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>	2019-03-31 01:01:28 +01:00
Shahaf Shuler	c33a675b62	bus: introduce device level DMA memory mapping The DPDK APIs expose 3 different modes to work with memory used for DMA: 1. Use the DPDK owned memory (backed by the DPDK provided hugepages). This memory is allocated by the DPDK libraries, included in the DPDK memory system (memseg lists) and automatically DMA mapped by the DPDK layers. 2. Use memory allocated by the user and register to the DPDK memory systems. Upon registration of memory, the DPDK layers will DMA map it to all needed devices. After registration, allocation of this memory will be done with rte_malloc APIs. 3. Use memory allocated by the user and not registered to the DPDK memory system. This is for users who wants to have tight control on this memory (e.g. avoid the rte_malloc header). The user should create a memory, register it through rte_extmem_register API, and call DMA map function in order to register such memory to the different devices. The scope of the patch focus on #3 above. Currently the only way to map external memory is through VFIO (rte_vfio_dma_map). While VFIO is common, there are other vendors which use different ways to map memory (e.g. Mellanox and NXP). The work in this patch moves the DMA mapping to vendor agnostic APIs. Device level DMA map and unmap APIs were added. Implementation of those APIs was done currently only for PCI devices. For PCI bus devices, the pci driver can expose its own map and unmap functions to be used for the mapping. In case the driver doesn't provide any, the memory will be mapped, if possible, to IOMMU through VFIO APIs. Application usage with those APIs is quite simple: allocate memory * call rte_extmem_register on the memory chunk. * take a device, and query its rte_device. * call the device specific mapping function for this device. Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap APIs, leaving the rte device APIs as the preferred option for the user. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-03-30 16:48:56 +01:00
Shahaf Shuler	0cbce3a167	vfio: skip DMA map failure if already mapped Currently vfio DMA map function will fail in case the same memory segment is mapped twice. This is too strict, as this is not an error to map the same memory twice. Instead, use the kernel return value to detect such state and have the DMA function to return as successful. For type1 mapping the kernel driver returns EEXISTS. For spapr mapping EBUSY is returned since kernel 4.10. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-03-30 16:48:55 +01:00
Shahaf Shuler	4106d89a18	vfio: allow DMA map to the default container Enable users the option to call rte_vfio_dma_map with request to map to the default vfio fd. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2019-03-30 16:47:54 +01:00
Anatoly Burakov	23d5455517	mem: warn user when running without NUMA support Running in non-legacy mode on a NUMA-enabled system without libnuma is unsupported, so explicitly print out a warning when trying to do so. Running in legacy mode without libnuma is still supported whether or not we are running with libnuma support enabled, so also fix init to allow that scenario. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-30 00:13:04 +01:00
Kevin Traynor	e1e4dafbc7	power: fix frequency list buffer validation The frequency list buffer was already validated in power_acpi_cpufreq_freqs(), so the newly added check was redundant. To keep consistency with power_pstate_cpufreq_freqs(), remove the original check and update the log message. Fixes: `2e6ccdb4e0` ("power: fix frequency list to handle null buffer") Cc: stable@dpdk.org Signed-off-by: Kevin Traynor <ktraynor@redhat.com>	2019-03-29 14:58:27 +01:00
Anatoly Burakov	3660216ef1	malloc: fix IPC message initialization The memset size for an IPC message is set incorrectly. Fix it to cover the entire IPC message. Fixes: `07dcbfe010` ("malloc: support multiprocess memory hotplug") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-29 12:55:07 +01:00
Anatoly Burakov	b8a86c83e0	fbarray: fix init unlock without lock Certain failure paths of rte_fbarray_init() will unlock the mem area lock without locking it first. Fix this by properly handling the failures. Fixes: `5b61c62cfd` ("fbarray: add internal tailq for mapped areas") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-29 12:49:35 +01:00
Darek Stojaczyk	5a98bc5e83	fbarray: fix attach deadlock rte_fbarray_attach() currently locks its internal spinlock, but never releases it. Secondary processes won't even start if there is more than one fbarray to be attached to - the second rte_fbarray_attach() would be just stuck. Fix it by releasing the lock at the end of rte_fbarray_attach(). I believe this was the original intention. Fixes: `5b61c62cfd` ("fbarray: add internal tailq for mapped areas") Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-29 12:49:35 +01:00
Anatoly Burakov	1fd3bcf3f9	vfio: document multiprocess limitation for container API Currently, there is no support for sharing custom VFIO containers between multiple processes, but it is not documented. Document this limitation. Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-29 00:07:16 +01:00
Thomas Monjalon	3a1a885e03	eal: remove redundant atomic API description Atomic functions are described in doxygen of the file lib/librte_eal/common/include/generic/rte_atomic.h The copies in arch-specific files are redundant and confuse readers about the genericity of the API. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-03-28 23:52:53 +01:00
Dekel Peled	8015c5593a	eal/ppc: fix global memory barrier From previous patch description: "to improve performance on PPC64, use light weight sync instruction instead of sync instruction." Excerpt from IBM doc [1], section "Memory barrier instructions": "The second form of the sync instruction is light-weight sync, or lwsync. This form is used to control ordering for storage accesses to system memory only. It does not create a memory barrier for accesses to device memory." This patch removes the use of lwsync, so calls to rte_wmb() and rte_rmb() will provide correct memory barrier to ensure order of accesses to system memory and device memory. [1] https://www.ibm.com/developerworks/systems/articles/powerpc.html Fixes: `d23a6bd04d` ("eal/ppc: fix memory barrier for IBM POWER") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com>	2019-03-28 23:48:28 +01:00
Michał Mirosław	a1c6b70786	mem: count overcommit hugepages as available With nr_overcommit_hugepages > 0 application may be able to allocate hugepages even when free_hugepages == 0. Take this into account when counting available hugepages. Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-28 23:33:50 +01:00
Anatoly Burakov	034f1fb616	mem: attempt multiple hugepage allocations at init When requesting memory with ``-m`` or ``--socket-mem`` flags, currently the init will fail if the requested memory amount was bigger than any one memseg list, even if total amount of available memory was sufficient. Fix this by making EAL to attempt to allocate pages multiple times, until we either fulfill our memory requirements, or run out of hugepages to allocate. Bugzilla ID: 95 Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-28 23:28:58 +01:00
Anatoly Burakov	bec5625588	mem: improve best-effort allocation Previously, when using non-exact allocation, we were requesting N pages to be allocated, but allowed the memory subsystem to allocate less than requested. However, we were still expecting to see N contigous free pages in the memseg list. This presents a problem because there is no way to try and allocate as many pages as possible, even if there isn't enough contiguous free entries in the list. To address this, use the new "find biggest" fbarray API's when allocating non-exact number of pages. This way, we will first check how many entries in the list are actually available, and then try to allocate up to that number. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-28 23:28:54 +01:00
Anatoly Burakov	7353ee7344	fbarray: add API to find biggest used or free chunks Currently, while there is a way to find total amount of used/free space in an fbarray, there is no way to find biggest contiguous chunk. Add such API, as well as unit tests to test this API. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-28 23:28:52 +01:00
Anatoly Burakov	5b61c62cfd	fbarray: add internal tailq for mapped areas Currently, there are numerous reliability issues with fbarray, such as: - There is no way to prevent attaching to overlapping memory areas - There is no way to prevent double-detach - Failed destroy leaves fbarray in an invalid state (fbarray itself is valid, but its backing memory area is already detached) In addition, on FreeBSD, doing mmap() on a file descriptor does not keep the lock, so we also need to store the fd in order to keep the lock. This patch improves upon fbarray to address both of these issues by adding an internal tailq to track allocated areas and their respective file descriptors. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-28 23:28:50 +01:00
Nikhil Rao	db9f4430c2	service: fix parameter type for attribute The type of value parameter to rte_service_attr_get should be uint64_t *, since the attributes are of type uint64_t. Fixes: `4d55194d76` ("service: add attribute get function") Signed-off-by: Nikhil Rao <nikhil.rao@intel.com> Reviewed-by: Gage Eads <gage.eads@intel.com> Reviewed-by: Rami Rosen <ramirose@gmail.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>	2019-03-28 21:07:48 +01:00
Ruifeng Wang	90fefe78bf	hash: optimize signature compare for Arm NEON Implemented signature compare function based on neon intrinsic. Hash bulk lookup had 3% - 6% performance gain after optimization. Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>	2019-03-28 19:54:21 +01:00
Joyce Kong	ca49b92079	ticketlock: enable generic ticketlock on all arch Let all architectures use generic ticketlock implementation. Signed-off-by: Joyce Kong <joyce.kong@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 15:00:11 +01:00
Joyce Kong	184104fc61	ticketlock: introduce fair ticket based locking The spinlock implementation is unfair, some threads may take locks aggressively while leaving the other threads starving for long time. This patch introduces ticketlock which gives each waiting thread a ticket and they can take the lock one by one. First come, first serviced. This avoids starvation for too long time and is more predictable. Suggested-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 14:58:49 +01:00
Joyce Kong	e8af2f1f11	rwlock: reimplement with atomic builtins The __sync builtin based implementation generates full memory barriers ('dmb ish') on Arm platforms. Using C11 atomic builtins to generate one way barriers. Here is the assembly code of __sync_compare_and_swap builtin. __sync_bool_compare_and_swap(dst, exp, src); 0x000000000090f1b0 <+16>: e0 07 40 f9 ldr x0, [sp, #8] 0x000000000090f1b4 <+20>: e1 0f 40 79 ldrh w1, [sp, #6] 0x000000000090f1b8 <+24>: e2 0b 40 79 ldrh w2, [sp, #4] 0x000000000090f1bc <+28>: 21 3c 00 12 and w1, w1, #0xffff 0x000000000090f1c0 <+32>: 03 7c 5f 48 ldxrh w3, [x0] 0x000000000090f1c4 <+36>: 7f 00 01 6b cmp w3, w1 0x000000000090f1c8 <+40>: 61 00 00 54 b.ne 0x90f1d4 <rte_atomic16_cmpset+52> // b.any 0x000000000090f1cc <+44>: 02 fc 04 48 stlxrh w4, w2, [x0] 0x000000000090f1d0 <+48>: 84 ff ff 35 cbnz w4, 0x90f1c0 <rte_atomic16_cmpset+32> 0x000000000090f1d4 <+52>: bf 3b 03 d5 dmb ish 0x000000000090f1d8 <+56>: e0 17 9f 1a cset w0, eq // eq = none Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Gavin Hu <gavin.hu@arm.com> Signed-off-by: Joyce Kong <joyce.kong@arm.com> Tested-by: Joyce Kong <joyce.kong@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 11:47:05 +01:00
Gavin Hu	453d8f7366	spinlock: reimplement with atomic one-way barrier The __sync builtin based implementation generates full memory barriers ('dmb ish') on Arm platforms. Using C11 atomic builtins to generate one way barriers. Here is the assembly code of __sync_compare_and_swap builtin. __sync_bool_compare_and_swap(dst, exp, src); 0x000000000090f1b0 <+16>: e0 07 40 f9 ldr x0, [sp, #8] 0x000000000090f1b4 <+20>: e1 0f 40 79 ldrh w1, [sp, #6] 0x000000000090f1b8 <+24>: e2 0b 40 79 ldrh w2, [sp, #4] 0x000000000090f1bc <+28>: 21 3c 00 12 and w1, w1, #0xffff 0x000000000090f1c0 <+32>: 03 7c 5f 48 ldxrh w3, [x0] 0x000000000090f1c4 <+36>: 7f 00 01 6b cmp w3, w1 0x000000000090f1c8 <+40>: 61 00 00 54 b.ne 0x90f1d4 <rte_atomic16_cmpset+52> // b.any 0x000000000090f1cc <+44>: 02 fc 04 48 stlxrh w4, w2, [x0] 0x000000000090f1d0 <+48>: 84 ff ff 35 cbnz w4, 0x90f1c0 <rte_atomic16_cmpset+32> 0x000000000090f1d4 <+52>: bf 3b 03 d5 dmb ish 0x000000000090f1d8 <+56>: e0 17 9f 1a cset w0, eq // eq = none The benchmarking results showed constant improvements on all available platforms: 1. Cavium ThunderX2: 126% performance; 2. Hisilicon 1616: 30%; 3. Qualcomm Falkor: 13%; 4. Marvell ARMADA 8040 with A72 cores on macchiatobin: 3.7% Here is the example test result on TX2: $sudo ./build/app/test -l 16-27 -- i RTE>>spinlock_autotest * spinlock_autotest without this patch * Test with lock on 12 cores... Core [16] Cost Time = 53886 us Core [17] Cost Time = 53605 us Core [18] Cost Time = 53163 us Core [19] Cost Time = 49419 us Core [20] Cost Time = 34317 us Core [21] Cost Time = 53408 us Core [22] Cost Time = 53970 us Core [23] Cost Time = 53930 us Core [24] Cost Time = 53283 us Core [25] Cost Time = 51504 us Core [26] Cost Time = 50718 us Core [27] Cost Time = 51730 us Total Cost Time = 612933 us * spinlock_autotest with this patch * Test with lock on 12 cores... Core [16] Cost Time = 18808 us Core [17] Cost Time = 29497 us Core [18] Cost Time = 29132 us Core [19] Cost Time = 26150 us Core [20] Cost Time = 21892 us Core [21] Cost Time = 24377 us Core [22] Cost Time = 27211 us Core [23] Cost Time = 11070 us Core [24] Cost Time = 29802 us Core [25] Cost Time = 15793 us Core [26] Cost Time = 7474 us Core [27] Cost Time = 29550 us Total Cost Time = 270756 us In the tests on ThunderX2, with more cores contending, the performance gain was even higher, indicating the __atomic implementation scales up better than __sync. Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 09:19:39 +01:00
Gavin Hu	85cffb2ecc	ring: enforce reading tail before slots In weak memory models, like arm64, reading the prod.tail may get reordered after reading the ring slots, which corrupts the ring and stale data is observed. This issue was reported by NXP on 8-A72 DPAA2 board. The problem is most likely caused by missing the acquire semantics when reading prod.tail (in SC dequeue) which makes it possible to read a stale value from the ring slots. For MP (and MC) case, rte_atomic32_cmpset() already provides the required ordering. For SP case, the control depependency between if-statement (which depends on the read of r->cons.tail) and the later stores to the ring slots make RMB unnecessary. About the control dependency, read more at: https://www.cl.cam.ac.uk/~pes20/ppc-supplemental/test7.pdf This patch is adding the required read barrier to prevent reading the ring slots get reordered before reading prod.tail for SC case. Fixes: `c9fb3c6289` ("ring: move code in a new header file") Cc: stable@dpdk.org Signed-off-by: Gavin Hu <gavin.hu@arm.com> Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com> Tested-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Nipun Gupta <nipun.gupta@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-28 01:22:04 +01:00
Pavan Nikhilesh	5cbd14b3e5	eal: roundup TSC frequency when estimating When estimating tsc frequency using sleep/gettime round it up to the nearest multiple of 10Mhz for more accuracy. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Keith Wiles <keith.wiles@intel.com>	2019-03-28 00:45:16 +01:00
Pavan Nikhilesh	f56e551485	eal: add macro to align value to the nearest multiple Add macro to align value to the nearest multiple of the given value, resultant value might be greater than or less than the first parameter whichever difference is the lowest. Update unit test to include the new macro. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>	2019-03-28 00:45:00 +01:00
Jerin Jacob	55878866eb	use appropriate EAL macro for constructors Use eal's RTE_INIT abstraction for defining constructors. Signed-off-by: Jerin Jacob <jerinj@marvell.com> Reviewed-by: David Marchand <david.marchand@redhat.com>	2019-03-27 23:10:57 +01:00
Jakub Grajciar	0c7ce182a7	eal: add pending interrupt callback unregister use case: if callback is used to receive message form socket, and the message received is disconnect/error, this callback needs to be unregistered, but cannot because it is still active. With this patch it is possible to mark the callback to be unregistered once the interrupt process is done with this interrupt source. Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>	2019-03-27 18:53:47 +01:00
Kevin Traynor	c0d9052afb	eal/linux: fix log levels for pagemap reading failure Commit `cdc242f260` says: For Linux kernel 4.0 and newer, the ability to obtain physical page frame numbers for unprivileged users from /proc/self/pagemap was removed. Instead, when an IOMMU is present, simply choose our own DMA addresses instead. In this case the user still sees error messages, so adjust the log levels. Later, other checks will ensure that errors are logged in the appropriate cases. Fixes: `cdc242f260` ("eal/linux: support running as unprivileged user") Cc: stable@dpdk.org Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com>	2019-03-27 14:54:40 +01:00
Anatoly Burakov	929a91e99c	malloc: fix documentation of realloc function The documentation for rte_realloc claims that the resized area will always reside on the same NUMA node. This is not actually the case - while resized area will be on the same NUMA node, if resizing the area is not possible, then the memory will be reallocated using rte_malloc(), which can allocate memory on another NUMA node, depending on which lcore rte_realloc() was called from and which NUMA nodes have memory available. Fix the API doc to match the actual code of rte_realloc(). Fixes: `af75078fec` ("first public release") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-27 12:15:04 +01:00
Stephen Hemminger	24aa4f0fba	mem: poison memory when freed DPDK malloc library allows broken programs to work because the semantics of zmalloc and malloc are the same. This patch enables a more secure model which will catch (and crash) programs that reuse memory already freed if RTE_MALLOC_DEBUG is enabled. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-27 10:53:41 +01:00
Andrius Sirvys	cd6683331d	acl: fix compiler flags with meson and AVX2 runtime When compiling the ACL library on a system without AVX2 support, the flags used to compile the AVX2-specific code for later run-time use were not based on the regular cflags for the rest of the library. This can cause errors due to symbols being missed/undefined due to incorrect flags. For example, when testing compilation on Alpine linux, we got: error: unknown type name 'cpu_set_t' due to _GNU_SOURCE not being defined in the cflags. This issue can be fixed by appending "-mavx2" to the cflags rather than replacing them with it. Fixes: `5b9656b157` ("lib: build with meson") Cc: stable@dpdk.org Signed-off-by: Andrius Sirvys <andrius.sirvys@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-27 10:38:06 +01:00
Bruce Richardson	88f591d1db	eal: remove unneeded version logic The version number in the DPDK_VERSION file will never have an offset that needs to be subtracted, so remove that logic from the version string generation. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>	2019-03-27 09:43:54 +01:00
Bruce Richardson	d320fe56bd	build: use version number from config file Since we have the version number in a separate file at the root level, we should not need to duplicate this in rte_version.h too. Best approach here is to move the macros for specifying the year/month/etc. parts from the version header file to the build config file - leaving the other utility macros for e.g. printing the version string, where they are. For "make", this is done by having a little bit of awk parse the version file and pass the results through to the preprocessor for the config generation stage. For "meson", this is done by parsing the version and adding it to the standard dpdk_conf object. In both cases, we need to append a large number - in this case "99", previously 16 in original code - to the version number when we want to do version number comparisons. Without this, the release version e.g. 19.05.0 will compare as less than it's RC's e.g. 19.05.0-rc4. With it, the comparison is correct as "19.05.0.99 > 19.05.0-rc4.99". Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-03-27 09:43:47 +01:00
Fiona Trahe	866bc6742c	compressdev: add flag to specify where processing is done A new device feature flag, RTE_COMPDEV_FF_OP_DONE_IN_DEQUEUE is added. A PMD should set this if the bulk of the processing is done during the dequeue. It should leave it cleared if the bulk of the processing is done during the enqueue (default). Applications can use this as a hint for tuning. Signed-off-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Shally Verma <shallyv@marvell.com>	2019-03-22 15:54:24 +01:00
Fan Zhang	51acc16b51	ipsec: support 3DES-CBC This patch adds triple-des CBC mode cipher algorithm to ipsec library. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-22 15:54:24 +01:00
Fan Zhang	3975d5cb1d	ipsec: support AES-CTR This patch adds AES-CTR cipher algorithm support to ipsec library. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>	2019-03-22 15:54:24 +01:00
Damian Nowak	a76e869f66	cryptodev: remove XTS comment duplication This patch removes duplicated text about AES-XTS mode. Signed-off-by: Damian Nowak <damianx.nowak@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-22 15:54:24 +01:00
Anoob Joseph	2382aa8c8f	cryptodev: fix driver name comparison The string compare to the length of driver name might give false positives when there are drivers with similar names (one being the subset of another). Following is such a naming which could result in false positive. 1. crypto_driver 2. crypto_driver1 When strncmp with len = strlen("crypto_driver") is done, it could give a false positive when compared against "crypto_driver1". For such cases, 'strlen + 1' is done, so that the NULL termination also would be considered for the comparison. Fixes: `d11b0f30df` ("cryptodev: introduce API and framework for crypto devices") Cc: stable@dpdk.org Signed-off-by: Ankur Dwivedi <adwivedi@marvell.com> Signed-off-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-22 14:27:46 +01:00
Arek Kusztal	83a6cb03bc	cryptodev: add result field to mod exp and inv This commit adds result field to be used when modular exponentiation or modular multiplicative inverse operation is used Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Shally Verma <shallyv@marvell.com>	2019-03-22 14:27:46 +01:00
Konstantin Ananyev	27e71c7fdc	cryptodev: restore crypto op alignment and layout in 18.08 new cache-aligned structure rte_crypto_asym_op was introduced. As it also was included into rte_crypto_op, it caused implicit change in rte_crypto_op layout and alignment: now rte_crypto_op is cache-line aligned has a hole of 40/104 bytes between phys_addr and sym/asym op. It looks like unintended ABI breakage, plus such change can cause negative performance effects: - now status and sym[0].m_src lies on different cache-lines, so post-process code would need extra cache-line read. - new alignment causes grow of the space requirements and cache-line reads/updates for structures that contain rte_crypto_op inside. As there seems no actual need to have rte_crypto_asym_op cache-line aligned, and rte_crypto_asym_op is not intended to be used on it's own - the simplest fix is just to remove cache-line alignment for it. As the immediate positive effect: on IA ipsec-secgw performance increased by 5-10% (depending on the crypto-dev and algo used). My guess that on machines with 128B cache-line and lookaside-protocol capable crypto devices the impact will be even more noticeable. Fixes: `26008aaed1` ("cryptodev: add asymmetric xform and op definitions") Cc: stable@dpdk.org Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Shally Verma <shallyv@marvell.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-22 14:27:46 +01:00
Stephen Hemminger	0366137722	ethdev: check for invalid device name Do not allow creating an Ethernet device with a name over the allowed maximum (or zero length). This is safer than silently truncating which is what happens now. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Ali Alnubani <alialnu@mellanox.com>	2019-03-21 19:27:51 +01:00
Andrew Rybchenko	b6950cc79d	ethdev: highlight that all-multicast is retained on restart All-multicast is a part of receive mode configuration and it is better to mention explicitly that it is retained across restart. Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-03-20 18:15:42 +01:00
Andrew Rybchenko	8010be2a12	ethdev: advertise default MAC as retained on restart The documentation says MAC addresses array is retained and it is logical to assume that default MAC address is retained as well. Also some PMDs do not allow to change the default MAC in running state (see RTE_ETH_DEV_NOLIVE_MAC_ADDR). Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-03-20 18:15:42 +01:00
Andrew Rybchenko	189f554647	ethdev: advertise MTU as retained across stop/start Changing MTU in running state may return -EBUSY saying that MTU cannot be changed when the port is running. It assumes that changes may be done in stopped and started (but some PMDs may reject it) state and it is logical to require that changes done in any of these states are retained. Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-03-20 18:15:42 +01:00
Maxime Coquelin	3e0396166b	vhost: support requests only handled by external backend External backends may have specific requests to handle, and so we don't want the vhost-user lib to handle these requests as errors. This patch also changes the experimental API by introducing RTE_VHOST_MSG_RESULT_NOT_HANDLED so that vhost-user lib can report an error if a message is handled neither by the vhost-user library nor by the external backend. The logic changes a bit so that if the callback returns with ERR, OK or REPLY, it is considered the message is handled by the external backend so it won't be handled by the vhost-user library. It is still possible for an external backend to listen to requests that have to be handled by the vhost-user library like SET_MEM_TABLE, but the callback have to return NOT_HANDLED in that case. Vhost-crypto backend is also adapted to this API change. Suggested-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-20 18:15:42 +01:00
Maxime Coquelin	9401b80327	vhost: add API to set protocol features flags rte_vhost_driver_set_protocol_features API is to be used by external backends to advertise vhost-user protocol features it supports. It has to be called after rte_vhost_driver_register() and before rte_vhost_driver_start(). Example of usage to advertize VHOST_USER_PROTOCOL_F_FOOBAR protocol feature: const char *path = "/tmp/vhost-user"; uint64_t protocol_features; rte_vhost_driver_register(path, 0); rte_vhost_driver_get_protocol_features(path, &protocol_features); protocol_features \|= VHOST_USER_PROTOCOL_F_FOOBAR; rte_vhost_driver_set_protocol_features(path, protocol_features); rte_vhost_driver_start(path); Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-20 18:15:42 +01:00
Jiayu Hu	2f706027c8	vhost: fix interrupt suppression for the split ring The VIRTIO_RING_F_EVENT_IDX feature of split ring might be broken, as the value of signalled_used is invalid after live migration, start up and virtio driver reload. This patch fixes it by using signalled_used_valid. In addition, this patch makes the VIRTIO_RING_F_EVENT_IDX implementation of split ring match kernel backend to suppress more interrupts. Fixes: `e37ff95440` ("vhost: support virtqueue interrupt/notification suppression") Cc: stable@dpdk.org Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Tested-by: Yinan Wang <yinan.wang@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-20 18:15:42 +01:00
Maxime Coquelin	11d5253a3e	vhost: prevent disabled rings to be processed with zero-copy The vhost-user spec says that once the vring is disabled, the client has to stop processing it. But it can happen when dequeue zero-copy is enabled if outstanding descriptors buffers are still being processed by an external NIC or another guest. The fix consists in draining the zmbufs list to ensure no more descriptors buffers are in the wild. Note that this fix is only working in the case REPLY_ACK protocol feature is enabled, which is not the case by default for now (it is only enabled when IOMMU feature is enabled in the vhost library). Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-20 18:15:42 +01:00
Rami Rosen	7065dff68e	ethdev: fix method name in doxygen comment This patch fixes rte_ethdev header file to use the correct method name, namely to use rte_eth_dev_info_get() instead of rte_eth_dev_infos_get(). Fixes: `a4996bd89c` ("ethdev: new Rx/Tx offloads API") Fixes: `4f5701f28b` ("examples: fix RSS hash function configuration") Cc: stable@dpdk.org Signed-off-by: Rami Rosen <ramirose@gmail.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>	2019-03-20 18:15:42 +01:00
Stephen Hemminger	ddf5a9c6d7	ethdev: replace snprintf with strlcpy on init Don't need to use snprintf for simple name copy. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Rami Rosen <ramirose@gmail.com>	2019-03-20 18:15:42 +01:00
Stephen Hemminger	e97f83e452	ethdev: replace snprintf with strlcpy for owner The set_port_owner was copying a string between structures of the same type, therefore the name could never be truncated (unless source string was not null terminated). Use strlcpy which does it better. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-03-20 18:15:42 +01:00
Tomasz Jozwiak	a7cece2ead	malloc: add NUMA-aware realloc function Currently, rte_realloc will not respect original allocation's NUMA node when memory cannot be resized, and there is no NUMA-aware equivalent of rte_realloc. This patch adds such a function. The new API will ensure that reallocated memory stays on requested NUMA node, as well as allow moving allocated memory to a different NUMA node. Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2019-03-23 16:54:50 +01:00
Pavan Nikhilesh	1534cc6ab1	doc: add notes about eventdev producer/consumer dependency EventDev i.e consumer needs to be started before starting the event producers. Update documentation of EventDev and EventDev adapters. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com> Reviewed-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>	2019-03-15 06:46:50 +01:00
Bruce Richardson	218c4e68c1	mk: use linux and freebsd in config names Rather than using linuxapp and bsdapp everywhere, we can change things to use the, more readable, terms "linux" and "freebsd" in our build configs. Rather than renaming the configs we can just duplicate the existing ones with the new names using symlinks, and use the new names exclusively internally. ["make showconfigs" also only shows the new names to keep the list short] The result is that backward compatibility is kept fully but any new builds or development can be done using the newer names, i.e. both "make config T=x86_64-native-linuxapp-gcc" and "T=x86_64-native-linux-gcc" work. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 23:05:06 +01:00
Bruce Richardson	5fbc1d498f	build/freebsd: rename macro BSDPAPP to FREEBSD Rename the macro and all instances in DPDK code, but keep a copy of the old macro defined for legacy code linking against DPDK Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 23:01:14 +01:00
Bruce Richardson	742bde12f3	build/linux: rename macro from LINUXAPP to LINUX Rename the macro to make things shorter and more comprehensible. For both meson and make builds, keep the old macro around for backward compatibility. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 17:31:22 +01:00
Bruce Richardson	91d7846ce6	eal/linux: rename linuxapp to linux The term "linuxapp" is a legacy one, but just calling the subdirectory "linux" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 17:31:13 +01:00
Bruce Richardson	25c99fbd68	eal/bsd: rename bsdapp to freebsd The term "bsdapp" is a legacy one, but just calling the subdirectory "freebsd" is just clearer for all concerned. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2019-03-12 17:30:20 +01:00
Arek Kusztal	9f9d23d428	cryptodev: rework modexp and modinv comments This patch changes modular exponentiation and modular multiplicative inverse API comments to make it more precise. Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Shally Verma <shallyv@marvell.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-03-06 17:30:43 +01:00
David Marchand	1598c72959	eal: fix core list validation with disabled cores -l and -c options are two ways to select the cores used by DPDK. Their format differs, but the checks on the selected cores are the same. Use an intermediate array to separate the specific parsing checks from the common consistency checks. The parsing functions now concentrate on validating the passed string and do nothing more. We can report all invalid core indexes rather than only the first error. In the error log message, reporting [0, cfg->lcore_count - 1] as a valid range is then wrong when the core list is not continuous. Example on my 8 cpus laptop with core 2 and 6 disabled. echo 0 > /sys/devices/system/cpu/cpu2/online echo 0 > /sys/devices/system/cpu/cpu6/online Before: ./master/app/testpmd -l 0-7 --no-huge -m 512 -- --total-num-mbufs 2048 EAL: Detected 6 lcore(s) EAL: Detected 1 NUMA nodes EAL: invalid core list, please check core numbers are in [0, 5] range ... After: ./master/app/testpmd -l 0-7 --no-huge -m 512 -- --total-num-mbufs 2048 EAL: Detected 6 lcore(s) EAL: Detected 1 NUMA nodes EAL: lcore 2 unavailable EAL: lcore 6 unavailable EAL: invalid core list, please check specified cores are part of 0-1,3-5,7 ... Fixes: `d888cb8b96` ("eal: add core list input format") Fixes: `b38693b612` ("eal: fix core number validation") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-03-07 21:22:53 +01:00
David Marchand	33df941d79	eal: remove dead code in core list parsing We don't need to look for trailing spaces. This is a copy/paste block from eal_parse_coremask(). Remove it and the associated comment. Fixes: `d888cb8b96` ("eal: add core list input format") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com>	2019-03-07 21:22:48 +01:00
David Marchand	c3568ea376	eal: restrict control threads to startup CPU affinity Spawning the ctrl threads on anything that is not part of the eal coremask is not that polite to the rest of the system, especially when you took good care to pin your processes on cpu resources with tools like taskset (linux) / cpuset (freebsd). Rather than introduce yet another eal options to control on which cpu those ctrl threads are created, let's take the startup cpu affinity as a reference and remove the eal coremask from it. If no cpu is left, then we default to the master core. The cpuset is computed once at init before the original cpu affinity is lost. Introduced a RTE_CPU_AND macro to abstract the differences between linux and freebsd respective macros. Examples in a 4 cores FreeBSD vm: $ ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1057 PID TID COMM TDNAME CPU CSID CPU MASK 1057 100131 testpmd - 2 1 2 1057 100140 testpmd eal-intr-thread 1 1 0-1 1057 100141 testpmd rte_mp_handle 1 1 0-1 1057 100142 testpmd lcore-slave-3 3 1 3 $ cpuset -l 1,2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1061 PID TID COMM TDNAME CPU CSID CPU MASK 1061 100131 testpmd - 2 2 2 1061 100144 testpmd eal-intr-thread 1 2 1 1061 100145 testpmd rte_mp_handle 1 2 1 1061 100147 testpmd lcore-slave-3 3 2 3 $ cpuset -l 2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \ -- -i --total-num-mbufs=2048 $ procstat -S 1065 PID TID COMM TDNAME CPU CSID CPU MASK 1065 100131 testpmd - 2 2 2 1065 100148 testpmd eal-intr-thread 2 2 2 1065 100149 testpmd rte_mp_handle 2 2 2 1065 100150 testpmd lcore-slave-3 3 2 3 Fixes: `d651ee4919` ("eal: set affinity for control threads") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>	2019-03-07 19:21:28 +01:00
David Marchand	759b9be661	eal: fix control threads pinnning pthread_setaffinity_np returns a >0 value on error. We could end up letting the ctrl threads on the current process cpu affinity. Fixes: `d651ee4919` ("eal: set affinity for control threads") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>	2019-03-07 19:13:48 +01:00
David Marchand	b206376438	eal: fix check when retrieving current CPU affinity pthread_getaffinity_np returns a >0 value when failing. This is mainly for the sake of correctness. The only case where it could fail is when passing an incorrect cpuset size wrt to the kernel. Fixes: `2eba8d21f3` ("eal: restrict cores auto detection") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Rami Rosen <ramirose@gmail.com>	2019-03-07 16:37:14 +01:00
Stephen Hemminger	e7d798172f	eal: remove legacy PMD log macro The RTE_PMD_DEBUG_TRACE was only enabled for EVENTDEV_DEBUG and that configuration is now handled by RTE_EDEV_LOG macros. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-01 18:17:36 +01:00
Stephen Hemminger	b5af9abaa7	eventdev: use same log macro for all unsupported calls The driver already has RTE_EDEV_XXX log macros so use them in two more places. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-01 18:17:36 +01:00
Stephen Hemminger	e37aad5ed3	eal: drop unused macros for primary process check No usage in current DPDK code base. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-03-01 18:17:36 +01:00
Darek Stojaczyk	c19429844c	vhost: remove vhost-net requirements from generic APIs The rte_vhost API to put data into virtqueues operates on mbufs and hence it is strictly vhost-net specific. External backends need to implement virtqueue handling from scratch and that's just not possible without APIs to get/set vring base addresses. Those relevant APIs are there, but they have a check that prevents them from working with any non-vhost-net device. This patch removes those checks. rte_vhost_get_log_base() is not necessarily needed for external backends, as other, higher level vhost APIs for live migration are available and could be used instead. We remove the extra check from it anyway for consistency. Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2019-03-01 18:17:36 +01:00
Tiwei Bie	2a2904fa9c	vhost: fix potential use-after-free for memory region Reclaim outstanding zmbufs first before freeing memory regions, otherwise there could be use-after-free. Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-03-01 18:17:36 +01:00
Tiwei Bie	d767436ee5	vhost: fix potential use-after-free for zero copy mbuf Don't free the zero copy mbufs before they have been consumed, otherwise there could be use-after-free. Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-03-01 18:17:36 +01:00
Tiwei Bie	041d37b2ef	vhost: restore mbuf first when freeing zmbuf The mbufs should also be restored in free_zmbufs(). Fixes: `b0a985d1f3` ("vhost: add dequeue zero copy") Fixes: `3ebd930588` ("vhost: fix mbuf free") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-03-01 18:17:35 +01:00
Luca Boccassi	a9933bb1de	build: improve libbsd dependency handling Use dependency() instead of manual append to ldflags. Move libbsd inclusion to librte_eal, so that all other libraries and PMDs will inherit it. Signed-off-by: Luca Boccassi <bluca@debian.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-02-27 12:28:03 +01:00
Luca Boccassi	8990c8c0d0	build: reorder cmdline library after EAL Most libraries and PMDs depend on eal, and eal depends only on kvargs, so reorder the list in Meson to reflect this and take advantage of this dependency chain. Signed-off-by: Luca Boccassi <bluca@debian.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2019-02-27 12:27:58 +01:00
Luca Boccassi	e30b4e566f	build: improve dependency handling Whenever possible (if the library ships a pkg-config file) use meson's dependency() function to look for it, as it will automatically add it to the Requires.private list if needed, to allow for static builds to succeed for reverse dependencies of DPDK. Otherwise the recursive dependencies are not parsed, and users doing static builds have to resolve them manually by themselves. When using this API avoid additional checks that are superfluous and take extra time, and avoid adding the linker flag manually which causes it to be duplicated. Signed-off-by: Luca Boccassi <bluca@debian.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Bruce Richardson <bruce.richardson@intel.com>	2019-02-27 12:13:54 +01:00
Bruce Richardson	d23e141ffa	build: set RTE_ARCH_64 based on pointer size Rather than relying on the target machine architecture, use the size of a pointer from the compiler to determine if we are 64-bits or not. This allows correct behaviour when you pass -m32 as a compile option. It also allows us to use this value repeatedly throughout the repo rather than continually testing for the sizeof(void*). Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Luca Boccassi <bluca@debian.org> Acked-by: Luca Boccassi <bluca@debian.org>	2019-02-26 18:34:28 +01:00
Bruce Richardson	b543d1a715	compat: merge compat library into EAL Since compat library is only a single header, we can easily move it into the EAL common headers instead of tracking it separately. The downside of this is that it becomes a little more difficult to have any libs that are built before EAL depend on it. Thankfully, this is not a major problem as the only library which uses rte_compat.h and is built before EAL (kvargs) already has the path to the compat.h header file explicitly called out as an include path. However, to ensure that we don't hit problems later with this, we can add EAL common headers folder to the global include list in the meson build which means that all common headers can be safely used by all libraries, no matter what their build order. As a side-effect, this patch also fixes an issue with building on BSD using meson, due to compat lib no longer needing to be listed as a dependency. Fixes: `a8499f65a1` ("log: add missing experimental tag") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Tested-by: David Marchand <david.marchand@redhat.com> Tested-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>	2019-02-25 16:03:31 +01:00
Pallantla Poornima	7c7b756225	vhost: fix sprintf with snprintf sprintf function is not secure as it doesn't check the length of string. More secure function snprintf is used. Fixes: `d7280c9fff` ("vhost: support selective datapath") Cc: stable@dpdk.org Signed-off-by: Pallantla Poornima <pallantlax.poornima@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2019-02-22 14:39:49 +01:00
Wenjie Sun	054617fd82	vhost: fix deadlock in driver unregister In rte_vhost_driver_unregister(), the connection fd is removed from the fdset using fdset_try_del(). Call to this function may fail if the corresponding fd is in busy state, indicating that event dispatcher is executing the read or write callback on this fd. When it happens, rte_vhost_driver_unregister() keeps trying to remove the fd from the set until it is no more busy. This situation is causing a deadlock, because rte_vhost_driver_unregister() keeps trying to remove the fd from the set with vhost_user.mutex held, while the callback executed by the dispatcher, vhost_user_read_cb(), also takes this mutex at numerous places. The fix consists in releasing vhost_user.mutex between each retry in vhost_driver_unregister(). Fixes: `8b4b949144` ("vhost: fix dead lock on closing in server mode") Cc: stable@dpdk.org Signed-off-by: Wenjie Sun <findtheonlyway@gmail.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2019-02-22 14:39:49 +01:00
Thomas Monjalon	a3a4aba4b5	ethdev: remove unused variable When removing the old attach function, the racy variable for getting the last port id became unused. Fixes: `c9cce42876` ("ethdev: remove deprecated attach/detach functions") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>	2019-02-20 16:41:43 +01:00
Rami Rosen	b13baac8d5	mbuf: fix a typo This trivial patch fixes a typo in rte_mbuf.h. Fixes: `f20b50b946` ("mbuf: optimize refcnt update") Cc: stable@dpdk.org Signed-off-by: Rami Rosen <ramirose@gmail.com>	2019-02-12 14:32:01 +01:00

1 2 3 4 5 ...

5276 Commits