numam-dpdk

Author	SHA1	Message	Date
Fan Zhang	b1d978fc7b	cryptodev: add opaque data field to symmetric session This patch adds a opaque data field to cryptodev symmetric session. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	5d6c73dd59	cryptodev: add reference count to session private data This patch adds a refcnt field to every session private data in the cryptodev symmetric session. The counter is used to prevent freeing symmetric session blindly before it is not cleared by every type of crypto device in use. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	9e5f5ecb5e	cryptodev: add user data size to symmetric session This patch adds a user_data_sz field to cryptodev symmetric session. The field is used to check if reading or writing the session's user data field is eligible. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	e764cd72a9	cryptodev: update symmetric session structure This patch updates the rte_cryptodev_sym_session structure for cryptodev library. The updates include a changed session private data array and an added nb_drivers field. They are used to calculate the correct session header size and ensure safe access of the session private data. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	0b60386ac3	cryptodev: add sym session header size function This patch adds a new API in Cryptodev Framework. The API is used to get the header size for the created symmetric Cryptodev session. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	ac5e42daca	vhost/crypto: use separate session mempools This patch uses the two session mempool approach to vhost crypto. One mempool is for session header objects, and the other is for session private data. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	1d6f89885e	cryptodev: add sym session mempool create This patch adds a new API "rte_cryptodev_sym_session_pool_create()" to cryptodev library. All applications are required to use this API to create sym session mempool as it adds private data and nb_drivers information to the mempool private data. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Fan Zhang	725d2a7fbf	cryptodev: change queue pair configure structure This patch changes the cryptodev queue pair configure structure to enable two mempool passed into cryptodev PMD simutaneously. Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Fiona Trahe <fiona.trahe@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Akhil Goyal <akhil.goyal@nxp.com>	2019-01-10 16:57:22 +01:00
Eelco Chaudron	655796d2b5	meter: support RFC4115 trTCM This patch adds support for RFC4115 trTCM meters. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2019-01-10 00:34:09 +01:00
Thomas Monjalon	7637518249	version: 19.02-rc1 Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2018-12-23 00:21:13 +01:00
Tonghao Zhang	03b7fd7e54	sched: fix memory leak on init failure In some case, we may create sched port dynamically, if err when creating so memory will leak. Fixes: `de3cfa2c98` ("sched: initial import") Cc: stable@dpdk.org Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>	2018-12-22 00:22:57 +01:00
Reshma Pattan	5d3f721009	mbuf: implement generic format for sched field This patch implements the changes proposed in the deprecation notes [1][2]. librte_mbuf changes: The mbuf->hash.sched field is updated to support generic definition in line with the ethdev traffic manager and meter APIs. The new generic format contains: queue ID, traffic class, color. Added public APIs to set and get these new fields to and from mbuf. librte_sched changes: In addtion, following API functions of the sched library have been modified with an additional parameter of type struct rte_sched_port to accommodate the changes made to mbuf sched field. (i)rte_sched_port_pkt_write() (ii) rte_sched_port_pkt_read_tree_path() librte_pipeline, qos_sched UT, qos_sched app are updated to make use of new changes. Also mbuf->hash.txadapter has been added for eventdev txq, rte_event_eth_tx_adapter_txq_set and rte_event_eth_tx_adapter_txq_get() are updated to use mbuf->hash.txadapter.txq. doc: Release notes updated. Removed deprecation notice for mbuf->hash.sched and sched API. [1] http://mails.dpdk.org/archives/dev/2018-February/090651.html [2] https://mails.dpdk.org/archives/dev/2018-November/119051.html Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Tested-by: Nikhil Rao <nikhil.rao@intel.com> Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>	2018-12-22 00:22:44 +01:00
Reshma Pattan	c712b01326	meter: unify packet color definition Added new rte_color definition in librte_meter to consolidate color definition which is currently replicated in various places such as rte_meter.h, rte_tm.h and rte_mtr.h Created aliases for rte_tm_color, rte_mtr_color and rte_meter_color to use new rte_color values. The definitions of rte_tm_color, rte_mtr_color and rte_meter_color will be deprecated in future. Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com> Signed-off-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>	2018-12-20 19:00:10 +01:00
Bruce Richardson	fff6df7bf5	telemetry: fix using ports of different types Different NIC ports can have different numbers of xstats on them, which means that we can't just use the xstats list from the first port registered in the telemetry library. Instead, we need to check the type of each port - by checking its ops structure pointer - and register each port type once with the metrics lib. Fixes: `fdbdb3f9ce` ("telemetry: add initial connection socket") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Kevin Laatz <kevin.laatz@intel.com>	2018-12-22 03:23:06 +01:00
Maxime Coquelin	b473ec1131	vhost: batch used descs chains write-back with packed ring Instead of writing back descriptors chains in order, let's write the first chain flags last in order to improve batching. Also, move the write barrier in logging cache sync, so that it is done only when logging is enabled. It means there is now one more barrier for split ring when logging is enabled. With Kernel's pktgen benchmark, ~3% performance gain is measured. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Maxime Coquelin	815814c4ff	vhost: remove useless prefetch for packed ring descriptor This prefetch does not show any performance improvement. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Maxime Coquelin	aaf8979d6f	vhost: prefetch descriptor after the read barrier This patch moves the prefetch after the available index is read to avoid prefetching a descriptor not available yet. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Maxime Coquelin	33e12d63d1	vhost: enforce desc flags and content read ordering A read barrier is required to ensure that the ordering between descriptor's flags and content reads is enforced. 1. read flags = desc->flags if (flags & AVAIL_BIT) 2. read desc->id There is a control dependency between steps 1 and step 2. 2 could be speculatively executed before 1, which could result in 'id' to not be updated yet. Fixes: `2f3225a7d6` ("vhost: add vector filling support for packed ring") Cc: stable@dpdk.org Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Maxime Coquelin	d4ff2135eb	vhost: enforce avail index and desc read ordering A read barrier is required to ensure the ordering between available index and the descriptor reads is enforced. 1. read avail_head = avail->idx 2. read cur_idx = last_avail_idx if (cur_idx != avail_head) { 3. read idx = avail->ring[cur_idx] 4. read desc[idx] } There is a control dependency between step 1 and steps 3 & 4, 3 could be speculatively executed before 1, which could result in 'idx' to not being updated yet. Fixes: `4796ad63ba` ("examples/vhost: import userspace vhost application") Cc: stable@dpdk.org Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com>	2018-12-21 16:22:41 +01:00
Bruce Richardson	8743d499a5	net: fix underflow for checksum of invalid IPv4 packets If we receive a packet with an invalid IP header, where the total packet length is reported as less than the IP header length, we would end up getting an underflow in the length subtraction. This could cause us to checksum e.g. 4GB of data in the case where the result of the subtraction was -1. We fix this by having the function return 0 - an invalid sum - when the length is less than the header length. Fixes: `af75078fec` ("first public release") Fixes: `6006818cfb` ("net: new checksum functions") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>	2018-12-21 16:22:41 +01:00
Xiao Wang	b13ad2decc	vhost: provide helpers for virtio ring relay This patch provides two helpers for vdpa device driver to perform a relay between the guest virtio ring and a mediated virtio ring. The available ring relay will synchronize the available entries, and help to do desc validity checking. The used ring relay will synchronize the used entries from mediated ring to guest ring, and help to do dirty page logging for live migration. The later patch will leverage these two helpers. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-21 16:22:40 +01:00
Xiao Wang	43f34e3566	vhost: provide helper for host notifier ctrl VDPA driver can decide if it needs to enable/disable the host notifier mapping, so exposing a API can allow flexibility. A later patch will base on this. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-21 16:22:40 +01:00
Xiao Wang	02e3b285d4	vhost: remove unused function vhost_detach_vdpa_device() is internally defined but not used, remove it in this patch. Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-21 16:22:40 +01:00
Matthias Gatto	276d63505b	vhost: fix race condition when adding fd in the fdset fdset_add can call fdset_shrink_nolock which call fdset_move concurrently to poll that is call in fdset_event_dispatch. This patch add a mutex to protect poll from been call at the same time fdset_add call fdset_shrink_nolock. Fixes: `1b815b8959` ("vhost: try to shrink pfdset when fdset_add fails") Cc: stable@dpdk.org Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-21 16:22:40 +01:00
Anatoly Burakov	ba731ea1dd	malloc: fix deadlock when reading stats Currently, malloc statistics and external heap creation code use memory hotplug lock as a way to synchronize accesses to heaps (as in, locking the hotplug lock to prevent list of heaps from changing under our feet). At the same time, malloc statistics code will also lock the heap because it needs to access heap data and does not want any other thread to allocate anything from that heap. In such scheme, it is possible to enter a deadlock with the following sequence of events: thread 1 thread 2 rte_malloc() rte_malloc_dump_stats() take heap lock take hotplug lock failed to allocate, attempt to take hotplug lock attempt to take heap lock Neither thread will be able to continue, as both of them are waiting for the other one to drop the lock. Adding an additional lock will require an ABI change, so instead of that, make malloc statistics calls thread-unsafe with respect to creating/destroying heaps. Fixes: `72cf92b318` ("malloc: index heaps using heap ID rather than NUMA node") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-21 15:26:43 +01:00
Honnappa Nagarahalli	d5c677db89	hash: fix out-of-bound write while freeing key slot Add a debug check for out-of-bound write while freeing the key slot. Coverity issue: 325733 Fixes: `e605a1d36c` ("hash: add lock-free r/w concurrency") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Steve Capper <steve.capper@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2018-12-21 01:53:33 +01:00
Jeff Shaw	0f48ca429b	hash: fix return of bulk lookup The __rte_hash_lookup_bulk() function returns void, and therefore should not return with an expression. This commit fixes the following compiler warning when attempting to compile with "-pedantic -std=c11". warning: ISO C forbids ‘return’ with expression, in function returning void [-Wpedantic] Fixes: `9eca8bd7a6` ("hash: separate lock-free and r/w lock lookup") Cc: stable@dpdk.org Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2018-12-21 01:41:18 +01:00
Liang Ma	e6c6dc0f96	power: add p-state driver compatibility Previously, in order to use the power library, it was necessary for the user to disable the intel_pstate driver by adding “intel_pstate=disable” to the kernel command line for the system, which causes the acpi_cpufreq driver to be loaded in its place. This patch adds the ability for the power library use the intel-pstate driver. It adds a new suite of functions behind the current power library API, and will seamlessly set up the user facing API function pointers to the relevant functions depending on whether the system is running with acpi_cpufreq kernel driver, intel_pstate kernel driver or in a guest, using kvm. The library API and ABI is unchanged. Signed-off-by: Liang Ma <liang.j.ma@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: David Hunt <david.hunt@intel.com>	2018-12-21 01:33:59 +01:00
Qi Zhang	85d6815fa6	eal: close multi-process socket during cleanup When secondary process quit, the mp_socket* file still exist, that cause rte_mp_request_sync fail when try to send message on a floating socket. The patch fix the issue by introduce a function rte_mp_channel_cleanup. This function will be called by rte_eal_cleanup and it will close the mp socket and delete the mp_socket* file. Fixes: `bacaa27540` ("eal: add channel for multi-process communication") Cc: stable@dpdk.org Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2018-12-21 01:15:41 +01:00
Anatoly Burakov	9d65053761	eal: add 64-bit log2 function Add missing implementation for 64-bit log2 function, and extend the unit test to test this new function. Also, remove duplicate reimplementation of this function from testpmd and memalloc. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-21 00:23:49 +01:00
Anatoly Burakov	43c9e6c205	eal: add 64-bit fls function Add missing implementation for 64-bit fls function, and extend unit test to test the new function as well. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-21 00:17:43 +01:00
Anatoly Burakov	4e261f5519	eal: add 64-bit bsf and 32-bit safe bsf functions Add an rte_bsf64 function that follows the convention of existing rte_bsf32 function. Also, add missing implementation for safe version of rte_bsf32, and implement unit tests for all recently added bsf varieties. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-21 00:00:58 +01:00
Anatoly Burakov	cc7ddb00da	bitmap: remove deprecated 64-bit bsf function The function rte_bsf64 was deprecated in a previous release, so remove the function, and the deprecation notice associated with it. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 23:44:56 +01:00
Anatoly Burakov	307315d457	eal: fix runtime directory cleanup in noshconf mode When using --no-shconf or --in-memory modes, there is no runtime directory to be created, so there is no point in attempting to clean it. Fixes: `0a529578f1` ("eal: clean up unused files on initialization") Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 23:27:35 +01:00
Anatoly Burakov	c75f535ac5	mem: use memfd for no-huge mode When running in no-huge mode, we anonymously allocate our memory. While this works for regular NICs and vdev's, it's not suitable for memory sharing scenarios such as virtio with vhost_user backend. To fix this, allocate no-huge memory using memfd, and register it with memalloc just like any other memseg fd. This will enable using rte_memseg_get_fd() API with --no-huge EAL flag. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-20 22:58:25 +01:00
Anatoly Burakov	df7722c75b	mem: allow setting up segment list fd Currently, only segment fd's for multi-file segments are supported, while for memfd-backed no-huge memory we need single-file segments mode. Add support for single-file segments in the internal API. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-20 22:55:56 +01:00
Anatoly Burakov	d75eea3145	mem: check for memfd support in segment fd API If memfd support was not compiled, or hugepage memfd support is not available at runtime, the API will now return proper error code, indicating that this API is unsupported. This changes the API, so document the changes. Fixes: `41dbdb6872` ("mem: add external API to retrieve page fd") Fixes: `3a44687139` ("mem: allow querying offset into segment fd") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-20 22:54:37 +01:00
Anatoly Burakov	525670756a	mem: fix segment fd API error code for external segment Segment fd API does not support getting segment fd's from externally allocated memory, so return proper error code on any attempts to do so. This changes API behavior, so document the change as well. Fixes: `5282bb1c36` ("mem: allow memseg lists to be marked as external") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-12-20 22:51:49 +01:00
Anatoly Burakov	bed7941886	mem: allow usage of non-heap external memory in multiprocess Add multiprocess support for externally allocated memory areas that are not added to DPDK heap (and add relevant doc sections). Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 18:14:55 +01:00
Anatoly Burakov	950e8fb4e1	mem: allow registering external memory areas The general use-case of using external memory is well covered by existing external memory API's. However, certain use cases require manual management of externally allocated memory areas, so this memory should not be added to the heap. It should, however, be added to DPDK's internal structures, so that API's like ``rte_virt2memseg`` would work on such external memory segments. This commit adds such an API to DPDK. The new functions will allow to register and unregister externally allocated memory areas, as well as documentation for them. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 18:14:55 +01:00
Anatoly Burakov	39ff94e71c	malloc: separate destroying memseg list and heap data Currently, destroying external heap chunk and its memseg list is part of one process. When we will gain the ability to unregister external memory from DPDK that doesn't have any heap structures associated with it, we need to be able to find and destroy memseg lists as well as heap data separately. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 18:10:08 +01:00
Anatoly Burakov	0f526d674f	malloc: separate creating memseg list and malloc heap Currently, creating external malloc heap involves also creating a memseg list backing that malloc heap. We need to have them as separate functions, to allow creating memseg lists without creating a malloc heap. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 18:09:55 +01:00
Anatoly Burakov	646e5260ee	malloc: make alignment requirements more stringent The external heaps API already implicitly expects start address of the external memory area to be page-aligned, but it is not enforced or documented. Fix this by implementing additional parameter checks at memory add call, and document the page alignment requirement explicitly. Fixes: `7d75c31014` ("malloc: allow adding memory to named heaps") Cc: stable@dpdk.org Suggested-by: Yongseok Koh <yskoh@mellanox.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-12-20 15:34:03 +01:00
Anatoly Burakov	b3e735e16e	malloc: fix duplicate mem event notification We already trigger a mem event notification inside the walk function, no need to do it twice. Fixes: `f32c7c9de9` ("malloc: enable event callbacks for external memory") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 15:28:55 +01:00
Seth Howell	fba0ca2274	malloc: notify primary process about hotplug in secondary When secondary process hotplugs memory, it sends a request to primary, which then performs the real mmap() and sends sync requests to all secondary processes. Upon receiving such sync request, each secondary process will notify the upper layers of hotplugged memory (and will call all locally registered event callbacks). In the end we'll end up with memory event callbacks fired in all the processes except the primary, which is a bug. This gets critical if memory is hotplugged while a VFIO device is attached, as the VFIO memory registration - which is done from a memory event callback present in the primary process only - is never called. After this patch, a primary process fires memory event callbacks before secondary processes start their synchronizations - both for hotplug and hotremove. Fixes: `07dcbfe010` ("malloc: support multiprocess memory hotplug") Cc: stable@dpdk.org Signed-off-by: Seth Howell <seth.howell@intel.com> Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 15:25:34 +01:00
Yongseok Koh	6d09256148	malloc: fix finding maximum contiguous IOVA size malloc_elem_find_max_iova_contig() could return invalid size due to a missing sanity check. The following gdb output shows how 'cur_size' can be invalid in find_biggest_element(). (gdb) p/x cur_size $4 = 0xffffffffffe42900 (gdb) p elem $1 = (struct malloc_elem ) 0x12e842000 (gdb) p elem $2 = {heap = 0x7ffff7ff387c, prev = 0x12e831fc0, next = 0x12e842900, free_list = {le_next = 0x109538000, le_prev = 0x7ffff7ff3894}, msl = 0x7ffff7ff107c, state = ELEM_FREE, pad = 0, size = 2304} (gdb) p *elem->msl $5 = {{base_va = 0x100200000, addr_64 = 4297064448}, page_sz = 2097152, socket_id = 0, version = 790, len = 17179869184, external = 0, memseg_arr = {name = "memseg-2048k-0-0", '\000' <repeats 47 times>, count = 493, len = 8192, elt_sz = 48, data = 0x10002e000, rwlock = {cnt = 0}}} Fixes: `9fe6bceafd` ("malloc: add finding biggest free IOVA-contiguous element") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 15:17:48 +01:00
Jim Harris	476c847ab6	malloc: add option --match-allocations SPDK uses the rte_mem_event_callback_register API to create RDMA memory regions (MRs) for newly allocated regions of memory. This is used in both the SPDK NVMe-oF target and the NVMe-oF host driver. DPDK creates internal malloc_elem structures for these allocated regions. As users malloc and free memory, DPDK will sometimes merge malloc_elems that originated from different allocations that were notified through the registered mem_event callback routine. This results in subsequent allocations that can span across multiple RDMA MRs. This requires SPDK to check each DPDK buffer to see if it crosses an MR boundary, and if so, would have to add considerable logic and complexity to describe that buffer before it can be accessed by the RNIC. It is somewhat analagous to rte_malloc returning a buffer that is not IOVA-contiguous. As a malloc_elem gets split and some of these elements get freed, it can also result in DPDK sending an RTE_MEM_EVENT_FREE notification for a subset of the original RTE_MEM_EVENT_ALLOC notification. This is also problematic for RDMA memory regions, since unregistering the memory region is all-or-nothing. It is not possible to unregister part of a memory region. To support these types of applications, this patch adds a new --match-allocations EAL init flag. When this flag is specified, malloc elements from different hugepage allocations will never be merged. Memory will also only be freed back to the system (with the requisite memory event callback) exactly as it was originally allocated. Since part of this patch is extending the size of struct malloc_elem, we also fix up the malloc autotests so they do not assume its size exactly fits in one cacheline. Signed-off-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 13:01:08 +01:00
Gao Feng	cc80353223	memzone: fix unlock on initialization failure The RTE_PROC_PRIMARY error handler lost the unlock statement in the current codes. Now unlock and return in one place to fix it. Fixes: `49df3db848` ("memzone: replace memzone array with fbarray") Cc: stable@dpdk.org Signed-off-by: Gao Feng <davidfgao@tencent.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 12:24:14 +01:00
Gao Feng	32fa7f8913	eal: check peer allocation in multi-process request Add the check for null peer pointer like the bundle pointer in the mp request handler. They should follow same style. And add some logs for nomem cases. Signed-off-by: Gao Feng <davidfgao@tencent.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 00:01:28 +01:00
Gao Feng	e14bc93e8f	eal: fix leak on multi-process request error When rte_eal_alarm_set failed, need to free the bundle mem in the error handler of handle_primary_request and handle_secondary_request. Fixes: `244d513071` ("eal: enable hotplug on multi-process") Fixes: `ac9e4a1737` ("eal: support attach/detach shared device from secondary") Cc: stable@dpdk.org Signed-off-by: Gao Feng <davidfgao@tencent.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2018-12-20 00:01:28 +01:00

1 2 3 4 5 ...

5057 Commits