numam-dpdk

Author	SHA1	Message	Date
Harry van Haaren	78ffab9611	eventdev: add port attribute function This commit reworks the port functions to retrieve information about the port, like the enq or deq depths. Note that "port count" is a device attribute, and is added in a later patch for dev attributes. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>	2017-10-10 18:30:50 +02:00
Tim McDaniel	cec04e240d	eventdev: clarify usage of forward and release ops Update doxygen to make it clear that RTE_EVENT_OP_FORWARD and RTE_EVENT_OP_RELEASE must only be enqueued to the same port that the original event was dequeued from. Signed-off-by: Tim McDaniel <timothy.mcdaniel@intel.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2017-10-10 18:30:37 +02:00
Gage Eads	381acec2b1	eventdev: ease single-link queue config requirements Events sent through single-link queues are naturally in-order and atomic, without reordering or atomic scheduling. Logically the nb_atomic_flows and nb_atomic_order_sequences arguments don't apply to a single link queue, but applications must set these (depending on the queue config type) to bypass the is_valid_{ordered, atomic}_queue_conf() checks in the eventdev layer. This commit updates those is_valid_* functions to ignore queues with the SINGLE_LINK flag, to simplify their configuration. Signed-off-by: Gage Eads <gage.eads@intel.com> Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>	2017-10-10 18:30:24 +02:00
Maxime Coquelin	3494ed045e	vhost: distinguish master and slave requests This patch adds an union in VhostUserMsg to distinguish between master and slave initiated requests, instead of casting slave requests as master request. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:54:31 +02:00
Dariusz Stojaczyk	efba12a78d	vhost: add user callbacks for socket open/close Added new callbacks to notify about socket connection status. As destroy_device is used for virtqueue processing pause as well as connection close, the user has no distinction between those. Consider the following scenario: rte_vhost: received SET_VRING_BASE message, calling destroy_device() as usual user: end-user asks to remove the device (together with socket file), OK, device is not in use - that's NOT the behavior we want calling rte_vhost_driver_unregister() etc. Instead of changing new_device/destroy_device callbacks and breaking the ABI, a set of new functions new_connection/destroy_connection has been added. Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com>	2017-10-10 15:54:31 +02:00
Kuba Kozak	66a6210124	vhost: check poll error code Add return value check for poll() call. Coverity issue: 140740 Fixes: 59317cef249c ("vhost: allow many vhost-user ports") Cc: stable@dpdk.org Signed-off-by: Kuba Kozak <kubax.kozak@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:54:31 +02:00
Maxime Coquelin	69c90e98f4	vhost: enable IOMMU support Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:53:27 +02:00
Maxime Coquelin	36031f80cc	vhost: invalidate vring in case of matching IOTLB invalidate As soon as a page used by a ring is invalidated, the access_ok flag is cleared, so that processing threads try to map them again. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	eefac9536a	vhost: postpone device creation until rings are mapped Translating the start addresses of the rings is not enough, we need to be sure all the ring is made available by the guest. It depends on the size of the rings, which is not known on SET_VRING_ADDR reception. Furthermore, we need to be be safe against vring pages invalidates. This patch introduces a new access_ok flag per virtqueue, which is set when all the rings are mapped, and cleared as soon as a page used by a ring is invalidated. The invalidation part is implemented in a following patch. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	09927b5249	vhost: translate ring addresses when IOMMU enabled When IOMMU is enabled, the ring addresses set by the VHOST_USER_SET_VRING_ADDR requests are guest's IO virtual addresses, whereas Qemu virtual addresses when IOMMU is disabled. When enabled and the required translation is not in the IOTLB cache, an IOTLB miss request is sent, but being called by the vhost-user socket handling thread, the function does not wait for the requested IOTLB update. The function will be called again on the next IOTLB update message reception if matching the vring addresses. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	3ea7052f4b	vhost: postpone rings addresses translation This patch postpones rings addresses translations and checks, as addresses sent by the master shuld not be interpreted as long as ring is not started and enabled[0]. When protocol features aren't negotiated, the ring is started in enabled state, so the addresses translations are postponed to vhost_user_set_vring_kick(). Otherwise, it is postponed to when ring is enabled, in vhost_user_set_vring_enable(). [0]: http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg04355.html Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	b0098b5e21	vhost: fix dereferencing invalid pointer after realloc numa_realloc() reallocates the virtio_net device structure and updates the vhost_devices[] table with the new pointer if the rings are allocated different NUMA node. Problem is that vhost_user_msg_handler() still dereferences old pointer afterward. This patch prevents this by fetching again the dev pointer in vhost_devices[] after messages have been handled. Fixes: af295ad4698c ("vhost: realloc device and queues to same numa node as vring desc") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	321203a54b	vhost: enable rings at the right time When VHOST_USER_F_PROTOCOL_FEATURES is negotiated, the ring is not enabled when started, but enabled through dedicated VHOST_USER_SET_VRING_ENABLE request. When not negotiated, the ring is started in enabled state, at VHOST_USER_SET_VRING_KICK request time. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	62fdb8255a	vhost: use the guest IOVA to host VA helper Replace rte_vhost_gpa_to_vva() calls with vhost_iova_to_vva(), which requires to also pass the mapped len and the access permissions needed. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	fed67a20ac	vhost: introduce guest IOVA to backend VA helper This patch introduces vhost_iova_to_vva() function to translate guest's IO virtual addresses to backend's virtual addresses. When IOMMU is enabled, the IOTLB cache is queried to get the translation. If missing from the IOTLB cache, an IOTLB_MISS request is sent to Qemu, and IOTLB cache is queried again on IOTLB event notification. When IOMMU is disabled, the passed address is a guest's physical address, so the legacy rte_vhost_gpa_to_vva() API is used. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	e95f34d380	vhost: handle IOTLB update and invalidate requests Vhost-user device IOTLB protocol extension introduces VHOST_USER_IOTLB message type. The associated payload is the vhost_iotlb_msg struct defined in Kernel, which in this was can be either an IOTLB update or invalidate message. On IOTLB update, the virtqueues get notified of a new entry. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	76e99bfc4c	vhost: initialize vrings IOTLB caches The per-virtqueue IOTLB cache init is done at virtqueue init time. init_vring_queue() now takes vring id as parameter, so that the IOTLB cache mempool name can be generated. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	01a4bb55f9	vhost: support IOTLB miss slave requests Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	f72c2ad63a	vhost: add pending IOTLB miss request list and helpers In order to be able to handle other ports or queues while waiting for an IOTLB miss reply, a pending list is created so that waiter can return and restart later on with sending again a miss request. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	d012d1f293	vhost: add IOTLB helper functions Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	06903abc0d	vhost: add IOMMU-related macros for old kernels These defines and enums have been introduced in upstream kernel v4.8, and backported to RHEL 7.4. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	275c3f9447	vhost: support slave requests channel Currently, only QEMU sends requests, the backend sends replies. In some cases, the backend may need to send requests to QEMU, like IOTLB miss events when IOMMU is supported. This patch introduces a new channel for such requests. QEMU sends a file descriptor of a new socket using VHOST_USER_SET_SLAVE_REQ_FD. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	a0563bd2e3	vhost: prepare for slave requests send_vhost_message() is currently only used to send replies, so it modifies message flags to perpare the reply. With upcoming channel for backend initiated request, this function can be used to send requests. This patch introduces a new send_vhost_reply() that does the message flags modifications, and makes send_vhost_message() generic. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	25bf7a0b09	vhost: make error handling consistent in Rx path In the non-mergeable receive case, when copy_mbuf_to_desc() call fails the packet is skipped, the corresponding used element len field is set to vnet header size, and it continues with next packet/desc. It could be a problem because it does not know why it failed, and assume the desc buffer is large enough. In mergeable receive case, when copy_mbuf_to_desc_mergeable() fails, packets burst is simply stopped. This patch makes the non-mergeable error path to behave as the mergeable one, as it seems the safest way. Also, doing this way will simplify pending IOTLB miss requests handling. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	94018cf3d5	vhost: revert workaround MQ fails to startup This reverts commit 04d81227960b ("vhost: workaround MQ fails to startup"). As agreed when this workaround was introduced, it can be reverted as Qemu v2.10 that fixes the issue is now out. The reply-ack feature is required for vhost-user IOMMU support. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Tiwei Bie	e5c494a7a2	vhost: batch small guest memory copies This patch adaptively batches the small guest memory copies. By batching the small copies, the efficiency of executing the memory LOAD instructions can be improved greatly, because the memory LOAD latency can be effectively hidden by the pipeline. We saw great performance boosts for small packets PVP test. This patch improves the performance for small packets, and has distinguished the packets by size. So although the performance for big packets doesn't change, it makes it relatively easy to do some special optimizations for the big packets too. Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Signed-off-by: Zhihong Wang <zhihong.wang@intel.com> Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:48:53 +02:00
Jonas Pfefferle	33604c3135	vfio: refactor PCI BAR mapping Split pci_vfio_map_resource for primary and secondary processes. Save all relevant mapping data in primary process to allow the secondary process to perform mappings. Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>	2017-10-10 15:37:58 +02:00
Jonas Pfefferle	ed1e7e576b	vfio: fix sPAPR IOMMU DMA window size DMA window size needs to be big enough to span all memory segment's physical addresses. We do not need multiple levels of IOMMU tables as we already span ~70TB of physical memory with 16MB hugepages. Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2017-10-10 15:36:04 +02:00
Patrick MacArthur	e3f141879e	eal: copy raw strings taken from command line Normally, command line argument strings are considered immutable, but SPDK [1] and urdma [2] construct argv arrays to pass to rte_eal_init(). These strings are allocated using malloc() and freed after DPDK initialization with free(). However, in the case of --file-prefix and --huge-dir, DPDK takes the pointer to these strings in argv directly. If a secondary process calls rte_eal_pci_probe() after rte_eal_init() returns, as is done by SPDK, this causes a use-after-free error because the strings have been freed by the calling code immediately after rte_eal_init() returns. This problem was observed when running SPDK example programs as a secondary process and causes the secondary processes to fail: Starting DPDK 16.11.1 initialization... [ DPDK EAL parameters: identify -c 4 --file-prefix=spdk3260 --base-virtaddr=0x1000000000 --proc-type=auto ] EAL: Detected 40 lcore(s) EAL: Auto-detected process type: SECONDARY EAL: Probing VFIO support... EAL: VFIO support initialized EAL: PCI device 0000:81:00.0 on NUMA socket 1 EAL: probe driver: 8086:953 spdk_nvme EAL: cannot connect to primary process! EAL: Error - exiting with code: 1 Cause: Requested device 0000:81:00.0 cannot be used Running strace shows that the file prefix has been zero'd out by the time that the secondary process attempts to probe the NVMe device. The use-after-free errors can be easily detected with valgrind: ==8489== Invalid read of size 1 ==8489== at 0x4C30D22: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==8489== by 0x58DB955: vfprintf (vfprintf.c:1637) ==8489== by 0x59A4685: __vsnprintf_chk (vsnprintf_chk.c:63) ==8489== by 0x59A45E7: __snprintf_chk (snprintf_chk.c:34) ==8489== by 0x1246AB: get_socket_path.constprop.0 (in /home/pmacarth/src/spdk/examples/nvme/identify/identify) ==8489== by 0x124B09: vfio_mp_sync_connect_to_primary (in /home/pmacarth/src/spdk/examples/nvme/identify/identify) ==8489== by 0x123BE4: vfio_get_group_fd.part.1 (in /home/pmacarth/src/spdk/examples/nvme/identify/identify) ==8489== by 0x124366: vfio_setup_device (in /home/pmacarth/src/spdk/examples/nvme/identify/identify) ==8489== by 0x126C8A: pci_vfio_map_resource (in /home/pmacarth/src/spdk/examples/nvme/identify/identify) ==8489== by 0x12B115: pci_probe_all_drivers.part.0 (in /home/pmacarth/src/spdk/examples/nvme/identify/identify) ==8489== by 0x12B596: rte_eal_pci_probe (in /home/pmacarth/src/spdk/examples/nvme/identify/identify) ==8489== by 0x11D5B5: spdk_pci_enumerate (pci.c:147) ==8489== Address 0x63f362e is 14 bytes inside a block of size 32 free'd ==8489== at 0x4C2ED5B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==8489== by 0x11E6FB: spdk_free_args (init.c:136) ==8489== by 0x11EBF5: spdk_env_init (init.c:309) ==8489== by 0x10D2AA: main (identify.c:976) ==8489== Block was alloc'd at ==8489== at 0x4C2DB2F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==8489== by 0x11E7D7: _sprintf_alloc (init.c:76) ==8489== by 0x11EA78: spdk_build_eal_cmdline (init.c:251) ==8489== by 0x11EA78: spdk_env_init (init.c:282) ==8489== by 0x10D2AA: main (identify.c:976) ==8489== Fix this by using strdup() to create separate memory buffers for these strings. Note that this patch will cause valgrind to report memory leaks of these buffers as there is nowhere to free them. Using static buffers is an option but would make these strings have a fixed maximum length whereas there is currently no limit defined by the API. [1] http://spdk.io [2] https://github.com/zrlio/urdma Fixes: af75078fece3 ("first public release") Cc: stable@dpdk.org Signed-off-by: Patrick MacArthur <patrick@patrickmacarthur.net> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>	2017-10-09 23:25:13 +02:00
Seth Howell	7485e06c2a	mem: check mmap failure If mmap fails, it will return the value MAP_FAILED. Checking for this return code allows us to properly identify mmap failures and report them as such to the calling function. Signed-off-by: Seth Howell <seth.howell@intel.com> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>	2017-10-09 23:17:04 +02:00
Xueming Li	41baec55a8	mem: fix malloc element free in debug mode malloc_elem_free() is clearing(setting to 0) the trailer cookie when RTE_MALLOC_DEBUG is enabled. In case of joining free neighbor element, part of joined memory is not getting cleared due to missing the length of trailer cookie in the middle. This patch fixes calculation of free memory length to be cleared in malloc_elem_free() by including trailer cookie. Fixes: af75078fece3 ("first public release") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>	2017-10-09 23:15:45 +02:00
Xueming Li	3cd4e0e883	mem: fix malloc debug config This patch replaces broken macro RTE_LIBRTE_MALLOC_DEBUG with RTE_MALLOC_DEBUG. Fixes: af75078fece3 ("first public release") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>	2017-10-09 23:15:45 +02:00
Xueming Li	f385306357	config: add option to enable asserts Currently, enabling assertion have to set CONFIG_RTE_LOG_LEVEL to RTE_LOG_DEBUG. CONFIG_RTE_LOG_LEVEL is the default log level of control path, RTE_LOG_DP_LEVEL is the log level of data path. It's a little bit hard to understand literally that assertion is decided by control path LOG_LEVEL, especially assertion used on data path. On the other hand, DPDK need an assertion enabling switch w/o impacting log output level, assuming "--log-level" not specified. Assertion is an important API to balance DPDK high performance and robustness. To promote assertion usage, it's valuable to unhide assertion out of COFNIG_RTE_LOG_LEVEL. In one word, log is log, assertion is assertion, debug is hot pot :) Rationale of this patch is to introduce an dedicate switch of assertion: RTE_ENABLE_ASSERT Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>	2017-10-09 23:15:45 +02:00
Jianfeng Tan	f26ab687a7	eal: remove Xen dom0 support We remove xen-specific code in EAL, including the option --xen-dom0, memory initialization code, compiling dependency, etc. Related documents are removed or updated, and bump the eal library version. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>	2017-10-09 01:54:29 +02:00
Jianfeng Tan	a7cb2e20d2	mem: remove API to get physical address in dom0 Previously, to get MFN address in dom0, this API is a wrapper to obtain the "physical address". As we will removed xen dom0 support, this API is not necessary. Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2017-10-09 01:52:37 +02:00
Jianfeng Tan	1950bd7694	xen: remove dependency in libraries Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2017-10-09 01:52:08 +02:00
Jacek Piasecki	a6a47ac9c2	cfgfile: rework load function New functions added to cfgfile library make it possible to significantly simplify the code of rte_cfgfile_load_with_params() This patch shows the new body of this function. Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2017-10-09 00:50:48 +02:00
Jacek Piasecki	d4cb819758	cfgfile: support runtime modification Extend existing cfgfile library with providing new API functions: rte_cfgfile_create() - create new cfgfile object rte_cfgfile_add_section() - add new section to existing cfgfile object rte_cfgfile_add_entry() - add new entry to existing cfgfile object in specified section rte_cfgfile_set_entry() - update existing entry in cfgfile object rte_cfgfile_save() - save existing cfgfile object to INI file This modification allows to create a cfgfile on runtime and opens up the possibility to have applications dynamically build up a proper DPDK configuration, rather than having to have a pre-existing one. Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2017-10-09 00:50:25 +02:00
Jacek Piasecki	b82a987ffc	cfgfile: rework to flat arrays Change to flat arrays in cfgfile struct force slightly different data access for most of cfgfile functions. This patch provides necessary changes in existing API. Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2017-10-09 00:45:11 +02:00
Jacek Piasecki	250fef469e	cfgfile: remove EAL dependency This patch removes the dependency to EAL in cfgfile library. Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2017-10-09 00:44:59 +02:00
Yipeng Wang	703be9531a	member: add AVX for HT mode For key search, the signatures of all entries are compared against the signature of the key that is being looked up. Since all signatures are contiguously put in a bucket, they can be compared with vector instructions (AVX2), achieving higher lookup performance. This patch adds AVX2 implementation in a separate header file. Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com> Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2017-10-09 00:02:45 +02:00
Yipeng Wang	54b8edc07c	member: implement vBF mode Bloom Filter (BF) [1] is a well-known space-efficient probabilistic data structure that answers set membership queries. Vector of Bloom Filters (vBF) is an extension to traditional BF that supports multi-set membership testing. Traditional BF will return found or not-found for each key. vBF will also return which set the key belongs to if it is found. Since each set requires a BF, vBF should be used when set count is small. vBF's false positive rate could be set appropriately so that its memory requirement and lookup speed is better in certain cases comparing to HT based set-summary. This patch adds the vBF implementation. [1]B H Bloom, “Space/Time Trade-offs in Hash Coding with Allowable Errors,” Communications of the ACM, 1970. Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com> Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2017-10-09 00:02:45 +02:00
Yipeng Wang	904ec78a23	member: implement HT mode One of the set-summary structures is hash-table based set-summary (HTSS). One example is cuckoo filter [1]. Comparing to a traditional hash table, HTSS has a much more compact structure. For each element, only one signature and its corresponding set ID is stored. No key comparison is required during lookup. For the table structure, there are multiple entries in each bucket, and the table is composed of many buckets. Two modes are supported for HTSS, "cache" and "none-cache" modes. The non-cache mode is similar to the cuckoo filter [1]. When a bucket is full, one entry will be evicted to its alternative bucket to make space for the new key. The table could be full and then no more keys could be inserted. This mode has false-positive rate but no false-negative. Multiple entries with same signature could stay in the same bucket. The "cache" mode does not evict key to its alternative bucket when a bucket is full, an existing key will be evicted out of the table like a cache. Thus, the table will never reject keys when it is full. Another property is in each bucket, there cannot be multiple entries with same signature. The mode could have both false-positive and false-negative probability. This patch adds the implementation of HTSS. [1] B Fan, D G Andersen and M Kaminsky, “Cuckoo Filter: Practically Better Than Bloom,” in Conference on emerging Networking Experiments and Technologies, 2014. Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com> Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2017-10-09 00:02:45 +02:00
Yipeng Wang	857ed6c68c	member: implement main API Membership library is an extension and generalization of a traditional filter (for example Bloom Filter and cuckoo filter) structure. In general, the Membership library is a data structure that provides a "set-summary" and responds to set-membership queries of whether a certain element belongs to a set(s). A membership test for an element will return the set this element belongs to or not-found if the element is never inserted into the set-summary. The results of the membership test are not 100% accurate. Certain false positive or false negative probability could exist. However, comparing to a "full-blown" complete list of elements, a "set-summary" is memory efficient and fast on lookup. This patch adds the main API definition. Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com> Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>	2017-10-09 00:02:45 +02:00
CongWen Zhang	dfdc2940cb	jobstats: fix a doxygen comment Signed-off-by: CongWen Zhang <zhang.congwen@zte.com.cn> Reviewed-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>	2017-10-08 22:22:08 +02:00
Hemant Agrawal	e1a45fc494	igb_uio: fix build on arm64 kernel IGB_UIO compilation recently got enabled for ARM64 by default The igb_uio compilation against ARM64 based stock 4.x (e.g. 4.13) kernel is giving compilation warnings: igb_uio.c: In function ‘igbuio_pci_irqcontrol’: igb_uio.c:115:25: error: implicit declaration of function ‘irq_get_irq_dat ’ [-Werror=implicit-function-declaration] struct irq_data *irq = irq_get_irq_data(udev->info.irq); ^ igb_uio.c:115:25: error: initialization makes pointer from integer without a cast [-Werror=int-conversion] Fixes: d196343a258e ("igb_uio: use kernel functions for masking MSI-X") Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com> Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>	2017-10-08 17:19:08 +02:00
Yangchao Zhou	3fb1ea032b	hash: optimize Toeplitz RSS computation Use rte_bsf32 and fast bit unset operation to optimize the softrss computation. The following measurements shows improvement over the default softrss computation function. tuple lens old(cycles) new(cycles) 3 1225 337 9 3743 992 Signed-off-by: Yangchao Zhou <zhouyates@gmail.com> Reviewed-by: Vladimir Medvedkin <medvedkinv@gmail.com>	2017-10-07 13:50:43 +02:00
Pablo de Lara	98b8ec7060	hash: fix eviction counter When adding a new entry in a hash table, there is a maximum number of evictions that can be performed. When the counter of these evictions reaches this maximum, the entry cannot be added, as it is considered that the algorithm has encountered an infinite loop. The problem with the current implementation, is that this counter was declared as a static variable. If there are multiple threads adding entries in the same table or in different tables, they should access different counters, one per core and per table. Therefore, the variable has been modified to be non-static. Fixes: 243e93a5046f ("hash: fix unlimited cuckoo path") Cc: stable@dpdk.org Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>	2017-10-07 13:50:43 +02:00
Tonghao Zhang	071925527d	igb_uio: use UIO macro instead of hardcoded value This is not bugfix, but it's convenient to help developer to review and maintain the igbuio codes. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>	2017-10-07 00:51:59 +02:00
Markus Theil	74da59da7f	igb_uio: add MSI IRQ mode This patch adds MSI IRQ mode in a way, that should also work on older kernel versions. The base for my patch was an attempt to do this in cf705bc36c which was later reverted in d8ee82745a. Compilation was tested on Linux 3.2, 4.10 and 4.12. Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>	2017-10-06 23:03:03 +01:00

1 2 3 4 5 ...

3623 Commits