9770 Commits

Author SHA1 Message Date
Maxime Coquelin
94018cf3d5 vhost: revert workaround MQ fails to startup
This reverts commit 04d81227960b ("vhost: workaround MQ fails to
startup").

As agreed when this workaround was introduced, it can be reverted
as Qemu v2.10 that fixes the issue is now out.

The reply-ack feature is required for vhost-user IOMMU support.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:52:27 +02:00
Daniel Mrzyglod
7b3249c56e net/virtio: fix untrusted scalar value
The unscrutinized value may be incorrectly assumed to be within a certain
range by later operations.

In vhost_user_read: An unscrutinized value from an untrusted source used
in a trusted context - the value of sz_payload may be harmfull and we need
limit them to the max value of payload.

Coverity issue: 139601
Fixes: 6a84c37e3975 ("net/virtio-user: add vhost-user adapter layer")
Cc: stable@dpdk.org

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:52:27 +02:00
Olivier Matz
16e48c9ed7 net/virtio: fix Rx handler when checksum is requested
The simple Rx handler is selected even if Rx checksum offload is
requested by the application, but this handler does not support
offloads. This results in broken received packets (no checksum flag but
invalid checksum in the mbuf data).

Disable the simple Rx handler in that case.

Fixes: 96cb6711939e ("net/virtio: support Rx checksum offload")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:52:27 +02:00
Olivier Matz
0964936308 net/virtio: keep Rx handler whatever the Tx queue config
Split use_simple_rxtx into use_simple_rx and use_simple_tx,
and ensure that only use_simple_tx is updated when txq flags
forces to use the standard Tx handler.

This change is also useful for next commit (disable simple Rx
path when Rx checksum is requested).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:52:27 +02:00
Olivier Matz
02dd0e2129 net/virtio: remove SSE check
Since commit f27769f796a0 ("mk: require SSE4.2 support on all x86
platforms"), SSE4.2 is a requirement when compiling on x86 platforms.

We can remove this check in the virtio driver.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:51:04 +02:00
Olivier Matz
4819eae8d9 net/virtio: rationalize setting of Rx/Tx handlers
The selection of Rx/Tx handlers is done at several places,
group them in one function set_rxtx_funcs().

The update of hw->use_simple_rxtx is also rationalized:
- initialized to 1 (prefer simple path)
- in dev configure or rx/tx queue setup, if something prevents from
  using the simple path, change it to 0.
- in dev start, set the handlers according to hw->use_simple_rxtx.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:51:04 +02:00
Olivier Matz
efc83a1e7f net/virtio: fix queue setup consistency
In rx/tx queue setup functions, some code is executed only if
use_simple_rxtx == 1. The value of this variable can change depending on
the offload flags or sse support. If Rx queue setup is called before Tx
queue setup, it can result in an invalid configuration:

- dev_configure is called: use_simple_rxtx is initialized to 0
- rx queue setup is called: queues are initialized without simple path
  support
- tx queue setup is called: use_simple_rxtx switch to 1, and simple
  Rx/Tx handlers are selected

Fix this by postponing a part of Rx/Tx queue initialization in
dev_start(), as it was the case in the initial implementation.

Fixes: 48cec290a3d2 ("net/virtio: move queue configure code to proper place")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:51:04 +02:00
Olivier Matz
0c4f909c17 net/virtio: fix mbuf port for simple Rx function
The mbuf->port was was not properly set for the first received
mbufs. Fix this by setting it in virtqueue_enqueue_recv_refill_simple(),
which is used to enqueue the first mbuf in the ring.

The function virtio_rxq_rearm_vec(), which is used to rearm the ring
with new mbufs, is correct and does not need to be updated.

Fixes: cab0461234e7 ("virtio: fill Rx avail ring with blank mbufs")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:51:04 +02:00
Olivier Matz
78fd97c334 net/virtio: fix log levels in configure
On error, we should log with error level.

Fixes: 9f4f2846ef76 ("virtio: support vlan filtering")
Fixes: 86d59b21468a ("net/virtio: support LRO")
Fixes: 96cb6711939e ("net/virtio: support Rx checksum offload")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:51:04 +02:00
Olivier Matz
9e33b79f31 doc: fix description of L4 Rx checksum offload
As described in API documentation, the field hw_ip_checksum
requests both L3 and L4 offload.

Fixes: dad1ec72a377 ("doc: document NIC features")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:51:04 +02:00
Olivier Matz
d67d86ce5b net/virtio: revert not claiming IP checksum offload
This reverts
commit 4dab342b7522 ("net/virtio: do not falsely claim to do IP checksum").

The description of rxmode->hw_ip_checksum is:

     hw_ip_checksum   : 1, /**< IP/UDP/TCP checksum offload enable. */

Despite its name, this field can be set by an application to enable L3
and L4 checksums. In case of virtio, only L4 checksum is supported and
L3 checksums flags will always be set to "unknown".

Fixes: 4dab342b7522 ("net/virtio: do not falsely claim to do IP checksum")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:48:53 +02:00
Olivier Matz
ec9f3d122a net/virtio: revert not claiming LRO support
This reverts
commit 701a64622c26 ("net/virtio: do not claim to support LRO")

Setting rxmode->enable_lro is a way to tell the host that the guest is
ok to receive tso packets. From the guest point of view, it is like
enabling LRO on a physical driver.

Fixes: 701a64622c26 ("net/virtio: do not claim to support LRO")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:48:53 +02:00
Steven Luong
924da8f1c4 net/virtio-user: send kick notify backend on init
Acccording to the vhost-user spec [0], client must start ring
upon receiving a kick (that is, detecting that file descriptor
is reachable) on the descriptor specified by VHOST_USER_SET_VRING_KICK.

The code sends a kick to the rx queue. It is missing sending a
kick for the tx queue. This patch is to add the missing code to
comply with the spec.

[0]: https://fossies.org/linux/qemu/docs/specs/vhost-user.txt

Signed-off-by: Steven Luong <sluong@cisco.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:48:53 +02:00
Tiwei Bie
e5c494a7a2 vhost: batch small guest memory copies
This patch adaptively batches the small guest memory copies.
By batching the small copies, the efficiency of executing the
memory LOAD instructions can be improved greatly, because the
memory LOAD latency can be effectively hidden by the pipeline.
We saw great performance boosts for small packets PVP test.

This patch improves the performance for small packets, and has
distinguished the packets by size. So although the performance
for big packets doesn't change, it makes it relatively easy to
do some special optimizations for the big packets too.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:48:53 +02:00
Zhiyong Yang
0373ab9bfc net/virtio: replace magic number with PCI constant
To use macro instead of magic number in order to enhance code
readability.

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:48:53 +02:00
Zhiyong Yang
9ff41aa7a0 net/virtio: fix indent
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
2017-10-10 15:48:53 +02:00
Jonas Pfefferle
33604c3135 vfio: refactor PCI BAR mapping
Split pci_vfio_map_resource for primary and secondary processes.
Save all relevant mapping data in primary process to allow
the secondary process to perform mappings.

Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-10-10 15:37:58 +02:00
Jonas Pfefferle
ed1e7e576b vfio: fix sPAPR IOMMU DMA window size
DMA window size needs to be big enough to span all memory segment's
physical addresses. We do not need multiple levels of IOMMU tables
as we already span ~70TB of physical memory with 16MB hugepages.

Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2017-10-10 15:36:04 +02:00
Shreyansh Jain
1459585888 bus/dpaa: fix memory allocation during scan
With the IOVA auto detection changes, bus scan is performed before
memory initialization. DPAA bus scan must not use rte_malloc in
its path.

Fixes: cf408c22476c ("eal: auto detect IOVA mode")

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2017-10-10 15:30:44 +02:00
Eelco Chaudron
7a88893246 doc: add use of mlockall to programmers guide
When I was adding mlockall() to the testpmd application it was
suggested to add a reference to the use case of mlockall(). This patch
adds is.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2017-10-10 00:41:08 +01:00
Eelco Chaudron
1c036b16c2 app/testpmd: avoid pages being swapped out
Call the mlockall() function, to attempt to lock all of its process
memory into physical RAM, and preventing the kernel from paging any
of its memory to disk.

When using testpmd for performance testing, depending on the code path
taken, we see a couple of page faults in a row. These faults effect
the overall drop-rate of testpmd. On Linux the mlockall() call will
prefault all the pages of testpmd (and the DPDK libraries if linked
dynamically), even without LD_BIND_NOW.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-10-10 00:30:16 +01:00
Patrick MacArthur
e3f141879e eal: copy raw strings taken from command line
Normally, command line argument strings are considered immutable, but
SPDK [1] and urdma [2] construct argv arrays to pass to rte_eal_init().
These strings are allocated using malloc() and freed after DPDK
initialization with free(). However, in the case of --file-prefix and
--huge-dir, DPDK takes the pointer to these strings in argv directly. If
a secondary process calls rte_eal_pci_probe() after rte_eal_init()
returns, as is done by SPDK, this causes a use-after-free error because
the strings have been freed by the calling code immediately after
rte_eal_init() returns.

This problem was observed when running SPDK example programs as a
secondary process and causes the secondary processes to fail:

Starting DPDK 16.11.1 initialization...
[ DPDK EAL parameters: identify -c 4 --file-prefix=spdk3260 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 40 lcore(s)
EAL: Auto-detected process type: SECONDARY
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL:   probe driver: 8086:953 spdk_nvme
EAL:   cannot connect to primary process!
EAL: Error - exiting with code: 1
Cause: Requested device 0000:81:00.0 cannot be used

Running strace shows that the file prefix has been zero'd out by the
time that the secondary process attempts to probe the NVMe device.

The use-after-free errors can be easily detected with valgrind:

==8489== Invalid read of size 1
==8489==    at 0x4C30D22: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8489==    by 0x58DB955: vfprintf (vfprintf.c:1637)
==8489==    by 0x59A4685: __vsnprintf_chk (vsnprintf_chk.c:63)
==8489==    by 0x59A45E7: __snprintf_chk (snprintf_chk.c:34)
==8489==    by 0x1246AB: get_socket_path.constprop.0 (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x124B09: vfio_mp_sync_connect_to_primary (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x123BE4: vfio_get_group_fd.part.1 (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x124366: vfio_setup_device (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x126C8A: pci_vfio_map_resource (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x12B115: pci_probe_all_drivers.part.0 (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x12B596: rte_eal_pci_probe (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x11D5B5: spdk_pci_enumerate (pci.c:147)
==8489==  Address 0x63f362e is 14 bytes inside a block of size 32 free'd
==8489==    at 0x4C2ED5B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8489==    by 0x11E6FB: spdk_free_args (init.c:136)
==8489==    by 0x11EBF5: spdk_env_init (init.c:309)
==8489==    by 0x10D2AA: main (identify.c:976)
==8489==  Block was alloc'd at
==8489==    at 0x4C2DB2F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8489==    by 0x11E7D7: _sprintf_alloc (init.c:76)
==8489==    by 0x11EA78: spdk_build_eal_cmdline (init.c:251)
==8489==    by 0x11EA78: spdk_env_init (init.c:282)
==8489==    by 0x10D2AA: main (identify.c:976)
==8489==

Fix this by using strdup() to create separate memory buffers for these
strings. Note that this patch will cause valgrind to report memory
leaks of these buffers as there is nowhere to free them. Using static
buffers is an option but would make these strings have a fixed maximum
length whereas there is currently no limit defined by the API.

[1] http://spdk.io
[2] https://github.com/zrlio/urdma

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Patrick MacArthur <patrick@patrickmacarthur.net>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-09 23:25:13 +02:00
Seth Howell
7485e06c2a mem: check mmap failure
If mmap fails, it will return the value MAP_FAILED. Checking for this
return code allows us to properly identify mmap failures and report
them as such to the calling function.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-09 23:17:04 +02:00
Xueming Li
41baec55a8 mem: fix malloc element free in debug mode
malloc_elem_free() is clearing(setting to 0) the trailer cookie when
RTE_MALLOC_DEBUG is enabled. In case of joining free neighbor element,
part of joined memory is not getting cleared due to missing the length
of trailer cookie in the middle.

This patch fixes calculation of free memory length to be cleared in
malloc_elem_free() by including trailer cookie.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-09 23:15:45 +02:00
Xueming Li
3cd4e0e883 mem: fix malloc debug config
This patch replaces broken macro RTE_LIBRTE_MALLOC_DEBUG with
RTE_MALLOC_DEBUG.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-09 23:15:45 +02:00
Xueming Li
f385306357 config: add option to enable asserts
Currently, enabling assertion have to set CONFIG_RTE_LOG_LEVEL to
RTE_LOG_DEBUG. CONFIG_RTE_LOG_LEVEL is the default log level of control
path, RTE_LOG_DP_LEVEL is the log level of data path. It's a little bit
hard to understand literally that assertion is decided by control path
LOG_LEVEL, especially assertion used on data path.

On the other hand, DPDK need an assertion enabling switch w/o impacting
log output level, assuming "--log-level" not specified.

Assertion is an important API to balance DPDK high performance and
robustness. To promote assertion usage, it's valuable to unhide
assertion out of COFNIG_RTE_LOG_LEVEL.

In one word, log is log, assertion is assertion, debug is hot pot :)

Rationale of this patch is to introduce an dedicate switch of
assertion: RTE_ENABLE_ASSERT

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-10-09 23:15:45 +02:00
Zhiyong Yang
08ef593a70 test: add check for AVX512F
The CPUs which support AVX512 have been released. Add support for
checking AVX512F instruction set.

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 16:26:35 +02:00
Pablo de Lara
ca5aeab31d mempool/octeontx: fix icc build
drivers/mempool/octeontx/octeontx_fpavf.c(789):
error #592: variable "fpa" is used before its value is set
        RTE_SET_USED(fpa);

Fixes: 1c842786fe6c ("mempool/octeontx: probe fpavf PCIe devices")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-10-09 16:17:33 +02:00
Jianfeng Tan
f26ab687a7 eal: remove Xen dom0 support
We remove xen-specific code in EAL, including the option --xen-dom0,
memory initialization code, compiling dependency, etc.

Related documents are removed or updated, and bump the eal library
version.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
2017-10-09 01:54:29 +02:00
Jianfeng Tan
a7cb2e20d2 mem: remove API to get physical address in dom0
Previously, to get MFN address in dom0, this API is a wrapper to
obtain the "physical address".

As we will removed xen dom0 support, this API is not necessary.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 01:52:37 +02:00
Jianfeng Tan
1950bd7694 xen: remove dependency in libraries
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 01:52:08 +02:00
Jianfeng Tan
a641f1f9d5 xen: remove dependency in applications
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 01:51:58 +02:00
Jianfeng Tan
8b3746e8f7 net/xenvirt: remove
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 01:11:48 +02:00
Jianfeng Tan
1125122ca4 examples/vhost_xen: remove
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 01:06:39 +02:00
Jacek Piasecki
154cdbb039 test/cfgfile: add realloc scenario
Load huge realloc_sections.ini file to check malloc/realloc
ability of cfgfile library.

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 00:52:37 +02:00
Jacek Piasecki
a6a47ac9c2 cfgfile: rework load function
New functions added to cfgfile library make it possible
to significantly simplify the code of rte_cfgfile_load_with_params()

This patch shows the new body of this function.

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 00:50:48 +02:00
Jacek Piasecki
d4cb819758 cfgfile: support runtime modification
Extend existing cfgfile library with providing new API functions:

rte_cfgfile_create() - create new cfgfile object
rte_cfgfile_add_section() - add new section to existing cfgfile
object
rte_cfgfile_add_entry() - add new entry to existing cfgfile
object in specified section
rte_cfgfile_set_entry() - update existing entry in cfgfile object
rte_cfgfile_save() - save existing cfgfile object to INI file

This modification allows to create a cfgfile on
runtime and opens up the possibility to have applications
dynamically build up a proper DPDK configuration, rather than having
to have a pre-existing one.

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 00:50:25 +02:00
Jacek Piasecki
b82a987ffc cfgfile: rework to flat arrays
Change to flat arrays in cfgfile struct force slightly
different data access for most of cfgfile functions.
This patch provides necessary changes in existing API.

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 00:45:11 +02:00
Jacek Piasecki
250fef469e cfgfile: remove EAL dependency
This patch removes the dependency to EAL in cfgfile library.

Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 00:44:59 +02:00
Yipeng Wang
55694b2a9f doc: add membership documentation
This patch adds the documentation for membership library.

Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2017-10-09 00:23:59 +02:00
Yipeng Wang
0cc67a96e4 test/member: add functional and perf tests
This patch adds functional and performance tests for membership
library.

Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-10-09 00:02:45 +02:00
Yipeng Wang
703be9531a member: add AVX for HT mode
For key search, the signatures of all entries are compared against
the signature of the key that is being looked up. Since all
signatures are contiguously put in a bucket, they can be compared
with vector instructions (AVX2), achieving higher lookup performance.

This patch adds AVX2 implementation in a separate header file.

Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-10-09 00:02:45 +02:00
Yipeng Wang
54b8edc07c member: implement vBF mode
Bloom Filter (BF) [1] is a well-known space-efficient
probabilistic data structure that answers set membership queries.
Vector of Bloom Filters (vBF) is an extension to traditional BF
that supports multi-set membership testing. Traditional BF will
return found or not-found for each key. vBF will also return
which set the key belongs to if it is found.

Since each set requires a BF, vBF should be used when set count
is small. vBF's false positive rate could be set appropriately so
that its memory requirement and lookup speed is better in certain
cases comparing to HT based set-summary.

This patch adds the vBF implementation.

[1]B H Bloom, “Space/Time Trade-offs in Hash Coding with Allowable
Errors,” Communications of the ACM, 1970.

Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-10-09 00:02:45 +02:00
Yipeng Wang
904ec78a23 member: implement HT mode
One of the set-summary structures is hash-table based
set-summary (HTSS). One example is cuckoo filter [1].

Comparing to a traditional hash table, HTSS has a much more
compact structure. For each element, only one signature and
its corresponding set ID is stored. No key comparison is required
during lookup. For the table structure, there are multiple entries
in each bucket, and the table is composed of many buckets.

Two modes are supported for HTSS, "cache" and "none-cache" modes.
The non-cache mode is similar to the cuckoo filter [1].
When a bucket is full, one entry will be evicted to its
alternative bucket to make space for the new key. The table could
be full and then no more keys could be inserted. This mode has
false-positive rate but no false-negative. Multiple entries
with same signature could stay in the same bucket.

The "cache" mode does not evict key to its alternative bucket
when a bucket is full, an existing key will be evicted out of
the table like a cache. Thus, the table will never reject keys when
it is full. Another property is in each bucket, there cannot be
multiple entries with same signature. The mode could have both
false-positive and false-negative probability.

This patch adds the implementation of HTSS.

[1] B Fan, D G Andersen and M Kaminsky, “Cuckoo Filter: Practically
Better Than Bloom,” in Conference on emerging Networking
Experiments and Technologies, 2014.

Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-10-09 00:02:45 +02:00
Yipeng Wang
857ed6c68c member: implement main API
Membership library is an extension and generalization of a traditional
filter (for example Bloom Filter and cuckoo filter) structure.
In general, the Membership library is a data structure that provides a
"set-summary" and responds to set-membership queries of whether a
certain element belongs to a set(s). A membership test for an element
will return the set this element belongs to or not-found if the
element is never inserted into the set-summary.

The results of the membership test are not 100% accurate. Certain
false positive or false negative probability could exist. However,
comparing to a "full-blown" complete list of elements, a "set-summary"
is memory efficient and fast on lookup.

This patch adds the main API definition.

Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-10-09 00:02:45 +02:00
CongWen Zhang
dfdc2940cb jobstats: fix a doxygen comment
Signed-off-by: CongWen Zhang <zhang.congwen@zte.com.cn>
Reviewed-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
2017-10-08 22:22:08 +02:00
Santosh Shukla
2baa3f0b7d mempool/octeontx: support memory area ops
Add support for register_memory_area ops in mempool driver.

Allow more than one HW pool when using OcteonTx mempool driver:
By storing each pool information to the list and find appropriate
list element by matching the rte_mempool pointers.

Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-08 19:30:50 +02:00
Santosh Shukla
06935a4f45 mempool/octeontx: support capabilities query
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-08 19:30:50 +02:00
Santosh Shukla
e48de68d89 mempool/octeontx: support count query
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-08 19:30:50 +02:00
Santosh Shukla
e4f6731454 mempool/octeontx: support enqueue and dequeue
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-08 19:30:50 +02:00