Extend existing cfgfile library with providing new API functions:
rte_cfgfile_create() - create new cfgfile object
rte_cfgfile_add_section() - add new section to existing cfgfile
object
rte_cfgfile_add_entry() - add new entry to existing cfgfile
object in specified section
rte_cfgfile_set_entry() - update existing entry in cfgfile object
rte_cfgfile_save() - save existing cfgfile object to INI file
This modification allows to create a cfgfile on
runtime and opens up the possibility to have applications
dynamically build up a proper DPDK configuration, rather than having
to have a pre-existing one.
Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Change to flat arrays in cfgfile struct force slightly
different data access for most of cfgfile functions.
This patch provides necessary changes in existing API.
Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This patch removes the dependency to EAL in cfgfile library.
Signed-off-by: Jacek Piasecki <jacekx.piasecki@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
For key search, the signatures of all entries are compared against
the signature of the key that is being looked up. Since all
signatures are contiguously put in a bucket, they can be compared
with vector instructions (AVX2), achieving higher lookup performance.
This patch adds AVX2 implementation in a separate header file.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Bloom Filter (BF) [1] is a well-known space-efficient
probabilistic data structure that answers set membership queries.
Vector of Bloom Filters (vBF) is an extension to traditional BF
that supports multi-set membership testing. Traditional BF will
return found or not-found for each key. vBF will also return
which set the key belongs to if it is found.
Since each set requires a BF, vBF should be used when set count
is small. vBF's false positive rate could be set appropriately so
that its memory requirement and lookup speed is better in certain
cases comparing to HT based set-summary.
This patch adds the vBF implementation.
[1]B H Bloom, “Space/Time Trade-offs in Hash Coding with Allowable
Errors,” Communications of the ACM, 1970.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
One of the set-summary structures is hash-table based
set-summary (HTSS). One example is cuckoo filter [1].
Comparing to a traditional hash table, HTSS has a much more
compact structure. For each element, only one signature and
its corresponding set ID is stored. No key comparison is required
during lookup. For the table structure, there are multiple entries
in each bucket, and the table is composed of many buckets.
Two modes are supported for HTSS, "cache" and "none-cache" modes.
The non-cache mode is similar to the cuckoo filter [1].
When a bucket is full, one entry will be evicted to its
alternative bucket to make space for the new key. The table could
be full and then no more keys could be inserted. This mode has
false-positive rate but no false-negative. Multiple entries
with same signature could stay in the same bucket.
The "cache" mode does not evict key to its alternative bucket
when a bucket is full, an existing key will be evicted out of
the table like a cache. Thus, the table will never reject keys when
it is full. Another property is in each bucket, there cannot be
multiple entries with same signature. The mode could have both
false-positive and false-negative probability.
This patch adds the implementation of HTSS.
[1] B Fan, D G Andersen and M Kaminsky, “Cuckoo Filter: Practically
Better Than Bloom,” in Conference on emerging Networking
Experiments and Technologies, 2014.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Membership library is an extension and generalization of a traditional
filter (for example Bloom Filter and cuckoo filter) structure.
In general, the Membership library is a data structure that provides a
"set-summary" and responds to set-membership queries of whether a
certain element belongs to a set(s). A membership test for an element
will return the set this element belongs to or not-found if the
element is never inserted into the set-summary.
The results of the membership test are not 100% accurate. Certain
false positive or false negative probability could exist. However,
comparing to a "full-blown" complete list of elements, a "set-summary"
is memory efficient and fast on lookup.
This patch adds the main API definition.
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
IGB_UIO compilation recently got enabled for ARM64 by default
The igb_uio compilation against ARM64 based stock 4.x (e.g. 4.13)
kernel is giving compilation warnings:
igb_uio.c: In function ‘igbuio_pci_irqcontrol’:
igb_uio.c:115:25: error: implicit declaration of function
‘irq_get_irq_dat ’ [-Werror=implicit-function-declaration]
struct irq_data *irq = irq_get_irq_data(udev->info.irq);
^
igb_uio.c:115:25: error: initialization makes pointer from integer without
a cast [-Werror=int-conversion]
Fixes: d196343a258e ("igb_uio: use kernel functions for masking MSI-X")
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Use rte_bsf32 and fast bit unset operation to optimize the
softrss computation.
The following measurements shows improvement over the default
softrss computation function.
tuple lens old(cycles) new(cycles)
3 1225 337
9 3743 992
Signed-off-by: Yangchao Zhou <zhouyates@gmail.com>
Reviewed-by: Vladimir Medvedkin <medvedkinv@gmail.com>
When adding a new entry in a hash table, there is
a maximum number of evictions that can be
performed. When the counter of these evictions reaches
this maximum, the entry cannot be added, as it is considered
that the algorithm has encountered an infinite loop.
The problem with the current implementation, is that this
counter was declared as a static variable.
If there are multiple threads adding entries in the same table
or in different tables, they should access different counters,
one per core and per table.
Therefore, the variable has been modified to be non-static.
Fixes: 243e93a5046f ("hash: fix unlimited cuckoo path")
Cc: stable@dpdk.org
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This is not bugfix, but it's convenient to help developer
to review and maintain the igbuio codes.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
This patch adds MSI IRQ mode in a way, that should
also work on older kernel versions. The base for my patch
was an attempt to do this in cf705bc36c which was later
reverted in d8ee82745a. Compilation was tested on Linux 3.2,
4.10 and 4.12.
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
For better readability throughout the module, the destruction
order is changed to the exact inverse construction order.
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The patch which introduced the usage of pci_alloc_irq_vectors
came after the patch which switched to non-threaded ISR (f0d1896fa1),
but did not use non-threaded ISR, if pci_alloc_irq_vectors
is used.
Fixes: 99bb58f3adc7 ("igb_uio: switch to new irq function for MSI-X")
Cc: stable@dpdk.org
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
igb_uio already allocates irqs using pci_alloc_irq_vectors on
recent kernels >= 4.8. The interrupt disable code was not
using the corresponding pci_free_irq_vectors, but the also
deprecated pci_disable_msix, before this fix.
Fixes: 99bb58f3adc7 ("igb_uio: switch to new irq function for MSI-X")
Cc: stable@dpdk.org
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Interrupt setup code in igb_uio has to deal with multiple
types of interrupts and kernel versions. This patch moves
the setup and teardown code into own functions, to make
it more readable.
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Tested-by: Markus Theil <markus.theil@tu-ilmenau.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
detect SLE version reverse chronologically as ">=" is being used.
Fixes: 2972254ce163 ("kni: fix build on Suse 12 SP3")
Cc: stable@dpdk.org
Signed-off-by: Nirmoy Das <ndas@suse.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
HW pool manager e.g. Octeontx SoC demands s/w to program start and end
address of pool. Currently, there is no such api in external mempool.
Introducing rte_mempool_ops_register_memory_area api which will let HW(pool
manager) to know when common layer selects hugepage:
For each hugepage - Notify its start/end address to HW pool manager.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Some mempool hw like octeontx/fpa block, demands block size
(/total_elem_sz) aligned object start address.
Introducing an MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag.
If this flag is set:
- Align object start address(vaddr) to a multiple of total_elt_sz.
- Allocate one additional object. Additional object is needed to make
sure that requested 'n' object gets correctly populated.
Example:
- Let's say that we get 'x' size of memory chunk from memzone.
- And application has requested 'n' object from mempool.
- Ideally, we start using objects at start address 0 to...(x-block_sz)
for n obj.
- Not necessarily first object address i.e. 0 is aligned to block_sz.
- So we derive 'offset' value for block_sz alignment purpose i.e..'off'.
- That 'off' makes sure that start address of object is blk_sz aligned.
- Calculating 'off' may end up sacrificing first block_sz area of
memzone area x. So total number of the object which can fit in the
pool area is n-1, Which is incorrect behavior.
Therefore we request one additional object (/block_sz area) from memzone
when MEMPOOL_F_CAPA_BLK_ALIGNED_OBJECTS flag is set.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The memory area containing all the objects must be physically
contiguous.
Introducing MEMPOOL_F_CAPA_PHYS_CONTIG flag for such use-case.
The flag useful to detect whether pool area has sufficient space
to fit all objects. If not then return -ENOSPC.
This way, we make sure that all object within a pool is contiguous.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Allow the mempool driver to advertise his pool capabilities.
For that pupose, an api(rte_mempool_ops_get_capabilities)
and ->get_capabilities() handler has been introduced.
- Upon ->get_capabilities() call, mempool driver will advertise
his capabilities to mempool flags param.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
xmem_size and xmem_usage need to know the status of mempool flags,
so add 'flags' arg in _xmem_size/usage() api.
Following patch will make use of that.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
mp->flags is int and mempool API writes unsigned int
value in 'flags', so fix the 'flags' data type.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Now that dpdk supports more than one mempool drivers and
each mempool driver works best for specific PMD, example:
- sw ring based mempool for Intel PMD drivers.
- dpaa2 HW mempool manager for dpaa2 PMD driver.
- fpa HW mempool manager for Octeontx PMD driver.
Application would like to know the best mempool handle
for any port.
Introducing rte_eth_dev_pool_ops_supported() API,
which allows PMD driver to advertise
his supported pool capability to the application.
Supported pools are categorized in below priority:-
- Best mempool handle for this port (Highest priority '0')
- Port supports this mempool handle (Priority '1')
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
DPDK has support for both sw and hw mempool and
currently user is limited to use ring_mp_mc pool.
In case user want to use other pool handle,
need to update config RTE_MEMPOOL_OPS_DEFAULT, then
build and run with desired pool handle.
Introducing eal option to override default pool handle.
Now user can override the RTE_MEMPOOL_OPS_DEFAULT by passing
pool handle to eal `--mbuf-pool-ops-name=""`.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
iova autodetection depends on rte_bus_scan result. Result of bus scan will
have updated device_list and each device in that list has its '.kdev' state
updated. That kdrv state used to detect iova mapping mode for that device.
_device_parse() has dependency on rt_bus_scan so,
Below calls moved up in the eal initialization order:
- eal_option_device_parse
- rte_bus_scan
And based on the result of rte_bus_scan_iommu_class - select iova
mapping mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Introducing rte_eal_iova_mode() helper API. This API
used by non-eal library for detecting iova mode.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
API(rte_bus_get_iommu_class) helps to automatically detect and select
appropriate iova mapping scheme for iommu capable device on that bus.
Algorithm for iova scheme selection for bus:
0. Iterate through bus_list.
1. Collect each bus iova mode value and update into 'mode' var.
2. Mode selection scheme is:
if mode == 0 then iova mode is _pa,
if mode == 1 then iova mode is _pa,
if mode == 2 then iova mode is _va,
if mode == 3 then iova mode ia _pa.
So mode !=2 will be default iova mode (_pa).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Get iommu class of PCI device on the bus and returns preferred iova
mapping mode for that bus.
Patch also introduces RTE_PCI_DRV_IOVA_AS_VA drv flag.
Flag used when driver needs to operate in iova=va mode.
Algorithm for iova scheme selection for PCI bus:
0. If no device bound then return with RTE_IOVA_DC mapping mode,
else goto 1).
1. Look for device attached to vfio kdrv and has .drv_flag set
to RTE_PCI_DRV_IOVA_AS_VA.
2. Look for any device attached to UIO class of driver.
3. Check for vfio-noiommu mode enabled.
If 2) & 3) is false and 1) is true then select
mapping scheme as RTE_IOVA_VA. Otherwise use default
mapping scheme (RTE_IOVA_PA).
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Introducing rte_pci_get_iommu_class API which helps to get iommu class
of PCI device on the bus and returns preferred iova mapping mode for
PCI bus.
Patch also adds rte_pci_get_iommu_class definition for:
- bsdapp: api returns default iova mode.
- linuxapp: Has stub implementation, Followup patch has complete
implementation.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Export rte_pci_match() function as it needed in the followup patch.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
As for the testpmd flow command which uses uint16_t since the beginning by
chance, switch to portid_t for consistency.
Fixes: 14ab03825b1d ("ethdev: increase port id range")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
In order to support more than 256 virtual ports, the field "port"
in rte_mbuf has been increased to 16 bits. The initialization/reset
value of the field "port" should be changed from 0xff to 0xffff
accordingly.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Extend port_id definition from uint8_t to uint16_t in lib and drivers
data structures, specifically rte_eth_dev_data. Modify the APIs,
drivers and app using port_id at the same time.
Fix some checkpatch issues from the original code and remove some
unnecessary cast operations.
release_17_11 and deprecation docs have been updated in this patch.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The compilation with gcc-6.3.0 and EXTRA_CFLAGS=-Og gives the following
error:
CC rte_metrics.o
rte_metrics.c: In function
‘rte_metrics_reg_names’: rte_metrics.c:153:22: error: ‘entry’
may be used uninitialized in this function
[-Werror=maybe-uninitialized]
entry->idx_next_set = 0;
~~~~~~~~~~~~~~~~~~~~^~~
This is a false positive, gcc is not able to see that we always go in
the loop at least once, initializing entry.
Fix the warning by initializing entry to NULL.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The compilation with gcc-6.3.0 and EXTRA_CFLAGS=-Og gives the following
error:
CC cmdline_parse.o
cmdline_parse.c: In function ‘match_inst’:
cmdline_parse.c:227:5: error: ‘token_p’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]
if (token_p) {
^
This is a false positive, gcc is not able to see that we always go in
the loop at least once, initializing token_p.
Fix the warning by initializing token_p to NULL.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The compilation with gcc-6.3.0 and EXTRA_CFLAGS=-Og gives the following
error:
CC eal_pci_uio.o
eal_pci_uio.c: In function ‘pci_get_uio_de ’:
eal_pci_uio.c:221:9: error: ‘uio_num’ may be used uninitialized in
this function [-Werror=maybe-uninitialized]
return uio_num;
^~~~~~~
This is a false positive: gcc is not able to see that when e != NULL,
uio_num is always set. Fix the warning by initializing uio_num to -1,
and by the way, change the type to int.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Remove unnecessary check for new flow type for rss hash filter update.
Signed-off-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
PMDs which expose this offload cap supports optimization for fast release
of mbufs following successful Tx.
Such optimization requires that per queue, all mbufs come from the same
mempool and has refcnt = 1.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Introduce a new API to configure Tx offloads.
In the new API, offloads are divided into per-port and per-queue
offloads. The PMD reports capability for each of them.
Offloads are enabled using the existing DEV_TX_OFFLOAD_* flags.
To enable per-port offload, the offload should be set on both device
configuration and queue configuration. To enable per-queue offload, the
offloads can be set only on queue configuration.
In addition the Tx offloads will be disabled by default and be
enabled per application needs. This will much simplify PMD management of
the different offloads.
Applications should set the ETH_TXQ_FLAGS_IGNORE flag on txq_flags
field in order to move to the new API.
The old Tx offloads API is kept for the meanwhile, in order to enable a
smooth transition for PMDs and application to the new API.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Introduce a new API to configure Rx offloads.
In the new API, offloads are divided into per-port and per-queue
offloads. The PMD reports capability for each of them.
Offloads are enabled using the existing DEV_RX_OFFLOAD_* flags.
To enable per-port offload, the offload should be set on both device
configuration and queue configuration. To enable per-queue offload, the
offloads can be set only on queue configuration.
Applications should set the ignore_offload_bitfield bit on rxmode
structure in order to move to the new API.
The old Rx offloads API is kept for the meanwhile, in order to enable a
smooth transition for PMDs and application to the new API.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The assertion of return value from the open() function is done against
0, while it is a correct value - open() returns -1 in case of an error.
It causes problems while trying to run as a daemon, in which case, this
call to open() will return 0 as a valid descriptor.
Fixes: b94e5c9406b5 ("eal/arm: add CPU flags for ARMv7")
Fixes: 97523f822ba9 ("eal/arm: add CPU flags for ARMv8")
Fixes: 9ae155385686 ("eal/ppc: cpu flag checks for IBM Power")
Cc: stable@dpdk.org
Signed-off-by: Lukasz Majczak <lma@semihalf.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>