When offload is enabled, vhost needs to access the first mbuf
to get the packet info, e.g. TCP header. So we couldn't delay
the data copy in this case.
Fixes: e5c494a7a22b ("vhost: batch small guest memory copies")
Reported-by: Lei Yao <lei.a.yao@intel.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Commit 3ea7052f4b1b ("vhost: postpone rings addresses translation")
moves rings addresses translation at either vring kick or enable
time, depending on whether protocol features are enabled or not.
This is done not interpret ring information as long as the vring
is not fully initialized.
The problem is that with old QEMU versions, like v2.5, the ring
is enabled before addresses are sent, so addresses are never
translated.
This patch fixes the issue by doing the translation in
VHOST_USER_SET_VRING_ADDR handling if ring is already enabled.
Fixes: 3ea7052f4b1b ("vhost: postpone rings addresses translation")
Reported-by: Lei Yao <lei.a.yao@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Introduce mask-based hash functions in hash_func.h.
Propagate their usage in test/test, test/test-pipeline and
examples/ip_pipeline.
Remove the non-mask-based hash function prototype from API (which
was previously used as build workaround).
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Rework for the 32-byte key hash tables (both the extendible
bucket and LRU)to use the mask-based hash function and the
unified parameter structure.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Rework for the 16-byte key hash tables (both the extendible
bucket and LRU)to use the mask-based hash function and the
unified parameter structure.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Rework for the 8-byte key hash tables (both the extendible
bucket and LRU)to use the mask-based hash function and the
unified parameter structure.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Rework for the variable size key LRU hash table to use the
mask-based hash function and the unified parameter structure.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Rework for the variable size key extendible bucket (EXT) hash
table to use the mask-based hash function and the unified
parameter structure.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Add unified parameter structure for all hash tables in librte_table.
Add mask-based hash function prototype, which is input parameter for
all hash tables.
Renamed the non-mask-based hash function prototype and all the calls
to it (to be removed later).
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The non-dosig version of the variable size key Least Recently Used
(LRU) hash tables are removed. The remaining hash tables are renamed
to eliminate the dosig particle from their name.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The non-dosig version of the variable size key extendible bucket
hash tables are removed. The remaining hash tables are renamed to
eliminate the dosig particle from their name.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The non-dosig version of the 16-byte key hash tables (both extendable
bucket and LRU) are removed. The remaining hash tables are renamed to
eliminate the dosig particle from their name.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The non-dosig version of the 8-byte key hash tables (both extendable
bucket and LRU) are removed. The remaining hash tables are renamed to
eliminate the dosig particle from their name.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The list of libraries in LDLIBS was generated from the DEPDIRS-xyz
variable. This is valid when the subdirectory name match the library
name, but it's not always the case, especially for PMDs.
The patches removes this feature and explicitly adds the proper
libraries in LDLIBS.
Some DEPDIRS-xyz variables become useless, remove them.
Reported-by: Gage Eads <gage.eads@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Gage Eads <gage.eads@intel.com>
Since the functions exported by DPDK EAL on all OS's should be
identical, we should not need separate function version files for each
OS. Therefore move existing version files to the top-level EAL
directory.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The linuxapp and bsdapp interrupt header files are now identical, so
merge them into a common file in common/include.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
A number of interrupt functions only existed on Linux. Adding in stubs
for these functions corrects this omission, and allows the map files for
both Linux and FreeBSD to be identical.
Fixes: 9efe9c6cdcac ("eal/linux: add epoll wrappers")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The bsdapp-specific rte_interrupts.h file does not need to be different
from the linuxapp one, as there is nothing Linux specific in the APIs or
data structures. This will then allow us to merge the files in a common
location to avoid duplication.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
If the default location for the PMD .so files does not exist, it should
not be treated as a fatal error condition like an incorrect path on the
command line. Therefore check that the path exists and is a directory
before adding it to the list of paths to check for PMDs.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Remove rte_set_log_level(), rte_get_log_level(),
rte_set_log_type(), and rte_get_log_type().
Also update librte_eal.so version in docuementation.
The LIBABIVER variable in eal has already been modified in
commit f26ab687a74f ("eal: remove Xen dom0 support").
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
When getting group fd from primary process, secondary wasn't storing
the fd anywhere, leading to a (harmless) error message in EAL logs,
and (not so harmless) potential problems when hot-unplugging devices
managed by VFIO in a secondary process.
Fix it by actually storing the group fd whenever we get a valid one
from the secondary process.
Fixes: 94c0776b1bad ("vfio: support hotplug")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
VFIO may be used by buses other than PCI. This patch enables
the VFIO on the basis of vfio root presence.
Since vfio_enable should be called only once, pci_vfio_enable
is also removed.
A debug print is added in case vfio_pci module is not present.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch introduces new ethdev generic API for Traffic Metering and
Policing (MTR), which is yet another standard RX offload for Ethernet
devices.
Similar to rte_flow and rte_tm APIs, the configuration of MTR objects is
done in their own namespace (rte_mtr) within the librte_ether library.
Main features:
1. Traffic metering: determine the color for the current packet (green,
yellow, red) based on history maintained by the MTR object. Supported
algorithms: srTCM (RFC 2697), trTCM (RFC 2698 and RFC 4115).
2. Policing (per meter output color actions): re-color the packet (keep
or change the meter output color) or drop the packet.
3. Statistics
4. Capability API
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Following similar approach as rte_flow and rte_tm for modularity reasons,
the ops for the new rte_mtr API are retrieved through a new eth_dev_ops
function.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Metering and policing action typically sits on top of flow classification,
which is why MTR objects are enabled through a newly introduced flow
action.
The configuration of MTR objects is done in their own namespace (rte_mtr)
within the librte_ether library. The MTR object is hooked into ethdev RX
processing path using the "meter" flow action.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Compile fails when kernel version is <= 3.17 with error:
"dereferencing pointer to incomplete type". This is because struct
uio_device definition is not exposed in kernel earlier than 3.17.
This patch fixes it by using pointer of rte_uio_pci_dev as
dev_id instead of uio_device for irq device handler.
Fixes: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in VM")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
build error:
build/lib/librte_eal/linuxapp/kni/kni_net.c:215:5: error:
‘struct net_device’ has no member named ‘trans_start’
dev->trans_start = jiffies;
Signed-off-by: Nirmoy Das <ndas@suse.de>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
If pass-through a VF by vfio-pci to a Qemu VM, after FLR
in VM, the interrupt setting is not recoverd correctly
to host as below:
in VM guest:
Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
in Host:
Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
That was because in pci_reset_function, it first reads the
PCI configure and set FLR reset, and then writes PCI configure
as restoration. But not all the writing are successful to Host.
Because vfio-pci driver doesn't allow directly write PCI MSI-X
Cap.
To fix this issue, we need to move the interrupt enablement from
igb_uio probe to open device file. While it is also the similar as
the behaviour in vfio_pci kernel module code.
Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
In case of NUMA reallocation, the virtqueue struct is reallocated
on another socket, meaning that its address changes.
In translate_ring_addresses(), addr pointer was not fetched again
after the reallocation, so it pointed to freed memory.
This patch just fetch again addr pointer after the reallocation.
Reported-by: Lei Yao <lei.a.yao@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
In case of NUMA reallocation, virtqueue's iotlb list is broken,
has its head changes but first iotlb entry in the list still points
to the previous head pointer.
Also, in case of reallocation, we want the IOTLB cache mempool to be
on the new socket.
This patch perform a full re-init of the IOTLB cache when mempool
already exists, and calls the IOTLB cache init function in case
the virtqueue is being reallocated on a new socket.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
An optimization was done to only take the iotlb cache lock
once per packet burst instead of once per IOVA translation.
With this, IOTLB miss requests are sent to Qemu with the lock
held, which can cause a deadlock if the socket buffer is full,
and if Qemu is waiting for an IOTLB update to be done.
Holding the lock is not necessary when sending an IOTLB miss
request, as it is not manipulating the IOTLB cache list, which
the lock protects. Let's just release it while sending the
IOTLB miss.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Compiler error:
irte_efd.o: In function `rte_efd_lookup':
rte_efd.c:(.text+0x6d6e): undefined reference to `efd_lookup_internal_avx2'
rte_efd.o: In function `rte_efd_lookup_bulk':
rte_efd.c:(.text+0x87d4): undefined reference to `efd_lookup_internal_avx2'
This can be observed with a compiler that doesn't support AVX2 and
shared build.
Fixes: 86d898968826 ("efd: add AVX2 vector lookup function")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
MSI masks contain a 1 if interrupt is masked, 0 if unmasked.
I got that wrong with the !!state calculation. For better
readability, the mask is now changed like in igbuio_msi_mask_irq.
Fixes: a8ea1e5fb647 ("igb_uio: fix unknown MSI symbols")
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Tested-by: Markus Theil <markus.theil@tu-ilmenau.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch partially reverts the commit d196343a258e and adds some
functions from Markus' previous version of the patch [1].
igb_uio uses pci_msi_unmask_irq() and pci_msi_mask_irq() kernel APIs
when kernel version is >= 3.19 because these APIs are implemented in
this Linux kernel version.
But these APIs only exported beginning from Linux kernel 4.5, so before
this Linux kernel version igb_uio kernel module is not usable,
and giving following warnings:
"igb_uio: Unknown symbol pci_msi_unmask_irq"
"igb_uio: Unknown symbol pci_msi_mask_irq"
The support for these APIs increased to Linux kernel >= 4.5
For older version of Linux kernel unmask_msi_irq() and mask_msi_irq()
are used but these functions are not exported at all.
Instead of these functions switched back to previous implementation in
igb_uio for MSI-X, and for MSI used igbuio_msi_mask_irq() from [1].
[1]
http://dpdk.org/dev/patchwork/patch/28144/
Fixes: d196343a258e ("igb_uio: use kernel functions for masking MSI-X")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch enables x86 EFD file be compiled only if the compiler
supports AVX2 since it is already chosen at run-time.
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
This patch dynamically selects functions of memcpy at run-time based
on CPU flags that current machine supports. This patch uses function
pointers which are bind to the relative functions at constrctor time.
In addition, AVX512 instructions set would be compiled only if users
config it enabled and the compiler supports it.
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
First, try to use CPUID Time Stamp Counter and Nominal Core Crystal
Clock Information Leaf to determine the tsc hz on platforms that
supports it (does not require privileged user).
If the CPUID leaf is not available, then try to determine the tsc hz by
reading the MSR 0xCE (requires privileged user).
Default to the tsc hz estimation if both methods fail.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
In ppc_64, rte_rdtsc() returns timebase register value which increments
at independent timebase frequency and hence not related to lcore cpu
frequency to derive TSC hz. Hence, we stick with master lcore frequency.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Use cntvct_el0 system register to get the system counter frequency.
If the system is configured with RTE_ARM_EAL_RDTSC_USE_PMU then
return 0(let the common code calibrate the tsc frequency).
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
When calibrating the TSC frequency, first, probe the architecture specific
function. If not available, use the existing calibrate scheme.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>