rte_eal_check_module() might return -1, which would have been a
"not false" condition for mod_available. Fix that to only report
vfio being enabled if rte_eal_check_module() returns 1.
Fixes: 221f7c220d6b ("vfio: move global config out of PCI files")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
The CPUID instruction is caught by hypervisor which can return
a flag indicating one is running, and its name.
Suggested-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Print a warning if the --base-virtaddr hint is not respected
since this might lead to problems when mapping memory in
the secondary process.
Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Repeated occurrences of 'the'.
The change was obtained using the following command:
sed -i "s;the the ;the ;" `git grep -l "the "`
Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
build error:
.../dpdk/build/build/lib/librte_eal/linuxapp/kni/igb_main.c:2809:2:
error: implicit declaration of function ‘setup_timer’;
did you mean ‘sk_stop_timer’? [-Werror=implicit-function-declaration]
setup_timer(&adapter->watchdog_timer, &igb_watchdog,
^~~~~~~~~~~
sk_stop_timer
cc1: all warnings being treated as errors
error observed whed CONFIG_RTE_KNI_KMOD_ETHTOOL config option enabled.
Because Linux removed setup_timer macros for kernel version >= 4.15
Linux: 513ae785c63c ("timer: Remove setup_*timer() interface")
Replaced setup_timer with timer_setup for new kernel versions.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
More error reported for device reset in release() [1],
when device pass-through to the guest, host kernel crash on guest exit.
Removing the reset completely.
This is close to reverting commit b58eedfc7dd5 [2], taking into account
previous fix to remove reset in open as well [3], but not exactly same.
With latest code, interrupts are enabled in uio open() callback and
disabled in uio release() callback, so when a DPDK application exit
device interrupts are disabled. Previously interrupts were only enabled
once in igb_uio module insert and disabled in module removal.
Also with latest code device set as bus master in open() and master
cleared in release(), clearing bus master should prevent further DMA
which was one of the target of the initial patch.
The initial intention was also to reset the device to be sure it has
been left in proper state, but currently that part is missing because of
reported problem(s).
Still igb_uio should be safer comparing to the pre b58eedfc7dd5 state.
[1]
http://dpdk.org/ml/archives/dev/2017-November/081459.html
[2]
b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
[3]
f73b38e9245d ("igb_uio: remove device reset in open")
Fixes: e3a64deae2d5 ("igb_uio: prevent reset for bnx2x devices")
Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Move the vdev bus from lib/librte_eal to drivers/bus.
As the crypto vdev helper function refers to data structure
in rte_vdev.h, so we move those helper function into drivers/bus
too.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
For virtual device, the rte_intr_handle struct is
initialized by the virtual device driver, including
the event fd assignment. If the event fd need to be
read for clean, an argument is required for the proper
event fd read.
This patch adds efd_counter_size in rte_intr_handle
struct to tell the rx interrupt process the read size.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
Revert the patchset run-time Linking support including the following
3 commits:
Fixes: 84cc318424d4 ("eal/x86: select optimized memcpy at run-time")
Fixes: c7fbc80fe60f ("test: select memcpy alignment unit at run-time")
Fixes: 5f180ae32962 ("efd: move AVX2 lookup in its own compilation unit")
The patchset would cause perf drop in vhost/virtio loopback performance
test. Because the run-time dispatch must cost at least a function call
comparing to the compile-time dispatch. And the reference cpu cycles value
is small. And in the test, when using 128-256 bytes packet, it would cause
16%-20% perf drop with mergeble path. When using 256 bytes packet, it would
cause 13% perf drop with vector path.
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Some devices are having problem on device reset that happens during DPDK
application exit [1].
Create a static list of devices and exclude them from device reset.
[1]
http://dpdk.org/ml/archives/dev/2017-November/080927.html
Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fix kernel crash with KNI because KNI requires physical addresses.
When IOVA VA mode used, memzones and mbufs physical address fields
contain virtual addresses. But KNI relies on these fields to enable
kernel access for buffers. Those fields having virtual address cause
crash in kernel.
This is a workaround until KNI fixed properly to work with virtual
addresses.
Fixes: 72d013644bd6 ("mem: honor IOVA mode in malloc virt2phy")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
The function rte_mem_virt2phy() is kept and used in functions which
works only with physical addresses.
For all other calls this function is replaced by rte_mem_virt2iova()
which does a direct mapping (no conversion) in the VA case.
Note: the new function rte_mem_virt2iova() function matches the
behaviour implemented in rte_mem_virt2phy() by the commit
680f6c12600f ("mem: honor IOVA mode in virt2phy")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Renaming rte_memseg {.phys_addr} to {.iova}
Keep the deprecated name in an anonymous union to avoid breaking
the API.
Use rte_iova_t and RTE_BAD_IOVA where appropriate in
memory segment handling.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
If the IOVA mode is not using physical addresses,
no need to log an error about physical address issue.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
The memzone header is often included without good reason.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Exposed VFIO functions simply uses a "vfio" prefix.
Use the proper "rte_vfio" prefix for those symbols.
Fixes: 279b581c897d ("vfio: expose functions")
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Revert back to using VFIO_PRESENT as a marker to enable compilation
of VFIO-related segments.
VFIO_PRESENT is the combination of user configuration RTE_EAL_VFIO and
kernel version support check.
eal_vfio.h VFIO_PRESENT related check ordered to be compatible with
rte_vfio.h one, no functional modification.
Fixes: 279b581c897d ("vfio: expose functions")
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
The PCI lib defines the types and methods allowing to use PCI elements.
The PCI bus implements a bus driver for PCI devices by constructing
rte_bus elements using the PCI lib.
Move the relevant code out of the EAL to its expected place.
Libraries, drivers, unit tests and applications are updated to use the
new rte_bus_pci.h header when necessary.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
These symbols are only relevant to PCI operations.
Move them to a private PCI-related header, allowing to remove the
dependency of the PCI subsystem upon private eal_vfio.h.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The following symbols are used by vfio implementations within the PCI bus.
They need to be publicly available for the PCI bus to be outside the
EAL.
+ vfio_enable;
+ vfio_is_enabled;
+ vfio_noiommu_is_enabled;
+ vfio_release_device;
+ vfio_setup_device;
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Some internal configuration elements set by the user on the command line
are necessary outside the EAL, when the PCI bus is detached.
Expose:
+ rte_eal_create_uio_dev
+ rte_eal_has_pci
+ rte_eal_vfio_intr_mode
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
This function was previously private to the EAL layer.
Other subsystems requires it, such as the PCI bus.
In order not to force other components to include stdbool, which is
incompatible with several NIC drivers, the return type has
been changed from bool to int.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Remove device reset during application start, the reset for application
exit still there.
Reset in open removed because of following comments:
1- Device reset not completed when VF driver loaded, which cause VF PMD
initialization error.
Adding delay can solve the issue but will increase driver load time.
2- Reset will be issues all devices unconditionally, not very efficient
way.
Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Harish Patil <harish.patil@cavium.com>
Tested-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Tested-by: Jingjing Wu <jingjing.wu@intel.com>
Since the functions exported by DPDK EAL on all OS's should be
identical, we should not need separate function version files for each
OS. Therefore move existing version files to the top-level EAL
directory.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The linuxapp and bsdapp interrupt header files are now identical, so
merge them into a common file in common/include.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Remove rte_set_log_level(), rte_get_log_level(),
rte_set_log_type(), and rte_get_log_type().
Also update librte_eal.so version in docuementation.
The LIBABIVER variable in eal has already been modified in
commit f26ab687a74f ("eal: remove Xen dom0 support").
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
When getting group fd from primary process, secondary wasn't storing
the fd anywhere, leading to a (harmless) error message in EAL logs,
and (not so harmless) potential problems when hot-unplugging devices
managed by VFIO in a secondary process.
Fix it by actually storing the group fd whenever we get a valid one
from the secondary process.
Fixes: 94c0776b1bad ("vfio: support hotplug")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
VFIO may be used by buses other than PCI. This patch enables
the VFIO on the basis of vfio root presence.
Since vfio_enable should be called only once, pci_vfio_enable
is also removed.
A debug print is added in case vfio_pci module is not present.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Compile fails when kernel version is <= 3.17 with error:
"dereferencing pointer to incomplete type". This is because struct
uio_device definition is not exposed in kernel earlier than 3.17.
This patch fixes it by using pointer of rte_uio_pci_dev as
dev_id instead of uio_device for irq device handler.
Fixes: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in VM")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
build error:
build/lib/librte_eal/linuxapp/kni/kni_net.c:215:5: error:
‘struct net_device’ has no member named ‘trans_start’
dev->trans_start = jiffies;
Signed-off-by: Nirmoy Das <ndas@suse.de>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
If pass-through a VF by vfio-pci to a Qemu VM, after FLR
in VM, the interrupt setting is not recoverd correctly
to host as below:
in VM guest:
Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
in Host:
Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
That was because in pci_reset_function, it first reads the
PCI configure and set FLR reset, and then writes PCI configure
as restoration. But not all the writing are successful to Host.
Because vfio-pci driver doesn't allow directly write PCI MSI-X
Cap.
To fix this issue, we need to move the interrupt enablement from
igb_uio probe to open device file. While it is also the similar as
the behaviour in vfio_pci kernel module code.
Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
MSI masks contain a 1 if interrupt is masked, 0 if unmasked.
I got that wrong with the !!state calculation. For better
readability, the mask is now changed like in igbuio_msi_mask_irq.
Fixes: a8ea1e5fb647 ("igb_uio: fix unknown MSI symbols")
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Tested-by: Markus Theil <markus.theil@tu-ilmenau.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch partially reverts the commit d196343a258e and adds some
functions from Markus' previous version of the patch [1].
igb_uio uses pci_msi_unmask_irq() and pci_msi_mask_irq() kernel APIs
when kernel version is >= 3.19 because these APIs are implemented in
this Linux kernel version.
But these APIs only exported beginning from Linux kernel 4.5, so before
this Linux kernel version igb_uio kernel module is not usable,
and giving following warnings:
"igb_uio: Unknown symbol pci_msi_unmask_irq"
"igb_uio: Unknown symbol pci_msi_mask_irq"
The support for these APIs increased to Linux kernel >= 4.5
For older version of Linux kernel unmask_msi_irq() and mask_msi_irq()
are used but these functions are not exported at all.
Instead of these functions switched back to previous implementation in
igb_uio for MSI-X, and for MSI used igbuio_msi_mask_irq() from [1].
[1]
http://dpdk.org/dev/patchwork/patch/28144/
Fixes: d196343a258e ("igb_uio: use kernel functions for masking MSI-X")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch dynamically selects functions of memcpy at run-time based
on CPU flags that current machine supports. This patch uses function
pointers which are bind to the relative functions at constrctor time.
In addition, AVX512 instructions set would be compiled only if users
config it enabled and the compiler supports it.
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
When calibrating the TSC frequency, first, probe the architecture specific
function. If not available, use the existing calibrate scheme.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
With the introduction of IOVA mode, the only blocker to run
with 4KB pages for NICs binding to vfio-pci, is that
RTE_BAD_PHYS_ADDR is not a valid IOVA address.
We can refine this by using VA as IOVA if it's IOVA mode.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
This function can be used to check the role of a specific lcore.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Split pci_vfio_map_resource for primary and secondary processes.
Save all relevant mapping data in primary process to allow
the secondary process to perform mappings.
Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
DMA window size needs to be big enough to span all memory segment's
physical addresses. We do not need multiple levels of IOMMU tables
as we already span ~70TB of physical memory with 16MB hugepages.
Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>