Populating the eventfd in rte_intr_enable in each request to vfio
triggers a reconfiguration of the interrupt handler on the kernel side.
The problem is that rte_intr_enable is often used to re-enable masked
interrupts from drivers interrupt handlers.
This reconfiguration leaves a window during which a device could send
an interrupt and then the kernel logs this (unsolicited from the kernel
point of view) interrupt:
[158764.159833] do_IRQ: 9.34 No irq handler for vector
VFIO api makes it possible to set the fd at setup time.
Make use of this and then we only need to ask for masking/unmasking
legacy interrupts and we have nothing to do for MSI/MSIX.
"rxtx" interrupts are left untouched but are most likely subject to the
same issue.
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1654824
Fixes: 5c782b3928b8 ("vfio: interrupts")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Shahed Shaikh <shshaikh@marvell.com>
The APIs in the rte_bus_vdev.h file were not part of the API
documentation. I added this header file to the doxygen config file with
the name vdev.
Signed-off-by: Aideen McLoughlin <aideen.mcloughlin@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
1. need to use the bpool with rte_malloc instead of rte_free
2. Option to give portal to the secondary process thread.
Signed-off-by: Radu Bulie <radu-andrei.bulie@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Akhil Goyal <akhil.goyal@nxp.com>
Parse and find_device have specific function - former is for parsing a
string passed as argument, whereas the later is for iterating over all
the devices in the bus and calling a callback/handler. They have been
corrected with their right operations to support hotplugging/devargs
plug/unplug calls.
Support for plug/unplug too has been added.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Now that everything that has ever accessed the shared memory
config is doing so through the public API's, we can make it
internal. Since we're removing quite a few headers from
rte_eal_memconfig.h, we need to add them back in places
where this header is used.
This bumps the ABI, so also change all build files and make
update documentation.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
Currently, the memory hotplug is locked automatically by all
memory-related _walk() functions, but sometimes locking the
memory subsystem outside of them is needed. There is no
public API to do that, so it creates a dependency on shared
memory config to be public. Fix this by introducing a new
API to lock/unlock the memory hotplug subsystem.
Create a new common file for all things mem config, and a
new API namespace rte_mcfg_*, and search-and-replace all
usages of the locks with the new API.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
When selecting the preferred IOVA mode of the pci bus, the current
heuristic ("are devices bound?", "are devices bound to UIO?", "are pmd
drivers supporting IOVA as VA?" etc..) should honor the device
white/blacklist so that an unwanted device does not impact the decision.
There is no reason to consider a device which has no driver available.
This applies to all OS, so implements this in common code then call a
OS specific callback.
On Linux side:
- the VFIO special considerations should be evaluated only if VFIO
support is built,
- there is no strong requirement on using VA rather than PA if a driver
supports VA, so defaulting to DC in such a case.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
For each driver where we optionally disable it, add in the reason why it's
being disabled, so the user knows how to fix it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
The vmbus scan code can just skip non-network devices.
More importantly, this fixes the bug where some vmbus devices
don't have all the attributes (like monitor_id) and a single
failure would cause the scan to break the loop.
Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Using access followed by open causes a static analysis warning
about Time of check versus Time of use. Also, access() and
open() have different UID permission checks.
This is not a serious problem; but easy to fix by using errno instead.
Coverity issue: 300870
Fixes: 4a928ef9f611 ("bus/pci: enable write combining during mapping")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.
For primary processes, it is OK to not have IPC because
there may not be any secondary processes in the first place,
and there are valid use cases that disable IPC support, so
all primary process usages are fixed up to ignore IPC
failures.
For secondary processes, IPC will be crucial, so leave all
of the error handling as is.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When checking RTE_PCI_DRV_IOVA_AS_VA flag to determine IOVA mode,
pci_one_device_has_iova_va() returns true only if kernel driver of the
device is vfio. However, Mellanox mlx4/5 PMD doesn't need to be detached
from kernel driver and attached to VFIO/UIO. Control path still goes
through the existing kernel driver, which is mlx4_core/mlx5_core. In order
to make RTE_PCI_DRV_IOVA_AS_VA effective for mlx4/mlx5 PMD, a new kernel
driver type has to be introduced.
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
The lcore_config structure will be hidden in future release.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Add 'RTE_' prefix to defines:
- rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN.
- rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN.
- rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN.
- rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN.
- rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN.
- rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN.
- rename ETHER_MTU as RTE_ETHER_MTU.
- rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN.
- rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID.
- rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN.
- rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU.
- rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR.
- rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR.
- rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4.
- rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6.
- rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP.
- rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN.
- rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP.
- rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ.
- rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG.
- rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588.
- rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW.
- rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB.
- rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP.
- rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS.
- rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM.
- rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN.
- rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE.
- rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4.
- rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6.
- rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH.
- rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH.
- rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS.
- rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP.
- rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG.
- rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add 'rte_' prefix to structures:
- rename struct ether_addr as struct rte_ether_addr.
- rename struct ether_hdr as struct rte_ether_hdr.
- rename struct vlan_hdr as struct rte_vlan_hdr.
- rename struct vxlan_hdr as struct rte_vxlan_hdr.
- rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Printing a null pointer with %s is flagged as a warning by GCC 9, and
should not be done. Replace the %s with the word "null" itself.
Fixes: 828d51d8fc3e ("bus/fslmc: refactor scan and probe functions")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
If secondary process attempt to mmap the resource resulted in
the wrong address, then it would leave behind the bad mmap.
Coverity issue: 337675, 337664
Fixes: 2a28a502c607 ("bus/vmbus: map ring in secondary process")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
In many scenarios, AFU is needed searched by name, this
function add the feature.
Signed-off-by: Rosen Xu <rosen.xu@intel.com>
Signed-off-by: Andy Pei <andy.pei@intel.com>
AFU can be implemented into many different acceleration
devices, these devices need shared data to store private
information when they are handled by users.
Signed-off-by: Rosen Xu <rosen.xu@intel.com>
Signed-off-by: Andy Pei <andy.pei@intel.com>
Define variables for "is_linux", "is_freebsd" and "is_windows"
to make the code shorter for comparisons and more readable.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
MC firmware is the core component of FSLMC bus and DPAA2 devices.
Prior to this patch, MC firmware supported 10.10.x version. This
patch bumps the min supported version to 10.14.x.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Removes some unused firmware code which was added in last bump
of the firmware version. No current features uses these APIs.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Do a global replace of snprintf(..."%s",...) with strlcpy, adding in the
rte_string_fns.h header if needed. The function changes in this patch were
auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy.cocci --dir . --in-place
and then the files edited using awk to add in the missing header:
gawk -i inplace '/include <rte_/ && ! seen { \
print "#include <rte_string_fns.h>"; seen=1} {print}'
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
For files that already have rte_string_fns.h included in them, we can
do a straight replacement of snprintf(..."%s",...) with strlcpy. The
changes in this patch were auto-generated via command:
spatch --sp-file devtools/cocci/strlcpy-with-header.cocci --dir . --in-place
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
- fle pool allocations should be done for each process.
- cryptodev->data is shared across muliple processes but
cryptodev itself is allocated for each process. So any
information which needs to be shared between processes,
should be kept in cryptodev->data.
Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
This fixes the following compile error with musl libc:
drivers/bus/fslmc/qbman/include/compat.h:41:10: error:
'stdout' undeclared (first use in this function)
fflush(stdout); \
^~~~~~
Fixes: 531b17a780dc ("bus/fslmc: add QBMAN driver to bus")
Cc: stable@dpdk.org
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Fixes following build error with musl libc:
In file included from drivers/bus/fslmc/qbman/qbman_debug.c:6:
drivers/bus/fslmc/qbman/include/compat.h:21:10: fatal error:
error.h: No such file or directory
#include <error.h>
^~~~~~~~~
Apparently it is not used anywere in qbman so simply remove the include.
Fixes: 531b17a780dc ("bus/fslmc: add QBMAN driver to bus")
Cc: stable@dpdk.org
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
The DPDK APIs expose 3 different modes to work with memory used for DMA:
1. Use the DPDK owned memory (backed by the DPDK provided hugepages).
This memory is allocated by the DPDK libraries, included in the DPDK
memory system (memseg lists) and automatically DMA mapped by the DPDK
layers.
2. Use memory allocated by the user and register to the DPDK memory
systems. Upon registration of memory, the DPDK layers will DMA map it
to all needed devices. After registration, allocation of this memory
will be done with rte_*malloc APIs.
3. Use memory allocated by the user and not registered to the DPDK memory
system. This is for users who wants to have tight control on this
memory (e.g. avoid the rte_malloc header).
The user should create a memory, register it through rte_extmem_register
API, and call DMA map function in order to register such memory to
the different devices.
The scope of the patch focus on #3 above.
Currently the only way to map external memory is through VFIO
(rte_vfio_dma_map). While VFIO is common, there are other vendors
which use different ways to map memory (e.g. Mellanox and NXP).
The work in this patch moves the DMA mapping to vendor agnostic APIs.
Device level DMA map and unmap APIs were added. Implementation of those
APIs was done currently only for PCI devices.
For PCI bus devices, the pci driver can expose its own map and unmap
functions to be used for the mapping. In case the driver doesn't provide
any, the memory will be mapped, if possible, to IOMMU through VFIO APIs.
Application usage with those APIs is quite simple:
* allocate memory
* call rte_extmem_register on the memory chunk.
* take a device, and query its rte_device.
* call the device specific mapping function for this device.
Future work will deprecate the rte_vfio_dma_map and rte_vfio_dma_unmap
APIs, leaving the rte device APIs as the preferred option for the user.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The fman device list need to be accessed across processes.
The hw device structures should be allocated with rte_calloc
instead of calloc. The rte_calloc is not available at the
time of bus scan, so better prepare the device list at probe.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
A reference to qman_fq_lookup_table need to be saved in each
fq, so that it is retrieved while in running secondary process.
Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Current value of 'fmbm_rfsdm' register (0x010CE3F0) doesn't include
the bit to drop colored (red) packets. New value (0x010EE3F0) fixes
this.
Check with 'fmbm_rffc' register of fm_port_bmi_regs.
Fixes: 6d6b4f49a155 ("bus/dpaa: add FMAN hardware operations")
Cc: stable@dpdk.org
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
The fslmc bus code was duplicating the device name and
doing extra initialization. The code can be simplified
to just use the device name directly.
Compile tested only; do not have this hardware.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
When fslmc is built as part of a general distribution, the
bus code will log errors when other devices are present.
This could confuse users it is not an error.
Fixes: 50245be05d1a ("bus/fslmc: support device blacklisting")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
The secondary mapping function was duplicating the code
used to search the uio_resource list.
Skip the unwinding since map failure already makes device
unusable.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Need to remember primary channel in secondary process.
Then use it to iterate over subchannels in secondary
process mapping setup.
Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
If vmbus is run on older kernel (without all the uio mappings),
then the bus driver should stop when it hits the missing mappings
rather than recording the empty values.
Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
The code was testing the result of mmap incorrectly.
I.e the test that a local pointer is not MAP_FAILED would
always succeed and therefore hid any potential problems.
Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
The secondary process doesn't correctly map the second
and later resources because it doesn't change the offset.
Fixes: 831dba47bd36 ("bus/vmbus: add Hyper-V virtual bus support")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
All drivers should have SPDX on the first line of the source
files in the format
/* SPDX-License-Identifier: ...
Several files used minor modifications which were inconsistent
with the pattern. Fix it to make scanning tools easier.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Rename the macro and all instances in DPDK code, but keep a copy of
the old macro defined for legacy code linking against DPDK
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Rename the macro to make things shorter and more comprehensible. For
both meson and make builds, keep the old macro around for backward
compatibility.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The term "linuxapp" is a legacy one, but just calling the subdirectory
"linux" is just clearer for all concerned.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
In case vdev was already probed, it shouldn't be probed again,
and it should return -EEXIST as error.
There are some checks in vdev_probe() and insert_vdev(),
but a check was missing in vdev_plug().
The check is moved in vdev_probe_all_drivers() which is called
in all code paths.
Fixes: e9d159c3d534 ("eal: allow probing a device again")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
The log was printing the device name two times,
first one being supposed to be the driver name.
As we don't know yet the driver name, the log is simplified.
Fixes: 9bf4901d1a11 ("bus/vdev: remove probe with driver name option")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>