If start is set, and a device before it matches the data
passed for comparison, then this first device is returned.
This induces potentially infinite loops.
Fixes: c7fe1eea8a74 ("bus: simplify finding starting point")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
This list should not be used by drivers.
Use the public API instead.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Before, we were aggregating multiple pages into one memseg, so the
number of memsegs was small. Now, each page gets its own memseg,
so the list of memsegs is huge. To accommodate the new memseg list
size and to keep the under-the-hood workings sane, the memseg list
is now not just a single list, but multiple lists. To be precise,
each hugepage size available on the system gets one or more memseg
lists, per socket.
In order to support dynamic memory allocation, we reserve all
memory in advance (unless we're in 32-bit legacy mode, in which
case we do not preallocate memory). As in, we do an anonymous
mmap() of the entire maximum size of memory per hugepage size, per
socket (which is limited to either RTE_MAX_MEMSEG_PER_TYPE pages or
RTE_MAX_MEM_MB_PER_TYPE megabytes worth of memory, whichever is the
smaller one), split over multiple lists (which are limited to
either RTE_MAX_MEMSEG_PER_LIST memsegs or RTE_MAX_MEM_MB_PER_LIST
megabytes per list, whichever is the smaller one). There is also
a global limit of CONFIG_RTE_MAX_MEM_MB megabytes, which is mainly
used for 32-bit targets to limit amounts of preallocated memory,
but can be used to place an upper limit on total amount of VA
memory that can be allocated by DPDK application.
So, for each hugepage size, we get (by default) up to 128G worth
of memory, per socket, split into chunks of up to 32G in size.
The address space is claimed at the start, in eal_common_memory.c.
The actual page allocation code is in eal_memalloc.c (Linux-only),
and largely consists of copied EAL memory init code.
Pages in the list are also indexed by address. That is, in order
to figure out where the page belongs, one can simply look at base
address for a memseg list. Similarly, figuring out IOVA address
of a memzone is a matter of finding the right memseg list, getting
offset and dividing by page size to get the appropriate memseg.
This commit also removes rte_eal_dump_physmem_layout() call,
according to deprecation notice [1], and removes that deprecation
notice as well.
On 32-bit targets due to limited VA space, DPDK will no longer
spread memory to different sockets like before. Instead, it will
(by default) allocate all of the memory on socket where master
lcore is. To override this behavior, --socket-mem must be used.
The rest of the changes are really ripple effects from the memseg
change - heap changes, compile fixes, and rewrites to support
fbarray-backed memseg lists. Due to earlier switch to _walk()
functions, most of the changes are simple fixes, however some
of the _walk() calls were switched to memseg list walk, where
it made sense to do so.
Additionally, we are also switching locks from flock() to fcntl().
Down the line, we will be introducing single-file segments option,
and we cannot use flock() locks to lock parts of the file. Therefore,
we will use fcntl() locks for legacy mem as well, in case someone is
unfortunate enough to accidentally start legacy mem primary process
alongside an already working non-legacy mem-based primary process.
[1] http://dpdk.org/dev/patchwork/patch/34002/
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Many drivers across the various device types rely on PCI infrastructure,
so the bus drivers should be the first driver class built.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Intel VT-d supports different address widths for the IOVAs, from
39 bits to 56 bits.
While recent processors support at least 48 bits, VT-d emulation
currently only supports 39 bits. It makes DMA mapping to fail in this
case when using VA as IOVA mode, as user-space virtual addresses uses
up to 47 bits (see kernel's Documentation/x86/x86_64/mm.txt).
This patch parses VT-d CAP register value available in sysfs, and
forbid VA as IOVA mode if the GAW is 39 bits or unknown.
Fixes: f37dfab21c98 ("drivers/net: enable IOVA mode for Intel PMDs")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Chas Williams <chas3@att.com>
For virtio legacy device, testpmd startup fails when using uio_pci_generic.
The issue is caused by invoking the function pci_ioport_map. The correct
value of intr_handle.type is already set before calling it, we should avoid
overwriting the default value "RTE_INTR_HANDLE_UNKNOWN" in this function.
Besides, the removal has no harm to other cases because it is set to 0 by a
memset on the whole struct during allocation in the function pci_scan_one.
Such assignments are removed in the meanwhile in pci_uio_map_resource(),
pci_vfio_map_resource_primary() and pci_vfio_map_resource_secondary() in
order to keep consistencies and avoid future questions.
Fixes: 756ce64b1ecd ("eal: introduce PCI ioport API")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Thomas Monjalon <thomas@monjalon.net>
In real life, kernel version is only weakly corolated with presence
or absence of defines in header files. Instead, check directly if
the needed value is defined.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This fixes the use of an never defined PPC64 define in
ret_pci_get_iommu_class.
Fixes: b48e0e2d9cb4 ("bus/pci: fix IOMMU class for sPAPR")
Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
The function pci_get_sysfs_path was moved from EAL to the PCI driver.
The namespace is now fixed by adding "rte_" prefix.
The map files are fixed by removing the symbol from EAL and adding
it to the PCI driver.
It is an API break but it is probably not used by applications.
Anyway this API is already broken by the move in a new header file.
Fixes: c752998b5e2e ("pci: introduce library and driver")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
PPC64 sPAPR iommu does not support iova as va.
Use pa mode instead.
Fixes: 815c7deaed2d ("pci: get IOMMU class on Linux")
Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
The memzone header is often included without good reason.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Some symbols were introduced with the wrong prefix.
Add the usual "rte_" prefix when needed.
Fixes: c752998b5e2e ("pci: introduce library and driver")
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Exposed VFIO functions simply uses a "vfio" prefix.
Use the proper "rte_vfio" prefix for those symbols.
Fixes: 279b581c897d ("vfio: expose functions")
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
If the device is not capable of resetting, then Linux kernel updates
the errno as EINVAL.
http://elixir.free-electrons.com/linux/v4.9/source/drivers/vfio/pci/vfio_pci.c#L887
Honor the EINVAL errno value to avoid pci vfio setup failure.
Fixes: f25f8f367644 ("bus/pci: check VFIO reset ioctl error")
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Revert back to using VFIO_PRESENT as a marker to enable compilation
of VFIO-related segments.
VFIO_PRESENT is the combination of user configuration RTE_EAL_VFIO and
kernel version support check.
eal_vfio.h VFIO_PRESENT related check ordered to be compatible with
rte_vfio.h one, no functional modification.
Fixes: 279b581c897d ("vfio: expose functions")
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
When checking if any devices bound to uio, we did not exclude
those which are blacklisted (or in the case that a whitelist
is specified).
This patch fixes it by only checking whitelisted devices, or
not-blacklisted devices depending on the bus scan mode.
Fixes: 815c7deaed2d ("pci: get IOMMU class on Linux")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
The PCI lib defines the types and methods allowing to use PCI elements.
The PCI bus implements a bus driver for PCI devices by constructing
rte_bus elements using the PCI lib.
Move the relevant code out of the EAL to its expected place.
Libraries, drivers, unit tests and applications are updated to use the
new rte_bus_pci.h header when necessary.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>