23 Commits

Author SHA1 Message Date
Gaetan Rivet
64de7e4069 bus/pci: fix find device implementation
If start is set, and a device before it matches the data
passed for comparison, then this first device is returned.

This induces potentially infinite loops.

Fixes: c7fe1eea8a74 ("bus: simplify finding starting point")
Cc: stable@dpdk.org

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2018-04-27 16:31:44 +02:00
Gaetan Rivet
7765f0f408 bus/pci: do not reference devargs list
This list should not be used by drivers.
Use the public API instead.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-04-25 03:58:10 +02:00
Anatoly Burakov
66cc45e293 mem: replace memseg with memseg lists
Before, we were aggregating multiple pages into one memseg, so the
number of memsegs was small. Now, each page gets its own memseg,
so the list of memsegs is huge. To accommodate the new memseg list
size and to keep the under-the-hood workings sane, the memseg list
is now not just a single list, but multiple lists. To be precise,
each hugepage size available on the system gets one or more memseg
lists, per socket.

In order to support dynamic memory allocation, we reserve all
memory in advance (unless we're in 32-bit legacy mode, in which
case we do not preallocate memory). As in, we do an anonymous
mmap() of the entire maximum size of memory per hugepage size, per
socket (which is limited to either RTE_MAX_MEMSEG_PER_TYPE pages or
RTE_MAX_MEM_MB_PER_TYPE megabytes worth of memory, whichever is the
smaller one), split over multiple lists (which are limited to
either RTE_MAX_MEMSEG_PER_LIST memsegs or RTE_MAX_MEM_MB_PER_LIST
megabytes per list, whichever is the smaller one). There is also
a global limit of CONFIG_RTE_MAX_MEM_MB megabytes, which is mainly
used for 32-bit targets to limit amounts of preallocated memory,
but can be used to place an upper limit on total amount of VA
memory that can be allocated by DPDK application.

So, for each hugepage size, we get (by default) up to 128G worth
of memory, per socket, split into chunks of up to 32G in size.
The address space is claimed at the start, in eal_common_memory.c.
The actual page allocation code is in eal_memalloc.c (Linux-only),
and largely consists of copied EAL memory init code.

Pages in the list are also indexed by address. That is, in order
to figure out where the page belongs, one can simply look at base
address for a memseg list. Similarly, figuring out IOVA address
of a memzone is a matter of finding the right memseg list, getting
offset and dividing by page size to get the appropriate memseg.

This commit also removes rte_eal_dump_physmem_layout() call,
according to deprecation notice [1], and removes that deprecation
notice as well.

On 32-bit targets due to limited VA space, DPDK will no longer
spread memory to different sockets like before. Instead, it will
(by default) allocate all of the memory on socket where master
lcore is. To override this behavior, --socket-mem must be used.

The rest of the changes are really ripple effects from the memseg
change - heap changes, compile fixes, and rewrites to support
fbarray-backed memseg lists. Due to earlier switch to _walk()
functions, most of the changes are simple fixes, however some
of the _walk() calls were switched to memseg list walk, where
it made sense to do so.

Additionally, we are also switching locks from flock() to fcntl().
Down the line, we will be introducing single-file segments option,
and we cannot use flock() locks to lock parts of the file. Therefore,
we will use fcntl() locks for legacy mem as well, in case someone is
unfortunate enough to accidentally start legacy mem primary process
alongside an already working non-legacy mem-based primary process.

[1] http://dpdk.org/dev/patchwork/patch/34002/

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:55:39 +02:00
Anatoly Burakov
7411d03249 bus/pci: use memseg walk instead of iteration
Reduce dependency on internal details of EAL memory subsystem, and
simplify code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2018-04-11 19:48:10 +02:00
Olivier Matz
fd4ab1fe9c bus/pci: use SPDX tags in 6WIND copyrighted files
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-02-01 02:32:52 +01:00
Bruce Richardson
6c9457c279 build: replace license text with SPDX tag
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Luca Boccassi <bluca@debian.org>
2018-01-30 21:58:59 +01:00
Bruce Richardson
04c5af4272 bus/pci: build with meson
Many drivers across the various device types rely on PCI infrastructure,
so the bus drivers should be the first driver class built.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2018-01-30 17:49:16 +01:00
Maxime Coquelin
54a328f552 bus/pci: forbid IOVA mode if IOMMU address width too small
Intel VT-d supports different address widths for the IOVAs, from
39 bits to 56 bits.

While recent processors support at least 48 bits, VT-d emulation
currently only supports 39 bits. It makes DMA mapping to fail in this
case when using VA as IOVA mode, as user-space virtual addresses uses
up to 47 bits (see kernel's Documentation/x86/x86_64/mm.txt).

This patch parses VT-d CAP register value available in sysfs, and
forbid VA as IOVA mode if the GAW is 39 bits or unknown.

Fixes: f37dfab21c98 ("drivers/net: enable IOVA mode for Intel PMDs")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Chas Williams <chas3@att.com>
2018-01-20 16:25:48 +01:00
Zhiyong Yang
6c7001480a bus/pci: fix interrupt handler type
For virtio legacy device, testpmd startup fails when using uio_pci_generic.

The issue is caused by invoking the function pci_ioport_map. The correct
value of intr_handle.type is already set before calling it, we should avoid
overwriting the default value "RTE_INTR_HANDLE_UNKNOWN" in this function.
Besides, the removal has no harm to other cases because it is set to 0 by a
memset on the whole struct during allocation in the function pci_scan_one.

Such assignments are removed in the meanwhile in pci_uio_map_resource(),
pci_vfio_map_resource_primary() and pci_vfio_map_resource_secondary() in
order to keep consistencies and avoid future questions.

Fixes: 756ce64b1ecd ("eal: introduce PCI ioport API")
Cc: stable@dpdk.org

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Thomas Monjalon <thomas@monjalon.net>
2018-01-12 01:04:22 +01:00
Stephen Hemminger
a12f226789 bus/pci: do not use kernel version to determine MSIX defines
In real life, kernel version is only weakly corolated with presence
or absence of defines in header files. Instead, check directly if
the needed value is defined.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2018-01-05 23:56:08 +01:00
Bruce Richardson
5566a3e358 drivers: use SPDX tag for Intel copyright files
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2018-01-04 22:41:39 +01:00
Jerin Jacob
82bf1caf5f bus/pci: fix a typo in doxygen file description
Fixes: 764bf26873b9 ("add FreeBSD support")

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-11-12 19:50:43 +01:00
Jonas Pfefferle
f1b7c6b7f5 bus/pci: fix PPC condition for IOMMU class
This fixes the use of an never defined PPC64 define in
ret_pci_get_iommu_class.

Fixes: b48e0e2d9cb4 ("bus/pci: fix IOMMU class for sPAPR")

Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
2017-11-07 17:04:09 +01:00
Thomas Monjalon
c52dd39411 bus/pci: fix namespace of sysfs path function
The function pci_get_sysfs_path was moved from EAL to the PCI driver.

The namespace is now fixed by adding "rte_" prefix.
The map files are fixed by removing the symbol from EAL and adding
it to the PCI driver.

It is an API break but it is probably not used by applications.
Anyway this API is already broken by the move in a new header file.

Fixes: c752998b5e2e ("pci: introduce library and driver")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-11-07 00:44:10 +01:00
Jonas Pfefferle
b48e0e2d9c bus/pci: fix IOMMU class for sPAPR
PPC64 sPAPR iommu does not support iova as va.
Use pa mode instead.

Fixes: 815c7deaed2d ("pci: get IOMMU class on Linux")

Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
2017-11-07 00:42:42 +01:00
Thomas Monjalon
4c00cfdc0e remove useless memzone includes
The memzone header is often included without good reason.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-11-06 22:12:08 +01:00
Gaetan Rivet
0e3ef055be pci: fix namespace prefix of new functions
Some symbols were introduced with the wrong prefix.
Add the usual "rte_" prefix when needed.

Fixes: c752998b5e2e ("pci: introduce library and driver")

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-11-06 21:42:22 +01:00
Gaetan Rivet
77dad68c20 vfio: fix namespace prefix of newly exposed functions
Exposed VFIO functions simply uses a "vfio" prefix.
Use the proper "rte_vfio" prefix for those symbols.

Fixes: 279b581c897d ("vfio: expose functions")

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-11-06 21:41:41 +01:00
Jerin Jacob
6fb00f8bae bus/pci: fix VFIO device reset
If the device is not capable of resetting, then Linux kernel updates
the errno as EINVAL.
http://elixir.free-electrons.com/linux/v4.9/source/drivers/vfio/pci/vfio_pci.c#L887

Honor the EINVAL errno value to avoid pci vfio setup failure.

Fixes: f25f8f367644 ("bus/pci: check VFIO reset ioctl error")

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Jonas Pfefferle <jpf@zurich.ibm.com>
2017-10-31 19:23:36 +01:00
Ferruh Yigit
bc104bb853 bus/pci: fix VFIO mode
Revert back to using VFIO_PRESENT as a marker to enable compilation
of VFIO-related segments.

VFIO_PRESENT is the combination of user configuration RTE_EAL_VFIO and
kernel version support check.

eal_vfio.h VFIO_PRESENT related check ordered to be compatible with
rte_vfio.h one, no functional modification.

Fixes: 279b581c897d ("vfio: expose functions")

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-31 19:18:36 +01:00
Jonas Pfefferle
f25f8f3676 bus/pci: check VFIO reset ioctl error
Check return value of device reset ioctl

Coverity issue: 195003
Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping")

Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-10-26 23:51:39 +02:00
Jianfeng Tan
633e4c7d71 bus/pci: fix UIO bind check
When checking if any devices bound to uio, we did not exclude
those which are blacklisted (or in the case that a whitelist
is specified).

This patch fixes it by only checking whitelisted devices, or
not-blacklisted devices depending on the bus scan mode.

Fixes: 815c7deaed2d ("pci: get IOMMU class on Linux")

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
2017-10-26 23:35:00 +02:00
Gaetan Rivet
c752998b5e pci: introduce library and driver
The PCI lib defines the types and methods allowing to use PCI elements.

The PCI bus implements a bus driver for PCI devices by constructing
rte_bus elements using the PCI lib.

Move the relevant code out of the EAL to its expected place.

Libraries, drivers, unit tests and applications are updated to use the
new rte_bus_pci.h header when necessary.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-10-26 23:17:31 +02:00