Commit Graph

125 Commits

Author SHA1 Message Date
Haiyue Wang
598be72395 vfio: support VF token
The Linux kernel module vfio-pci introduces the VF token to enable
SR-IOV support since 5.7.

The VF token can be set by a vfio-pci based PF driver and must be known
by the vfio-pci based VF driver in order to gain access to the device.

Since the vfio-pci module uses the VF token as internal data to provide
the collaboration between SR-IOV PF and VFs, so DPDK can use the same
VF token for all PF devices by specifying the related EAL option.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Tested-by: Harman Kalra <hkalra@marvell.com>
2020-07-07 14:06:49 +02:00
Stephen Hemminger
2cc263c3ff vfio: lower the priority of startup messages
The startup of VFIO is too noisy. Logging is expensive on some
systems, and distracting to the user.

It should not be logging at NOTICE level, reduce it to INFO level.
It really should be DEBUG here but that would hide it by default.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-07-07 13:51:28 +02:00
Yunjian Wang
f4823a3982 vfio: remove unused variable
The 'group_status' has never been used and can be removed.

Fixes: 94c0776b1b ("vfio: support hotplug")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2020-07-07 13:49:56 +02:00
Tal Shnaiderman
2ab745bd8f eal: move OS common options usage functions
Move common functions between Unix and Windows to eal_common_options.c.

Those functions are getter functions for rte_application_usage_hook.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
Tal Shnaiderman
57a2efb304 eal: move OS common config objects
Move common functions between Unix and Windows to eal_common_config.c.

Those functions are getter functions for IOVA,
configuration, Multi-process.

Move rte_config, internal_config, early_mem_config and runtime_dir
to be defined in the common file with getter functions.

Refactor the users of the config variables above to use
the getter functions.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:53 +02:00
David Marchand
d58314aa3c eal: remove redundant newline in alert message
rte_eal_init_alert() already appends a newline.

Fixes: 0a529578f1 ("eal: clean up unused files on initialization")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-06-25 01:18:29 +02:00
Tal Shnaiderman
48be180de6 eal: move OS common debug functions to single file
Move common functions between Unix and Windows to eal_common_debug.c.

Those functions are rte_exit, __rte_panic and rte_dump_registers
which has the same implementation on Unix and Windows.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-24 11:02:29 +02:00
Harman Kalra
02b73b1e93 eal/linux: fix epoll fd list rebuild for interrupts
An issue has been observed where epoll file descriptor
list rebuilds every time an interrupt/alarm event is
received.

eal_intr_process_interrupts() should notify pipe fd only
if any source is removed from the source list i.e (rv > 0)

Fixes: 0c7ce182a7 ("eal: add pending interrupt callback unregister")
Cc: stable@dpdk.org

Signed-off-by: Harman Kalra <hkalra@marvell.com>
2020-06-24 10:01:56 +02:00
Fady Bader
03437f2dc8 timer: move from common to Unix directory
EAL common timer doesn't compile under Windows.

Compilation log:
error LNK2019:
unresolved external symbol nanosleep referenced in function
rte_delay_us_sleep
error LNK2019:
unresolved external symbol get_tsc_freq referenced in function set_tsc_freq
error LNK2019:
unresolved external symbol sleep referenced in function set_tsc_freq

The reason was that some functions called POSIX functions.
The solution was to move POSIX dependent functions from common to Unix.

Signed-off-by: Fady Bader <fady@mellanox.com>
Reviewed-by: Tal Shnaiderman <talshn@mellanox.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2020-06-23 18:33:20 +02:00
Dmitry Kozlyuk
694161b7e0 mem: extract common dynamic memory allocation
Code in Linux EAL that supports dynamic memory allocation (as opposed to
static allocation used by FreeBSD) is not OS-dependent and can be reused
by Windows EAL. Move such code to a file compiled only for the OS that
require it. Keep Anatoly Burakov maintainer of extracted code.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:26:37 +02:00
Dmitry Kozlyuk
83713ef276 mem: extract common memseg list initialization
All supported OS create memory segment lists (MSL) and reserve VA space
for them in a nearly identical way. Move common code into EAL private
functions to reduce duplication.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:25:16 +02:00
Dmitry Kozlyuk
c4b89ecb64 eal: introduce memory management wrappers
Introduce OS-independent wrappers for memory management operations used
across DPDK and specifically in common code of EAL:

* rte_mem_map()
* rte_mem_unmap()
* rte_mem_page_size()
* rte_mem_lock()

Windows uses different APIs for memory mapping and reservation, while
Unices reserve memory by mapping it. Introduce EAL private functions to
support memory reservation in common code:

* eal_mem_reserve()
* eal_mem_free()
* eal_mem_set_dump()

Wrappers follow POSIX semantics limited to DPDK tasks, but their
signatures deliberately differ from POSIX ones to be more safe and
expressive. New symbols are internal. Being thin wrappers, they require
no special maintenance.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:25:05 +02:00
Dmitry Kozlyuk
176bb37ca6 eal: introduce internal wrappers for file operations
Introduce OS-independent wrappers in order to support common EAL code
on Unix and Windows:

* eal_file_open: open or create a file.
* eal_file_lock: lock or unlock an open file.
* eal_file_truncate: enforce a given size for an open file.

Implementation for Linux and FreeBSD is placed in "unix" subdirectory,
which is intended for common code between the two. These thin wrappers
require no special maintenance.

Common code supporting multi-process doesn't use the new wrappers,
because it is inherently Unix-specific and would impose excessive
requirements on the wrappers.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:24:37 +02:00
Ciara Power
61d6c7a98b telemetry: fix init log printing
Initially, printf was used to indicate and error/warning resulting from
telemetry initialisation. This is now fixed to use EAL logs for
notices, and the unnecessary printf for an error is removed.

Fixes: eeb486f3ba ("eal: add telemetry as dependency")
Fixes: dd6275a424 ("telemetry: fix error log output")

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-05-24 18:01:31 +02:00
Ciara Power
febbebf7f2 telemetry: keep threads separate from data plane
The threads for listening on the telemetry sockets are control threads
and should be separated from those on the data plane. Since telemetry
cannot use the rte_ctrl_thread_create() API, as it does not depend on
EAL, we pass the ctrl thread cpu_set to telemetry init and use it
directly to ensure that telemetry cannot interfere with the data plane
threads.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
2020-05-19 15:05:56 +02:00
Bruce Richardson
293c53d8b2 eal: add telemetry callbacks
EAL now registers commands to provide some basic info from EAL.

Example:
Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 20.05.0-rc0", "pid": 72662, "max_output_len": 16384}
--> /
{"/": ["/", "/eal/app_params", "/eal/params", "/ethdev/link_status", \
    "/ethdev/list", "/ethdev/xstats", "/help", "/info", "/rawdev/list", \
    "/rawdev/xstats"]}
--> /eal/app_params
{"/eal/app_params": ["-i"]}
--> /eal/params
{"/eal/params": ["./app/dpdk-testpmd"]}

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:16 +02:00
Ciara Power
e122b0bff9 eal: remove option registration infrastructure
As Telemetry no longer uses rte_option, and was the only user of this
infrastructure, it can now be removed.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:16 +02:00
Ciara Power
eeb486f3ba eal: add telemetry as dependency
This patch moves telemetry further down the build, and adds it as a
dependency for EAL. Telemetry V2 is now configured to build by default,
and the legacy support is built when the telemetry config flag is set.

Telemetry now has EAL flags, shown below:
"--telemetry" = Enables telemetry (this is default if no flags given)
"--no-telemetry" = Disables telemetry

When telemetry is enabled, it will attempt to open the new socket
version, and also the legacy support socket (this will depend on Jansson
external dependency and telemetry config flag, as before).

Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:16 +02:00
Luca Boccassi
611faa5f46 fix various typos found by Lintian
Cc: stable@dpdk.org

Signed-off-by: Luca Boccassi <bluca@debian.org>
2020-04-25 19:53:47 +02:00
Li Feng
d72e4042c5 mem: exclude unused memory from core dump
Currently, even though memory is mapped with PROT_NONE, this does not
cause it to be excluded from core dumps. This is counter-productive,
because in a lot of cases, this memory will go unused (e.g. when the
memory subsystem preallocates VA space but hasn't yet mapped physical
pages into it).

Use `madvise()` call with MADV_DONTDUMP/MADV_NOCORE to exclude the
unused memory from being dumped.

Signed-off-by: Li Feng <fengli@smartx.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-04-24 19:36:17 +02:00
Li Feng
76e91e3f14 mem: mark pages as not accessed when freeing memory
Commit 8a4baf06c1 ("mem: mark pages as not accessed when reserving VA")
has mapped the initialized memory with PROT_NONE, and when it's unmapped,
eal_memalloc.c should remmap the anonymous memory with PROT_NONE too.

Fixes: 8a4baf06c1 ("mem: mark pages as not accessed when reserving VA")
Cc: stable@dpdk.org

Signed-off-by: Li Feng <fengli@smartx.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-04-24 19:36:17 +02:00
Jerin Jacob
05c4105738 trace: add interrupt tracepoints
Add the following interrupt related tracepoints.

- rte_eal_trace_intr_callback_register()
- rte_eal_trace_intr_callback_unregister()
- rte_eal_trace_intr_enable()
- rte_eal_trace_intr_disable()

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:55 +02:00
Jerin Jacob
0baa1e01c3 trace: add thread tracepoints
Add the following thread related tracepoints.

- rte_eal_trace_thread_remote_launch()
- rte_eal_trace_thread_lcore_ready()

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:53 +02:00
Jerin Jacob
4931010619 trace: add alarm tracepoints
Add following alarm related trace points.

- rte_eal_trace_alarm_set()
- rte_eal_trace_alarm_cancel()

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:47 +02:00
Jerin Jacob
6c232fc44c trace: add generic tracepoints
This patch creates the following generic tracepoint for
generic tracing when there is no dedicated tracepoint is
available.

- rte_eal_trace_generic_void()
- rte_eal_trace_generic_u64()
- rte_eal_trace_generic_u32()
- rte_eal_trace_generic_u16()
- rte_eal_trace_generic_u8()
- rte_eal_trace_generic_i64()
- rte_eal_trace_generic_i32()
- rte_eal_trace_generic_i16()
- rte_eal_trace_generic_i8()
- rte_eal_trace_generic_int()
- rte_eal_trace_generic_long()
- rte_eal_trace_generic_float()
- rte_eal_trace_generic_double()
- rte_eal_trace_generic_ptr()
- rte_eal_trace_generic_str()

For example, if an application wishes to emit an int datatype,
it can call rte_eal_trace_generic_int(val) to emit the trace.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:42 +02:00
Jerin Jacob
3b155d24bd trace: hook subsystem to Linux
Connect the internal trace interface API to Linux EAL.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:38 +02:00
Jerin Jacob
f1a099f5b1 trace: create CTF TDSL metadata in memory
Common trace format(CTF) defines the metadata[1][2] for trace events,
This patch creates the metadata for the DPDK events in memory and
later this will be saved to trace directory on rte_trace_save()
invocation.

[1] https://diamon.org/ctf/#specification
[2] https://diamon.org/ctf/#examples

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:22 +02:00
Jerin Jacob
321dd5f8fa trace: add internal init and fini interface
Define eal_trace_init() and eal_trace_fini() EAL interface
functions that rte_eal_init() and rte_eal_cleanup() function can
use to initialize and finalize the trace subsystem.
eal_trace_init() function will add the following functionality if
trace is enabled through EAL command line param.

- Test for trace registration failure.
- Test for duplicate trace name registration.
- Generate UUID ver 4.
- Create a trace directory.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:15 +02:00
Jerin Jacob
27db82c709 trace: introduce new subsystem
Define the public API for trace support.
This patch also adds support for the build infrastructure and
update the MAINTAINERS file for the trace subsystem.

The 8 bytes tracepoint object is a global variable, and can be used in
fast path. Created a new __rte_trace_point section to store the
tracepoint objects as,
- It is a mostly read-only data and not to mix with other "write"
  global variables.
- Chances that the same subsystem fast path variables come in the same
  fast path cache line. i.e, it will enable a more predictable
  performance number from build to build.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:06 +02:00
Jerin Jacob
d0cfb06468 eal: introduce API for getting thread name
Introduce rte_thread_getname() API to get the thread name
and implement it for Linux and FreeBSD.

FreeBSD does not support getting the thread name.

One of the consumers of this API will be the trace subsystem where
it used as an informative purpose.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:03 +02:00
Wei Hu (Xavier)
d6298844da vfio: fix use after free with multiprocess
This patch fixes the heap-use-after-free bug which was found by ASAN
(Address-Sanitizer) in the vfio_get_default_container_fd function.

Fixes: 6bcb7c95fe ("vfio: share default container in multi-process")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-04-21 18:13:44 +02:00
Michael Haeuptle
b758423bc4 vfio: fix race condition with sysfs
This fix treats a 0 return value from vfio_open_group_fd
in vfio_get_group_fd as the intended error condition instead
of putting an incorrect 0 file descriptor in the vfio_group table.

Sometimes, the creation of device files in sysfs is not
instantaneously causing vfio_open_groupfd to return 0.
This has been observed when hot removing/adding multiple
NVMe devices (>=4).

Fixes: 340b7bb8d5 ("vfio: extend data structure for multi container")
Cc: stable@dpdk.org

Signed-off-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Acked-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-04-21 18:13:44 +02:00
Thomas Monjalon
ddcd7640ca replace no-return attributes
The new macro __rte_noreturn, for compiler hinting,
is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-04-16 18:30:58 +02:00
Thomas Monjalon
f2fc83b40f replace unused attributes
There is a common macro __rte_unused, avoiding warnings,
which is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-04-16 18:30:58 +02:00
Pavan Nikhilesh
acec04c4b2 build: disable experimental API check internally
Remove setting ALLOW_EXPERIMENTAL_API individually for each Makefile and
meson.build. Instead, enable ALLOW_EXPERIMENTAL_API flag across app, lib
and drivers.
This changes reduces the clutter across the project while still
maintaining the functionality of ALLOW_EXPERIMENTAL_API i.e. warning
external applications about experimental API usage.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2020-04-14 16:22:34 +02:00
Thomas Monjalon
5f60d38b5a eal: clean make and meson files
Clean up indent and line ordering in Makefile and meson.build
for consistency in linux/ and freebsd/ directories.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-03-31 13:08:55 +02:00
Thomas Monjalon
a083f8cc77 eal: move OS-specific sub-directories
Since the kernel modules are moved to kernel/ directory,
there is no need anymore for the sub-directory eal/ in
linux/, freebsd/ and windows/.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-03-31 13:08:55 +02:00
Thomas Monjalon
9c1e0dc39a eal: move common header files
The EAL API (with doxygen documentation) is moved from
common/include/ to include/, which makes more clear that
it is the global API for all environments and architectures.

Note that the arch-specific and OS-specific include files are not
in this global include directory, but include/generic/ should
cover the doxygen documentation for them.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-03-31 13:08:55 +02:00
Thomas Monjalon
2a1991799e eal: move arch-specific C files
The arch-specific directories arm, ppc and x86 in common/arch/
are moved at the same level as the OS-specific directories.
It makes more clear that EAL is covering a matrix combining OS and arch.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-03-31 13:08:55 +02:00
Thomas Monjalon
4448a202b6 eal: remove useless makefiles
When moving files to the directory kernel/,
the file BSDmakefile.meson was left in eal/.

Also the intermediate makefiles in linux/ and freebsd/ became useless.

Fixes: acaa9ee991 ("move kernel modules directories")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-03-31 13:08:55 +02:00
Anatoly Burakov
4236694f0a mem: preallocate VA space in no-huge mode
When --no-huge mode is used, the memory is currently allocated with
mmap(NULL, ...). This is fine in most cases, but can fail in cases
where DPDK is run on a machine with an IOMMU that is of more limited
address width than that of a VA, because we're not specifying the
address hint for mmap() call.

Fix it by preallocating VA space before mapping it.

Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Tested-by: Jun W Zhou <junx.w.zhou@intel.com>
2020-03-27 11:04:09 +01:00
Anatoly Burakov
d1c7c0cdf7 vfio: map contiguous areas in one go
Currently, when we are creating DMA mappings for memory that's
either external or is backed by hugepages in IOVA as PA mode, we
assume that each page is necessarily discontiguous. This may not
actually be the case, especially for external memory, where the
user is able to create their own IOVA table and make it
contiguous. This is a problem because VFIO has a limited number
of DMA mappings, and it does not appear to concatenate them and
treats each mapping as separate, even when they cover adjacent
areas.

Fix this so that we always map contiguous memory in a single
chunk, as opposed to mapping each segment separately.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <ray.kinsella@intel.com>
2020-03-27 10:09:22 +01:00
Harman Kalra
9058afaa26 eal: check if running in interrupt context
Added an API to check if current execution is in interrupt
context. This will be helpful to handle nested interrupt cases.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2020-02-06 16:35:40 +01:00
Stephen Hemminger
292f02b58c mem: fix munmap in error unwind
The loop to unwind existing mmaps was only unmapping the
first segment and the error paths after mmap() were not
doing munmap of the current segment.

Fixes: 66cc45e293 ("mem: replace memseg with memseg lists")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-02-06 15:39:30 +01:00
Takeshi Yoshimura
986f2134c3 vfio: fix mapping failures in ppc64le
ppc64le failed when using large physical memory. I found problems in my two
commits in the past.

In commit e072d16f89 ("vfio: fix expanding DMA area in ppc64le"), I added
a sanity check using a mapped address to resolve an issue around expanding
IOMMU window, but this was not enough, since memory allocation can return
memory anywhere dependent on memory fragmentation. DPDK may still skip DMA
mapping and attempts to unmap non-mapped DMA during expanding IOMMU window.
As a result, SPDK apps using large physical memory frequently failed to
proceed the communication with NVMe and/or went into an infinite loop.

The root cause of the bug was in a gap between memory segments managed by
DPDK and firmware-level DMA mapping. DPDK's memory segments don't contain
the state of DMA mapping, and so, the memesg_walk cannot determine if an
iterated memory segment is mapped or not. This resulted in incorrect DMA
maps and unmaps.

At this time, I added the code to avoid iterating non-mapped memory
segments during DMA mapping. The memseg_walk iterates over memory segments
marked as "used", and so, the code sets memory segments that will be
mapped or unmapped as "free" transiently.

The commit db90b4969e ("vfio: retry creating sPAPR DMA window") allows
retring different page levels and sizes to create DMA window. However, this
allows page sizes different from hugepage sizes. This inconsistency caused
failures at the time of DMA mapping after the window creation. This patch
fixes to retry only different page levels.

Fixes: e072d16f89 ("vfio: fix expanding DMA area in ppc64le")
Fixes: db90b4969e ("vfio: retry creating sPAPR DMA window")
Cc: stable@dpdk.org

Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2020-02-05 21:57:21 +01:00
Ali Alnubani
e8a17faa5e eal/linux: fix build when VFIO is disabled
The header linux/version.h isn't included when CONFIG_RTE_EAL_VFIO
is explicitly disabled. LINUX_VERSION_CODE and KERNEL_VERSION are
therefore undefined, causing the build failure:

  lib/librte_eal/linux/eal/eal.c: In function ‘rte_eal_init’:
  lib/librte_eal/linux/eal/eal.c:1076:32: error: "LINUX_VERSION_CODE" is
    not defined, evaluates to 0 [-Werror=undef]

Fixes: a0dede62a5 ("eal/linux: remove KNI restriction on IOVA")
Cc: stable@dpdk.org

Signed-off-by: Ali Alnubani <alialnu@mellanox.com>
2020-01-20 00:08:53 +01:00
David Marchand
aef1d07331 eal/linux: fix build error on RHEL 7.6
Previous fix gives hiccups to gcc on RHEL 7.6:

== Build lib/librte_eal/linux/eal
  CC eal_interrupts.o
...lib/librte_eal/linux/eal/eal_interrupts.c: In function
  ‘eal_intr_thread_main’:
...lib/librte_eal/linux/eal/eal_interrupts.c:1048:9: error: missing
  initializer for field ‘events’ of ‘struct epoll_event’
  [-Werror=missing-field-initializers]
  struct epoll_event ev = { };
         ^
In file included from ...lib/librte_eal/linux/eal/eal_interrupts.c:15:0:
/usr/include/sys/epoll.h:89:12: note: ‘events’ declared here
   uint32_t events; /* Epoll events */
            ^
...lib/librte_eal/linux/eal/eal_interrupts.c: At top level:
cc1: error: unrecognized command line option
  "-Wno-address-of-packed-member" [-Werror]
cc1: all warnings being treated as errors

Fixes: e0ab8020ac ("eal/linux: fix uninitialized data valgrind warning")
Cc: stable@dpdk.org

Reported-by: Andrew Rybchenko <arybchenko@solarflare.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2019-12-04 14:22:02 +01:00
Stephen Hemminger
e0ab8020ac eal/linux: fix uninitialized data valgrind warning
Valgrind reports that eal interrupt thread is calling epoll_ctl
with uninitialized data.
This is a false positive, because the kernel is not going to care about
the unused bits in the union but trivial to fix by initializing it.

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@redhat.com>
2019-12-04 10:05:05 +01:00
Ferruh Yigit
de480bbf13 kni: fix build with Linux 4.9.x
The 'get_user_pages_remote()' API is updated in kernel 4.10.0 [1],
but the check added as > 4.9.0,
this logic is broken for kernels 4.9.x, because they justify
> 4.9.0 check but have the old API.

Fixing the check as >= 4.10.0

[1]
commit 5b56d49fc31d ("mm: add locked parameter to get_user_pages_remote()")

Fixes: d965af9e8a ("kni: increase kernel version requirement for VA")

Reported-by: Andrew Rybchenko <arybchenko@solarflare.com>
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-11-28 14:48:24 +01:00
Ferruh Yigit
d965af9e8a kni: increase kernel version requirement for VA
A build error reported related to the selected 'get_user_pages_remote()'
kernel API:

.../kernel/linux/kni/kni_dev.h:113:8:
  error: too few arguments to function ‘get_user_pages_remote’
  ret = get_user_pages_remote(tsk, tsk->mm, iova, 1
        ^~~~~~~~~~~~~~~~~~~~~

Currently there are three versions of the 'get_user_pages_remote()'
supported, based on kernel version < 4.9, = 4.9, > 4.9.

These version based checks are not working fine with the distro kernels
which is the cause of reported build error. The error reported by the
kernel version 4.8, but it is using API defined in > 4.9.

To be able to take control of this, and possible more, related build
error, increasing the minimum supported kernel version for iova=va with
KNI to kernel version 4.9.

This leaves us with single version of the kernel API and more manageable.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-11-21 00:18:02 +01:00