Commit Graph

23 Commits

Author SHA1 Message Date
Shijith Thotton
a986c2b797 build: add option to configure IOVA mode as PA
IOVA mode in DPDK is either PA or VA.
The new build option enable_iova_as_pa configures the mode to PA
at compile time.
By default, this option is enabled.
If the option is disabled, only drivers which support it are enabled.
Supported driver can set the flag pmd_supports_disable_iova_as_pa
in its build file.

mbuf structure holds the physical (PA) and virtual address (VA).
If IOVA as PA is disabled at compile time, PA field (buf_iova)
of mbuf is redundant as it is the same as VA
and is replaced by a dummy field.

Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2022-10-09 13:14:52 +02:00
Kevin Laatz
1cab1a40ea bus: cleanup devices on shutdown
During EAL init, all buses are probed and the devices found are
initialized. On eal_cleanup(), the inverse does not happen, meaning any
allocated memory and other configuration will not be cleaned up
appropriately on exit.

Currently, in order for device cleanup to take place, applications must
call the driver-relevant functions to ensure proper cleanup is done before
the application exits. Since initialization occurs for all devices on the
bus, not just the devices used by an application, it requires a)
application awareness of all bus devices that could have been probed on the
system, and b) code duplication across applications to ensure cleanup is
performed. An example of this is rte_eth_dev_close() which is commonly used
across the example applications.

This patch proposes adding bus cleanup to the eal_cleanup() to make EAL's
init/exit more symmetrical, ensuring all bus devices are cleaned up
appropriately without the application needing to be aware of all bus types
that may have been probed during initialization.

Contained in this patch are the changes required to perform cleanup for
devices on the PCI bus and VDEV bus during eal_cleanup(). There would be an
ask for bus maintainers to add the relevant cleanup for their buses since
they have the domain expertise.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
2022-10-04 21:20:15 +02:00
Abdullah Ömer Yamaç
8ae946970e eal: fix thread name for high order lcores
In case of higher order (greater than 99) logical cores, name was
truncated (length is restricted to 16 characters, including the
terminating null byte ('\0')) and it makes hard to follow threads.

Before this fix, this issue can be reproduced using following arguments:
  --lcores=0,10@1,100@2
Then we had:
lcore-worker-10
lcore-worker-10

Signed-off-by: Abdullah Ömer Yamaç <omer.yamac@ceng.metu.edu.tr>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2022-09-30 11:23:12 +02:00
Dmitry Kozlyuk
72b452c5f2 eal: remove unneeded includes from a public header
Do not include <ctype.h>, <errno.h>, and <stdlib.h> from <rte_common.h>,
because they are not used by this file.
Include the needed headers directly from the files that need them.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-09-21 15:31:03 +02:00
Don Wallwork
42fbb8e85d eal/linux: allocate worker lcore stacks in hugepages
Add support for using hugepages for worker lcore stack memory. The
intent is to improve performance by reducing stack memory related TLB
misses and also by using memory local to the NUMA node of each lcore.

EAL option '--huge-worker-stack[=stack-size-in-kbytes]' is added to allow
the feature to be enabled at runtime. If the size is not specified,
the system pthread stack size will be used.

Signed-off-by: Don Wallwork <donw@xsightlabs.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
2022-06-23 22:36:33 +02:00
Chengwen Feng
a8f23b444d trace: fix crash when exiting
Bug scenario:
1. start testpmd:
  $ dpdk-testpmd -l 4-6 -a 0000:7d:00.0 --trace=.* -- -i
2. quit testpmd and then observed segment fault:
  Bye...
  Segmentation fault (core dumped)

The root cause is that rte_trace_save() and eal_trace_fini() access
the huge pages which were cleanup by rte_eal_memory_detach().

This patch moves rte_trace_save() and eal_trace_fini() before
rte_eal_memory_detach() to fix the bug.

Fixes: dfbc61a2f9 ("mem: detach memsegs on cleanup")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2022-06-21 11:11:00 +02:00
David Marchand
a95d70547c eal: factorize lcore main loop
All OS implementations provide the same main loop.
Introduce helpers (shared for Linux and FreeBSD) to handle synchronisation
between main and threads and factorize the rest as common code.
Thread id are now logged as string in a common format across OS.

Note:
- this change also fixes Windows EAL: worker threads cpu affinity was
  incorrectly reported in log.

- libabigail flags this change as breaking ABI in clang builds:
  1 function with some indirect sub-type change:

  [C] 'function int rte_eal_remote_launch(int (void*)*, void*, unsigned
      int)' at eal_common_launch.c:35:1 has some indirect sub-type
      changes:
    parameter 1 of type 'int (void*)*' changed:
      in pointed to type 'function type int (void*)' at rte_launch.h:31:1:
        entity changed from 'function type int (void*)' to 'typedef
          lcore_function_t' at rte_launch.h:31:1
        type size hasn't changed

  This is being investigated on libabigail side.
  For now, we don't have much choice but to waive reports on this symbol.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2022-04-14 13:59:50 +02:00
David Marchand
449e7dbc7b eal: cleanup lcore ID hand-over
So far, a worker thread has been using its thread_id to discover which
lcore has been assigned to it.

On the other hand, as noted by Tyler, the pthread API does not strictly
guarantee that a new thread won't start running eal_thread_loop before
pthread_create writes to &lcore_config[xx].thread_id.

Though all OS implementations supported in DPDK (recently) ensure this
property, it is more robust to have the main thread directly pass
the worker thread lcore.

Signed-off-by: David Marchand <david.marchand@redhat.com>
2022-04-14 13:59:50 +02:00
Sean Morrissey
30a1de105a lib: remove unneeded header includes
These header includes have been flagged by the iwyu_tool
and removed.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
2022-02-22 13:10:39 +01:00
Stephen Hemminger
06c047b680 remove unnecessary null checks
Functions like free, rte_free, and rte_mempool_free
already handle NULL pointer so the checks here are not necessary.

Remove redundant NULL pointer checks before free functions
found by nullfree.cocci

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-12 12:07:48 +01:00
Stephen Hemminger
a0cc7be20d mem: cleanup multiprocess resources
The mp action resources in malloc should be cleaned up via
rte_eal_cleanup.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-11 19:49:22 +01:00
Stephen Hemminger
e8dc971b63 eal: cleanup multiprocess hotplug resources
When rte_eal_cleanup is called, hotplug should unregister the
resources associated with the multi-process server.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-11 19:49:22 +01:00
Stephen Hemminger
6412941ae8 vfio: cleanup the multiprocess sync handle
When rte_eal_cleanup is called the rte_mp_action for VFIO
should be freed.

Fixes: edf73dd330 ("ipc: handle unsupported IPC in action register")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-11 19:49:22 +01:00
Stephen Hemminger
5f4eb82f3c log: close in cleanup stage
When application calls rte_eal_cleanup on shutdown,
the DPDK log should be closed and cleaned up.

This helps reduce false reports from tools like ASAN
and valgrind that track memory leaks.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-11 19:49:22 +01:00
Stephen Hemminger
6e97b5fc1a eal: move Unix filesystem functions into one file
Both Linux and FreeBSD have same code for creating runtime
directory and reading sysfs files. Put them in the new lib/eal/unix
subdirectory.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2022-02-09 19:12:53 +01:00
Stephen Hemminger
1835a22f34 support systemd service convention for runtime directory
Systemd.exec supports configuring the runtime directory of a service
via RuntimeDirectory=. This creates the directory with the necessary
permissions which actual service may not have if running in container.

The change to DPDK is to look for the environment RUNTIME_DIRECTORY
first and use that in preference to the fallback alternatives.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
2022-02-09 19:12:40 +01:00
Stephen Hemminger
36514d8dfa eal: remove size for setting runtime directory
The size argument to eal_set_runtime_dir is useless and was
being used incorrectly in strlcpy. It worked only because
all callers passed PATH_MAX which is same as sizeof the destination
runtime_dir.

Note: this is an internal API so no user exposed change.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-02-09 16:42:31 +01:00
Dmitry Kozlyuk
32b4771cd8 eal/linux: allow hugepage file reuse
Linux EAL ensured that mapped hugepages are clean
by always mapping from newly created files:
existing hugepage backing files were always removed.
In this case, the kernel clears the page to prevent data leaks,
because the mapped memory may contain leftover data
from the previous process that was using this memory.
Clearing takes the bulk of the time spent in mmap(2),
increasing EAL initialization time.

Introduce a mode to keep existing files and reuse them
in order to speed up initial memory allocation in EAL.
Hugepages mapped from such files may contain data
left by the previous process that used this memory,
so RTE_MEMSEG_FLAG_DIRTY is set for their segments.
If multiple hugepages are mapped from the same file:
1. When fallocate(2) is used, all memory mapped from this file
   is considered dirty, because it is unknown
   which parts of the file are holes.
2. When ftruncate(3) is used, memory mapped from this file
   is considered dirty unless the file is extended
   to create a new mapping, which implies clean memory.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2022-02-08 21:32:53 +01:00
Jim Harris
628bac7df1 eal/linux: remove unused variable for socket memory
clang-13 rightfully complains that the total_mem variable in
eal_parse_socket_arg is set but not used, since the final
accumulated total_mem result isn't used anywhere.
So just remove the total_mem variable.

Fixes: 0a703f0f36 ("eal/linux: fix parsing zero socket memory and limits")

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-11-04 13:27:18 +01:00
Harman Kalra
90b13ab8d4 alarm: remove direct access to interrupt handle
Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the libraries access the interrupt handle fields.

Implementing alarm cleanup routine, where the memory allocated
for interrupt instance can be freed.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
2021-10-25 21:20:12 +02:00
Bruce Richardson
e89463a366 eal: limit telemetry to primary processes
Telemetry interface should be exposed for primary processes only, since
secondary processes will conflict on socket creation, and since all
data in secondary process is generally available to primary. For
example, all device stats for ethdevs, cryptodevs, etc. will all be
common across processes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
2021-10-14 20:31:10 +02:00
Bruce Richardson
ce382fdddb eal: create runtime dir even when shared data is not used
When multi-process is not wanted and DPDK is run with the "no-shconf"
flag, the telemetry library still needs a runtime directory to place the
unix socket for telemetry connections. Therefore, rather than not
creating the directory when this flag is set, we can change the code to
attempt the creation anyway, but not error out if it fails. If it
succeeds, then telemetry will be available, but if it fails, the rest of
DPDK will run without telemetry. This ensures that the "in-memory" flag
will allow DPDK to run even if the whole filesystem is read-only, for
example.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-07-07 15:23:09 +02:00
Bruce Richardson
99a2dd955f lib: remove librte_ prefix from directory names
There is no reason for the DPDK libraries to all have 'librte_' prefix on
the directory names. This prefix makes the directory names longer and also
makes it awkward to add features referring to individual libraries in the
build - should the lib names be specified with or without the prefix.
Therefore, we can just remove the library prefix and use the library's
unique name as the directory name, i.e. 'eal' rather than 'librte_eal'

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-21 14:04:09 +02:00