When a device is added with a devargs (hotplug or whitelist),
the bus pointer can be retrieved via its devargs.
But there is no such devargs.bus in case of standard scan.
A pointer to the rte_bus handle is added to rte_device.
When a device is allocated (during a scan),
the pointer to its bus is assigned.
It will make possible to remove a rte_device,
using the function pointer from its bus.
The function rte_bus_find_by_device() becomes useless,
and may be removed later.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The function rte_devargs_remove(), which is intended to be internal,
can take a devargs structure as argument.
The matching is still using string comparison of bus name and
device name.
It is simpler and may allow a different devargs matching in future.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
rte_eal_parse_devargs_str() does not support parsing the bus name
at the start of devargs. So it was renamed and deprecated.
rte_eal_devargs_add(), rte_eal_devargs_type_count() and
rte_eal_devargs_dump() were declared deprecated and had their
implementation body renamed.
All these functions were deprecated in release 18.05.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The enum names are *_params (plural form).
And the items are also using the plural form: *_PARAMS_*.
It looks more natural to use the singular form *_PARAM_* for items.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
We could match devices by their PCI id (vendor id, device id, etc).
But for now, only matching by PCI address is implemented.
The devargs parameter "id" is renamed "addr" to reflect its real meaning.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
The experimental symbol check script matched on the regexes
"\.text.*$SYM" and "\.text\.experimental.*$SYM" which allows for
substring matches.
If a symbol is leading substring of another one (e.g. symbol foo
is a substring of symbol foobar), it would match on symbols
when it shouldn't.
It is fixed by matching additionally on the end of line
so that symbols are an exact match.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
This fixes the problem of reference count leak if
igbuio_pci_enable_interrupts fails.
Also, replace mutex and integer with a kernel atomic counter.
This is standard pattern for kernel devices.
Fixes: 19685d5aa7 ("igb_uio: allow multi-process access")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The final_va field is set during remap_segment() but this information is
not propagated to temporal copy of huge page memory configuration so the
unlink_hugepage_files() function wrongly assume that there is nothing to
unlink. Fix this issue by checking orig_va instead of final_va.
Fixes: 66cc45e293 ("mem: replace memseg with memseg lists")
Cc: stable@dpdk.org
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Currently, mempools can only be allocated either using native
DPDK memory, or anonymous memory. This patch will add two new
methods to allocate mempool using external memory (regular or
hugepage memory), and add documentation about it to testpmd
user guide.
It adds a new flag "--mp-alloc", with four possible values:
native (use regular DPDK allocator), anon (use anonymous
mempool), xmem (use externally allocated memory area), and
xmemhuge (use externally allocated hugepage memory area). Old
flag "--mp-anon" is kept for compatibility.
All external memory is allocated using the same external heap,
but each will allocate and add a new memory area.
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Add simple unit tests to test external memory support.
The tests are pretty basic and mostly consist of checking
if invalid API calls are handled correctly, plus a simple
allocation/deallocation test for malloc and memzone.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When adding or removing external memory from the memory map, there
may be actions that need to be taken on account of this memory (e.g.
DMA mapping). Add support for triggering callbacks when adding,
removing, attaching or detaching external memory.
Some memory event callback handlers will need additional logic to
handle external memory regions. For example, virtio callback has to
completely ignore externally allocated memory, because there is no
way to find file descriptors backing the memory address in a
generic fashion. All other callbacks have also been adjusted to
handle RTE_BAD_IOVA as IOVA address, as this is one of the expected
use cases for external memory support.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
In order to use external memory in multiple processes, we need to
attach to primary process's memseg lists, so add a new API to do
that. It is the responsibility of the user to ensure that memory
is accessible and that it has been previously added to the malloc
heap by another process.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Add an API to remove memory from specified heaps. This will first
check if all elements within the region are free, and that the
region is the original region that was added to the heap (by
comparing its length to length of memory addressed by the
underlying memseg list).
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Add an API to add externally allocated memory to malloc heap. The
memory will be stored in memseg lists like regular DPDK memory.
Multiple segments are allowed within a heap. If IOVA table is
not provided, IOVA addresses are filled in with RTE_BAD_IOVA.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Add API to allow creating new malloc heaps. They will be created
with socket ID's going above RTE_MAX_NUMA_NODES, to avoid clashing
with internal heaps.
This breaks the ABI, so document the change.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
An API is needed to check whether a particular socket ID belongs
to an internal or external heap. Prime user of this would be
mempool allocator, because normal assumptions of IOVA
contiguousness in IOVA as VA mode do not hold in case of
externally allocated memory.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When we will be creating external heaps, they will have their own
"fake" socket ID, so add a function that will map the heap name
to its socket ID.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
We will need to refer to external heaps in some way. While we use
heap ID's internally, for external API use it has to be something
more user-friendly. So, we will be using a string to uniquely
identify a heap.
This breaks the ABI, so document the change.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
This changes the semantics of what we understand by "socket ID",
so document the change in the release notes.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Switch over all parts of EAL to use heap ID instead of NUMA node
ID to identify heaps. Heap ID for DPDK-internal heaps is NUMA
node's index within the detected NUMA node list. Heap ID for
external heaps will be order of their creation.
This breaks the ABI, so document the changes.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When we allocate and use DPDK memory, we need to be able to
differentiate between DPDK hugepage segments and segments that
were made part of DPDK but are externally allocated. Add such
a property to memseg lists.
This breaks the ABI, so document the change in release notes.
This also breaks a few internal assumptions about memory
contiguousness, so adjust malloc code in a few places.
All current calls for memseg walk functions were adjusted to
ignore external segments where it made sense.
Mempools is a special case, because we may be asked to allocate
a mempool on a specific socket, and we need to ignore all page
sizes on other heaps or other sockets. Previously, this
assumption of knowing all page sizes was not a problem, but it
will be now, so we have to match socket ID with page size when
calculating minimum page size for a mempool.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Previously, to calculate length of memory area covered by a memseg
list, we would've needed to multiply page size by length of fbarray
backing that memseg list. This is not obvious and unnecessarily
low level, so store length in the memseg list itself.
This breaks ABI, so bump the EAL ABI version and document the
change. Also, while we're breaking ABI, pack the members a little
better.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
build error:
.../lib/librte_eventdev/rte_event_eth_tx_adapter.c:
In function ‘txa_service_queue_del’:
.../lib/librte_eventdev/rte_event_eth_tx_adapter.c:800:7:
error: ‘ret’ may be used uninitialized in this function
[-Werror=maybe-uninitialized]
compilation terminated due to -Wfatal-errors.
https://mails.dpdk.org/archives/test-report/2018-October/065919.html
'ret' may be used uninitialized when 'dev->data->nb_tx_queues' is 0,
although this is not a practical value, initialize 'ret' to cover this
case.
Fixes: a3bbf2e097 ("eventdev: add eth Tx adapter implementation")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
The typedef of "__virtio16" is introduced into Linux kernel in v3.19.
To prevent build error on old kernel, this patch replaces the
"__virtio" usage with "uint16_t".
Fixes: d7fe5a2861 ("net/ifc: support live migration")
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch add support to use select call with qman portal fd
for timeout based dequeue request for eventdev.
If there is a event available qman portal fd will be set
and the function will be awakened. If no event is available,
it will only wait till the given timeout value.
In case of interrupt the timeout ticks are used as usecs.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Make the -Wno-format-nonliteral flag conditional, and only set in
clang and gcc builds, since this flag is not supported (nor needed)
when building dsw with icc.
Fixes: 46a186b1f0 ("event/dsw: add device registration and build system")
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Currently, DPDK will skip mapping some areas (or even an entire BAR)
if MSI-X table happens to be in them but is smaller than page size.
Kernels 4.16+ will allow mapping MSI-X BARs [1], and will report this
as a capability flag. Capability flags themselves are also only
supported since kernel 4.6 [2].
This commit will introduce support for checking VFIO capabilities,
and will use it to check if we are allowed to map BARs with MSI-X
tables in them, along with backwards compatibility for older
kernels, including a workaround for a variable rename in VFIO
region info structure [3].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?id=a32295c612c57990d17fb0f41e7134394b2f35f6
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?id=c84982adb23bcf3b99b79ca33527cd2625fbe279
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?id=ff63eb638d63b95e489f976428f1df01391e15e4
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When NUMA-aware hugepages config option is set, we rely on
libnuma to tell the kernel to allocate hugepages on a specific
NUMA node. However, we allocate node mask before we check if
NUMA is available in the first place, which, according to
the manpage [1], causes undefined behaviour.
Fix by only using nodemask when we have NUMA available.
[1] https://linux.die.net/man/3/numa_alloc_onnode
Bugzilla ID: 20
Fixes: 1b72605d24 ("mem: balanced allocation of hugepages")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Currently, command-line switches for legacy mem mode or single-file
segments mode are only stored in internal config. This leads to a
situation where these flags have to always match between primary
and secondary, which is bad for usability.
Fix this by storing these flags in the shared config as well, so
that secondary process can know if the primary was launched in
single-file segments or legacy mem mode.
This bumps the EAL ABI, however there's an EAL deprecation notice
already in place[1] for a different feature, so that's OK.
[1] http://patches.dpdk.org/patch/43502/
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Implement the operators of an rte_class for the
ethdev abstraction layer.
Register the layer as such.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This iterator can be customized with a comparison function that will
trigger a stopping condition.
It can be leveraged to write several different iterators that have
similar but non-identical purposes.
It is private to librte_ethdev.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The PCI bus can now parse a matching field "id" as follows:
"bus=pci,id=0000:00:00.0"
or
"bus=pci,id=00:00.0"
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Modify kni_net_ioctl() to return -EOPNOTSUPP for all ioctls instead
of 0.
This is necessary because the Wicked (and possibly other) network
interface managers will perform the SIOCGIWNAME ioctl to check if
the interface is a wireless interface. If the KNI module returns
success, Wicked will incorrectly interpret the interface as a wireless
interface.
Signed-off-by: Dan Gora <dg@adax.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Long time ago preallocation of memory for KNI was introduced in commit
0c6bc8e. It was done because of lack of ability to free previously
allocated memzones, which led to memzone exhaustion. Currently memzones
can be freed and this patch uses this ability for dynamic KNI memory
allocation.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Santosh Shukla no longer associated with Cavium.
Update the octeontx driver code maintainership.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Skeleton PMD does not support RTE_EVENT_ETH_RX_ADAPTER_CAP_MULTI_EVENTQ
so make the Rx queue_id = -1 and initialize the event port
configuration to zero.
Fixes: d65856999d ("test/event: add Rx adapter tests for interrupt driven queues")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Skeleton PMD does not support RTE_EVENT_ETH_RX_ADAPTER_CAP_MULTI_EVENTQ
and implicit_release_disable so make the Rx queue_id = -1 and
initialize the event port configuration to zero.
Fixes: ec36d881f5 ("eventdev: add implicit release disable capability")
Fixes: 2a9c83ae3b ("test/eventdev: add multi-ports test")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Convert existing Tx service based pipeline to Tx adapter based APIs and
simplify worker functions.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Remove unnecessary newline at the end of logs.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>