Fixes missing PKT_TX_UDP_SEG, PKT_TX_OUTER_IPV6,PKT_TX_OUTER_IPV4,
PKT_TX_IPV6 and PKT_TX_IPV4 values in PKT_TX_OFFLOAD_MASK.
Also sort them in bit wise order to recognize missing items later.
Fixes: 6d18505efaa6 ("vhost: support UDP Fragmentation Offload")
Fixes: 1c3b7c33e977 ("mbuf: add Tx offloading flags for tunnels")
Fixes: 711ba9e23e68 ("mbuf: remove aliasing of Tx offloading flags with Rx ones")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Some users want to use their own epoll instances to control both
DPDK rxq interrupt fds and their own other fds. So added a function
to get rxq interrupt fd based on port id and queue id.
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
No users left for this function, time to deprecate it.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Several pattern items and actions were never handled by rte_flow_copy()
because their descriptions were missing. rte_flow_conv() inherited this
deficiency.
This patch adds them and reorders others to match rte_flow.h. It doesn't
pose as a fix because so far no one has complained about it and
rte_flow_conv() would have to be backported as well: this function is
the only sane approach to handle VXLAN and NVGRE encap definitions.
As a matter of fact, it's the last missing piece to finally allow
testpmd users to request the creation of VXLAN/NVGRE encap/decap flow
rules without getting rejected outright.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
This provides a means for applications to retrieve the name of flow
pattern items and actions.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
rte_flow_copy() is bound to duplicate flow rule descriptions
(attributes, pattern and list of actions, all at once), however
applications sometimes need more flexibility, for instance the ability
to duplicate only one of the underlying objects (a single pattern item
or action) or retrieve other properties such as their names.
Instead of adding dedicated functions to handle each possible use case,
this patch introduces rte_flow_conv(), which supports any number of
object conversion operations in an extensible manner.
This patch re-implements rte_flow_copy() as a wrapper to
rte_flow_conv().
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
It's used to get number of available registered vDPA devices.
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
All information about a device to probe can be grouped
in a common string, which is what we usually call devargs.
An application should not have to parse this string before
calling the EAL probe function.
And the syntax could evolve to be more complex and support
matching multiple devices in one string.
That's why the bus name and device name should be removed from
rte_eal_hotplug_add().
Instead of changing this function, a simpler one is added
and used in the old one, which may be deprecated later.
When removing a device, we already know its rte_device handle
which can be directly passed as parameter of rte_eal_hotplug_remove().
If the rte_device is not known, it can be retrieved with the devargs,
by iterating in the device list (future RTE_DEV_FOREACH()).
Similarly to the probing case, a new function is added
and used in the old one, which may be deprecated later.
The new function is used in failsafe, because the replacement is easy.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
These functions are quite old and are the only available replacement
for the deprecated attach/detach functions.
Note: some new functions may (again) replace these hotplug functions,
in future, with better parameters.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
When a device is added with a devargs (hotplug or whitelist),
the bus pointer can be retrieved via its devargs.
But there is no such devargs.bus in case of standard scan.
A pointer to the rte_bus handle is added to rte_device.
When a device is allocated (during a scan),
the pointer to its bus is assigned.
It will make possible to remove a rte_device,
using the function pointer from its bus.
The function rte_bus_find_by_device() becomes useless,
and may be removed later.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The function rte_devargs_remove(), which is intended to be internal,
can take a devargs structure as argument.
The matching is still using string comparison of bus name and
device name.
It is simpler and may allow a different devargs matching in future.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
rte_eal_parse_devargs_str() does not support parsing the bus name
at the start of devargs. So it was renamed and deprecated.
rte_eal_devargs_add(), rte_eal_devargs_type_count() and
rte_eal_devargs_dump() were declared deprecated and had their
implementation body renamed.
All these functions were deprecated in release 18.05.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
The enum names are *_params (plural form).
And the items are also using the plural form: *_PARAMS_*.
It looks more natural to use the singular form *_PARAM_* for items.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
The final_va field is set during remap_segment() but this information is
not propagated to temporal copy of huge page memory configuration so the
unlink_hugepage_files() function wrongly assume that there is nothing to
unlink. Fix this issue by checking orig_va instead of final_va.
Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: stable@dpdk.org
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
When adding or removing external memory from the memory map, there
may be actions that need to be taken on account of this memory (e.g.
DMA mapping). Add support for triggering callbacks when adding,
removing, attaching or detaching external memory.
Some memory event callback handlers will need additional logic to
handle external memory regions. For example, virtio callback has to
completely ignore externally allocated memory, because there is no
way to find file descriptors backing the memory address in a
generic fashion. All other callbacks have also been adjusted to
handle RTE_BAD_IOVA as IOVA address, as this is one of the expected
use cases for external memory support.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
In order to use external memory in multiple processes, we need to
attach to primary process's memseg lists, so add a new API to do
that. It is the responsibility of the user to ensure that memory
is accessible and that it has been previously added to the malloc
heap by another process.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Add an API to remove memory from specified heaps. This will first
check if all elements within the region are free, and that the
region is the original region that was added to the heap (by
comparing its length to length of memory addressed by the
underlying memseg list).
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Add an API to add externally allocated memory to malloc heap. The
memory will be stored in memseg lists like regular DPDK memory.
Multiple segments are allowed within a heap. If IOVA table is
not provided, IOVA addresses are filled in with RTE_BAD_IOVA.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Add API to allow creating new malloc heaps. They will be created
with socket ID's going above RTE_MAX_NUMA_NODES, to avoid clashing
with internal heaps.
This breaks the ABI, so document the change.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
An API is needed to check whether a particular socket ID belongs
to an internal or external heap. Prime user of this would be
mempool allocator, because normal assumptions of IOVA
contiguousness in IOVA as VA mode do not hold in case of
externally allocated memory.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When we will be creating external heaps, they will have their own
"fake" socket ID, so add a function that will map the heap name
to its socket ID.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
We will need to refer to external heaps in some way. While we use
heap ID's internally, for external API use it has to be something
more user-friendly. So, we will be using a string to uniquely
identify a heap.
This breaks the ABI, so document the change.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
This changes the semantics of what we understand by "socket ID",
so document the change in the release notes.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Switch over all parts of EAL to use heap ID instead of NUMA node
ID to identify heaps. Heap ID for DPDK-internal heaps is NUMA
node's index within the detected NUMA node list. Heap ID for
external heaps will be order of their creation.
This breaks the ABI, so document the changes.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When we allocate and use DPDK memory, we need to be able to
differentiate between DPDK hugepage segments and segments that
were made part of DPDK but are externally allocated. Add such
a property to memseg lists.
This breaks the ABI, so document the change in release notes.
This also breaks a few internal assumptions about memory
contiguousness, so adjust malloc code in a few places.
All current calls for memseg walk functions were adjusted to
ignore external segments where it made sense.
Mempools is a special case, because we may be asked to allocate
a mempool on a specific socket, and we need to ignore all page
sizes on other heaps or other sockets. Previously, this
assumption of knowing all page sizes was not a problem, but it
will be now, so we have to match socket ID with page size when
calculating minimum page size for a mempool.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Previously, to calculate length of memory area covered by a memseg
list, we would've needed to multiply page size by length of fbarray
backing that memseg list. This is not obvious and unnecessarily
low level, so store length in the memseg list itself.
This breaks ABI, so bump the EAL ABI version and document the
change. Also, while we're breaking ABI, pack the members a little
better.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
build error:
.../lib/librte_eventdev/rte_event_eth_tx_adapter.c:
In function ‘txa_service_queue_del’:
.../lib/librte_eventdev/rte_event_eth_tx_adapter.c:800:7:
error: ‘ret’ may be used uninitialized in this function
[-Werror=maybe-uninitialized]
compilation terminated due to -Wfatal-errors.
https://mails.dpdk.org/archives/test-report/2018-October/065919.html
'ret' may be used uninitialized when 'dev->data->nb_tx_queues' is 0,
although this is not a practical value, initialize 'ret' to cover this
case.
Fixes: a3bbf2e09756 ("eventdev: add eth Tx adapter implementation")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Currently, DPDK will skip mapping some areas (or even an entire BAR)
if MSI-X table happens to be in them but is smaller than page size.
Kernels 4.16+ will allow mapping MSI-X BARs [1], and will report this
as a capability flag. Capability flags themselves are also only
supported since kernel 4.6 [2].
This commit will introduce support for checking VFIO capabilities,
and will use it to check if we are allowed to map BARs with MSI-X
tables in them, along with backwards compatibility for older
kernels, including a workaround for a variable rename in VFIO
region info structure [3].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?id=a32295c612c57990d17fb0f41e7134394b2f35f6
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?id=c84982adb23bcf3b99b79ca33527cd2625fbe279
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?id=ff63eb638d63b95e489f976428f1df01391e15e4
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When NUMA-aware hugepages config option is set, we rely on
libnuma to tell the kernel to allocate hugepages on a specific
NUMA node. However, we allocate node mask before we check if
NUMA is available in the first place, which, according to
the manpage [1], causes undefined behaviour.
Fix by only using nodemask when we have NUMA available.
[1] https://linux.die.net/man/3/numa_alloc_onnode
Bugzilla ID: 20
Fixes: 1b72605d2416 ("mem: balanced allocation of hugepages")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Currently, command-line switches for legacy mem mode or single-file
segments mode are only stored in internal config. This leads to a
situation where these flags have to always match between primary
and secondary, which is bad for usability.
Fix this by storing these flags in the shared config as well, so
that secondary process can know if the primary was launched in
single-file segments or legacy mem mode.
This bumps the EAL ABI, however there's an EAL deprecation notice
already in place[1] for a different feature, so that's OK.
[1] http://patches.dpdk.org/patch/43502/
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Implement the operators of an rte_class for the
ethdev abstraction layer.
Register the layer as such.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This iterator can be customized with a comparison function that will
trigger a stopping condition.
It can be leveraged to write several different iterators that have
similar but non-identical purposes.
It is private to librte_ethdev.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Long time ago preallocation of memory for KNI was introduced in commit
0c6bc8e. It was done because of lack of ability to free previously
allocated memzones, which led to memzone exhaustion. Currently memzones
can be freed and this patch uses this ability for dynamic KNI memory
allocation.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Make the ethernet port id passed into
rte_event_eth_rx_adapter_caps_get() 16 bit.
Also, update the event rx adapter test to use 16 bit
ethernet port ids.
Fixes: c2189c907dd1 ("eventdev: make ethdev port identifiers 16-bit")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This patch implements the Tx adapter APIs by invoking the
corresponding eventdev PMD callbacks and also provides
the common rte_service function based implementation when
the eventdev PMD support is absent.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
The caps API allows the application to query if the transmit
stage is implemented in the eventdev PMD or uses the common
rte_service function. The PMD callbacks support the
eventdev PMD implementation of the adapter.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The ethernet Tx adapter abstracts the transmit stage of an
event driven packet processing application. The transmit
stage may be implemented with eventdev PMD support or use a
rte_service function implemented in the adapter. These APIs
provide a common configuration and control interface and
an transmit API for the eventdev PMD implementation.
The transmit port is specified using mbuf::port. The transmit
queue is specified using the rte_event_eth_tx_adapter_txq_set()
function.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This commit introduces a new function in the eventdev API,
which allows applications to read the number of unlink requests
in progress on a particular port of an eventdev instance.
This information allows applications to verify when no more packets
from a particular queue (or any queue) will arrive at a port.
The application could decide to stop polling, or put the core into
a sleep state if it wishes, as it is ensured that no new packets
will arrive at a particular port anymore if all queues are unlinked.
Suggested-by: Matias Elo <matias.elo@nokia.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Use RTE_MAX_ETHPORTS instead of rte_eth_dev_count_total()
when allocating eth Rx adapter's per-eth device data structure
to account for hotplugged devices.
Fixes: 9c38b704d280 ("eventdev: add eth Rx adapter implementation")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
When performing enqueue operations on the split and packed rings,
if the reserved buffer length from the descriptor table exceeds
65535, the returned length by fill_vec_buf_split/_packed()
overflows. This patch is to avoid this corner case.
Fixes: f689586bc060 ("vhost: shadow used ring update")
Fixes: fd68b4739d2c ("vhost: use buffer vectors in dequeue path")
Fixes: 2f3225a7d69b ("vhost: add vector filling support for packed ring")
Fixes: 37f5e79a271d ("vhost: add shadow used ring support for packed rings")
Fixes: a922401f35cc ("vhost: add Rx support for packed ring")
Fixes: ae999ce49dcb ("vhost: add Tx support for packed ring")
Cc: stable@dpdk.org
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Introduce vhost_message_handlers, which maps the message request
type to the message handler. Then replace the switch construct
with a map and call.
Failing vhost_user_set_features is fatal and all processing should
stop immediately and propagate the error to the upper layers. Change
the code accordingly to reflect that.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Each vhost-user message handling function will return an int result
which is described in the new enum vh_result: error, OK and reply.
All functions will now have two arguments, virtio_net double pointer
and VhostUserMsg pointer.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>