An API is needed to check whether a particular socket ID belongs
to an internal or external heap. Prime user of this would be
mempool allocator, because normal assumptions of IOVA
contiguousness in IOVA as VA mode do not hold in case of
externally allocated memory.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When we will be creating external heaps, they will have their own
"fake" socket ID, so add a function that will map the heap name
to its socket ID.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
We will need to refer to external heaps in some way. While we use
heap ID's internally, for external API use it has to be something
more user-friendly. So, we will be using a string to uniquely
identify a heap.
This breaks the ABI, so document the change.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
We will be assigning "invalid" socket ID's to external heap, and
malloc will now be able to verify if a supplied socket ID is in
fact a valid one, rendering parameter checks for sockets
obsolete.
This changes the semantics of what we understand by "socket ID",
so document the change in the release notes.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Switch over all parts of EAL to use heap ID instead of NUMA node
ID to identify heaps. Heap ID for DPDK-internal heaps is NUMA
node's index within the detected NUMA node list. Heap ID for
external heaps will be order of their creation.
This breaks the ABI, so document the changes.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When we allocate and use DPDK memory, we need to be able to
differentiate between DPDK hugepage segments and segments that
were made part of DPDK but are externally allocated. Add such
a property to memseg lists.
This breaks the ABI, so document the change in release notes.
This also breaks a few internal assumptions about memory
contiguousness, so adjust malloc code in a few places.
All current calls for memseg walk functions were adjusted to
ignore external segments where it made sense.
Mempools is a special case, because we may be asked to allocate
a mempool on a specific socket, and we need to ignore all page
sizes on other heaps or other sockets. Previously, this
assumption of knowing all page sizes was not a problem, but it
will be now, so we have to match socket ID with page size when
calculating minimum page size for a mempool.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Previously, to calculate length of memory area covered by a memseg
list, we would've needed to multiply page size by length of fbarray
backing that memseg list. This is not obvious and unnecessarily
low level, so store length in the memseg list itself.
This breaks ABI, so bump the EAL ABI version and document the
change. Also, while we're breaking ABI, pack the members a little
better.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
build error:
.../lib/librte_eventdev/rte_event_eth_tx_adapter.c:
In function ‘txa_service_queue_del’:
.../lib/librte_eventdev/rte_event_eth_tx_adapter.c:800:7:
error: ‘ret’ may be used uninitialized in this function
[-Werror=maybe-uninitialized]
compilation terminated due to -Wfatal-errors.
https://mails.dpdk.org/archives/test-report/2018-October/065919.html
'ret' may be used uninitialized when 'dev->data->nb_tx_queues' is 0,
although this is not a practical value, initialize 'ret' to cover this
case.
Fixes: a3bbf2e097 ("eventdev: add eth Tx adapter implementation")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Currently, DPDK will skip mapping some areas (or even an entire BAR)
if MSI-X table happens to be in them but is smaller than page size.
Kernels 4.16+ will allow mapping MSI-X BARs [1], and will report this
as a capability flag. Capability flags themselves are also only
supported since kernel 4.6 [2].
This commit will introduce support for checking VFIO capabilities,
and will use it to check if we are allowed to map BARs with MSI-X
tables in them, along with backwards compatibility for older
kernels, including a workaround for a variable rename in VFIO
region info structure [3].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?id=a32295c612c57990d17fb0f41e7134394b2f35f6
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?id=c84982adb23bcf3b99b79ca33527cd2625fbe279
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/commit/?id=ff63eb638d63b95e489f976428f1df01391e15e4
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
When NUMA-aware hugepages config option is set, we rely on
libnuma to tell the kernel to allocate hugepages on a specific
NUMA node. However, we allocate node mask before we check if
NUMA is available in the first place, which, according to
the manpage [1], causes undefined behaviour.
Fix by only using nodemask when we have NUMA available.
[1] https://linux.die.net/man/3/numa_alloc_onnode
Bugzilla ID: 20
Fixes: 1b72605d24 ("mem: balanced allocation of hugepages")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Currently, command-line switches for legacy mem mode or single-file
segments mode are only stored in internal config. This leads to a
situation where these flags have to always match between primary
and secondary, which is bad for usability.
Fix this by storing these flags in the shared config as well, so
that secondary process can know if the primary was launched in
single-file segments or legacy mem mode.
This bumps the EAL ABI, however there's an EAL deprecation notice
already in place[1] for a different feature, so that's OK.
[1] http://patches.dpdk.org/patch/43502/
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Implement the operators of an rte_class for the
ethdev abstraction layer.
Register the layer as such.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This iterator can be customized with a comparison function that will
trigger a stopping condition.
It can be leveraged to write several different iterators that have
similar but non-identical purposes.
It is private to librte_ethdev.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Long time ago preallocation of memory for KNI was introduced in commit
0c6bc8e. It was done because of lack of ability to free previously
allocated memzones, which led to memzone exhaustion. Currently memzones
can be freed and this patch uses this ability for dynamic KNI memory
allocation.
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Make the ethernet port id passed into
rte_event_eth_rx_adapter_caps_get() 16 bit.
Also, update the event rx adapter test to use 16 bit
ethernet port ids.
Fixes: c2189c907d ("eventdev: make ethdev port identifiers 16-bit")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This patch implements the Tx adapter APIs by invoking the
corresponding eventdev PMD callbacks and also provides
the common rte_service function based implementation when
the eventdev PMD support is absent.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
The caps API allows the application to query if the transmit
stage is implemented in the eventdev PMD or uses the common
rte_service function. The PMD callbacks support the
eventdev PMD implementation of the adapter.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The ethernet Tx adapter abstracts the transmit stage of an
event driven packet processing application. The transmit
stage may be implemented with eventdev PMD support or use a
rte_service function implemented in the adapter. These APIs
provide a common configuration and control interface and
an transmit API for the eventdev PMD implementation.
The transmit port is specified using mbuf::port. The transmit
queue is specified using the rte_event_eth_tx_adapter_txq_set()
function.
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
This commit introduces a new function in the eventdev API,
which allows applications to read the number of unlink requests
in progress on a particular port of an eventdev instance.
This information allows applications to verify when no more packets
from a particular queue (or any queue) will arrive at a port.
The application could decide to stop polling, or put the core into
a sleep state if it wishes, as it is ensured that no new packets
will arrive at a particular port anymore if all queues are unlinked.
Suggested-by: Matias Elo <matias.elo@nokia.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Use RTE_MAX_ETHPORTS instead of rte_eth_dev_count_total()
when allocating eth Rx adapter's per-eth device data structure
to account for hotplugged devices.
Fixes: 9c38b704d2 ("eventdev: add eth Rx adapter implementation")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
When performing enqueue operations on the split and packed rings,
if the reserved buffer length from the descriptor table exceeds
65535, the returned length by fill_vec_buf_split/_packed()
overflows. This patch is to avoid this corner case.
Fixes: f689586bc0 ("vhost: shadow used ring update")
Fixes: fd68b4739d ("vhost: use buffer vectors in dequeue path")
Fixes: 2f3225a7d6 ("vhost: add vector filling support for packed ring")
Fixes: 37f5e79a27 ("vhost: add shadow used ring support for packed rings")
Fixes: a922401f35 ("vhost: add Rx support for packed ring")
Fixes: ae999ce49d ("vhost: add Tx support for packed ring")
Cc: stable@dpdk.org
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Introduce vhost_message_handlers, which maps the message request
type to the message handler. Then replace the switch construct
with a map and call.
Failing vhost_user_set_features is fatal and all processing should
stop immediately and propagate the error to the upper layers. Change
the code accordingly to reflect that.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Each vhost-user message handling function will return an int result
which is described in the new enum vh_result: error, OK and reply.
All functions will now have two arguments, virtio_net double pointer
and VhostUserMsg pointer.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As VhostUserMsg structure is reused to generate the reply, move the
relevant fields update into the respective message handling functions.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Do not use the typedef version of struct VhostUserMsg. Also unify the
related parameter name.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The doxygen comment describing the rte_eth_dev_info structure
was separated from the structure itself so move the comment
back to be with the structure.
Fixes: 7238e63bce ("ethdev: add support for device offload capabilities")
Cc: stable@dpdk.org
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
This patch fixes how function exit is handled when errors inside
rte_eth_dev_create.
Fixes: e489007a41 ("ethdev: add generic create/destroy ethdev APIs")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Current Intel tx prepare function does not properly handle the
case where only IP checksum is requested, without requesting
any L4 checksum or TSO: IP checksum is not properly reset to 0
and output packet may contain invalid IP checksum.
Fixes: 4fb7e803eb ("ethdev: add Tx preparation")
Cc: stable@dpdk.org
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
When compiling on FreeBSD, lots of warnings/errors are thrown for
unused parameter. Fix these by marking the parameters as unused
in the code.
Fixes: 1009ba1704 ("mem: add internal API to get and set segment fd")
Fixes: 3a44687139 ("mem: allow querying offset into segment fd")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
A fragmented packets is supposed to live no longer than max_cycles,
but the lib deletes an expired packet only occasionally when it scans
a bucket to find an empty slot while adding a new packet.
Therefore a fragment might sit in the table forever.
Signed-off-by: Alex Kiselev <alex@therouter.net>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Rework the delete function and add additional
internal data structures to support incremental
LPM tree update rather than full tree rebuild.
Signed-off-by: Alex Kiselev <alex@therouter.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Rework the lpm6 rule subsystem and replace
current rules algorithm complexity O(n)
with hashtables which allow dealing with
large (50k) rule sets.
Signed-off-by: Alex Kiselev <alex@therouter.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Enable using memfd-created segments if supported by the system.
This will allow having real fd's for pages but without hugetlbfs
mounts, which will enable in-memory mode to be used with virtio.
The implementation is mostly piggy-backing on existing real-fd
code, except that we no longer need to unlink any files or track
per-page locks in single-file segments mode, because in-memory
mode does not support secondary processes anyway.
We move some checks from EAL command-line parsing code to memalloc
because it is now possible to use single-file segments mode with
in-memory mode, but only if memfd is supported.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In a few cases, user may need to query offset into fd for a
particular memory segment (for example, to selectively map
pages). This commit adds a new API to do that.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Now that we can retrieve page fd's internally, we can expose it
as an external API. This will add two flavors of API - thread-safe
and non-thread-safe. Fix up internal API's to return values we need
without modifying rte_errno internally if called from within EAL.
We do not want calling code to accidentally close an internal fd, so
we make a duplicate of it before we return it to the user. Caller is
therefore responsible for closing this fd.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Enable setting and retrieving segment fd's internally.
For now, retrieving fd's will not be used anywhere until we
get an external API, but it will be useful for things like
virtio, where we wish to share segment fd's.
Setting segment fd's will not be available as a public API
at this time, but internally it is needed for legacy mode,
because we're not allocating our hugepages in memalloc in
legacy mode case, and we still need to store the fd.
Another user of get segment fd API is memseg info dump, to
show which pages use which fd's.
Not supported on FreeBSD.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Previously, we were only tracking lock file fd's in single-file
segments mode, but did not track fd's in non-single file mode
because we didn't need to (mmap() call still kept the lock). Now
that we are going to expose these fd's to the world, we need to
have access to them, so track them even in non-single file
segments mode.
We don't need to close fd's after mmap() because we're still
tracking them in an fd list. Also, for anonymous hugepages mode,
fd will always be -1 so exit early on error.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Previously, we were only using lock lists to store per-page lock fd's
because we cannot use modern fcntl() file description locks to lock
parts of the page in single file segments mode.
Now, we will be using this list to store either lock fd's (along with
memseg list fd) in single file segments mode, or per-page fd's (and set
memseg list fd to -1), so rename the list accordingly.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Previously, when we allocated hugepages, we closed the fd's corresponding
to them after we've done our mappings. Since we did mmap(), we didn't
actually lose the reference, but file descriptors used for mmap() do not
count against the fd limit. Since we are going to store all of our fd's,
we will hit the fd limit much more often when using smaller page sizes.
Fix this to raise the fd limit to maximum unconditionally.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In-memory mode was never meant to support legacy mode, because we
cannot sort anonymous pages anyway.
Fixes: 72b49ff623 ("mem: support --in-memory mode")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In noshconf mode, no shared files are created, but we're still trying
to unlink them, resulting in detach/destroy failure even though it
should have succeeded. Fix it by exiting early in noshconf mode.
Fixes: 3ee2cde248 ("fbarray: support --no-shconf mode")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The strncpy function has long been deemed unsafe for use,
in favor of strlcpy or snprintf.
While snprintf is standard and strlcpy is still largely available,
they both have issues regarding error checking and performance.
Both will force reading the source buffer past the requested size
if the input is not a proper c-string, and will return the expected
number of bytes copied, meaning that error checking needs to verify
that the number of bytes copied is not superior to the destination
size.
This contributes to awkward code flow, unclear error checking and
potential issues with malformed input.
The function strscpy has been discussed for some time already and
has been made available in the linux kernel[1].
Propose this new function as a safe alternative.
[1]: http://git.kernel.org/linus/30c44659f4a3
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Juhamatti Kuusisaari <juhamatti.kuusisaari@coriant.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
__rte_mbuf_raw_free and __rte_pktmbuf_prefree_seg have been deprecated for
a long time now (early 17.05), are not part of the abi and are easily
replaced with existing api.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
pdump library now uses generic multi process channel
and it is no more dependent on the pthreads, so remove
the dependency from the Makefile.
Fixes: 660098d61f ("pdump: use generic multi-process channel")
Cc: stable@dpdk.org
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
After running meson to configure a DPDK build, it can be useful to know
what was automatically enabled or disabled. Therefore, print out by way of
summary a categorised list of libraries and drivers to be built.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
EAL is a standard dependency of all libraries, except for those built
before it. We can therefore simplify the logic by just checking if EAL
has been processed, and make it a standard dependency if so.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This has been only build-tested for now, on a native ppc64el POWER8E
machine running Debian sid.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
They are built by the legacy makefiles but not by Meson.
Fixes: 8f40ee0734 ("eal/x86: get hypervisor name")
Cc: stable@dpdk.org
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Removed DEV_RX_OFFLOAD_CRC_STRIP offload flag.
Without any specific Rx offload flag, default behavior by PMDs is to
strip CRC.
PMDs that support keeping CRC should advertise DEV_RX_OFFLOAD_KEEP_CRC
Rx offload capability.
Applications that require keeping CRC should check PMD capability first
and if it is supported can enable this feature by setting
DEV_RX_OFFLOAD_KEEP_CRC in Rx offload flag in rte_eth_dev_configure()
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tomasz Duszynski <tdu@semihalf.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Remes <remes@netcope.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Patch 5355f443 added two definitions of DEV_TX_OFFLOAD_xxx.
If new Tx offload capabilities are defined, they also must be mentioned
in rte_tx_offload_names in rte_ethdev.c file.
This patch adds the required lines in array rte_tx_offload_names.
Fixes: 5355f4439e ("ethdev: introduce generic IP/UDP tunnel checksum and TSO")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
There are a lot of cases where vhost-user massage handling
could fail and end up in a fully not recoverable state. For
example, allocation failures of shadow used ring and batched
copy array are not recoverable and leads to the segmentation
faults like this on the receiving/transmission path:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f913fecf0 (LWP 43625)]
in copy_desc_to_mbuf () at /lib/librte_vhost/virtio_net.c:760
760 batch_copy[vq->batch_copy_nb_elems].dst =
This could be easily reproduced in case of low memory or big
number of vhost-user ports.
Fix that by propagating error to the upper layer which will
end up with disconnection in case we can not report to
the message sender when the error happens.
Fixes: f689586bc0 ("vhost: shadow used ring update")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When VIRTIO_RING_F_EVENT_IDX is negotiated, we need to
update the avail event to enable the notification.
Fixes: 3f8ff12821 ("vhost: support interrupt mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
'numa_realloc()' allocates 'zmbufs' even if zero copy mode
is not configured. This leads to memory leak, because array
is freed only for zero copy case.
Fixes: 2651726def ("vhost: do deep copy while reallocating queue")
CC: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The rte_eth_dev_owner_unset function always generates a log
message because the unset value for owner id is 0.
Also, when rte_eth_dev_owner_delete is called with a valid
owner id, the log message should be at NOTICE not ERROR
severity.
Fixes: 5b7ba31148 ("ethdev: add port ownership")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Matan Azrad <matan@mellanox.com>
Current code assumes a MAC change can occur when the port has been
started. In fact, there are some NICs which require this port state
for being successful, but other NICs not always support MAC change
in that case.
This patch supports a new device flag for a device advertising this
limitation, and if the flag is set, the MAC is changed before the
port starts.
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
It's a common case that 'get_mempolicy' fails on systems
without NUMA support. No need to flag an error in log for
this situation.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The patch changes rx_burst profiling approach:
1. VTune's instrumentation is removed
2. empty hook callback for profiling is added
This way all VTune-specific logic moves to the VTune side.
Hook is enabled only when CONFIG_RTE_ETHDEV_PROFILE_WITH_VTUNE option
is turned on. VTune uses this hook to attach to the polling cycle. It
is not possible to attach to the rx_burst directly, as it is inline.
Signed-off-by: Ilia Kurakin <ilia.kurakin@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
If user specifies priority=0 for some of ACL rules
that can cause rte_acl_classify to return wrong results.
The reason is that priority zero is used internally for no-match nodes.
See more details at: https://bugs.dpdk.org/show_bug.cgi?id=79.
The simplest way to overcome the issue is just not allow zero
to be a valid priority for the rule.
Fixes: dc276b5780 ("acl: new library")
Cc: stable@dpdk.org
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
We need to do the NULL pointer check first after malloc().
Fixes: 07dcbfe010 ("malloc: support multiprocess memory hotplug")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Start version numbering for a new release cycle,
and introduce a template file for release notes.
The release notes comments have a new block to suggest
the order of items, inspired by Ferruh's proposal.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: John McNamara <john.mcnamara@intel.com>
Describing the thread-safety support more accurately for
API documentation.
Fixes: f2e3001b53 ("hash: support read/write concurrency")
Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch fixes a doxygen comment of the rte_eth_dev_allocate()
method. There is no parameter named "type" for this
method; so this patch removes the doxygen comment about it.
Fixes: 6751f6deb7 ("ethdev: get rid of device type")
Cc: stable@dpdk.org
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fix a segmentation fault which occurs when the kni_autotest is run
in the 'test' application.
This segmenation fault occurs when rte_kni_get() is called with a
NULL value for 'name'.
Fixes: 0c6bc8ef70 ("kni: memzone pool for alloc and release")
Cc: stable@dpdk.org
Signed-off-by: Dan Gora <dg@adax.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
rte_hash_lookup_data() and rte_hash_lookup_with_hash_data()
functions return the index of the table where the key is stored
when this is found, and not 0 as the Doxygen currently states.
Also, these functions, and rte_hash_get_key_with_position()
return negative values when keys are not found (-EINVAL and -ENOENT),
where the minus sign was missing.
Bugzilla ID: 78
Fixes: 473d1bebce ("hash: allow to store data in hash table")
Fixes: 6dc34e0afe ("hash: retrieve a key given its position")
Cc: stable@dpdk.org
Reported-by: Petr Houska <t-pehous@microsoft.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The old offload API is removed in 18.08,
so the library version must be increased,
in order to show the incompatibility with 18.05 one.
Fixes: ab3ce1e0c1 ("ethdev: remove old offload API")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
As per guideline that new APIs must be experimental
for at least one release, it is now possible to remove
the experimental tag from:
rte_meter_srtcm_profile_config()
rte_meter_trtcm_profile_config()
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Hotplug functions should be used directly to add and remove devices.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
These functions are buggy from the very beginning and should not be used.
Generic EAL hotplug mechanisms should be used instead.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
rte_mempool_calc_mem_size_helper() was introduced to avoid
code duplication and used in deprecated rte_mempool_mem_size() and
rte_mempool_op_calc_mem_size_default(). Now the first one is removed
and it is better to fold the helper into the second one to make it
more readable.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Functions rte_mempool_populate_phys(), rte_mempool_virt2phy() and
rte_mempool_populate_phys_tab() are just wrappers for corresponding
IOVA functions and were deprecated in v17.11.
Functions rte_mempool_xmem_create(), rte_mempool_xmem_size(),
rte_mempool_xmem_usage() and rte_mempool_populate_iova_tab() were
deprecated in v18.05 and removal was announced earlier in v18.02.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
n_bits comes as first argument, align doxygen comment.
n_bit need to not be multiple of 512 as n_bits
are rounding to RTE_BITMAP_CL_BIT_SIZE.
Fixes: 14456f59e9 ("doc: fix doxygen warnings in QoS API")
Fixes: de3cfa2c98 ("sched: initial import")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
rte_ring implementation is not preemptible only under certain
circumstances. This clarification is helpful for data plane and
control plane communication using rte_ring.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Currently, a mempool can be created if the number of
objects is zero. However, in this scenario,
rte_mempool_create should return NULL,
as the mempool created is useless otherwise.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Change log level of messages from ERR to INFO where
the post condition of the API is success, but no action
was actually needed as the condition already existed.
e.g. calling rte_eth_dev_start() for a device that is
already started.
Fixes: bea1e0c70c ("ethdev: convert static log type usage to dynamic")
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This commit fixes a bug in a 32-bit environment where the
generic ring_init() would fail, but given the interaction
with memzones the next iteration of the event_ring_autotest
would actually *pass* because the ring in question would
exist already an be looked-up.
This commit rightly error checks the result of ring_init(),
and calls rte_free() on the memory as required.
Fixes: dc39e2f359 ("eventdev: add ring structure for events")
Cc: stable@dpdk.org
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
IOTLB entries contain the host virtual address of the guest
pages. When receiving a new VHOST_USER_SET_MEM_TABLE request,
the previous regions get unmapped, so the IOTLB entries, if any,
will be invalid. It does cause the vhost-user process to
segfault.
This patch introduces a new function to flush the IOTLB cache,
and call it as soon as the backend handles a VHOST_USER_SET_MEM
request.
Fixes: 69c90e98f4 ("vhost: enable IOMMU support")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Changing ownership of a port is a normal event, and should
not be logged at ERR priority. Downgrade to a DEBUG message.
Fixes: bea1e0c70c ("ethdev: convert static log type usage to dynamic")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Matan Azrad <matan@mellanox.com>
Rawdev queue count API prototype was declared, but the definition was
missing from the library. This patch implements the function.
This API is used to query the device about the count of queues it has
been configured with.
Fixes: c88b3f2558 ("rawdev: introduce raw device library")
Cc: stable@dpdk.org
Suggested-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Free up the memzone allocated during the
rte_latencystats_init().
Fixes: 5cd3cac9ed ("latency: added new library for latency stats")
CC: stable@dpdk.org
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Currently, nic_uio driver does not support interrupts, so any
attempts to install an interrupt handler will fail with a
not supported error, which will cause an error message that is
confusing to the user.
Silence this error by moving it to debug log level, and reword
the message to avoid containing the word "Error", to avoid
triggering DTS test failures [1].
[1] https://git.dpdk.org/tools/dts/tree/tests/TestSuite_scatter.py?#n110
Fixes: 23150bd8d8 ("eal/bsd: add interrupt thread")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
The forward declaraion of rte_pci_device in rte_ethdev.h
is not needed anymore.
Fixes: cd8c7c7ce2 ("ethdev: replace bus specific struct with generic dev")
Cc: stable@dpdk.org
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
The node parent update API function may be used to update the
priority/weight of an existing node. Update the documentation to
indicate that this use case is supported.
Signed-off-by: Ben Shelton <benjamin.h.shelton@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
This patch adds a sanity check so that names passed into
rte_metrics_reg_names() and the wrapper rte_metrics_reg_name()
cannot be NULL.
Fixes: 349950ddb9 ("metrics: add information metrics library")
Cc: stable@dpdk.org
Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
If rte_metrics_init() had not been called and hence the internal
metric storage is not allocated, rte_metrics_get_values() and
rte_metrics_get_name() would silently fail by returning zero
(i.e. no metrics registered). This patch changes the result of
this scenario to an explicit fail by returning -EIO.
Fixes: 349950ddb9 ("metrics: add information metrics library")
Cc: stable@dpdk.org
Signed-off-by: Remy Horton <remy.horton@intel.com>