errno_autotest testcase were failed since
commit 5d7b673d5fd6 ("mk: build with _GNU_SOURCE defined by default")
RTE>>errno_autotest
rte_strerror: 'Unknown error 11',
strerror: 'Resource temporarily unavailable'
Test Failed
There are two different version of strerror_t() based on
_GNU_SOURCE definition.
/* XSI-compliant */
int strerror_r(int errnum, char *buf, size_t buflen);
/* GNU-specific */
char *strerror_r(int errnum, char *buf, size_t buflen);
Since the GNU-specific version returns char* the exiting "if"
condition around the strerror_r fails.
Switching back to XSI-compliant version to allow
a) Portable strerror_r() usage as musl c library uses
non GNU speficic version
https://git.musl-libc.org/cgit/musl/tree/src/string/strerror_r.c
b) Based on strerror_r(3) man page, it is possible that GNU-specific
version need not use char *buf to fill error message instead it
can use the immutable static string from the library and return it.
note from strerror_r(3) man page:
The GNU-specific strerror_r() returns a pointer to a string containing
the error message. This may be either a pointer to a string that the
function stores in buf, or a pointer to some (immutable)
static string (in which case buf is unused).
Fixes: 5d7b673d5fd6 ("mk: build with _GNU_SOURCE defined by default")
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
If a device is unplugged while an interrupt is pending, the
read call to the uio device to remove it from the poll wait list
can fail resulting in it being continually polled forever. This
change checks for the read failing and if so, unregisters the device
as an interrupt source and causes the wait list to be rebuilt.
This race has been reported and observed in production.
Fixes: 0a45657a6794 ("pci: rework interrupt handling")
Cc: stable@dpdk.org
Signed-off-by: Brian Russell <brussell@brocade.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
rte_mp_request_sync() says that the caller is responsible
for freeing one of its parameters afterwards. EAL didn't
do that, causing a memory leak.
Fixes: 244d5130719c ("eal: enable hotplug on multi-process")
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Some global variables can be eliminated, since they are not part of
public interface, it is free to remove them.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Some global variables can indeed be static, add static keyword to them.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
So far each process in MP used to have a separate container
and relied on the primary process to register all memsegs.
Mapping external memory via rte_vfio_container_dma_map()
in secondary processes was broken, because the default
(process-local) container had no groups bound. There was
even no way to bind any groups to it, because the container
fd was deeply encapsulated within EAL.
This patch introduces a new SOCKET_REQ_DEFAULT_CONTAINER
message type for MP synchronization, makes all processes
within a MP party use a single default container, and hence
fixes rte_vfio_container_dma_map() for secondary processes.
From what I checked this behavior was always the same, but
started to be invalid/insufficient once mapping external
memory was allowed.
While here, fix up the comment on rte_vfio_get_container_fd().
This function always opens a new container, never reuses
an old one.
Fixes: 73a639085938 ("vfio: allow to map other memory regions")
Cc: stable@dpdk.org
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
We were reading some memory just after freeing it.
Fixes: 83a73c5fef66 ("vfio: use generic multi-process channel")
Cc: stable@dpdk.org
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Always attempt to find already opened fd for an iommu
group as subsequent attempts to open it will fail.
There's no public API to check if a group was already
bound and has a container, so rte_vfio_container_group_bind()
shouldn't fail in such case.
Fixes: ea2dc1066870 ("vfio: add multi container support")
Cc: stable@dpdk.org
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch uses EAL option "--iova-mode" to force the IOVA mode to a
particular value. There exists virtual devices that are not directly
attached to the PCI bus, and therefore the auto detection of the IOVA
mode based on probing the PCI bus and IOMMU configuration may not
report the required addressing mode. Using the EAL option permits the
mode to be explicitly configured in this scenario.
Signed-off-by: Eric Zhang <eric.zhang@windriver.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Marko Kovacevic <marko.kovacevic@intel.com>
In the case of user don't want to use bus iova scheme and want
to override.
For that, adding EAL option --iova-mode=<string> where valid input
string is 'pa' or 'va'.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Eric Zhang <eric.zhang@windriver.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Commit 73a639085938 ("vfio: allow to map other memory regions")
introduced a bug in sPAPR IOMMU mapping. The commit removed necessary
ioctl with VFIO_IOMMU_SPAPR_REGISTER_MEMORY. Also, vfio_spapr_map_walk
should call vfio_spapr_dma_do_map instead of vfio_spapr_dma_mem_map.
Fixes: 73a639085938 ("vfio: allow to map other memory regions")
Cc: stable@dpdk.org
Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com>
Linux kernel uses a really high address as starting address for
serving mmaps calls. If there exist addressing limitations and
IOVA mode is VA, this starting address is likely too high for
those devices. However, it is possible to use a lower address in
the process virtual address space as with 64 bits there is a lot
of available space.
This patch adds an address hint as starting address for 64 bits
systems and increments the hint for next invocations. If the mmap
call does not use the hint address, repeat the mmap call using
the hint address incremented by page size.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
A device can suffer addressing limitations. This function checks
memsegs have iovas within the supported range based on dma mask.
PMDs should use this function during initialization if device
suffers addressing limitations, returning an error if this function
returns memsegs out of range.
Another usage is for emulated IOMMU hardware with addressing
limitations.
It is necessary to save the most restricted dma mask for checking out
memory allocated dynamically after initialization.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
RTE_MEMZONE_SIZE_HINT_ONLY wasn't checked in any way,
causing size hints to be parsed as hard requirements.
This resulted in some allocations being failed prematurely.
Fixes: 68b6092bd3c7 ("malloc: allow reserving biggest element")
Cc: stable@dpdk.org
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch is used to fix the memory leak issue of logid.
We use the ASAN test in SPDK when integrating DPDK and
find this memory leak issue.
Fixes: d8a2bc71dfc2 ("log: remove app path from syslog id")
Cc: stable@dpdk.org
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
in struct ip_frag_key,src_dst[] type is uint64_t.
but "val" which to store the calc restult ,type is uint32_t.
we may lost high 32 bit key. and function return value is int,
but it won't return < 0.
Signed-off-by: Li Han <han.li1@zte.com.cn>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
As part of the effort of consolidating the DPDK installation bits and
pieces across distros, set the default directory of lib/ where PMDs get
installed to dpdk/pmds-XX.YY. It's necessary to have a versioned
subdirectory as multiple ABI revisions might be installed at the same
time, so having a fixed name will cause trouble with the autoload
feature.
Small refactor with parsing and saving the major version to a variable,
since it's now used in 3 different places.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Timothy Redaelli <tredaelli@redhat.com>
This patch adds telemetry as a dependecy to all applications. Without these
changes, the --telemetry flag will not be recognised and applications will
fail to run if they want to enable telemetry.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds functionality to enable/disable the selftest.
This functionality will be extended in future to make the
enabling/disabling more dynamic and remove this 'hardcoded' approach. We
are temporarily using this approach due to the design changes (vdev vs eal)
made to the library.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds functionality to create a JSON message in
order to send it to a client socket.
When stats are requested by a client, they are retrieved from
the metrics library and encoded in JSON format.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds functionality to update the statistics in
the metrics library with values from the ethdev stats.
Values need to be updated before they are encoded into a JSON
message and sent to the client that requested them. The JSON encoding
will be added in a subsequent patch.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds the parser file. This is used to parse any
messages that are received on any of the client sockets.
Currently, the unregister functionality works using the parser.
Functionality relating to getting statistic values for certain ports
will be added in a subsequent patch, however the parsing involved
for that command is added in this patch.
Some of the parser code included is in preparation for future
functionality, that is not implemented yet in this patchset.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch introduces clients to the telemetry API.
When a client makes a connection through the initial telemetry
socket, they can send a message through the socket to be
parsed. Register messages are expected through this socket, to
enable clients to register and have a client socket setup for
future communications.
A TAILQ is used to store all clients information. Using this, the
client sockets are polled for messages, which will later be parsed
and dealt with accordingly.
Functionality that make use of the client sockets were introduced
in this patch also, such as writing to client sockets, and sending
error responses.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds the telemetry UNIX socket. It is used to
allow connections from external clients.
On the initial connection from a client, ethdev stats are
registered in the metrics library, to allow for their retrieval
at a later stage.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds the infrastructure and initial code for the telemetry
library.
The telemetry init is registered with eal_init(). We can then check to see
if --telemetry was passed as an eal option. If --telemetry was parsed, then
we call telemetry init at the end of eal init.
Control threads are used to get CPU cycles for telemetry, which are
configured in this patch also.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch makes the eal_get_runtime_dir() API public so it can be used
from outside EAL.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This commit adds infrastructure to EAL that allows an application to
register it's init function with EAL. This allows libraries to be
initialized at the end of EAL init.
This infrastructure allows libraries that depend on EAL to be initialized
as part of EAL init, removing circular dependency issues.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
The offload name functions are useful, but since they are
marked experimental they can not be used by upstream projects.
For example, VPP duplicates the same table in its code.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add a new rte_delay_us_sleep() function that uses nanosleep().
This function can be used by applications to not implement
their own nanosleep() based callback and by internal DPDK
code if CPU non-blocking delay needed.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The iterator was matching all representors if it was not specified
in the devargs string. It was a wrong default behaviour.
If there is no representor parameter in the devargs, the iterator
should not match any representor port.
The implementation of the default behaviour would be simpler
if a "no match" handler is added to rte_kvargs_process().
As it requires an API breakage, it will be reworked later.
Fixes: a7d3c6271d55 ("ethdev: support representor id as iterator filter")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
If a port is being created and rollbacked because of an error,
the event RTE_ETH_EVENT_DESTROY should not be sent.
It makes no sense to receive a destroy event for a port which
was not yet announced via RTE_ETH_EVENT_NEW.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add the Mpls header structure in librte_net. It will be used by next
patch that adds the support of Mpls L2 layer in the software packet
type parser.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Currently, postcopy_ufd is initialized to 0 implicitly, so fd 0
could be closed unexpectedly by vhost_backend_cleanup(). Fix this
issue by initializing postcopy_ufd to -1 explicitly.
Fixes: 9eefef3b5970 ("vhost: introduce postcopy advise message")
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In both split and packed dequeue paths, flush_shadow_used_ring
and vhost_ring_call variants gets called even if not packets
have been dequeued, and so no descriptors updates happened.
It has an impact on CPU pipeline, as memory barriers are used
in these functions.
This patch don't call these functions if no descriptors have
been dequeued. The performance gain with split ring when
dequeue zero-copy is disabled should be null, but should be
noticeable with packed ring or dequeue zero-copy enabled.
Fixes: ae999ce49dcb ("vhost: add Tx support for packed ring")
Fixes: 915cf9404225 ("vhost: use shadow used ring in dequeue path")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Tested-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The MAC addresses of a port can be matched with devargs.
As the conflict between rte_ether.h and netinet/ether.h is not resolved,
the MAC parsing is done with a rte_cmdline function.
As a result, cmdline library becomes a dependency of ethdev.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
The representor id is added in rte_eth_dev_data in order to be able
to match a port with its representor id in devargs.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
The functions for representor devargs parsing were static
in the file rte_ethdev.c.
In order to reuse them in the file rte_class_eth.c,
they are moved to the files ethdev_private.c/.h.
A log is fixed by adding a missing line feed.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
If a value contains a comma, rte_kvargs_tokenize() will split here.
In order to support list syntax [a,b] as value, an extra parsing of
the square brackets is added.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
These hotplug functions were deprecated and have some new replacements.
As announced earlier, the oldest ones are now removed.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
The hotplug attach/detach features are implemented in EAL layer.
There is a new ethdev iterator to retrieve ports from ethdev layer.
As announced earlier, the (buggy) ethdev functions are now removed.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
If no rte_device is given in the iterator,
eth_dev_match() is looking at all ports without any restriction,
except the ethdev kvargs filter.
It allows to iterate with a devargs filter referencing only
some ethdev parameters. The format (from the new devargs syntax) is:
class=eth,paramY=Y
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
The iterator will return the ethdev port ids matching a devargs string.
It is recommended to use the macro RTE_ETH_FOREACH_MATCHING_DEV()
for usage convenience.
The class string is prefixed with '+' in order to skip the validation
of the parameter keys. It is tolerated for the compatibility with
the old (current) syntax where all parameters (bus, class and driver)
are mixed in the same string without any delimiter.
Thanks to this compatibility prefix, the driver parameters will be
skipped during the ethdev parsing, and not considered invalid.
A macro is introduced in rte_common.h to workaround a const field.
This hack is needed to free const strings in the iterator.
It is preferred to keep the const for these fields, because it gives
a hint that they are not changed at each iteration.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
We should return the length of the buffers described by
the current descriptor chain after filling the buffer
vector. So we need to zero the *len first.
Fixes: 2f3225a7d69b ("vhost: add vector filling support for packed ring")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Currently it's not possible to build DPDK as shared library with
cryptodev disabled since vhost is trying to link with rte_crypto,
but rte_crypto and rte_hash are only needed when you build vhost_crypto
and so only when cryptodev is enabled.
This patch fix this by linking rte_vhost with rte_crypto and rte_hash
only when cryptodev is enabled.
Fixes: b4ca81298613 ("vhost/crypto: fix build without cryptodev")
Fixes: 939066d96563 ("vhost/crypto: add public function implementation")
Cc: stable@dpdk.org
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Currenlty the encap/decap actions only support encapsulation
of VXLAN and NVGRE L2 packets (L2 encapsulation is where
the inner packet has a valid Ethernet header, while L3 encapsulation
is where the inner packet doesn't have the Ethernet header).
In addtion the parameter to to the encap action is a list of rte items,
this results in 2 extra translation, between the application to the
actioni and from the action to the NIC. This results in negative impact
on the insertion performance.
Looking forward there are going to be a need to support many more tunnel
encapsulations. For example MPLSoGRE, MPLSoUDP.
Adding the new encapsulation will result in duplication of code.
For example the code for handling NVGRE and VXLAN are exactly the same,
and each new tunnel will have the same exact structure.
This patch introduce a raw encapsulation that can support L2 tunnel types
and L3 tunnel types. In addtion the new
encapsulations commands are using raw buffer inorder to save the
converstion time, both for the application and the PMD.
In order to encapsulate L3 tunnel type there is a need to use both
actions in the same rule: The decap to remove the L2 of the original
packet, and then encap command to encapsulate the packet with the
tunnel.
For decap L3 there is also a need to use both commands in the same flow
first the decap command to remove the outer tunnel header and then encap
to add the L2 header.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>