This function is documented to return the number of unregistered
callbacks or negative numbers on error, but pci_vfio checks for
ret != 0 to detect failures. Not anymore.
Fixes: c115fd000c ("vfio: handle hotplug request notifier")
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
So far each process in MP used to have a separate container
and relied on the primary process to register all memsegs.
Mapping external memory via rte_vfio_container_dma_map()
in secondary processes was broken, because the default
(process-local) container had no groups bound. There was
even no way to bind any groups to it, because the container
fd was deeply encapsulated within EAL.
This patch introduces a new SOCKET_REQ_DEFAULT_CONTAINER
message type for MP synchronization, makes all processes
within a MP party use a single default container, and hence
fixes rte_vfio_container_dma_map() for secondary processes.
From what I checked this behavior was always the same, but
started to be invalid/insufficient once mapping external
memory was allowed.
While here, fix up the comment on rte_vfio_get_container_fd().
This function always opens a new container, never reuses
an old one.
Fixes: 73a6390859 ("vfio: allow to map other memory regions")
Cc: stable@dpdk.org
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
We were reading some memory just after freeing it.
Fixes: 83a73c5fef ("vfio: use generic multi-process channel")
Cc: stable@dpdk.org
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Always attempt to find already opened fd for an iommu
group as subsequent attempts to open it will fail.
There's no public API to check if a group was already
bound and has a container, so rte_vfio_container_group_bind()
shouldn't fail in such case.
Fixes: ea2dc10668 ("vfio: add multi container support")
Cc: stable@dpdk.org
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Invoking the right pci read/write functions is based on interrupt
handler type. However, this is not configured for secondary processes
precluding to use those functions.
This patch fixes the issue using the driver name the device is bound
to instead.
Fixes: 632b2d1dee ("eal: provide functions to access PCI config")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
In virtio_read_caps and vtpci_msix_detect, rte_pci_read_config returns
the number of bytes read from PCI config or < 0 on error.
If less than the expected number of bytes are read then log the
failure and return rather than carrying on with garbage.
Fixes: 6ba1f63b5a ("virtio: support specification 1.0")
Cc: stable@dpdk.org
Signed-off-by: Brian Russell <brussell@brocade.com>
Signed-off-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
On Linux, rte_pci_read_config on success returns the number of read
bytes, but on BSD it returns 0.
Document the return values, and have BSD behave as Linux does.
At least one case (bnx2x PMD) treats 0 as an error, so the change
makes sense also for that.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This patch uses EAL option "--iova-mode" to force the IOVA mode to a
particular value. There exists virtual devices that are not directly
attached to the PCI bus, and therefore the auto detection of the IOVA
mode based on probing the PCI bus and IOMMU configuration may not
report the required addressing mode. Using the EAL option permits the
mode to be explicitly configured in this scenario.
Signed-off-by: Eric Zhang <eric.zhang@windriver.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Marko Kovacevic <marko.kovacevic@intel.com>
In the case of user don't want to use bus iova scheme and want
to override.
For that, adding EAL option --iova-mode=<string> where valid input
string is 'pa' or 'va'.
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Eric Zhang <eric.zhang@windriver.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Commit 73a6390859 ("vfio: allow to map other memory regions")
introduced a bug in sPAPR IOMMU mapping. The commit removed necessary
ioctl with VFIO_IOMMU_SPAPR_REGISTER_MEMORY. Also, vfio_spapr_map_walk
should call vfio_spapr_dma_do_map instead of vfio_spapr_dma_mem_map.
Fixes: 73a6390859 ("vfio: allow to map other memory regions")
Cc: stable@dpdk.org
Signed-off-by: Takeshi Yoshimura <tyos@jp.ibm.com>
NFP can handle IOVA as VA. It requires to check those IOVAs
being in the supported range what is done during initialization.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
NFP devices can not handle DMA addresses requiring more than
40 bits. This patch uses rte_dev_check_dma_mask with 40 bits
and avoids device initialization if memory out of NFP range.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Currently the code precludes IOVA mode if IOMMU hardware reports
less addressing bits than necessary for full virtual memory range.
Although VT-d emulation currently only supports 39 bits, it could
be iovas for allocated memlory being within that supported range.
This patch allows IOVA mode in such a case adding a call to
rte_eal_check_dma_mask using the reported addressing bits by the
IOMMU hardware.
Indeed, memory initialization code has been modified for using lower
virtual addresses than those used by the kernel for 64 bits processes
by default, and therefore memsegs iovas can use 39 bits or less for
most systems. And this is likely 100% true for VMs.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Current code checks if IOMMU hardware reports enough addressing
bits for using IOVA mode but it repeats the same check for any
PCI device present. This is not necessary because the IOMMU hardware
is the same for all of them.
This patch only checks the IOMMU using first PCI device found.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Linux kernel uses a really high address as starting address for
serving mmaps calls. If there exist addressing limitations and
IOVA mode is VA, this starting address is likely too high for
those devices. However, it is possible to use a lower address in
the process virtual address space as with 64 bits there is a lot
of available space.
This patch adds an address hint as starting address for 64 bits
systems and increments the hint for next invocations. If the mmap
call does not use the hint address, repeat the mmap call using
the hint address incremented by page size.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
A device can suffer addressing limitations. This function checks
memsegs have iovas within the supported range based on dma mask.
PMDs should use this function during initialization if device
suffers addressing limitations, returning an error if this function
returns memsegs out of range.
Another usage is for emulated IOMMU hardware with addressing
limitations.
It is necessary to save the most restricted dma mask for checking out
memory allocated dynamically after initialization.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
It's not necessary to insert device argment to devargs_list
during bus scan, but this happens when we try to attach a
device on secondary process. The patch fix the issue.
Fixes: cdb068f031 ("bus/vdev: scan by multi-process channel")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
RTE_MEMZONE_SIZE_HINT_ONLY wasn't checked in any way,
causing size hints to be parsed as hard requirements.
This resulted in some allocations being failed prematurely.
Fixes: 68b6092bd3 ("malloc: allow reserving biggest element")
Cc: stable@dpdk.org
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
This patch is used to fix the memory leak issue of logid.
We use the ASAN test in SPDK when integrating DPDK and
find this memory leak issue.
Fixes: d8a2bc71df ("log: remove app path from syslog id")
Cc: stable@dpdk.org
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
in struct ip_frag_key,src_dst[] type is uint64_t.
but "val" which to store the calc restult ,type is uint32_t.
we may lost high 32 bit key. and function return value is int,
but it won't return < 0.
Signed-off-by: Li Han <han.li1@zte.com.cn>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Allow users and packagers to override the default dpdk/drivers
subdirectory where the PMDs get installed under $lib.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Timothy Redaelli <tredaelli@redhat.com>
As part of the effort of consolidating the DPDK installation bits and
pieces across distros, set the default directory of lib/ where PMDs get
installed to dpdk/pmds-XX.YY. It's necessary to have a versioned
subdirectory as multiple ABI revisions might be installed at the same
time, so having a fixed name will cause trouble with the autoload
feature.
Small refactor with parsing and saving the major version to a variable,
since it's now used in 3 different places.
Signed-off-by: Luca Boccassi <bluca@debian.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Timothy Redaelli <tredaelli@redhat.com>
Disabled vdev_netvsc build in FreeBSD because it is not supported.
Added changes to enable vdev_netvsc build if it is Linux OS and
disable in FreeBSD.
Fixes: 9fc43dbfd6 ("net/vdev_netvsc: add in meson build")
Signed-off-by: Agalya Babu RadhaKrishnan <agalyax.babu.radhakrishnan@intel.com>
Acked-by: Stephen Hemminger <sthemmin@microsoft.com>
Disabled tap build in FreeBSD because it is not supported
Added changes to enable tap build if it is Linux OS and
disable in FreeBSD.
Fixes: 095cae3668 ("net/tap: add in meson build")
Signed-off-by: Agalya Babu RadhaKrishnan <agalyax.babu.radhakrishnan@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Disabled softnic build in FreeBSD because it is not supported
Added changes to enable softnic build if it is Linux OS and
disable in FreeBSD.
Fixes: 6b2a3900e2 ("net/softnic: add to meson build")
Cc: stable@dpdk.org
Signed-off-by: Agalya Babu RadhaKrishnan <agalyax.babu.radhakrishnan@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
Disabled avp build in FreeBSD because it is not supported.
Added changes to enable avp build if it is Linux OS and
disable in FreeBSD.
Fixes: ed71204dd0 ("net/avp: add to meson build")
Cc: stable@dpdk.org
Signed-off-by: Agalya Babu RadhaKrishnan <agalyax.babu.radhakrishnan@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Disabled nfp build in FreeBSD because it is not supported
Added changes to enable NFP build if it is Linux OS and
disable in FreeBSD.
Fixes: d9b9ca7e05 ("net/nfp: add to meson build")
Cc: stable@dpdk.org
Signed-off-by: Agalya Babu RadhaKrishnan <agalyax.babu.radhakrishnan@intel.com>
FreeBSD compilation was failing through meson build.
RTE_EAL_VFIO is not supported in FreeBSD.
But RTE_EAL_VFIO was enabled for both linux and freebsd.
So RTE_EAL_VFIO is removed from config/rte_config.h and
based on the platform RTE_EAL_VFIO flag is enabled/disabled appropriately.
Fixes: 844514c735 ("eal: build with meson")
Cc: stable@dpdk.org
Signed-off-by: Agalya Babu RadhaKrishnan <agalyax.babu.radhakrishnan@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
build error:
== Build drivers/net/tap
mktemp: cannot create temp file /tmp/dpdk.auto-config-h.sh.XXX.c:
Invalid argument
.../buildtools/auto-config-h.sh: line 86: : No such file or directory
.../drivers/net/tap/Makefile:55: recipe for target
'tap_autoconf.h.new' failed
Above error observed on Wind River Linux 8.0
`mktemp` command in that system has a restrictions to have X in
the template at the end and at least six of them.
Complied to mktemp requirements and add -xc flag to compiler to say
`temp` file is a C file
Fixes: ff37ca5d37 ("devtools: use a common prefix for temporary files")
Reported-by: Shuai Zhu <shuaix.zhu@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch adds all documentation for telemetry.
A description on how to use the Telemetry API with a DPDK
application is given in this document.
It also adds a release notes update for telemetry.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
This patch adds a python script which can be used as a demo
client. The script is interactive and will allow the user to
register, request statistics, and unregister.
To run the script, an argument for the client file path must
be passed in: "python telemetry_client.py <file_path>".
This script is useful to see how the Telemetry API for DPDK
is used, and how to make the initial connection.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds telemetry as a dependecy to all applications. Without these
changes, the --telemetry flag will not be recognised and applications will
fail to run if they want to enable telemetry.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds functionality to enable/disable the selftest.
This functionality will be extended in future to make the
enabling/disabling more dynamic and remove this 'hardcoded' approach. We
are temporarily using this approach due to the design changes (vdev vs eal)
made to the library.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds functionality to create a JSON message in
order to send it to a client socket.
When stats are requested by a client, they are retrieved from
the metrics library and encoded in JSON format.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds functionality to update the statistics in
the metrics library with values from the ethdev stats.
Values need to be updated before they are encoded into a JSON
message and sent to the client that requested them. The JSON encoding
will be added in a subsequent patch.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds the parser file. This is used to parse any
messages that are received on any of the client sockets.
Currently, the unregister functionality works using the parser.
Functionality relating to getting statistic values for certain ports
will be added in a subsequent patch, however the parsing involved
for that command is added in this patch.
Some of the parser code included is in preparation for future
functionality, that is not implemented yet in this patchset.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch introduces clients to the telemetry API.
When a client makes a connection through the initial telemetry
socket, they can send a message through the socket to be
parsed. Register messages are expected through this socket, to
enable clients to register and have a client socket setup for
future communications.
A TAILQ is used to store all clients information. Using this, the
client sockets are polled for messages, which will later be parsed
and dealt with accordingly.
Functionality that make use of the client sockets were introduced
in this patch also, such as writing to client sockets, and sending
error responses.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds the telemetry UNIX socket. It is used to
allow connections from external clients.
On the initial connection from a client, ethdev stats are
registered in the metrics library, to allow for their retrieval
at a later stage.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch adds the infrastructure and initial code for the telemetry
library.
The telemetry init is registered with eal_init(). We can then check to see
if --telemetry was passed as an eal option. If --telemetry was parsed, then
we call telemetry init at the end of eal init.
Control threads are used to get CPU cycles for telemetry, which are
configured in this patch also.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Brian Archbold <brian.archbold@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This patch makes the eal_get_runtime_dir() API public so it can be used
from outside EAL.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
This commit adds infrastructure to EAL that allows an application to
register it's init function with EAL. This allows libraries to be
initialized at the end of EAL init.
This infrastructure allows libraries that depend on EAL to be initialized
as part of EAL init, removing circular dependency issues.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Check that the firmware response has a bit set indicating
it's valid before dereferencing the rest of the response contents.
Fixes: 0bdd36e122 ("crypto/qat: make dequeue function generic")
Cc: stable@dpdk.org
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Fix a typo on DEV_RX_OFFLOAD_QINQ_STRIP selection.
Fixes: 0074d02fca ("app/testpmd: convert to new Rx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The forwarding mode mac_swap should be macswap in testpmd guide.
Fixes: e76d7a768c ("doc: fix syntax in testpmd user guide")
Cc: stable@dpdk.org
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The offload name functions are useful, but since they are
marked experimental they can not be used by upstream projects.
For example, VPP duplicates the same table in its code.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Calling rte_eth_macaddr_get to get a copy of the MAC address causes
a hot spot according to profiling. We can easily get the current
MAC address by just examining the bonded device.
Signed-off-by: Chas Williams <chas3@att.com>