Some devices may be inaccessible for a variety of reasons, or the
PCI-bus may be unavailable causing the whole thing to fail. Still,
better to continue attempts at probes.
Since PCI isn't neccessarily required, it may be possible to simply log
the error and continue on letting the user check the logs and restart
the application when things have failed.
This will usually be an issue because of permissions. However, it could
also be caused by OOM. In either case, errno will contain the
underlying cause.
For linux, it is safe to re-init the system here, so allow the
application to take corrective action and reinit.
For BSD, this is not the case, for other reasons, including hugepage
allocation has already happened, and needs to be properly uninitialized.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Plugins are useful and important. However, it seems crazy to abort
everything just because they don't initialize properly.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
There could be some confusion as to why the call failed - this change
will always reflect the value of the error in rte_error.
When initializing the interrupt thread, there are a number of possible
reasons for failure - some of which are correctable by the application.
Do not panic() needlessly, and give the application a change to reflect
this information to the user.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
After code inspection, there is no way for eal_timer_init() to fail. It
simply returns 0 in all cases. As such, this test could either go-away
or stay here as 'future-proofing'.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When log initialization fails, it's generally because the fopencookie
failed. While this is rare in practice, it could happen, and it is
likely because of memory pressure. So, flag the error, and allow the
user to retry.
Memory init can only fail when access to hugepages (either as primary or
secondary process) fails (and that is usually permissions). Since the
manner of failure is not reversible, we cannot allow retry.
There are some theoretical racy conditions in the system that _could_
cause early tailq init to fail; however, no need to panic the
application. While it can't continue using DPDK, it could make better
alerts to the user.
rte_eal_alarm_init() call uses the linux timerfd framework to create a
poll()-able timer using standard posix file operations. This could fail
for a few reasons given in the man-pages, but many could be
corrected by the user application. No need to panic.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When memzone initialization fails, report the error to the calling
application rather than panic(). Without a good way of detaching /
releasing hugepages, at this point the application will have to restart.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
It's possible that the application could take a corrective action here,
and either prompt the user for different arguments, or at least perform
a better logging. Exiting this early prevents any useful information
gathering from the application layer.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When attempting to scan hugepages, signal to the eal that an error has
occurred, rather than performing a panic.
If we fail to acquire hugepage information, simply signal an error to
the application. This clears the run_once counter, allowing the user or
application to take a corrective action and retry.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This adds a new API to check for the eal cpu versions.
It's now possible to gracefully exit the application, or for
applications which support non-dpdk datapaths working in concert with
DPDK datapaths, there no longer is the possibility of exiting for
unsupported CPUs.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
There may be no way to gracefully recover, but the application
should be notified that a failure happened, rather than completely
aborting. This allows the user to proceed with a "slow-path" type
solution.
After this change, the EAL CPU NUMA node resolution step can no longer
emit an rte_panic. This aligns with the code in rte_eal_init, which
expects failures to return an error code.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The FreeBSD implementation wasn't registering new devices
with the device framework on start up. However, common
code attempts to unregister them on shutdown which causes
a SEGFAULT. This fix makes the FreeBSD code do the same
thing as the Linux code for registration.
Fixes: 13a1317d3ba7 ("pci: create device list and fallback on its members")
Cc: stable@dpdk.org
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
This patch extend next_hop field from 8-bits to 21-bits in LPM library
for IPv6.
Added versioning symbols to functions and updated
library and applications that have a dependency on LPM library.
Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Allow the BAR setup to succeed if a device has at least 1 BAR region
defined. Previously, the device probe would only succeed if at least one
memory BAR existed, but there are devices that have only port I/O BARs.
For example, on Virtual Box a virtio device has only a single I/O BAR
because by default MSI-X is not enabled. While in qemu/kvm the virtio
device has MSI-X enabled and therefore has both an I/O and Memory BAR.
The following are excerpts from "lspci -nnvvvv -s 00:09.0" on both types of
systems.
Virtual Box:
Region 0: I/O ports at d260 [size=32]
Capabilities: [80] #00 [0000]
QEMU/KVM:
Region 0: I/O ports at c060 [size=32]
Region 1: Memory at febd1000 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at feb80000 [disabled] [size=256K]
Capabilities: [40] MSI-X: Enable+ Count=3 Masked-
Vector table: BAR=1 offset=00000000
PBA: BAR=1 offset=00000800
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
When possible, replace the uses of rte_mempool_create() with
the helper provided in librte_mbuf: rte_pktmbuf_pool_create().
This is the preferred way to create a mbuf pool.
This also updates the documentation.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The check of queue_id is done in all drivers implementing
rte_eth_rx_queue_count(). Factorize this check in the generic function.
Note that the nfp driver was doing the check differently, which could
induce crashes if the queue index was too big.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The API comments are not consistent between each other.
The function rte_eth_rx_queue_count() returns the number of used
descriptors on a receive queue.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
For Linux kernel 4.0 and newer, the ability to obtain
physical page frame numbers for unprivileged users from
/proc/self/pagemap was removed. Instead, when an IOMMU
is present, simply choose our own DMA addresses instead.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Applications and other libraries should not be reading inside the
rte_ring structure directly to get the ring size. Instead add a fn
to allow it to be queried.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
This adds a check to ensure that the container_of() macro is not used to
cast away (remove) constness.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This fixes the usage of structure members that are declared const to get
a pointer to the embedding parent structure.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Re-enable CONFIG_RTE_LIBRTE_SCHED, since it is needed to build
correctly.
Fix a few warnings when compiling mpipe_tilegx.c.
Remove an empty rte_cpu_feature_table[] array using a bogus type.
Properly set RTE_OBJCOPY_{TARGET,ARCH} in mk/arch/tile/rte.vars.mk.
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
It's trivial to directly invoke a read of the special-purpose
register that holds the clock cycle counter, so just do that.
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
As announced in the deprecation notice, remove the functions for
single/multi producer/consumer enqueue/dequeue.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
When removing log history functions, the map has not been updated.
Fixes: d7e61ad3ae36 ("log: remove deprecated history dump")
Reported-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Uninitialized scalar variable.
Using uninitialized value cfg->sections[curr_section]->num_entries
when calling rte_cfgfile_close.
And memory in variables cfg->sections[curr_section],
sect->entries[curr_entry] maybe not equal NULL.
We must decrement counters curr_section, curr_entry when failed to realloc.
Fixes: eaafbad419bf ("cfgfile: library to interpret config files")
Signed-off-by: Dmitriy Yakovlev <bombermag@gmail.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
The max number of interrupt request is possible
be changed after rte_intr_callback_register, so
in get_max_intr, we need to check if necessary to
update the max_intr.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
This patch fixes a segmentation fault in function
rte_cryptodev_devices_get(), due to incorrect driver name path.
It reworks the function to use correct types and clean up
for visibility.
Coverity issue: 141067
Fixes: 38227c0e3ad2 ("cryptodev: retrieve device info")
Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The "dev->intr_handle.fd" is possibly a negative value while it is
passed as an argument to function "close". Fix the check to the fd.
Fixes: 5a60a7ffc801 ("pci: introduce functions to alloc and free uio resource")
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Prevent a segmentation fault in rte_sched_port_free by only accessing
the port structure after the NULL pointer check has been made.
Fixes: 7b3c4f35 ("sched: fix releasing enqueued packets")
Cc: stable@dpdk.org
Signed-off-by: Alan Dewar <adewar@brocade.com>
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
When a secondary process wants access to the VFIO container file
descriptor, the primary process calls vfio_get_container_fd() which
always opens an entirely new file descriptor on /dev/vfio/vfio.
However, once the file descriptor has been passed to the subprocess, it
is effectively duplicated, meaning that the copy of the file descriptor
in the primary process is no longer needed. However, the primary
process does not close the duplicate fd, which results in a resource
leak.
This can be reproduced by starting a primary process with a small
RLIMIT_NOFILE limit configured to use VFIO for at least one device, and
repeatedly launching secondary processes until the file descriptor limit
is exceeded.
Fix the resource leak by closing the local vfio container file
descriptor after passing it to the secondary process.
Fixes: 2f4adfad0a69 ("vfio: add multiprocess support")
Cc: stable@dpdk.org
Signed-off-by: Patrick MacArthur <patrick@patrickmacarthur.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Found with clang static analysis:
lib/librte_vhost/vhost_user.c:996:3: warning:
Value stored to 'ret' is never read
ret = vhost_user_get_vring_base(dev, &msg.payload.state);
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Emmanuel Roullit <emmanuel.roullit@gmail.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Found with clang static analysis:
lib/librte_vhost/virtio_net.c:723:17: warning:
Access to field 'data_off' results in a dereference of a null pointer
(loaded from variable 'tcp_hdr')
m->l4_len = (tcp_hdr->data_off & 0xf0) >> 2;
^~~~~~~~~~~~~~~~~
Fixes: d0cf91303d73 ("vhost: add Tx offload capabilities")
Cc: stable@dpdk.org
Signed-off-by: Emmanuel Roullit <emmanuel.roullit@gmail.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Setting up the mapping from GPA (guest physical address) to HPA (guest
physical address) could be very time consuming when the guest memory is
backened with small pages (4K). The bigger the guest memory, the longer
it takes. This could lead a very long vhost-user negotiation.
Since the mapping is only needed in zero copy mode so far, we could
avoid such time consuming settup when zero copy is turned off (which is
the default case).
It's actually a workaround, a right fix might be to start a new thread,
and hide the big latency there.
Fixes: e246896178e6 ("vhost: get guest/host physical address mappings")
Cc: stable@dpdk.org
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If a malicious guest forges a dead loop desc chain (let desc->next point
to itself) and desc->len is zero, this could lead to a dead loop in
copy_mbuf_to_desc(following is a simplified code to show this issue
clearly):
while (mbuf_is_not_totally_consumed) {
if (desc_avail == 0) {
desc = &descs[desc->next];
desc_avail = desc->len;
}
COPY(desc, mbuf, desc_avail);
}
I have actually fixed a same issue before: commit a436f53ebfeb ("vhost:
avoid dead loop chain"); it fixes the dequeue path though, leaving the
enqueue path still vulnerable.
The fix is the same. Add a var nr_desc to avoid the dead loop.
Fixes: f1a519ad981c ("vhost: fix enqueue/dequeue to handle chained vring descriptors")
Cc: stable@dpdk.org
Reported-by: Xieming Katty <katty.xieming@huawei.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds helper functions for new performance application which
provide identifiers and number of crypto device and
provide and check capabilities available for defined device and algorithm.
The performance application can be used to measure throughput and latency
of cryptography operation performed by crypto device.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch adds the cryptodev scheduler PMD name and type identifier to
librte_cryptodev.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This makes struct rte_cryptodev independent of struct rte_pci_device by
replacing it with a pointer to the generic struct rte_device.
This is inline with the recent changes in ethdev
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: John Griffin <john.griffin@intel.com>
Reviewed-by: Shreyansh Jain <shreyansh.jain@nxp.com>
rte_cryptodev_pmd_get_dev, rte_cryptodev_pmd_get_named_dev,
rte_cryptodev_pmd_is_valid_dev were incorrectly marked as inline and
therefore not useable from crypto PMDs when built as shared
libraries as they accessed the global rte_cryptodev_globals device
structure.
Fixes: d11b0f30 ("cryptodev: introduce API and framework for crypto devices")
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>