On IBM POWER platform, when mapping /dev/zero file to hugepage memory
space, mmap will not respect the requested address hint. This will cause
the memory initialization for the second process fails. This patch adds
the required mmap flags to make it work. Beside this, users need to set
the nr_overcommit_hugepages to expand the VA range. When
doing the initialization, users need to set both nr_hugepages and
nr_overcommit_hugepages to the same value, like 64, 128, etc.
Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
This field is only used in the initialization phase. Remove it since the
global log level can also be retrieved using a public API:
rte_log_get_global_level().
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
It's better to initialize the internal config in rte_eal_init()
instead of eal_log_level_parse(), since this structure is not only
about logs.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The initialization of the default log level (from configuration) was
removed by mistake in a previous commit. The global log level was
wrongly set to debug when no --log-level argument was passed. Restore
this initialization.
Before:
$ ./build/app/test
RTE>>dump_log_types
global log level is debug
...
After:
$ ./build/app/test
RTE>>dump_log_types
global log level is info
...
Fixes: 845afe51e4 ("eal: change specific log levels at startup")
Reported-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
adding extra vfio utility functions to map file.
They will be used by other vfio supported buses like fslmc bus
for NXP DPAA2 devices
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
This adds a name field to the generic struct rte_device. The EAL is
checking for the name being populated when registering a device but
doesn't enforce global unique names as this is left to the bus
implementations.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Based on EAL Bus APIs, PCI bus callbacks and support functions are
introduced in this patch.
EAL continues to have direct PCI init/scan calls as well. These would be
removed in subsequent patches to enable bus only PCI devices.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Reviewed-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
This function rte_cpu_is_supported is now part of the public ABI,
so should be advertised as such.
Fixes: 37e97ad2c5 ("eal: do not panic when CPU is not supported")
Signed-off-by: Aaron Conole <aconole@redhat.com>
The patch change the prototype of callback function
(rte_intr_callback_fn) by removing the unnecessary parameter.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Remove the inappropriate modification on get_max_intr
field that keep the intr_source read only.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Deprecate the following functions:
- rte_set_log_level(), replaced by rte_log_set_global_level()
- rte_get_log_level(), replaced by rte_log_get_global_level()
- rte_set_log_type(), replaced by rte_log_set_level()
- rte_get_log_type(), replaced by rte_log_get_level()
The new functions provide a better control of the per-type log level,
and have a better name prefix (rte_log_).
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Example of use:
./app/test-pmd --log-level='pmd\.i40e.*,8'
This enables debug logs for all dynamic logs whose type starts with
'pmd.i40e'.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Introduce 2 new functions to support dynamic log types:
- rte_log_register(): register a log name, and return a log type id
- rte_log_set_level(): set the log level of a given log type
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Change the size of m->port and m->nb_segs to 16 bits. It is now possible
to reference a port identifier larger than 256 and have a mbuf chain
larger than 256 segments.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
To avoid multiple stores on fast path, Ethernet drivers
aggregate the writes to data_off, refcnt, nb_segs and port
to an uint64_t data and write the data in one shot
with uint64_t* at &mbuf->rearm_data address.
Some of the non-IA platforms have store operation overhead
if the store address is not naturally aligned.This patch
fixes the performance issue on those targets.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
A new interrupt type, RTE_INTR_HANDLE_VDEV, is added to support lsc and rxq
interrupt for vdev.
For lsc interrupt, except from original EPOLLIN events, we also listen for
socket peer closed connection event (EPOLLRDHUP and EPOLLHUP).
For rxq interrupt, add a precondition to avoid invoking any vfio and uio
code.
For intr_handle initialization, let each vdev driver to do that.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Prior to this patch only UIO/VFIO interrupt handlers types were supported.
This patch adds support for the external interrupt handler type, allowing
external drivers to set their own fds with specific interrupt handlers.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Add support for SLES12SP3, which uses kernel 4.4,
but backported features from newer kernels.
Signed-off-by: Nirmoy Das <ndas@suse.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
glibc 2.25 is warning about if applications depend on
sys/types.h for makedev macro, it expects to be included
from <sys/sysmacros.h>
Found this error while testing with GCC 6.3.1 on archlinux.
lib/librte_eal/linuxapp/eal/eal_pci_uio.c: In function ‘pci_mknod_uio_dev’:
lib/librte_eal/linuxapp/eal/eal_pci_uio.c:134:13:
error: In the GNU C Library, "makedev" is defined
by <sys/sysmacros.h>. For historical compatibility, it is
currently defined by <sys/types.h> as well, but we plan to
remove this soon. To use "makedev", include <sys/sysmacros.h>
directly. If you did not intend to use a system-defined macro
"makedev", you should undefine it after including <sys/types.h>. [-Werror]
dev = makedev(major, minor);
^~~~~~~~~~~~~~~~~
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
When binding with vfio-pci, secondary process cannot be started with
an error message:
cannot find TAILQ entry for PCI device.
It's due to: struct rte_pci_addr is padded with 1 byte for alignment
by compiler. Then below comparison in commit 2f4adfad0a
("vfio: add multiprocess support") will fail if the last byte is not
initialized.
memcmp(&vfio_res->pci_addr, &dev->addr, sizeof(dev->addr)
And commit cdc242f260 ("eal/linux: support running as unprivileged user")
just triggers this bug by using a stack un-initialized variable.
The fix is to use rte_eal_compare_pci_addr() for pci addr comparison.
Fixes: 2f4adfad0a ("vfio: add multiprocess support")
Fixes: cdc242f260 ("eal/linux: support running as unprivileged user")
Cc: stable@dpdk.org
Reported-by: Pawel Rutkowski <pawelx.rutkowski@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Some compilers require definition of vfio_iommu_spapr_tce_ddw_info
before its use in vfio_iommu_spapr_tce_info, so move tce_info
definition below tce_ddw_info.
Fixes: 468f42cc26 ("vfio: fix build on old kernel")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Recently added "dma_zalloc_coherent()" call is causing build error
for Linux kernels < 3.2.
compile error:
lib/librte_eal/linuxapp/igb_uio/igb_uio.c:
In function ‘igbuio_pci_probe’:
lib/librte_eal/linuxapp/igb_uio/igb_uio.c:434:2:
error: implicit declaration of function ‘dma_zalloc_coherent’
[-Werror=implicit-function-declaration]
map_addr = dma_zalloc_coherent(&dev->dev, 1024,
^
dma_zalloc_coherent() introduced with Linux kernel 3.2, with commit
Linux: 842fa69f3e0c ("include/linux/dma-mapping.h: add dma_zalloc_coherent()")
Since it does not exist for older kernels, causing a build error.
Switched to dma_alloc_coherent() API to prevent build error.
Fixes: d287e4d41b ("igb_uio: map dummy DMA forcing IOMMU domain attachment")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This eliminates the overhead of a task switch when an interrupt arrives.
Signed-off-by: David Su <david.w.su@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
For using a DPDK app when iommu is enabled, it requires to
add iommu=pt to the kernel command line. But using igb_uio driver
makes DMAR errors because the device has not an IOMMU domain.
Since kernel 3.15, iommu=pt requires to use the internal kernel
DMA API for attaching the device to the IOMMU 1:1 mapping, aka
si_domain. Previous versions did attach the device to that
domain when intel iommu notifier was called.
This is not a problem if the driver does later some call to the
DMA API because the mapping can be done then. But DPDK apps do
not use that DMA API at all.
Doing this dma map and unmap is harmless even when iommu is not
enabled at all.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Current device hotplug is just supported by UIO managed devices.
This patch adds same functionality with VFIO.
It has been validated through tests using IOMMU and also with
VFIO and no-iommu mode.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The flags member of irq_set should be ORed with VFIO_IRQ_SET_ACTION_MASK
and not VFIO_IRQ_SET_ACTION_UNMASK. The bug was found by code inspection.
Fixes: 5c782b3928 ("vfio: interrupts")
Cc: stable@dpdk.org
Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
compile error:
.../build/build/lib/librte_eal/linuxapp/kni/kni_net.c:124:6:
error: implicit declaration of function ‘signal_pending’
[-Werror=implicit-function-declaration]
if (signal_pending(current) || ret_val <= 0) {
^~~~~~~~~~~~~~
Linux 4.11 moves signal function declarations to its own header file:
Linux: 174cd4b1e5fb ("sched/headers: Prepare to move signal wakeup &
sigpending methods from <linux/sched.h> into <linux/sched/signal.h>")
Use new header file "linux/sched/signal.h" to fix the build error.
Cc: stable@dpdk.org
Reported-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Tested-by: Pankaj Gupta <pagupta@redhat.com>
Before this patch, the management of dependencies between directories
had several issues:
- the generation of .depdirs, done at configuration is slow: it can take
more than one minute on some slow targets (usually ~10s on a standard
PC without -j).
- for instance, it is possible to express a dependency like:
- app/foo depends on lib/librte_foo
- and lib/librte_foo depends on app/bar
But this won't work because the directories are traversed with a
depth-first algorithm, so we have to choose between doing 'app' before
or after 'lib'.
- the script depdirs-rule.sh is too complex.
- we cannot use "make -d" for debug, because the output of make is used for
the generation of .depdirs.
This patch moves the DEPDIRS-* variables in the upper Makefile, making
the dependencies much easier to calculate. A DEPDIRS variable is still
used to process library dependencies in LDLIBS.
After this commit, "make config" is almost immediate.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Robin Jarry <robin.jarry@6wind.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
For now, exit the init. It's likely that even aborting the initialization
is premature in this case, as it may be possible to proceed even if one
bus or another is not available.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Even if one vdev should fail, there's no need to prevent further
processing. Log the error, and reflect it to the higher levels to
decide.
Seems like it's possible to continue. At least, the error is reflected
properly in the logs. A user could then go and correct or investigate
the situation.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Some devices may be inaccessible for a variety of reasons, or the
PCI-bus may be unavailable causing the whole thing to fail. Still,
better to continue attempts at probes.
Since PCI isn't neccessarily required, it may be possible to simply log
the error and continue on letting the user check the logs and restart
the application when things have failed.
This will usually be an issue because of permissions. However, it could
also be caused by OOM. In either case, errno will contain the
underlying cause.
For linux, it is safe to re-init the system here, so allow the
application to take corrective action and reinit.
For BSD, this is not the case, for other reasons, including hugepage
allocation has already happened, and needs to be properly uninitialized.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Plugins are useful and important. However, it seems crazy to abort
everything just because they don't initialize properly.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
There could be some confusion as to why the call failed - this change
will always reflect the value of the error in rte_error.
When initializing the interrupt thread, there are a number of possible
reasons for failure - some of which are correctable by the application.
Do not panic() needlessly, and give the application a change to reflect
this information to the user.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
After code inspection, there is no way for eal_timer_init() to fail. It
simply returns 0 in all cases. As such, this test could either go-away
or stay here as 'future-proofing'.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When log initialization fails, it's generally because the fopencookie
failed. While this is rare in practice, it could happen, and it is
likely because of memory pressure. So, flag the error, and allow the
user to retry.
Memory init can only fail when access to hugepages (either as primary or
secondary process) fails (and that is usually permissions). Since the
manner of failure is not reversible, we cannot allow retry.
There are some theoretical racy conditions in the system that _could_
cause early tailq init to fail; however, no need to panic the
application. While it can't continue using DPDK, it could make better
alerts to the user.
rte_eal_alarm_init() call uses the linux timerfd framework to create a
poll()-able timer using standard posix file operations. This could fail
for a few reasons given in the man-pages, but many could be
corrected by the user application. No need to panic.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When memzone initialization fails, report the error to the calling
application rather than panic(). Without a good way of detaching /
releasing hugepages, at this point the application will have to restart.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
It's possible that the application could take a corrective action here,
and either prompt the user for different arguments, or at least perform
a better logging. Exiting this early prevents any useful information
gathering from the application layer.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When attempting to scan hugepages, signal to the eal that an error has
occurred, rather than performing a panic.
If we fail to acquire hugepage information, simply signal an error to
the application. This clears the run_once counter, allowing the user or
application to take a corrective action and retry.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This adds a new API to check for the eal cpu versions.
It's now possible to gracefully exit the application, or for
applications which support non-dpdk datapaths working in concert with
DPDK datapaths, there no longer is the possibility of exiting for
unsupported CPUs.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
There may be no way to gracefully recover, but the application
should be notified that a failure happened, rather than completely
aborting. This allows the user to proceed with a "slow-path" type
solution.
After this change, the EAL CPU NUMA node resolution step can no longer
emit an rte_panic. This aligns with the code in rte_eal_init, which
expects failures to return an error code.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Allow the BAR setup to succeed if a device has at least 1 BAR region
defined. Previously, the device probe would only succeed if at least one
memory BAR existed, but there are devices that have only port I/O BARs.
For example, on Virtual Box a virtio device has only a single I/O BAR
because by default MSI-X is not enabled. While in qemu/kvm the virtio
device has MSI-X enabled and therefore has both an I/O and Memory BAR.
The following are excerpts from "lspci -nnvvvv -s 00:09.0" on both types of
systems.
Virtual Box:
Region 0: I/O ports at d260 [size=32]
Capabilities: [80] #00 [0000]
QEMU/KVM:
Region 0: I/O ports at c060 [size=32]
Region 1: Memory at febd1000 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at feb80000 [disabled] [size=256K]
Capabilities: [40] MSI-X: Enable+ Count=3 Masked-
Vector table: BAR=1 offset=00000000
PBA: BAR=1 offset=00000800
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
For Linux kernel 4.0 and newer, the ability to obtain
physical page frame numbers for unprivileged users from
/proc/self/pagemap was removed. Instead, when an IOMMU
is present, simply choose our own DMA addresses instead.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
When removing log history functions, the map has not been updated.
Fixes: d7e61ad3ae ("log: remove deprecated history dump")
Reported-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The max number of interrupt request is possible
be changed after rte_intr_callback_register, so
in get_max_intr, we need to check if necessary to
update the max_intr.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
The "dev->intr_handle.fd" is possibly a negative value while it is
passed as an argument to function "close". Fix the check to the fd.
Fixes: 5a60a7ffc8 ("pci: introduce functions to alloc and free uio resource")
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
When a secondary process wants access to the VFIO container file
descriptor, the primary process calls vfio_get_container_fd() which
always opens an entirely new file descriptor on /dev/vfio/vfio.
However, once the file descriptor has been passed to the subprocess, it
is effectively duplicated, meaning that the copy of the file descriptor
in the primary process is no longer needed. However, the primary
process does not close the duplicate fd, which results in a resource
leak.
This can be reproduced by starting a primary process with a small
RLIMIT_NOFILE limit configured to use VFIO for at least one device, and
repeatedly launching secondary processes until the file descriptor limit
is exceeded.
Fix the resource leak by closing the local vfio container file
descriptor after passing it to the secondary process.
Fixes: 2f4adfad0a ("vfio: add multiprocess support")
Cc: stable@dpdk.org
Signed-off-by: Patrick MacArthur <patrick@patrickmacarthur.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
If the name is too long, it triggers BUG in alloc_netdev().
Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Bus implementations can implement a probe handler to match the devices
scanned against the drivers registered.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Scan for bus discovers the devices available on the bus and adds them
to a bus specific device list. Each bus mandatorily implements this
method.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch introduces the rte_bus abstraction for EAL.
The model is:
- One or more devices are connected to a Bus
- Drivers are running instances which manage one or more devices
- Bus is responsible for identifying devices (and interrupt propogation)
- Driver is responsible for initializing the device
This patch adds a 'rte_bus' base class which would be extended for
specific implementations. It also introduces Bus registration and
deregistration functions.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Instead of passing domain, bus, devid, func, just pass
an rte_pci_addr.
Signed-off-by: Ben Walker <benjamin.walker@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Both register/unregister and enable/disable don't necessarily require the
rte_intr_handle to be modifiable. Therefore lets constify it.
Signed-off-by: Jan Blunck <jblunck@infradead.org>
No device driver sets the unbind flag in current public code base.
Therefore it is good time to remove the unused dead code.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Some platform like octeontx may use pci and
vdev based combined device to represent a logical
dpdk functional device.In such case, postponing the
vdev initialization after pci device
initialization will provide the better view of
the pci device resources in the system in
vdev's probe function, and it allows better
functional subsystem registration in vdev probe
function.
As a bonus, This patch fixes a bond device
initialization use case.
example command to reproduce the issue:
./testpmd -c 0x2 --vdev 'eth_bond0,mode=0,
slave=0000:02:00.0,slave=0000:03:00.0' --
--port-topology=chained
root cause:
In existing case(vdev initialization and then pci
initialization), creates three Ethernet ports with
following port ids
0 - Bond device
1 - PCI device 0
2 - PCI devive 1
Since testpmd, calls the configure/start on all the ports on
start up,it will translate to following illegal setup sequence
1)bond device configure/start
1.1) pci device0 stop/configure/start
1.2) pci device1 stop/configure/start
2)pci device 0 configure(illegal setup case,
as device in start state)
The fix changes the initialization sequence and
allow initialization in following valid setup order
1) pcie device 0 configure/start
2) pcie device 1 configure/start
3) bond device 2 configure/start
3.1) pcie device 0/stop/configure/start
3.2) pcie device 1/stop/configure/start
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The lsb_release script is part of an optional package which is not
always installed. On the other hand, /etc/lsb-release is always present
even on minimal Ubuntu installations.
root@ubuntu1604:~# dpkg -S /etc/lsb-release
base-files: /etc/lsb-release
Read the file if present and use the variables defined in it.
Signed-off-by: Robin Jarry <robin.jarry@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
In function pci_mknod_uio_dev() in lib/librte_eal/eal/eal_pci_uio.c,
The return value of mknod() is ret, not f got by fopen().
So the value of ret should be checked for mknod().
Fixes: f7f97c1604 ("pci: add option --create-uio-dev to run without hotplug")
Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
compile error:
CC [M] .../lib/librte_eal/linuxapp/kni/igb_main.o
.../lib/librte_eal/linuxapp/kni/igb_main.c:2317:21:
error: initialization from incompatible pointer type
[-Werror=incompatible-pointer-types]
.ndo_set_vf_vlan = igb_ndo_set_vf_vlan,
^~~~~~~~~~~~~~~~~~~
Linux kernel 4.9 updates API for ndo_set_vf_vlan:
Linux: 79aab093a0b5 ("net: Update API for VF vlan protocol 802.1ad support")
Use new API for Linux kernels >= 4.9
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
compile error:
CC [M] .../lib/librte_eal/linuxapp/kni/kni_misc.o
cc1: warnings being treated as errors
.../lib/librte_eal/linuxapp/kni/kni_misc.c: In function ‘kni_exit_net’:
.../lib/librte_eal/linuxapp/kni/kni_misc.c:113:18:
error: unused variable ‘knet’
For kernel versions < v3.1 mutex_destroy() is a macro and does nothing,
this cause an unused variable warning for knet which used in the
mutex_destroy()
mutex_destroy() converted into static inline function with commit:
Linux: 4582c0a4866e ("mutex: Make mutex_destroy() an inline function")
To fix the warning unused attribute added to the knet variable.
Fixes: 93a298b34e ("kni: support core id parameter in single threaded mode")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Since switched to kernel dynamic debugging it is possible to remove
compile time debug log configuration.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Switch to dynamic logging functions. Depending kernel configuration this
may cause previously visible logs disappear.
How to enable dynamic logging:
https://www.kernel.org/doc/Documentation/dynamic-debug-howto.txt
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Allow binding KNI thread to specific core in single threaded mode
by setting core_id and force_bind config parameters.
Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Before this patch, application-specific loggers could not be
installed before rte_eal_init completed (the initialization process
called rte_openlog_stream, overwriting any previously installed
logger). This made it impossible for an application to capture the
initial log messages generated during rte_eal_init. This patch changes
initialization so that information from a previous call to
rte_openlog_stream is not lost. Specifically:
* The default log stream is now maintained separately from an
application-specific log stream installed with rte_openlog_stream.
* rte_eal_common_log_init has been renamed to eal_log_set_default,
since this is all it does. It no longer invokes rte_openlog_stream; it
just updates the default stream. Also, this method now returns void,
rather than int, since there are no errors.
This patch also removes the "early log" mechanism and cleans up the
log initialization mechanism:
* The default log stream defaults to stderr on all platforms if
eal_log_set_default hasn't been invoked (Linux used to use stdout
during the first part of initialization).
* Removed rte_eal_log_early_init; all of the desired functionality can
be achieved by calling eal_log_set_default.
* Removed lib/librte_eal/bsdapp/eal/eal_log.c: it contained only one
function, rte_eal_log_init, which is not needed or invoked for BSD.
* Removed declaration for eal_default_log_stream in rte_log.h (it's now
private to eal_common_log.c).
* Moved call to rte_eal_log_init earlier in rte_eal_init for Linux, so
that it starts using the preferrred log ASAP.
Signed-off-by: John Ousterhout <ouster@cs.stanford.edu>
When compiled with EXTRA_CFLAGS="-O1", the compiler is not
able to detect that size is always initialized when used, and
issues a wrong warning:
eal_memory.c: In function ‘rte_eal_hugepage_attach’:
eal_memory.c:1684:3: error: ‘size’ may be used uninitialized in this
function [-Werror=maybe-uninitialized]
munmap(hp, size);
^
Workaround this issue by initializing size to 0.
Seen on gcc (Debian 5.4.1-1) 5.4.1 20160803.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Running secondary is tricky due to the need to map the memory region
at the right place in VM, which is whatever primary has chosen. If the
base address for primary happens to by already mapped in the
secondary, we will hit precisely these error messages (depending if we
fail on the config region or the hugepages). This is why there is
already a comment about ASLR.
The issue is that in most cases, remapping does not happen and "errno"
is not changed and therefore stale. In our case, we got a "permission
denied", which sent us down the wrong track. It's such a common error
for secondary that I feel this error message should be unambiguous and
helpful.
The call to close was also moved because close() may override errno.
Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
Now that rte_device is available, drivers can start using its members
(numa, name) as well as link themselves into another rte_device list.
As of now no one is using this list, but can be used for moving over all
devices (pdev/vdev/Xdev) and perform bulk actions (like cleanup).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Reword commit log for extra rte_device list]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Move all PMD_VDEV-specific code into a separate module and header
file to not polute the generic code anymore. There is now a list
of virtual devices available.
The rte_vdev_driver integrates the original rte_driver inside
(C inheritance). The rte_driver will be however change in the
future to serve as a common base for all other types of drivers.
The existing PMDs (PMD_VDEV) are to be modified later (there is
no change for them at the moment).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Hotplug invocations, which deals with devices, should come from the layer
that already handles them, i.e. EAL.
For both attach and detach operations, 'name' is used to select the bus
that will handle the request.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
- Move rte_eth_dev_create_unique_device_name() from ether/rte_ethdev.c to
common/include/rte_pci.h as rte_eal_pci_device_name(). Being a common
method, can be used across crypto/net PCI PMDs.
- Remove crypto specific routine and fallback to common name function.
- Introduce a eal private Update function for PCI device naming.
Signed-off-by: David Marchand <david.marchand@6wind.com>
[Shreyansh: Merge crypto/pci helper patches]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
These lists can be initialized once and for all at build time.
With this, those lists are only manipulated in a common place
(and we could even make them private).
A nice side effect is that pci drivers can now register in constructors.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
rte_eal_dev_init is declared in both eal_private.h and rte_dev.h since its
introduction.
This function has been exported in ABI, so remove it from eal_private.h
Fixes: e57f20e051 ("eal: make vdev init path generic for both virtual and pci devices")
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>
An application might be linked to DPDK but not really use it,
so move the cpu flag check to the EAL initialization instead.
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Aaron Conole <aconole@redhat.com>
In ASLR-enabled system, it is possible that selected
virtual space is occupied by program segments. Therefore,
error path should not blindly unmap all memmory segments
but only those already mapped.
Steps that lead to crash:
1. memeseg 0 in secondary process overlaps with libc.so
2. mmap of /dev/zero fails for virtual space of memseg 0
3. munmap of memseg 0 leads to unmapping libc.so itself
4. app gets SIGSEGV after returning from syscall to libc
Fixes: ea329d7f8e ("mem: fix leak after mapping failure")
Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
RTE_EAL_SINGLE_FILE_SEGMENTS was introduced with ivshmem integration.
Now that ivshmem was removed (commit c711ccb309)
and a simple git grep shows no one else references it;
I think we can now remove it.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
When running single-core, some drivers tend to call rte_delay_us for a
long time, and that is causing packet drops.
To avoid this, rte_delay_us can be replaced with user-defined delay
function with:
void rte_delay_us_callback_register(void(*userfunc)(unsigned));
When userfunc==rte_delay_us_block build-in blocking delay function is
restored.
Signed-off-by: Jozef Martiniak <jozmarti@cisco.com>
Compile error:
.../lib/librte_eal/linuxapp/kni/kni_net.c:
In function ‘kni_net_rx_lo_fifo’:
.../lib/librte_eal/linuxapp/kni/kni_net.c:331:1:
error: the frame size of 1056 bytes is larger than 1024 bytes
[-Werror=frame-larger-than=]
This compile error seen with some compiler / kernel combinations.
Moved some local variables to the kni_dev struct.
Fixes: 8451269e6d ("kni: remove continuous memory restriction")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Use mempool buf_addr and buf_physaddr fields for address translation.
Since each mbuf address calculated separately, the restriction of all
mbufs should come from a continuous memory restriction is no more valid.
mbuf related FIFO's content changed, rx_q and alloc_q now carries
physical address of mbufs. tx_q and free_q content not changed, they
still carries virtual address of mbufs.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Linux kernel v4.8 removes macro DEFINE_PCI_DEVICE_TABLE
Linux: 7e9321599011 ("treewide: remove references to the now unnecessary
DEFINE_PCI_DEVICE_TABLE")
Replaced macro with its value in kni ethtool drivers.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Compile error:
CC [M] .../build/lib/librte_eal/linuxapp/kni/igb_main.o
.../build/lib/librte_eal/linuxapp/kni/igb_main.c:
In function ‘igb_check_swap_media’:
.../build/lib/librte_eal/linuxapp/kni/igb_main.c:1556:7:
error: variable ‘link’ set but not used [-Werror=unused-but-set-variable]
bool link;
^
With Linux kernel >= v3.0 this warning disabled:
Linux: 8417da6f2128 ("kbuild: Fix passing -Wno-* options to gcc 4.4+")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add support for RHEL 7.3, which uses kernel 3.10,
but backported features from newer kernels.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fix build error with Linux kernel >= v4.7
Fix compile error because of Linux API change, 'trans_start' field
removed from 'struct net_device'.
Linux: 9b36627acecd ("net: remove dev->trans_start")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
There is no need for the page files to be readable (and executable) by
other users. This can be exploited by non-privileged users to access the
working memory of a DPDK app.
Open the files with 0600.
Signed-off-by: Robin Jarry <robin.jarry@6wind.com>
Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked to avoid warnings and compilation failures.
Unnamed structs/unions are allowed since C11, however many compiler
versions do not use this mode by default.
This commit prevents the following errors:
error: ISO C99 doesn't support unnamed structs/unions
error: struct has no named members
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.
This commit prevents the following errors:
error: type of bit-field `[...]' is a GCC extension
Note: the standard does not require implementations to issue a diagnostic
message with these, and such errors do not occur with recent GCC or clang
versions. However, GCC 4.7 is still common and using the extension keyword
is easier than checking compiler version.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Exported header files used by applications should allow the strictest
compiler flags. Language extensions used in many places must be explicitly
marked or removed to avoid warnings and compilation failures.
The extension keyword is used whenever the C99 syntax cannot do it.
This commit prevents the following errors:
error: ISO C forbids zero-size array `[...]'
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Fix pernet calls when HAVE_SIMPLIFIED_PERNET_OPERATIONS is not set.
Fixes: e6734d21b4 ("kni: fix build with kernel 2.6.32")
Signed-off-by: Vincent Guo <guopengfei160@163.com>
Acked-by Ferruh Yigit <ferruh.yigit@intel.com>
Removing KNI interface that has no PCI driver for ethtool support cause
kernel crash.
Fixes: 109febfe58 ("net/igb: move PCI device IDs from EAL")
Fixes: 221fba3b98 ("net/ixgbe: move PCI device IDs from EAL")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
License information is already in LICENSE.GPL.
Remove two extra copies and change referred filename in the files.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
PCI device ids moved from common header into igb driver itself.
KNI starts using pci_device_id from kni/ethtool/igb driver, this is only
for KNI ethtool support, KNI data path is not affected.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
PCI device ids moved from common header into ixgbe driver itself.
KNI starts using pci_device_id from kni/ethtool/ixgbe driver, this is
only for KNI ethtool support, KNI data path is not affected.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Following discussions on the mailing list [1] and since nobody stood up to
implement the necessary cleanups, here is the ivshmem integration removal.
There is not much to say about this patch, a lot of code is being removed.
The default configuration file for packet_ordering example is replaced with
the "native" x86 file.
The only tricky part is in eal_memory with the memseg index stuff.
More cleanups can be done after this but will come in subsequent patchsets.
[1]: http://dpdk.org/ml/archives/dev/2016-June/040844.html
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The log history feature was deprecated in 16.07.
The remaining empty functions are removed in 16.11.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
In rte_mem_virt2phy: Value returned from a function and indicating the
number of bytes was ignored. This could cause a wrong pfn (page frame
number) mask read from pagemap file.
When read returns less than the number of sizeof(uint64_t) bytes,
function rte_mem_virt2phy returns error.
Coverity issue: 13212
Fixes: 40b966a211 ("ivshmem: library changes for mmaping using ivshmem")
Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
The offset of the 2nd mmap() when mapping the region after msix_bar
needs to take region address into consideration as mmap() takes
address that is resource-relative instead of bar-relative. This is
exposed when binding vmxnet3 to vfio-pci.
Fixes: 90a1633b23 ("eal/linux: allow to map BARs with MSI-X tables")
Signed-off-by: Yong Wang <yongwang@vmware.com>
Signed-off-by: Ronghua Zhang <rzhang@vmware.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Having constructor function in the header file is generally
a bad idea, as it will eventually be implanted to 3rd party
library.
In this case it causes linking issues with 3rd party libraries
when an application is not linked to dpdk, due to missing
symbol called by constructor.
Fixes: ba7468997e ("spinlock: add HTM lock elision for x86")
Signed-off-by: Damjan Marion <damarion@cisco.com>
Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
When using Xen Dom0, it looks that /proc/self/pagemap returns 0.
This breaks the creation of mbufs pool.
We can workaround this in rte_mem_virt2phy() by browsing the dpdk memory
segments. This only works for dpdk memory, but it's enough to fix the
mempool creation.
Fixes: c042ba2067 ("mempool: rework support of Xen dom0")
Fixes: 3097de6e6b ("mem: get physical address of any pointer")
Reported-by: Huilong Xu <huilongx.xu@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
When building as shared library, the compiler complains for
undefined reference to `rte_xen_mem_phy2mch'
The symbol rte_xen_mem_phy2mch was introduced in DPDK 2.2
and has been called in mempool recently via rte_mem_phy2mch.
Fixes: c042ba2067 ("mempool: rework support of Xen dom0")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Fix the compilation with CONFIG_RTE_LIBRTE_XEN_DOM0=y, by correcting the
typo in variable names.
Fixes: 8dab483701 ("xen: return machine address without knowing memseg id")
Reported-by: Huilong Xu <huilongx.xu@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
We can now just OR the vfio_enabled sequentially and so adding new VFIO
subsystems (vfio_platform) is possible.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
The module eal_pci_vfio_mp_sync is quite generic so it shouldn't contain the
"pci" string in its name. The internal functions don't need the pci_* prefix.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
The vfio_cfg is a module-global variable and so together with this
variable, it is necessary to move functions:
* pci_vfio_get_group_fd
- renamed to vfio_get_group_fd
- pci_* version removed (no other call in EAL)
* pci_vfio_setup_device
- renamed as vfio_setup_device
* pci_vfio_enable
- renamed as vfio_enable
- generalized to check for a specific vfio driver presence
- pci_* specialization preserved as a wrapper
* pci_vfio_is_enabled
- renamed as vfio_is_enabled
- generalized to check for a specific vfio driver presence
to preserve the semantics of VFIO + PCI
- pci_* specialization preserved as a wrapper
* clear_current_group
- private function, just moved
To stop GCC complaining about "defined but not used", the private
function pci_vfio_get_group_no has been removed entirely.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
The setup logic access the global vfio_cfg variable that will be moved in the
following commits. We need to separate all accesses to this variable to a
general code.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
The pci_vfio_set_iommu_type is not PCI-specific and it is a private function
of the eal_pci_vfio.c. We just rename the function and make it available even
for non-PCI devices.
The pci_vfio_has_supported_extensions is not PCI-specific and it is a private
function of the eal_pci_vfio.c. We just rename the function and make it
available even for non-PCI devices.
The pci_vfio_get_container_fd is not PCI-specific. Move the implementation to
the eal_vfio.c as vfio_get_container_fd. No other code seems to call this
function.
Generalize the pci_vfio_get_group_no to not be PCI-specific. Move the general
implementation to the eal_vfio.c as vfio_get_group_no and leave the original
pci_vfio_get_group_no being a wrapper around this to preserve compilation
issues. The pci_vfio_get_group_no function will be removed later.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
We make the iommu_types public temporarily here until the depending stuff is
refactored. The iommu_types and dma_map functions will be changed to be private
inside the eal_vfio module later.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
mmap the iomem range of the PCI device fails for kernels that
enabled CONFIG_IO_STRICT_DEVMEM option:
EAL: pci_map_resource():
cannot mmap(39, 0x7f1c51800000, 0x100000, 0x0):
Invalid argument (0xffffffffffffffff)
CONFIG_IO_STRICT_DEVMEM is introduced in Linux v4.5 and not enabled
by default:
Linux commit: 90a545e restrict /dev/mem to idle io memory ranges
As a workaround igb_uio can stop reserving PCI memory resources, from
kernel point of view iomem region looks like idle and mmap works
again. This matches uio_pci_generic usage.
With this update device iomem range is not protected against any
other kernel drivers or userspace access. But this shouldn't
be a problem for dpdk usage module since purpose of the igb_uio
module is to provide userspace access.
Fixes: af75078fec ("first public release")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Current code does not munmap 'hugepage' mapping (hugepage info file) on
function exit, leaking resources.
Coverity issue: 97920
Fixes: b6a468ad41 ("memory: add --socket-mem option")
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
This reverts commit 593a084afc.
Since recently [1], it is not possible to run the dpdk with
non-root privileges and the --no-huge option. This is because the eal
layer tries to lock the memory. Using locked memory is mandatory for
physical devices because they reference physical addresses.
But a user may want to start the dpdk without locked memory, because he
does not have the permission to do so, and/or does not have this need,
for instance because he uses virtual drivers.
So this commit reverts the use of MAP_LOCKED in mmap() flags.
[1] http://www.dpdk.org/ml/archives/dev/2016-May/039404.html
Fixes: 593a084afc ("mem: lock pages when not using hugepages")
Reported-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
EAL memory init allocates all free hugepages of the whole system,
which seen from sysfs, even when applications do not ask so many.
When there is a limitation on how many hugepages an application can
use (such as cgroup.hugetlb), or hugetlbfs is specified with an
option of size (exceeding the quota of the fs), it just fails to
start even there are enough hugepages allocated.
To fix above issue, this patch:
- Changes the logic to continue memory init to see if hugetlb
requirement of application can be addressed by already allocated
hugepages.
- To make sure each hugepage is allocated successfully, we add a
recover mechanism, which relies on a mem access to fault-in
hugepages, and if it fails with SIGBUS, recover to previously
saved stack environment with siglongjmp().
For the case of CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS (enabled by
default when compiling IVSHMEM target), it's indispensable to
mapp all free hugepages in the system. Under this case, it fails
to start when allocating fails.
Test example:
a. cgcreate -g hugetlb:/test-subgroup
b. cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup
c. cgexec -g hugetlb:test-subgroup \
./examples/helloworld/build/helloworld -c 0x2 -n 4
Fixes: af75078fec ("first public release")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Yulong Pei <yulong.pei@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Using gcc 6.1, in some cases, kni fails to compile
because of unused variables:
lib/librte_eal/linuxapp/kni/ixgbe_main.c:82:19:
error: ‘ixgbe_copyright’
defined but not used [-Werror=unused-const-variable=]
lib/librte_eal/linuxapp/kni/ixgbe_main.c:62:19:
error: ‘ixgbe_driver_string’
defined but not used [-Werror=unused-const-variable=]
Fixes: 3fc5ca2f63 ("kni: initial import")
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Following compile error observed with CentOS 6.8, which uses kernel
kernel-devel-2.6.32-642.el6.x86_64:
In function 'igbuio_msix_mask_irq':
error: 'PCI_MSIX_ENTRY_CTRL_MASKBIT' undeclared
Reported-by: Thiago Martins <thiagocmartinsc@gmail.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
The KeepAlive rte_keepalive_mark_sleep function was not being exported.
Fixes: 90c622f356 ("keepalive: add liveness callback")
Signed-off-by: Remy Horton <remy.horton@intel.com>
Patch fixes resource leak in rte_eal_hugepage_attach() where mapped files
were not freed back to the OS in case of failure. Patch uses the behavior
of Linux munmap: "It is not an error if the indicated range does not
contain any mapped pages".
Coverity issue: 13295, 13296, 13303
Fixes: af75078fec ("first public release")
Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
The function rte_thread_setname needs glibc 2.12,
otherwise it returns -1 without using any parameter.
The macro RTE_SET_USED avoids an "unused parameter" warning.
Fixes: 3901ed99c2 ("eal: fix thread naming on FreeBSD")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
rte_thread_setname was a macro defined only for Linux.
The function rte_thread_setname() can now be used on FreeBSD
as well on Linux.
It is required to build librte_pdump.
The macro was 0 for old glibc. The function is now returning -1.
The related logs are decreased from error to debug level because
it is not an important failure, just a debug inconvenience.
Fixes: 278f945402 ("pdump: add new library for packet capture")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
The function rte_keepalive_register_alive_callback do not exist.
The function rte_keepalive_register_relay_callback was missing for BSD.
Fixes: 90c622f356 ("keepalive: add liveness callback")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Adds and documents new callbacks that allow transitions to core
states other than dead to be reported to applications.
Signed-off-by: Remy Horton <remy.horton@intel.com>
On PPC64, the ioports are mapped in memory. Implement the missing part
of ioport API for PPC64 when using uio. This may also work on other
architectures but it has not been tested.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Split pci_parse_sysfs_resource() and introduce
pci_parse_one_sysfs_resource() that parses one line of sysfs resource
file.
This new function will be exported and used in next commits when
mapping the ioports resources.
No functional change.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
In a previous commit, the file used to map the PCI resources changed
from "/dev/uio<x>" to "/sys/bus/pci/devices/<busaddr>/resource", making
the comment wrong. Remove it.
Fixes: 9e67561acd ("eal/linux: mmap uio resources using resourceX files")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
From iopl(2) man page: "This call is mostly for the x86 architecture. On
many other architectures it does not exist or will always return an
error".
This patch removes the call to iopl() in rte_eal_iopl_init() for
architectures other than x86, and always return 0 (success). This was
already done for ARM in
commit 0291476ae3 ("eal/linux: never check iopl for arm")
Next patches will introduce the support of memory mapped IO resources
for architectures != x86.
On BSD, there is nothing to do as open("/dev/io") already does the
proper thing. See man IO(4).
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This patch is used to add the class_id (class_code,
subclass_code, programming_interface) support for
pci_device probe. With this patch, it will be
flexible for users to probe a class of devices
by class_id.
Signed-off-by: Ziye Yang <ziye.yang@intel.com>
The SYSFS_PCI_DEVICES is a constant that makes the PCI testing
difficult as it points to an absolute path. We remove using this
constant and introducing a function pci_get_sysfs_path that gives
the same value. However, the user can pass a SYSFS_PCI_DEVICES env
variable to override the path. It is now possible to create a fake
sysfs hierarchy for testing.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
The libraries rte_mempool and rte_ring are not used in EAL,
except for the ivshmem part (CONFIG_RTE_LIBRTE_IVSHMEM).
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
The log history uses rte_mempool. In order to remove the mempool
dependency in EAL (and improve the build), this feature is deprecated.
The ABI is kept but the behaviour is now voided because it seems this
function was not used. The history can be read from syslog.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Partial revert of an earlier ill-conceived "fix".
Adjacent segments can never be considered overlapping because we
are not comparing ends to starts, but rather starts to starts.
Therefore the earlier fix was wrong (plus it also had a typo).
Fixes: d6cf31419e ("ivshmem: avoid infinite loop when concatenating segments")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fix compile error because of Linux API change, 'trans_start' field
removed from 'struct net_device'.
Linux: 9b36627acecd ("net: remove dev->trans_start")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
The $(comma) variable is not defined in this Makefile, nor in
any included Makefile. Seen while doing a "make clean" on ubuntu:
$ make clean
== Clean lib
== Clean lib/librte_compat
== Clean lib/librte_eal
== Clean lib/librte_eal/common
== Clean lib/librte_eal/linuxapp
== Clean lib/librte_eal/linuxapp/eal
== Clean lib/librte_eal/linuxapp/igb_uio
== Clean lib/librte_eal/linuxapp/kni
tr: missing operand after ‘.-’
Two strings must be given when translating.
Try 'tr --help' for more information.
This commit replaces $(comma) by a ',' character, it's not a problem in
that case since we are inside antiquotes.
Fixes: a09b359dac ("kni: fix build on Ubuntu 14.04")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by Ferruh Yigit <ferruh.yigit@intel.com>
The conversion from guest physical address to machine physical address
is fast when the caller knows the memseg corresponding to the gpa.
But in case the user does not know this information, just find it
by browsing the segments. This feature will be used by next commit.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Although the physical address won't be correct in memory segment,
this allows at least to retrieve the physical address using
rte_mem_virt2phy(). Indeed, if the page is not locked, the page
may not be present in physical memory.
With next commit, it allows a mempool to have properly filled physical
addresses when using --no-huge option.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
rx_q fifo may have chained mbufs, merge them into single skb before
handing to the network stack.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Currently every time a KNI interface goes up, its ethernet address
is reassigned.
After this patch ethernet address is assigned only once,
at initialization time.
Suggested-by: Sergey Balabanov <balabanovsv@ecotelecom.ru>
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Fix issue reported by Coverity.
Coverity ID 13194
The function returns a value that indicates an error condition. If this
is not checked, the error condition may not be handled correctly.
Fixes: 2f4adfad0a ("vfio: add multiprocess support")
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The driver i40e was using a specific PCI config before the release 16.04.
Since 16.04, it is always enabled in i40e (commit 56465cfaf).
The API has been deprecated in the commit 68f7759382.
The igb_uio implementation has been deprecated in commit b7cf8e155.
The config helper - through igb_uio sysfs entries - is now removed.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Fix vhost-kni compile errors because of Linux kernel API changes
- SOCK_ASYNC_WAITDATA renamed to SOCKWQ_ASYNC_WAITDATA
Linux commit id: 9cd3e072
Updated in Linux kernel 4.4
- sk_alloc() gets new parameter
Linux commit id: 11aa9c28b
Updated in Linux kernel 4.2
New parameter is: "@kern: is this to be a kernel socket?"
Reported-by: Chintu Hetam <rometoroam@gmail.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Coverity ID 13289: Resource leak:
The system resource will not be reclaimed and reused,
reducing the future availability of the resource.
In pci_vfio_get_group_fd: Leak of memory or pointers to system resources
Fixes: ff0b67d1c8 ("vfio: DMA mapping")
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
netif_rx() should be used in interrupt context. Replace it with
netif_rx_ni() which is safe to use in process context.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch aligns the logic used to check for the presence of
adjacent segments in has_adjacent_segments() with the logic used
in cleanup_segments() when actually deciding to concatenate or
not a pair of segments. Additionally, adjacent segments are
no longer considered overlapping to avoid generating errors for
segments that can happily coexist together.
This fixes an infinite loop that happened when segments where
adjacent in their physical or virtual addresses but not in their
ioremap addresses: has_adjacent_segments() reported the presence
of adjacent segments while cleanup_segments() was not considering
them for concatenation, resulting in an infinite loop since the
result of has_adjacent_segments() is used in the decision to
continue looping in cleanup_segments().
Signed-off-by: David Verbeiren <david.verbeiren@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
For GLIBC < 2.17 it is necessery to add -lrt for linker
from glibc > 2.17 The `clock_*' suite of functions (declared in <time.h>) is now
available directly in the main C library. This affect Ubuntu 12.04 in i686
and other older Linux Distros).
Fixes: 4758404a30 ("mk: fix eal shared library dependencies")
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
When compiling each file, the CPU flags are given as RTE_MACHINE_CPUFLAG_*
and in the list RTE_COMPILE_TIME_CPUFLAGS.
RTE_MACHINE_CPUFLAG_* are used to check the CPU features when compiling.
The list RTE_COMPILE_TIME_CPUFLAGS is used only to check the CPU at
runtime in the function rte_cpu_check_supported(). So it is not needed to
define this list for every files.
That's why RTE_COMPILE_TIME_CPUFLAGS is removed from the common variable
MACHINE_CFLAGS and is added only to the CFLAGS of eal_common_cpuflags.c.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
uio_pci_generic does not offer the same sysfs helpers as igb_uio.
In this case, ioport number can only be retrieved by parsing /proc/ioports.
Fixes: 756ce64b1e ("eal: introduce PCI ioport API")
Reported-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Commit b8eb345378 ("pci: ignore devices already managed in Linux when
mapping x86 ioport") did not update other parts of the ioport api.
The application is not supposed to call these read/write/unmap ioport
functions if map call failed but I prefer aligning the code for the sake
of consistency.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Add DT_NEEDED entries for librte_eal external dependencies.
Details between the platforms differ somewhat, and for static
builds they need to be handled from mk/exec-env still.
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
call pci_ioport_map (on x86) only if the pci device is not bound
to a kernel driver.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Use RTE_KDRV_NONE to indicate that kernel driver (other than VFIO/UIO) isn't
managing the device.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch adds a new function to the EAL API:
int rte_eal_primary_proc_alive(const char *path);
The function indicates if a primary process is alive right now.
This functionality is implemented by testing for a write-
lock on the config file, and the function tests for a lock.
The use case for this functionality is that a secondary
process can wait until a primary process starts by polling
the function and waiting. When the primary is running, the
secondary continues to poll to detect if the primary process
has quit unexpectedly, the secondary process can detect this.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
This patch fixes a race-condition when a primary and
secondary process simultaneously probe PCI devices.
This is implemented by moving the rte_eal_mcfg_complete()
function call in rte_eal_init() until after rte_eal_pci_probe().
The memory mapping of PCI device in the secondary process *must*
happen after the primary has finished doing the mapping as it
relies on information written by the primary.
The end result is that the secondary process waits longer,
until the primary has completed its PCI probing, and then
notifies the secondary process.
This race-condition became visible during the development of
a function that allows a secondary process to be polling until
a primary process exists. The secondary would then probe PCI
devices at the same time, causing an error during rte_eal_init()
Linux EAL:
Fixes: 916e4f4f4e ("memory: fix for multi process support")
BSD EAL:
Fixes: 764bf26873 ("add FreeBSD support")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
It deprecates sys files of 'extended_tag' and
'max_read_request_size' which was not documented.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Remove pci configuration of 'extended tag' and 'max read request
size', as they are not required by all devices and it lets PMD to
configure them if necessary.
In addition, 'pci_config_space_set()' is deprecated.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This was working fine because addresses of two structs are same:
struct A {
struct B b;
} a;
As above sample "a" and "b" has same address.
Now casting private data back to the correct struct type, to the one
stored.
Fixes: af75078fec ("first public release")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
CONFIG_RTE_LIBRTE_EAL_*APP can be replaced by CONFIG_RTE_EXEC_ENV_*APP.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
with only one hugepage or already sorted hugepage addresses, the sort
function called memcpy with same src and dst pointer. Debugging with
valgrind will issue a warning about overlapping area. This patch changes
the sort method to qsort to avoid this behavior. The separate sort
function is no longer necessary.
Suggested-by: Jay Rolette <rolette@infiniteio.com>
Signed-off-by: Ralf Hoffmann <ralf.hoffmann@allegro-packets.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Fix compile error when enable CONFIG_RTE_LIBEAL_USE_HPET.
Error messages:
lib/librte_eal/linuxapp/eal/eal_timer.c: In function ‘rte_eal_hpet_init’:
lib/librte_eal/linuxapp/eal/eal_timer.c:222:2: error:
implicit declaration of function ‘rte_thread_setname’
Fixes: badb3688ff ("eal/linux: fix build with glibc < 2.12")
Signed-off-by: Yi Lu <luyi68@live.com>
Acked-by: David Marchand <david.marchand@6wind.com>
The version 2.3 has been renamed 16.04.
Fixes: 6d7de6d2e3 ("version: switch to year.month numbers")
Reported-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
rte_get_log_type and rte_get_log_level functions has been available
for many versions. But they are missing from the shared library map
and therefore do not get exported correctly.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Include vfio map/rd/wr support for pci ioport.
Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
vfio_pci_mmap() try to map all pci bars. ioport region are not mapped in
vfio/kernel so ignore mmaping for ioport.
Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
iopl() syscall not supported in linux-arm/arm64 so always return 0 value.
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Most of the code is inspired on virtio driver.
rte_pci_ioport structure is filled at map time with anything needed for later
read / write calls.
At the moment, base field is used to store a x86 ioport (uint16_t) and will
be reused for other arches.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The patch c344eab3ee has moved the hardware definition of CPU flags.
Now the functions checking these hardware flags are also moved.
The function rte_cpu_get_flag_enabled() is no more inline.
The benefits are:
- remove rte_cpu_feature_table from the ABI (recently added)
- hide hardware details from the API
- allow to adapt structures per arch (done in next patch)
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The new function rte_cpu_get_flag_name() is added to the EAL API.
It is implemented (duplicated) in each arch because the next patch
will remove the public exposure of the feature tables.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
No need to split mbuf structure to two cache lines for 128-byte cache
line size targets as it can fit on a single 128-byte cache line.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
by default, all the targets will be configured with the 64-byte cache line
size, targets which have different cache line size can be overridden
through target specific config file.
Selected ThunderX and power8 as CONFIG_RTE_CACHE_LINE_SIZE=128 targets
based on existing configuration.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The file rte_config.h is automatically generated and included.
No need to #include it.
The example performance-thread needs a makefile fix to avoid
overwriting the default cflags.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
fix the error reported by checkpatch:
"ERROR: return is not a function, parentheses are not required"
remove parentheses in return like:
"return (logical expressions)"
remove parentheses in return a function like:
"return (rte_mempool_lookup(...))"
Fixes: 6307b909b8 ("lib: remove extra parenthesis after return")
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Currently rte_eal_check_module() detects Linux kernel modules via reading
/proc/modules. Built-in ones aren't listed there and therefore they are not
being found.
Add support for checking built-in modules with parsing the sysfs files
This commit obsoletes the /proc/modules parsing approach.
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Normally we could set RTE_PCI_DRV_NEED_MAPPING flag so that eal will
invoke pci_map_device internally for us. From that point view, there
is no need to export pci_map_device.
However, for virtio pmd driver, which is designed to work without
binding UIO (or something similar first), pci_map_device() will fail,
which ends up with virtio pmd driver being skipped. Therefore, we can
not set RTE_PCI_DRV_NEED_MAPPING blindly at virtio pmd driver.
Therefore, this patch exports pci_map_device, and let virtio pmd call
it when necessary.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Move cpu_feature_table array from arch specific rte_cpuflags.h files to
new arch specific rte_cpuflags.c files.
Main motivation is to escape from static variable declarations in
header files. cpu_feature_table has many copies in final binary, even
exist in some object files that does not use this variable at all.
And this can be a sample to create architecture specific source files
and move some functions which are not performance sensitive from
architecture header files to source files.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
This commit is adding a generic mechanism to support multiple IOMMU
types. For now, it's only type 1 (x86 IOMMU) and no-IOMMU (a special
VFIO mode that doesn't use IOMMU at all), but it's easily extended
by adding necessary definitions to eal_vfio.h, and DMA mapping
functions to eal_pci_vfio.c.
Since type 1 IOMMU module is no longer necessary to have VFIO,
we fix the module check to check for vfio-pci instead. It's not
ideal and triggers VFIO checks more often (and thus produces more
error output, which was the reason behind the module check in the
first place), so we compensate for that by providing more verbose
logging, indicating whether VFIO initialization has succeeded or
failed.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
The kernel fills new allocated (huge) pages with zeros.
DPDK just has to populate page tables to trigger the allocation.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Changing from 1/2 second to 1/10 doesn't compromise the precision,
and a 4/10 second is worth saving.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
test_mp_secondary was initially added by mistake.
rte_snprintf has been removed.
Fixes: 9d41beed24 ("lib: provide initial versioning")
Fixes: 3185322809 ("eal: remove rte_snprintf")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
In eal_intr_proc_rxtx_intr, negative value may be used as argument to
a function expecting a positive value. If 'read' returns EAGAIN as
example, the bytes_read updates to a negative value which continue
be passed as argument for the next 'read'.
Coverity issue: 107115
Function read(fd, &buf, bytes_read) returns a negative number.
Assigning: signed variable bytes_read = read.
CID 107115 (#1 of 1): Argument cannot be negative
(NEGATIVE_RETURNS) bytes_read is passed to a parameter
that cannot be negative.
bytes_read = read(fd, &buf, bytes_read);
Fixes: c9f3ec1a0f ("eal/linux: add Rx interrupt control function")
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
The current implementation of VFIO will not with the new no-IOMMU mode
in 4.4 kernel. The original code assumed that IOMMU group zero would
never be used. Group numbers are assigned starting at zero, and up
until now the group numbers came from the hardware which is likely
to use group 0 for system devices that are not used with DPDK.
The fix is to allow 0 as a valid group and rearrange code
to split the return value from the group value.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
RHEL 7.2 contains additional backports from newer upstream kernels.
Add RHEL_RELEASE_CODE logic for RHEL_RELEASE_VERSION(7,2) to pick up
the changes to kernel functions.
Signed-off-by: Lee Roberts <lee.roberts@hpe.com>
There is a new function in the EAL API for internal use.
It has neither a proper prefix nor a .map export:
libethdev.so: undefined reference to `is_xen_dom0_supported'
Fixes: 719dbebceb ("xen: allow determining DOM0 at runtime")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
pthread_setname_np() function added in glibc 2.12, using this function
in older glibc versions cause compile error:
error: implicit declaration of function "pthread_setname_np"
This patch adds "rte_thread_setname" macro and set it according
glibc >= 2.12 check, thread naming disabled for older glibc versions,
glibc versions that support "pthread_setname_np" will keep using this
function.
Fixes: 67b6d3039e ("eal: set name to threads")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
To get pci_dev and vf number from dev, benefit from
existing macros in pci.h
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
[Thomas note: it breaks the old 2.6.33 support]
Fixes following error when Ubuntu 12.04 uses kernel 3.13.0-30-generic,
since skb_set_hash() is implemented in the kernel from 3.13.0-30,
which is declared as UBUNTU_KERNEL_VERSION(3,13,0,30,0) and not
UBUNTU_KERNEL_VERSION(3,13,0,30,54)
In file included
from /usr/src/linux-headers-3.13.0-30-generic/include/linux/if_ether.h:23:0,
from /tmp/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_osdep.h:39,
from /tmp/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_hw.h:31,
from /tmp/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_api.h:31,
from /tmp/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_mbx.h:31,
from /tmp/dpdk/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/kni/e1000_mbx.c:28:
/usr/src/linux-headers-3.13.0-30-generic/include/linux/skbuff.h:740:1:
note: previous definition of ‘skb_set_hash’ was here
skb_set_hash(struct sk_buff *skb, __u32 hash, enum pkt_hash_types type)
^
Fixes: e88c3b0a ("kni: fix build on Ubuntu 12.04.5")
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
CLOCK_MONOTONIC_RAW added in glibc 2.12, using this define in older
glibc versions cause compile error:
'error: identifier "CLOCK_MONOTONIC_RAW" is undefined'
This patch replaces "CLOCK_MONOTONIC_RAW" with "CLOCK_MONOTONIC" for
older glibc versions, versions that support "CLOCK_MONOTONIC_RAW"
will keep using this clock type.
Fixes: d08d304508 ("eal/linux: make alarm not affected by system time jump")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Adds functions for detecting and reporting the live-ness of LCores,
the primary requirement of which is minimal overheads for the
core(s) being checked. Core failures are notified via an application
defined callback.
Signed-off-by: Remy Horton <remy.horton@intel.com>
It fixes the compile issue on kernel version 2.6.32 or old ones.
Error logs:
lib/librte_eal/linuxapp/kni/kni_misc.c:121: error: unknown field id specified in initializer
lib/librte_eal/linuxapp/kni/kni_misc.c:121: error: excess elements in struct initializer
lib/librte_eal/linuxapp/kni/kni_misc.c:121: error: (near initialization for kni_net_ops)
lib/librte_eal/linuxapp/kni/kni_misc.c:122: error: unknown field size specified in initializer
lib/librte_eal/linuxapp/kni/kni_misc.c:122: error: excess elements in struct initializer
lib/librte_eal/linuxapp/kni/kni_misc.c:122: error: (near initialization for kni_net_ops)
Fixes: 72a7a2b246 ("kni: allow per-net instances")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
/proc/version_signature is the version for the host machine, but in
e.g., chroots, this does not necessarily match that DPDK is built
for. DPDK will then build for the wrong kernel version - that of the
server, and not that installed in the (build) chroot.
The patch uses utsrelease.h from the kernel sources instead and fakes
the upload version.
Tested on a server with Ubuntu 12.04, building in a chroot for Ubuntu
14.04.
Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Signed-off-by: Johan Faltstrom <johan.faltstrom@netinsight.net>
Acked-by: Helin Zhang <helin.zhang@intel.com>
There's no good reason to limit plugins to Linux, make it available
on FreeBSD too. Refactor the plugin code from Linux EAL to common
helper functions, also check for and fail on errors during initialization.
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Fix to take this change into account: https://lkml.org/lkml/2015/7/9/101
Has been applied to Kernel 4.3.0-rc6
Linux: 4a7cc831 ("genirq/MSI: Move msi_list from struct pci_dev to struct device")
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Someone may need to call rte_eal_init() with a fake argc/argv array
in the middle of using getopt() to parse its own unrelated argc/argv
parameters. So getopt lib shouldn't be reset by rte_eal_init().
Now eal will always save optind, optarg and optopt (and optreset on
FreeBSD) at the beginning, initialize optind (and optreset on FreeBSD)
to 1 before calling getopt_long(), then restore all values after.
Suggested-by: Don Provan <dprovan@bivio.net>
Suggested-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Tiwei Bie <btw@mail.ustc.edu.cn>
Reviewed-by: Don Provan <dprovan@bivio.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
VFIO allows multiple MSI-X vector, others doesn't, but maybe will
allow it in the future.
Device drivers need to be aware of the capability.
It's better to avoid condition check on interrupt type (VFIO) everywhere,
instead a capability api is more flexible for the condition change.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
The patch adds condition check to avoid enable nothing.
In disable state, both max_intr and nb_efd are zero.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
During VFIO_DEVICE_SET_IRQS, the previous order is
{Q0_fd, ... Qn_fd, misc_fd}.
The vector number of misc is indeterminable which is
ugly to some NIC (e.g. i40e, fm10k).
The patch adjusts the order in {misc_fd, Q0_fd, ... Qn_fd},
always reserve the first vector to misc interrupt.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Some comments have a wrong space between /** and <.
Seen with
git grep '\*\* <'
Reported-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Kernel 4.2 has introduced two new parameters in ndo_bridge_getlink,
which breaks DPDK compilation.
Linux: 7d4f8d87 ("switchdev: ad VLAN support for ports bridge-getlink")
This patch adds the necessary checks to fix it.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Rename HAVE_NDO_BRIDGE_GETLINK_FILTER_MASK macro for
a more meaningful HAVE_NDO_BRIDGE_GETLINK_NLFLAGS,
as the macro is used to know if igb_ndo_bridge_getlink
function has nlflags parameter.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
There is a global variable 'device_in_use' which is used to make sure
only one instance is using /dev/kni device. If you were using LXC, you
will find there is only one instance of KNI example could be run even
different namespaces were created.
In order to have /dev/kni used simultaneously in different namespaces,
making all of global variables as per network namespace variables.
With regard to single kernel thread mode, there will be one kernel
thread for each of network namespace.
Signed-off-by: Dex Chen <dex.chen@ruckuswireless.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
When an application using huge-pages crash or exists, the hugetlbfs
backing files are not cleaned up. This is a patch to clean those files.
There are multi-process DPDK applications that may be benefited by those
backing files. Therefore, I have made that configurable so that the
application that does not need those backing files can remove them, thus
not changing the current default behavior. The application itself can
clean it up, however the rationale behind DPDK cleaning it up is, DPDK
created it and therefore, it is better it unlinks it.
Signed-off-by: Shesha Sreenivasamurthy <shesha@cisco.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
This patch adds support for pthread_setname_np on Linux and
pthread_set_name_np on FreeBSD.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: add name in tep_termination example]
Add RTE_INTR_HANDLE_EXT handler type for PMDs that do not support VFIO or
UIO. Those are expected to manage the file descriptor themselves.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>