1315 Commits

Author SHA1 Message Date
Gaetan Rivet
2a46c2a2f2 eal/x86: include common header
The macro RTE_SET_USED is defined in rte_common.h

This header is included through eal_private.h, which includes in turn
rte_pci.h

Once the PCI subsystem is out of the EAL, this will break the
compilation (seen on FreeBSD).

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-10-26 23:17:31 +02:00
Gaetan Rivet
b27a200d03 eal: include stdint in private header
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-10-26 23:17:31 +02:00
Gaetan Rivet
0af0c40d41 bus: include debug header
This header is included through rte_pci.h, which will be removed once
the PCI bus is moved out of the EAL.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-10-26 23:17:31 +02:00
Ferruh Yigit
65d3ba3264 eal: fix build with glibc < 2.12
build error:
  CC rte_cycles.o
  cc1: warnings being treated as errors
  ...dpdk/lib/librte_eal/common/arch/x86/rte_cycles.c: In function
  ‘rdmsr’:
  ...dpdk/lib/librte_eal/common/arch/x86/rte_cycles.c:67:2: error:
  implicit declaration of function ‘pread’
  ...dpdk/lib/librte_eal/common/arch/x86/rte_cycles.c:67:2: error:
  nested extern declaration of ‘pread’

from pread man page:
pread(), pwrite():
   _XOPEN_SOURCE >= 500
   || /* Since glibc 2.12: */ _POSIX_C_SOURCE >= 200809L

For glibc < 2.12 _XOPEN_SOURCE >= 500 is required.

Adding _GNU_SOURCE define to the file which implies _XOPEN_SOURCE=700

Fixes: ad3516bb4ae1 ("eal/x86: implement arch-specific TSC freq query")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-26 23:08:13 +02:00
Gaetan Rivet
51093e679b pci: propagate PMD removal error value for unplug
If a PCI device detach removal fails, returns the actual removal
operator error value.

Use this value within pci->unplug, as it may help applications solve an
issue with the feature or more accurately warn their users.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2017-10-26 02:33:01 +02:00
Harry van Haaren
b17b952ec3 service: allow to disable core check
This commit adds a new function to disable the runtime mapped
service-cores check. This allows an application to take responsibility
of running unmapped services.

This feature is useful in cases like unit tests, where the application
code (or unit test in this case) requires accurate control over when
the service function is called to ensure correct behaviour, and when
an application has an advanced use-case and wishes to manage services
manually.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
2017-10-25 17:05:38 +02:00
Harry van Haaren
e9139a32f6 service: add function to run on app lcore
This commit adds a new function which allows an application lcore
(aka; *not* a dedicated service-lcore) to run a service. This
function is required to allow applications to gradually port
functionality to using services, while still being able to run
ordinary application functions.

This requirement became clear when a patch to the existing
eventdev/pipeline sample app was modified to use a service-core
for scheduling - and that same core should be capable of running
the "worker()" function from the application too.

This patch refactors the existing running code into a smaller
"service_run" function, which can be called by the dedicated
service-core loop, and the newly added function.

[1] http://dpdk.org/ml/archives/dev/2017-October/079876.html

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
2017-10-25 17:04:52 +02:00
Gaetan Rivet
96d9dd74cd bus: skip useless iterations in find function
The starting point is known. The iterator can be directly set to it.

The function rte_bus_find can easily be used with a comparison function
always returning True. This would make it a regular bus iterator.

Users doing so would however accomplish such iteration in

   O(N * N/2) = O(N^2)

Which can be avoided.

Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-25 13:12:36 +02:00
Ferruh Yigit
f73b38e924 igb_uio: remove device reset in open
Remove device reset during application start, the reset for application
exit still there.

Reset in open removed because of following comments:
1- Device reset not completed when VF driver loaded, which cause VF PMD
   initialization error.
   Adding delay can solve the issue but will increase driver load time.

2- Reset will be issues all devices unconditionally, not very efficient
   way.

Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Harish Patil <harish.patil@cavium.com>
Tested-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Tested-by: Jingjing Wu <jingjing.wu@intel.com>
2017-10-24 22:34:44 +01:00
Bruce Richardson
ce877d8718 eal: use a single version map file
Since the functions exported by DPDK EAL on all OS's should be
identical, we should not need separate function version files for each
OS. Therefore move existing version files to the top-level EAL
directory.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-24 01:24:22 +02:00
Bruce Richardson
1ae0a97f14 eal: mark internal interrupts file as such to doxygen
Put a file-level comment on rte_eal_interrupts.h to mark it as an
internal only header.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-24 01:24:22 +02:00
Bruce Richardson
6b8e8cd87f eal: merge bsdapp and linuxapp interrupt headers
The linuxapp and bsdapp interrupt header files are now identical, so
merge them into a common file in common/include.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-24 01:24:22 +02:00
Bruce Richardson
3a207c8ff4 eal/bsd: fix missing interrupt stub functions
A number of interrupt functions only existed on Linux. Adding in stubs
for these functions corrects this omission, and allows the map files for
both Linux and FreeBSD to be identical.

Fixes: 9efe9c6cdcac ("eal/linux: add epoll wrappers")

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-24 01:24:22 +02:00
Bruce Richardson
d47598fdcf eal/bsd: align interrupt header file with Linux version
The bsdapp-specific rte_interrupts.h file does not need to be different
from the linuxapp one, as there is nothing Linux specific in the APIs or
data structures. This will then allow us to merge the files in a common
location to avoid duplication.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-24 01:24:22 +02:00
Bruce Richardson
d6a4399cdf eal: avoid error for non-existent default PMD path
If the default location for the PMD .so files does not exist, it should
not be treated as a fatal error condition like an incorrect path on the
command line. Therefore check that the path exists and is a directory
before adding it to the list of paths to check for PMDs.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-24 01:24:22 +02:00
Olivier Matz
091eaac258 log: remove deprecated functions
Remove rte_set_log_level(), rte_get_log_level(),
rte_set_log_type(), and rte_get_log_type().

Also update librte_eal.so version in docuementation.
The LIBABIVER variable in eal has already been modified in
commit f26ab687a74f ("eal: remove Xen dom0 support").

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2017-10-24 01:24:21 +02:00
Anatoly Burakov
69f7504949 vfio: fix secondary process initialization
When getting group fd from primary process, secondary wasn't storing
the fd anywhere, leading to a (harmless) error message in EAL logs,
and (not so harmless) potential problems when hot-unplugging devices
managed by VFIO in a secondary process.

Fix it by actually storing the group fd whenever we get a valid one
from the secondary process.

Fixes: 94c0776b1bad ("vfio: support hotplug")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-10-24 00:38:24 +02:00
Hemant Agrawal
229f351a63 vfio: enable independently of PCI bus
VFIO may be used by buses other than PCI. This patch enables
the VFIO on the basis of vfio root presence.

Since vfio_enable should be called only once, pci_vfio_enable
is also removed.

A debug print is added in case vfio_pci module is not present.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-10-24 00:32:20 +02:00
Jingjing Wu
6b9ed026a8 igb_uio: fix build with kernel <= 3.17
Compile fails when kernel version is <= 3.17 with error:
"dereferencing pointer to incomplete type". This is because struct
uio_device definition is not exposed in kernel earlier than 3.17.

This patch fixes it by using pointer of rte_uio_pci_dev as
dev_id instead of uio_device for irq device handler.

Fixes: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in VM")
Cc: stable@dpdk.org

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com>
2017-10-16 13:07:11 +02:00
Thomas Monjalon
87607f45bd version: 17.11-rc1
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2017-10-14 01:29:59 +02:00
Nirmoy Das
47b1119fd1 kni: fix build on SLE12 SP3
build error:
build/lib/librte_eal/linuxapp/kni/kni_net.c:215:5: error:
‘struct net_device’ has no member named ‘trans_start’
  dev->trans_start = jiffies;

Signed-off-by: Nirmoy Das <ndas@suse.de>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-13 23:12:18 +01:00
Jingjing Wu
5f6ff30dc5 igb_uio: fix interrupt enablement after FLR in VM
If pass-through a VF by vfio-pci to a Qemu VM, after FLR
in VM, the interrupt setting is not recoverd correctly
to host as below:
 in VM guest:
        Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
 in Host:
        Capabilities: [70] MSI-X: Enable+ Count=5 Masked-

That was because in pci_reset_function, it first reads the
PCI configure and set FLR reset, and then writes PCI configure
as restoration. But not all the writing are successful to Host.
Because vfio-pci driver doesn't allow directly write PCI MSI-X
Cap.

To fix this issue, we need to move the interrupt enablement from
igb_uio probe to open device file. While it is also the similar as
the behaviour in vfio_pci kernel module code.

Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of device file")
Cc: stable@dpdk.org

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-13 22:35:29 +01:00
Markus Theil
6f4600eba9 igb_uio: fix legacy MSI masking
MSI masks contain a 1 if interrupt is masked, 0 if unmasked.
I got that wrong with the !!state calculation. For better
readability, the mask is now changed like in igbuio_msi_mask_irq.

Fixes: a8ea1e5fb647 ("igb_uio: fix unknown MSI symbols")

Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Tested-by: Markus Theil <markus.theil@tu-ilmenau.de>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-13 21:57:48 +02:00
Ferruh Yigit
a8ea1e5fb6 igb_uio: fix unknown MSI symbols
This patch partially reverts the commit d196343a258e and adds some
functions from Markus' previous version of the patch [1].

igb_uio uses pci_msi_unmask_irq() and pci_msi_mask_irq() kernel APIs
when kernel version is >= 3.19 because these APIs are implemented in
this Linux kernel version.

But these APIs only exported beginning from Linux kernel 4.5, so before
this Linux kernel version igb_uio kernel module is not usable,
and giving following warnings:
"igb_uio: Unknown symbol pci_msi_unmask_irq"
"igb_uio: Unknown symbol pci_msi_mask_irq"

The support for these APIs increased to Linux kernel >= 4.5

For older version of Linux kernel unmask_msi_irq() and mask_msi_irq()
are used but these functions are not exported at all.
Instead of these functions switched back to previous implementation in
igb_uio for MSI-X, and for MSI used igbuio_msi_mask_irq() from [1].

[1]
http://dpdk.org/dev/patchwork/patch/28144/

Fixes: d196343a258e ("igb_uio: use kernel functions for masking MSI-X")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-13 15:50:13 +02:00
Santosh Shukla
07a6f5c2d3 eal: call plugin init before device parse
Default eal_init code calls
0. eal_plugins_init
1. eal_option_device_parse
2. rte_bus_scan

IOVA commit:cf408c224 missed on calling eal_plugins_init before
eal_option_device_parse, rte_bus_scan and that introduced below
regression for shared mode:

with CONFIG_RTE_BUILD_SHARED_LIB=y:

'net_vhost0,iface=/tmp/vhost-user2' -d ./install/lib/librte_pmd_vhost.so
-- --portmask=1 --disable-hw-vlan -i --rxq=1 --txq=1 --nb-cores=1
--eth-peer=0,52:54:00:11:22:12
EAL: Detected 4 lcore(s)
ERROR: failed to parse device "net_vhost0"
EAL: Unable to parse device 'net_vhost0,iface=/tmp/vhost-user2'
PANIC in main():
Cannot init EAL

Fixes: cf408c224 ("eal: auto detect IOVA mode")

Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-10-13 15:38:30 +02:00
Xiaoyun Li
84cc318424 eal/x86: select optimized memcpy at run-time
This patch dynamically selects functions of memcpy at run-time based
on CPU flags that current machine supports. This patch uses function
pointers which are bind to the relative functions at constrctor time.
In addition, AVX512 instructions set would be compiled only if users
config it enabled and the compiler supports it.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
2017-10-13 15:20:50 +02:00
Pablo de Lara
ea39ca97d2 eal/x86: fix FreeBSD build
lib/librte_eal/common/arch/x86/rte_cycles.c: In function 'rdmsr':
lib/librte_eal/common/arch/x86/rte_cycles.c:57:11:
error: unused parameter 'msr' [-Werror=unused-parameter]
 rdmsr(int msr, uint64_t *val)
           ^
lib/librte_eal/common/arch/x86/rte_cycles.c:57:26:
error: unused parameter 'val' [-Werror=unused-parameter]
 rdmsr(int msr, uint64_t *val)
                          ^
Fixes: ad3516bb4ae1 ("eal/x86: implement arch-specific TSC freq query")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2017-10-13 15:20:50 +02:00
Sergio Gonzalez Monroy
ad3516bb4a eal/x86: implement arch-specific TSC freq query
First, try to use CPUID Time Stamp Counter and Nominal Core Crystal
Clock Information Leaf to determine the tsc hz on platforms that
supports it (does not require privileged user).

If the CPUID leaf is not available, then try to determine the tsc hz by
reading the MSR 0xCE (requires privileged user).

Default to the tsc hz estimation if both methods fail.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-13 13:07:38 +02:00
Jerin Jacob
15692396fd eal/ppc64: implement arch-specific TSC freq query
In ppc_64, rte_rdtsc() returns timebase register value which increments
at independent timebase frequency and hence not related to lcore cpu
frequency to derive TSC hz. Hence, we stick with master lcore frequency.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
2017-10-13 13:07:38 +02:00
Jerin Jacob
583152bc81 eal/armv8: implement arch-specific TSC freq query
Use cntvct_el0 system register to get the system counter frequency.

If the system is configured with RTE_ARM_EAL_RDTSC_USE_PMU then
return 0(let the common code calibrate the tsc frequency).

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
2017-10-13 13:07:23 +02:00
Jerin Jacob
3dbc565e81 timer: honor arch-specific TSC frequency query
When calibrating the TSC frequency, first, probe the architecture specific
function. If not available, use the existing calibrate scheme.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2017-10-13 13:07:17 +02:00
Pavan Bhagavatula
bc48589e47 eal: move bitmap from sched library
The librte_sched uses rte_bitmap to manage large arrays of bits in an
optimized method so, moving it to eal/common would allow other libraries
and applications to use it.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2017-10-12 22:31:33 +02:00
Luca Boccassi
f642036f3a mk: sort headers before wildcard inclusion
In order to achieve fully reproducible builds, always use the same
inclusion order for headers in the Makefiles.

Signed-off-by: Luca Boccassi <luca.boccassi@gmail.com>
2017-10-12 22:31:33 +02:00
Jianfeng Tan
7100dfe381 mem: honor IOVA mode for no-huge case
With the introduction of IOVA mode, the only blocker to run
with 4KB pages for NICs binding to vfio-pci, is that
RTE_BAD_PHYS_ADDR is not a valid IOVA address.

We can refine this by using VA as IOVA if it's IOVA mode.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
2017-10-12 21:05:21 +01:00
Stephen Hemminger
2cb43002af ethdev: increase device internal name length
Allow sufficient space for UUID in string form (36+1).
Needed to use UUID with Hyper-V.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-12 01:36:58 +01:00
Jiayu Hu
119583797b gso: support TCP/IPv4 GSO
This patch adds GSO support for TCP/IPv4 packets. Supported packets
may include a single VLAN tag. TCP/IPv4 GSO doesn't check if input
packets have correct checksums, and doesn't update checksums for
output packets (the responsibility for this lies with the application).
Additionally, TCP/IPv4 GSO doesn't process IP fragmented packets.

TCP/IPv4 GSO uses two chained MBUFs, one direct MBUF and one indrect
MBUF, to organize an output packet. Note that we refer to these two
chained MBUFs as a two-segment MBUF. The direct MBUF stores the packet
header, while the indirect mbuf simply points to a location within the
original packet's payload. Consequently, use of the GSO library requires
multi-segment MBUF support in the TX functions of the NIC driver.

If a packet is GSO'd, TCP/IPv4 GSO reduces its MBUF refcnt by 1. As a
result, when all of its GSOed segments are freed, the packet is freed
automatically.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
2017-10-12 01:36:57 +01:00
Shreyansh Jain
63bdef1827 bus: ignore scan and probe failures
Bus scan is responsible for finding devices over *all* buses.
Some of these buses might not be able to scan but that should
not prevent other buses to be scanned.

Same is the case for probing. It is possible that some devices which
were scanned didn't have a specific driver. That should not prevent
other buses from being probed.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2017-10-12 00:29:06 +02:00
Pavan Nikhilesh
78666372fa eal: add function to check lcore role
This function can be used to check the role of a specific lcore.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2017-10-11 22:30:16 +02:00
Sergio Gonzalez Monroy
5b618b5b29 eal/x86: use cpuid builtin
GCC does have the __get_cpuid_count builtin which checks for maximum
supported leaf, but implementations differ between CLANG and GCC.

This change provides an implementation compatible with both GCC and
CLANG 3.4+.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2017-10-11 21:59:56 +02:00
Jonas Pfefferle
33604c3135 vfio: refactor PCI BAR mapping
Split pci_vfio_map_resource for primary and secondary processes.
Save all relevant mapping data in primary process to allow
the secondary process to perform mappings.

Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2017-10-10 15:37:58 +02:00
Jonas Pfefferle
ed1e7e576b vfio: fix sPAPR IOMMU DMA window size
DMA window size needs to be big enough to span all memory segment's
physical addresses. We do not need multiple levels of IOMMU tables
as we already span ~70TB of physical memory with 16MB hugepages.

Signed-off-by: Jonas Pfefferle <jpf@zurich.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2017-10-10 15:36:04 +02:00
Patrick MacArthur
e3f141879e eal: copy raw strings taken from command line
Normally, command line argument strings are considered immutable, but
SPDK [1] and urdma [2] construct argv arrays to pass to rte_eal_init().
These strings are allocated using malloc() and freed after DPDK
initialization with free(). However, in the case of --file-prefix and
--huge-dir, DPDK takes the pointer to these strings in argv directly. If
a secondary process calls rte_eal_pci_probe() after rte_eal_init()
returns, as is done by SPDK, this causes a use-after-free error because
the strings have been freed by the calling code immediately after
rte_eal_init() returns.

This problem was observed when running SPDK example programs as a
secondary process and causes the secondary processes to fail:

Starting DPDK 16.11.1 initialization...
[ DPDK EAL parameters: identify -c 4 --file-prefix=spdk3260 --base-virtaddr=0x1000000000 --proc-type=auto ]
EAL: Detected 40 lcore(s)
EAL: Auto-detected process type: SECONDARY
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL:   probe driver: 8086:953 spdk_nvme
EAL:   cannot connect to primary process!
EAL: Error - exiting with code: 1
Cause: Requested device 0000:81:00.0 cannot be used

Running strace shows that the file prefix has been zero'd out by the
time that the secondary process attempts to probe the NVMe device.

The use-after-free errors can be easily detected with valgrind:

==8489== Invalid read of size 1
==8489==    at 0x4C30D22: strlen (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8489==    by 0x58DB955: vfprintf (vfprintf.c:1637)
==8489==    by 0x59A4685: __vsnprintf_chk (vsnprintf_chk.c:63)
==8489==    by 0x59A45E7: __snprintf_chk (snprintf_chk.c:34)
==8489==    by 0x1246AB: get_socket_path.constprop.0 (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x124B09: vfio_mp_sync_connect_to_primary (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x123BE4: vfio_get_group_fd.part.1 (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x124366: vfio_setup_device (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x126C8A: pci_vfio_map_resource (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x12B115: pci_probe_all_drivers.part.0 (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x12B596: rte_eal_pci_probe (in /home/pmacarth/src/spdk/examples/nvme/identify/identify)
==8489==    by 0x11D5B5: spdk_pci_enumerate (pci.c:147)
==8489==  Address 0x63f362e is 14 bytes inside a block of size 32 free'd
==8489==    at 0x4C2ED5B: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8489==    by 0x11E6FB: spdk_free_args (init.c:136)
==8489==    by 0x11EBF5: spdk_env_init (init.c:309)
==8489==    by 0x10D2AA: main (identify.c:976)
==8489==  Block was alloc'd at
==8489==    at 0x4C2DB2F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8489==    by 0x11E7D7: _sprintf_alloc (init.c:76)
==8489==    by 0x11EA78: spdk_build_eal_cmdline (init.c:251)
==8489==    by 0x11EA78: spdk_env_init (init.c:282)
==8489==    by 0x10D2AA: main (identify.c:976)
==8489==

Fix this by using strdup() to create separate memory buffers for these
strings. Note that this patch will cause valgrind to report memory
leaks of these buffers as there is nowhere to free them. Using static
buffers is an option but would make these strings have a fixed maximum
length whereas there is currently no limit defined by the API.

[1] http://spdk.io
[2] https://github.com/zrlio/urdma

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Patrick MacArthur <patrick@patrickmacarthur.net>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-09 23:25:13 +02:00
Seth Howell
7485e06c2a mem: check mmap failure
If mmap fails, it will return the value MAP_FAILED. Checking for this
return code allows us to properly identify mmap failures and report
them as such to the calling function.

Signed-off-by: Seth Howell <seth.howell@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-09 23:17:04 +02:00
Xueming Li
41baec55a8 mem: fix malloc element free in debug mode
malloc_elem_free() is clearing(setting to 0) the trailer cookie when
RTE_MALLOC_DEBUG is enabled. In case of joining free neighbor element,
part of joined memory is not getting cleared due to missing the length
of trailer cookie in the middle.

This patch fixes calculation of free memory length to be cleared in
malloc_elem_free() by including trailer cookie.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-09 23:15:45 +02:00
Xueming Li
3cd4e0e883 mem: fix malloc debug config
This patch replaces broken macro RTE_LIBRTE_MALLOC_DEBUG with
RTE_MALLOC_DEBUG.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2017-10-09 23:15:45 +02:00
Xueming Li
f385306357 config: add option to enable asserts
Currently, enabling assertion have to set CONFIG_RTE_LOG_LEVEL to
RTE_LOG_DEBUG. CONFIG_RTE_LOG_LEVEL is the default log level of control
path, RTE_LOG_DP_LEVEL is the log level of data path. It's a little bit
hard to understand literally that assertion is decided by control path
LOG_LEVEL, especially assertion used on data path.

On the other hand, DPDK need an assertion enabling switch w/o impacting
log output level, assuming "--log-level" not specified.

Assertion is an important API to balance DPDK high performance and
robustness. To promote assertion usage, it's valuable to unhide
assertion out of COFNIG_RTE_LOG_LEVEL.

In one word, log is log, assertion is assertion, debug is hot pot :)

Rationale of this patch is to introduce an dedicate switch of
assertion: RTE_ENABLE_ASSERT

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
2017-10-09 23:15:45 +02:00
Jianfeng Tan
f26ab687a7 eal: remove Xen dom0 support
We remove xen-specific code in EAL, including the option --xen-dom0,
memory initialization code, compiling dependency, etc.

Related documents are removed or updated, and bump the eal library
version.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
2017-10-09 01:54:29 +02:00
Jianfeng Tan
a7cb2e20d2 mem: remove API to get physical address in dom0
Previously, to get MFN address in dom0, this API is a wrapper to
obtain the "physical address".

As we will removed xen dom0 support, this API is not necessary.

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2017-10-09 01:52:37 +02:00
Hemant Agrawal
e1a45fc494 igb_uio: fix build on arm64 kernel
IGB_UIO compilation recently got enabled for ARM64 by default

The igb_uio compilation against ARM64 based stock 4.x (e.g. 4.13)
kernel is giving compilation warnings:

igb_uio.c: In function ‘igbuio_pci_irqcontrol’:
igb_uio.c:115:25: error: implicit declaration of function
‘irq_get_irq_dat ’ [-Werror=implicit-function-declaration]
  struct irq_data *irq = irq_get_irq_data(udev->info.irq);
                         ^
igb_uio.c:115:25: error: initialization makes pointer from integer without
a cast [-Werror=int-conversion]

Fixes: d196343a258e ("igb_uio: use kernel functions for masking MSI-X")

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
2017-10-08 17:19:08 +02:00
Tonghao Zhang
071925527d igb_uio: use UIO macro instead of hardcoded value
This is not bugfix, but it's convenient to help developer
to review and maintain the igbuio codes.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
2017-10-07 00:51:59 +02:00