Due to eal_alarm_callback() and rte_eal_alarm_set() use gettimeofday()
to get the current time, and gettimeofday() is affected by jumps.
For example, set up a rte_alarm which will be triggerd next second (
current time + 1 second) by rte_eal_alarm_set(). And the callback
function of this rte_alarm sets up another rte_alarm which will be
triggered next second (current time + 2 second).
Once we change the system time when the callback function is triggered,
it is possible that rte alarm functionalities work out of expectation.
Replace gettimeofday() with clock_gettime(CLOCK_MONOTONIC_RAW, &now)
could avoid this phenomenon.
Signed-off-by: Wen-Chi Yang <wolkayang@gmail.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
For virtio-net pmd, the interrupt management thread must be created after
this driver has initialised so that iopl() has been properly called and
its effects are inherited by all eal children threads.
Before this change, changing link status on a virtio-net device would
trigger a segfault in the interrupt thread :
$ mkdir -p /mnt/huge
$ echo 256 > /proc/sys/vm/nr_hugepages
$ mount -t hugetlbfs none /mnt/huge
$ lspci |grep Ethernet
00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
$ modprobe uio
$ insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
$ echo 0000:00:03.0 > /sys/bus/pci/devices/0000\:00\:03.0/driver/unbind
$ echo 1af4 1000 > /sys/bus/pci/drivers/igb_uio/new_id
$ ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n 3 -w 0000:00:03.0 -- -i --txqflags=0xf01 --total-num-mbufs 2048
[snip]
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL: probe driver: 1af4:1000 rte_virtio_pmd
Interactive-mode selected
Configuring Port 0 (socket 0)
Port 0: DE:AD:DE:01:02:03
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Done
testpmd>
Then, from qemu monitor:
(qemu) set_link virtio-net-pci.0 off
testpmd> Segmentation fault
Fixes: 565b85dcd9f4 ("eal: set iopl only when needed")
Reported-by: Stephen Hemminger <shemming@brocade.com>
Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Function rte_epoll_wait should return when underlying call
to epoll_wait times out.
Signed-off-by: Robert Sanford <rsanford@akamai.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
The function rte_eal_pci_close_one() was renamed rte_eal_pci_detach().
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: David Marchand <david.marchand@6wind.com>
The extended unified packet type is now part of the standard ABI.
As mbuf struct is changed, the mbuf library version is incremented.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
The Rx interrupt feature is now part of the standard ABI.
Because of changes in rte_intr_handle and struct rte_eth_conf,
the eal and ethdev library versions are incremented.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
The patch sets zero as the default value of pci device numa_node
if the socket could not be determined.
It provides the same default value as FreeBSD which has no NUMA support,
and makes the return value of rte_eth_dev_socket_id() be consistent
with the API description.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Do some cleanup of pci scan loop.
* check errors first
* don't initialize variables where not necessary
* cuddle else (follow existing style)
* chop off conditional after return
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Build log:
lib/librte_eal/common/eal_common_pci.c:188:4: error:
implicit declaration of function pci_config_space_set
The function rte_eal_pci_probe_one_driver, which calls
pci_config_space_set, was moved to eal_common_pci.c,
but pci_config_space_set was left in eal_pci.c with static specifier.
Fixes: 4d4ebca4 ("pci: merge probing and closing functions for linux and bsd")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
The patch exposes intr event fd create and release for PMD.
The device driver can assign the number of event associated with interrupt vector.
It also provides misc functions to check 1) allows other slowpath intr(e.g. lsc);
2) intr event on fastpath is enabled or not.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
The intr handle type(RTE_INTR_HANDLE_UIO_INTX) was introduced by UIO pci generic.
When turning on the lsc interrupt, it complains fd read error.
The patch uses the correct read size in the case of RTE_INTR_HANDLE_UIO_INTX.
Fixes: 3f313bef3467 ("eal/linux: fix irq handling with igb_uio")
Reported-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
The patch maps each of the eventfd to the interrupt vector of VFIO MSI-X.
Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
The patch adds 'rte_intr_rx_ctl' to add or delete interrupt vector
events monitor on specified epoll instance.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
The patch adds 'rte_epoll_wait' and 'rte_epoll_ctl' for async event wakeup.
It defines 'struct rte_epoll_event' as the event param.
When the event fds add to a specified epoll instance, 'eptrs' will hold
the rte_epoll_event object pointer.
The 'op' uses the same enum as epoll_wait/ctl does.
The epoll event support to carry a raw user data and to register a callback
which is executed during wakeup.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
The patch adds interrupt vectors support in rte_intr_handle.
'vec_en' is set when interrupt vectors are detected and associated
event fds are set. Those event fds are stored in efds[].
'intr_vec' is reserved for device driver to initialize the vector
mapping table.
Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Some drivers need ability to access PCI config (for example for power
management). This adds an abstraction to do this for both Linux
and BSD.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Harish Patil <harish.patil@qlogic.com>
Move common functions from BSD/Linux to eal_common_memory.c file.
BSD uses contigmem kernel module and Linux uses /proc/self/pagemap file.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move common functions from BSD/Linux to eal_common_timer.c.
BSD uses sysctl and Linux uses CLOCK_MONOTIC_RAW to calibrate TSC.
HPET is specific to Linux and not integrated in the common init.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
[Thomas: move inclusion used by ixgbe bypass]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Change the log level of startup messages. Anything that is
just normal activity (like getting virtual areas) is changed
to debug level. Anything that is a failure should be NOTICE
or ERR severity.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The read for events in the interrupt thread may get interrupted
by signals from application. Avoid generating stray log message.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
There are close and detach functions in ethdev.
To keep a consistent naming, PCI functions called by ethdev detach
must be named "detach" instead of "close".
Fix also comments which mix close and uninit names.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Implement rte_memzone_free which, as its name implies, would free a
memzone.
Currently memzone are tracked in an array and cannot be free.
To be able to reuse the same array to track memzones, we have to
change how we keep track of reserved memzones.
With this patch, any memzone with addr NULL is not used, so we also need
to change how we look for the next memzone entry free.
Add new unit test for rte_memzone_free API.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.
This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.
During initialization malloc sets all available memory as part of the heaps.
CONFIG_RTE_MALLOC_MEMZONE_SIZE was used to specify the default memory
block size to expand the heap. The option is not used/relevant anymore,
so we remove it.
Remove free_memseg field from internal mem config structure as it is
not used anymore.
Also remove code in ivshmem that was setting up free_memseg on init.
It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Move malloc inside eal and create a new section in MAINTAINERS file for
Memory Allocation in EAL.
Create a dummy malloc library to avoid breaking applications that have
librte_malloc in their DT_NEEDED entries.
This is the first step towards using malloc to allocate memory directly
from memsegs. Thus, memzones would allocate memory through malloc,
allowing to free memzones.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
In order to unify the packet type, the field of 'packet_type' in
'struct rte_mbuf' needs to be extended from 16 to 32 bits.
Accordingly, some fields in 'struct rte_mbuf' are re-organized to support
this change for Vector PMD.
As 'struct rte_kni_mbuf' for KNI should be right mapped to
'struct rte_mbuf', it should be modified accordingly.
In ixgbe PMD driver, corresponding changes are added for the mbuf changes,
especially the bit masks of packet type for 'ol_flags' are replaced by
unified packet type. In addition, more packet types (UDP, TCP and SCTP)
are supported in vectorized ixgbe PMD.
To avoid breaking ABI compatibility, all the changes would be enabled by
RTE_NEXT_ABI.
Note that around 2% performance drop (64B) was observed of doing 4 ports
(1 port per 82599 card) IO forwarding on the same SNB core.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
After code rework from bellow commit, logic expects hugepage_sz field to
always be set (ie. not zero value).
When using --no-huge, this field was left unset defaulting to zero.
Set hugepage_sz to RTE_PGSIZE_4K when using --no-huge.
Fixes: b3dfffd962ecd ("mem: allow multiple page sizes to be requested")
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
When using vfio, the probe fails for BAR > 0 after the
commit-id 90a1633b2 (eal/linux: allow to map BARs with MSI-X tables).
While debugging further, found that the BAR region offset and size read from
vfio are u64, but are assigned to uint32_t variables. This results in the u64
value getting truncated to 0 and passing wrong offset and size to mmap for
subsequent BAR regions.
The fix is to use unsigned long for the offset and size.
This is based on patch by Alejandro Lucero <alejandro.lucero@netronome.com>
posted at below:
http://dpdk.org/ml/archives/dev/2015-June/020201.html
and updated with diff from below to fix 32-bit compilation:
http://dpdk.org/ml/archives/dev/2015-July/020963.html
Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables")
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
The patch fixes vfio initialization issue introduced by below patch.
Root cause is that VFIO_PRESENT is inaccessible in eal common level.
To fix it, remove pci_map/unmap_device from common code, then implement
in linux and bsd code.
Fixes: 35b3313e322b ("pci: merge mapping functions for linux and bsd")
Reported-by: Michael Qiu <michael.qiu@intel.com>
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Newer kernels make this unreadable for security reasons for non-roots.
Running the application will then fill the logs with
rte_mem_virt2phy: cannot open /proc/self/pagemap
messages.
However, there are cases when DPDK is and should be run as non-root,
without the need for virtual-to-physical address translations: a
typical example is when working with PCAP input/output. This patch
adds a start-time check for /proc/self/pagemap readability, and
directly returns an error code from rte_mem_virt2phy().
This way, there is only a one-time warning at startup instead of
constant warnings all the time.
Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Signed-off-by: Johan Faltstrom <johan.faltstrom@netinsight.net>
Using IBM advance toolchain on Ubuntu 14.04 (package 8.0-3), gcc is complaining
about out of bound accesses.
CC eal_hugepage_info.o
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:
In function ‘eal_hugepage_info_init’:
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:350:35:
error: array subscript is above array bounds [-Werror=array-bounds]
internal_config.hugepage_info[j].hugepage_sz)
^
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:350:35:
error: array subscript is above array bounds [-Werror=array-bounds]
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:349:37:
error: array subscript is above array bounds [-Werror=array-bounds]
if (internal_config.hugepage_info[j-1].hugepage_sz <
^
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:350:35:
error: array subscript is above array bounds [-Werror=array-bounds]
internal_config.hugepage_info[j].hugepage_sz)
Looking at the code, these warnings are invalid from my pov and they disappeared
when upgrading the toolchain to new version (8.0-4).
However, the code was buggy (sorting code is wrong), so fix this by using qsort
and adding a check on num_sizes to avoid potential out of bound accesses.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
With this, we should be checkpatch compliant.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Replace this while loop with a for loop and simplify error handling.
Indent is broken on purpose, fixed in next commit.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Rather than cast the huge pages number returned by get_num_hugepages, rework
this function so that it returns 0 when something goes wrong.
And no need for casts in log.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
The code in eal_hugepage_info.c is not reachable by secondary processes.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
This patch consolidates below functions, and implements these in common
eal code.
- rte_eal_pci_probe_one_driver()
- rte_eal_pci_close_one_driver()
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: David Marchand <david.marchand@6wind.com>
The patch consolidates below functions, and implemented in common
eal code.
- pci_map_device()
- pci_unmap_device()
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
The patch consolidates below functions, and implemented in common
eal code.
- pci_map_resource()
- pci_unmap_resource()
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch consolidates below structures, and defines them in common code.
- struct pci_map
- struct mapped_pci_resources
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch adds a new function called pci_uio_map_resource_by_index().
The function hides how to map uio resource in linuxapp and bsdapp.
With the function, pci_uio_map_resource() will be more abstracted.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch adds new functions called pci_uio_alloc_resource() and
pci_uio_free_resource().
The functions hides how to prepare or free uio resource in linuxapp
and bsdapp. With the function, pci_uio_map_resource() will be more
abstracted.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch fixes below.
- bsdapp
- Use map_id in pci_uio_map_resource().
- Fix interface of pci_map_resource().
- Move path variable of mapped_pci_resource structure to pci_map.
- linuxapp
- Remove redundant error message of linuxapp.
'pci_uio_map_resource()' is implemented in both linuxapp and bsdapp,
but interface is different. The patch fixes the function of bsdapp
to do same as linuxapp. After applying it, file descriptor should be
opened and closed out of pci_map_resource().
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch fixes following memory leaks.
- When open() is failed, uio_res and fds won't be freed in
pci_uio_map_resource().
- When pci_map_resource() is failed but path is allocated correctly,
path and fds won't be freed in pci_uio_map_recource().
Also, some mapped resources should be freed.
- When pci_uio_unmap() is called, path should be freed.
Also, fixes below.
- When pci_map_resource() is failed, mapaddr will be MAP_FAILED.
In this case, pci_map_addr should not be incremented in
pci_uio_map_resource().
- To shrink code, move close().
- Remove fail variable.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
When pci_uio_unmap_resource() is called, a file descriptor that is used
for uio configuration should be closed.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>