The patch adds 'rte_intr_rx_ctl' to add or delete interrupt vector
events monitor on specified epoll instance.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
The patch adds 'rte_epoll_wait' and 'rte_epoll_ctl' for async event wakeup.
It defines 'struct rte_epoll_event' as the event param.
When the event fds add to a specified epoll instance, 'eptrs' will hold
the rte_epoll_event object pointer.
The 'op' uses the same enum as epoll_wait/ctl does.
The epoll event support to carry a raw user data and to register a callback
which is executed during wakeup.
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
The patch adds interrupt vectors support in rte_intr_handle.
'vec_en' is set when interrupt vectors are detected and associated
event fds are set. Those event fds are stored in efds[].
'intr_vec' is reserved for device driver to initialize the vector
mapping table.
Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Some drivers need ability to access PCI config (for example for power
management). This adds an abstraction to do this for both Linux
and BSD.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Harish Patil <harish.patil@qlogic.com>
Move common functions from BSD/Linux to eal_common_memory.c file.
BSD uses contigmem kernel module and Linux uses /proc/self/pagemap file.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move common functions from BSD/Linux to eal_common_timer.c.
BSD uses sysctl and Linux uses CLOCK_MONOTIC_RAW to calibrate TSC.
HPET is specific to Linux and not integrated in the common init.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
[Thomas: move inclusion used by ixgbe bypass]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Change the log level of startup messages. Anything that is
just normal activity (like getting virtual areas) is changed
to debug level. Anything that is a failure should be NOTICE
or ERR severity.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The read for events in the interrupt thread may get interrupted
by signals from application. Avoid generating stray log message.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
There are close and detach functions in ethdev.
To keep a consistent naming, PCI functions called by ethdev detach
must be named "detach" instead of "close".
Fix also comments which mix close and uninit names.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Implement rte_memzone_free which, as its name implies, would free a
memzone.
Currently memzone are tracked in an array and cannot be free.
To be able to reuse the same array to track memzones, we have to
change how we keep track of reserved memzones.
With this patch, any memzone with addr NULL is not used, so we also need
to change how we look for the next memzone entry free.
Add new unit test for rte_memzone_free API.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
In the current memory hierarchy, memsegs are groups of physically
contiguous hugepages, memzones are slices of memsegs and malloc further
slices memzones into smaller memory chunks.
This patch modifies malloc so it partitions memsegs instead of memzones.
Thus memzones would call malloc internally for memory allocation while
maintaining its ABI.
During initialization malloc sets all available memory as part of the heaps.
CONFIG_RTE_MALLOC_MEMZONE_SIZE was used to specify the default memory
block size to expand the heap. The option is not used/relevant anymore,
so we remove it.
Remove free_memseg field from internal mem config structure as it is
not used anymore.
Also remove code in ivshmem that was setting up free_memseg on init.
It would be possible to free memzones and therefore any other structure
based on memzones, ie. mempools
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Move malloc inside eal and create a new section in MAINTAINERS file for
Memory Allocation in EAL.
Create a dummy malloc library to avoid breaking applications that have
librte_malloc in their DT_NEEDED entries.
This is the first step towards using malloc to allocate memory directly
from memsegs. Thus, memzones would allocate memory through malloc,
allowing to free memzones.
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
In order to unify the packet type, the field of 'packet_type' in
'struct rte_mbuf' needs to be extended from 16 to 32 bits.
Accordingly, some fields in 'struct rte_mbuf' are re-organized to support
this change for Vector PMD.
As 'struct rte_kni_mbuf' for KNI should be right mapped to
'struct rte_mbuf', it should be modified accordingly.
In ixgbe PMD driver, corresponding changes are added for the mbuf changes,
especially the bit masks of packet type for 'ol_flags' are replaced by
unified packet type. In addition, more packet types (UDP, TCP and SCTP)
are supported in vectorized ixgbe PMD.
To avoid breaking ABI compatibility, all the changes would be enabled by
RTE_NEXT_ABI.
Note that around 2% performance drop (64B) was observed of doing 4 ports
(1 port per 82599 card) IO forwarding on the same SNB core.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
After code rework from bellow commit, logic expects hugepage_sz field to
always be set (ie. not zero value).
When using --no-huge, this field was left unset defaulting to zero.
Set hugepage_sz to RTE_PGSIZE_4K when using --no-huge.
Fixes: b3dfffd962ecd ("mem: allow multiple page sizes to be requested")
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
When using vfio, the probe fails for BAR > 0 after the
commit-id 90a1633b2 (eal/linux: allow to map BARs with MSI-X tables).
While debugging further, found that the BAR region offset and size read from
vfio are u64, but are assigned to uint32_t variables. This results in the u64
value getting truncated to 0 and passing wrong offset and size to mmap for
subsequent BAR regions.
The fix is to use unsigned long for the offset and size.
This is based on patch by Alejandro Lucero <alejandro.lucero@netronome.com>
posted at below:
http://dpdk.org/ml/archives/dev/2015-June/020201.html
and updated with diff from below to fix 32-bit compilation:
http://dpdk.org/ml/archives/dev/2015-July/020963.html
Fixes: 90a1633b2347 ("eal/linux: allow to map BARs with MSI-X tables")
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
The patch fixes vfio initialization issue introduced by below patch.
Root cause is that VFIO_PRESENT is inaccessible in eal common level.
To fix it, remove pci_map/unmap_device from common code, then implement
in linux and bsd code.
Fixes: 35b3313e322b ("pci: merge mapping functions for linux and bsd")
Reported-by: Michael Qiu <michael.qiu@intel.com>
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Newer kernels make this unreadable for security reasons for non-roots.
Running the application will then fill the logs with
rte_mem_virt2phy: cannot open /proc/self/pagemap
messages.
However, there are cases when DPDK is and should be run as non-root,
without the need for virtual-to-physical address translations: a
typical example is when working with PCAP input/output. This patch
adds a start-time check for /proc/self/pagemap readability, and
directly returns an error code from rte_mem_virt2phy().
This way, there is only a one-time warning at startup instead of
constant warnings all the time.
Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Signed-off-by: Johan Faltstrom <johan.faltstrom@netinsight.net>
Using IBM advance toolchain on Ubuntu 14.04 (package 8.0-3), gcc is complaining
about out of bound accesses.
CC eal_hugepage_info.o
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:
In function ‘eal_hugepage_info_init’:
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:350:35:
error: array subscript is above array bounds [-Werror=array-bounds]
internal_config.hugepage_info[j].hugepage_sz)
^
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:350:35:
error: array subscript is above array bounds [-Werror=array-bounds]
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:349:37:
error: array subscript is above array bounds [-Werror=array-bounds]
if (internal_config.hugepage_info[j-1].hugepage_sz <
^
lib/librte_eal/linuxapp/eal/eal_hugepage_info.c:350:35:
error: array subscript is above array bounds [-Werror=array-bounds]
internal_config.hugepage_info[j].hugepage_sz)
Looking at the code, these warnings are invalid from my pov and they disappeared
when upgrading the toolchain to new version (8.0-4).
However, the code was buggy (sorting code is wrong), so fix this by using qsort
and adding a check on num_sizes to avoid potential out of bound accesses.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
With this, we should be checkpatch compliant.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Replace this while loop with a for loop and simplify error handling.
Indent is broken on purpose, fixed in next commit.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Rather than cast the huge pages number returned by get_num_hugepages, rework
this function so that it returns 0 when something goes wrong.
And no need for casts in log.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
The code in eal_hugepage_info.c is not reachable by secondary processes.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
This patch consolidates below functions, and implements these in common
eal code.
- rte_eal_pci_probe_one_driver()
- rte_eal_pci_close_one_driver()
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: David Marchand <david.marchand@6wind.com>
The patch consolidates below functions, and implemented in common
eal code.
- pci_map_device()
- pci_unmap_device()
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
The patch consolidates below functions, and implemented in common
eal code.
- pci_map_resource()
- pci_unmap_resource()
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch consolidates below structures, and defines them in common code.
- struct pci_map
- struct mapped_pci_resources
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch adds a new function called pci_uio_map_resource_by_index().
The function hides how to map uio resource in linuxapp and bsdapp.
With the function, pci_uio_map_resource() will be more abstracted.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch adds new functions called pci_uio_alloc_resource() and
pci_uio_free_resource().
The functions hides how to prepare or free uio resource in linuxapp
and bsdapp. With the function, pci_uio_map_resource() will be more
abstracted.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch fixes below.
- bsdapp
- Use map_id in pci_uio_map_resource().
- Fix interface of pci_map_resource().
- Move path variable of mapped_pci_resource structure to pci_map.
- linuxapp
- Remove redundant error message of linuxapp.
'pci_uio_map_resource()' is implemented in both linuxapp and bsdapp,
but interface is different. The patch fixes the function of bsdapp
to do same as linuxapp. After applying it, file descriptor should be
opened and closed out of pci_map_resource().
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch fixes following memory leaks.
- When open() is failed, uio_res and fds won't be freed in
pci_uio_map_resource().
- When pci_map_resource() is failed but path is allocated correctly,
path and fds won't be freed in pci_uio_map_recource().
Also, some mapped resources should be freed.
- When pci_uio_unmap() is called, path should be freed.
Also, fixes below.
- When pci_map_resource() is failed, mapaddr will be MAP_FAILED.
In this case, pci_map_addr should not be incremented in
pci_uio_map_resource().
- To shrink code, move close().
- Remove fail variable.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
When pci_uio_unmap_resource() is called, a file descriptor that is used
for uio configuration should be closed.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
This patch fixes coding style of below files in linuxapp and bsdapp.
- eal_pci.c
- eal_pci_uio.c
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
The RTE_LOG(DEBUG, ...) messages in rte_eal_cpu_init() are printed
even when the log level on the command line was set to INFO or lower.
The problem is the rte_eal_cpu_init() routine was called before
the command line args are scanned. Setting --log-level=7 now
correctly does not print the messages from the rte_eal_cpu_init() routine.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Using the "physical_package_id" as a fallback for determining the
numa node of a core tends to be unreliable. Fix this by using a
detection routine which reads the numa information from
/sys/devices/system/node and just returns a numa node of 0 on
failure.
Reported-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
On Fedora 22, with GCC 5.1, errors are reported due to array accesses
being potentially out of bounds. This commit fixes this by ensuring the
bounds check in the loop takes account of the array size.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Move xenvirt PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move ring PMD to drivers directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move pcap pmd to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
move af_packet pmd to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The introduction of uio_pci_generic broke interrupt handling with
igb_uio. The igb_uio device uses the kernel read/write method to
enable disable IRQ's; the uio_pci_generic has to use PCI intx
config read/write to enable disable interrupts.
Since igb_uio uses MSI-X the PCI intx config read/write won't
work.
Fixes: c112df6875a5 ("eal/linux: toggle interrupt for uio_pci_generic")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Set internal event file descriptor to be non-block and not
inherited across exec. This prevents accidental hangs and
passing in another thread.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
The PCI device id table is immutable and should be made const
in all drivers. The pseudo drivers can initialize their local
copy as necessary.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Kernel driver (kdrv) seems easier to understand than
passthrough driver (pt_driver). It's also more generic
as a PMD could run on top of any PCI kernel driver if
it would offer such support.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>