This patch adds methods that use hardware memory transactions (HTM) on
fast-path for rwlock (a.k.a. lock elision). Here the methods are implemented
for x86 using Restricted Transactional Memory instructions (Intel(r)
Transactional Synchronization Extensions). The implementation fall-backs to
the normal rwlock if HTM is not available or memory transactions fail. This is
not a replacement for all rwlock usages since not all critical sections
protected by locks are friendly to HTM. For example, an attempt to perform
a HW I/O operation inside a hardware memory transaction always aborts
the transaction since the CPU is not able to roll-back should the transaction
fail. Therefore, hardware transactional locks are not advised to be used around
rte_eth_rx_burst() and rte_eth_tx_burst() calls.
Signed-off-by: Roman Dementiev <roman.dementiev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This patch adds methods that use hardware memory transactions (HTM) on fast-path
for spinlocks (a.k.a. lock elision). Here the methods are implemented for x86
using Restricted Transactional Memory instructions (Intel(r) Transactional
Synchronization Extensions). The implementation fall-backs to the normal
spinlock if HTM is not available or memory transactions fail. This is not
a replacement for all spinlock usages since not all critical sections protected
by spinlocks are friendly to HTM. For example, an attempt to perform a HW I/O
operation inside a hardware memory transaction always aborts the transaction
since the CPU is not able to roll-back should the transaction fail.
Therefore, hardware transactional locks are not advised to be used around
rte_eth_rx_burst() and rte_eth_tx_burst() calls.
Signed-off-by: Roman Dementiev <roman.dementiev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
In containers like docker, current->pid returns current process's global
PID instead of its own PID under containers's PID namespace, and
get_net_ns_by_pid() suppose to accept a virtual PID under its own
namespace, so we should use task_pid_vnr(current) to get current process's
virtual PID instead of current->pid.
Signed-off-by: Wenfeng Liu <liuwf@arraynetworks.com.cn>
Acked-by: Helin Zhang <helin.zhang@intel.com>
We did some (very basic) tests with IGMP, which involves adding
multicast addresses to ETH interfaces. This is done via the ip tool,
an example can be found on e.g.,
http://superuser.com/questions/324824/linux-built-in-or-open-source-program-to-join-multicast-group
and this will fail on KNI interfaces because of an unimplemented ioctl
SIOCADDMULTI. The patch simply adds an empty callback for set_rx_mode
(typically used for setting up hardware) so that the ioctl succeeds.
This is the same thing as the Linux tap interface does.
Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Signed-off-by: Johan Faltstrom <johan.faltstrom@netinsight.net>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Loop processing packets dequeued from rx_q was using the number of
packets requested, not how many it actually received.
Variable rename to make code a little more clear
Signed-off-by: Jay Rolette <rolette@infiniteio.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
No reason to check out many entries are in kni->rx_q prior to
actually pulling them from the fifo. You can't dequeue more than
are there anyway. Max entries to dequeue is either the max batch
size or however much space is available on kni->free_q (lesser of the two).
Signed-off-by: Jay Rolette <rolette@infiniteio.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Do not need the 'safe' version of list_for_each_entry() if you are
not deleting from the list as you iterate over it.
Signed-off-by: Jay Rolette <rolette@infiniteio.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Implement .ndo_change_carrier to enable
DPDK applications to propagate link state changes to
kni virtual interfaces through sysfs
Signed-off-by: Vijayakumar Muthuvel Manickam <mmvijay@gmail.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Needed to run as non-root but with higher memory allocations, and
removes a constraint on no-huge mode being limited to 64M. A usage
example is if running with file input with the pcap PMD, which can be
done as non-root after this patch via e.g.,
./test-dpdk --no-huge -m 1024 -l 0,1 -n3
--vdev 'eth_pcap0,rx_pcap=eth-rx.pcap,tx_pcap=eth-tx.pcap'
Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Signed-off-by: Johan Faltstrom <johan.faltstrom@netinsight.net>
Acked-by: David Marchand <david.marchand@6wind.com>
Ran this code base through a script which:
- removes trailing whitespace
- removes space before tabs
- removes blank lines at end of file
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Using the "physical_package_id" as a fallback for determining the
numa node of a core tends to be unreliable. Fix this by using a
detection routine which reads the numa information from
/sys/devices/system/node and just returns a numa node of 0 on
failure.
Reported-by: Wang Sheng-Hui <shhuiw@gmail.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
On Fedora 22, with GCC 5.1, errors are reported due to array accesses
being potentially out of bounds. This commit fixes this by ensuring the
bounds check in the loop takes account of the array size.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
rte_pci.h depends upon stdio.h for the definition of the FILE type. Add
in #include <stdio.h> to the file to satisfy this dependency in cases
where the including C file does not already include stdio.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Marc Sune <marc.sune@bisdn.de>
Move xenvirt PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move ring PMD to drivers directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Move pcap pmd to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
move af_packet pmd to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The introduction of uio_pci_generic broke interrupt handling with
igb_uio. The igb_uio device uses the kernel read/write method to
enable disable IRQ's; the uio_pci_generic has to use PCI intx
config read/write to enable disable interrupts.
Since igb_uio uses MSI-X the PCI intx config read/write won't
work.
Fixes: c112df6875a5 ("eal/linux: toggle interrupt for uio_pci_generic")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Set internal event file descriptor to be non-block and not
inherited across exec. This prevents accidental hangs and
passing in another thread.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Compilation on FreeBSD with clang was broken, giving the error message:
lib/librte_eal/bsdapp/eal/eal_pci.c:438:16: fatal error: assigning to
'struct rte_pci_id *' from 'const struct rte_pci_id *' discards qualifiers
[-Wincompatible-pointer-types-discards-qualifiers]
for (id_table = dr->id_table ; id_table->vendor_id != 0; id_table++) {
^ ~~~~~~~~~~~~
This patch fixes the issue by adding "const" to the type of id_table.
Fixes: 6065355a03fc ("pci: make device id tables const")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This patch adds support for enic in the nic_uio driver so that enic
could be used on FreeBSD.
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Due to commit c0371da6 in kernel 3.19, which removed msg_iov
and msg_iovlen from struct msghdr, DPDK would not build.
Also, functions memcpy_toiovecend and memcpy_fromiovecend
were removed in commits ba7438ae and 57dd8a07, being substituted by
copy_from_iter and copy_to_iter.
This patch makes use of struct iov_iter, which has references
to msg_iov and msg_iovln, and makes use of copy_from_iter
and copy_to_iter.
Reported-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Due to API changes in function pointer ndo_bridge_setlink
(commit ad41faa8) and the rename of functions vlan_tx_*
(commit df8a39de) in kernel 4.0, DPDK would not build.
This patch adds the properly checks to fix the compilation.
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Closing /dev/io fd causes SIGBUS in inb/outb instructions
as the process loses the IOPL privileges once the fd is closed:
(gdb) bt
0 0x0000000000492f2c in outb (port=49170, data=0 '\000')
at /usr/include/machine/cpufunc.h:244
1 0x0000000000492f7a in outb_p (data=0 '\000', port=49170)
at /dpdk/dpdk-2.0.0/lib/librte_pmd_virtio/virtio_pci.h:211
2 0x000000000049328d in vtpci_set_status (hw=0x80331f380, status=0 '\000')
at /dpdk/dpdk-2.0.0/lib/librte_pmd_virtio/virtio_pci.c:130
3 0x00000000004931fe in vtpci_reset (hw=0x80331f380)
at /dpdk/dpdk-2.0.0/lib/librte_pmd_virtio/virtio_pci.c:108
4 0x00000000004a175e in eth_virtio_dev_init (eth_dev=0x831b80 <rte_eth_devices>)
at /dpdk/dpdk-2.0.0/lib/librte_pmd_virtio/virtio_ethdev.c:1150
5 0x0000000000462c09 in rte_eth_dev_init (pci_drv=0x79d1a0 <rte_virtio_pmd>,
pci_dev=0x802417560) at /dpdk/dpdk-2.0.0/lib/librte_ether/rte_ethdev.c:326
6 0x000000000046f03f in rte_eal_pci_probe_one_driver (dr=0x79d1a0 <rte_virtio_pmd>,
dev=0x802417560) at /dpdk/dpdk-2.0.0/lib/librte_eal/bsdapp/eal/eal_pci.c:487
7 0x0000000000475b06 in pci_probe_all_drivers (dev=0x802417560)
at /dpdk/dpdk-2.0.0/lib/librte_eal/common/eal_common_pci.c:116
8 0x0000000000475bb9 in rte_eal_pci_probe ()
at /dpdk/dpdk-2.0.0/lib/librte_eal/common/eal_common_pci.c:246
9 0x000000000046cd63 in rte_eal_init (argc=5, argv=0x7fffffffeaf0)
at /dpdk/dpdk-2.0.0/lib/librte_eal/bsdapp/eal/eal.c:554
10 0x0000000000404544 in main ()
Fixes: 8a312224bcde ("eal/bsd: fix fd leak")
Signed-off-by: Raz Amir <razamir22@gmail.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
According to the api, rte_log() / rte_vlog() are supposed to check the log level
and type but they were not doing so. This check was only done in the RTE_LOG
macro while this macro is only there to remove log messages at build time.
rte_log() always calls rte_vlog(), so do the check in rte_vlog() only.
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
The old device id: IXGBE_DEV_ID_X550EM_X is split into 2 new device id:
IXGBE_DEV_ID_X550EM_X_10G_T and IXGBE_DEV_ID_X550EM_X_1G_T
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
probe and close both don't modify the rte_pci_addr structure
that is passed.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
The PCI device id table is immutable and should be made const
in all drivers. The pseudo drivers can initialize their local
copy as necessary.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Kernel driver (kdrv) seems easier to understand than
passthrough driver (pt_driver). It's also more generic
as a PMD could run on top of any PCI kernel driver if
it would offer such support.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
eal options OPT_CREATE_UIO_DEV does not need argument so set it to zero.
It needs to reset create_uio_dev explicitly.
Fixes: f7f97c16048e ("pci: add option --create-uio-dev to run without hotplug")
Signed-off-by: Haifeng Gao <gaohaifeng.gao@huawei.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Due to API changes in functions ndo_dflt_bridge_getlink
(commit 2c3c031c) and ndo_fdb_add (commit f6f6424b)
in kernel 3.19, DPDK would not build.
This patch solves the problem, by checking the kernel version
and adding the necessary new parameters.
Mind that function igb_ndo_fdb_add does not need the extra parameter
if USE_CONST_DEV_UC_CHAR is not set, since that macro is only defined
when kernel is greater or equal than 3.7
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
In Suse11 SP3, there'll be errors for not found sse3 functions.
rte_memcpy.h: In function ‘rte_memcpy’:
rte_memcpy.h:625: error: implicit declaration of function ‘_mm_alignr_epi8’
rte_memcpy.h:625: error: nested extern declaration of ‘_mm_alignr_epi8’
rte_memcpy.h:625: error: incompatible type for argument 2 of ‘_mm_storeu_si128’
These functions defined in tmmintrin.h and should be included in.
Fixes: 9144d6bcdefd ("eal/x86: optimize memcpy for SSE and AVX")
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Function pread need macro _XOPEN_SOURCE be defined.
Add _GNU_SOURCE will fix this issue.
error: implicit declaration of function ‘pread’
Fixes: 4a499c649590 ("eal/linux: enable uio_pci_generic support")
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
When including rte_version.h without string.h, there is a compilation error:
include/rte_version.h: error: implicit declaration of function ‘strlen’
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Fix a warning when the rte_common.h header is included in a compilation
using -Wbad-function-cast, such as in Open vSwitch where the
following warning is emitted repeatedly:
../rte_common.h: In function 'rte_is_aligned':
../rte_common.h:184:9: warning: cast from function call of
type 'uintptr_t' to non-matching type 'void *' [-Wbad-function-cast]
This change fixes the issue in rte_common.h by using the RTE_ALIGN_FLOOR
macro to get the aligned floor value with generic type casting.
Also removed the rte_align_floor_int() function and replaced it with
the RTE_PTR_ALIGN_FLOOR() macro.
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
x86intrin.h cannot be directly included as it is not
always available.
rte_common_vect.h handles compiler differences.
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Commit a2348166ea18 ("tailq: move to dynamic tailq") introduced a bug in
uio/vfio resources list init.
These resources list were pointed at through a pointer initialised only once but
too early in the eal init (before tailqs init).
Fix this by "resolving" this pointer when used (which is well after tailqs
init).
Fixes: a2348166ea18 ("tailq: move to dynamic tailq")
Reported-by: Marvin Liu <yong.liu@intel.com>
Reported-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Tested-by: John McNamara <john.mcnamara@intel.com>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
The contigmem module was using an "int" type for specifying the
size of blocks of memory to be reserved. A 2GB block was therefore
overflowing the signed 32-bit value, making 1GB the largest block
size that could be reserved as a single unit.
The fix is to change the type used for the buffer/block size to
an "int64_t" value.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
There is no remaining reference to E_RTE_NO_TAILQ.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
No static entry remaining, the rte_tailq api is for "internal use" only, get rid
of the static slots.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
This register system makes it possible to reserve a tailq for the dpdk
libraries.
The "dynamic" tailqs are right after the "static" tailqs in shared mem.
Primary process is responsible for writing the tailq names, so that secondary
processes can find them.
This is a temp commit, "static" tailqs are removed after conversion of all
users in next commits.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
A lot of places just protect against concurrent access and I can not see the
gain of having those macros.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>