This is a Linux-specific virtual PMD driver backed by an AF_PACKET
socket. This implementation uses mmap'ed ring buffers to limit copying
and user/kernel transitions. The PACKET_FANOUT_HASH behavior of
AF_PACKET is used for frame reception. In the current implementation,
Tx and Rx queues are always paired, and therefore are always equal
in number -- changing this would be a Simple Matter Of Programming.
Interfaces of this type are created with a command line option like
"--vdev=eth_af_packet0,iface=...". There are a number of options available
as arguments:
- Interface is chosen by "iface" (required)
- Number of queue pairs set by "qpairs" (optional, default: 1)
- AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
- AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
- AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)
Signed-off-by: John W. Linville <linville@tuxdriver.com>
[Thomas: disable because of incompatibility with some kernels]
It eliminates a race between threads using rte_alarm_cancel() and
rte_alarm_set().
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
According to the changes of the i40e base driver, two device
IDs (0x1573, 0x1582) are not supported anymore, and one new
device ID (0x1586) is supported. The list of i40e device IDs
DPDK supported should be modified accordingly.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Since commit 08b563ffb1 ("mbuf: replace data pointer by an offset"),
KNI vhost compilation (CONFIG_RTE_KNI_VHOST=y) was broken.
rte_pktmbuf_mtod() is not used in the kernel context but is replaced
by a simple addition of the base address and the offset.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Following the big headers rework, all C++ stuff has moved to arch-specific
headers. The generic headers should not contain this so that this is done only
once.
There was a remaining #ifdef __cplusplus in "eal: split CPU cycle operation to
architecture specific" (fa4001c30e).
Reported-by: Keunhong Lee <dlrmsghd@gmail.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
No need to keep the same code duplicated for 32 and 64bits x86.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Chao Zhu <bjzhuc@cn.ibm.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Architecture can have their own specific headers, just install all headers from
arch directory.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Chao Zhu <bjzhuc@cn.ibm.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch splits CPU flags related operations from DPDK and push them
to architecture specific arch directories, so that other processor
architecture can implement its own CPU flag functions to support DPDK.
Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch splits the SSE based memory copy function from DPDK and push
them to architecture specific arch directories. Other processor
architecture can implement its own vector based memory copy functions.
Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch splits the spinlock operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can be easily adopted.
Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch splits the prefetch operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can implement their own functions.
Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch splits the CPU TSC read operations from DPDK and push them to
architecture specific arch directories, so that other processors that
don't have tsc register can implement its own functions.
Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch splits the byte order operations from DPDK and push them to
architecture specific arch directories, so that other processor
architecture to support DPDK can be easily adopted.
Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This patch first adds architecture specific directories to eal.
Then split the atomic operations to architecture specific and generic files.
Architecture specific files are put into the corresponding architecture
directory and common header are put into generic directory.
Update documentation generation with new generic/ directory.
Signed-off-by: Chao Zhu <bjzhuc@cn.ibm.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
On FreeBSD, when initializing a secondary process,
EAL was complaining if there were ports not bound
to nic_uio module, exiting the application, which
should not happen, as this is expected behaviour,
and not an error
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
In the case where a userspace reports itself as Ubuntu, but the
kernel isn't providing the expected version signature interface,
turn off Ubuntu specializations.
This situation happens often enough in development environments,
and with multi-distribution build servers (e.g. chroot, containers).
Signed-off-by: Alexander Guy <alexander@andern.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The maximum mount contiguous memory regions for FreeBSD is limited by
RTE_CONTIGMEM_MAX_NUM_BUFS, a pointer to each region is stored in
static void * contigmem_buffers[RTE_CONTIGMEM_MAX_NUM_BUFS]
A user can specify a greater amount via hw.contigmem.num_buffers,
while the allocation logic will prevent this allocation from occuring the logic
in contigmem_unload() will attempt to free hw.contigmem.num_buffers and an
overrun occurs.
This patch limits the freeing to a maximum of RTE_CONTIGMEM_MAX_NUM_BUFS.
Signed-off-by: Alan Carew <alan.carew@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
While some applications may store metadata about packets in the packet
mbuf headroom, this is not a workable solution for packet metadata which
is either:
* larger than the headroom (or headroom is needed for adding pkt headers)
* needs to be shared or copied among packets
To support these use cases in applications, we reserve a general
"userdata" pointer field inside the second cache-line of the mbuf. This
is better than having the application store the pointer to the external
metadata in the packet headroom, as it saves an additional cache-line
from being used.
Apart from storing metadata, this field also provides a general 8-byte
scratch space inside the mbuf for any other application uses that are
applicable.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Recent Ubuntu 12.04.5 LTS is shipped with 3.13.0-36.63 as the only
supported kernel.
So skb_set_hash has been backported and is conflicting with kni kcompat one.
Commit a09b359dac ("fix build on Ubuntu 14.04") describes the initial problem.
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
[Thomas: reorder conditions to ease reading]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The flag RTE_PCI_DRV_MULTIPLE was used to register an eth_driver allowing
multiples devices with a single PCI id.
It is now possible to register a pci_driver and create ethdev objects
using rte_eth_dev_allocate().
Suggested-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The function rte_snprintf() was deprecated in version 1.7.0
(commit 6f41fe75e2).
It's now totally removed.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
This field is not used anymore, remove it.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
There is no need for ioport access for applications that won't use virtio pmds.
Make rte_eal_iopl_init() non-static so that it is called from pmds that need it.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
From man(4) io:
"The initial implementation simply raised the IOPL of the current thread
when open(2) was called on the device. This behaviour is retained in the
current implementation as legacy support for both i386 and amd64."
http://www.freebsd.org/cgi/man.cgi?query=io&sektion=4
Nothing prevents from closing it just after.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
KNI applies only to linux, so there should be no need for any kni files to
be present in the bsdapp eal folder.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Identify all options through the getopt_long return value.
This way, we only need a big switch/case.
Indentation is broken to ease commit review (fixed in next commit).
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
All common options are now in a single file.
Common usage() has been moved as well.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
We can handle both short and long options for those in the same case.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Following commit cac6d08c8b and 4bf3fe634a
(replace --use-device option by --pci-whitelist and --vdev),
this option is not available anymore.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Add a --log-level option to set the default eal log level.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
It is helpful when you want outside code to cooperate with and respect
log levels set in DPDK. Then you can avoid using duplicate incompatible
log code in the DPDK and non-DPDK parts of the app.
Signed-off-by: Matthew Hall <mhall@mhcomputing.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
[Thomas: add void to fix function signature]
This change splits the mbuf in two to move the pool and next pointers to
the second cache line. This frees up 16 bytes in first cache line.
The reason for this change is that we believe that there is no possible
way that we can ever fit all the fields we need to fit into a 64-byte
mbuf, and so we need to start looking at a 128-byte mbuf instead. Examples
of new fields that need to fit in, include -
* 32-bits more for filter information for support for the new filters in
the i40e driver (and possibly other future drivers)
* an additional 2-4 bytes for storing info on a second vlan tag to allow
drivers to support double Vlan/QinQ
* 4-bytes for storing a sequence number to enable out of order packet
processing and subsequent packet reordering
as well as potentially a number of other fields or splitting out fields
that are superimposed over each other right now, e.g. for the qos scheduler.
We also want to allow space for use by other non-Intel NIC drivers that may
be open-sourced to dpdk.org in the future too, where they support fields
and offloads that currently supported hardware doesn't.
If we accept the fact of a 2-cache-line mbuf, then the issue becomes
how to rework things so that we spread our fields over the two
cache lines while causing the lowest slow-down possible. The general
approach that we are looking to take is to focus the first cache
line on fields that are updated on RX , so that receive only deals
with one cache line. The second cache line can be used for application
data and information that will only be used on the TX leg. This would
allow us to work on the first cache line in RX as now, and have the
second cache line being prefetched in the background so that it is
available when necessary. Hardware prefetches should help us out
here. We also may move rarely used, or slow-path RX fields e.g. such
as those for chained mbufs with jumbo frames, to the second
cache line, depending upon the performance impact and bytes savings
achieved.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The offload flags field (ol_flags) was 16-bits and had no further room
for expansion. This patch increases the field size to 64-bits, using up
the remaining reserved space in the single-cache-line mbuf.
NOTE: none of the values for existing flags have been changed, i.e. no
new numbers have been explicitly reserved between existing flag
definitions.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
* Reorder the fields in the mbuf so that we have fields that are used
together side-by-side in the structure. This means that we have a
contiguous block of 8-bytes in the mbuf which are used to reset an mbuf
of descriptor rearm, and a block of 16-bytes of data (excluding flags)
which are set on RX from the received packet descriptor.
* Use dummy fields as appropriate to ensure alignment or to reserve gaps
for later field additions.
* Place most items which are not used by fast-path RX separately at the end
of the structure so they can later be moved to a separate cache line.
[The l2/l3 length fields are not moved at this stage as doing so will
cause overflow to the next cache line].
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The mbuf structure already contains a pointer to the beginning of the
buffer (m->buf_addr). It is not needed to use 8 bytes again to store
another pointer to the beginning of the data.
Using a 16 bits unsigned integer is enough as we know that a mbuf is
never longer than 64KB. We gain 6 bytes in the structure thanks to
this modification.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
* Updated to apply to latest on mainline.
* Disabled vector PMD in config as it relies heavily on the mbuf layout
This will be re-enabled in a subsequent commit once vPMD has been
reworked to take account of mbuf changes.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
- pci_num_vf() is already defined in RHEL 6
- pci_intx_mask_supported is already defined in RHEL 6.3
- pci_check_and_mask_intx is already defined in RHEL 6.3
Signed-off-by: Guillaume Gaudonville <guillaume.gaudonville@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This reverts commit 399a3f0db8
"fix IRQ mode handling"
and part of commit 4a5c221f9d
"fix compability on old kernel"
MSI implementation is using irq_to_desc which is not exported before
kernel 3.4 and commit 3911ff30.
Let's revert it for release 1.7.1, waiting for another solution.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Since Linux commit "set name_assign_type in alloc_netdev" (c835a677331495),
the function alloc_netdev takes a new parameter (name_assign_type)
whose default value is NET_NAME_UNKNOWN.
Signed-off-by: Aaro Koskinen <aaro.koskinen@nsn.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The sysfs directory for hugepages parsing was not closed properly in some
error cases.
Signed-off-by: Zhangkun <zhangk.zhangkun@huawei.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>