Checkin
a132a9cf2bcd440a974b9d3f5c44ba30b2c895a1 hash: use intrinsic
changed the rte_hash_crc.h from using the crc32 instruction via inline
assembly to using an intrinsic. The intrinsic should allow for better
compiler performance, but the change did not account for the fact that
the inline assembly being in AT&T syntax used the opposite operand
order of the intrinsic.
This turns out to not matter for correctness, because the CRC32
operation is commutative. However, it could potentially matter for
performance, because the loop is more efficient with the moving
pointer in the source operand and the accumulation in the destination
operand.
This was discovered by Jan Beulich when looking at the equivalent code
in the Linux kernel.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Reported-by: Jan Beulich <jbeulich@suse.com>
Reported-by: Pashupati Kumar <kumarp@brocade.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This reverts commit "log: get full path as syslog id" (494a02537f1)
and restore the original patch from Stephen Hemminger (04210699eee).
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
In rte_kvargs_process() and rte_kvargs_count(), if the key_match
argument is NULL, process all entries.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This argument can be useful when rte_kvargs_process() is called with
key=NULL, in this case the handler is invoked for all entries of the
kvlist.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The "value" argument is read-only and should be const.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
When we match a key in is_valid_key() and rte_kvargs_process(), do a
strict comparison (strcmp()) instead of using strstr(s1, s2) which tries
a find s1 in s2. This old behavior could lead to unexpected match, for
instance "cola" match "chocolate".
Surprisingly, no patch was needed on rte_kvargs_count() as it already
used strcmp().
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Remove the rte_kvargs_add_pair() function whose only role was to check
if a key is duplicated. Having duplicated keys is now allowed by kvargs
API.
Also replace rte_strsplit() by more a standard function strtok_r() that
is easier to understand for people already knowing the libc. It also
avoids useless calls to strnlen(). The delimiters macros become strings
instead of chars due to the strtok_r() API.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Before the patch, a call to rte_kvargs_tokenize() resulted in a call to
strdup() to allocate a modifiable copy of the argument string. This
string was never freed, excepted in the error cases of
rte_kvargs_tokenize() where rte_free() was wrongly called instead of
free(). In other cases, freeing this string was impossible as the
pointer not saved.
This patch introduces rte_kvargs_free() in order to free the structure
properly. The pointer to the duplicated string is now kept in the
rte_kvargs structure. A call to rte_kvargs_parse() directly allocates
the structure, making rte_kvargs_init() useless.
The only drawback of this API change is that a key/value associations
cannot be added to an existing kvlist. But it's not used today, and
there is not obvious use case for that.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This value was not very useful as the size of the table is fixed (equals
RTE_KVARGS_MAX).
By the way, the memset in the initialization function was wrong (size
too short). Even if it was not really an issue since we rely on the
"count" field, it is now fixed by this patch.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Now that rte_kvargs is a generic library, there is no need to have an argument
for the driver name in rte_kvargs_tokenize() and rte_kvargs_parse()
prototypes. This argument was only used to log the driver name in case of
error. Instead, we can add a log in init function of pmd_pcap and pmd_ring.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The rte_kvargs library is a reworked copy of rte_eth_pcap_arg_parser,
so it provides the same service. Therefore we can use it and remove the
code of rte_eth_pcap_arg_parser.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This macro was used for blacklist parsing but is not used anymore
since commit 5a55b9ac91face71e9d665eecc716201d28745b0.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
1) In the EAL initialization phase, invoke the function rte_eal_cpu_init
to detect the set of running cores (and enable them by default) before
processing the [enabled] core mask option that is performed during the
parsing of EAL arguments.
2) In the function rte_eal_cpu_init():
- to parse the set of all running logical cores on the machine, do not
use the RTE_LCORE_FOREACH macro that considers the set of already
detected cores...
Instead, use a standard loop based on the RTE_MAX_LCORE constant.
- explicitely set to ROLE_RTE the role of each detected logical core
that is recorded in the EAL configuration, as all running cores are
enabled by default.
3) In the function eal_parse_coremask(), update the "lcore_count" field
of the EAL configuration with the effective number of logical cores
that are set in the mask of enabled logical cores.
Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Adding or subtracting a value to a pointer makes a new pointer
of unknown type.
So typeof() is replaced by (void*) in RTE_PTR_ADD() and RTE_PTR_SUB().
But RTE_PTR_ALIGN_* macros have in their explicit API to return a pointer
of the same type. Since RTE_PTR_ALIGN_CEIL is based on RTE_PTR_ADD, a
typeof() is added to keep the original behaviour.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Add an option to specify libraries to be loaded before probing the PCI.
For instance, testpmd -d librte_pmd_xxx.so can be used to enable xxx driver
support on testpmd without any recompilation of testpmd.
Plugins are loaded before creating threads because we want the threads to
inherit any property that could be set while loading a plugin, such as iopl().
Signed-off-by: Damien Millescamps <damien.millescamps@6wind.com>
Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
- rte_panic must be before rte_panic_ to be associated to its doc
- marker /**< must be used when commenting after the declaration only
- fix rte_string_fns.h title
- typos
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
The following changes are included in this patch for ixgbe:
* Support for a separate Vector Poll-Mode Driver component
* Refactoring to extract out definitions from .c file to separate .h
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
This is a fix for the ixgbe hardware offload flags not being set
when bulk alloc RX is used. The issue was caused by masking off
the bits that store the hardware offload values in the status_error
field to retrieve the done bit for the descriptor.
Commit 7431041062b9fd0555bac7ca4abebc99e3379fa5 in DPDK-1.3.0
introduced bulk dequeue, which included the bug.
Signed-off-by: Bryan Benson <bmbenson@amazon.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
Physical Function assignes Tx/Rx queues to each Virtual Function
according to different schemes[1]. By querying through mailbox,
VF is able to get number of Tx/Rx queues assigned to it.
Note that current Intel ixgbe driver ixgbe-3.18.7 does not fully
support mailbox message IXGBE_VF_GET_QUEUES. The service routine
for IXGBE_VF_GET_QUEUES must be fixed, otherwise PF always return
1 as Tx/Rx queue number.
[1] See section 7.2.1.2.1, 7.1.2.2 and 7.10.2.7.2 of Intel 82599 10
Gbe Controller Datasheet.
Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Following introduction of loopback mode, this mode should be explicitely
disabled in ixgbe_dev_rx_init() if not enabled.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
82599 has two loopback operation modes, Tx->Rx and Rx->Tx.
For the time being only Tx->Rx is supported.
The new field lpbk_mode added in struct rte_eth_conf defines loopback
operation mode for certain ethernet controller. By default the value
of lpbk_mode is 0, meaning loopback mode disabled.
Since each ethernet controller has its own definition of loopback modes,
API user has to check both datasheet and implementation of certain driver
so as to understand what are valid values to be set, and what are the
expected behaviors.
Check IXGBE_LPBK_82599_XXX which are defined in ixgbe_ethdev.h
for valid values of 82599 loopback mode.
Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
Acked-by: Venky Venkatesan <venky.venkatesan@intel.com>
The 82576 has known issues which require the write threshold to be set to 1.
See:
http://download.intel.com/design/network/specupdt/82576_SPECUPDATE.pdf
If not then single packets will hang in transmit ring until more arrive.
Simple tests like ping will fail.
The workaround was in the wrong file (commit a30ebfbb8c3a).
Move it in igb one to restore original patch (7e9e49feea).
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
It should be possible to enable RSS with one Rx queue.
RSS hash can be useful independently of the number of Rx queues.
Applications can use RSS hash to identify different IP flows.
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
As explained in rte_ethdev.h, ETH_MQ_RX_NONE allows to not choose RSS, DCB
or VMDQ mode.
But the igb/ixgbe code always silently select the RSS mode with ETH_MQ_RX_NONE.
This patch fixes this incoherence between the API and the implementation.
Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
Poll Mode Driver for Paravirtual VMXNET3 NIC.
As a PMD, the VMXNET3 driver provides the packet reception and transmission
callbacks, vmxnet3_recv_pkts and vmxnet3_xmit_pkts. It does not support
scattered packet reception as part of vmxnet3_recv_pkts and
vmxnet3_xmit_pkts. Also, it does not support scattered packet reception as part of
the device operations supported.
The VMXNET3 PMD handles all the packet buffer memory allocation and resides in
guest address space and it is solely responsible to free that memory when not needed.
The packet buffers and features to be supported are made available to hypervisor via
VMXNET3 PCI configuration space BARs. During RX/TX, the packet buffers are
exchanged by their GPAs, and the hypervisor loads the buffers with packets in the RX
case and sends packets to vSwitch in the TX case.
The VMXNET3 PMD is compiled with vmxnet3 device headers. The interface is similar
to that of the other PMDs available in the Intel(R) DPDK API. The driver pre-allocates the
packet buffers and loads the command ring descriptors in advance. The hypervisor fills
those packet buffers on packet arrival and write completion ring descriptors, which are
eventually pulled by the PMD. After reception, the Intel(R) DPDK application frees the
descriptors and loads new packet buffers for the coming packets. The interrupts are
disabled and there is no notification required. This keeps performance up on the RX
side, even though the device provides a notification feature.
In the transmit routine, the Intel(R) DPDK application fills packet buffer pointers in the
descriptors of the command ring and notifies the hypervisor. In response the hypervisor
takes packets and passes them to the vSwitch. It writes into the completion descriptors
ring. The rings are read by the PMD in the next transmit routine call and the buffers
and descriptors are freed from memory.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Update the KNI kernel driver so it can compile on more modern kernels
Also, rebaseline the ethtool support off updated igb kernel drivers
so that we get the latest bug fixes and device support.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
kni_net fixed to prevent losing packet bytes when doing loopback.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reported-by: Daniel Kaminsky <daniel.kaminsky@infinitelocality.com>
This provides a para-virtualization packet switching solution, based on the
Xen hypervisor’s Grant Table, which provides simple and fast packet
switching capability between guest domains and host domain based on
MAC address or VLAN tag.
This solution is comprised of two components; a Poll Mode Driver (PMD)
as the front end in the guest domain and a switching back end in the
host domain. XenStore is used to exchange configure information
between the PMD front end and switching back end,
including grant reference IDs for shared Virtio RX/TX rings, MAC
address, device state, and so on.
The front end PMD can be found in the Intel DPDK directory lib/
librte_pmd_xenvirt and back end example in examples/vhost_xen.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Core support for using the Intel DPDK with Xen Dom0 - including EAL
changes and mempool changes. These changes encompass how memory mapping
is done, including support for initializing a memory pool inside an
already-allocated block of memory.
KNI sample app updated to use KNI close function when used with Xen.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
These library changes provide a new Intel DPDK feature for communicating
with virtual machines using QEMU's IVSHMEM mechanism.
The feature works by providing a command line for QEMU to map several hugepages
into a single IVSHMEM device. For the guest to know what is inside any given IVSHMEM
device (and to distinguish between Intel(R) DPDK and non-Intel(R) DPDK IVSHMEM
devices), a metadata file is also mapped into the IVSHMEM segment. No work needs to
be done by the guest application to map IVSHMEM devices into memory; they are
automatically recognized by the Intel(R) DPDK Environment Abstraction Layer (EAL).
Changes in this patch:
* Changes to EAL to allow mapping of all hugepages in a memseg into a single file
* Changes to EAL to allow ivshmem devices to be transparently mapped in
the process running on the guest.
* New ivshmem library to create and manage metadata exported to guest VM's
* New ivshmem compilation targets
* Mempool and ring changes to allow export of structures to a VM and allow
a VM to attach to those structures.
* New autotests to unit tests this functionality.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
For certain functionality, e.g. Xen Dom0 support, it is required that
we can guarantee that memzones for descriptor rings won't cross 2M
boundaries. So add new memzone reserve function where we can pass in a
boundary condition parameter.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Extra space for future alignment was reserved twice.
It was introduced in version 1.3.0 (commit 916e4f4f4e45a1d3cdd473cf9ef).
Signed-off-by: Pei Chao <peichao85@gmail.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
In some cases, it is possible to not use hugepages.
So a simple malloc is used to initialize DPDK memory.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Damien Millescamps <damien.millescamps@6wind.com>
For multi-process applications, it can sometimes occur that part of the
address ranges used for memory mapping in the primary process are not
free in the secondary process, which causes the secondary processes to
abort on startup.
This patch adds in a memory hinting mechanism, where you can hint a
starting base address to the primary process for where you would like
the hugepage memory to be mapped. It is just a hint, so the memory will
not always go exactly where requested, but it should allow the memory
addresses used by a primary process to be adjusted up or down a little,
thereby fixing issues with secondary process startup.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Allow poll-mode drivers to maintain their own caches of mbufs, by allowing them
to check if it's ok to free an mbuf (to their local cache) without actually
freeing it back to the memory pool itself.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>