Updating functionality in EAL to support adding link bonding
devices via –vdev option. Link bonding devices will be
initialized after all physical devices have been probed and
initialized.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The function rte_snprintf serves no useful purpose. It is the
same as snprintf() for all valid inputs. Deprecate it and
replace all uses in current code.
Leave the tests for the deprecated function in place.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Mark the rte_log, cmdline_printf and rte_snprintf functions as
being printf-style functions. This causes compilation errors
due to mis-matched parameter types, so the parameter types are
fixed where appropriate.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Since commit "add FILE argument to debug functions" (591a9d7985c1230),
application which includes rte_memory.h without stdio.h will be hit
compilation failure:
/path/to/include/rte_memory.h:146:30: error: unknown type name ‘FILE’
void rte_dump_physmem_layout(FILE *f);
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Reviewed-by: Hayato Momma <h-momma@ce.jp.nec.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Only some devices support the link state interrupt configuration option.
Link state control does not work in virtual drivers
(virtio, vmxnet3, igbvf, and ixgbevf). Instead of having the application
try and guess whether it will work or not provide a driver flag that
can be checked instead.
Note: if device driver doesn't support link state control, what
would happen previously is that the code would never detect link
transitions. This prevents that.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: rename flag]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Problems with lib rte_malloc:
1. Rte_malloc searches a heap's entire free list looking for the best
fit, resulting in linear complexity.
2. Heaps store free blocks in a singly-linked list, resulting in
linear complexity when rte_free needs to remove an adjacent block.
3. The library inserts and removes free blocks with ad hoc, in-line
code, rather than using linked-list functions or macros.
4. The library wastes potential small blocks of size 64 and 128 bytes
(plus overhead of 64 bytes) as padding when reusing free blocks or
resizing allocated blocks.
This patch addresses those problems as follows:
1. Replace single free list with a handful of free lists. Each free
list contains blocks of a specified size range, for example:
list[0]: (0 , 2^8]
list[1]: (2^8 , 2^10]
list[2]: (2^10, 2^12]
list[3]: (2^12, 2^14]
list[4]: (2^14, MAX_SIZE]
When allocating a block, start at the first list that can contain
a big enough block. Search subsequent lists, if necessary.
Terminate the search as soon as we find a block that is big enough.
2. Use doubly-linked lists, so that we can remove free blocks in
constant time.
3. Use BSD LIST macros, as defined in sys/queue.h and the QUEUE(3)
man page.
4. Change code to utilize small blocks of data size 64 and 128, when
splitting larger blocks.
Signed-off-by: Robert Sanford <rsanford2@gmail.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Unit tests for Packet Framework libraries.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
The Packet Framework pipeline library provides a standard methodology
(logically similar to OpenFlow) for rapid development of complex packet
processing pipelines out of ports, tables and actions.
A pipeline is constructed by connecting its input ports to its output ports
through a chain of lookup tables. As result of lookup operation into the
current table, one of the table entries (or the default table entry, in case
of lookup miss) is identified to provide the actions to be executed on the
current packet and the associated action meta-data.
The behavior of user actions is defined through the configurable table action
handler, while the reserved actions define the next hop for the current packet
(either another table, an output port or packet drop) and are handled
transparently by the framework.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
This file defines the operations to be implemented by
any Packet Framework table.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
This file defines the port operations that have to be implemented
by Packet Framework ports.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked by: Ivan Boule <ivan.boule@6wind.com>
Moving interrupt type enum out of igb_uio and renaming it to be more
generic. Such a strange header naming and separation is done mostly to
make coming virtio patches easier to port to dpdk.org tree.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Rename the RTE_PCI_DRV_NEED_IGB_UIO to be more generic.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Currently, EAL does not distinguish between actual failures and expected
initialization errors. E.g. sometimes the driver fails to initialize
because it was not supposed to be initialized in the first place, such
as device not being managed by said driver.
This patch makes EAL fail on actual initialization errors while still
skipping over expected initialization errors.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This adds the code for a new Intel DPDK library for packet distribution.
The distributor is a component which is designed to pass packets
one-at-a-time to workers, with dynamic load balancing. Using the RSS
field in the mbuf as a tag, the distributor tracks what packet tag is
being processed by what worker and then ensures that no two packets with
the same tag are in-flight simultaneously. Once a tag is not in-flight,
then the next packet with that tag will be sent to the next available
core.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
[Thomas: add doxygen @file comment]
Allows to lookup four IP addresses in an LPM table.
Uses SSE instrincts.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Pablo de Lara Guarch <pablo.de.lara.guarch@intel.com>
This commit removes trailing whitespace from lines in files. Almost all
files are affected, as the BSD license copyright header had trailing
whitespace on 4 lines in it [hence the number of files reporting 8 lines
changed in the diffstat].
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: remove spaces before tabs in libs]
[Thomas: remove more trailing spaces in non-C files]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Recent change to rte_dump_tailq (commit 591a9d7985c1230652),
which now uses a FILE parameter causes compilation to fail under FreeBSD
and sourced to a missing include of stdio.h.
Errors:
rte_tailq.h: unknown type name 'FILE' void rte_dump_tailq(FILE *f);
rte_memory.h: unknown type name 'FILE' void rte_dump_physmem_layout(FILE *f);
Signed-off-by: Alan Carew <alan.carew@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
There was an indentation error in commit e57f20e051770861377ff
"make vdev init path generic for both virtual and pci devices"
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Currently, physical device pmds use a separate initalization path
(rte_pmd_init_all) while virtual devices use a constructor registration and
rte_eal_dev_init. Theres no reason to have them be separate. This patch
removes the vdev specific nomenclature from the vdev init path and makes it more
generic for use with all pmds. This is the first step in converting the
physical device pmds to using the same constructor based registration path that
the virtual devices use.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Rather than have each driver have to remember to add a constructor to it to make
sure its gets registered properly, wrap that process up in a macro to make
registration a one line affair. This also sets the stage for us to make
registration of vdev pmds and physical pmds a uniform process
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
When doing diagnostic function, it is useful to have a ability
to iterate over all memzones.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
It makes no sense to inline string functions, in fact snprintf
can't be inlined because the function supports variable number of
arguments.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: update includes]
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The DPDK dump functions are useful for remote debugging of an
applications. But when application runs as a daemon, stdout
is typically routed to /dev/null.
Instead change all these functions to take a stdio FILE * handle
instead. An application can then use open_memstream() to capture
the output.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: fix quota_watermark example]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Start development cycle for version 1.7.0.
This new development workflow introduces a new versioning scheme.
Instead of having releases r0, r1, r2, etc, there will be release
candidates. Last number has special meanings:
< 16 numbers are reserved for release candidates (RTE_VER_SUFFIX is -rc)
16 is reserved for the release (RTE_VER_SUFFIX must be unset)
> 16 numbers can be used locally (RTE_VER_SUFFIX must be set)
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
There is no need for a 'magic' field in struct rte_config, as this part of the
structure is local to each process. All threads of a process are synchronised
because of the run_once atomic.
So remove this field, as it is only adding confusion when reading code that
references 'magic' field from struct rte_mem_config.
Besides, there is no reference about the 'version' field, so remove it as well.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
There should be no real need for this initialised field as the whole structure
is set to 0 in rte_config_init() by primary process, and secondary processes
wait for this to happen before anything else (looking at mem_config magic).
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
We don't really need this field as it is only used when creating the memzone
object associated to this heap.
Removing numa_socket field makes things simpler and remove race condition.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Only the last feature was checked since commit 99f2cdf9ca10
(eal: fix %rbx corruption and simplify the code)
The return code for rte_cpu_get_flag_enabled is only checked on the termination
of the for loop that it is called inside, but should be checked for every
iteration it makes through the for loop. This is caused by some silly missing
brackets. Simply add them in
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Pablo De Lara Guarch <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Instead of having a list of virtual device drivers in EAL code, add an
API to register drivers. Thanks to this new registration method, we can
remove the references to pmd_ring, pmd_pcap and pmd_xenvirt in EAL code.
This also enables the ability to register a virtual device driver as
a shared library.
The registration is done in an init function flaged with
__attribute__((constructor)). The new convention is to name this
function rte_pmd_xyz_init(). The per-device init function is renamed
rte_pmd_xyz_devinit().
By the way the internal PMDs are now also .so/standalone ready. Let's do
it later on. It will be required to ease maintenance.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
The name "nonpci_devs" for virtual devices is ambiguous as a physical
device can also be non-PCI (ex: usb, sata, ...). A better name for this
file is "vdev" as it only deals with virtual devices.
This patch doesn't introduce any change except renaming.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Some PCI drivers may require some specific initialization arguments at
start-up.
Even if unused today, adding this feature seems coherent with virtual
devices in order to provide a full-featured rte_devargs framework. In
the future, it could be added in pmd_ixgbe or pmd_igb for instance to
enable debug of drivers or setting a specific operating mode at
start-up.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This commit changes the API of --use-device command line argument.
It changes the separators from ';' to ','. Indeed, ';' is not the best
choice as this character is also used to separate shell commands,
forcing the user to surround arguments with quotes.
This commit impacts both devargs and kvargs as each of them define
a separator in --use-device argument:
- devargs defines the separator between the device name or pci_id and
its arguments
- kvargs defines the separator between each key/value pairs in
arguments for drivers using the kvargs API to parse their arguments
The modification of devargs and kvargs is done in one commit to keep
the coherency of --use-device.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Remove old whitelist code:
- remove references to rte_pmd_ring, rte_pmd_pcap and pmd_xenvirt in
is_valid_wl_entry() as we want to be able to register external virtual
drivers as a shared library. Moreover this code was duplicated with
dev_types[] from eal_common_pci.c
- eal_common_whitelist.c was badly named: it was able to process PCI
devices white list and the registration of virtual devices
- the parsing code was complex: all arguments were prepended in
one string dev_list_str[4096], then split again
Use the newly introduced rte_devargs to get:
- the PCI white list
- the PCI black list
- the list of virtual devices
Rework the tests:
- a part of the whitelist test can be removed as it is now tested
in app/test/test_devargs.c
- the other parts are just reworked to adapt them to the new API
This commit induce a small API modification: it is not possible to specify
several devices per "--use-device" option. This notation was anyway a bit
cryptic. Ex:
--use-device="eth_ring0,eth_pcap0;iface=ixgbe0"
now becomes:
--use-device="eth_ring0" --use-device="eth_pcap0;iface=ixgbe0"
On the other hand, it is now possible to work in PCI blacklist mode and
instanciate virtual drivers, which was not possible before this patch.
Test result:
./app/test -c 0x15 -n 3 -m 64
RTE>>devargs_autotest
EAL: invalid PCI identifier <08:1>
EAL: invalid PCI identifier <00.1>
EAL: invalid PCI identifier <foo>
EAL: invalid PCI identifier <>
EAL: invalid PCI identifier <000f:0:0>
Test OK
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
This commit introduces a new API for storing device arguments given by
the user. It only adds the framework and the test. The modification of
EAL to use this new module is done in next commit.
The final goals:
- unify pci-blacklist, pci-whitelist, and virtual devices arguments
in one file
- allow to register a virtual device driver from a dpdk extension
provided as a shared library. For that we will require to remove
references to rte_pmd_ring and rte_pmd_pcap in argument parsing code
- clarify the API of eal_common_whitelist.c, and rework its code that is
often complex for no reason.
- support arguments for PCI devices and possibly future non-PCI devices
(other than virtual devices) without effort.
Test result:
echo 100 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 100 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
./app/test -c 0x15 -n 3 -m 64
RTE>>eal_flags_autotest
[...]
Test OK
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
To avoid confusion with virtual devices, rename device_list as
pci_device_list and driver_list as pci_driver_list.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Neil Horman reported that on x86-64 the upper half of %rbx would get
clobbered when the code was compiled PIC or PIE, because the
i386-specific code to preserve %ebx was incorrectly compiled.
However, the code is really way more complex than it needs to be. For
one thing, the CPUID instruction only needs %eax (leaf) and %ecx
(subleaf) as parameters, and since we are testing for bits, we might
as well list the bits explicitly. Furthermore, we can use an array
rather than doing a switch statement inside a structure.
Reported-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
Insert get_physaddr() into public API as rte_mem_virt2phy().
rte_mem_virt2phy() permits to obtain the physical address of any
virtual address mapped to the current process.
get_physaddr() was working only for addresses pointing exactly to
the first byte of a page.
Note that this function is very slow and shouldn't be called
after initialization to avoid a performance bottleneck.
The memory must be locked with mlock(). The function rte_mem_lock_page()
is a mlock() helper that lock the whole page.
A better name would be rte_mem_virt2phys but rte_mem_virt2phy is more
consistent with rte_mempool_virt2phy.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
This reverts commit 57c24af85d9eaa81549a212169605b4e2468a29f
which was wrongly rebased in 1.6.0 branch:
- commit log must be changed for 1.6.0
- it breaks building for 32-bit
A new version of this commit has to be done.
Applications can test versions, for compatibility, this way:
#if RTE_VERSION >= RTE_VERSION_NUM(1,2,3,4)
RTE_VERSION was already defined for use with rte_config.
It is moved in rte_version.h and updated to current version number.
Note that the first tag having this helper is 1.2.3r2.
Releases r0 have not this patch.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Intel 82546EB Gigabit ethernet controller is reported to be working
with copper.
Tested-by: Ognjen Joldzic <ognjen.joldzic@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>