Commit Graph

616 Commits

Author SHA1 Message Date
Stephen Hemminger
58f8a1d2e3 memzone: add iterator function
When doing diagnostic function, it is useful to have a ability
to iterate over all memzones.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-05-16 16:02:55 +02:00
Stephen Hemminger
e5ac7c2ff3 eal: don't inline string functions
It makes no sense to inline string functions, in fact snprintf
can't be inlined because the function supports variable number of
arguments.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: update includes]
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-05-16 16:02:55 +02:00
Stephen Hemminger
591a9d7985 add FILE argument to debug functions
The DPDK dump functions are useful for remote debugging of an
applications. But when application runs as a daemon, stdout
is typically routed to /dev/null.

Instead change all these functions to take a stdio FILE * handle
instead. An application can then use open_memstream() to capture
the output.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: fix quota_watermark example]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-16 16:02:55 +02:00
Stephen Hemminger
c738c6a644 spelling fixes
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-05-16 16:02:55 +02:00
Ivan Boule
168dfa6166 app/testpmd: add engine that replies to ARP and ICMP echo requests
Add a new specific packet processing engine in the "testpmd" application that
only replies to ARP requests and to ICMP echo requests.
For this purpose, a new "icmpecho" forwarding mode is provided that can be
dynamically selected with the following testpmd command:

    set fwd icmpecho

before starting the receipt of packets on the selected ports.

Then, the "icmpecho" engine performs the following actions on all received
packets:

- replies to a received ARP request by sending back on the RX port a ARP
  reply with a "sender hardware address" field containing the MAC address
  of the RX port,

- replies to a ICMP echo request by sending back on the RX port a ICMP echo
  reply, swapping the IP source and the IP destination address in the IP
  header,

- otherwise, simply drops the received packet.

When replying to a received packet that was encapsulated into a VLAN tunnel,
the reply is sent back with the same VLAN identifier.
By default, the testpmd configures VLAN header stripping RX option on each
port.
This option is not managed by the icmpecho engine which won't detect
packets that were encapsulated into a VLAN.
To address this issue, the VLAN header stripping option must be previously
switched off with the following testpmd command:

    vlan set strip off

When the "verbose" mode has been set with the testpmd command
"set verbose 1", the "icmpecho" engine displays informations about each
received packet.

The "icmpecho" forwarding engine can also be used to simply check port
connectivity at the hardware level (check that cables are well-plugged)
and at the software level (receipt of VLAN packets, for instance).

Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-16 13:25:16 +02:00
Julien Cretin
2612a4b935 mem: remove redundant check in optimize_object_size
The second condition of this logical OR:
    (get_gcd(new_obj_size, nrank * nchan) != 1 ||
    get_gcd(nchan, new_obj_size) != 1)
is redundant with the first condition.

We can show that the first condition is equivalent to its disjunction
with the second condition using these two results:

- R1: For all conditions A and B, if B implies A, then (A || B) is
  equivalent to A.

- R2: (get_gcd(nchan, new_obj_size) != 1) implies
  (get_gcd(new_obj_size, nrank * nchan) != 1)

We can show R1 with the following truth table (0 is false, 1 is true):
        +-----+-----++----------+-----+-------------+
        |  A  |  B  || (A || B) |  A  | B implies A |
        +-----+-----++----------+-----+-------------+
        |  0  |  0  ||     0    |  0  |      1      |
        |  0  |  1  ||     1    |  0  |      0      |
        |  1  |  0  ||     1    |  1  |      1      |
        |  1  |  1  ||     1    |  1  |      1      |
        +-----+-----++----------+-----+-------------+
                Truth table of (A || B) and A

We can show R2 by looking at the code of optimize_object_size and
get_gcd.

We see that:
- S1: (nchan >= 1) and (nrank >= 1).
- S2: get_gcd returns 0 only when both arguments are 0.

Let:
- X be get_gcd(new_obj_size, nrank * nchan).
- Y be get_gcd(nchan, new_obj_size).

Suppose:
- H1: get_gcd returns the greatest common divisor of its arguments.
- H2: (nrank * nchan) does not exceed UINT_MAX.

We prove (Y != 1) implies (X != 1) with the following steps:
- Suppose L0: (Y != 1). We have to show (X != 1).
- By H1, Y is the greatest common divisor of nchan and new_obj_size.
  In particular, we have L1: Y divides nchan and new_obj_size.
- By H2, we have L2: nchan divides (nrank * nchan)
- By L1 and L2, we have L3: Y divides (nrank * nchan) and
  new_obj_size.
- By H1 and L3, we have L4: (Y <= X).
- By S1 and S2, we have L5: (Y != 0).
- By L0 and L5, we have L6: (Y > 1).
- By L4 and L6, we have (X > 1) and thus (X != 1), which concludes.

R2 was also tested for all values of new_obj_size, nrank, and nchan
between 0 and 2000.

This redundant condition was found using TrustInSoft Analyzer.

Signed-off-by: Julien Cretin <julien.cretin@trust-in-soft.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-14 11:21:44 +02:00
Didier Pallard
937cca79c9 mem: change default per socket memory allocation
Currently, if there is more memory in hugepages than the amount
requested by dpdk application, the memory is allocated by taking as much
memory as possible from each socket, starting from first one.
For example if a system is configured with 8 GB in 2 sockets (4 GB per
socket), and dpdk is requesting only 4GB of memory, all memory will be
taken in socket 0 (that have exactly 4GB of free hugepages) even if some
cores are configured on socket 1, and there are free hugepages on socket
1...

Change this behaviour to allocate memory on all sockets where some cores
are configured, spreading the memory amongst sockets using following
ratio per socket:
N° of cores configured on the socket / Total number of configured cores
* requested memory
If this new algorithm fails, it defaults to previous behaviour.

This algorithm is used when memory amount is specified globally using
-m option. Per socket memory allocation can always be done using
--socket-mem option.

It is implemented only for Linux as BSD part looks not to be ready for NUMA.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Venky Venkatesan <venky.venkatesan@intel.com>
2014-05-14 11:06:49 +02:00
Olivier Matz
1d64e46eb8 ring: allow to initialize without memzone
Allow to initialize a ring in an already allocated memory. The rte_ring_create()
function that allocates a ring in a rte_memzone is still available and now uses
the new rte_ring_init() function in order to factorize the code.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-05-13 14:40:09 +02:00
Olivier Matz
a182620042 ring: get size in memory
Add a function that returns the amount of memory occupied by a rte_ring
structure and its object table. This commit prepares the next one that
will allow to allocate a ring dynamically.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-05-13 14:35:06 +02:00
David Marchand
5d8751b83b pci: remove deprecated RTE_EAL_UNBIND_PORTS option
RTE_EAL_UNBIND_PORTS was deprecated in DPDK 1.4.0 and removed in 1.6.0, but the
code was not removed.

The bind/unbind operations should not be handled by the eal.
These operations should be either done outside of dpdk or inside the PMDs
themselves as these are their problems.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:21:48 +02:00
David Marchand
ef6352833a pci: move RTE_PCI_DRV_FORCE_UNBIND handling out of #ifdef
Move RTE_PCI_DRV_FORCE_UNBIND flag handling out of RTE_EAL_UNBIND_PORTS section.
This had nothing to do with RTE_EAL_UNBIND_PORTS anyway.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:21:29 +02:00
David Marchand
d8deab8a88 pci: pci_switch_module cleanup
The pci_switch_module() function should only do what its name tells: unbind pci
devices and rebind them on the specified kernel driver.
Hence, it can not call pci_uio_map_resource().

Call to pci_uio_map_resource() should be moved to rte_eal_pci_probe_one_driver()
so that we can factorize code.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:21:22 +02:00
David Marchand
d3e6faf840 pci: rework interrupt fd init and fix fd leak
A fd leak happens in pci_map_resource when multiple bars are mapped.
Fix this by closing fd unconditionnally in this function and open the
intr_handle fd in pci_uio_map_resource instead.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:20:47 +02:00
David Marchand
99d44c7e26 pci: remove virtio-uio workaround
virtio-uio does not need eal to map bars from uio device, so remove flag
RTE_PCI_DRV_NEED_IGB_UIO.
Then, move virtio-uio workaround out of generic eal_pci.c for linux
implementation.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:20:35 +02:00
David Marchand
8e5f9df258 pci: align bsd implementation on linux
bsd implementation lacks check on driver flags, fix this.
Besides, check on BAR0 is not needed and could cause trouble for devices that
have no BAR0.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:20:29 +02:00
David Marchand
8990aac31d pci: fix potential mem leaks
Looking at bsd implementation, we can see that there are some potential mem
leaks in linux implementation. Fix them.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:18:17 +02:00
Burakov, Anatoly
0c3977dd15 mem: take reserved hugepages into account
Some applications reserve hugepages for later use,
but DPDK doesn't take reserved pages into account
when calculating number of available number of hugepages.

This patch adds reading from "resv_hugepages" file
in addition to "free_hugepages".

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-13 10:11:18 +02:00
Thomas Monjalon
536ba2d8a8 version: 1.7.0-rc0
Start development cycle for version 1.7.0.

This new development workflow introduces a new versioning scheme.
Instead of having releases r0, r1, r2, etc, there will be release
candidates. Last number has special meanings:
< 16 numbers are reserved for release candidates (RTE_VER_SUFFIX is -rc)
16 is reserved for the release (RTE_VER_SUFFIX must be unset)
> 16 numbers can be used locally (RTE_VER_SUFFIX must be set)

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-05-13 10:11:17 +02:00
Wang Sheng-Hui
b20539d687 eal: print maximum and detected lcores
Print the maximum lcore(s) as configured, and the number of lcore(s) detected
on eal cpu init as debug info besides the not separate detected/not-detected
lcore info.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
[Thomas: add BSD part]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-05 18:04:05 +02:00
Didier Pallard
2ef79b212a eal: remove useless output of undetected lcores
Increasing maximum number of lcores gives a huge place to undetected
lcores in output traces. Moreover, this output does not give any
interesting information, since list of undetected lcores can be deduced
from list of detected ones.
So remove output related to undetected cores.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-05 18:03:09 +02:00
David Marchand
69020660c3 eal: remove unused config fields
There is no need for a 'magic' field in struct rte_config, as this part of the
structure is local to each process. All threads of a process are synchronised
because of the run_once atomic.
So remove this field, as it is only adding confusion when reading code that
references 'magic' field from struct rte_mem_config.

Besides, there is no reference about the 'version' field, so remove it as well.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-05 18:02:56 +02:00
Thomas Monjalon
fa97553a69 version: 1.6.0r2
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-01 22:58:21 +02:00
Thomas Monjalon
ef4fb79bee eal: fix usage description for bsd
A line was forgotten when removing blacklist option in commit
"use devargs for vdev and PCI lists with bsd" (cd25fb0863).

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-30 23:20:01 +02:00
Maxime Leroy
9ad0e24942 eal: fix vdev allocation on non-0 numa socket
vdev ethdev can not be allocated on a numa socket that is not socket 0.
The reason comes from rte_eth_dev_allocate() which uses rte_socket_id() to
identify the socket on which vdev driver data should be allocated.
However, at this initialization step, rte_socket_id() always returns 0.

Looking at rte_socket_id(), it needs rte_lcore_id() which uses the per-core
global _lcore_id variable. This variable is initialised by
eal_thread_init_master.

So eal_thread_init_master should be called before rte_eal_vdev_init().

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-30 22:59:17 +02:00
David Marchand
c87215d045 malloc: simplify heap initialisation
There should be no real need for this initialised field as the whole structure
is set to 0 in rte_config_init() by primary process, and secondary processes
wait for this to happen before anything else (looking at mem_config magic).

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 11:31:22 +02:00
David Marchand
eeeebe4cd9 malloc: fix race condition on numa_socket field
We don't really need this field as it is only used when creating the memzone
object associated to this heap.
Removing numa_socket field makes things simpler and remove race condition.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 11:29:00 +02:00
David Marchand
8c910d01ae kni: fix build with debian kernel 3.2.57-2
Following debian kernel headers upgrade to 3.2.57, pci capability accessors
have been backported (upstream commit 8c0d3a02c1309eb6112d2e7c8172e8ceb26ecfca,
("PCI: Add accessors for PCI Express Capability", v3.7-rc1)).

It results in the same compilation error as redhat 6.x.
However, there is no clear way to determine we are building on a debian kernel.
So, rather than determine if we are building on a distribution kernel, look at
PCI_EXP_LNKSTA2 that appeared in this upstream commit.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:31:30 +02:00
Olivier Matz
22139b107b nic_uio: fix build with freebsd 10
Compiling the DPDK under FreeBSD gives the following error due to a
missing include <sys/rwlock.h>.

In file included from nic_uio.c:52:
@/vm/vm_pager.h:126:2: error: implicit declaration of function 'rw_assert' is invalid in C99
      [-Werror,-Wimplicit-function-declaration]
        VM_OBJECT_ASSERT_WLOCKED(object);
        ^
@/vm/vm_object.h:226:2: note: expanded from macro 'VM_OBJECT_ASSERT_WLOCKED'
        rw_assert(&(object)->lock, RA_WLOCKED)
        ^
In file included from nic_uio.c:52:
@/vm/vm_pager.h:126:2: error: use of undeclared identifier 'RA_WLOCKED'
@/vm/vm_object.h:226:29: note: expanded from macro 'VM_OBJECT_ASSERT_WLOCKED'
        rw_assert(&(object)->lock, RA_WLOCKED)
                                   ^
In file included from nic_uio.c:52:
@/vm/vm_pager.h:143:2: error: use of undeclared identifier 'RA_WLOCKED'
        VM_OBJECT_ASSERT_WLOCKED(object);
        ^
@/vm/vm_object.h:226:29: note: expanded from macro 'VM_OBJECT_ASSERT_WLOCKED'
        rw_assert(&(object)->lock, RA_WLOCKED)
                                   ^
In file included from nic_uio.c:52:
@/vm/vm_pager.h:167:2: error: use of undeclared identifier 'RA_WLOCKED'
        VM_OBJECT_ASSERT_WLOCKED(object);
        ^
@/vm/vm_object.h:226:29: note: expanded from macro 'VM_OBJECT_ASSERT_WLOCKED'
        rw_assert(&(object)->lock, RA_WLOCKED)
                                   ^
In file included from nic_uio.c:52:
@/vm/vm_pager.h:190:2: error: use of undeclared identifier 'RA_WLOCKED'
        VM_OBJECT_ASSERT_WLOCKED(m->object);
        ^
@/vm/vm_object.h:226:29: note: expanded from macro 'VM_OBJECT_ASSERT_WLOCKED'
        rw_assert(&(object)->lock, RA_WLOCKED)
                                   ^

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:31:30 +02:00
Olivier Matz
dd9a36772a devargs: allow to provide arguments per pci device for bsd
The bsdapp part was missing in commit 8e245de6ca.

Add the ability to pass some specific initialization arguments to PCI
devices at start-up.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:31:30 +02:00
Olivier Matz
4bf3fe634a devargs: replace --use-device option by --pci-whitelist and --vdev for bsd
The bsdapp part was missing in commit cac6d08c8b.

This commit splits the "--use-device" option in two new options:

- "--pci-whitelist or -w": add a PCI device in the white list
- "--vdev": instanciate a new virtual device

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:31:19 +02:00
Olivier Matz
c431df41bc devargs: use a comma to separate key/values for bsd
The bsdapp part was missing in commit a8b97e3a1d.

This commit changes the API of --use-device command line argument.
It changes the separators from ';' to ','.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:31:08 +02:00
Olivier Matz
cd25fb0863 devargs: use devargs for vdev and PCI lists with bsd
The bsdapp part was missing in commit 1220458951.

This patch removes old whitelist code and use the newly introduced
rte_devargs to get the PCI white list, the PCI black list and the list
of virtual devices.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:30:54 +02:00
Olivier Matz
3944edad8b devargs: build common functions for bsd
The bsd part was missing in commit bf6dea0e04.

This commit introduces a new API for storing device arguments given by
the user. It only adds the framework and the test.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:28:19 +02:00
Olivier Matz
f070eb9a21 pci: rename device and driver lists for bsd
The bsdapp part was missing in commit 5b1f4a67dd.

To avoid confusion with virtual devices, rename device_list as
pci_device_list and driver_list as pci_driver_list.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:28:19 +02:00
Olivier Matz
9a9f7f8d3a mem: get dummy physical address in case of --no-huge with bsd
The bsdapp part was missing in commit 57c24af85d.

This commit adds a dummy rte_mem_virt2phy() to fix the compilation of
DPDK under BSD. This function is only used when the debug option
"--no-huge" is given, to get the physical address of mempools in memory.

As a result, it seems acceptable for now to implement a dummy function
to fix the compilation as the usual case (using contigmem module) works
properly.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:28:19 +02:00
Olivier Matz
eca9a994bc mem: get hugepages config for bsd
The bsdapp part was missing in c5e9eeca5a.

This commit allows external libraries and applications to know if
hugepages are enabled.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-30 01:28:18 +02:00
Pascal Mazon
894fd42e7f eal: do not try to load library from current directory
When loading a library "libfoo.so" (depending on "libbar.so", located in an
entirely different folder), with a LD_LIBRARY_PATH=/path/to/libfoo.so", it
returns an error:

 EAL: ./libfoo.so: cannot open shared object file: No such file or directory

If the first dlopen() fails (here, because it can't find all dependencies),
the code requires for a second dlopen() that looks for "./libfoo.so". It
turns on pathname matching, which does not use LD_LIBRARY_PATH. As a result,
it fails because it cannot find "./libfoo.so".

The error message matches the error of the second dlopen(), not the first's.

Do not try to look for a different library ("./"-prefixed) than the one
provided in argument. Let the dynamic library management handle it, just
provide an appropriate LD_LIBRARY_PATH.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-18 00:38:38 +02:00
David Marchand
4f04db8b89 eal: check coremask against detected lcores
lcores that are set in coremask should be checked against lcores detected on
system. This way, we won't need to check them later.

Besides, if specifying an unavailable lcore, we currently panic in
eal_thread_loop() because pthread_setaffinity_np fails.
So this check will return an error with a more explicit message in
eal_parse_coremask().

"EAL: pthread_setaffinity_np failed
 PANIC in eal_thread_loop():
 cannot set affinity"

becomes :

"EAL: lcore 4 unavailable
 EAL: invalid coremask"

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-18 00:38:37 +02:00
Neil Horman
5d52944803 eal: fix check of all requested CPU features
Only the last feature was checked since commit 99f2cdf9ca
(eal: fix %rbx corruption and simplify the code)

The return code for rte_cpu_get_flag_enabled is only checked on the termination
of the for loop that it is called inside, but should be checked for every
iteration it makes through the for loop.  This is caused by some silly missing
brackets.  Simply add them in

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Pablo De Lara Guarch  <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-18 00:20:04 +02:00
Jean-Mickael Guerin
5578ace03c kni: more compatibility with RHEL 6.4/6.5
For RH 6.5:
- always include mdio.h to get the definitions of MDIO_EEE, ETHTOOL_GEEE
- is_link_local_ether_addr(), pcie_capability_clear_and_set_word(),  and
  ether_addr_equal() have been backported

For RH 6.4:
- same issue with ether_addr_equal()
- here ETH_GEE is defined without having the functions.

igb_ethtool.c:2441: error: implicit declaration of function ‘mmd_eee_adv_to_ethtool_adv_t’

Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-18 00:20:04 +02:00
Jean-Mickael Guerin
22367416b0 kni: disable FDB operations on RHEL 6.5
On RH 6.5:
igb_main.c:2298: error: unknown field ‘ndo_fdb_add’ specified in
initializer

FDB ops are present in RH 6.5 via the extension of netdev, so add the
ifdef inside the netdev ops definition of igb.

However, FDB functions are not set for RHEL 6.5: the implementation
relies on dev_mc_add_excl API which has not been backported.

Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-18 00:20:04 +02:00
Aaro Koskinen
74d62e73df kni: fix build with kernel 3.15
rxhash has been renamed to hash. In 3.14 and newer, we can use
skb_set_hash().

Signed-off-by: Aaro Koskinen <aaro.koskinen@nsn.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-17 18:00:32 +02:00
Stephen Hemminger
524f073a10 ivshmem: fix errors identified by hardening
Need to pass mode argument to open with O_CREAT.
Must check return value from ftruncate().

Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-17 15:48:44 +02:00
Olivier Matz
396b69e56a vdev: allow external registration of virtual device drivers
The registration of an external vdev driver (a .so library) is done in a
function that has the ((constructor)) attribute. This function is called
when dlopen(driver.so) is invoked.

As a result, we need to do the dlopen() before calling
rte_eal_vdev_init() that calls the initialization functions of all
registered drivers.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-11 16:17:57 +02:00
Olivier Matz
4c39baf297 vdev: new registration API
Instead of having a list of virtual device drivers in EAL code, add an
API to register drivers. Thanks to this new registration method, we can
remove the references to pmd_ring, pmd_pcap and pmd_xenvirt in EAL code.
This also enables the ability to register a virtual device driver as
a shared library.

The registration is done in an init function flaged with
__attribute__((constructor)). The new convention is to name this
function rte_pmd_xyz_init(). The per-device init function is renamed
rte_pmd_xyz_devinit().

By the way the internal PMDs are now also .so/standalone ready. Let's do
it later on. It will be required to ease maintenance.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-11 16:17:42 +02:00
Olivier Matz
9fa5e2b026 vdev: rename nonpci_devs as vdev
The name "nonpci_devs" for virtual devices is ambiguous as a physical
device can also be non-PCI (ex: usb, sata, ...). A better name for this
file is "vdev" as it only deals with virtual devices.

This patch doesn't introduce any change except renaming.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-11 14:05:08 +02:00
Olivier Matz
8e245de6ca devargs: allow to provide arguments per pci device
Some PCI drivers may require some specific initialization arguments at
start-up.

Even if unused today, adding this feature seems coherent with virtual
devices in order to provide a full-featured rte_devargs framework. In
the future, it could be added in pmd_ixgbe or pmd_igb for instance to
enable debug of drivers or setting a specific operating mode at
start-up.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 15:50:34 +02:00
Olivier Matz
cac6d08c8b devargs: replace --use-device option by --pci-whitelist and --vdev
This commit splits the "--use-device" option in two new options:

- "--pci-whitelist or -w": add a PCI device in the white list
- "--vdev": instanciate a new virtual device

Before the patch, the same option "--use-device" was used for these 2
use-cases.

By the way, we also add "--pci-blacklist" in addition to the existing
"-b" for coherency with the whitelist parameter.

Test result:

echo 100 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 100 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
./app/test -c 0x15 -n 3 -m 64
RTE>>eal_flags_autotest
[...]
Test OK

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 15:50:34 +02:00
Olivier Matz
a8b97e3a1d devargs: use a comma instead of semicolon to separate key/values
This commit changes the API of --use-device command line argument.
It changes the separators from ';' to ','. Indeed, ';' is not the best
choice as this character is also used to separate shell commands,
forcing the user to surround arguments with quotes.

This commit impacts both devargs and kvargs as each of them define
a separator in --use-device argument:

- devargs defines the separator between the device name or pci_id and
   its arguments
- kvargs defines the separator between each key/value pairs in
   arguments for drivers using the kvargs API to parse their arguments

The modification of devargs and kvargs is done in one commit to keep
the coherency of --use-device.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 15:50:11 +02:00
Olivier Matz
1220458951 devargs: use devargs for vdev and PCI whitelist/blacklist
Remove old whitelist code:
- remove references to rte_pmd_ring, rte_pmd_pcap and pmd_xenvirt in
  is_valid_wl_entry() as we want to be able to register external virtual
  drivers as a shared library. Moreover this code was duplicated with
  dev_types[] from eal_common_pci.c
- eal_common_whitelist.c was badly named: it was able to process PCI
  devices white list and the registration of virtual devices
- the parsing code was complex: all arguments were prepended in
  one string dev_list_str[4096], then split again

Use the newly introduced rte_devargs to get:
- the PCI white list
- the PCI black list
- the list of virtual devices

Rework the tests:
- a part of the whitelist test can be removed as it is now tested
  in app/test/test_devargs.c
- the other parts are just reworked to adapt them to the new API

This commit induce a small API modification: it is not possible to specify
several devices per "--use-device" option. This notation was anyway a bit
cryptic. Ex:
  --use-device="eth_ring0,eth_pcap0;iface=ixgbe0"
  now becomes:
  --use-device="eth_ring0" --use-device="eth_pcap0;iface=ixgbe0"

On the other hand, it is now possible to work in PCI blacklist mode and
instanciate virtual drivers, which was not possible before this patch.

Test result:

./app/test -c 0x15 -n 3 -m 64
RTE>>devargs_autotest
EAL: invalid PCI identifier <08:1>
EAL: invalid PCI identifier <00.1>
EAL: invalid PCI identifier <foo>
EAL: invalid PCI identifier <>
EAL: invalid PCI identifier <000f:0:0>
Test OK

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 14:59:34 +02:00
Olivier Matz
bf6dea0e04 devargs: introduce API and test
This commit introduces a new API for storing device arguments given by
the user. It only adds the framework and the test. The modification of
EAL to use this new module is done in next commit.

The final goals:

- unify pci-blacklist, pci-whitelist, and virtual devices arguments
  in one file
- allow to register a virtual device driver from a dpdk extension
  provided as a shared library. For that we will require to remove
  references to rte_pmd_ring and rte_pmd_pcap in argument parsing code
- clarify the API of eal_common_whitelist.c, and rework its code that is
  often complex for no reason.
- support arguments for PCI devices and possibly future non-PCI devices
  (other than virtual devices) without effort.

Test result:

echo 100 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 100 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
./app/test -c 0x15 -n 3 -m 64
RTE>>eal_flags_autotest
[...]
Test OK

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 14:58:34 +02:00
Olivier Matz
5b1f4a67dd pci: rename device and driver lists
To avoid confusion with virtual devices, rename device_list as
pci_device_list and driver_list as pci_driver_list.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 14:58:31 +02:00
Didier Pallard
f283b30509 ixgbe: release software locked semaphores on initialization
It may happen that DPDK application gets killed while having
acquired locks on the ethernet hardware, causing these locks to
be never released. On next restart of the application, DPDK
skip those ports because it can not acquire the lock,
this may cause some ports (or even complete board if SMBI is locked)
to be inaccessible from DPDK application until reboot of the
hardware.

This patch release locks that are supposed to be locked due to
an improper exit of the application.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-04-09 18:30:02 +02:00
Didier Pallard
4c9d8ed203 igb: release software locked semaphores on initialization
It may happen that DPDK application gets killed while having
acquired locks on the ethernet hardware, causing these locks to
be never released. On next restart of the application, DPDK
skip those ports because it can not acquire the lock,
this may cause some ports (or even complete board if SMBI is locked)
to be inaccessible from DPDK application until reboot of the
hardware.

This patch release locks that are supposed to be locked due to
an improper exit of the application.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-04-09 18:30:02 +02:00
Bruce Richardson
d73d8f3ad4 timer: fix TSC frequency by not reading /proc/cpuinfo
This reverts commit da6fd0759c.
	"timer: get TSC frequency from /proc/cpuinfo"

The use of cpuinfo to determine the frequency of the TSC is not
advisable and leads to incorrect results when power management is
in use. This is because, while the TSC frequency does not change
in modern cpus with constant_tsc support, the frequency of the core,
and hence the frequency of the core reported by cpuinfo *does* change.

Depending on the current frequency of core 0 when an application is
started, the EAL can get a wildly incorrect value for the TSC freq.
Since frequency is scaled down for power saving, any incorrect value
is likely to be lower than the default, which means that any delay
loops inside the code which rely on the TSC will be shorter than
planned. This can cause issues (reported on the mailing list by a number
of people) where ports are not initialized correctly due to delays being
too short.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-09 14:21:36 +02:00
Neil Horman
99f2cdf9ca eal: fix %rbx corruption and simplify the code
Neil Horman reported that on x86-64 the upper half of %rbx would get
clobbered when the code was compiled PIC or PIE, because the
i386-specific code to preserve %ebx was incorrectly compiled.

However, the code is really way more complex than it needs to be.  For
one thing, the CPUID instruction only needs %eax (leaf) and %ecx
(subleaf) as parameters, and since we are testing for bits, we might
as well list the bits explicitly.  Furthermore, we can use an array
rather than doing a switch statement inside a structure.

Reported-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: H. Peter Anvin <hpa@linux.intel.com>
2014-04-02 14:38:40 +02:00
Mauro Annarumma
ce5c43f6b9 ixgbe: support flow director for X540
Flow director in X540 uses the same registers as in 82599.
So it just has to be enabled in the 82599 implementation.

Signed-off-by: Mauro Annarumma <mauroannarumma@hotmail.it>
Acked-by: Maxime Leroy <maxime.leroy@6wind.com>
2014-03-26 11:03:56 +01:00
Stephen Hemminger
4cf4c837db mempool: use GCC push/pop_options
The include file should not change the GCC compile options for
the whole file being compiled, but only for the one inline function
that needs it. Using the push_options/pop_options fixes this.

Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-24 18:58:25 +01:00
Stephen Hemminger
2d32fef70b hash: make arg for jhash2 const
The argument to rte_jhash2() is not changed.

Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-24 18:58:25 +01:00
Stephen Hemminger
156705307c mbuf: copy offload flags when doing attach/clone
rte_pktmbuf_attach copies the packet meta data but does not
copy the offload flags. This means that cloned packets lose
their offload settings such as vlan tag.

Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-24 18:58:25 +01:00
Thomas Monjalon
1daf0aae7f vmxnet3: rename library
In order to distinguish clearly this implementation from the extension
vmxnet3-usermap, it is renamed to reflect its usage of uio framework.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Thomas Graf <tgraf@redhat.com>
2014-03-21 15:40:30 +01:00
Daniel Kan
18f02ff759 pci: fix igb_uio mapping for virtio_uio and vmxnet3_uio
Since commit 10ed994 (pci: use igb_uio mapping only when needed),
the flag RTE_PCI_DRV_NEED_IGB_UIO must be set even if RTE_EAL_UNBIND_PORTS
is disabled.
It was not the case for virtio_uio and vmxnet3_uio so the uio resources were
not mapped when RTE_EAL_UNBIND_PORTS was not defined.
Specifically, pci_uio_map_resource() was not called so
pci_dev->mem_resource was not mapped.

Signed-off-by: Daniel Kan <dan@nyansa.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-21 15:40:30 +01:00
David Marchand
a6bb9c8ced igb_uio: don't bind vmxnet3 and virtio devices if disabled
When not using vmxnet3-uio and virtio-uio PMDs, prevent igb_uio from binding
these devices. This way, vmxnet3 and virtio PMDs won't fail to initialize
because of a device silently bound to igb_uio.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-21 11:25:32 +01:00
Thomas Monjalon
5dbb84c0a8 virtio: rename library
In order to distinguish clearly this implementation from the extension
virtio-net-pmd, it is renamed to reflect its usage of uio framework.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Chris Wright <chrisw@redhat.com>
2014-03-20 17:50:51 +01:00
Stephen Hemminger
e6b87d19d9 get rid of DOS format end of lines
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-20 16:17:57 +01:00
Thomas Monjalon
3097de6e6b mem: get physical address of any pointer
Insert get_physaddr() into public API as rte_mem_virt2phy().

rte_mem_virt2phy() permits to obtain the physical address of any
virtual address mapped to the current process.
get_physaddr() was working only for addresses pointing exactly to
the first byte of a page.
Note that this function is very slow and shouldn't be called
after initialization to avoid a performance bottleneck.

The memory must be locked with mlock(). The function rte_mem_lock_page()
is a mlock() helper that lock the whole page.

A better name would be rte_mem_virt2phys but rte_mem_virt2phy is more
consistent with rte_mempool_virt2phy.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-03-20 15:35:08 +01:00
Thomas Monjalon
53a9ca3c57 mem: revert "get physical address of any pointer"
This reverts commit 57c24af85d
which was wrongly rebased in 1.6.0 branch:
- commit log must be changed for 1.6.0
- it breaks building for 32-bit
A new version of this commit has to be done.
2014-03-20 15:35:08 +01:00
David Marchand
4b28dda3dc mem: fix build of virtual address hinting for 32-bit
The initial commit doesn't build for 32-bit:
8ea9ff83 (mem: allow virtual memory address hinting)

lib/librte_eal/linuxapp/eal/eal.c: In function ‘eal_parse_base_virtaddr’:
build/include/rte_common.h:133:22:
error: cast from pointer to integer of different size
[-Werror=pointer-to-int-cast]
  RTE_PTR_ALIGN_FLOOR((typeof(ptr))RTE_PTR_ADD(ptr, (align) - 1), align)
                      ^

RTE_PTR_ALIGN_CEIL return type is the same as what we give it as input.
So instead of casting the returned value, cast 'addr' which should be the same
as base_virtaddr.

Reported-by: Mats Liljegren <mats.liljegren@enea.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-20 15:34:46 +01:00
David Marchand
2a315d6985 pcap: revert build patches
This reverts commits
a0cdfcf9 (use pcap-config to guess compilation flags),
ef5b2363 (fix build with empty LIBPCAP_CFLAGS) and
60191b89 (fix build when pcap_sendpacket is unavailable).

These patches are creating more problems than solving the initial one
(which was a build error with too old pcap libraries).
Since old pcap libraries are not that common, just revert them.

Reported-by: Meir Tseitlin <mirots@gmail.com>
Reported-by: Mats Liljegren <mats.liljegren@enea.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-19 14:33:52 +01:00
Olivier Matz
266ffe3494 pcap: fix build error introduced by kvargs
Due to a merge conflict between commits 4c745617a1 and 9d5752d80,
rte_eth_pcap.c was not compiling with the following error:

rte_eth_pcap.c: In function 'rte_pmd_init_internals':
rte_eth_pcap.c:559:30: error: dereferencing pointer to incomplete type
rte_eth_pcap.c:560:15: error: dereferencing pointer to incomplete type
rte_eth_pcap.c:561:18: error: dereferencing pointer to incomplete type
rte_eth_pcap.c:603:47: error: dereferencing pointer to incomplete type
rte_eth_pcap.c: In function 'rte_pmd_pcap_init':
rte_eth_pcap.c:732:73: error: 'dict' undeclared (first use in this
  function)
rte_eth_pcap.c:732:73: note: each undeclared identifier is reported
  only once for each function it appears in

This commit replaces "struct args_dict" by "struct rte_kvargs" to fix
the compilation issue.

By the way, it also removes the declaration of these functions from
the header file as no other file in DPDK references one of them. It
avoids to include <rte_kvargs.h> in rte_eth_pcap.h.

Reported-by: Meir Tseitlin <mirots@gmail.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-19 14:21:17 +01:00
David Marchand
f95b372558 version: 1.6.0r1
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-26 11:07:29 +01:00
Thomas Monjalon
c528a3b7d5 version: add 4th digit and helper macros
Applications can test versions, for compatibility, this way:
	#if RTE_VERSION >= RTE_VERSION_NUM(1,2,3,4)

RTE_VERSION was already defined for use with rte_config.
It is moved in rte_version.h and updated to current version number.

Note that the first tag having this helper is 1.2.3r2.
Releases r0 have not this patch.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-02-26 11:07:29 +01:00
Aaro Koskinen
645b0d13c9 kni: fix build with kernel 3.14
ether_addr_equal() was added in Linux 3.5. compare_ether_addr() was
deleted in 3.14. Start using ether_addr_equal() and provide an own
implementation for older kernels.

This fixes the compilation with Linux 3.14-rc1.

Signed-off-by: Aaro Koskinen <aaro.koskinen@nsn.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:29 +01:00
Adrien Mazarguil
205c33c45a kni: fix build with kernel < 3.3 with netdev_features_t backport
The netdev_features_t typedef appeared in Linux 3.3, but checking the kernel
version isn't enough with some distributions (such as Debian Wheezy) that
backported it into 3.2, causing a compilation failure due to redefinition.

Since the presence of a typedef can't be tested at compile time, this commit
adds type kni_netdev_features_t, which, depending on the kernel version,
translates either to u32 or netdev_features_t.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Thomas Monjalon
fd52b47781 kni: fix build with 802.1p kernel support
C90 compilers forbid mixed declaration and code.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2014-02-26 11:07:28 +01:00
Thomas Monjalon
27f76bdf27 ixgbe: remove residual fix about resetting big Tx queues
No need to keep residues of a fix which is replaced by another one.
This reverts commit 5a6d9897f9
(residual fix about resetting big Tx queues).

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Mats Liljegren
a02aa29dec pcap: save if_index of the bound device
Use command line parameters to get the name of the interface.
This name is converted into if_index, which is provided as
device info.

Signed-off-by: Mats Liljegren <mats.liljegren@enea.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
David Marchand
60191b8919 pcap: fix build when pcap_sendpacket is unavailable
Before libpcap 1.0.0, pcap_sendpacket was not available on linux targets (unless
backported).
When using such a library, we won't be able to send packet on the wire, yet we
can still dump packets into a pcap file.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
David Marchand
fa2e9958db pcap: fix build with old libpcap
For backwards compatibility, pcap.h includes pcap/pcap.h.
Hence, to be compatible with older pcap libraries, we must include pcap.h.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Thomas Monjalon
a007988d5d pcap: remove unused constant
RTE_ETH_PCAP_MBUFS is not used anymore since commit 6eb0ae218a
(pcap: fix mbuf allocation).

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Mats Liljegren
a12f21157a ethdev: introduce if_index in device info
This field is intended for pcap to describe the name of the interface
as known to Linux. It is an interface index, but can be translated into
an interface name using if_indextoname() function.

When using pcap, interrupt affinity becomes important, and this field
gives the application a chance to ensure that interrupt affinity is set
to the lcore handling the device.

Signed-off-by: Mats Liljegren <mats.liljegren@enea.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Thomas Monjalon
bc786a4e41 ethdev: fix non-reconfigurable pmd init
Some Poll-Mode Drivers (PMD) are not reconfigurable and,
thus, do not implement (rx|tx)_queue_release functions.
For these drivers, the functions rte_eth_dev_(rx|tx)_queue_config
must return an ENOTSUP error only when reconfiguring,
but not at initial configuration.

Move the FUNC_PTR_OR_ERR_RET check into the case of reconfiguration.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
2014-02-26 11:07:28 +01:00
Ivan Boule
e659b6b439 ethdev: add pause frame counters for em/igb/ixgbe
Add into the `rte_eth_stats` data structure 4 (64-bit) counters
of XOFF/XON pause frames received and sent on a given port.

Update em, igb, and ixgbe drivers to return the value of the 4 XOFF/XON
counters through the `rte_eth_stats_get` function exported by the DPDK
API.

Display the value of the 4 XOFF/XON counters in the `testpmd` application.

Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Ivan Boule
7238e63bce ethdev: add support for device offload capabilities
1) Make device RX and TX offload capabilities to be returned in the
   rte_eth_dev_info data structure by the function rte_eth_dev_info_get

   The following initial set of RX offload capabilities are defined:
   - VLAN header stripping
   - IPv4 header checksum check
   - UDP checksum check
   - TCP checksum check
   - TCP large receive offload (LRO)

   The following initial set of TX offload capabilities are defined:
   - VLAN header insertion
   - IPv4 header checksum computation
   - UDP checksum computation
   - TCP checksum computation
   - SCTP checksum computation
   - TCP segmentation offload (Transmit Segmentation Offload)
   - UDP segmentation offload

   2) Update the eth_dev_infos_get() function of the igb and ixgbe PMDs
      to return the offload capabilities which are supported by the
      device and that are effectively managed by the driver.

Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Olivier Matz
f7f97c1604 pci: add option --create-uio-dev to run without hotplug
When the user specifies --create-uio-dev in dpdk eal start options, the
DPDK will create the /dev/uioX instead of waiting that a program does it
(which is usually hotplug).

This option is useful in embedded environments where there is no hotplug
to do the work.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Olivier Matz
61410438da pci: split the function providing uio device and mappings
Add a new function pci_get_uio_dev() that parses /sys/bus/pci/devices
to get the uio device associated with a PCI device. This patch just
moves some code that was in pci_uio_map_resource() in the new function
without any functional change.

Thanks to this change, the next commit will be easier to understand.
Moreover it improves readability: having smaller functions help to
understand what pci_uio_map_resource() does.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Thomas Monjalon
5fe669202a pci: support 82546EB
Intel 82546EB Gigabit ethernet controller is reported to be working
with copper.

Tested-by: Ognjen Joldzic <ognjen.joldzic@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Damien Millescamps
050a84b9af pci: add flag to force unbind device
Some devices need to be unbound in order to be used via the PMD
without kernel module.

Signed-off-by: Damien Millescamps <damien.millescamps@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Thomas Monjalon
10ed99419b pci: use igb_uio mapping only when needed
Since DPDK 1.4, if RTE_EAL_UNBIND_PORTS is disabled, igb_uio mapping is
done for all devices (commit eee16c964c), breaking some non-Intel drivers.
But pci_uio_map_resource() should only be called for Intel devices
(using igb_uio kernel module).
The flag RTE_PCI_DRV_NEED_IGB_UIO is set for all those devices, even when
RTE_EAL_UNBIND_PORTS is disabled (fixes commit a22f5ce8fc).

Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Damien Millescamps <damien.millescamps@6wind.com>
2014-02-26 11:07:28 +01:00
David Marchand
1a40263998 pci: do not check BAR0 mapping
Since DPDK 1.4, bars mapping is checked and prevent from initializing
drivers which do not use igb_uio mapping (see commit eee16c964c).

There is no need to check for bars mapping, especially BAR0 is not required.
If bars mapping failed, then pci_uio_map_resource will fail and we won't reach
this check. So get rid of BAR0 check.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Damien Millescamps <damien.millescamps@6wind.com>
2014-02-26 11:07:27 +01:00
Damien Millescamps
1896b4ec5e mem: fix mempool for --no-huge
In --no-huge mode, mempool provides objects with their associated
header/trailer fitting in a standard page (usually 4KB).
This means all non-UIO driver should work correctly in this mode,
since UIO drivers allocate ring sizes that cannot fit in a page.

Extend rte_mempool_virt2phy to obtain the correct physical address when
elements of the pool are not on the same physically contiguous memory region.

Reason for this patch is to be able to run on a kernel < 2.6.37 without
the need to patch it, since all kernel below are either bugged or don't
have huge page support at all (< 2.6.28).

Signed-off-by: Damien Millescamps <damien.millescamps@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2014-02-26 11:07:27 +01:00
Damien Millescamps
c5e9eeca5a mem: get hugepages config
Allow external libraries and applications to know if hugepages
are enabled.

Signed-off-by: Damien Millescamps <damien.millescamps@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2014-02-26 11:07:27 +01:00
Adrien Mazarguil
29a2ca7388 mem: get memzone from any CPU socket when hugepages are disabled
When huge pages are disabled, memory is allocated for a single, undefined
CPU socket using malloc(), causing rte_memzone_reserve_aligned() to fail
most of the time.

This patch causes that memory to use SOCKET_ID_ANY instead of 0, and allow
it to be used in place of any socket ID specified by user.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Damien Millescamps <damien.millescamps@6wind.com>
2014-02-26 11:07:27 +01:00
Olivier Matz
926edd634e mem: fix rte_malloc(SOCKET_ID_ANY), try to allocate on other nodes
Before this patch, rte_malloc(SOCKET_ID_ANY) was equivalent to
rte_malloc(this_socket). If the user specifies SOCKET_ID_ANY, it means that
memory can be allocated on any socket. So fix the behavior of rte_malloc() in
order to do that. The current CPU socket is still the default, but if it fails,
other sockets are tested.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:27 +01:00
Olivier Matz
9ac92a2693 mem: remove unneeded log
Remove an error log in memzone_reserve_aligned_thread_unsafe().
It is up to the caller to log the error, and this is already done
in DPDK code (especially in network drivers).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:27 +01:00
Didier Pallard
e29ad45703 mem: get physical address of any rte_malloc buffer
Get physical address of any rte_malloc allocated buffer using
function rte_malloc_virt2phy(addr).
The rte_memzone pointer is now stored in each allocated memory block
header to allow simple computation of physical address of a block
using the memzone it comes from.
The function rte_malloc_virt2phy has a dependency on rte_memory.h:
phys_addr_t must be defined.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:27 +01:00
Thomas Monjalon
2609f70224 mem: more const qualifiers in malloc API
Some functions don't modify their parameter which should be marked as const.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2014-02-26 11:07:27 +01:00
Damien Millescamps
57c24af85d mem: get physical address of any pointer
Extract rte_mem_virt2phy() from get_physaddr().

rte_mem_virt2phy() permits to obtain the physical address of any
virtual address mapped to the current process calling this function.
Note that this function is very slow and shouldn't be called
after initialization to avoid a performance bottleneck.

The memory must be locked with mlock(). The function rte_mem_lock_page()
is a mlock() helper that lock the whole page.

A better name would be rte_mem_virt2phys but rte_mem_virt2phy is more
consistent with rte_mempool_virt2phy.

Signed-off-by: Damien Millescamps <damien.millescamps@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-02-26 11:07:27 +01:00
Didier Pallard
3314648f83 timer: add precise TSC function
According to Intel Developer's Manual:

"The RDTSC instruction is not a serializing instruction. It does not necessarily wait
 until all previous instructions have been executed before reading the counter. Simi-
 larly, subsequent instructions may begin execution before the read operation is
 performed. If software requires RDTSC to be executed only after all previous instruc-
 tions have completed locally, it can either use RDTSCP (if the processor supports that
 instruction) or execute the sequence LFENCE;RDTSC."

So add a rte_rdtsc_precise function that do a memory barrier before rdtsc to
synchronize operations and ensure that the TSC read is done at the expected place.
Use r/w memory barrier instead of lfence to serialize both loads and stores.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Reviewed-by: François-Frédéric Ozog <ff@ozog.com>
Reviewed-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:27 +01:00
Thomas Monjalon
da6fd0759c timer: get TSC frequency from /proc/cpuinfo
TSC frequency was guessed by reading CLOCK_MONOTONIC_RAW or sleeping 1 sec.
Now, read frequency from cpuinfo first.
Keep other methods as fallbacks.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
2014-02-26 11:01:14 +01:00
Ivan Boule
fb022b85ba timer: check TSC reliability
Read flags from /proc/cpuinfo and warn if constant_tsc or nonstop_tsc is
not found.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
2014-02-26 11:01:14 +01:00
Olivier Matz
73a2bc5dba spinlock: fix build with clang
LLVM clang requires an explicitly sized "cmp" assembly instruction.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:01:14 +01:00
H. Peter Anvin
c4eedd9b53 hash: reverse the operand order to crc32
Checkin

a132a9cf2b hash: use intrinsic

changed the rte_hash_crc.h from using the crc32 instruction via inline
assembly to using an intrinsic.  The intrinsic should allow for better
compiler performance, but the change did not account for the fact that
the inline assembly being in AT&T syntax used the opposite operand
order of the intrinsic.

This turns out to not matter for correctness, because the CRC32
operation is commutative.  However, it could potentially matter for
performance, because the loop is more efficient with the moving
pointer in the source operand and the accumulation in the destination
operand.

This was discovered by Jan Beulich when looking at the equivalent code
in the Linux kernel.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Reported-by: Jan Beulich <jbeulich@suse.com>
Reported-by: Pashupati Kumar <kumarp@brocade.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:01:14 +01:00
Thomas Monjalon
d8a2bc71df log: remove app path from syslog id
This reverts commit "log: get full path as syslog id" (494a02537f)
and restore the original patch from Stephen Hemminger (04210699ee).

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2014-02-26 11:01:14 +01:00
Olivier Matz
38a901702f kvargs: make the NULL key to match all entries
In rte_kvargs_process() and rte_kvargs_count(), if the key_match
argument is NULL, process all entries.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:14 +01:00
Olivier Matz
95418a30be kvargs: add the key in handler pameters
This argument can be useful when rte_kvargs_process() is called with
key=NULL, in this case the handler is invoked for all entries of the
kvlist.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:14 +01:00
Olivier Matz
d8e2337ac4 kvargs: add const attribute in handler parameters
The "value" argument is read-only and should be const.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:14 +01:00
Olivier Matz
d24839b318 kvargs: be strict when matching a key
When we match a key in is_valid_key() and rte_kvargs_process(), do a
strict comparison (strcmp()) instead of using strstr(s1, s2) which tries
a find s1 in s2. This old behavior could lead to unexpected match, for
instance "cola" match "chocolate".

Surprisingly, no patch was needed on rte_kvargs_count() as it already
used strcmp().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:14 +01:00
Olivier Matz
ac15c81315 kvargs: simpler parsing and allow duplicated keys
Remove the rte_kvargs_add_pair() function whose only role was to check
if a key is duplicated. Having duplicated keys is now allowed by kvargs
API.

Also replace rte_strsplit() by more a standard function strtok_r() that
is easier to understand for people already knowing the libc. It also
avoids useless calls to strnlen(). The delimiters macros become strings
instead of chars due to the strtok_r() API.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:14 +01:00
Olivier Matz
ef8e5447e3 kvargs: rework API to fix memory leak
Before the patch, a call to rte_kvargs_tokenize() resulted in a call to
strdup() to allocate a modifiable copy of the argument string. This
string was never freed, excepted in the error cases of
rte_kvargs_tokenize() where rte_free() was wrongly called instead of
free(). In other cases, freeing this string was impossible as the
pointer not saved.

This patch introduces rte_kvargs_free() in order to free the structure
properly. The pointer to the duplicated string is now kept in the
rte_kvargs structure. A call to rte_kvargs_parse() directly allocates
the structure, making rte_kvargs_init() useless.

The only drawback of this API change is that a key/value associations
cannot be added to an existing kvlist. But it's not used today, and
there is not obvious use case for that.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:14 +01:00
Olivier Matz
03b611b713 kvargs: remove useless size field
This value was not very useful as the size of the table is fixed (equals
RTE_KVARGS_MAX).

By the way, the memset in the initialization function was wrong (size
too short). Even if it was not really an issue since we rely on the
"count" field, it is now fixed by this patch.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:14 +01:00
Olivier Matz
991543e815 kvargs: remove driver name in arguments
Now that rte_kvargs is a generic library, there is no need to have an argument
for the driver name in rte_kvargs_tokenize() and rte_kvargs_parse()
prototypes. This argument was only used to log the driver name in case of
error. Instead, we can add a log in init function of pmd_pcap and pmd_ring.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:14 +01:00
Olivier Matz
07f7d55dd2 kvargs: use the new library in pmd_pcap
The rte_kvargs library is a reworked copy of rte_eth_pcap_arg_parser,
so it provides the same service. Therefore we can use it and remove the
code of rte_eth_pcap_arg_parser.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:13 +01:00
Olivier Matz
e1a00536c8 kvargs: add a new library to parse key/value arguments
Copy the code from rte_eth_pcap_arg_parser.[ch], without any functional
modifications, only:
- rename functions and structure
- restyle (indentation)
- add comments (doxygen style)
- add "const" or "static" attributes, remove unneeded "inline"

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 11:01:13 +01:00
Thomas Monjalon
46d4e6f7ad eal: remove unused macro for blacklist
This macro was used for blacklist parsing but is not used anymore
since commit 5a55b9ac91.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-02-26 11:01:13 +01:00
Ivan Boule
f563a3727b eal: fix recording of detected/enabled logical cores
1) In the EAL initialization phase, invoke the function rte_eal_cpu_init
   to detect the set of running cores (and enable them by default) before
   processing the [enabled] core mask option that is performed during the
   parsing of EAL arguments.

2) In the function rte_eal_cpu_init():
   - to parse the set of all running logical cores on the machine, do not
     use the RTE_LCORE_FOREACH macro that considers the set of already
     detected cores...
     Instead, use a standard loop based on the RTE_MAX_LCORE constant.
   - explicitely set to ROLE_RTE the role of each detected logical core
     that is recorded in the EAL configuration, as all running cores are
     enabled by default.

3) In the function eal_parse_coremask(), update the "lcore_count" field
   of the EAL configuration with the effective number of logical cores
   that are set in the mask of enabled logical cores.

Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:01:13 +01:00
Thomas Monjalon
894f5cc441 eal: fix type of pointer arithmetic result
Adding or subtracting a value to a pointer makes a new pointer
of unknown type.
So typeof() is replaced by (void*) in RTE_PTR_ADD() and RTE_PTR_SUB().

But RTE_PTR_ALIGN_* macros have in their explicit API to return a pointer
of the same type. Since RTE_PTR_ALIGN_CEIL is based on RTE_PTR_ADD, a
typeof() is added to keep the original behaviour.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2014-02-26 11:01:13 +01:00
Damien Millescamps
f9a08f6502 eal: add support for shared object drivers
Add an option to specify libraries to be loaded before probing the PCI.

For instance, testpmd -d librte_pmd_xxx.so can be used to enable xxx driver
support on testpmd without any recompilation of testpmd.

Plugins are loaded before creating threads because we want the threads to
inherit any property that could be set while loading a plugin, such as iopl().

Signed-off-by: Damien Millescamps <damien.millescamps@6wind.com>
Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:01:13 +01:00
Thomas Monjalon
9b226335bb doc: fix some doxygen comments
- rte_panic must be before rte_panic_ to be associated to its doc
- marker /**< must be used when commenting after the declaration only
- fix rte_string_fns.h title
- typos

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-02-26 11:01:13 +01:00
Bruce Richardson
dc76ed2478 version: 1.6.0
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 10:47:59 +01:00
Bruce Richardson
4074a5d806 ixgbe: fix vf irq storm when running on Xen Dom0.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-26 10:22:33 +01:00
Bruce Richardson
429c6d86b3 ixgbe: prepare for vector pmd
The following changes are included in this patch for ixgbe:
* Support for a separate Vector Poll-Mode Driver component
* Refactoring to extract out definitions from .c file to separate .h

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 10:22:33 +01:00
Bruce Richardson
1550c20be0 ixgbe: minor rework offloading bits fix
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-26 10:22:33 +01:00
Bryan Benson
2c90eb650c ixgbe: fix offloading bits when Rx bulk alloc is used
This is a fix for the ixgbe hardware offload flags not being set
when bulk alloc RX is used. The issue was caused by masking off
the bits that store the hardware offload values in the status_error
field to retrieve the done bit for the descriptor.

Commit 7431041062 in DPDK-1.3.0
introduced bulk dequeue, which included the bug.

Signed-off-by: Bryan Benson <bmbenson@amazon.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
2014-02-26 10:22:33 +01:00
Qinglai Xiao
27399312ef ixgbe: query assignment of VF queues
Physical Function assignes Tx/Rx queues to each Virtual Function
according to different schemes[1]. By querying through mailbox,
VF is able to get number of Tx/Rx queues assigned to it.

Note that current Intel ixgbe driver ixgbe-3.18.7 does not fully
support mailbox message IXGBE_VF_GET_QUEUES. The service routine
for IXGBE_VF_GET_QUEUES must be fixed, otherwise PF always return
1 as Tx/Rx queue number.

[1] See section 7.2.1.2.1, 7.1.2.2 and 7.10.2.7.2 of Intel 82599 10
    Gbe Controller Datasheet.

Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 10:22:33 +01:00
Bruce Richardson
f474e64f4d ixgbe: fix disabling loopback mode
Following introduction of loopback mode, this mode should be explicitely
disabled in ixgbe_dev_rx_init() if not enabled.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-26 10:22:33 +01:00
Qinglai Xiao
db03592561 ixgbe: add Tx->Rx loopback mode for 82599
82599 has two loopback operation modes, Tx->Rx and Rx->Tx.
For the time being only Tx->Rx is supported.

The new field lpbk_mode added in struct rte_eth_conf defines loopback
operation mode for certain ethernet controller. By default the value
of lpbk_mode is 0, meaning loopback mode disabled.

Since each ethernet controller has its own definition of loopback modes,
API user has to check both datasheet and implementation of certain driver
so as to understand what are valid values to be set, and what are the
expected behaviors.

Check IXGBE_LPBK_82599_XXX which are defined in ixgbe_ethdev.h
for valid values of 82599 loopback mode.

Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
Acked-by: Venky Venkatesan <venky.venkatesan@intel.com>
2014-02-26 10:22:33 +01:00
Bruce Richardson
682d65b8f5 igb: fix dual vlan ethertype
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-26 10:22:32 +01:00
Stephen Hemminger
8e7bd48f75 igb: restore workaround errata with wthresh on 82576
The 82576 has known issues which require the write threshold to be set to 1.
See:
	http://download.intel.com/design/network/specupdt/82576_SPECUPDATE.pdf

If not then single packets will hang in transmit ring until more arrive.
Simple tests like ping will fail.

The workaround was in the wrong file (commit a30ebfbb8c).
Move it in igb one to restore original patch (7e9e49feea).

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 10:22:32 +01:00
Bruce Richardson
3f6899edd7 igb/ixgbe: remove useless header inclusion
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-26 10:22:32 +01:00
Maxime Leroy
3aa1e71982 igb/ixgbe: allow RSS with only one Rx queue
It should be possible to enable RSS with one Rx queue.
RSS hash can be useful independently of the number of Rx queues.
Applications can use RSS hash to identify different IP flows.

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
2014-02-26 10:22:32 +01:00
Maxime Leroy
7a9b2b0998 igb/ixgbe: ETH_MQ_RX_NONE should disable RSS
As explained in rte_ethdev.h, ETH_MQ_RX_NONE allows to not choose RSS, DCB
or VMDQ mode.

But the igb/ixgbe code always silently select the RSS mode with ETH_MQ_RX_NONE.
This patch fixes this incoherence between the API and the implementation.

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
2014-02-26 10:22:32 +01:00
Bruce Richardson
a18dc3319e pcap: add missing dependency on malloc
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 10:22:32 +01:00
Bruce Richardson
b9a4361fc5 virtio: mark functions as always inline
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 10:22:32 +01:00
Bruce Richardson
2cf3151ee4 virtio: various improvements
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 10:22:32 +01:00
Bruce Richardson
0978aa72cd virtio: add close function
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 10:22:32 +01:00
Bruce Richardson
dfaff37fc4 vmxnet3: import new vmxnet3 poll mode driver implementation
Poll Mode Driver for Paravirtual VMXNET3 NIC.
As a PMD, the VMXNET3 driver provides the packet reception and transmission
callbacks, vmxnet3_recv_pkts and vmxnet3_xmit_pkts. It does not support
scattered packet reception as part of vmxnet3_recv_pkts and
vmxnet3_xmit_pkts. Also, it does not support scattered packet reception as part of
the device operations supported.

The VMXNET3 PMD handles all the packet buffer memory allocation and resides in
guest address space and it is solely responsible to free that memory when not needed.
The packet buffers and features to be supported are made available to hypervisor via
VMXNET3 PCI configuration space BARs. During RX/TX, the packet buffers are
exchanged by their GPAs, and the hypervisor loads the buffers with packets in the RX
case and sends packets to vSwitch in the TX case.

The VMXNET3 PMD is compiled with vmxnet3 device headers. The interface is similar
to that of the other PMDs available in the Intel(R) DPDK API. The driver pre-allocates the
packet buffers and loads the command ring descriptors in advance. The hypervisor fills
those packet buffers on packet arrival and write completion ring descriptors, which are
eventually pulled by the PMD. After reception, the Intel(R) DPDK application frees the
descriptors and loads new packet buffers for the coming packets. The interrupts are
disabled and there is no notification required. This keeps performance up on the RX
side, even though the device provides a notification feature.

In the transmit routine, the Intel(R) DPDK application fills packet buffer pointers in the
descriptors of the command ring and notifies the hypervisor. In response the hypervisor
takes packets and passes them to the vSwitch. It writes into the completion descriptors
ring. The rings are read by the PMD in the next transmit routine call and the buffers
and descriptors are freed from memory.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 10:22:32 +01:00
Bruce Richardson
d0c5ae3392 pci: add pci ids for vmxnet3 devices
pci ids for vmxnet3 devices added.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 10:22:32 +01:00
Bruce Richardson
b9ee370557 kni: update kernel driver ethtool baseline
Update the KNI kernel driver so it can compile on more modern kernels
Also, rebaseline the ethtool support off updated igb kernel drivers
so that we get the latest bug fixes and device support.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-26 10:22:32 +01:00
Bruce Richardson
e01d69d51c kni: fix packet loss in loopback mode
kni_net fixed to prevent losing packet bytes when doing loopback.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reported-by: Daniel Kaminsky <daniel.kaminsky@infinitelocality.com>
2014-02-26 10:22:12 +01:00
Bruce Richardson
9c495f7f18 kni: add kni close function
KNI close function added.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:19 +01:00
Bruce Richardson
47bd46112b xen: import xenvirt pmd and vhost_xen
This provides a para-virtualization packet switching solution, based on the
Xen hypervisor’s Grant Table, which provides simple and fast packet
switching capability between guest domains and host domain based on
MAC address or VLAN tag.

This solution is comprised of two components; a Poll Mode Driver (PMD)
as the front end in the guest domain and a switching back end in the
host domain.  XenStore is used to exchange configure information
between the PMD front end and switching back end,
including grant reference IDs for shared Virtio RX/TX rings, MAC
address, device state, and so on.

The front end PMD can be found in the Intel DPDK directory lib/
librte_pmd_xenvirt and back end example in examples/vhost_xen.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:19 +01:00
Bruce Richardson
148f963fb5 xen: core library changes
Core support for using the Intel DPDK with Xen Dom0 - including EAL
changes and mempool changes. These changes encompass how memory mapping
is done, including support for initializing a memory pool inside an
already-allocated block of memory.
KNI sample app updated to use KNI close function when used with Xen.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:19 +01:00
Bruce Richardson
40b966a211 ivshmem: library changes for mmaping using ivshmem
These library changes provide a new Intel DPDK feature for communicating
with virtual machines using QEMU's IVSHMEM mechanism.

The feature works by providing a command line for QEMU to map several hugepages
into a single IVSHMEM device. For the guest to know what is inside any given IVSHMEM
device (and to distinguish between Intel(R) DPDK and non-Intel(R) DPDK IVSHMEM
devices), a metadata file is also mapped into the IVSHMEM segment. No work needs to
be done by the guest application to map IVSHMEM devices into memory; they are
automatically recognized by the Intel(R) DPDK Environment Abstraction Layer (EAL).

Changes in this patch:
* Changes to EAL to allow mapping of all hugepages in a memseg into a single file
* Changes to EAL to allow ivshmem devices to be transparently mapped in
  the process running on the guest.
* New ivshmem library to create and manage metadata exported to guest VM's
* New ivshmem compilation targets
* Mempool and ring changes to allow export of structures to a VM and allow
  a VM to attach to those structures.
* New autotests to unit tests this functionality.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:19 +01:00
Bruce Richardson
013615a784 mem: add bounded reserve function
For certain functionality, e.g. Xen Dom0 support, it is required that
we can guarantee that memzones for descriptor rings won't cross 2M
boundaries. So add new memzone reserve function where we can pass in a
boundary condition parameter.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:19 +01:00
Pei Chao
e0317809cd mem: remove duplicated lines
Extra space for future alignment was reserved twice.
It was introduced in version 1.3.0 (commit 916e4f4f4e).

Signed-off-by: Pei Chao <peichao85@gmail.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-25 21:29:19 +01:00
Thomas Monjalon
f622f3848f mem: fix log for --no-huge
In some cases, it is possible to not use hugepages.
So a simple malloc is used to initialize DPDK memory.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Damien Millescamps <damien.millescamps@6wind.com>
2014-02-25 21:29:19 +01:00
Bruce Richardson
8ea9ff834f mem: allow virtual memory address hinting
For multi-process applications, it can sometimes occur that part of the
address ranges used for memory mapping in the primary process are not
free in the secondary process, which causes the secondary processes to
abort on startup.
This patch adds in a memory hinting mechanism, where you can hint a
starting base address to the primary process for where you would like
the hugepage memory to be mapped. It is just a hint, so the memory will
not always go exactly where requested, but it should allow the memory
addresses used by a primary process to be adjusted up or down a little,
thereby fixing issues with secondary process startup.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
48a0cc25f1 mbuf: rework check on mbuf freeing
Allow poll-mode drivers to maintain their own caches of mbufs, by allowing them
to check if it's ok to free an mbuf (to their local cache) without actually
freeing it back to the memory pool itself.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
657eabecd8 sched: use common macro RTE_DIM
Replace local DIM() macro with RTE_DIM in rte_red.c

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
c788bdb1a5 timer: missing optimization flag in compile
Timer library was missing the -O3 compile-time flag. This has been
added.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
cac5abc265 eal: fix printf format
Fix some format indicators in printf.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
5e8446dc6b eal: cleanup on mempool and memzone object names
Cleanup mempool and memzone object names so that we can more easily rename them
from headers.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
896c37c5af eal: fix build with some gcc 4.4 toolchains
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
88f93f9340 eal: fix typo for RTE_EAL_ALLOW_INV_SOCKET_ID
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
ae9ba5bb6d eal: fix support for older gcc versions
older versions of gcc don't support the cold attribute so make its
presence conditional.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
16a6a44761 eal: fix cpuflags for latest microarch
Ensure that support for AVX2, HLE and RTM works with cpuflags.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
a3fd610463 eal: new common macros added
Added the following new macros/inline functions, which are both
generally useful and needed for later functionality:
* rte_align64pow2: aligns a 64bit parameter to next power of 2
* RTE_LEN2MASK: create mask of type <tp> with the first <ln> bits
* RTE_DIM: return the number of elements in an array.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
6f5f8ecce1 eal: add rte_compiler_barrier() macro
The rte_ring functions used a compiler barrier to stop the compiler
reordering certain expressions. This is generally useful so is moved
to the common header file with the other barriers.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
e49680a87e mk: compilation fixes
Missing _GNU_SOURCE define for compilation of a number of files.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
764bf26873 add FreeBSD support
Changes to allow compilation and use on FreeBSD. Includes:
* contigmem and nic_uio driver for FreeBSD
* new EAL instance
* new "bsdapp" compilation target
* various compilation fixes due to differences between linux and freebsd

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:18 +01:00
Bruce Richardson
e9d48c0072 update Intel copyright years to 2014
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-02-25 21:29:14 +01:00
Intel
4f85a5697e version: 1.5.2
Signed-off-by: Intel
2014-01-15 16:40:47 +01:00
Intel
466e137457 lpm: fix sub-rule deletion
Restore group validation flag of the tbl8 entry
if sub-rule is replaced by an encompassing rule.

Signed-off-by: Intel
2014-01-15 16:37:43 +01:00
Richardson, Bruce
6eb0ae218a pcap: fix mbuf allocation
A static list of 64 mbufs was being reused in Rx function.
This caused two errors:
1) If more than 64 buffers were requested in a single burst,
   only the last 64 buffers are returned, the others are lost.
2) Application will free the mbuf being returned, but the receive
   function will reuse the buffer anyway. If some other allocation
   is done, there is suddenly multiple writers for the same mbuf.
It is fixed by allocating mbuf on demand.

In the same time, some length errors are fixed.

Reported-by: Mats Liljegren <mats.liljegren@enea.com>
Reported-by: Robert Sanford <rsanford@prolexic.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-01-15 15:55:05 +01:00
Intel
5a6d9897f9 ixgbe: residual fix about resetting big Tx queues
Index overflow when resetting big queues was partially fixed in
bcf457f8c0 (ixgbe: fix index overflow when resetting big Tx queues)
and better fixed in
e8ae856140 (igb/ixgbe: fix index overflow when resetting big queues)

But this version (1.5.2r0) has residues of the initial fix from 1.5.1r0.

Signed-off-by: Intel
2014-01-15 15:31:09 +01:00
Richardson, Bruce
b2595c4aa9 igb/ixgbe: fix build with ICC
ICC requires an initializer be given for the static variables,
so adding one in cases where one wasn't previously given.

This problem was introduced in commit e8ae856140
(fix index overflow when resetting big queues).

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-01-15 15:29:29 +01:00
Thomas Monjalon
e8ae856140 igb/ixgbe: fix index overflow when resetting big queues
Rings are resetted with a loop because memset cannot be used without
issuing a warning about volatile casting.
The index of the loop was a 16-bit variable which is sufficient for
ring entries number but not for the byte size of the whole ring.
The overflow happens when rings are configured for 4096 entries
(descriptor size is 16 bytes). The result is an endless loop.

It is fixed by indexing ring entries and resetting all bytes of the entry
with a simple assignment.
The descriptor initializer is zeroed thanks to its static declaration.

There already was a fix for ixgbe Tx only
(commit bcf457f8c0).
It is reverted to use the same fix everywhere (Rx/Tx for igb/ixgbe).

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
2014-01-03 17:08:09 +01:00
Intel
06bea48c36 version: 1.5.1
Signed-off-by: Intel
2013-11-24 21:31:36 +01:00
Intel
4d2ca079d4 app/test: rename pmac_acl as acl
Signed-off-by: Intel
2013-11-24 21:31:36 +01:00
Intel
a4a9fa474c kni: fix vhost build with kernel 3.7
Signed-off-by: Intel
2013-11-24 21:31:36 +01:00
Intel
c1c677a1ec kni: fix X540 init
KNI must not do hardware reset. But it was resetting X540 devices.

This bug was in the first KNI version
(commit 3fc5ca2f63).

Signed-off-by: Intel
2013-11-24 21:31:36 +01:00
Intel
8b401ff04f kni: add i354 support
Signed-off-by: Intel
2013-11-24 21:31:36 +01:00
Intel
bcf457f8c0 ixgbe: fix index overflow when resetting big Tx queues
The index of the loop was a 16-bit variable which is sufficient for
ring entries number but not for the byte size of the whole ring.
The overflow happens when queue rings are configured for 4096 entries
(descriptor size is 16 bytes). The result is an endless loop.

Signed-off-by: Intel
2013-11-24 21:31:16 +01:00
Intel
eba3a2e929 ixgbe: fix RSC disabling bit
Signed-off-by: Intel
2013-11-24 01:31:35 +01:00
Intel
940b3cc0c0 ixgbe: add MAC control forward
Signed-off-by: Intel
2013-11-24 01:31:35 +01:00
Intel
03c95df15a e1000: add MAC control forward
Signed-off-by: Intel
2013-11-24 01:31:34 +01:00
Intel
ff4a42ca64 ethdev: add MAC control forward
Signed-off-by: Intel
2013-11-24 01:31:34 +01:00
Intel
a1ebecb0a9 ixgbe: add 82599 bypass support
Signed-off-by: Intel
2013-11-24 01:31:34 +01:00
Intel
c3d0564cf0 ethdev: add bypass logic
An overview of bypass logic can be seen in this document:
http://www.intel.com/content/dam/www/public/us/en/documents/product-briefs/ethernet-server-bypass-adapter-x520-x540-family-brief.pdf

Signed-off-by: Intel
2013-11-24 01:31:34 +01:00
Intel
df0f8da586 igb: fix PF build
Signed-off-by: Intel
2013-11-24 01:31:34 +01:00
Intel
e2aa75f170 igb: configure CRC stripping for i211 and i354
Signed-off-by: Intel
2013-11-24 01:31:34 +01:00
Intel
4f29810e3c igb: add i354 support
Signed-off-by: Intel
2013-11-24 01:31:34 +01:00
Intel
38db3f7f50 e1000: update base driver
The base driver supports more NICs:
    - i210 flashless
    - i217
    - i218
    - i354

The new features are not automatically used by the DPDK PMD.

Signed-off-by: Intel
2013-11-24 01:31:34 +01:00
Intel
dffbaf7880 e1000: revert fix for multicast in VF
Revert fix from commit 06cf9be95c.

Signed-off-by: Intel
2013-11-24 01:31:34 +01:00
Intel
1558bea6e3 e1000: more error checks
Signed-off-by: Intel
2013-11-24 01:31:33 +01:00
Intel
2fd4855f30 e1000: mark unused parameters
Signed-off-by: Intel
2013-11-24 01:31:33 +01:00
Intel
1d2d65121b e1000: minor changes
Signed-off-by: Intel
2013-11-24 01:31:33 +01:00
Intel
5037620be5 e1000: whitespace changes
Signed-off-by: Intel
2013-11-24 01:31:33 +01:00
Intel
f72751f2c7 virtio: minor changes
Signed-off-by: Intel
2013-11-24 01:31:33 +01:00
Intel
de9fa911b1 ring: use rte_atomic functions
Rather than directly calling intrinsics functions,
use functions from rte_atomic.h.

Signed-off-by: Intel
2013-11-24 01:31:33 +01:00
Intel
a8d9a27ae4 ring: move log
Signed-off-by: Intel
2013-11-24 01:31:33 +01:00
Intel
a7bbd6e2fc ring: fix build without ixgbe
Signed-off-by: Intel
2013-11-24 01:31:33 +01:00
Intel
1e082d4350 pcap: fix build without ixgbe
Signed-off-by: Intel
2013-11-24 01:31:33 +01:00
Intel
718bf2ae4d mem: remove hugepage file on unmap
Signed-off-by: Intel
2013-11-23 23:48:41 +01:00
Intel
ce1f9117b3 pci: fix sysfs parsing for uio
Signed-off-by: Intel
2013-11-23 23:48:21 +01:00
Intel
53784a6090 sched: remove debug symbols
Signed-off-by: Intel
2013-11-21 10:12:10 +01:00
Intel
8d111d7425 meter: remove debug symbols
Signed-off-by: Intel
2013-11-21 10:12:10 +01:00
Intel
9cb80085a4 lpm: fix route adding
Signed-off-by: Intel
2013-11-21 10:12:01 +01:00
Intel
fed1491b40 eal: whitespace change
Signed-off-by: Intel
2013-11-20 10:27:00 +01:00
Intel
732a9d3035 eal: fix build with intrinsics and gcc < 4.8
Provide a fallback if the intrinsic __builtin_bswap16 is not available.

Signed-off-by: Intel
2013-11-20 10:27:00 +01:00
Intel
ac33030524 mk: avoid multiple inclusion of rte_config.h
Signed-off-by: Intel
2013-11-19 16:20:09 +01:00
Intel
dcae8715b8 version: 1.5.0-pre-release
Signed-off-by: Intel
2013-10-09 16:16:16 +02:00
Intel
b23ffbaa82 kni: add vhost backend
Attach to vhost-net as raw socket backend.

Signed-off-by: Intel
2013-10-09 16:16:15 +02:00
Intel
904d29a135 kni: move FIFO functions
Move FIFO functions into kni_fifo.h in order to reuse it for vhost.

Signed-off-by: Intel
2013-10-09 16:16:15 +02:00
Intel
9c61145ff6 kni: allow multiple threads
In this new mode, each KNI device has its own kernel thread for Rx.
The core affinity is configurable.

Signed-off-by: Intel
2013-10-09 16:16:15 +02:00
Intel
fbf895d44c kni: identify device by name
Some old API functions based on port_id are deprecated.

Signed-off-by: Intel
2013-10-09 16:16:15 +02:00
Intel
0b44a857c8 kni: generate random MAC address if needed
Replace the address based on "\0KNIxy" by a random MAC.

Signed-off-by: Intel
2013-10-09 16:16:15 +02:00
Intel
4583570637 kni: fix build with kernel 3.10
- The flags NETIF_F_HW_VLAN_* have been renamed to NETIF_F_HW_VLAN_CTAG_*.
See Linux commit f646968f8f7c624587de729115d802372b9063dd.

- The VLAN protocol must be specified.
See Linux commits 86a9bad3ab6b6f858fd4443b48738cabbb6d094c
and 80d5c3689b886308247da295a228a54df49a44f6.

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
47cda8c470 kni: clean logs
The debug is now disabled by default and can be enabled with
configuration option CONFIG_RTE_KNI_KO_DEBUG.

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
abc3630ed3 kni: minor changes
Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
c1f86306a0 virtio: add new driver
This PMD can be used in a VM having virtio-net NIC.

Note: it is a different implementation than virtio-usermap extension.

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
7ef0072910 ethdev: random MAC address
Factorize code by moving random_addr() function in only place.
It will be reused for virtio.

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
7bd128eae2 eal: increase I/O privilege
Set I/O privilege to the highest level (3).
It is needed for virtio.

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
4c173302c3 pcap: add new driver
This PMD uses libpcap to send/receive packets to/from any NIC.
It can also read/write to/from a file.

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
e1e4017751 ring: add new driver
This PMD is a set of FIFOs using rte_ring without any NIC.
It can be used as a loopback.

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
dc5026b485 ethdev: allow device without registered driver
It is needed for non-pci devices (ring and pcap).

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
0b33b68d12 ethdev: export allocate function
The function rte_eth_dev_allocate() was called by rte_eth_dev_init().
In order to use it for non-pci devices, it is now in public API.

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
2c502225c6 eal: introduce non-pci devices
This type of pseudo-device is needed for ring and pcap PMDs.
They are compatible with whitelist and are initialized in rte_eal_init().

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
fe3a45fd41 ixgbe: add VMDq support
Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
d52147ec28 igb: add VMDq support
Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
88ac4396ad ethdev: add VMDq support
Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
0105ba4c6b ixgbe: fix VF init without setup
In case of multi-process application, the secondary process can initialize
the driver without configuring queues. In this case the Rx/Tx functions
were not initialized because it was only done in queue setup.

Fix by reproducing the same behaviour as in eth_ixgbe_dev_init().

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
5caeb1b143 igb: fix VF init without setup
In case of multi-process application, the secondary process can initialize
the driver without configuring queues. In this case the Rx/Tx functions
were not initialized because it was only done in queue setup.

Fix by reproducing the same behaviour as in eth_igb_dev_init().

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
a84f185a8a e1000: fix descriptor overflow
Allow rxq->rx_tail + offset > 65535
in eth_em_rx_descriptor_done().

Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
b967915a03 e1000: minor changes
Signed-off-by: Intel
2013-10-09 16:16:14 +02:00
Intel
02331c16ec ethdev: reset unsupported stats
Initialize statistics structure to 0 before passing it to the PMD.
This way, the unsupported fields will be 0.

Signed-off-by: Intel
2013-10-09 16:14:52 +02:00
Intel
d15808aa1e mem: retrieve mempool cache only when needed
It is an optimization for the single consumer case,
or when cache is too small,
or when cache is disabled.

Signed-off-by: Intel
2013-10-09 16:04:09 +02:00
Intel
7c60fd9ef6 mem: retry malloc with smaller block size when failure
rte_malloc try to allocate memzone blocks with a minimum size.
It it fails, it retries for a smaller size than the standard one.
It will really fail if it cannot allocate block of the requested size.

Signed-off-by: Intel
2013-10-09 16:04:09 +02:00
Intel
9b15ba895b timer: use a skip list
The skip list algorithm allows to improve the scalability.

Signed-off-by: Intel
2013-10-09 16:04:09 +02:00
Intel
ef8ff191a4 lpm: rework rules storage
Signed-off-by: Intel
2013-10-09 16:04:09 +02:00
Intel
0260e5e43f sched: minor changes
Do not define grinder_credits_check() if it is not used.

Signed-off-by: Intel
2013-10-09 16:04:09 +02:00
Intel
5140eb165f eal: use pause only with SSE2
The pause instruction is part of SSE2 extensions.
Note that some compilers define _mm_pause as "rep; nop" instead of "pause".
For compatible processors, they are equivalent.

http://www.intel.com/Assets/PDF/manual/325383.pdf:
"
When executing a spin-wait loop, a Pentium 4 or Intel Xeon processor suffers
a severe performance penalty when exiting the loop because it detects a
possible memory order violation.
The PAUSE instruction provides a hint to the processor that the code sequence
is a spin-wait loop. The processor uses this hint to avoid the memory order
violation in most situations, which greatly improves processor performance.
"

Signed-off-by: Intel
2013-10-09 16:04:09 +02:00
Intel
cd5b32ee33 eal: allow to whitelist devices
The new option --use-device is a PCI whitelist.
It is the opposite to -b option.

Signed-off-by: Intel
2013-10-09 16:03:59 +02:00
Intel
5a55b9ac91 eal: allow to blacklist address without domain prefix
These 2 formats are now accepted:
    domain🚌device.function
           bus:device.function

Signed-off-by: Intel
2013-10-09 15:46:52 +02:00
Intel
0058a97b6a eal: rework CPU mask parsing
The CPU mask was limited to "unsigned long long".
The limit was removed and parsing/init is less loosy.

Signed-off-by: Intel
2013-10-09 15:46:52 +02:00
Intel
e740686611 eal: minor changes
Signed-off-by: Intel
2013-10-09 15:46:52 +02:00
Intel
e25e4d7ef1 mk: shared libraries
Allow to build shared libraries (.so) instead of static ones (.a).

Signed-off-by: Intel
2013-10-09 15:35:36 +02:00
Intel
1c1d4d7a92 doc: whitespace changes in licenses
Signed-off-by: Intel
2013-10-09 14:51:55 +02:00
Intel
cd449b929b update version to 1.4.1
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
0b3144a905 ixgbe: fix DCB setup
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
11c378b199 ixgbe: check DD bit for specific RX descriptor
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
c32ee651a4 igb: check DD bit of specific RX descriptor
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
6a6f2b57a3 ethdev: check DD bit of specific RX descriptor
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
525ef82c4e ixgbe: RX queue count is not implemented for VF
It was introduced by mistake in version 1.4.0.

Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
90b2eeb80a ixgbe: use DD bit to count RX available descriptors
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
0f6b7c7f7a igb: use DD bit to count RX available descriptors
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
c25e53e079 ethdev: fix doxygen comment for rte_eth_rx_queue_count
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
9dc8cd6ef7 pci: check driver probe return code
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00
Intel
096ff2e82d mem: rework huge page mapping for secondary process
Signed-off-by: Intel
2013-09-17 14:16:10 +02:00