122 Commits

Author SHA1 Message Date
Helin Zhang
eeae544a51 pci: access to specific bits via sysfs
Enabling 'Extended Tag' and resetting 'Max Read Request Size' in PCI
config space have big impacts to i40e performance. They cannot be
changed on some BIOS implementations, though can on others. Two sys
files of 'extended_tag' and 'max_read_request_size' are added to
support changing them by 'echo' in user space.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Signed-off-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Heqing Zhu <heqing.zhu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
2014-06-17 18:21:09 +02:00
Anatoly Burakov
317fe51f6e eal: add command line option to select vfio interrupt type
Unlike igb_uio, VFIO interrupt type is not set by kernel module
parameters but is set up via ioctl() calls at runtime. This warrants
a new EAL command-line parameter. It will have no effect if VFIO is
not compiled, but will set VFIO interrupt type to either "legacy", "msi"
or "msix" if VFIO support is compiled. Note that VFIO initialization
will fail if the interrupt type selected is not supported by the system.

If the interrupt type parameter wasn't specified, VFIO will try all
interrupt types (starting with MSI-X).

In unit tests, we don't know if VFIO is compiled (eal_vfio.h header is
internal to Linuxapp EAL), so we check this flag regardless.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
5da473e965 pci: enable vfio device binding
Add support for binding VFIO devices if RTE_PCI_DRV_NEED_MAPPING is set
for this driver. Try VFIO first, if not mapped then try IGB_UIO too.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
2f4adfad0a vfio: add multiprocess support
Since VFIO cannot be used to map the same device twice, secondary
processes receive the device/group fd's by means of communicating over a
local socket. Only group and container fd's should be sent, as device
fd's can be obtained via ioctl() calls' on the group fd.

For multiprocess, VFIO distinguishes between existing but unused groups
(e.g. grups that aren't bound to VFIO driver) and non-existing groups in
order to know if the secondary process requests a valid group, or if
secondary process requests something that doesn't exist.

VFIO multiprocess sync communicates over a simple protocol. It defines
two requests - request for group fd, and request for container fd.
Possible replies are: SOCKET_OK (an OK signal), SOCKET_ERR (error
signal) and SOCKET_NO_FD (a signal that indicates that the requested
VFIO group is valid, but no fd is present for that group - indicating
that the respective group is simply not bound to VFIO driver).

Here is the logic in a nutshell:

1. secondary process sends SOCKET_REQ_CONTAINER or SOCKET_REQ_GROUP
1a. in case of SOCKET_REQ_GROUP, client also then sends group number
2. primary process receives message
2a. in case of invalid group, SOCKET_ERR is sent back to secondary
2b. in case of unbound group, SOCKET_NO_FD is sent back to secondary
2c. in case of valid group, SOCKET_OK is sent and followed by fd
3. socket is closed

in case of any error, socket is closed and SOCKET_ERR is sent.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
ff0b67d1c8 vfio: DMA mapping
Adding code to support VFIO mapping (primary processes only). Most of
the things are done via ioctl() calls on either /dev/vfio/vfio (the
container) or a /dev/vfio/$GROUP_NR (IOMMU group).

In a nutshell, the code does the following:
1. creates a VFIO container (an entity that allows sharing IOMMU DMA
   mappings between devices)
2. checks if a given PCI device is a member of an IOMMU group (if it's
   not, this indicates that the device isn't bound to VFIO)
3. calls open() the group file to obtain a group fd
4. checks if the group is viable (that is, if all the devices in the
   same IOMMU group are either bound to VFIO or not bound to anything)
5. adds the group to a container
6. sets up DMA mappings (only done once, mapping whole DPDK hugepage
   memory for DMA, with a 1:1 correspondence of IOVA to PA)
7. gets the actual PCI device fd from the group fd (can fail, which
   simply means that this particular device is not bound to VFIO)
8. maps BARs (MSI-X BAR cannot be mmaped, so skipping it)
9. sets up interrupt structures (but not enables them!)
10. enables PCI bus mastering

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
5c782b3928 vfio: interrupts
Creating code to handle VFIO interrupts in EAL interrupts (supports all
types of interrupts).

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
157bf937f5 vfio: header for build support
Add VFIO compilation option to linuxapp config.

Adding a header that will determine if VFIO support should be compiled
in. If VFIO is enabled in config (and it's enabled by default), then the
header will also check for kernel version. If VFIO is enabled in config
and if the kernel version is 3.6+, then VFIO_PRESENT will be defined.
This is the macro that should be used to determine if VFIO support is
being compiled in.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: HuilongX Xu <huilongx.xu@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
71d74422e2 pci: rename RTE_PCI_DRV_NEED_IGB_UIO to RTE_PCI_DRV_NEED_MAPPING
Rename the RTE_PCI_DRV_NEED_IGB_UIO to be more generic.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
6bf883260d pci: distinguish between legitimate failures and non-fatal errors
Currently, EAL does not distinguish between actual failures and expected
initialization errors. E.g. sometimes the driver fails to initialize
because it was not supposed to be initialized in the first place, such
as device not being managed by said driver.

This patch makes EAL fail on actual initialization errors while still
skipping over expected initialization errors.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
2b730d4f0a pci: fix code style
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
67c536bdad pci: move uio mapping in a dedicated file
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
46a6fa8793 pci: rework uio mapping to prepare for vfio
Separating mapping code and calls to open. This is a preparatory work
for VFIO patch since it'll need to map BARs too but it doesn't use path
in mapped_pci_resource. Also, renaming structs to be more generic.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
f15addb79e mem: make --no-huge use mmap instead of malloc
This makes it possible to run DPDK without hugepage memory when VFIO
is used, as VFIO uses virtual addresses to set up DMA mappings.

Technically, malloc is just fine, but we want to guarantee that
memory will be page-aligned, so using mmap to be safe.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:10 +02:00
Anatoly Burakov
a05a193109 eal: remove useless compilation flag
eal_hpet.c was renamed to eal_timer.c and, thanks to code changes, does
not need the -Wno-return-type any more.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-16 15:02:09 +02:00
Bruce Richardson
3031749c2d remove trailing whitespaces
This commit removes trailing whitespace from lines in files. Almost all
files are affected, as the BSD license copyright header had trailing
whitespace on 4 lines in it [hence the number of files reporting 8 lines
changed in the diffstat].

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: remove spaces before tabs in libs]
[Thomas: remove more trailing spaces in non-C files]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-06-11 00:29:34 +02:00
Jijiang Liu
5ebbb17281 xen: reserve memory at installing dom0_mm.ko
The patch changes the way of reserving memory in Dom0 driver.

It will reserve memory at installing rte_dom0_mm.ko kernel module
instead of requesting memory dynamically during DPDK application startup.
Meanwhile, now driver requests memory size of 4M once first,
if it failed, and request memory size of 2M once.

The main reasons for these changes are as follows:
First, to reduce the impact of increasing in memory fragment
after system run a long time.
Second, to reduce number of memory segment.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-29 11:43:11 +02:00
Ouyang Changchun
0748be2cf9 ethdev: queue start and stop
This patch adds API to support queue start and stop functionality for RX/TX.
It allows RX and TX queue is started or stopped one by one, instead of starting
and stopping all of them at the same time.

Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Tested-by: Waterman Cao <waterman.cao@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-28 16:00:55 +02:00
Neil Horman
e57f20e051 eal: make vdev init path generic for both virtual and pci devices
Currently, physical device pmds use a separate initalization path
(rte_pmd_init_all) while virtual devices use a constructor registration and
rte_eal_dev_init.  Theres no reason to have them be separate.  This patch
removes the vdev specific nomenclature from the vdev init path and makes it more
generic for use with all pmds.  This is the first step in converting the
physical device pmds to using the same constructor based registration path that
the virtual devices use.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-20 14:28:16 +02:00
Stephen Hemminger
e5ac7c2ff3 eal: don't inline string functions
It makes no sense to inline string functions, in fact snprintf
can't be inlined because the function supports variable number of
arguments.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: update includes]
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-05-16 16:02:55 +02:00
Stephen Hemminger
591a9d7985 add FILE argument to debug functions
The DPDK dump functions are useful for remote debugging of an
applications. But when application runs as a daemon, stdout
is typically routed to /dev/null.

Instead change all these functions to take a stdio FILE * handle
instead. An application can then use open_memstream() to capture
the output.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
[Thomas: fix quota_watermark example]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-16 16:02:55 +02:00
Stephen Hemminger
c738c6a644 spelling fixes
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-05-16 16:02:55 +02:00
Didier Pallard
937cca79c9 mem: change default per socket memory allocation
Currently, if there is more memory in hugepages than the amount
requested by dpdk application, the memory is allocated by taking as much
memory as possible from each socket, starting from first one.
For example if a system is configured with 8 GB in 2 sockets (4 GB per
socket), and dpdk is requesting only 4GB of memory, all memory will be
taken in socket 0 (that have exactly 4GB of free hugepages) even if some
cores are configured on socket 1, and there are free hugepages on socket
1...

Change this behaviour to allocate memory on all sockets where some cores
are configured, spreading the memory amongst sockets using following
ratio per socket:
N° of cores configured on the socket / Total number of configured cores
* requested memory
If this new algorithm fails, it defaults to previous behaviour.

This algorithm is used when memory amount is specified globally using
-m option. Per socket memory allocation can always be done using
--socket-mem option.

It is implemented only for Linux as BSD part looks not to be ready for NUMA.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Venky Venkatesan <venky.venkatesan@intel.com>
2014-05-14 11:06:49 +02:00
David Marchand
5d8751b83b pci: remove deprecated RTE_EAL_UNBIND_PORTS option
RTE_EAL_UNBIND_PORTS was deprecated in DPDK 1.4.0 and removed in 1.6.0, but the
code was not removed.

The bind/unbind operations should not be handled by the eal.
These operations should be either done outside of dpdk or inside the PMDs
themselves as these are their problems.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:21:48 +02:00
David Marchand
ef6352833a pci: move RTE_PCI_DRV_FORCE_UNBIND handling out of #ifdef
Move RTE_PCI_DRV_FORCE_UNBIND flag handling out of RTE_EAL_UNBIND_PORTS section.
This had nothing to do with RTE_EAL_UNBIND_PORTS anyway.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:21:29 +02:00
David Marchand
d8deab8a88 pci: pci_switch_module cleanup
The pci_switch_module() function should only do what its name tells: unbind pci
devices and rebind them on the specified kernel driver.
Hence, it can not call pci_uio_map_resource().

Call to pci_uio_map_resource() should be moved to rte_eal_pci_probe_one_driver()
so that we can factorize code.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:21:22 +02:00
David Marchand
d3e6faf840 pci: rework interrupt fd init and fix fd leak
A fd leak happens in pci_map_resource when multiple bars are mapped.
Fix this by closing fd unconditionnally in this function and open the
intr_handle fd in pci_uio_map_resource instead.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:20:47 +02:00
David Marchand
99d44c7e26 pci: remove virtio-uio workaround
virtio-uio does not need eal to map bars from uio device, so remove flag
RTE_PCI_DRV_NEED_IGB_UIO.
Then, move virtio-uio workaround out of generic eal_pci.c for linux
implementation.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:20:35 +02:00
David Marchand
8990aac31d pci: fix potential mem leaks
Looking at bsd implementation, we can see that there are some potential mem
leaks in linux implementation. Fix them.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-13 13:18:17 +02:00
Burakov, Anatoly
0c3977dd15 mem: take reserved hugepages into account
Some applications reserve hugepages for later use,
but DPDK doesn't take reserved pages into account
when calculating number of available number of hugepages.

This patch adds reading from "resv_hugepages" file
in addition to "free_hugepages".

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-13 10:11:18 +02:00
Wang Sheng-Hui
b20539d687 eal: print maximum and detected lcores
Print the maximum lcore(s) as configured, and the number of lcore(s) detected
on eal cpu init as debug info besides the not separate detected/not-detected
lcore info.

Signed-off-by: Wang Sheng-Hui <shhuiw@gmail.com>
[Thomas: add BSD part]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-05 18:04:05 +02:00
Didier Pallard
2ef79b212a eal: remove useless output of undetected lcores
Increasing maximum number of lcores gives a huge place to undetected
lcores in output traces. Moreover, this output does not give any
interesting information, since list of undetected lcores can be deduced
from list of detected ones.
So remove output related to undetected cores.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-05-05 18:03:09 +02:00
David Marchand
69020660c3 eal: remove unused config fields
There is no need for a 'magic' field in struct rte_config, as this part of the
structure is local to each process. All threads of a process are synchronised
because of the run_once atomic.
So remove this field, as it is only adding confusion when reading code that
references 'magic' field from struct rte_mem_config.

Besides, there is no reference about the 'version' field, so remove it as well.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-05-05 18:02:56 +02:00
Maxime Leroy
9ad0e24942 eal: fix vdev allocation on non-0 numa socket
vdev ethdev can not be allocated on a numa socket that is not socket 0.
The reason comes from rte_eth_dev_allocate() which uses rte_socket_id() to
identify the socket on which vdev driver data should be allocated.
However, at this initialization step, rte_socket_id() always returns 0.

Looking at rte_socket_id(), it needs rte_lcore_id() which uses the per-core
global _lcore_id variable. This variable is initialised by
eal_thread_init_master.

So eal_thread_init_master should be called before rte_eal_vdev_init().

Signed-off-by: Maxime Leroy <maxime.leroy@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-30 22:59:17 +02:00
Pascal Mazon
894fd42e7f eal: do not try to load library from current directory
When loading a library "libfoo.so" (depending on "libbar.so", located in an
entirely different folder), with a LD_LIBRARY_PATH=/path/to/libfoo.so", it
returns an error:

 EAL: ./libfoo.so: cannot open shared object file: No such file or directory

If the first dlopen() fails (here, because it can't find all dependencies),
the code requires for a second dlopen() that looks for "./libfoo.so". It
turns on pathname matching, which does not use LD_LIBRARY_PATH. As a result,
it fails because it cannot find "./libfoo.so".

The error message matches the error of the second dlopen(), not the first's.

Do not try to look for a different library ("./"-prefixed) than the one
provided in argument. Let the dynamic library management handle it, just
provide an appropriate LD_LIBRARY_PATH.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-18 00:38:38 +02:00
David Marchand
4f04db8b89 eal: check coremask against detected lcores
lcores that are set in coremask should be checked against lcores detected on
system. This way, we won't need to check them later.

Besides, if specifying an unavailable lcore, we currently panic in
eal_thread_loop() because pthread_setaffinity_np fails.
So this check will return an error with a more explicit message in
eal_parse_coremask().

"EAL: pthread_setaffinity_np failed
 PANIC in eal_thread_loop():
 cannot set affinity"

becomes :

"EAL: lcore 4 unavailable
 EAL: invalid coremask"

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-04-18 00:38:37 +02:00
Stephen Hemminger
524f073a10 ivshmem: fix errors identified by hardening
Need to pass mode argument to open with O_CREAT.
Must check return value from ftruncate().

Signed-off-by: Stephen Hemminger <shemming@brocade.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-17 15:48:44 +02:00
Olivier Matz
396b69e56a vdev: allow external registration of virtual device drivers
The registration of an external vdev driver (a .so library) is done in a
function that has the ((constructor)) attribute. This function is called
when dlopen(driver.so) is invoked.

As a result, we need to do the dlopen() before calling
rte_eal_vdev_init() that calls the initialization functions of all
registered drivers.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-11 16:17:57 +02:00
Olivier Matz
9fa5e2b026 vdev: rename nonpci_devs as vdev
The name "nonpci_devs" for virtual devices is ambiguous as a physical
device can also be non-PCI (ex: usb, sata, ...). A better name for this
file is "vdev" as it only deals with virtual devices.

This patch doesn't introduce any change except renaming.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-11 14:05:08 +02:00
Olivier Matz
8e245de6ca devargs: allow to provide arguments per pci device
Some PCI drivers may require some specific initialization arguments at
start-up.

Even if unused today, adding this feature seems coherent with virtual
devices in order to provide a full-featured rte_devargs framework. In
the future, it could be added in pmd_ixgbe or pmd_igb for instance to
enable debug of drivers or setting a specific operating mode at
start-up.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 15:50:34 +02:00
Olivier Matz
cac6d08c8b devargs: replace --use-device option by --pci-whitelist and --vdev
This commit splits the "--use-device" option in two new options:

- "--pci-whitelist or -w": add a PCI device in the white list
- "--vdev": instanciate a new virtual device

Before the patch, the same option "--use-device" was used for these 2
use-cases.

By the way, we also add "--pci-blacklist" in addition to the existing
"-b" for coherency with the whitelist parameter.

Test result:

echo 100 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 100 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
./app/test -c 0x15 -n 3 -m 64
RTE>>eal_flags_autotest
[...]
Test OK

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 15:50:34 +02:00
Olivier Matz
a8b97e3a1d devargs: use a comma instead of semicolon to separate key/values
This commit changes the API of --use-device command line argument.
It changes the separators from ';' to ','. Indeed, ';' is not the best
choice as this character is also used to separate shell commands,
forcing the user to surround arguments with quotes.

This commit impacts both devargs and kvargs as each of them define
a separator in --use-device argument:

- devargs defines the separator between the device name or pci_id and
   its arguments
- kvargs defines the separator between each key/value pairs in
   arguments for drivers using the kvargs API to parse their arguments

The modification of devargs and kvargs is done in one commit to keep
the coherency of --use-device.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 15:50:11 +02:00
Olivier Matz
1220458951 devargs: use devargs for vdev and PCI whitelist/blacklist
Remove old whitelist code:
- remove references to rte_pmd_ring, rte_pmd_pcap and pmd_xenvirt in
  is_valid_wl_entry() as we want to be able to register external virtual
  drivers as a shared library. Moreover this code was duplicated with
  dev_types[] from eal_common_pci.c
- eal_common_whitelist.c was badly named: it was able to process PCI
  devices white list and the registration of virtual devices
- the parsing code was complex: all arguments were prepended in
  one string dev_list_str[4096], then split again

Use the newly introduced rte_devargs to get:
- the PCI white list
- the PCI black list
- the list of virtual devices

Rework the tests:
- a part of the whitelist test can be removed as it is now tested
  in app/test/test_devargs.c
- the other parts are just reworked to adapt them to the new API

This commit induce a small API modification: it is not possible to specify
several devices per "--use-device" option. This notation was anyway a bit
cryptic. Ex:
  --use-device="eth_ring0,eth_pcap0;iface=ixgbe0"
  now becomes:
  --use-device="eth_ring0" --use-device="eth_pcap0;iface=ixgbe0"

On the other hand, it is now possible to work in PCI blacklist mode and
instanciate virtual drivers, which was not possible before this patch.

Test result:

./app/test -c 0x15 -n 3 -m 64
RTE>>devargs_autotest
EAL: invalid PCI identifier <08:1>
EAL: invalid PCI identifier <00.1>
EAL: invalid PCI identifier <foo>
EAL: invalid PCI identifier <>
EAL: invalid PCI identifier <000f:0:0>
Test OK

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 14:59:34 +02:00
Olivier Matz
bf6dea0e04 devargs: introduce API and test
This commit introduces a new API for storing device arguments given by
the user. It only adds the framework and the test. The modification of
EAL to use this new module is done in next commit.

The final goals:

- unify pci-blacklist, pci-whitelist, and virtual devices arguments
  in one file
- allow to register a virtual device driver from a dpdk extension
  provided as a shared library. For that we will require to remove
  references to rte_pmd_ring and rte_pmd_pcap in argument parsing code
- clarify the API of eal_common_whitelist.c, and rework its code that is
  often complex for no reason.
- support arguments for PCI devices and possibly future non-PCI devices
  (other than virtual devices) without effort.

Test result:

echo 100 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages
echo 100 > /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages
./app/test -c 0x15 -n 3 -m 64
RTE>>eal_flags_autotest
[...]
Test OK

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 14:58:34 +02:00
Olivier Matz
5b1f4a67dd pci: rename device and driver lists
To avoid confusion with virtual devices, rename device_list as
pci_device_list and driver_list as pci_driver_list.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-10 14:58:31 +02:00
Bruce Richardson
d73d8f3ad4 timer: fix TSC frequency by not reading /proc/cpuinfo
This reverts commit da6fd0759cbeb5fc14991a79e40105b9f6b99059.
	"timer: get TSC frequency from /proc/cpuinfo"

The use of cpuinfo to determine the frequency of the TSC is not
advisable and leads to incorrect results when power management is
in use. This is because, while the TSC frequency does not change
in modern cpus with constant_tsc support, the frequency of the core,
and hence the frequency of the core reported by cpuinfo *does* change.

Depending on the current frequency of core 0 when an application is
started, the EAL can get a wildly incorrect value for the TSC freq.
Since frequency is scaled down for power saving, any incorrect value
is likely to be lower than the default, which means that any delay
loops inside the code which rely on the TSC will be shorter than
planned. This can cause issues (reported on the mailing list by a number
of people) where ports are not initialized correctly due to delays being
too short.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-04-09 14:21:36 +02:00
Thomas Monjalon
3097de6e6b mem: get physical address of any pointer
Insert get_physaddr() into public API as rte_mem_virt2phy().

rte_mem_virt2phy() permits to obtain the physical address of any
virtual address mapped to the current process.
get_physaddr() was working only for addresses pointing exactly to
the first byte of a page.
Note that this function is very slow and shouldn't be called
after initialization to avoid a performance bottleneck.

The memory must be locked with mlock(). The function rte_mem_lock_page()
is a mlock() helper that lock the whole page.

A better name would be rte_mem_virt2phys but rte_mem_virt2phy is more
consistent with rte_mempool_virt2phy.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-03-20 15:35:08 +01:00
Thomas Monjalon
53a9ca3c57 mem: revert "get physical address of any pointer"
This reverts commit 57c24af85d9eaa81549a212169605b4e2468a29f
which was wrongly rebased in 1.6.0 branch:
- commit log must be changed for 1.6.0
- it breaks building for 32-bit
A new version of this commit has to be done.
2014-03-20 15:35:08 +01:00
David Marchand
4b28dda3dc mem: fix build of virtual address hinting for 32-bit
The initial commit doesn't build for 32-bit:
8ea9ff83 (mem: allow virtual memory address hinting)

lib/librte_eal/linuxapp/eal/eal.c: In function ‘eal_parse_base_virtaddr’:
build/include/rte_common.h:133:22:
error: cast from pointer to integer of different size
[-Werror=pointer-to-int-cast]
  RTE_PTR_ALIGN_FLOOR((typeof(ptr))RTE_PTR_ADD(ptr, (align) - 1), align)
                      ^

RTE_PTR_ALIGN_CEIL return type is the same as what we give it as input.
So instead of casting the returned value, cast 'addr' which should be the same
as base_virtaddr.

Reported-by: Mats Liljegren <mats.liljegren@enea.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-03-20 15:34:46 +01:00
Olivier Matz
f7f97c1604 pci: add option --create-uio-dev to run without hotplug
When the user specifies --create-uio-dev in dpdk eal start options, the
DPDK will create the /dev/uioX instead of waiting that a program does it
(which is usually hotplug).

This option is useful in embedded environments where there is no hotplug
to do the work.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00
Olivier Matz
61410438da pci: split the function providing uio device and mappings
Add a new function pci_get_uio_dev() that parses /sys/bus/pci/devices
to get the uio device associated with a PCI device. This patch just
moves some code that was in pci_uio_map_resource() in the new function
without any functional change.

Thanks to this change, the next commit will be easier to understand.
Moreover it improves readability: having smaller functions help to
understand what pci_uio_map_resource() does.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-02-26 11:07:28 +01:00