Commit Graph

1432 Commits

Author SHA1 Message Date
Keith Wiles
d8b8517893 ethdev: fix hotplug check for Rx and Tx callbacks
Some checks with rte_eth_dev_is_valid_port() were missed when merging
hotplug and callbacks features.

Fixes: c282abd2a6 ("ethdev: remove assumption that port will not be detached")

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2015-03-05 21:30:47 +01:00
Cunming Liang
352078e8e1 ixgbe: check rxd number to avoid mbuf leak
The mbuf leak happens when the assigned number of rx descriptor is not
power of 2 in vector mode.
As it's presumed on vpmd rx (for rx_tail wrap), adding condition check
to prevent it.
The root cause reference code in *_recv_raw_pkts_vec* as below.
"rxq->rx_tail = (uint16_t)(rxq->rx_tail & (rxq->nb_rx_desc - 1));".

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-03-05 20:06:06 +01:00
Stephen Hemminger
490a1f0a25 enic: remove useless cast
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-03-04 21:50:42 +01:00
Stephen Hemminger
d886477be7 eal/linux: remove useless memset
The path variable is set via snprintf, and does not need to
memset before that.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-03-04 21:50:42 +01:00
Stephen Hemminger
944127d1ed eal/bsd: remove useless assignments
If variable is set in the next line, it doesn't need to be
initialized.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2015-03-04 21:50:42 +01:00
Pawel Wodkowski
03b2f7716b cmdline: fix parameter type
Fix warning reported during static analysis about size_t to int cast
when passing parameters to parse_set_list().

This patch fix code formating errors that give checkpatch.pl errors
after generating patch.

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-03-04 11:19:37 +01:00
Pawel Wodkowski
460a165075 ring: fix memory leak
Free kvlist on function exit to avoid memory leak during devinit.

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-03-04 11:19:37 +01:00
Pawel Wodkowski
c34af7424e kvargs: fix freeing behaviour for null
By convention free() functions should ignore NULL parameter. This patch
add this behaviour for rte_kvargs_free().

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-03-04 11:19:37 +01:00
Pawel Wodkowski
5663c25dcc timer: fix callback declaration inconsistency
This patch remove inconsistency between declaration of type
rte_timer_cb_t, field f in struct rte_timer and function
__rte_timer_reset().

Although compiler treat both of them the same, the static analysis tool
like complain about that.

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-03-04 11:19:37 +01:00
Thomas Monjalon
4b1b380213 bond: remove debug function to fix link with shared lib
The function print_client_stats was used in the example without being
clearly exported in the map file. So it breaks linking with shared library
when debug is enabled.
It's better to remove this function as it probably could be implemented
with statistics API.

Fixes: cc7e8ae84f ("add example application for link bonding mode 6")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-03-04 11:19:25 +01:00
Thomas Monjalon
a91d686358 mlx4: mute auto config in quiet mode
If verbose is off, auto-config-h.sh script should be quiet.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-03-04 11:19:21 +01:00
Thomas Monjalon
9d740b97f2 mlx4: fix build with mempool debug enabled
The mempool header forces error on -Wcast-qual and makes verbs.h failing.
Let's include verbs before as a system header.

Fixes: 7fae69eeff ("mlx4: new poll mode driver")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2015-03-04 11:19:07 +01:00
Thomas Monjalon
80edfef4a8 virtio: fix build with debug enabled
With CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_INIT=y:
	error: ‘devname’ undeclared (first use in this function)

Fixes: da978dfdc4 ("virtio: use port IO to get PCI resource")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2015-03-04 11:18:53 +01:00
Thomas Monjalon
459132bc2c virtio: fix build with mempool debug enabled
The mempool header forces error on -Wcast-qual:
	error: cast discards ‘const’ qualifier from pointer target type

Let's fix it by removing const qualifier of pci driver from commit
	5e9f6d1340 ("pci: reference driver structure for each device")
It's needed because the driver flags are changed depending on using uio or not.
Actually these driver flags should be directly attached to each device.

Fixes: da978dfdc4 ("virtio: use port IO to get PCI resource")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2015-03-04 11:18:36 +01:00
Thomas Monjalon
ae6ad6b0c4 fm10k: fix build with debug enabled
error: implicit declaration of function ‘RTE_LOG’

Fixes: a6061d9e70 ("fm10k: register PF driver")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-03-04 11:18:36 +01:00
Thomas Monjalon
22e34d1211 mempool: fix build with debug enabled
error: format ‘%p’ expects argument of type ‘void *’,
but argument 5 has type ‘const struct rte_mempool *’ [-Werror=format=]

mp type is (const struct rte_mempool *) and must be casted into a simpler
type to be printed.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-03-04 11:18:27 +01:00
Thomas Monjalon
8cf61ec829 eal/linux: fix build
Compilation fails in some distributions because of missing unistd.h
needed for pread/pwrite (seen with Suse):
	lib/librte_eal/linuxapp/eal/eal_pci_uio.c:62:2:
	error: implicit declaration of function ‘pread’

Fixes: 4a499c6495 ("eal/linux: enable uio_pci_generic support")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-03-03 23:21:33 +01:00
Pawel Wodkowski
a001589ec1 devargs: fix null dereferencing on failure
On failure devargs->args should not be accessed if devargs is NULL.

Fixes: c07691ae10 ("devargs: remove limit on parameters length")

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-03-02 19:40:54 +01:00
Neil Horman
c3615e4a80 eal: clean up export of socket id variable
Theres no need to export this variable.  Its set and queried from an API call
that doesn't exist in the hot path.  Instead just export the rte_socket_id
symbol and make the variable private to protect it from type changes.  We should
do this with the other exported variables too, but I think its too late in the
release cycle to do that.

tested using distributor_autotest (which uses rte_socket_id), successfully.
Only tested on linux, as I don't currently have a bsd system spun up, but the
changes are symmetric, and should be fine

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-03-02 19:40:20 +01:00
Jijiang Liu
ea5c23df30 i40e: advertise TSO capability
Advertise the DEV_TX_OFFLOAD_TCP_TSO flag in the PMD features. It means
that the i40e PMD supports the offload of TSO.

Test report: http://www.dpdk.org/ml/archives/dev/2015-March/014467.html

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
2015-03-02 07:36:30 +01:00
Jijiang Liu
6fa72b444c i40e: enable TSO support
This patch enables i40e TSO feature for both non-tunneling packet and
tunneling packet.

Test report: http://www.dpdk.org/ml/archives/dev/2015-March/014467.html

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Grzegorz Galkowski <grzegorz.galkowski@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
2015-03-02 07:36:13 +01:00
Jijiang Liu
1f02d682d3 i40e: move Tx offloads parameters to separate structure
The structure size is u64 so it could be used with single cpu operation.

Test report: http://www.dpdk.org/ml/archives/dev/2015-March/014467.html

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Grzegorz Galkowski <grzegorz.galkowski@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
2015-03-02 07:35:26 +01:00
Tetsuya Mukawa
e34550c8b9 null: fix build with gcc-4.7
This patch fixes following errors with gcc-4.7.
 lib/librte_pmd_null/rte_eth_null.c:302:28:
     error: array subscript is above array bounds

Reported-by: John McNamara <john.mcnamara@intel.com>
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: John McNamara <john.mcnamara@intel.com>
2015-02-28 00:22:00 +01:00
Tetsuya Mukawa
2929553854 null: fix build with icc
This patch fixes following errors with icc.
 rte_eth_null.c(47): error #83:
     type qualifier specified more than once

Reported-by: John McNamara <john.mcnamara@intel.com>
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: John McNamara <john.mcnamara@intel.com>
2015-02-28 00:19:46 +01:00
Tetsuya Mukawa
911920812f ethdev: fix build with icc
This patch fixes following errors with icc.
  error #188: enumerated type mixed with another type
    return -1;

Fixes: 92d94d3744 ("ethdev: attach or detach port")

Reported-by: John McNamara <john.mcnamara@intel.com>
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: John McNamara <john.mcnamara@intel.com>
2015-02-28 00:14:51 +01:00
Changchun Ouyang
a8e3219a92 virtio: fix build on freebsd
This patch fixes the compilation issue on freebsd:
lib/librte_pmd_virtio/virtio_ethdev.c: In function 'virtio_resource_init':
lib/librte_pmd_virtio/virtio_ethdev.c:1071:56: error: unused parameter 'pci_dev' [-Werror=unused-parameter]

Fixes: da978dfdc4 ("virtio: use port IO to get PCI resource")

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
2015-02-28 00:12:37 +01:00
Thomas Monjalon
00c685634b mlx4: fix build
There was a missing change due by hotplug integration.

Fixes: 9f1653e7b7 ("ethdev: add device type")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-26 18:02:24 +01:00
Thomas Monjalon
ae438314b1 mlx4: remove old version compatibility
No need to check DPDK version. It just has to work on HEAD.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-26 18:02:17 +01:00
Thomas Monjalon
7609e66093 ethdev: fix build without hotplug
After setting CONFIG_RTE_LIBRTE_EAL_HOTPLUG=n, GCC stop compiling:
rte_ethdev.c:430:1: error: ‘rte_eth_dev_get_device_type’ defined but not used
rte_ethdev.c:438:1: error: ‘rte_eth_dev_save’ defined but not used
rte_ethdev.c:450:1: error: ‘rte_eth_dev_get_changed_port’ defined but not used
rte_ethdev.c:464:1: error: ‘rte_eth_dev_get_addr_by_port’ defined but not used
rte_ethdev.c:481:1: error: ‘rte_eth_dev_get_name_by_port’ defined but not used
rte_ethdev.c:503:1: error: ‘rte_eth_dev_is_detachable’ defined but not used

The hotplug option allows to build in environment (BSD) not yet
supported by this new feature.
It should be removed when BSD will be supported.
Waiting this day, let's fix build with hotplug disabled.

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-26 11:55:10 +01:00
Thomas Monjalon
b67578ccdf version: 2.0.0-rc1
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-26 00:41:57 +01:00
Tetsuya Mukawa
78d9a89884 null: support port hotplug
This patch adds port hotplug support to Null PMD.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2015-02-26 00:32:31 +01:00
Tetsuya Mukawa
c743e50c47 null: new poll mode driver
Null PMD is a driver of the virtual device particularly designed to measure
performance of DPDK PMDs. When an application call rx, Null PMD just allocates
mbufs and returns those. Also tx, the PMD just frees mbufs.

The PMD has following options.
- size: specify packe size allocated by RX. Default packet size is 64.
- copy: specify 1 or 0 to enable or disable copy while RX and TX.
	Default value is 0(disabled).
	This option is used for emulating more realistic data transfer.
	Copy size is equal to packet size.

To use the PMD, enable CONFIG_RTE_BUILD_SHARED_LIB in config file. Then
compile the PMD as shared library. The library can be linked using '-d'
option when an application invokes.

Here is an example.
$ sudo ./testpmd -c f -n 4 -d librte_pmd_null.so \
	--vdev 'eth_null0' --vdev 'eth_null1' -- -i --no-flush-rx

If testpmd is compiled with CONFIG_RTE_BUILD_SHARED_LIB, it may need to
specify more libraries using '-d' option.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2015-02-26 00:31:45 +01:00
Tetsuya Mukawa
c956caa6ea pcap: support port hotplug
This patch adds finalization code to free resources allocated by the
PMD.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2015-02-26 00:19:16 +01:00
Tetsuya Mukawa
92d94d3744 ethdev: attach or detach port
These functions are used for attaching or detaching a port.
When rte_eth_dev_attach() is called, the function tries to realize the
device name as pci address. If this is done successfully,
rte_eth_dev_attach() will attach physical device port. If not, attaches
virtual devive port.
When rte_eth_dev_detach() is called, the function gets the device type
of this port to know whether the port is come from physical or virtual.
And then specific detaching function will be called.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-26 00:08:25 +01:00
Tetsuya Mukawa
9f1653e7b7 ethdev: add device type
This new parameter is needed to keep device type like PCI or virtual.
Port detaching processes are different between PCI device and virtual
device.
RTE_ETH_DEV_PCI indicates device type is PCI. RTE_ETH_DEV_VIRTUAL
indicates device is virtual.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-26 00:08:25 +01:00
Tetsuya Mukawa
6c6829475b ethdev: close device
The patch adds function pointer to rte_pci_driver and eth_driver
structure. These function pointers are used when ports are detached.
Also, the patch adds rte_eth_dev_uninit(). So far, it's not called
by anywhere, but it will be called when port hotplug function is
implemented.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-26 00:08:25 +01:00
Tetsuya Mukawa
36ec8585b2 ethdev: release port
This patch adds rte_eth_dev_release_port(). The function is used for
changing an attached status of the device that has specified name.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-26 00:08:25 +01:00
Tetsuya Mukawa
c282abd2a6 ethdev: remove assumption that port will not be detached
To remove assumption, do like followings.

This patch adds "RTE_PCI_DRV_DETACHABLE" to drv_flags of rte_pci_driver
structure. The flags indicate the driver can detach devices at runtime.
Also, remove assumption that port will not be detached.

To remove the assumption.
- Add 'attached' member to rte_eth_dev structure.
  This member is used for indicating the port is attached, or not.
  DEV_ATTACHED indicates a port is attached.
  DEV_DETACHED indicates a port is detached.
- Add rte_eth_dev_allocate_new_port().
  This function is used for allocating new port.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-26 00:08:25 +01:00
Tetsuya Mukawa
0fe11ec592 eal: add vdev init and uninit
The patch adds following functions.
- rte_eal_vdev_init();
- rte_eal_vdev_uninit();
- rte_eal_parse_devargs_str().
These functions are used for driver initialization and finalization.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-26 00:08:25 +01:00
Tetsuya Mukawa
dbe6b4b61b pci: probe or close device
- Add pci_close_all_drivers()
  The function tries to find a driver for the specified device, and
  then close the driver.
- Add rte_eal_pci_probe_one() and rte_eal_pci_close_one()
  The functions are used for probe and close a device.
  First the function tries to find a device that has the specified
  PCI address. Then, probe or close the device.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-26 00:08:25 +01:00
Tetsuya Mukawa
b8be05722f pci: unmap igb_uio resources
The patch adds functions for unmapping igb_uio resources. The patch is only
for Linux and igb_uio environment. VFIO and BSD are not supported.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-26 00:03:07 +01:00
Tetsuya Mukawa
c0ce0577e8 pci: consolidate address comparisons
This patch replaces pci_addr_comparison() and memcmp() of pci addresses by
rte_eal_compare_pci_addr().

To compare PCI addresses, rte_eal_compare_pci_addr() doesn't use memcmp().
This is because sizeof(struct rte_pci_addr) returns 6, but actually
this structure is like below.

struct rte_pci_addr {
        uint16_t domain;                /**< Device domain */
        uint8_t bus;                    /**< Device bus */
        uint8_t devid;                  /**< Device ID */
        uint8_t function;               /**< Device function. */
};

If the structure is dynamically allocated in a function without bzero,
last 1 byte may have value. As a result, memcmp may not work.
To avoid such a case, rte_eal_compare_pci_addr() compare following values.

        dev_addr = (addr->domain << 24) | (addr->bus << 16) |
                                (addr->devid << 8) | addr->function;

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-25 23:38:00 +01:00
Michael Qiu
cc4fe5c45d pci: select memory mapping from driver type
With the driver type flag in struct rte_pci_dev, we do not need
to always map uio devices with vfio related function when
vfio enabled.

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2015-02-25 23:38:00 +01:00
Michael Qiu
d9a8cd9595 pci: add kernel driver type
Currently, dpdk has no ability to know which type of driver(
vfio-pci/igb_uio/uio_pci_generic) the device used. It only can
check whether vfio is enabled or not statically.

It really useful to have the flag, because different type need to
handle differently in runtime. For example, pci memory map,
pot hotplug, and so on.

This patch add a flag field for pci device to solve above issue.

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2015-02-25 23:38:00 +01:00
Panu Matilainen
4dc41c7be6 ixgbe: fix build with gcc 5
gcc 5 supports a new logical-not-parentheses warning which
ixgbe_common.c triggers, causing build failure with -Werror.
Since this source must not be modified, silence the warning instead.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-25 16:24:15 +01:00
Adrien Mazarguil
7fae69eeff mlx4: new poll mode driver
This PMD manages all variants of Mellanox ConnectX-3 (EN 40, EN 10, Pro EN
40) as well as their virtual functions in SR-IOV context through IB Verbs
(libibverbs) and the dedicated user-space driver (libmlx4).

It is disabled by default due to dependencies on these libraries and only
supports Linux userland at the moment partly because /sys (sysfs) support is
required.

Also claim responsibility in the MAINTAINERS file.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
2015-02-25 16:07:57 +01:00
Zhihong Wang
9144d6bcde eal/x86: optimize memcpy for SSE and AVX
Main code changes:

1. Differentiate architectural features based on CPU flags
    a. Implement separated move functions for SSE/AVX/AVX2 to make full utilization of cache bandwidth
    b. Implement separated copy flow specifically optimized for target architecture

2. Rewrite the memcpy function "rte_memcpy"
    a. Add store aligning
    b. Add load aligning based on architectural features
    c. Put block copy loop into inline move functions for better control of instruction order
    d. Eliminate unnecessary MOVs

3. Rewrite the inline move functions
    a. Add move functions for unaligned load cases
    b. Change instruction order in copy loops for better pipeline utilization
    c. Use intrinsics instead of assembly code

4. Remove slow glibc call for constant copies

Test report: http://dpdk.org/ml/archives/dev/2015-January/011848.html

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Tested-by: Jingguo Fu <jingguox.fu@intel.com>
Reviewed-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-02-25 11:50:53 +01:00
Robert Sanford
7085f6c738 timer: fix reset return value
- API rte_timer_reset() should return -1 when the timer is in the
RUNNING or CONFIG state. Instead, it ignores the return value of
internal function __rte_timer_reset() and always returns 0.
We change rte_timer_reset() to return the value returned by
__rte_timer_reset().

- Enhance timer stress test 2 to report how many timer reset
collisions occur, i.e., how many times rte_timer_reset() fails
due to a timer being in the CONFIG state.

Signed-off-by: Robert Sanford <rsanford2@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-02-25 10:43:27 +01:00
Robert Sanford
bf2fef39b5 timer: pause in reset sync
In rte_timer_reset_sync(), insert rte_pause() into loop that waits
for rte_timer_reset() to succeed.

Signed-off-by: Robert Sanford <rsanford2@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-02-25 10:39:42 +01:00
Cunming Liang
9bacd17535 eal: fix missing symbol in version map
As per_lcore__socket_id and rte_sys_gettid are missing in version map,
it causes compiling error when CONFIG_RTE_BUILD_SHARED_LIB is enabled.

Fixes: ef76436c68 ("eal: get unique thread id")
Fixes: 9e29251b2a ("eal: thread affinity API")

Reported-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: John McNamara <john.mcnamara@intel.com>
2015-02-25 09:59:41 +01:00
Bruce Richardson
54991196e8 eal/linux: remove unnecessary check for primary instance
In pci_uio_map_resource we check that we are in a primary process
before calling pci_uio_set_bus_master. However, there is already
an earlier check which means that we are always in a primary instance
at this point in the code, so the check can be removed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-02-24 22:34:06 +01:00
Bruce Richardson
0fbbb63fbf eal/linux: populate uio maps from pci resources array
Rather than scanning the resource file in sysfs a second time, we
can pull the information on physical addresses of BARs from the
pci resource information already present in the dev structure.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-02-24 22:31:25 +01:00
Bruce Richardson
9e67561acd eal/linux: mmap uio resources using resourceX files
Instead of distinguishing the BAR mappings via offset within a single
file, originally /dev/uioX, switch to mapping each individual bar via
the appropriately numbered resourceX file.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-02-24 22:29:43 +01:00
Pawel Wodkowski
597b0f74e2 jobstats: new library
This library provide API to measure time spend in particular parts of
code and to calculate optimal polling time.

To calculate a those statistics application code need to be divided into
parts (called jobs) that do something. It is up to application to decide
what is considered a job.

Series of jobs must be surrounded with the rte_jobstats_context_start()
and rte_jobstats_context_finish() calls. After that, jobs might be
started.  Each job must be surrounded with rte_jobstats_start() and
rte_jobstats_finish() calls.

After job finishes its execution, period in which it should be called
again is adjusted. It might be used to minimize time wasted on
unnecessary polls/calls. Adjustment is based on data provided by job
itself (ex: number of packets it processed).

After all jobs in serie are executed fallowing statistics are updated
and might be used by application. Statistics can be reset. Some of
provided statistic data:
 - total/min/max execution - time spent in executing jobs.
 - total/min/max management - time spent outside execution area. This
value might be used to measure overhead of scheduling jobs. This time
also contains overhead of rte_jobstats library itself.
 - number of loops that executed at least one job
 - executed jobs
 - time when statistics were reset.

Each job provide total/min/max execution time and execution count
statistics.

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-02-24 22:12:35 +01:00
David Marchand
23df14d1ba devargs: restore empty devargs
Following commit c07691ae10, an implicit change has been done in the
devargs API.
This triggers problem in virtual pmds that did not check for parameters
validity as it was implicitely valid.

Fix this by restoring the empty argument as "" and add a note in the api.
Restore associated tests.

Fixes: c07691ae10 ("devargs: remove limit on parameters length")

Reported-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 20:23:11 +01:00
Cunming Liang
4e01799aea ring: add optional yield to avoid spin forever
Add a sched_yield() syscall if the thread spins for too long,
waiting other thread to finish its operations on the ring.
That gives pre-empted thread a chance to proceed and finish
with ring enqueue/dequeue operation.
The purpose is to reduce contention on the ring.
By ring_perf_test, it doesn't shows additional perf penalty.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:23:07 +01:00
Cunming Liang
b4bee5f66a ring: support non-EAL thread
ring debug stat won't take care non-EAL thread.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:23:02 +01:00
Cunming Liang
30e6399892 mempool: support non-EAL thread
For non-EAL thread, bypass per lcore cache, directly use ring pool.
It allows using rte_mempool in either EAL thread or any user pthread.
As in non-EAL thread, it directly rely on rte_ring and it's none preemptive.
It doesn't suggest to run multi-pthread/cpu which compete the rte_mempool.
It will get bad performance and has critical risk if scheduling policy is RT.
Haven't found significant performance decrease by mempool_perf_test.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:57 +01:00
Cunming Liang
6295e793aa timer: support non-EAL thread
Allow to setup timers only for EAL (lcore) threads (__lcore_id < MAX_LCORE_ID).
E.g. – dynamically created thread will be able to reset/stop timer for lcore thread,
but it will be not allowed to setup timer for itself or another non-lcore thread.
rte_timer_manage() for non-lcore thread would simply do nothing and return straightway.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:52 +01:00
Cunming Liang
ca2e2dab07 spinlock: support non-EAL thread
In non-EAL thread, lcore_id always be LCORE_ID_ANY.
It can't be used as unique id for recursive spinlock.
Then use rte_gettid() to replace it.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:48 +01:00
Cunming Liang
fd4a5ce87d log: support non-EAL thread
For those non-EAL thread, *_lcore_id* is invalid and probably larger than RTE_MAX_LCORE.
The patch adds the check and allows only EAL thread using EAL per thread log level and log type.
Others shares the global log level.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:43 +01:00
Cunming Liang
3e10f2368a eal: initialize lcore and socket id
Set _lcore_id and _socket_id to (-1) by default.
For those non EAL thread, _lcore_id shall always be LCORE_ID_ANY.
The libraries using _lcore_id as index need to take care.
_socket_id always be SOCKET_ID_ANY until the thread changes the affinity
by rte_thread_set_affinity().

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:39 +01:00
Cunming Liang
b94580d688 malloc: avoid unknown socket id
Add check for rte_socket_id(), avoid get unexpected return like (-1).
By using rte_malloc_socket(), socket id is assigned by socket_arg.
If socket_arg set to SOCKET_ID_ANY, it expects to use the socket id to which the current cores belongs.
As the thread may affinity on a cpuset, the cores in the cpuset may belongs to different NUMA nodes.
The value of _socket_id probably be SOCKET_ID_ANY(-1), the case is not expected in origin malloc_get_numa_socket().

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:34 +01:00
Cunming Liang
8baacdd30e eal: apply thread affinity by assigned cpuset
EAL threads use assigned cpuset to set core affinity during startup.
It keeps 1:1 mapping, if no '--lcores' option is used.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:29 +01:00
Cunming Liang
9e29251b2a eal: thread affinity API
1. add two TLS *_socket_id* and *_cpuset*
2. add one internal API, eal_cpu_socket_id/eal_thread_dump_affinity
3. add two public API, rte_thread_set/get_affinity
4. update EAL version map for EAL public API

The API works for both EAL thread and non EAL thread.
When calling rte_thread_set_affinity, the *_socket_id* and
*_cpuset* of calling thread will be updated if the thread
successfully set the cpu affinity.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:24 +01:00
Cunming Liang
ef76436c68 eal: get unique thread id
The rte_gettid() wraps the linux and freebsd syscall gettid().
It provides a persistent unique thread id for the calling thread.
It will save the unique id in TLS on the first time.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:19 +01:00
Cunming Liang
f8e0f0163a eal: get socket id from cpu id
It defines eal_cpu_socket_id() which exposing the origin private cpu_socket_id().
The function is only used inside EAL. It returns socket_id of the specified cpu_id.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:15 +01:00
Cunming Liang
a9b1c67a2c eal: fix strnlen return value with icc
The problem is that strnlen() here may return invalid value with 32bit icc.
(actually it returns it’s second parameter,e.g: sysconf(_SC_ARG_MAX)).
It starts to manifest hwen max_len parameter is > 2M and using icc –m32 –O2 (or above).

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:22:08 +01:00
Cunming Liang
53e54bf817 eal: new option --lcores for cpu assignment
It supports one new eal long option '--lcores' for EAL thread cpuset assignment.

The format pattern:
	--lcores='<lcores[@cpus]>[<,lcores[@cpus]>...]'
lcores, cpus could be a single digit/range or a group.
'(' and ')' are necessary if it's a group.
If not supply '@cpus', the value of cpus uses the same as lcores.

e.g. '1,2@(5-7),(3-5)@(0,2),(0,6),7-8' means starting 9 EAL thread as below
  lcore 0 runs on cpuset 0x41 (cpu 0,6)
  lcore 1 runs on cpuset 0x2 (cpu 1)
  lcore 2 runs on cpuset 0xe0 (cpu 5,6,7)
  lcore 3,4,5 runs on cpuset 0x5 (cpu 0,2)
  lcore 6 runs on cpuset 0x41 (cpu 0,6)
  lcore 7 runs on cpuset 0x80 (cpu 7)
  lcore 8 runs on cpuset 0x100 (cpu 8)

Test report: http://dpdk.org/ml/archives/dev/2015-February/013383.html

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Qun Wan <qun.wan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:21:59 +01:00
Cunming Liang
798a71d703 eal: add cpuset into lcore config
The patch adds 'cpuset' into per-lcore configure 'lcore_config[]',
as the lcore no longer always 1:1 pinning with physical cpu.
The lcore now stands for a EAL thread rather than a logical cpu.

It doesn't change the default behavior of 1:1 mapping, but allows to
affinity the EAL thread to multiple cpus.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:21:54 +01:00
Cunming Liang
0e2e511b38 enic: fix bsd namespace conflict
Some macros already been defined by freebsd 'sys/param.h'.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:21:50 +01:00
Cunming Liang
ca31321cde eal/bsd: fix namespace conflict
Fix namespace with EAL prefix.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:21:14 +01:00
Cunming Liang
d55b8f3a49 eal/bsd: standardize init sequence between linux and bsd
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-24 20:20:52 +01:00
Panu Matilainen
6eb85c0e44 mk: fix build with Debian/Ubuntu-specific gcc version
Commit 71f0ab1849 broke compilation
on some versions of Debian and Ubuntu where gcc has been modified
to only emit MAJOR.MINOR part of the version from 'gcc -dumpversion'.
Drop the micro-version from gcc version comparisons to work around
this, it wasn't being used for anything anyway.

Fixes: 71f0ab1849 ("mk: rework gcc version detection to permit versions newer than 4.x")

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-02-24 12:11:16 +01:00
Thomas Monjalon
a906cf28fd eal: add help option
Help is printed with -h or --help.

Help is also printed for an unknown option.
This was broken since the rework of options.

Fixes: 489a9d6c9f ("merge bsd and linux common options parsing")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-02-24 12:08:01 +01:00
Thomas Monjalon
97bf974ca2 eal: sort and align options lists
Options listing in usage help was a mess.
The main usage line is fixed and shorter.
The options in usage output are logically sorted (cpu/mem/dev/proc),
aligned and lightly reworded.
The options in declarations are alphabetically sorted.
Code in swith statement is not moved.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-02-24 11:57:33 +01:00
Stephen Hemminger
4dc01c1dd7 enic: change probe log message level
Drivers should be silent on boot.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-02-24 03:57:32 +01:00
Stephen Hemminger
518b590803 enic: replace use of printf with log
Device driver should log via DPDK log, not to printf which is
sends to /dev/null in a daemon application.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: David Marchand <david.marchand@6wind.com>
[Thomas: include rte_log.h]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-24 03:56:44 +01:00
Panu Matilainen
71f0ab1849 mk: rework gcc version detection to permit versions newer than 4.x
Separately comparing major and minor versions becomes seriously clumsy
when with major version changes, convert the entire version string into
a numeric value (ie 4.6.0 becomes 460 and 5.0.0 becomes 500) and use
that for comparisons, eliminate unnecessary negations while at it.
This makes the comparisons simpler, more obvious and makes gcc 5.0
naturally recognized at least as capable as newest 4.x.

This three-digit scheme would run into trouble if gcc ever went to
two-digit version segments, but that hasn't happened in the last 10+
years so it seems like a safe assumption.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-24 03:47:29 +01:00
Keith Wiles
9903387f32 ixgbe: remove unused function causing error with clang
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-24 03:18:55 +01:00
Jeff Shaw
44a7fe6e1b fm10k: fix clang warning flags
This commit fixes the following error which was reported when
compiling with clang by removing the option.

error: unknown warning option '-Wno-unused-but-set-variable'

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2015-02-24 03:13:32 +01:00
Jeff Shaw
e8f85d7c7d fm10k: fix build with unused debug function
This commit fixes the following error which was reported when
compiling with clang by moving the function inside an
RTE_LIBRTE_FM10K_DEBUG_RX ifdef block.

error: unused function 'dump_rxd'

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
2015-02-24 03:09:57 +01:00
Sergio Gonzalez Monroy
e1545b393a mbuf: fix a couple of doxygen comments
Fix a couple of doxygen comments in mbuf structure:
 - seqn had no doxygen syntax.
 - usr was not generating proper link to function.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-24 03:00:31 +01:00
Stefan Puiu
4db87f9739 lib: fix C++11 compilation
In C++11 concatenated string literals need to have a space in between.
Found with clang++-3.4, IIRC g++-4.8 also complains about this.

Sample error message:
error: invalid suffix on literal; C++11 requires a space between literal
and identifier [-Wreserved-user-defined-literal]

Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
Reviewed-by: John McNamara <john.mcnamara@intel.com>
2015-02-24 02:46:52 +01:00
Hemant Agrawal
3e12a98fe3 kni: optimize Rx burst
The current implementation of rte_kni_rx_burst polls the fifo for buffers.
Irrespective of success or failure, it allocates the mbuf and try to put them into the alloc_q
if the buffers are not added to alloc_q, it frees them.
This waste lots of cpu cycles in allocating and freeing the buffers if alloc_q is full.

The logic has been changed to:
1. Initially allocand add buffer(burstsize) to alloc_q
2. Add buffers to alloc_q only when you are pulling out the buffers.

Signed-off-by: Hemant Agrawal <hemant@freescale.com>
Reviewed-by: Jay Rolette <rolette@infiniteio.com>
2015-02-24 02:26:24 +01:00
Igor Ryzhov
e128e53879 lpm: fix overflow issue
LPM table overflow may occur if table is full and added rule has
the biggest depth that already have some rules.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-02-24 02:08:19 +01:00
Ildar Mustafin
dc783e74cf pipeline: fix port meta for non-default entries
Signed-off-by: Ildar Mustafin <imustafin@bk.ru>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2015-02-24 02:01:13 +01:00
Huawei Xie
dbfa62d63f vhost: support dynamically registering server
* support calling rte_vhost_driver_register after rte_vhost_driver_session_start
* add mutext to protect fdset from concurrent access
* add busy flag in fdentry. this flag is set before cb and cleared after cb is finished.

mutex lock scenario in vhost:

* event_dispatch(in rte_vhost_driver_session_start) runs in a separate thread, infinitely
processing vhost messages through cb(callback).
* event_dispatch acquires the lock, get the cb and its context, mark the busy flag,
and releases the mutex.
* vserver_new_vq_conn cb calls fdset_add, which acquires the mutex and add new fd into fdset.
* vserver_message_handler cb frees data context, marks remove flag to request to delete
connfd(connection fd) from fdset.
* after cb returns, event_dispatch
  1. clears busy flag.
  2. if there is remove request, call fdset_del, which acquires mutex, checks busy flag, and
removes connfd from fdset.
* rte_vhost_driver_unregister(not implemented) runs in another thread, acquires the mutex,
calls fdset_del to remove fd(listenerfd) from fdset. Then it could free data context.

The above steps ensures fd data context isn't freed when cb is using.

VM(s) should have been shutdown before rte_vhost_driver_unregister.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:17 +01:00
Huawei Xie
54292e9520 vhost: support ifname for vhost-user
for vhost-cuse, ifname is the name of the tap device
for vhost-user, ifname is the name of the unix domain socket path

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:16 +01:00
Huawei Xie
8f972312b8 vhost: support vhost-user
In rte_vhost_driver_register(), vhost unix domain socket listener fd is created
and added to polled(based on select) fdset.

In rte_vhost_driver_session_start(), fds in the fdset are checked for
processing. If there is new connection from qemu, connection fd accepted is
added to polled fdset. The listener and connection fds in the fdset are
then both checked. When there is message on the connection fd, its
callback vserver_message_handler is called to process vhost-user messages.

To support identifying which virtio is from which guest VM, we could call
rte_vhost_driver_register with different socket path. Virtio devices from
same VM will connect to VM specific socket. The socket path information is
stored in the virtio_net structure.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:15 +01:00
Huawei Xie
fbf7e07ca1 vhost: add select based event driven processing
for more generic event driven processing, refer to:
	http://libevent.org/

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:14 +01:00
Huawei Xie
9464a44160 vhost: implement cuse memory table
remove set_memory_table ops

vhost-cuse or vhost-user will both implement their own set_memory_region handler.

In current vhost-cuse implementation, guest numa memory isn't supported.
Assume that guest memory is backed by only one file.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
2015-02-24 01:38:14 +01:00
Huawei Xie
c89d3e5afd vhost: make host memory mapping more generic
This functions accepts a virtual address and pid(qemu), and maps it into
current process(vhost)'s address space.

The memory behind the virtual address should be backed by a file,
and virtual address should be the starting address.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:13 +01:00
Huawei Xie
6ca9df2812 vhost: copy host memory mapping to a new cuse file
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:12 +01:00
Huawei Xie
c2f60667bf vhost: move fd copying into cuse subdirectory
File descriptor is copied from qemu process into vhost process.
vhost-user doesn't need eventfd kernel module to copy fds between processes.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
2015-02-24 01:38:11 +01:00
Huawei Xie
34f4c46dc4 vhost: rename header file
Rename vhost-net-cdev.h to vhost-net.h.
This file defines common operations provided by virtio-net(.c).

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:10 +01:00
Huawei Xie
6ee36ead58 vhost: move cuse related handling in a subdirectory
Create vhost_cuse directory and move vhost-net-cdev.c into vhost_cuse.

vhost-cuse driver will be divided into two parts: cuse driver specific message
handling(in cuse directory) and common message handling(in virtio-net.c).

vhost ioctl message is pre-processed in cuse and then sent to virtio-net
if is not terminated.

virtio-net.c provides common message handling for both vhost-cuse and vhost-user.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:09 +01:00
Huawei Xie
04d696037a vhost: enable virtio control channel Rx mode
VIRTIO_NET_F_CTRL_RX is dependant on VIRTIO_NET_F_CTRL_VQ.
Observed that virtio-net driver in guest would crash with only CTRL_RX enabled.

In virtnet_send_command:

	/* Caller should know better */
	BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) ||
		(out + in > VIRTNET_SEND_COMMAND_SG_MAX));

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:07 +01:00
Bruce Richardson
4dc294158c ethdev: support optional Rx and Tx callbacks
Add optional support for inline processing of packets inside the RX
or TX call. For an RX callback, what happens is that we get a set of
packets from the NIC and then pass them to a callback function, if
configured, to allow additional processing to be done on them, e.g.
filling in more mbuf fields, before passing back to the application.
On TX, the packets are similarly post-processed before being handed
to the NIC for transmission.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-24 00:38:27 +01:00
Bruce Richardson
1a9a0b9b02 ethdev: rename interrupt callbacks field
The 'callbacks' member of the rte_eth_dev structure has been renamed
to 'link_intr_cbs' to make it clear that it refers to callbacks from
NIC interrupts. This allows us to add other types of callbacks to
the structure without ambiguity.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-24 00:38:14 +01:00