Commit Graph

1548 Commits

Author SHA1 Message Date
Hemant Agrawal
3e12a98fe3 kni: optimize Rx burst
The current implementation of rte_kni_rx_burst polls the fifo for buffers.
Irrespective of success or failure, it allocates the mbuf and try to put them into the alloc_q
if the buffers are not added to alloc_q, it frees them.
This waste lots of cpu cycles in allocating and freeing the buffers if alloc_q is full.

The logic has been changed to:
1. Initially allocand add buffer(burstsize) to alloc_q
2. Add buffers to alloc_q only when you are pulling out the buffers.

Signed-off-by: Hemant Agrawal <hemant@freescale.com>
Reviewed-by: Jay Rolette <rolette@infiniteio.com>
2015-02-24 02:26:24 +01:00
Igor Ryzhov
e128e53879 lpm: fix overflow issue
LPM table overflow may occur if table is full and added rule has
the biggest depth that already have some rules.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-02-24 02:08:19 +01:00
Ildar Mustafin
dc783e74cf pipeline: fix port meta for non-default entries
Signed-off-by: Ildar Mustafin <imustafin@bk.ru>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2015-02-24 02:01:13 +01:00
Huawei Xie
dbfa62d63f vhost: support dynamically registering server
* support calling rte_vhost_driver_register after rte_vhost_driver_session_start
* add mutext to protect fdset from concurrent access
* add busy flag in fdentry. this flag is set before cb and cleared after cb is finished.

mutex lock scenario in vhost:

* event_dispatch(in rte_vhost_driver_session_start) runs in a separate thread, infinitely
processing vhost messages through cb(callback).
* event_dispatch acquires the lock, get the cb and its context, mark the busy flag,
and releases the mutex.
* vserver_new_vq_conn cb calls fdset_add, which acquires the mutex and add new fd into fdset.
* vserver_message_handler cb frees data context, marks remove flag to request to delete
connfd(connection fd) from fdset.
* after cb returns, event_dispatch
  1. clears busy flag.
  2. if there is remove request, call fdset_del, which acquires mutex, checks busy flag, and
removes connfd from fdset.
* rte_vhost_driver_unregister(not implemented) runs in another thread, acquires the mutex,
calls fdset_del to remove fd(listenerfd) from fdset. Then it could free data context.

The above steps ensures fd data context isn't freed when cb is using.

VM(s) should have been shutdown before rte_vhost_driver_unregister.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:17 +01:00
Huawei Xie
54292e9520 vhost: support ifname for vhost-user
for vhost-cuse, ifname is the name of the tap device
for vhost-user, ifname is the name of the unix domain socket path

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:16 +01:00
Huawei Xie
8f972312b8 vhost: support vhost-user
In rte_vhost_driver_register(), vhost unix domain socket listener fd is created
and added to polled(based on select) fdset.

In rte_vhost_driver_session_start(), fds in the fdset are checked for
processing. If there is new connection from qemu, connection fd accepted is
added to polled fdset. The listener and connection fds in the fdset are
then both checked. When there is message on the connection fd, its
callback vserver_message_handler is called to process vhost-user messages.

To support identifying which virtio is from which guest VM, we could call
rte_vhost_driver_register with different socket path. Virtio devices from
same VM will connect to VM specific socket. The socket path information is
stored in the virtio_net structure.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:15 +01:00
Huawei Xie
fbf7e07ca1 vhost: add select based event driven processing
for more generic event driven processing, refer to:
	http://libevent.org/

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:14 +01:00
Huawei Xie
9464a44160 vhost: implement cuse memory table
remove set_memory_table ops

vhost-cuse or vhost-user will both implement their own set_memory_region handler.

In current vhost-cuse implementation, guest numa memory isn't supported.
Assume that guest memory is backed by only one file.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
2015-02-24 01:38:14 +01:00
Huawei Xie
c89d3e5afd vhost: make host memory mapping more generic
This functions accepts a virtual address and pid(qemu), and maps it into
current process(vhost)'s address space.

The memory behind the virtual address should be backed by a file,
and virtual address should be the starting address.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:13 +01:00
Huawei Xie
6ca9df2812 vhost: copy host memory mapping to a new cuse file
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:12 +01:00
Huawei Xie
c2f60667bf vhost: move fd copying into cuse subdirectory
File descriptor is copied from qemu process into vhost process.
vhost-user doesn't need eventfd kernel module to copy fds between processes.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
2015-02-24 01:38:11 +01:00
Huawei Xie
34f4c46dc4 vhost: rename header file
Rename vhost-net-cdev.h to vhost-net.h.
This file defines common operations provided by virtio-net(.c).

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:10 +01:00
Huawei Xie
6ee36ead58 vhost: move cuse related handling in a subdirectory
Create vhost_cuse directory and move vhost-net-cdev.c into vhost_cuse.

vhost-cuse driver will be divided into two parts: cuse driver specific message
handling(in cuse directory) and common message handling(in virtio-net.c).

vhost ioctl message is pre-processed in cuse and then sent to virtio-net
if is not terminated.

virtio-net.c provides common message handling for both vhost-cuse and vhost-user.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:09 +01:00
Huawei Xie
04d696037a vhost: enable virtio control channel Rx mode
VIRTIO_NET_F_CTRL_RX is dependant on VIRTIO_NET_F_CTRL_VQ.
Observed that virtio-net driver in guest would crash with only CTRL_RX enabled.

In virtnet_send_command:

	/* Caller should know better */
	BUG_ON(!virtio_has_feature(vi->vdev, VIRTIO_NET_F_CTRL_VQ) ||
		(out + in > VIRTNET_SEND_COMMAND_SG_MAX));

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2015-02-24 01:38:07 +01:00
Bruce Richardson
4dc294158c ethdev: support optional Rx and Tx callbacks
Add optional support for inline processing of packets inside the RX
or TX call. For an RX callback, what happens is that we get a set of
packets from the NIC and then pass them to a callback function, if
configured, to allow additional processing to be done on them, e.g.
filling in more mbuf fields, before passing back to the application.
On TX, the packets are similarly post-processed before being handed
to the NIC for transmission.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-24 00:38:27 +01:00
Bruce Richardson
1a9a0b9b02 ethdev: rename interrupt callbacks field
The 'callbacks' member of the rte_eth_dev structure has been renamed
to 'link_intr_cbs' to make it clear that it refers to callbacks from
NIC interrupts. This allows us to add other types of callbacks to
the structure without ambiguity.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-24 00:38:14 +01:00
Jingjing Wu
2ccabd8cd1 i40e: enable internal switch of PF
This patch enables PF's internal switch by setting ALLOWLOOPBACK
flag when VEB is created. With this patch, traffic from PF can be
switched on the VEB.

Test report: http://www.dpdk.org/ml/archives/dev/2015-February/013237.html

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
2015-02-15 07:39:43 +01:00
Jingjing Wu
ac9491a96d i40e: fix vsi configuration
In i40e_vsi_config_tc_queue_mapping, should add a flag to indicate
another valid setting by OR operation, but not set this flag to
valid_sections, otherwise it will overwrite the flags set before.

Test report: http://www.dpdk.org/ml/archives/dev/2015-February/013237.html

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Tested-by: Min Cao <min.cao@intel.com>
2015-02-15 07:39:12 +01:00
Helin Zhang
c9223a2bf5 i40e: workaround for XL710 performance
On XL710, performance number is far from the expectation on recent
firmware versions, if promiscuous mode is disabled, or promiscuous
mode is enabled and port MAC address is equal to the packet
destination MAC address. The fix for this issue may not be
integrated in the following firmware version. So the workaround in
software driver is needed. For XL710, it needs to modify the initial
values of 3 internal only registers, which are the same as X710.
Note that the values for X710 and XL710 registers could be different,
and the workaround can be removed when it is fixed in firmware in
the future.

Test report: http://www.dpdk.org/ml/archives/dev/2015-February/012749.html

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
2015-02-15 07:32:44 +01:00
Dan Aloni
90a1633b23 eal/linux: allow to map BARs with MSI-X tables
While VFIO doesn't allow us to map complete BARs with MSI-X tables,
it does allow us to map around them in PAGE_SIZE granularity. There
might be adapters that provide their registers in the same BAR
but on a different page. For example, Intel's NVME adapter, though
not a network adapter, provides only one MMIO BAR that contains
the MSI-X table.

Signed-off-by: Dan Aloni <dan@kernelim.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2015-02-23 21:57:31 +01:00
Sergio Gonzalez Monroy
4769bc5a27 mbuf: remove build option to disable refcnt
This patch removes all references to RTE_MBUF_REFCNT, setting the refcnt
field in the mbuf struct permanently.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-23 19:31:24 +01:00
Sergio Gonzalez Monroy
e8b9ef877e mbuf: introduce indirect attached flag
Currently for mbufs with refcnt, we cannot free mbufs with external memory
buffers (ie. vhost zero copy), as they are recognized as indirect
attached mbufs and therefore we free the direct mbuf it points to,
resulting in an error in the case of external memory buffers.

We solve the issue by introducing the IND_ATTACHED_MBUF flag, which indicates
that the mbuf is an indirect attached mbuf pointing to another mbuf.
When we free an mbuf, we only free the direct mbuf if the flag is set.
Freeing an mbuf with external buffer is the same as freeing a non attached mbuf.
The flag is set during attach and clear on detach.

So in the case of vhost zero copy where we have mbufs with external
buffers, by default we just free the mbuf and it is up to the user to deal with
the external buffer.

This patch would allow the removal of the RTE_MBUF_REFCNT config option,
setting refcnt for all mbufs permanently.

The patch also modifies the vhost example as it was using the
RTE_MBUF_INDIRECT macro to detect if it was an mbuf with external buffer.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-23 19:27:06 +01:00
Marc Sune
16eaf25232 kni: add build option to disable preempting
This patch introduces CONFIG_RTE_KNI_PREEMPT_DEFAULT flag. When set to 'no',
KNI kernel thread(s) do not call schedule_timeout_interruptible(), which
improves overall KNI performance at the expense of CPU cycles (polling).

Default values is 'yes', maintaining the same behaviour as of now.

Signed-off-by: Marc Sune <marc.sune@bisdn.de>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-23 18:49:19 +01:00
Yerden Zhumabekov
614289298d hash: slice CRC data into 8-byte pieces
Calculating hash for data of variable length is more efficient
when that data is sliced into 8-byte pieces. The rest part of data
is hashed using CRC32 functions with either 8 and 4 byte operands.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-02-23 18:30:05 +01:00
Yerden Zhumabekov
8bae1da2af hash: fallback to software CRC32 implementation
Initially, SSE4.2 support is detected via the constructor function.

Added rte_hash_crc_set_alg() function to detect and set CRC32
implementation if necessary. SSE4.2 is allowed by default.

rte_hash_crc_*byte() functions reworked so they choose available
CRC32 implementation in the runtime.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-02-23 18:24:05 +01:00
Yerden Zhumabekov
d2b989045f hash: add CRC function for 8 bytes
SSE4.2 provides CRC32 intrinsic with 8-byte operand.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-02-23 18:23:13 +01:00
Yerden Zhumabekov
e068294180 hash: replace built-in functions implementing SSE4.2
Give up using built-in intrinsics and use our own assembly
implementation. Remove #include entry as well.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-02-23 18:15:58 +01:00
Yerden Zhumabekov
00bf774bab hash: add assembly implementation of CRC32 intrinsics
Added:
- crc32c_sse42_u32() emits 'crc32l' asm instruction;
- crc32c_sse42_u64() emits 'crc32q' asm instruction;
- crc32c_sse42_u64_mimic(), wrapper in case of run on 32-bit platform.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-02-23 18:14:41 +01:00
Yerden Zhumabekov
d983cf4169 hash: add software CRC32 implementation
Add lookup tables for CRC32 algorithm, crc32c_1word() and
crc32c_2words() functions returning hash of 32-bit and 64-bit
operand.

Signed-off-by: Yerden Zhumabekov <e_zhumabekov@sts.kz>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-02-23 18:13:09 +01:00
Jijiang Liu
352378ede8 i40e: support NVGRE in Rx tunnel filtering
The filter types supported are listed below for NVGRE packet:
   1. Inner MAC and Inner VLAN ID.
   2. Inner MAC address, inner VLAN ID and tenant ID.
   3. Inner MAC and tenant ID.
   4. Inner MAC address.
   5. Outer MAC address, tenant ID and inner MAC address.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
2015-02-23 16:29:58 +01:00
Jijiang Liu
09ba608848 ether: add transparent ethernet bridging type
Add an Ethernet type definition for Transparent Ethernet Bridging.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
2015-02-23 16:25:55 +01:00
Helin Zhang
b12964f621 ethdev: unification of RSS offload types
RSS offload types were defined separately for 1/10G and 40G NICs,
and have no relationship with flow types. The modifications are to
unify all RSS offload types for all PMDs. Unified RSS offload types
have new and common names which can be used for any PMD or
applications, and decouple from specific hardwares.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
[Thomas: merge with fm10k]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-22 23:56:20 +01:00
Helin Zhang
7fa96d696f ethdev: unification of flow types
Flow types was defined actually for i40e hardware specifically,
and wasn't able to be used for defining RSS offload types of all
PMDs. It removed the enum flow types, and uses macros instead
with new names. The new macros can be used for defining RSS
offload types later. Also modifications are made in i40e and
testpmd accordingly.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
[Thomas: merge with new flow director API]
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-22 23:56:20 +01:00
Helin Zhang
ea4c89aeee ethdev: fix size of flow type mask array
It wrongly calculates the size of the flow type mask array. The fix
is to align the flow type maximum index ID with the number of
element bit width, and then divide the number of element bit width.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2015-02-22 23:56:20 +01:00
Helin Zhang
38eef41d6c ethdev: minor comment changes
Added code style fixes.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2015-02-22 23:56:20 +01:00
Helin Zhang
138c9c4023 i40e: remove some useless line breaks
Added code style fixes.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2015-02-22 23:56:20 +01:00
Thomas Monjalon
12cf4d6e0a ethdev: remove old ethertype filter ABI
The old ethertype filter API was removed in commit 75db20648,
but was still in (newly integrated) version map for ABI.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-22 04:06:16 +01:00
Jingjing Wu
083b5148e8 ethdev: remove old ntuple filter API
Following structures are removed:
 - rte_2tuple_filter
 - rte_5tuple_filter
Following APIs are removed:
 - rte_eth_dev_add_2tuple_filter
 - rte_eth_dev_remove_2tuple_filter
 - rte_eth_dev_get_2tuple_filter
 - rte_eth_dev_add_5tuple_filter
 - rte_eth_dev_remove_5tuple_filter
 - rte_eth_dev_get_5tuple_filter
It also move macros TCP_*_FLAG to rte_eth_ctrl.h, and removes the macro
TCP_UGR_FLAG which is duplicated with TCP_URG_FLAG.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
[Thomas: remove also from version map]
2015-02-22 04:05:36 +01:00
Jingjing Wu
4c54a7e7bd ixgbe: migrate ntuple filter to new API
This patch defines new functions dealing with ntuple filters which is
corresponding to 5tuple in HW.
It removes old functions which deal with 5tuple filters.
Ntuple filter is dealt with through entrance ixgbe_dev_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-02-22 04:04:53 +01:00
Jingjing Wu
fdaa76971a igb: migrate ntuple filter to new API
This patch defines new functions dealing with ntuple filters which is
corresponding to 2tuple filter for 82580 and i350 in HW, and to 5tuple
filter for 82576 in HW.
It removes old functions which deal with 2tuple and 5tuple filters in igb driver.
Ntuple filter is dealt with through entrance eth_igb_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-02-22 04:04:53 +01:00
Jingjing Wu
5e4944a1d6 ethdev: new ntuple filter API
This patch defines ntuple filter type RTE_ETH_FILTER_NTUPLE and its structure rte_eth_ntuple_filter.
It also corrects the typo TCP_UGR_FLAG to TCP_URG_FLAG

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-02-22 04:04:53 +01:00
Jingjing Wu
222928b3dd ethdev: remove old syn filter API
Structure rte_syn_filter is removed.
Following APIs are removed:
  - rte_eth_dev_add_syn_filter
  - rte_eth_dev_remove_syn_filter
  - rte_eth_dev_get_syn_filter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
[Thomas: remove also from version map]
2015-02-22 03:09:05 +01:00
Jingjing Wu
afad6c82c6 ixgbe: migrate syn filter to new API
This patch defines new functions dealing with syn filter.
It removes old functions which deal with syn filter.
Syn filter is dealt with through entrance ixgbe_dev_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
2015-02-22 03:09:04 +01:00
Jingjing Wu
dbd7cfcfe1 igb: migrate syn filter to new API
This patch defines new functions dealing with syn filter.
It removes old functions of syn filter in igb driver.
Syn filter is dealt with through entrance eth_igb_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
2015-02-22 03:09:04 +01:00
Jingjing Wu
fd28727da2 ethdev: new syn filter API
This patch defines syn filter type RTE_ETH_FILTER_SYN and its structure rte_eth_syn_filter.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
2015-02-22 02:53:33 +01:00
Jingjing Wu
942524d605 ethdev: remove old flex filter API
Structure rte_flex_filter is removed.
Following APIs are removed:
  - rte_eth_dev_add_flex_filter
  - rte_eth_dev_remove_flex_filter
  - rte_eth_dev_get_flex_filter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
[Thomas: remove also from version map]
2015-02-22 02:53:29 +01:00
Jingjing Wu
231d43909a igb: migrate flex filter to new API
This patch defines new functions dealing with flex filter.
It removes old functions of flex filter in igb driver.
Syn filter is dealt with through entrance eth_igb_filter_ctrl.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
2015-02-22 02:53:24 +01:00
Jingjing Wu
2d72306de1 ethdev: new flex filter API
This patch defines flex filter type RTE_ETH_FILTER_FLEXIBLE and its structure rte_eth_flex_filter.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
2015-02-22 02:48:31 +01:00
Jingjing Wu
81f38e676e ixgbe: support flow director flush
This patch implement RTE_ETH_FILTER_FLUSH operation to delete all
flow director filters in ixgbe driver.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-22 01:34:53 +01:00
Jingjing Wu
fc84329941 ixgbe: migrate flow director info and statistic to new API
This patch changes the get info operation to be implemented through
filter_ctrl API and RTE_ETH_FILTER_INFO/RTE_ETH_FILTER_STATS ops.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-22 01:31:51 +01:00
Jingjing Wu
76c6f89e80 ixgbe: support new flow director masks
This patch implement the mask configuration of flow director filter,
by using the mask defined in rte_fdir_conf instead of callback function
fdir_set_masks.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-22 01:29:05 +01:00
Jingjing Wu
2d4c1a9ea2 ethdev: add new flow director masks
This patch defines structure rte_eth_fdir_masks.
It extends rte_fdir_conf and rte_eth_fdir_info to contain mask's configuration.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-22 01:24:32 +01:00
Jingjing Wu
299191e0c9 ethdev: remove flexbytes offset from flow director
This patch removes the flexbytes_offset from rte_fdir_conf, because
the flexible payload setting is done by flex_conf instead of flexbytes_offset.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-22 01:14:31 +01:00
Jingjing Wu
d54a988826 ixgbe: support flexpayload configuration of flow director
This patch implement the flexpayload configuration of flow director filter.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-22 01:14:21 +01:00
Jingjing Wu
1145eec608 ethdev: extend flow type and flexible payload type for flow director
This patch adds RTE_ETH_FLOW_TYPE_RAW and RTE_ETH_RAW_PAYLOAD to support the
flexible payload is started from the beginning of the packet.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-22 00:59:11 +01:00
Jingjing Wu
f9072f8b90 ixgbe: migrate flow director filtering to new API
This patch changes the add/delete/update operations to be implemented through
filter_ctrl API and RTE_ETH_FILTER_ADD/RTE_ETH_FILTER_DELETE/RTE_ETH_FILTER_UPDATE ops.
It also removes the callback functions:
 - ixgbe_eth_dev_ops.fdir_add_signature_filter
 - ixgbe_eth_dev_ops.fdir_update_signature_filter
 - ixgbe_eth_dev_ops.fdir_remove_signature_filter
 - ixgbe_eth_dev_ops.fdir_add_perfect_filter
 - ixgbe_eth_dev_ops.fdir_update_perfect_filter
 - ixgbe_eth_dev_ops.fdir_remove_perfect_filter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-22 00:58:03 +01:00
Danny Zhou
c112df6875 eal/linux: toggle interrupt for uio_pci_generic
enable/disable interrupt by manipulating a control bit of command
register on NIC's PCIe configuration space.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Tested-by: Qun Wan <qun.wan@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-20 23:34:40 +01:00
Danny Zhou
4a499c6495 eal/linux: enable uio_pci_generic support
Change the EAL PCI code so that it can work with both the
uio_pci_generic in-tree driver, as well as the igb_uio
DPDK-specific driver.

This involves changes to
1) Modify method of retrieving BAR resource mapping information
2) Mapping using resource files in /sys rather than /dev/uio*
2) Setup bus master bit in NIC's PCIe configuration space for
uio_pci_generic.

Signed-off-by: Danny Zhou <danny.zhou@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-20 23:34:31 +01:00
Daniel Mrzyglod
200866003a bond: rename mode 5
This patch modify mode older name from
BONDING_MODE_ADAPTIVE_TRANSMIT_LOAD_BALANCING to BONDING_MODE_TLB
This patch also changes order of TEST_ASSERT macro in
test_tlb_verify_slave_link_status_change_failover.

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-20 23:07:02 +01:00
Michal Jastrzebski
457ecf2953 bond: add debug info for mode 6
This patch add some debug information when using link bonding mode 6.
It prints basic information about ARP packets on RX and TX (MAC, ip,
packet number, arp packet type).
If CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB == y.
If CONFIG_RTE_LIBRTE_BOND_DEBUG_ALB_L1 is enabled instead of previous
one, use show command to see IPv4 balancing from clients.

Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-20 23:07:01 +01:00
Maciej Gajdzica
06fe78b98c bond: add mode 6
This mode includes adaptive TLB and receive load balancing (RLB). In RLB
the bonding driver intercepts ARP replies send by local system and
overwrites its source MAC address, so that different peers send data to
the server on different slave interfaces. When local system sends ARP
request, it saves IP information from it. When ARP reply from that peer
is received, its MAC is stored, one of slave MACs assigned and ARP reply
send to that peer.

Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>
Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-20 22:17:52 +01:00
Maciej Gajdzica
31db4d38de net: change arp header struct declaration
Changed MAC address type from uint8_t[6] to struct ether_addr and IP
address type from uint8_t[4] to uint32_t to make it consistent with
other DPDK code using MAC and IP addresses. It allows us to use
is_same_ether_addr and ether_addr_copy functions on MAC addresses in ARP header.  Also
removed union from arp_hdr struct to make calls to arp_data items
shorter. Updated test-pmd to match new arp_hdr version.

Signed-off-by: Maciej Gajdzica <maciejx.t.gajdzica@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
[Thomas: doxygenize comments]
2015-02-20 22:10:52 +01:00
Ouyang Changchun
e033c005c2 virtio: remove Rx hotspots from early return
Remove those hotspots which is unnecessary when early returning occurs.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:19:21 +01:00
Ouyang Changchun
da978dfdc4 virtio: use port IO to get PCI resource
Make virtio not require UIO for some security reasons, this is to match
6WIND's virtio-net-pmd.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:19:20 +01:00
Stephen Hemminger
32c118fd00 virtio: free mbuf's with threshold
This makes virtio driver work like ixgbe. Transmit buffers are
held until a transmit threshold is reached. The previous behavior
was to hold mbuf's until the ring entry was reused which caused
more memory usage than needed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:19:19 +01:00
Stephen Hemminger
5186fb1f37 virtio: set MAC address
Need to have do special things to set default mac address.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:19:16 +01:00
Stephen Hemminger
1b306359e5 virtio: suport multiple MAC addresses
Virtio support multiple MAC addresses.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:19:13 +01:00
Stephen Hemminger
9f4f2846ef virtio: support vlan filtering
Virtio supports vlan filtering.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:19:10 +01:00
Stephen Hemminger
a85786dc81 virtio: fix states handling during initialization
Change order of initialization to match Linux kernel.
Don't blow away control queue by doing reset when stopped.

Calling dev_stop then dev_start would not work.
Dev_stop was calling virtio reset and that would clear all queues
and clear all feature negotiation.
Resolved by only doing reset on device removal.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:19:05 +01:00
Stephen Hemminger
fb9d170496 virtio: move allocation before initialization
If allocation fails, don't want to leave virtio device stuck
in middle of initialization sequence.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:57 +01:00
Ouyang Changchun
894bb9cb0e virtio: use soft vlan strip with mergeable Rx packets
To keep the consistent logic with normal Rx path, the mergeable
Rx path also needs software vlan strip/decap if it is enabled.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:56 +01:00
Stephen Hemminger
1d5ced9176 virtio: use soft vlan encap/decap
Implement VLAN stripping in software. This allows application
to be device independent.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:52 +01:00
Stephen Hemminger
c974021a59 ether: add soft vlan encap/decap
It is helpful to allow device drivers that don't support hardware
VLAN stripping to emulate this in software. This allows application
to be device independent.

Avoid discarding shared mbufs. Make a copy in rte_vlan_insert() of any
packet to be tagged that has a reference count > 1.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:47 +01:00
Stephen Hemminger
9ffd50826f virtio: remove redundant alignment field
Since vq_alignment is constant (always 4K), it does not
need to be part of the vring struct.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:44 +01:00
Stephen Hemminger
c7bca33e42 virtio: remove unnecessary adapter structure
Cleanup virtio code by eliminating unnecessary nesting of
virtio hardware structure inside adapter structure.
Also allows removing unneeded macro, making code clearer.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:42 +01:00
Stephen Hemminger
8d6d7e5cb3 virtio: support link state interrupt
Virtio has link state interrupt which can be used.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:39 +01:00
Stephen Hemminger
069739780b virtio: allow starting with link down
Starting driver with link down should be ok, it is with every
other driver. So just allow it.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:35 +01:00
Ouyang Changchun
8c09c20fb4 virtio: fix update of vring descriptor index
Updating the vring descriptor index should be done before notifying host;
Remove 2 duplicated store memory barriers in both Rx and Tx path because there is
store memory barrier in vq_update_avail_idx function;
Notify the host only if packets actually transmitted(nb_tx > 0).

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:34 +01:00
Stephen Hemminger
faf79b7854 virtio: use weaker barriers
The DPDK driver only has to deal with the case of running on PCI
and with SMP. In this case, the code can use the weaker barriers
instead of using hard (fence) barriers. This will help performance.
The rationale is explained in Linux kernel virtio_ring.h.

To make it clearer that this is a virtio thing and not some generic
barrier, prefix the barrier calls with virtio_.

Add missing (and needed) barrier between updating ring data
structure and notifying host.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:26 +01:00
Stephen Hemminger
dec08c28c0 virtio: check packet headroom at compile time
Better to check at compile time than fail at runtime.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:22 +01:00
Stephen Hemminger
a8c0f835a7 virtio: make a function local
Make vtpci_get_status a local function as it is used in one file.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:18:09 +01:00
Stephen Hemminger
41d7fcf5e3 virtio: rearrange resource initialization
For clarity make the setup of PCI resources for Linux into a function rather
than block of code #ifdef'd in middle of dev_init.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-02-20 19:17:45 +01:00
Panu Matilainen
3c21004855 i40e: fix build with gcc 5
Eliminate ambiguity in the condition which trips up a "logical not
is only applied to the left..." warning from gcc 5, causing build
failure with -Werror. Besides non-ambiguous, the condition is
far more obvious this way.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-20 15:10:09 +01:00
Thomas Monjalon
485e82c3f1 ixgbe: forbid building vpmd without Rx bulk alloc
CONFIG_RTE_LIBRTE_IXGBE_RX_ALLOW_BULK_ALLOC is a prerequisite
of CONFIG_RTE_IXGBE_INC_VECTOR.

Reported-by: Alexander Belyakov <abelyako@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-20 12:04:05 +01:00
Bruce Richardson
51d4554896 ixgbe: fix vector receive of chained mbuf
When the vector pmd was receiving a mix of packets of various sizes,
some of which were split across multiple mbufs, there was an issue
with reassembly of the jumbo frames. This was due to a skipped increment
when using "continue" in a while loop. Changing the loop to a "for"
loop fixes this problem, by ensuring the increment is always performed.

Reported-by: Prashant Upadhyaya <praupadhyaya@gmail.com>
Reported-by: Martin Weiser <martin.weiser@allegro-packets.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Martin Weiser <martin.weiser@allegro-packets.com>
2015-02-20 11:58:07 +01:00
Xuelin Shi
3658eb95fe ixgbe: fix big endian access
ixgbe is little endian, but cpu maybe not.
add necessary conversions.
    rte_cpu_to_le_32(...) for PCI write
    rte_le_to_cpu_32(...) for PCI read.

Signed-off-by: Xuelin Shi <xuelin.shi@freescale.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-20 11:52:29 +01:00
Xuelin Shi
89626cb570 e1000: fix big endian access
e1000 is little endian, but cpu maybe not.
add necessary conversions.

rte_cpu_to_le_32(...) for PCI write
rte_le_to_cpu_32(...) for PCI read.

Signed-off-by: Xuelin Shi <xuelin.shi@freescale.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-20 11:44:52 +01:00
Sergio Gonzalez Monroy
a8c50a37b3 eal/linux: fix fscanf format mismatch
Variables are unsigned int but format scans for signed int.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-02-20 11:08:29 +01:00
Sergio Gonzalez Monroy
90f1adc3bd eal: fix dynamic link with virtio
Building shared libraries and using virtio PMD results in undefined
reference to 'rte_eal_iopl_init'.

Add missing function to eal version map.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-02-20 11:04:52 +01:00
Olivier Matz
ac09cd9732 cmdline: fix check in port list parsing
The argument ressize contains the size of the result buffer which
should be large enough to store the parsed result of a token. In
this case, it should be larger or equal to sizeof(cmdline_portlist_t)
(4 bytes), not PORTLIST_TOKEN_SIZE which is the max size of the token
string.

This is not a critical, it fixes cases where the total length of the
parsed instruction is greater than the maximum.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2015-02-20 10:50:16 +01:00
Michal Jastrzebski
960d1081c0 bond: change warning
Remove function name from warning.

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-18 19:07:34 +01:00
Tomasz Kulasek
6107d0f738 bond: fix symbol export
rte_eth_bond_8023ad_setup and rte_eth_bond_8023ad_conf_get entries need
to be exported to be used by test application for shared libraries
compilation.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-18 18:58:55 +01:00
Tomasz Kulasek
f111e60427 ring: fix MAC per device
Initialization procedure fix to allow per device MAC configuration.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-18 18:58:55 +01:00
Tomasz Kulasek
bc6e95d597 ring: add MAC address add/remove
Added MAC addr add/remove dummy callbacks.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-18 18:58:55 +01:00
Tomasz Kulasek
7ab04035e6 ring: add link up/down
Link up and down implementation.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-18 18:58:55 +01:00
Declan Doherty
0e0a4018a7 bond: fix memory leak on kvargs processing failure
identified by klockwork scan

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-02-18 18:12:58 +01:00
Jia Yu
a720aadd44 ethdev: return status of stats read
rte_eth_stats_get is to get physical device statistics. Without
return status, caller does not know whether function fails or not
(i.e. invalid port_id, driver has no stats_get callback). Caller
cannot differiente normal 0 stats or error case. This fix adds
a return status to the function.

Signed-off-by: Jia Yu <jyu@vmware.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-18 17:50:15 +01:00
Jia Yu
99610782ed ethdev: update link status after start
Since LSR interrupt is disabled by pmd drivers, link status in rte_eth_device is always down.
Bond slave_configure() enables LSR interrupt on devices to get notification if link status
changes. However, the LSC interrupt at device start time is still lost.

In this fix, call link_update to read link status from hardware
register at device start time.

Signed-off-by: Jia Yu <jyu@vmware.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-02-18 17:50:15 +01:00
Sergio Gonzalez Monroy
b70b56032b reorder: new library
This library provides reordering capability for out of order mbufs based
on a sequence number in the mbuf structure.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Signed-off-by: Richardson Bruce <bruce.richardson@intel.com>
Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-18 16:52:05 +01:00
David Marchand
c07691ae10 devargs: remove limit on parameters length
As far as I know, there is no reason why we should have a limit on the length of
parameters that can be given for a device.
Remove this limit by using dynamic allocations.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-18 12:16:38 +01:00
David Marchand
0215a4c61f devargs: indent and cleanup
Prepare for next commit.
Fix some indent issues, refactor error code.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-02-18 12:16:20 +01:00
Jeff Shaw
4c287332c3 fm10k: add PF and VF interrupt handling
1. Add functions to enable PF/VF interrupt.
2. Add function to process error message passed from interrupt.
2. Add 2 interrupt handling functions, one for PF and one for VF.
2. Enable interrupt after completing initialization of NIC.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
c0eb314107 fm10k: add VF support
fm10k pmd driver will support both PF and VF device with single
copy of code. The reason is NIC maps registers with same
function in PF and VF to same PCI I/O address. Then, PF/VF drivers
use same address to access registers belonging to it, HW will
translate the request to correct units.

For some functionalities that are unique to PF, driver will check
current driver type and behave correctly.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
fbf3e0f911 fm10k: add vlan filter
Add fm10k_vlan_filter_set to set vlan.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
c82dd0a7bf fm10k: add scatter receive
1. Add fm10k_recv_scattered_pkts function to receive jumbo frame
   and multi-segment packets.
2. Configure correct receive function in rx_init and dev_init.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
57033cdf8f fm10k: add PF RSS
1. Configure RSS in fm10k_dev_rx_init function.
2. Add fm10k_rss_hash_update and fm10k_rss_hash_conf_get to get
   and inquery RSS configuration.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
4b61d3bfa9 fm10k: add receive and tranmit
1. Add fm10k_recv_pkts and fm10k_xmit_pkts functions.
2. Link app function pointer to actual fm10k recv/xmit
   functions.
3. Change Makefile to compile new file fm10k_rxtx.c

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
9ae6068c86 fm10k: add dev start/stop
1. Add function to initialize RX queues.
2. Add function to initialize TX queues.
3. Add fm10k_dev_start, fm10k_dev_stop and fm10k_dev_close
   functions.
4. Add function to close mailbox service.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
387bf3006c fm10k: add Rx/Tx single queue start/stop
1. Add 4 functions fm10k_dev_rx_queue_start,
   fm10k_dev_rx_queue_stop, fm10k_dev_tx_queue_start,
   and fm10k_dev_tx_queue_stop.
2. verify Rx packet buffer alignment is valid.
   Hardware requires specific alignment for Rx packet buffers. At
   least one of the following two conditions must be satisfied.
       1) Address is 512B aligned
       2) Address is 8B aligned and buffer does not cross 4K boundary.

   Alignment is checked by the driver when the Rx queue is reset. It
   is assumed that if an entire descriptor ring can be filled with
   buffers containing valid alignment, then all buffers in that mempool
   have valid address alignment. It is the responsibility of the user
   to ensure all buffers have valid alignment, as it is the user who
   creates the mempool.

   It is assumed the buffer needs only to store a maximum size Ethernet
   frame.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
98068e0e04 fm10k: add Tx queue setup/release
Add fm10k_tx_queue_setup and fm10k_tx_queue_release functions.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
6cfe8969c9 fm10k: add Rx queue setup/release
Add fm10k_rx_queue_setup and fm10k_rx_queue_release functions.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
7e72cbfa32 fm10k: add reta query/update
1. Add fm10k_reta_update and fm10k_reta_query functions.
2. Add fm10k_link_update and fm10k_dev_infos_get functions.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
a6061d9e70 fm10k: register PF driver
1. Add init function to scan and initialize fm10k PF device.
2. Add implementation to register fm10k pmd PF driver.
3. Add 3 functions fm10k_dev_configure, fm10k_stats_get and
   fm10k_stats_get.
4. Add fm10k.h to define macros and basic data structure.
5. Add fm10k_logs.h to control log message output.
6. Change config/common_bsdapp and config/common_linuxapp, add
   macros to control fm10k pmd driver compile for linux and bsd.
7. Add Makefile.
8. Change lib/Makefile to add fm10k driver into compile list.
9. Change mk/rte.app.mk to add fm10k lib into link.
10. Add ABI version of librte_pmd_fm10k

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Signed-off-by: Michael Qiu <michael.qiu@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
fe92cff407 eal: add fm10k device id
Add fm10k device ID list into rte_pci_dev_ids.h.

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Jeff Shaw
7223d200c2 fm10k: add base driver
Base driver is developed and maintained by Intel ND team, includes
basic functional service to Intel Ethernet Switch FM10000 Series
of silicons.
Any suggestion on bug fix and improvement within this directory is
welcome, but need this team to change and update.

Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
2015-02-17 15:25:30 +01:00
Olivier Matz
3e11f931b8 i40e: add debug logs for Tx context descriptors
This could be useful to have this values for debug purposes.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:18 +01:00
Olivier Matz
0c0c43705c i40e: fix offloading of outer checksum for IPIP tunnel
When offloading the checksums of ipip tunnels, m->l2_len is set to 0
as there is no tunnel or inner l2 header. Since this is a valid value
remove the test.

By the way, also remove the same test with l3_len because at this
point, it is expected that the software provides proper values in the
mbuf. It should avoid a test in dataplane processing and therefore
slightly increase performance.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:18 +01:00
Jijiang Liu
2af384931c i40e: advertise outer IPv4 checksum capability
Advertise the DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM flag in the PMD
features. It means that the i40e PMD supports the offload of outer IP
checksum when transmitting tunneling packet.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-02-16 19:21:18 +01:00
Jijiang Liu
bb58239767 ethdev: add outer IP checksum capability flag
If the flag is advertised by a PMD, the NIC supports the outer IP
checksum TX offload of tunneling packets, therefore an application can
set the PKT_TX_OUTER_IP_CKSUM flag in mbufs when transmitting on this
port.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2015-02-16 19:21:18 +01:00
Olivier Matz
c9433176b3 mbuf: remove UDP tunnel flag
Since previous commit, the flag PKT_TX_UDP_TUNNEL_PKT is not used by any PMD,
remove it from mbuf API and from csumonly (testpmd). In csumonly, the
PKT_TX_OUTER_IP_CKSUM flag is already set for vxlan checksum, providing
enough information to the underlying driver.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:17 +01:00
Olivier Matz
59cfa76a22 i40e: remove the use of UDP tunnel flag
The definition of the flag PKT_TX_UDP_TUNNEL_PKT in rte_mbuf.h was:
  TX packet is an UDP tunneled packet. It must be specified when using
  outer checksum offload (PKT_TX_OUTER_IP_CKSUM)

This flag was used to tell the NIC that the offload type is UDP
(I40E_TXD_CTX_UDP_TUNNELING flag). In the datasheet, it says it's
required to specify the tunnel type in the register. However, some tests
(see [1]) showed that it also works without this flag.

Moreover, it is not explained how the hardware use this
information. From a network perspective, this information is useless for
calculating the outer IP checksum as it does not depend on the payload.

Having this flag in the API would force the application to specify the
tunnel type for something that looks only useful for this PMD. It will
limit the number of possible tunnel types (we would need a flag for each
tunnel type) and therefore prevent to support outer IP checksum for
proprietary tunnels.

Finally, if a hardware advertises "I support outer IP checksum", it must
be supported for any payload types.

This has been validated by [2], knowing that the ipip test case was fixed
after this test report [3].

[1] http://dpdk.org/ml/archives/dev/2015-January/011380.html
[2] http://dpdk.org/ml/archives/dev/2015-January/011475.html
[3] http://dpdk.org/ml/archives/dev/2015-January/011610.html

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:17 +01:00
Olivier Matz
e3f0151fc6 i40e: enable Tx checksum only for offloaded packets
From i40e datasheet:

  The IP header type and its offload. In case of tunneling, the IIPT
  relates to the inner IP header. See also EIPT field for the outer
  (External) IP header offload.

  00 - non IP packet or packet type is not defined by software
  01 - IPv6 packet
  10 - IPv4 packet with no IP checksum offload
  11 - IPv4 packet with IP checksum offload

Therefore it is not needed to fill the IIPT field if no offload is
requested (we can keep the value to 00). For instance, the linux driver
code does not set it when (skb->ip_summed != CHECKSUM_PARTIAL). We can
do the same in the dpdk driver.

The function i40e_txd_enable_checksum() that fills the offload registers
can only be called for packets requiring an offload.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:17 +01:00
Olivier Matz
609dd68ef1 mbuf: enhance the API documentation of offload flags
Based on http://dpdk.org/ml/archives/dev/2015-January/011127.html

Also adapt the csum forward engine code to the API.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:17 +01:00
Olivier Matz
fac4c750b2 mbuf: remove flag alias for IP checksum
The alias PKT_TX_IPV4_CSUM is only used in one place of i40e driver.
Remove it and only keep the legacy flag PKT_TX_IP_CSUM.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2015-02-16 19:21:17 +01:00
Cunming Liang
b0920f481a ixgbe: fix link issue in loopback mode
In loopback mode, it's expected force link up even when there's no cable connect.
But in codes, setup_sfp() rewrites the related register.
It causes in the case 'multispeed_fiber', it can't link up without cable connect.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Patrick Lu <patrick.lu@intel.com>
2015-02-15 17:09:31 +01:00
Michael Qiu
ed2547b68f pci: fix max VFs for non igb_uio drivers
max_vfs will only be created by igb_uio driver, for other
drivers like vfio or pci_uio_generic, max_vfs will miss.

But sriov_numvfs is not driver related, just get the vf numbers
from that field.

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2015-02-13 14:48:16 +01:00
Panu Matilainen
f30dcd03f9 cmdline: fix link due to missing symbol in version map
cmdline_token_portlist_ops fell through cracks in the initial symbol
versioning patch, breaking pktgen build.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-02-04 23:24:44 +01:00
Neil Horman
133b75923b mk: add library version extension
To differentiate libraries that break ABI, we add a library version number
suffix to the library, which must be incremented when a given libraries ABI is
broken.  This patch enforces that addition, sets the initial abi soname
extension to 1 for each library and creates a symlink to the base SONAME so that
the test applications will link properly.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2015-02-03 16:56:58 +01:00
Neil Horman
9d41beed24 lib: provide initial versioning
Add linker version script files to each DPDK library to put a stake in the
ground from which we can start cleaning up API's

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2015-02-03 16:56:58 +01:00
Neil Horman
166a743c53 compat: add infrastructure to support symbol versioning
Add initial pass header files to support symbol versioning.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2015-02-03 16:56:58 +01:00
Helin Zhang
782c8c92f1 i40e: add hash configuration
Hash filter control has been implemented for i40e. It includes
getting/setting,
- global hash configurations (hash function type, and symmetric
  hash enable per flow type)
- symmetric hash enable per port

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-02 16:13:54 +01:00
Helin Zhang
a98ff8a3c3 ethdev: add hash configuration
In order to support hash filter configuration, filter type of hash
is added, also the corresponding structures, macros and definitions
are added.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-02 16:13:54 +01:00
Helin Zhang
d0f2f39bf4 ethdev: move some comments
Added code style fixes.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-02 16:13:54 +01:00
Helin Zhang
d27e5d3cde i40e: use constant as default hash keys
Calculating the default RSS hash keys at run time is not needed
at all, and may have race conditions. The alternative is to use
array of random values which were generated manually as the
default hash keys.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-02 16:13:19 +01:00
Declan Doherty
ecd9d5193b bond: remove offload flags from transmit policy checks
The Link bonding library is incorrectly using receive packet type flags
in the transmit policy hashing functions, which would cause packets
generated locally to be incorrectly distributed across the slave
devices. This patch completely removes the dependency on the packet
type flags and uses the ether_type from either the Ethernet header or
the VLAN headers for branching.

This patch also includes the associate changes in the test suite and in
the packet_burst_generator code to remove the dependences on the packet
type flags.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
2015-02-02 12:30:33 +01:00
Pawel Wodkowski
ac1593b22a bond: fix headers for C++
Add missing declarations to rte_bond_8023ad.h.
Remove 'extern "C"' declarations from bond private header file.

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2015-02-02 12:30:33 +01:00
Thomas Monjalon
7e60e08397 acl: remove standalone header
This is a duplication of some EAL parts for a standalone packaging
which is not documented.
Packaging should be done outside of DPDK.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-02-02 12:30:33 +01:00
Konstantin Ananyev
17f520d2cf acl: add comments about internal layout
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:12:16 +01:00
Konstantin Ananyev
f3d24368ef acl: remove unused constant
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:26 +01:00
Konstantin Ananyev
62945e029e acl: introduce config parameter for performance/space trade-off
If at build phase we don't make any trie splitting,
then temporary build structures and resulting RT structure might be
much bigger than current.
>From other side - having just one trie instead of multiple can speedup
search quite significantly.
>From my measurements on rule-sets with ~10K rules:
RT table up to 8 times bigger, classify() up to 80% faster
than current implementation.
To make it possible for the user to decide about performance/space trade-off -
new parameter for build config structure (max_size) is introduced.
Setting it to the value greater than zero, instructs  rte_acl_build() to:
- make sure that size of RT table wouldn't exceed given value.
- attempt to minimise number of tries in the table.
Setting it to zero maintains current behaviour.
That introduces a minor change in the public API, but I think the possible
performance gain is too big to ignore it.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:26 +01:00
Konstantin Ananyev
a0e3310e7a acl: deduplicate some SSE and AVX2 code
Vector code reorganisation/deduplication:
To avoid maintaining two nearly identical implementations of calc_addr()
(one for SSE, another for AVX2), replace it with a new macro that suits
both SSE and AVX2 code-paths.
Also remove no needed any more MM_* macros.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
cf59b29bb9 acl: move SSE dwords shuffle
Reorganise SSE code-path a bit by moving lo/hi dwords shuffle
out from calc_addr().
That allows to make calc_addr() for SSE and AVX2 practically identical
and opens opportunity for further code deduplication.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
4269eae463 acl: use scalar method fastest for some cases
Previous improvements made scalar method the fastest one
for tiny bunch of packets (< 4).
That allows us to remove specific vector code-path for small number of packets
(search_sse_2) and always use scalar method for such cases.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
5dd71363bf acl: add AVX2 classify method
Introduce new classify() method that uses AVX2 instructions.

>From my measurements:
On HSW boards when processing >= 16 packets per call,
AVX2 method outperforms it's SSE counterpart by 10-25%,
(depending on the ruleset).

When build with the compilers that don't support AVX2 instructions,
make rte_acl_classify_avx2() do nothing and return an error.
At runtime, if librte_acl was build with the compiler that supports AVX2,
this method is selected as default one on HW that supports AVX2.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
da826b7135 eal: introduce ymm type for AVX 256-bit
New data type to manipulate 256 bit AVX values.
Rename field in the rte_xmm to keep common naming across SSE/AVX fields.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
3858b90d82 acl: deduplicate a bit of RT code
Move common check for input parameters up into rte_acl_classify_alg().

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
5c0e6c3de5 acl: make scalar RT code more similar to vector one
Make classify_scalar to behave in the same way as it's vector counterpart:
move match check out of the inner loop, etc.
That makes scalar and vector code look more identical.
Plus it improves scalar code performance.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
a726650857 acl: simplify match nodes allocation
Right now we allocate indexes for all types of nodes, except MATCH,
at 'gen final RT table' stage.
For MATCH type nodes we are doing it at building temporary tree stage.
This is totally unnecessary and makes code more complex and error prone.
Rework the code and make MATCH indexes being allocated at the same stage
as all others.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
ec51901a0b acl: introduce DFA nodes compression (group64) for identical entries
Introduced division of whole 256 child transition enties
into 4 sub-groups (64 kids per group).
So 2 groups within the same node with identical children,
can use one set of transition entries.
That allows to compact some DFA nodes and get space savings in the RT table,
without any negative performance impact.
>From what I've seen an average space savings: ~20%.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
d4132664d8 acl: fix overwritten matches
There was a bug at build phase that can cause matches beeing overwritten.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
c11f17d7f9 acl: remove build phase heuristic with negative performance effect
Current rule-wildness based heuristics can cause unnecessary splits of
the ruleset.
That might have negative performance effect:
more tries to traverse, bigger RT tables.
After removing it, on some test-cases with big rulesets (~10K)
observed ~50% speedup.
No difference for smaller rulesets.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
efb2529385 acl: make data indexes long enough to survive idle transitions
Make data_indexes long enough to survive idle transitions.
That allows to simplify match processing code.
Also fix incorrect size calculations for data indexes.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Konstantin Ananyev
47b1402088 acl: fix build in standalone mode
Fix compilation with RTE_LIBRTE_ACL_STANDALONE=y

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-28 17:11:25 +01:00
Michael Qiu
bfca21f8a0 ixgbe: add queue start failure check
For ixgbe, when queue start fails, for example, mbuf allocate
failure, the device will still start successfully, which could be
an issue.

Add return status check of queue start to avoid this issue.

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-27 15:09:29 +01:00
Ouyang Changchun
d0bd0db1ce ixgbe: remove an useless check in VF RSS
To follow up the comments from Pawel Wodkowski, remove this unnecessary check,
as check_mq_mode has already check the queue number in device configure stage,
if the queue number of vf is not correct, it will return error code and exit,
so it doesn't need check again here in device start stage (note: pf_host_configure
is called in device start stage).

Fixes: 42d2f78abc ("configure VF RSS")

Suggested-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
2015-01-27 12:36:24 +01:00
Stephen Hemminger
359ae1cc65 ixgbe: initialize link status on initialization
The link_status variable is not set when device is initialized.
This can lead to problems with link never being reported as up
if using some SFP modules where the link is instantly on.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-27 12:36:24 +01:00
Stephen Hemminger
4e4510dfff ethdev: remove useless stats zeroing in drivers
The rte_eth_stats_get is the only API that should call the device
statistics function directly, and it already does a memset of the
resulting structure since commit 02331c16ec. Therefore doing
memset() in the driver is redundant and should be removed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[David: remove also in igbvf and pcap PMDs]
Acked-By: David Marchand <david.marchand@6wind.com>
2015-01-27 12:36:23 +01:00
Marc Sune
9f516206d9 ip_frag: fix header for C++
Add missing extern 'C' decls in rte_ip_frag.h.

Fixes: 601e279df0 ("move fragmentation/reassembly headers into a library")

Signed-off-by: Marc Sune <marc.sune@bisdn.de>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-27 12:32:45 +01:00
Pablo de Lara
d10096c692 power: fix link with application
rte_power_freq_min function did not include "extern" keyword,
causing linking errors.

Fixes: 445c6528b5 ("power: common interface for guest and host")

Reported-by: Ildar Mustafin <imustafin@bk.ru>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-27 12:30:36 +01:00
Remi Pommarel
616dc65bea pcap: fix device name
Ethernet device's data should contain the virtual device name for pcap port.
This name is correctly set by rte_eth_dev_allocate() at initialization time,
but it is directly lost.

Fixes: 83b4113693 ("ethdev: add unique name to devices")

Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-27 12:25:32 +01:00
Remi Pommarel
dd9fecd0bc eal: fix enabled core number with -l option
When using core list argument to define which core to enable (ie -l) the
core_num field of the rte configuration is not updated the same way as using
coremask. This causes rte_lcore_num() to yield different value from the one
using coremask.

Fixes: d888cb8b96 ("add core list input format")

Signed-off-by: Remi Pommarel <repk@triplefau.lt>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-27 12:22:40 +01:00
Jingjing Wu
75db20648e ethdev: remove old ethertype filter
Structure rte_ethertype_filter is removed.
Following APIs are removed:
  - rte_eth_dev_add_ethertype_filter
  - rte_eth_dev_remove_ethertype_filter
  - rte_eth_dev_get_ethertype_filter

It is replaced by filter_ctrl API.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-01-20 09:14:53 +01:00
Jingjing Wu
748198ae82 ixgbe: use generic filter control for ethertype filter
This patch removes old functions which deal with ethertype filter in ixgbe driver.
It also defines ixgbe_dev_filter_ctrl which is binding to filter_ctrl API,
and ethertype filter can be dealt with through this new entrance.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-01-20 09:14:53 +01:00
Jingjing Wu
dd5c2dd7e5 igb: use generic filter control for ethertype filter
This patch removes old functions which deal with ethertype filter in igb driver.
It also defines eth_igb_filter_ctrl which is binding to filter_ctrl API,
and ethertype filter can be dealt with through this new entrance.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2015-01-20 09:14:53 +01:00
Declan Doherty
8e3e065016 mem: fix alignment parameter check
In commit 2fc8d6d the behaviour of function rte_is_power_of_2 was
changed to not return true for 0. memzone_reserve_aligned_thread_unsafe
and rte_malloc_socket both make the assumption that for align = 0
!rte_is_power_of_2(align) will return false. This patch adds a check
that align parameter is non-zero before doing the power of 2 check.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
[Thomas: use && operator instead of ternary ?: and fix precedence with parens]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-19 09:36:55 +01:00
Ouyang Changchun
42d2f78abc ixgbe: configure VF RSS
It needs config RSS and IXGBE_MRQC and IXGBE_VFPSRTYPE to enable VF RSS.

The psrtype will determine how many queues the received packets will distribute to,
and the value of psrtype should depends on both facet: max VF rxq number which
has been negotiated with PF, and the number of rxq specified in config on guest.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Reviewed-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-01-18 23:13:57 +01:00
Ouyang Changchun
db858dd021 ethdev: check VMDq RSS mode
Check mq mode for VMDq RSS, handle it correctly instead of returning an error;
Also remove the limitation of per pool queue number has max value of 1, because
the per pool queue number could be 2 or 4 if it is VMDq RSS mode;

The number of rxq specified in config will determine the mq mode for VMDq RSS.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Reviewed-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-01-18 23:12:44 +01:00
Ouyang Changchun
46b14c2311 ixgbe: get VF queue number
Get the available Rx and Tx queue number when receiving IXGBE_VF_GET_QUEUES
message from VF.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Reviewed-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-01-18 23:12:06 +01:00
Ouyang Changchun
6972eae614 ixgbe: negotiate VF API version
Negotiate API version with VF when receiving the IXGBE_VF_API_NEGOTIATE message.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Reviewed-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-01-18 23:11:41 +01:00
Ouyang Changchun
d0a4d79832 ixgbe: code cleanup
Put global register configuring out of loop for queue; also fix typo and indent.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Reviewed-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-01-18 23:10:46 +01:00
Tomasz Kulasek
b4f60f5557 eal/linux: check fscanf return when parsing modules list
The lack of result checking of fscanf function, breaks compilation
for default "-Werror=unused-result" flag.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2015-01-16 17:35:40 +01:00
Bruce Richardson
01e66999ef nic_uio: fix thread structure compatibility for future FreeBSD
Replace d_thread_t with struct thread in nic_uio.

Ref: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=196691
Quote:
"The d_thread_t typedef is a compat shim to support FreeBSD 4.x.
I'm planning to remove this shim from 11 and dpdk is very unlikely
to ever be ported to 4.x.
If it does it will need far more changes than just d_thread_t"

Reported-by: John Baldwin <jhb@freebsd.org>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2015-01-16 14:43:16 +01:00
Declan Doherty
7b00a204bb bond: fix vlan flag interpretation
This patch contains a fix for link bonding handling of vlan tagged packets in mode 3 and 5.
Currently xmit_slave_hash function misinterprets the PKT_RX_VLAN_PKT flag to mean that
there is a vlan tag within the packet when in actually means that there is a valid entry
in the vlan_tci field in the mbuf.

- Fixed VLAN tag support in hashing functions.
- Adds support for TCP in layer 4 header hashing.
- Splits transmit hashing function into separate functions for each policy to
  reduce branching and to make the code clearer.
- Fixed incorrect flag set in test application packet generator.

Test report: http://dpdk.org/ml/archives/dev/2015-January/010792.html

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
2015-01-15 13:41:39 +01:00
Michael Qiu
021d935a03 vfio: avoid enabling while the module is not loaded
When vfio module is not loaded when kernel support vfio feature,
the routine still try to open the container to get file
description.

This action is not safe, and of course got error messages:

EAL: Detected 40 lcore(s)
EAL:   unsupported IOMMU type!
EAL: VFIO support could not be initialized
EAL: Setting up memory...

This may make user confuse, this patch make it reasonable
and much more smooth to user.

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2015-01-15 13:41:39 +01:00
Stephen Hemminger
985cba0c5c log: remove unnecessary stubs
The read/seek/close stub functions are unnecessary on the
log stream.  Per glibc fopencookie man page:

       cookie_read_function_t *read
              If *read is a null pointer, then reads from  the  custom  stream
              always return end of file.

       cookie_seek_function_t *seek
              If *seek is a null pointer, then it is not possible  to  perform
              seek operations on the stream.

       cookie_close_function_t *close
              If  *close is NULL, then no special action is performed when the
              stream is closed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-15 13:41:39 +01:00
Vlad Zolotarov
8cf8fe6c3e mem: search only dpdk hugetlbfs maps
When scanning the hugetlbfs maps search only for the DPDK maps.
This will allow the application create its own hugetlbfs mappings
and use the DPDK facilities on the same hugetlbfs mount point.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-15 13:41:39 +01:00
Pawel Wodkowski
8d62edecaa ethdev: fix missing parenthesis in mac check
Fix check introduced in commit 4bdefaade6 (VMDQ enhancements).

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-15 13:41:39 +01:00
Ravi Kerur
2fc8d6daa4 eal: fix check for power of 2 in 0 case
rte_is_power_of_2 returns true for 0 and 0 is not power_of_2.
Fix by checking for n.

Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2015-01-15 13:41:39 +01:00
Thomas Monjalon
ec0b5f4fbd version: 2.0.0-rc0
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-01-15 10:55:25 +01:00
Thomas Monjalon
6fb3161060 version: 1.8.0
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-20 00:38:39 +01:00
Balazs Nemeth
494a1f991e ixgbevf: fix link state
This patch fixes checking the link state of a virtual function. If the
state has already been checked, it does not need to be checked
again. Previously, get_link_status in the ixgbe_hw struct was used to
track if the information had already been retrieved, but this field
was always set to false (signifying that the information was
up-to-date). The problem was introduced by commit 8ef32003 which was
part of a patch set to update the ixgbe portion of the PMD. This patch
does not break consistency with the ixgbevf driver. Instead, it fixes
the problem at the level of DPDK.

Applications that rely on the reported link speed could fail without
this patch. The qos_sched example application provided with DPDK did
not run when virtual functions were used. The output for this example
application is shown below:

EAL: Error - exiting with code: 1
  Cause: Unable to config sched subport 0, err=-2

The problem and the effect of the patch can been seen by running the
l2fwd example application using the following command:

sudo ./build/l2fwd -c 0x3 -n 4 -- -p 0x3 -T 0

Before the patch has been applied (with both links up):
...
Checking link statusdone
Port 0 Link Up - speed 100 Mbps - half-duplex

Port 1 Link Up - speed 100 Mbps - half-duplex

L2FWD: entering main loop on lcore 1
...

After the patch has been applied (with both links up):
...
Checking link statusdone
Port 0 Link Up - speed 10000 Mbps - full-duplex
Port 1 Link Up - speed 10000 Mbps - full-duplex
L2FWD: entering main loop on lcore 1
...

Before the patch has been applied (with link 0 down, link 1 up):
...
Checking link statusdone
Port 0 Link Up - speed 100 Mbps - half-duplex

Port 1 Link Up - speed 100 Mbps - half-duplex

L2FWD: entering main loop on lcore 1
...

After the patch has been applied (with link 0 down, link 1 up):
...
Checking link status............................................................
..............................done
Port 0 Link Down
Port 1 Link Up - speed 10000 Mbps - full-duplex
...

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2014-12-19 23:30:26 +01:00
Michael Qiu
740b397145 ixgbe: fix secondary process start
EAL:   probe driver: 8086:10fb rte_ixgbe_pmd
EAL:   PCI memory mapped at 0x7f18c2a00000
EAL:   PCI memory mapped at 0x7f18c2a80000
Segmentation fault (core dumped)

This is introduced by commit: 46bc9d75
	ixgbe: fix multi-process support
When start primary process with command line:
./app/test/test -n 1 -c ffff -m 64
then start the second one:
./app/test/test -n 1 --proc-type=secondary --file-prefix=rte
This segment-fault will occur.

Root cause is test app on primary process only starts device, but
the queue need initialized by manually command line.
So the tx queue is still NULL when secondary process startup.

Reported-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-12-19 23:30:25 +01:00
Sujith Sankar
390ea71091 enic: use eal to manage interrupts
This patch removes the interrupt registration code which was under the flag
VFIO_PRESENT and relies on the rte_lib code for the same.
This also ignores the initial trigger of ISR from the lib.

Signed-off-by: Sujith Sankar <ssujith@cisco.com>
2014-12-19 23:30:25 +01:00
Daniel Mrzyglod
dd6590fe2f af_packet: fix possible memory leak
In rte_pmd_init_internals, we are mapping memory but not released
if error occurs it could produce memory leak.
Add unmmap function to release memory.

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
Acked-by: John W. Linville <linville@tuxdriver.com>
2014-12-19 23:30:25 +01:00
Daniel Mrzyglod
74ab022182 af_packet: fix memory allocation checks
In rte_eth_af_packet.c we are we are missing NULL pointer
checks after calls to allocate memory for queues.
Add checking NULL pointer and error handling.

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-12-18 22:52:39 +01:00
Neil Horman
53d81e2d43 xenvirt: fix build break on ethernet address parsing
Back in commit aaa662e75c ("cmdline: fix overflow on bsd"),
the author failed to fixup a call to cmdline_parse_etheraddr in xenvirt.
This patch makes the needed correction to avoid a build break.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-18 22:52:39 +01:00
Ciara Loftus
f5a31522f0 vhost: add interface name for virtio
This patch fixes the issue whereby when using userspace vhost ports
in the context of vSwitching, the name provided to the hypervisor/QEMU
of the vhost tap device needs to be exposed in the library, in order
for the vSwitch to be able to direct packets to the correct device.
This patch introduces an 'ifname' member to the virtio-net structure
which is populated with the tap device name when QEMU is brought up
with a vhost device.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Anthony Fee <anthonyx.fee@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2014-12-18 22:52:38 +01:00
Jincheng Miao
a8aac461cb kni: fix build on CentOS 6.6
From CentOS 6.6, function skb_set_hash is introduced, this breaks
the previous assumption. So modify RHEL_RELEASE_VERSION from 7.0
to 6.6 to fix build for rte_kni.ko.

Related mail from Barak Enat:
http://dpdk.org/ml/archives/dev/2014-December/010124.html

building error likes:
  CC [M]  lib/librte_eal/linuxapp/kni/e1000_82575.o
In file included from lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_osdep.h:41,
                 from lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_hw.h:31,
                 from lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_api.h:31,
                 from lib/librte_eal/linuxapp/kni/e1000_82575.c:38:
lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h:3870: error: conflicting types for ‘skb_set_hash’
include/linux/skbuff.h:620: note: previous definition of ‘skb_set_hash’ was here

Reported-by: Barak Enat <barak@saguna.net>
Signed-off-by: Jincheng Miao <jincheng.miao@gmail.com>
2014-12-18 22:52:38 +01:00
Thomas Monjalon
7d6378efb4 version: 1.8.0-rc6
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-18 00:33:19 +01:00
Declan Doherty
a0399ce10f bond: check null before use
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
2014-12-18 00:26:08 +01:00
Declan Doherty
86b426c82d bond: fix pci table allocation check
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
2014-12-18 00:26:08 +01:00
Declan Doherty
3b6581a3d3 bond: check bounds before assigning active slave count
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
2014-12-18 00:26:08 +01:00
Helin Zhang
8ae50a6ba9 i40e: fix build with some toolchains
Compile warning which is treated as error occurs on Oracle Linux
(kernel 2.6.39, gcc 4.4.7) as below, or RHEL, CentOS. Aliasing
'struct i40e_aqc_debug_reg_read_write' should be avoided. Use the
elements inside that structure directly can fix the issue.

lib/librte_pmd_i40e/i40e_ethdev.c: In function 'eth_i40e_dev_init':
lib/librte_pmd_i40e/i40e_ethdev.c:5318: error: dereferencing pointer
  'cmd' does break strict-aliasing rules
lib/librte_pmd_i40e/i40e_ethdev.c:5314: note: initialized from here

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
2014-12-18 00:26:08 +01:00
Jincheng Miao
190a61b14c kni: fix build with kernel < 2.6.35 and vhost debug enabled
Seen on RHEL-6.5:
lib/librte_eal/linuxapp/kni/kni_vhost.c:222:
	error: ‘struct socket’ has no member named ‘wq’
lib/librte_eal/linuxapp/kni/kni_vhost.c:313:
	error: implicit declaration of function ‘sk_sleep’
lib/librte_eal/linuxapp/kni/kni_vhost.c:313:
	error: passing argument 1 of ‘__wake_up’ makes pointer from integer without a cast
include/linux/wait.h:146: note: expected ‘struct wait_queue_head_t *’
	but argument is of type ‘int’
lib/librte_eal/linuxapp/kni/kni_vhost.c:580:
	error: assignment makes pointer from integer without a cast

RHEL6.5 kernel is based on 2.6.32. But there are two changing
from 2.6.35:
1. socket struct is changed
It wrappered previous wait_queue_head_t of socket to
struct socket_wq. So for the kernel older than 2.6.35, we should
directly use socket->wait instead.

2. new function sk_sleep()
This function is implemented from 2.6.35 to obtain wait queue
from struct sock. This patch adds a macro in kni/compat.h
to be compatible with older kernels.

Patch is tested in RHEL6.5 and RHEL7.0 with:
CONFIG_RTE_LIBRTE_KNI=y
CONFIG_RTE_KNI_KO_DEBUG=y
CONFIG_RTE_KNI_VHOST=y
CONFIG_RTE_KNI_VHOST_MAX_CACHE_SIZE=1024
CONFIG_RTE_KNI_VHOST_VNET_HDR_EN=y
CONFIG_RTE_KNI_VHOST_DEBUG_RX=y
CONFIG_RTE_KNI_VHOST_DEBUG_TX=y

Signed-off-by: Jincheng Miao <jmiao@redhat.com>
2014-12-18 00:26:08 +01:00
Stephen Hemminger
db949b51d8 ethdev: fix mtu comment
In commit 59d0ecdbf0 ("MTU accessors"),
max_frame_size was replaced with mtu.
Default size is ETHER_MTU = 1500.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-18 00:26:08 +01:00
Thomas Monjalon
3c3028111f version: 1.8.0-rc5
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-17 01:04:07 +01:00
Bruce Richardson
b5af5c7967 af_packet: fix crash on initialization failure
The cleanup code on error checks for *internals being NULL only after
using the pointer to perform other cleanup. Fix this by moving the
clean-up based on the pointer inside the check for NULL.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-17 01:04:06 +01:00
Pablo de Lara
c7ba32604d vmxnet3: fix default Tx configuration
Since commit fbde27f19a "get default Rx/Tx configuration from dev info",
a default RX/TX configuration can be used for all PMDs.
In case of vmxnet3, the whole structure was zeroed and not filled out.
The PMD does not support multi segments or offload functions,
so txq_flags should have those flags set.

Test report: http://dpdk.org/ml/archives/dev/2014-December/009933.html

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: Xiaonan Zhang <xiaonanx.zhang@intel.com>
2014-12-17 01:04:06 +01:00
Helin Zhang
973273c7a4 i40e: workaround for X710 performance
On X710, performance number is far from the expectation on recent
firmware versions. The fix for this issue may not be integrated in
the following firmware version. So the workaround in software driver
is needed. It needs to modify the initial values of 3 internal only
registers. Note that the workaround can be removed when it is fixed
in firmware in the future.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
2014-12-17 01:04:06 +01:00
Stephen Hemminger
a775949fbb ixgbe: support X540 VF
Add missing setup for X540 MAC type when setting up VF.
Additional check exists in Linux driver but not in DPDK.

Signed-off-by: Bill Hong <bhong@brocade.com>
Signed-off-by: Stephen Hemminger <shemming@brocade.com>
2014-12-17 01:04:01 +01:00
Bruce Richardson
46bc9d7513 ixgbe: fix multi-process support
When using multiple processes, the TX function used in all processes
should be the same, otherwise the secondary processes cannot transmit
more than tx-ring-size - 1 packets.
To achieve this, we extract out the code to select the ixgbe TX function
to be used into a separate function inside the ixgbe driver, and call
that from a secondary process when it is attaching to an
already-configured NIC.

Testing with symmetric MP app shows that we are able to RX and TX from
both primary and secondary processes once this patch is applied.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
2014-12-17 00:40:37 +01:00
Bruce Richardson
199e6b913e ixgbe: fix array overflow in vector Rx
Switch the order of the conditions in a while loop, so we check the
range of "i" against the max, before using it to index into the array.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-17 00:40:37 +01:00
Shu Shen
0d6e2d783d igb_uio: fix build with kernel 3.18
This patch fixes build failing with undefined symbol _PAGE_IOMAP with
kernel 3.18.

The Xen-specific _PAGE_IOMAP PTE flag was removed in kernel 3.18 and
could be used for other purpose in future. This patch ensures that
_PAGE_IOMAP flag is only used for kernels before 3.18.

Signed-off-by: Shu Shen <shu.shen@radisys.com>
Acked-by: Jincheng Miao <jmiao@redhat.com>
2014-12-17 00:40:37 +01:00
Pablo de Lara
b443307ad3 ring: fix return type in enqueue and dequeue burst functions
Enqueue and dequeue burst functions always return a positive
value (including 0), so return type should be unsigned,
instead of int.

Fixed also API doc for one of the functions.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-17 00:40:37 +01:00
Konstantin Ananyev
ca68c8190f net: fix IPv6 checksum
For rte_ipv6_phdr_cksum() gcc 4.8.* with "-O3" not always generates
correct code.
Sometimes it 'forgets' to put len and proto fields of psd_header on the stack.
To overcome that problem and speedup things a bit, refactored rte_raw_cksum()
by splitting ipv6 pseudo-header csum calculation into 3 phases:
1. calc sum for src & dst addresses
2. add sum for proto & len.
3. finalise sum
That makes gcc to generate valid code and helps to avoid any copying.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-17 00:40:37 +01:00
Bruce Richardson
d3e454a86f cfgfile: fix read of empty file
If the file to be read by the cfgfile is empty, i.e. no configuration
data, but possibly comments present, the cfgfile should not mark the
last processed section (curr_section) as having N entries, since there
is no last processed section.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-17 00:40:37 +01:00
Bruce Richardson
8b27765d67 eal: use safe snprintf to print version
When printing the version string to a local variable, use snprintf for
safety over sprintf. This is general good practice even if the values
to print are all hard-coded.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-17 00:40:37 +01:00
Pawel Wodkowski
60a3df650d eal: fix unused value warning in memcpy macro
GCC 4.5.1 from SUSE throws this error:
	lib/librte_pmd_enic/enic_main.c:862:2: error: value computed is not used

This change use statements in expressions C extension provided by gcc to avoid
'value computed is not used' warning/error when size is not known at compile
time.

Reported-by: Michael Qiu <michael.qiu@intel.com>
Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
[Thomas: apply same fix to ppc_64]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-17 00:40:37 +01:00
Thomas Monjalon
4c8b417151 version: 1.8.0-rc4
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:44:25 +01:00
Helin Zhang
0b04f881a8 i40e: fix RSS RETA query
There is a bug in querying reta, of storing the data to the correct entry.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
2014-12-11 01:42:03 +01:00
Helin Zhang
77f127bcda enic: fix build with gcc 4.7.2
Compile warnings/errors was found on gcc 4.7.2 as follows. Variables
was reported of being used but uninitialized. Assigning an initial
value to it is needed.

lib/librte_pmd_enic/vnic/vnic_dev.c: In function vnic_dev_get_mac_addr:
lib/librte_pmd_enic/vnic/vnic_dev.c:393:16: error: a1 may be used uninitialized
	in this function [-Werror=uninitialized]
lib/librte_pmd_enic/vnic/vnic_dev.c:629:10: note: a1 was declared here
lib/librte_pmd_enic/vnic/vnic_dev.c: In function vnic_dev_set_mac_addr:
lib/librte_pmd_enic/vnic/vnic_dev.c:393:16: error: a1 may be used uninitialized
	in this function [-Werror=uninitialized]
lib/librte_pmd_enic/vnic/vnic_dev.c:980:10: note: a1 was declared here

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:03 +01:00
Declan Doherty
f2ef6f21ee bond: fix mac assignment to slaves
Adding call to mac_address_slaves_update from the lsc handler when the
first slave become active to propagate any mac changes made while
devices are inactive

Changed removing slave logic to use memmove instead of memcpy to move
data within the same array, as this was corrupting the slave array.

Adding unit test to cover failing assignment scenarios

Test report: http://dpdk.org/ml/archives/dev/2014-December/009623.html

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
2014-12-11 01:42:02 +01:00
Jincheng Miao
442f3bed6a xen: fix build with kernel 3.18
From upstream kernel commit 3db2e9cd, strict_strto* serial functions
are removed. So that we should directly used kstrtoul instead.

Add xen_dom0/compat.h to be compatible with older kernel.

Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:02 +01:00
Jincheng Miao
ee19d51ae5 kni: fix build with kernel 3.18
From upstream kernel commit 3db2e9cd, strict_strto* serial functions
are removed. So that we should directly used kstrtoul instead.

Add kni/compat.h to be compatible with older kernel.

Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:02 +01:00
Jincheng Miao
6573369dd7 igb_uio: fix build with kernel 3.18
From upstream kernel commit 3db2e9cd, strict_strto* serial functions
are removed. So that we should directly used kstrtoul instead.

kstrtoul exists from RHEL6.4, so for compatibility with old kernel and RHEL,
add some logic to igb_uio/compat.h.

Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:02 +01:00
Jincheng Miao
81ab433fff igb_uio: fix build with long term kernel and RHEL
Function pci_num_vf() is introduced from upstream linux-2.6.34. So
this patch make compatible with longterm kernel linux-2.6.32.63.

For RHEL, function pci_num_vf() begins from RHEL5 update9. And
it is stub-defined when CONFIG_PCI_IOV is not enabled.

So dropped the CONFIG_PCI_IOV checking of commit 11ba0426.

For other distro like RHEL behaved to pci_num_vf(), we could simply
append following condition macro:
(!(defined(OTHER_RELEASE_CODE) && \
 OTHER_RELEASE_CODE >= OTHER_RELEASE_VERSION(X, Y)))

Signed-off-by: Jincheng Miao <jmiao@redhat.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:02 +01:00
Mark Kavanagh
18f5900198 ethdev: fix build with libc ip6 header
The name of the rte_eth_fdir_flow's rte_eth_ipv6_flow attribute,
'ip6_flow', clashes with a macro defined in
/usr/include/netinet/ip6.h, such that when DPDK is linked with an
application that uses the afforementioned header, the macro is
expanded within the DPDK struct, causing a compilation error.

Rename the relevant attribute in DPDK to resolve this.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-12-11 01:42:02 +01:00
Michael Qiu
2b039d5f20 net: fix build with gcc 4.4.7 and strict aliasing
include/rte_ip.h:161: error: dereferencing pointer ‘u16’
        does break strict-aliasing rules
include/rte_ip.h:157: note: initialized from here
        ...

The root cause is that, compile enable strict aliasing by default,
while in function rte_raw_cksum() try to convert 'const char *'
to 'const uint16_t *'.

This workaround is to solve the compile issue of GCC strict-aliasing (two
different type pointers should not be point to the same memory address).

For GCC 4.4.7 it will definitely occurs if  flags "-fstrict-aliasing"
and "-Wall" used.

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
[Thomas: add workaround comment]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:02 +01:00
Michael Qiu
42c035e7bb eal: fix build with icc
lib/librte_eal/linuxapp/eal/eal.c(461): error #2259: non-pointer
conversion from "long long" to "void *" may lose significant bits
   RTE_PTR_ALIGN_CEIL((uintptr_t)addr, RTE_PGSIZE_16M);

The root cause is that "RTE_PGSIZE_16M" is defined as unsigned long long.
But in i686 platform "void *" is 32-bit.
It is safe to cast to size_t and make it works in both 32 & 64-bit
platform.

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:02 +01:00
Michael Qiu
3736db4f95 eal: fix build for 32-bit system
lib/librte_eal/linuxapp/eal/eal_memory.c:324:4: error: comparison
is always false due to limited range of data type [-Werror=type-limits]
    || (hugepage_sz == RTE_PGSIZE_16G)) {
    ^

This was introuduced by commit b77b5639:
        mem: add huge page sizes for IBM Power

The root cause is that size_t is 32-bit in i686 platform,
but RTE_PGSIZE_16M and RTE_PGSIZE_16G are always 64-bit.

Force hugepage_sz to always 64-bit to avoid this issue.

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:02 +01:00
Jia Yu
3a52e64742 lib: fix cache alignment of structures
Include rte_memory.h for lib files that use __rte_cache_aligned
attribute.

Consider the following code:

	struct per_core_foo {
		...
	} __rte_cache_aligned;

	struct global_foo {
		struct per_core_foo foo[RTE_MAX_CORE];
	};

If __rte_cache_aligned is not defined (rte_memory.h is not included),
the code compiles but the structure is not aligned... it defines the
structure and creates a global variable called __rte_cache_aligned.
And this can lead to really bad things if this code is in a .h that
is included by files that may or may not include rte_memory.h

Signed-off-by: Jia Yu <jyu@vmware.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-11 01:42:02 +01:00
Thomas Monjalon
505c3d03dc version: 1.8.0-rc3
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-06 11:25:18 +01:00
Huawei Xie
bc8e0f501c i40e: use macros for vlan filtering registers
Add two macros I40E_VFTA_IDX and I40E_VFTA_BIT for vlan filter search and set.
Add vlan_id check in vlan filter search and set function.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
2014-12-06 11:08:35 +01:00
Huawei Xie
fdef8f927f i40e: fix vlan filtering
">> 5" rather than ">> 4"

vlan id is a 12 bit value.
VFTA is 128 x 32 bit array (128 double word array) which could store 2^12 vlan bits.
Each bit represents whether corresponding vlan tag is set in the VSI.
Use high 7 bits as the index for the double word array.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
2014-12-06 11:08:35 +01:00
Konstantin Ananyev
51e16682cf ixgbe: do not override buffer length
The template mbuf_initializer is hard coded with a buflen which
might have been set differently by the application at the time of
mbuf pool creation.

- move buf_len fields out of rearm_data marker.
- make ixgbe_recv_pkts_vec() not touch buf_len field at all
(as all other RX functions behave).

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
2014-12-05 22:57:18 +01:00
Jean-Mickael Guerin
661dfdf09f ixgbe: fix setup of mbuf initializer template
Add a compiler barrier to make sure all fields covered by
the marker rearm_data are assigned before the read.

Fixes: 0ff3324da2 ("ixgbe: rework vector pmd following mbuf changes")

Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-05 22:49:25 +01:00
Bruce Richardson
dc018ce69b enic: fix uninitialized variable
The variable notify_pa is only initialized inside one branch of
an if statement, triggering a compiler error with clang 3.3 on FreeBSD.

  CC vnic/vnic_dev.o
lib/librte_pmd_enic/vnic/vnic_dev.c:777:6: fatal error: variable 'notify_pa'
      is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
        if (!vnic_dev_in_reset(vdev)) {

Fix this issue by adding "= 0" to the variable definition.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-05 22:09:23 +01:00
Bruce Richardson
06554d0240 enic: fix initialization error with clang
This patch fixes the following compiler error raised by clang 3.3
on FreeBSD 10:

  CC enic_clsf.o
lib/librte_pmd_enic/enic_clsf.c:99:25: fatal error: missing field 'u' initializer [-Wmissing-field-initializers]
        struct filter fltr = {0};

It fixes it by changing the initializer to set a named field to zero,
thereby automatically setting the rest of the unnamed fields also to
zero.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-05 22:09:23 +01:00
John W. Linville
a3a03e13a6 af_packet: add compile-time checks for kernel-specific options
This allows the PMD to compile with kernels that don't support the
options in question.  The "#if defined(...)" lines are a bit ugly,
but I don't know of any better way to accomplish the task.

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-12-05 22:09:23 +01:00
Bruce Richardson
58507670cf table: fix lookup with incomplete bitmask
When a lookup was done on a table_array structure with an incomplete
bitmask, the results was always zero hits. This was because the
pkts_mask value was cleared as we process each entry, and the result
was assigned at the end of the loop, when pkts_mask was zero.
Changing the assignment to occur at the start, before the pkts_mask
gets cleared, fixes this issue.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2014-12-05 16:55:00 +01:00
Jingjing Wu
e689a63caa i40e: setup flow director only if enabled
In order not to affect the FVL's performance by default setting, this
patch moves the flow director initialization from i40e_pf_setup to
i40e_dev_configure according to the mode in fdir configure info.
Then the resources used for flow director will be only setup if it is enabled.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2014-12-05 16:55:00 +01:00
Jijiang Liu
c14236f210 mbuf: replace inner fields by outer fields semantic
Replace the inner_l2_len and the inner_l3_len field with the
outer_l2_len and outer_l3_len field, and rework csum forward engine
and i40e PMD due to these changes.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-05 16:55:00 +01:00
Jijiang Liu
1c3b7c33e9 mbuf: add Tx offloading flags for tunnels
Replace PKT_TX_VXLAN_CKSUM with PKT_TX_UDP_TUNNEL_PKT in order to indicate
a packet is an UDP tunneling packet, and introduce 3 TX offload flags for
outer IP TX checksum, which are PKT_TX_OUTER_IP_CKSUM, PKT_TX_OUTER_IPV4
and PKT_TX_OUTER_IPV6 respectively.
Rework csum forward engine and i40e PMD due to these changes.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-05 16:55:00 +01:00
Jijiang Liu
711ba9e23e mbuf: remove aliasing of Tx offloading flags with Rx ones
The reason of redefining the PKT_TX_IPV4 and the PKT_TX_IPV6 is listed below,
It will avoid to send a packet with a bad info:
  - we receive a Ether/IP6/IP4/L4/data packet
  - the driver sets PKT_RX_IPV6_HDR
  - the stack decapsulates IP6
  - the stack sends the packet, it has the PKT_TX_IPV6 flag but it's an IPv4 packet.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-05 16:55:00 +01:00
Thomas Monjalon
72c605807a eal: detect endianness
There is no standard to check endianness.
So we need to try different checks.
Previous trials were done in testpmd (see commits
51f694dd40 and 64741f237c) without full success.
This one is not guaranteed to work everywhere so it could
evolve when exceptions are found.

If endianness is not detected, there is a fallback on x86
to little endian. It could be forced before doing detection
but it would add some arch-dependent code in the generic header.

The option CONFIG_RTE_ARCH_BIG_ENDIAN introduced for IBM Power only
(commit a982ec81d8) can be removed. A compile-time check is better.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
2014-12-05 16:55:00 +01:00
Alan Carew
aaa662e75c cmdline: fix overflow on bsd
When using test-pmd with flow director in FreeBSD, the application will
segfault/Bus error while parsing the command-line. This is due to how
each commands result structure is represented during parsing, where the offsets
for each tokens value is stored in a character array(char result_buf[BUFSIZ])
in cmdline_parse()(./lib/librte_cmdline/cmdline_parse.c).

The overflow occurs where BUFSIZ is less than the size of a commands result
structure, in this case "struct cmd_pkt_filter_result"
(app/test-pmd/cmdline.c) is 1088 bytes and BUFSIZ on FreeBSD is 1024 bytes as
opposed to 8192 bytes on Linux.

The problem can be reproduced by running test-pmd on FreeBSD:
./testpmd -c 0x3 -n 4 -- -i --portmask=0x3 --pkt-filter-mode=perfect
And adding a filter:
add_perfect_filter 0 udp src 192.168.0.0 1024 dst 192.168.0.0 1024 flexbytes
0x800 vlan 0 queue 0 soft 0x17

This patch removes the OS dependency on BUFSIZ and defines and uses a
library #define CMDLINE_PARSE_RESULT_BUFSIZE 8192

Added boundary checking to ensure this buffer size cannot overflow, with
an error message being produced.

Suggested-by: Olivier Matz <olivier.matz@6wind.com>
http://git.droids-corp.org/?p=libcmdline.git;a=commitdiff;h=b1d5b169352e57df3fc14c51ffad4b83f3e5613f

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-05 16:54:53 +01:00
Thomas Monjalon
29d03f7aa3 cmdline: revert fix overflow on bsd
Revert commit a0547e0a75 because it is an old version
of the patch and was applied by error.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-04 16:15:44 +01:00
Thomas Monjalon
b70ae1899f enic: fix warnings
A lot of warnings were not seen because $(WERROR_FLAGS) was not set
in the Makefile. But they appear with toolchains that enforce more checks.

-Wno-deprecated seems useless.
-Wno-strict-aliasing is added to avoid false positives.

This patch cleans up unused variable, unused functions, wrong types,
static declarations, etc. A lot of functions have unused parameters;
it suggests that more clean-up could be needed.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Sujith Sankar <ssujith@cisco.com>
2014-12-04 13:04:24 +01:00
Chao Zhu
8436acd6ba kni: fix build on IBM Power
Because of different cache line size, the alignment of struct
rte_kni_mbuf in rte_kni_common.h doesn't work on IBM Power. This patch
changed from 64 to RTE_CACHE_LINE_SIZE micro to do the alignment.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-12-04 12:59:03 +01:00
Alan Carew
a0547e0a75 cmdline: fix overflow on bsd
When using test-pmd with flow director in FreeBSD, the application will
segfault/Bus error while parsing the command-line. This is due to how
each commands result structure is represented during parsing, where the offsets
for each tokens value is stored in a character array(char result_buf[BUFSIZ])
in cmdline_parse()(./lib/librte_cmdline/cmdline_parse.c).

The overflow occurs where BUFSIZ is less than the size of a commands result
structure, in this case "struct cmd_pkt_filter_result"
(app/test-pmd/cmdline.c) is 1088 bytes and BUFSIZ on FreeBSD is 1024 bytes as
opposed to 8192 bytes on Linux.

This patch removes the OS dependency on BUFSIZ and defines and uses a
library #define CMDLINE_PARSE_RESULT_BUFSIZE 8192

The problem can be reproduced by running test-pmd on FreeBSD:
./testpmd -c 0x3 -n 4 -- -i --portmask=0x3 --pkt-filter-mode=perfect
And adding a filter:
add_perfect_filter 0 udp src 192.168.0.0 1024 dst 192.168.0.0 1024 flexbytes
0x800 vlan 0 queue 0 soft 0x17

Signed-off-by: Alan Carew <alan.carew@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-12-03 20:45:19 +01:00
Takayuki Usui
999284fcc3 kni: create interface in current network namespace
With this patch, KNI interface (e.g. vEth0) is created in the
network namespace where the DPDK application is running.
Otherwise, all interfaces are created in the default namespace
in the host.

put_net() is required, since get_net_ns_by_pid() increments
the reference counter of the network namespace with get_net().

Signed-off-by: Takayuki Usui <takayuki@midokura.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
2014-12-03 20:45:19 +01:00
Helin Zhang
2f423f6385 i40e: fix build with 16-byte descriptors
The compile error will occur as below when set 'RTE_LIBRTE_I40E_16BYTE_RX_DESC=y'.
'fd_id' should be used to replace 'fd', as 'fd' is not defined in that structure
at all. In addition, local variable of 'flexbl' and 'flexbh' must be used only if
32 bytes RX descriptor is selected.

error logs:
lib/librte_pmd_i40e/i40e_rxtx.c: In function i40e_rxd_build_fdir:
lib/librte_pmd_i40e/i40e_rxtx.c:431:28: error: volatile union <anonymous> has no member named fd
lib/librte_pmd_i40e/i40e_rxtx.c:427:19: error: unused variable flexbl [-Werror=unused-variable]
lib/librte_pmd_i40e/i40e_rxtx.c:427:11: error: unused variable flexbh [-Werror=unused-variable]

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2014-12-03 20:45:19 +01:00
Dennis Marinus
e77339f400 table: fix maybe-uninitialized variable with gcc lto
This patch fixes a maybe-uninitialized warning when compiling DPDK with
GCC 4.9 + Link Time Optimization.

Signed-off-by: Dennis Marinus <dmarinus@amazon.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-02 12:34:09 +01:00
Thomas Monjalon
c1aa5e44e6 ixgbe: fix build with bypass and debug enabled
Since commit aae1047905 ("use the right debug macro"),
DEBUGOUT was replaced by PMD_DRV_LOG which requires at least
2 arguments. But the level argument was missing.

Commit 7a10de5e27 fixed the logs but not the macros FUNC_PTR_OR_*
which are not preprocessed if RTE_LIBRTE_IXGBE_DEBUG_DRIVER is disabled.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-12-02 12:28:11 +01:00
Sujith Sankar
91f7dd5d5e enic: fix build with clang
This patch fixes the warnings and error reported by clang compiler on Linux.

Reported-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-01 12:24:04 +01:00
Olivier Matz
06c57d54ef ixgbe: fix bitfield assignation with clang
Commit 1224decaa4 ("support TCP segmentation offload")
changed the way the bitfields are assigned in ixgbe, example:

  tx_offload_mask.l2_len = ~0;

This result in a compilation error with clang:

  error: implicit truncation from 'int' to bitfield
    changes value from -1 to 127 [-Werror,-Wbitfield-constant-conversion]

Replacing the '=' with a '|=' fixes the issue.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-12-01 12:18:33 +01:00
Sujith Sankar
2d3ba69654 enic: fix build by using standard integer types
ENIC PMD was giving compilation errors on ppc_64-power8-linuxapp-gcc because
of types such as u_int32_t.  This patch replaces all those with uint32_t and
similar ones.

Reported-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-28 16:59:21 +01:00
Pablo de Lara
d68ca91ed8 bond: fix build with gcc 4.3
GCC 4.3 complains that slow_pkts array in bond_ethdev_tx_burst_8023ad
may be used uninitialized, so it has been initialized to NULL.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-28 16:19:25 +01:00
Balazs Nemeth
c2941df015 ixgbe: fix mbuf failure statistics in vector Rx
The statistics that is reported through the rx_nombuf fields in struct
rte_eth_stats was not set when the vector PMD was used. The statistics
should report the number of mbufs that could _not_ be allocated during
rearm of the RX queue. The non-vector PMD reports it correctly. The
use of either vector PMD or non-vector PMD depends on runtime
configuration. Hence it is possible that a change in configuration
would disable this statistics. To prevent this from happening, the
statistics should be reported by both implementations.

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-28 16:19:25 +01:00
Thomas Monjalon
ae518a0fe5 version: 1.8.0-rc2
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-27 22:50:06 +01:00
Jia Yu
4f1e87a332 bond: set offload capabilities flags
Before the fix, bond device's offload capabilities are unset. This fix
takes the minimum common set of slave devices' capabilities as bond
device's capabilities. For simplicity, we ensure all slave devices
to have a capability before bond device can claim this capability,
even if some slave devices are unused (i.e. linked down, standby).

Signed-off-by: Jia Yu <jyu@vmware.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
2014-11-27 22:50:06 +01:00
Daniel Mrzyglod
7c76a747e6 bond: add mode 5
Add support for mode 5 (Transmit load balancing) into pmd driver

This patch add support for Adaptive transmit load balancing (mode 5) to the
librte_pmd_bond library. This mode provides an adaptive transmit load
balancing. It dynamically changes the transmitting slave, according to the
computed load.

Further details are described here:
https://www.kernel.org/doc/Documentation/networking/bonding.txt
In implementation callback is used for sorting slave order - providing
statistics for burst function about slave bandwith usage  and sort
interfaces due to usage.

Difference in this implementation vs Linux implementation:
- We Are trying send all pkts – If one interface hasn’t send packets we are
trying to send rest of packets by other slaves sorted previously by callback
function.

Some implementation details:
- Every 100ms is taken obytes statistics from every slave.
- Every 10 ms the slaves in  table are sorted and updated by callback -
bandwidth and successfully transmitted bytes from previous iteration which
happens every 100 ms
- There is callback function which updates this statistics for transparency and
for rather intensive computation involved in this mode.

Test report: http://dpdk.org/ml/archives/dev/2014-November/008729.html

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2014-11-27 21:38:57 +01:00
Pawel Wodkowski
46fb436836 bond: add mode 4
This patch set add support for dynamic link aggregation (mode 4) to the
librte_pmd_bond library. This mode provides auto negotiation/configuration
of peers and well as link status changes monitoring using out of band
LACP (link aggregation control protocol) messages. For further details of
LACP specification see the IEEE 802.3ad/802.1AX standards. It is also
described here
https://www.kernel.org/doc/Documentation/networking/bonding.txt.

In this implementation we have an array of mode 4 settings for each slave.
There is also assumption that for every port is one aggregator (it might
be unused if better is found).

Difference in this implementation vs Linux implementation:
- this implementation it is not directly based on state machines but current
  state is calculated from actor and partner states (and other things too).

Some implementation details:
- during rx burst every packet Is checked if this is LACP or marker packet.
  If it is LACP frame it is passed to mode 4 logic using slaves rx ring  and
  removed from rx buffer before it is returned
- in tx burst, packets from mode 4 (if any) are injected into each slave.
- there is a timer running in background to process/produce mode 4
  frames form rx/to tx functions.

Some requirements for this mode:
- for LACP mode to work rx and tx burst functions must be invoked
  at least in 100ms intervals
- provided buffer to rx burst should be at least 2x slave count size. This is
  not needed but might increase performance especially during initial
  handshake.

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
2014-11-27 21:20:58 +01:00
Sujith Sankar
80083092aa enic: fix vfio inclusion
Inclusion of vfio.h was giving compilation errors if kernel version is less
than 3.6.0 and if RTE_EAL_VFIO was in config.

Removed inclusion of vfio.h and replaced RTE_EAL_VFIO with VFIO_PRESENT.

Reported-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-27 21:20:42 +01:00
Thomas Monjalon
f97ae91fcb enic: fix dependencies
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-27 19:12:43 +01:00
Thomas Monjalon
d07180f211 net: fix conflict with libc
It was impossible to include netinet/in.h and rte_ip.h
because the IP protocols were redefined.
It is removed because useless.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Ivan Boule <ivan.boule@6wind.com>
2014-11-27 19:03:27 +01:00
Keith Wiles
1665d6310d mempool: avoid dump crash with null pointer
Check the FILE *f and rte_mempool *mp pointers for NULL.

Signed-off-by: Keith Wiles <keith.wiles@windriver.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-11-27 17:30:20 +01:00
Sergio Gonzalez Monroy
fdf20fa7be add prefix to cache line macros
CACHE_LINE_SIZE is a macro defined in machine/param.h in FreeBSD and
conflicts with DPDK macro version.
Adding RTE_ prefix to avoid conflicts.
CACHE_LINE_MASK and CACHE_LINE_ROUNDUP are also prefixed.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
[Thomas: updated on HEAD, including PPC]
2014-11-27 16:21:11 +01:00
Sergio Gonzalez Monroy
be04c70727 eal/bsd: remove unused HPET support
The HPET support in the BSD EAL was copied directly from the Linux version,
but did not actually work on FreeBSD. We replace this old code with a simple
compiler message that informs the user that we don't support HPET on BSD if
they enable such support in the build-time configuration file.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2014-11-27 16:21:11 +01:00
Sergio Gonzalez Monroy
7343f7498f eal/bsd: use sysctl to get TSC frequency
BSD provides the TSC frequency value through sysctl.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
2014-11-27 16:21:01 +01:00
David Marchand
f9462cf0b9 eal: no more bare metal environment
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-11-27 13:09:51 +01:00
Thomas Monjalon
df2e9cec22 mbuf: sort TCP segmentation offload flag
Due to reordering conflicts, the TSO flag was not sorted.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-27 10:39:21 +01:00
David Marchand
63dd47b9a5 eal/linux: fix remaining checks for 64-bit architectures
RTE_ARCH_X86_64 can not be used as a way to determine if we are building for
64bits cpus. Instead, RTE_ARCH_64 should be used.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
2014-11-27 08:46:22 +01:00
jingjing.wu
be646317d7 i40e: add ethertype filter
Handle the RTE_ETH_FILTER_ADD and RTE_ETH_FILTER_DELETE operations
on ethertype filter.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2014-11-26 23:24:43 +01:00
jingjing.wu
07f02cdb08 ethdev: add ethertype filter
A new structure of ethertype filter is defined in rte_eth_ctrl.h
for filter_ctrl api

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
2014-11-26 23:24:43 +01:00
Sujith Sankar
df2fd00e29 enic: build integration
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
[Thomas: enable for BSD - not tested]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-26 23:07:11 +01:00
Sujith Sankar
fefed3d1e6 enic: new driver
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-11-26 23:07:11 +01:00
Sujith Sankar
9913fbb91d enic/base: common code
VNIC common code is partially shared with ENIC kernel mode driver.

Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-11-26 23:07:11 +01:00
Sujith Sankar
c73ddb41e7 enic: license
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
2014-11-26 23:07:11 +01:00
Chao Zhu
d05e7115f4 mem: support layout of IBM Power
The mmap of hugepage files on IBM Power starts from high address to low
address. This is different from x86. This patch modified the memory
segment detection code to get the correct memory segment layout on Power
architecture. This patch also added a commond ARCH_PPC_64 definition for
64 bit systems.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:10 +01:00
Chao Zhu
b77b563972 mem: add huge page sizes for IBM Power
IBM Power architecture has different huge page sizes (16MB, 16GB) than
x86.This patch defines RTE_PGSIZE_16M and RTE_PGSIZE_16G in the
rte_page_sizes enum variable and adds huge page size support of DPDK
for IBM Power architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:10 +01:00
Chao Zhu
f0fa947c61 eal/linux: disable iopl operation for IBM Power
iopl() call is mostly for the i386 architecture. In Power and other
architecture, it doesn't exist. This patch modified rte_eal_iopl_init()
and make it return -1 for Power and other architecture. Thus
rte_config.flags will not contain EAL_FLG_HIGH_IOPL flag for other
architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:10 +01:00
Chao Zhu
9ae1553856 eal/ppc: cpu flag checks for IBM Power
IBM Power processor doesn't have CPU flag hardware registers. This patch
uses aux vector software register to get CPU flags and add CPU flag
checking support for IBM Power architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:10 +01:00
Chao Zhu
7302a1724b eal/ppc: vector memcpy for IBM Power
The SSE based memory copy in DPDK only support x86. This patch adds
altivec based memory copy functions for IBM Power architecture. This
patch includes altivec.h which requires GCC version>= 4.8.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:10 +01:00
Chao Zhu
d464c8af66 eal/ppc: spinlock operations for IBM Power
This patch adds spinlock operations for IBM Power architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:09 +01:00
Chao Zhu
529c7f5c8c eal/ppc: prefetch operations for IBM Power
This patch add architecture specific prefetch operations for IBM Power
architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:09 +01:00
Chao Zhu
85997d60b1 eal/ppc: cpu cycle operations for IBM Power
IBM Power architecture doesn't have TSC register to get CPU cycles. This
patch implements the time base register read instead of TSC register of
x86 on IBM Power architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:09 +01:00
Chao Zhu
de6fff135e eal/ppc: byte order operations for IBM Power
This patch adds architecture specific byte order operations for IBM Power
architecture. Power architecture support both big endian and little
endian. This patch also adds a RTE_ARCH_BIG_ENDIAN micro.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:09 +01:00
Chao Zhu
05c3fd7110 eal/ppc: atomic operations for IBM Power
This patch adds architecture specific atomic operation file for IBM
Power architecture CPU.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2014-11-26 21:50:09 +01:00
Olivier Matz
1224decaa4 ixgbe: support TCP segmentation offload
Implement TSO (TCP segmentation offload) in ixgbe driver. The driver is
now able to use PKT_TX_TCP_SEG mbuf flag and mbuf hardware offload infos
(l2_len, l3_len, l4_len, tso_segsz) to configure the hardware support of
TCP segmentation.

In ixgbe, when doing TSO, the IP length must not be included in the TCP
pseudo header checksum. A new function ixgbe_fix_tcp_phdr_cksum() is
used to fix the pseudo header checksum of the packet before giving it to
the hardware.

In the patch, the tx_desc_cksum_flags_to_olinfo() and
tx_desc_ol_flags_to_cmdtype() functions have been reworked to make them
clearer. This should not impact performance as gcc (version 4.8 in my
case) is smart enough to convert the tests into a code that does not
contain any branch instruction.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-26 19:35:56 +01:00
Olivier Matz
4199fdea60 mbuf: generic support for TCP segmentation offload
Some of the NICs supported by DPDK have a possibility to accelerate TCP
traffic by using segmentation offload. The application prepares a packet
with valid TCP header with size up to 64K and deleguates the
segmentation to the NIC.

Implement the generic part of TCP segmentation offload in rte_mbuf. It
introduces 2 new fields in rte_mbuf: l4_len (length of L4 header in bytes)
and tso_segsz (MSS of packets).

To delegate the TCP segmentation to the hardware, the user has to:

- set the PKT_TX_TCP_SEG flag in mbuf->ol_flags (this flag implies
  PKT_TX_TCP_CKSUM)
- set the flag PKT_TX_IPV4 or PKT_TX_IPV6
- set PKT_TX_IP_CKSUM if it's IPv4, and set the IP checksum to 0 in
  the packet
- fill the mbuf offload information: l2_len, l3_len, l4_len, tso_segsz
- calculate the pseudo header checksum without taking ip_len in account,
  and set it in the TCP header, for instance by using
  rte_ipv4_phdr_cksum(ip_hdr, ol_flags)

The API is inspired from ixgbe hardware (the next commit adds the
support for ixgbe), but it seems generic enough to be used for other
hw/drivers in the future.

This commit also reworks the way l2_len and l3_len are used in igb
and ixgbe drivers as the l2_l3_len is not available anymore in mbuf.

Signed-off-by: Mirek Walukiewicz <miroslaw.walukiewicz@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-26 19:35:56 +01:00
Olivier Matz
6006818cfb net: new checksum functions
Introduce new functions to calculate checksums. These new functions
are derivated from the ones provided csumonly.c but slightly reworked.
There is still some room for future optimization of these functions
(maybe SSE/AVX, ...).

This API will be modified in tbe next commits by the introduction of
TSO that requires a different pseudo header checksum to be set in the
packet.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-26 19:35:55 +01:00
Olivier Matz
e4a1c50e69 mbuf: get the name of offload flags
In test-pmd (rxonly.c), the code is able to dump the list of ol_flags.
The issue is that the list of flags in the application has to be
synchronized with the flags defined in rte_mbuf.h.

This patch introduces 2 new functions rte_get_rx_ol_flag_name()
and rte_get_tx_ol_flag_name() that returns the name of a flag from
its mask. It also fixes rxonly.c to use this new functions and to
display the proper flags.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2014-11-26 19:35:55 +01:00
Olivier Matz
b161f72107 mbuf: remove too specific flags mask
This definition is specific to Intel PMD drivers and its definition
"indicate what bits required for building TX context" shows that it
should not be in the generic rte_mbuf.h but in the PMD driver.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-26 19:35:55 +01:00
Olivier Matz
b029fd236d mbuf: add help about Tx checksum flags
Describe how to use hardware checksum API.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-26 19:35:55 +01:00
Olivier Matz
340e52d9bd mbuf: reorder Tx flags
The tx mbuf flags are now ordered from the lowest value to the
the highest. Add comments to explain where to add new flags.

By the way, move the PKT_TX_VXLAN_CKSUM at the right place.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-26 18:54:41 +01:00
Olivier Matz
0e8d6a29b4 ixgbe: fix flags variable size to 64 bits
Since commit 4332beee9 "mbuf: expand ol_flags field to 64-bits", the
packet flags are now 64 bits wide. Some occurences were forgotten in
the ixgbe driver.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-26 18:54:15 +01:00
Olivier Matz
b22d99894a igb/ixgbe: fix IP checksum calculation
According to Intel® 82599 10 GbE Controller Datasheet (Table 7-38), both
L2 and L3 lengths are needed to offload the IP checksum.

Note that the e1000 driver does not need to be patched as it already
contains the fix.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-26 18:49:53 +01:00
Alan Carew
75bf9c8e82 power: integration of vm power management
librte_power now contains both rte_power_acpi_cpufreq and rte_power_kvm_vm
implementations.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:04 +01:00
Alan Carew
210c383e24 power: packet format for vm power management
Provides a command packet format for host and guest.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:04 +01:00
Alan Carew
445c6528b5 power: common interface for guest and host
Moved the current librte_power implementation to rte_power_acpi_cpufreq, with
renaming of functions only.
Added rte_power_kvm_vm implementation to support Power Management from a VM.

librte_power now hides the implementation based on the environment used.
A new call rte_power_set_env() can explicidly set the environment, if not
called then auto-detection takes place.

rte_power_kvm_vm is subset of the librte_power APIs, the following is supported:
 rte_power_init(unsigned lcore_id)
 rte_power_exit(unsigned lcore_id)
 rte_power_freq_up(unsigned lcore_id)
 rte_power_freq_down(unsigned lcore_id)
 rte_power_freq_min(unsigned lcore_id)
 rte_power_freq_max(unsigned lcore_id)

The other unsupported APIs return -ENOTSUP

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:04 +01:00
Alan Carew
cd0d5547e8 power: vm communication channels in guest
Allows for the opening of Virtio-Serial devices on a VM, where a DPDK
application can send packets to the host based monitor. The packet formatted is
specified in channel_commands.h
Each device appears as a serial device in path
/dev/virtio-ports/virtio.serial.port.<agent_type>.<lcore_num> where each lcore
in a DPDK application has exclusive to a device/channel.
Each channel is opened in non-blocking mode, after a successful open a test
packet is send to the host to ensure the host side is monitoring.

Signed-off-by: Alan Carew <alan.carew@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-26 17:27:03 +01:00
Anatoly Burakov
c4f136db8e eal/linux: map pci memory resources after hugepages
Multi-process DPDK application must mmap hugepages and PCI resources
into the same virtual address space. By default the virtual addresses
are chosen by the primary process automatically when calling the mmap.
But sometimes the chosen virtual addresses aren't usable in secondary
process - for example, secondary process is linked with more libraries
than primary process, and the library occupies the same address space
that the primary process has requested for PCI mappings.

This patch makes EAL try and map PCI BARs right after the hugepages
(instead of location chosen by mmap) in virtual memory, so that PCI BARs
have less chance of ending up in random places in virtual memory.

Signed-off-by: Liang Xu <liang.xu@cinfotech.cn>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-25 18:16:41 +01:00
Simon Kuenzer
fcbda6d4b0 eal: add option --master-lcore
Enable users to specify the lcore id that is used as master lcore.

Signed-off-by: Simon Kuenzer <simon.kuenzer@neclab.eu>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-25 14:06:40 +01:00
Patrick Lu
5583037a79 eal: get relative core index
EAL -c option allows the user to enable any lcore in the system.
Often times, the user app wants to know 1st enabled core, 2nd
enabled core, etc, rather than phyical core ID (rte_lcore_id().)

The new API rte_lcore_index() will return an index from enabled lcores
starting from zero.

Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-25 13:33:35 +01:00
Didier Pallard
d888cb8b96 eal: add core list input format
In current version, used cores can only be specified using a bitmask.
It will now be possible to specify cores in 2 different ways:
- Using a bitmask (-c [0x]nnn): bitmask must be in hex format
- Using a list in following format: -l <c1>[-c2][,c3[-c4],...]

The letter -l can stand for lcore or list.

-l 0-7,16-23,31 being equivalent to -c 0x80FF00FF

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-25 13:33:35 +01:00
Thomas Monjalon
8552f950ee eal: factorize configuration adjustment
Some adjustments are done after options parsing and are common
to Linux and BSD.

Remove process_type adjustment in rte_config_init() because
it is already done in eal_parse_args().
eal_proc_type_detect() is kept duplicated because it open a
file descriptor which is used later in each eal.c.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-25 13:33:35 +01:00
Thomas Monjalon
f91fc65d12 eal: factorize options sanity check
No need to have duplicated check for common options.

Some flags are set for options -c and -m in order to simplify the
checks.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-25 13:33:35 +01:00
Thomas Monjalon
341befa2a2 eal: factorize internal config reset
Now that internal config structure is common to Linux and BSD,
we can have a common function to initialize it.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-25 13:33:31 +01:00
Thomas Monjalon
33e25b3394 eal: fix header guards
Some guards are missing or have a wrong name.
Others have LINUXAPP in their name but are now common.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-25 13:30:23 +01:00