Commit Graph

4518 Commits

Author SHA1 Message Date
Keith Wiles
213af31e09 mempool: reduce structure size if no cache needed
The rte_mempool structure is changed, which will cause an ABI change
for this structure. Providing backward compat is not reasonable
here as this structure is used in multiple defines/inlines.

Allow mempool cache support to be dynamic depending on if the
mempool being created needs cache support. Saves about 1.5M of
memory used by the rte_mempool structure.

Allocating small mempools which do not require cache can consume
larges amounts of memory if you have a number of these mempools.

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-17 08:31:33 +02:00
Olivier Matz
fbfd99551c mbuf: add raw allocation function
Many drivers provide their own implementation of rte_mbuf_raw_alloc(),
duplicating the code. Introduce a new public function in rte_mbuf to
allocate a raw mbuf (uninitialized).

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-05-17 08:31:33 +02:00
Olivier Matz
8e483216fd doc: move release notes instructions as comments
We don't want to have this instructions in the generated docs, so use
comments. It's also less confusing for people adding entries in the
documentation.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2016-05-17 07:56:30 +02:00
Mauricio Vasquez B
b4e0f64ff2 examples: remove useless check of port count
The rte_eth_dev_count() function will never return a value greater
than RTE_MAX_ETHPORTS, so that checking is useless.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2016-05-16 21:01:51 +02:00
Daniel Mrzyglod
6b3ac0da69 examples/ip_pipeline: fix initialization of firewall bulk handler
bulk functions expect that all memory is set with zeros

Fixes: 67ebdbef0c ("examples/ip_pipeline: add bulk update of firewall rules")

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-05-16 21:01:51 +02:00
Marcin Kerlin
535b969d31 examples/ip_pipeline: fix options array overflow
In the function app_init_eal(struct app params * app) number of
entries into array exceeds the size of the array if the conditions
are fulfilled.

Coverity issue: 124567
Fixes: 7f64b9c004 ("examples/ip_pipeline: rework config file syntax")

Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-05-16 21:01:51 +02:00
Marcin Kerlin
d532ddccce examples/distributor: check mempool error
Value returned from a function is not checked for errors before being used.

Coverity issue: 13207
Fixes: 07db4a9750 ("examples/distributor: new sample app")

Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
2016-05-16 21:01:51 +02:00
Michal Kobylinski
30e1c9fab9 examples/netmap_compat: fix infinite loop
Infinite loop: The loop does not have a normal termination condition,
so will continue until an abnormal condition arises.
In rte_netmap_poll: Infinite loop with unsatisfiable exit condition.

Coverity issue: 30701
Fixes: 06371afe ("examples/netmap_compat: import netmap compatibility example")

Signed-off-by: Michal Kobylinski <michalx.kobylinski@intel.com>
2016-05-16 21:01:51 +02:00
Slawomir Mrozowicz
e752649c5b examples/qos_meter: check flow configuration error
Calling rte_meter_srtcm_config without checking return value.

Coverity issue: 30693
Fixes: e6541fdec8 ("meter: initial import")

Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-05-16 18:49:12 +02:00
Slawomir Mrozowicz
d52b5e735a examples/qos_sched: fix lcore limit
In expression 1ULL << i, left shifting by more than 63 bits
has undefined behavior. The shift amount, i, is as much as 127.

Coverity issue: 30690
Fixes: de3cfa2c98 ("sched: initial import")

Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-05-16 18:49:12 +02:00
Slawomir Mrozowicz
f2c41f26f8 examples/qos_sched: fix error message
rx_port in pconf->rx_port looks like a copy-paste error.

Coverity issue: 30699
Fixes: de3cfa2c98 ("sched: initial import")

Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-05-16 18:49:12 +02:00
Slawomir Mrozowicz
cb9234082a examples/qos_sched: fix negative loop bound in option parsing
negative_returns: Using unsigned variable n_tokens in a loop exit condition.

Coverity issue: 30704
Fixes: de3cfa2c98 ("sched: initial import")

Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-05-16 18:49:12 +02:00
Slawomir Mrozowicz
035b6a4f4b examples/qos_sched: fix out-of-bounds option parsing
overrun-local: Overrunning array tokens of 8 8-byte elements
at element index 4294967294 (byte offset 34359738352)
using index i (which evaluates to 4294967294).

Coverity issue: 30708
Fixes: de3cfa2c98 ("sched: initial import")

Signed-off-by: Slawomir Mrozowicz <slawomirx.mrozowicz@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2016-05-16 18:49:11 +02:00
Daniel Mrzyglod
3c4a6d00b1 examples/kni: print release error
In kni_free_kni: Value returned from a function is not checked for errors
before being used

Coverity issue: 30692
Fixes: b475eb0bc4 ("examples/kni: new parameters")

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2016-05-16 14:48:09 +02:00
Daniel Mrzyglod
42b3f50558 examples/vm_power_manager: ensure domain name is null terminated
In add_vm: The string buffer may not have a null terminator if the source
string's length is equal to the buffer size

Coverity issue: 30691
Fixes: e8ae9b6625 ("examples/vm_power: channel manager and monitor in host")

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
2016-05-16 14:48:09 +02:00
Daniel Mrzyglod
f87a9ddc4a power: fix error messages
Function strerror(errno) has built strings only for non-negative errno values.
for negative values of errno it describe error as "Unknown error -errno"
to be more descriptive i put string "channel not found" taken from header.

The negative argument will be interpreted as a very large unsigned value.

Coverity issue: 13266
Coverity issue: 13269
Fixes: 445c6528b5 ("power: common interface for guest and host")

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
2016-05-16 14:17:41 +02:00
Jan Viktorin
c4936eee7a config: disable qede for ARMv7
The qede PMD driver is failing when building for ARMv7:

drivers/net/qede/base/ecore_dev.c:1150:6:
	error: variable ‘prs_reg’ set but not used
[...]
drivers/net/qede/base/ecore_dev.c:2492:13:
	error: variable ‘p_phys’ set but not used
[...]
drivers/net/qede/base/ecore_dev.c:2532:39:
	error: variable ‘pbl_size’ set but not used

Fixes: 3eae93a9bf ("qede: enable PMD build")

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
2016-05-13 17:38:08 +02:00
Panu Matilainen
84740d785c qede: fix build with gcc 6
With gcc >= 6.0, qede base driver fails to build with:
drivers/net/qede/base/ecore_cxt.c: In function 'ecore_cdu_init_common':
cc1: error: left shift of negative value [-Werror=shift-negative-value]

Since the base drivers are untouchable, work around by disabling
the warning.

Fixes: ec94dbc573 ("qede: add base driver")

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
2016-05-13 17:38:07 +02:00
Ferruh Yigit
4de35ac07e qede: fix build with icc
fix errors:
icc: command line warning #10006: ignoring unknown option '-Wno-unused-value'
icc: command line warning #10006: ignoring unknown option '-Wno-format-nonliteral'
icc: command line warning #10006: ignoring unknown option '-Wno-shift-negative-value'
qede/base/ecore_dev.c(1643): error #188: enumerated type mixed with another type
        return 0;
               ^

Fixes: ec94dbc573 ("qede: add base driver")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
2016-05-13 17:38:07 +02:00
Daniel Mrzyglod
a5e20775a7 vhost: fix name not null terminated
Fix issue reported by Coverity.
Coverity ID 124556

If the buffer is treated as a null terminated string in later operations,
a buffer overflow or over-read may occur.

In vhost_set_ifname: The string buffer may not have a null terminator if
the source string's length is equal to the buffer size

Fixes: 54292e9520 ("vhost: support ifname for vhost-user")

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 20:25:30 +02:00
Yuanhan Liu
56fe86f8dc examples/vhost: embed statistics into device structure
Embed dev_statistics into vhost_dev struct, which could clean
the code a bit.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 20:22:40 +02:00
Yuanhan Liu
273ecdbc06 examples/vhost: clean switch worker
switch_worker() is the last piece of code that is messy yet it touches
virtio/vhost device.

Here do a cleanup, so that we will be less painful for later vhost ABI
refactoring.

The cleanup is straight forward: break long lines, move some code into
functions. The last, comment a bit on switch_worker().

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 20:22:40 +02:00
Yuanhan Liu
bdb19b771e examples/vhost: fix mbuf allocation failure
It has always been a mystery (at least to me before) that how many
mbuf is enough while creating an mbuf pool. While current macro
NUM_MBUFS_PER_PORT gives your some insights, it's not that accurate:
it doesn't consider the case we may receive a big packet, say 64K
when TSO is enabled.

We actually have tried to fix it once before, with commit 5499c1fc9b
("examples/vhost: fix mbuf allocation"), but it just workarounded it
by enlarging it a bit so that the case described in the commit log
by passes. So, while trying to fix it ultimately, I'm thinking how
big is big enough, and what are the factors need consider to figure
out a proper value.

Therefore, here you are. I introduced a helper function to create
the mbuf pool, and do the "how many mbufs are needed" calculation
there. Also, I put detailed comments how that comes, to serve as
the guidelines.

Fixes: 9fd72e3cbd ("examples/vhost: add virtio offload")
Fixes: 5499c1fc9b ("examples/vhost: fix mbuf allocation")

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 20:22:40 +02:00
Yuanhan Liu
9c5ef51207 examples/vhost: handle broadcast packet
Every time I do a VM2VM iperf test with vhost example, I have to set
the arp table manually, as vhost-switch just ignores the broadcast
packet, leaving the ARP request not served.

Here we do a transmit a broadcast packet (such as ARP request) to
every vhost device, as well as the physical port, to fix above
arp table issue.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 20:22:40 +02:00
Yuanhan Liu
da9e15837e examples/vhost: use MAC compare helper
rte_ether.h already provides a helper function to do mac address
compare. No need to define our own, use it directly.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 20:22:40 +02:00
Yuanhan Liu
45657a5c68 examples/vhost: use tailq to link vhost devices
To simplify code and logic.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 20:22:40 +02:00
Yuanhan Liu
7b63233bee examples/vhost: remove unused macro and struct
Interestingly, DESC_PER_CACHELINE has never been used since the
introduction of vhost example. Remove it.

vlan_ethhdr struct and VLAN_ETH_HLEN macro reference had been removed
by commit 4d50b6acbd ("examples/vhost: adapt Tx routing to lib"), but
had forgot to remove the definition.

Fixes: 4d50b6acbd ("examples/vhost: adapt Tx routing to lib")

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 20:22:40 +02:00
Yuanhan Liu
68363d8585 examples/vhost: remove the non-working zero copy code
It's reported that it's has not been working for a long while. And due
to it's complex, it's better to redesign it than to fix it to make it
work again.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 20:22:39 +02:00
Jianfeng Tan
2963d99a8b virtio: fix memory leak of virtqueue memzones
When virtio was proposed in DPDK, there is no API to free memzones.
But this has changed since rte_memzone_free() has been implemented by
commit ff909fe21f ("mem: introduce memzone freeing").

This patch is to make sure memzones in struct virtqueue, like mz and
virtio_net_hdr_mz, are freed when queue is released or setup fails.

Fixes: c1f86306a0 ("virtio: add new driver")

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 11:22:39 -07:00
Jianfeng Tan
4166bbf631 virtio: simplify queue allocation
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 11:22:33 -07:00
Jianfeng Tan
62a785a68e virtio: fix overwritten driver flags
The "drv_flags" is set with device as the input, which means different
device (say, modern vs legacy) could end up with a different value. And
the fact that "drv_flags" is shared by all devices means that every time
we add a new device, it simply overwrites the value configured from the
last device.

Therefore, when two virtio devices have different flags, it may lead to
wrong result, such as virtio would set irq config when it's not supported.

Making the flag per device (using "dev->data->dev_flags") could let us
have different value for each device, which would avoid the above issue.

Fixes: da978dfdc4 ("virtio: use port IO to get PCI resource")

Reported-by: David Marchand <david.marchand@6wind.com>
Suggested-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 10:57:10 -07:00
Huawei Xie
e928fd0bb0 virtio: optimize avail ring update
Avail ring is updated by the frontend and consumed by the backend.
There are frequent core to core cache transfers for the avail ring.

This optmization avoids avail ring entry index update if the entry
already holds the same value.
As DPDK virtio PMD implements FIFO free descriptor list (also for
performance reason of CACHE), in which descriptors are allocated
from the head and freed to the tail, with this patch in most cases
avail ring will remain the same, then it would be valid in both caches
of frontend and backend.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 10:57:10 -07:00
Tetsuya Mukawa
8180554d82 vhost: fix linkage of driver with library
Currently, vhost PMD doesn't have linkage for librte_vhost, even though
it depends on librte_vhost APIs. This causes a linkage error if below
conditions are fulfilled.

 - DPDK libraries are compiled as shared libraries.
 - DPDK application doesn't link librte_vhost.
 - Above application tries to link vhost PMD using '-d' DPDK option.

The patch adds linkage for librte_vhost to vhost PMD not to cause an
above error.

Fixes: ee584e9710 ("vhost: add driver on top of the library")

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 10:57:10 -07:00
Panu Matilainen
c6417ce61f mk: add build-time library directory to linker path
This is a pre-requisite for adding DT_NEEDED dependencies
between internal libraries.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
2016-05-10 10:57:10 -07:00
Huawei Xie
fac0b224c8 virtio: fix mbuf headroom size check
check merge-able header as it is supported.
previously we don't support merge-able feature, so non merge-able
header is checked.

Fixes: 13ce5e7eb9 ("virtio: mergeable buffers")

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 10:53:28 -07:00
Jianfeng Tan
2c0eb46f51 virtio: fix segfault on Tx desc flags setup
After the do-while loop, idx could be VQ_RING_DESC_CHAIN_END (32768)
when it's the last vring desc buf we can get. Therefore, following
expresssion could lead to a segfault error, as it tries to access
beyond the desc memory boundary.

    start_dp[idx].flags &= ~VRING_DESC_F_NEXT;

This bug could be reproduced easily with "set fwd txonly" in the
guest PMD, where the dequeue on host is slower than the guest Tx,
that running out of free desc buf is pretty easy.

The fix is straightforward and easy, just remove it, as we have
already set desc flags properly inside the do-while loop.

Fixes: dd856dfcb9 ("virtio: use any layout on Tx")

[Yuanhan Liu: commit log reword]
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 10:53:28 -07:00
Jianfeng Tan
e908312704 virtio: fix newline under debug mode
Issue: output of appliations and debug info of DPDK may be mixed up
in same line when enabling below debug options of virtio:
CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_INIT
CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX
CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DRIVER

This patch adds "\n" in the tail of definitions like PMD_RX_LOG,
PMD_TX_LOG, and PMD_DRV_LOG, and removes some "\n" when using these
macros.

Fixes: c1f86306a0 ("virtio: add new driver")

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-05-10 10:52:01 -07:00
Jingjing Wu
db340cf2ef i40e: fix VLAN stripping from inner header
Previously, for tunnel packets, such as VXLAN/NVGRE, the vlan
tags of the inner header will be stripped without putting vlan
info to descriptor, what is not expected behaviour.
This patch fixes it by changing hardware configuration to leave
the inner packet alone.

Fixes: 4861cde461 ("i40e: new poll mode driver")
Fixes: a778a1fa2e ("i40e: set up and initialize flow director")

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
2016-05-06 15:51:22 +02:00
Beilei Xing
b9a10de8d3 i40evf: expose queue info functions reused from PF
Add three new functions to the vf ops structure:
* rx_queue_count
* rxq_info_get
* txq_info_get
In all cases, the corresponding PF APIs can be reused.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2016-05-06 15:51:22 +02:00
John Daley
4a3259d620 enic: fix offset for Rx mbuf data
The code to provide mbufs for RX used m->data_off instead of
RTE_PKTMBUF_HEADROOM as the position inside the mbuf for the data to be
written. As the mbuf is uninitialised, this could potentially cause Rx
data to be placed at the wrong address in the mbuf - or even outside it.

Fixes: 947d860c82 ("enic: improve Rx performance")

Signed-off-by: John Daley <johndale@cisco.com>
2016-05-06 15:51:22 +02:00
Alejandro Lucero
6e138189ce nfp: expose device hotplug capability
RTE_PCI_DRV_DETACHABLE flag is required to indicate that a device
can be detached during execution.

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
2016-05-06 15:51:22 +02:00
Alejandro Lucero
142854c621 nfp: fix freeing multi-mbuf packets
mbufs were not properly released post-tx when they are chained.

Fixes: b812daadad ("nfp: add Rx and Tx")

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
2016-05-06 15:51:22 +02:00
Alejandro Lucero
ceccc5e78b nfp: serialize access for hardware reconfiguration
Some apps calling some functions from different threads at the
same time could lead to reconfig problems. Reconfig mechanism is
based on a hardware queue where incrementing a counter signals the
firmware to do the reconfig. If there is a second increment before the
first one has been processed the firmware will stop and a device
reset is necessary.

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
2016-05-06 15:51:22 +02:00
Panu Matilainen
aace9d0bcf mk: cleanup leftover references to malloc library
librte_malloc was long since merged into librte_eal, mop up the
leftover references to it from driver Makefiles.

Fixes: 2f9d47013e ("mem: move librte_malloc to eal/common")

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>

Applied with qede Makefile update too.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2016-05-06 15:51:22 +02:00
Björn Töpel
1bbcc5d211 i40evf: report error for unsupported CRC stripping config
On hosts running a non-DPDK PF driver, the VF has no means of changing
the HW CRC strip setting for a RX queue. It's implicitly enabled.

This patch checks if the host is running a non-DPDK PF kernel driver,
and returns an error, if HW CRC stripping was not requested in the port
configuration.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2016-05-06 15:51:22 +02:00
Daniel Mrzyglod
39c7226265 i40e: remove unneeded NULL check
In i40evf_config_vlan_pvid the check for NULL for the dev value is
unnecessary, since this value is passed in from the ethdev API which
will ensure that a valid rte_eth_dev structure is provided.
Furthermore, all code paths leading to this function already use the
dev value.

Issue identified by Coverity.
  Coverity ID 13302:
  There may be a null pointer dereference, or else the comparison against
  null is unnecessary.

  In i40evf_config_vlan_pvid: All paths that lead to this null pointer
  comparison already dereference the pointer earlier

Fixes: 2b12431b53 ("i40e: add vlan stripping and insertion to VF")

Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2016-05-06 15:51:22 +02:00
Bruce Richardson
35245e4438 i40e: simplify SSE packet length extraction code
In Table 8-16 of the "Intel® Ethernet Controller XL710 Datasheet" it is
stated that when the whole packet is written to a single buffer, the
header length field in the descriptor will be 0. This means that when
extracting the packet/data_len field from the descriptor in the driver
we do not need to mask out the extra header-length bits.

Inside the vector driver, this reduces the need to pull all four pktlen
fields into a single register to work on. Instead of a shift and mask,
we now need to only do a shift. Therefore, we can work on each descriptor
independently, processing each using one shift intrinsic and a blend.

This change makes the code shorter and easier to read, so we can pull it
into the main descriptor processing loop instead of needing its own
function. This in turn makes the descriptor processing in the loop as a
whole slightly easier to read as it's more linear.

In terms of performance, in testing this change shows little effect, with
single-core perf tests showing a very slight improvement.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
2016-05-06 15:51:22 +02:00
Bruce Richardson
0b6493fbb0 i40e: improve performance of vector PMD
An analysis of the i40e code using Intel® VTune™ Amplifier 2016 showed
that the code was unexpectedly causing stalls due to "Loads blocked by
Store Forwards". This can occur when a load from memory has to wait
due to the prior store being to the same address, but being of a smaller
size i.e. the stored value cannot be directly returned to the loader.
[See ref: https://software.intel.com/en-us/node/544454]

These stalls are due to the way in which the data_len values are handled
in the driver. The lengths are extracted using vector operations, but those
16-bit lengths are then assigned using scalar operations i.e. 16-bit
stores.

These regular 16-bit stores actually have two effects in the code:
* they cause the "Loads blocked by Store Forwards" issues reported
* they also cause the previous loads in the RX function to actually be a
load followed by a store to an address on the stack, because the 16-bit
assignment can't be done to an xmm register.

By converting the 16-bit store operations into a sequence of SSE blend
operations, we can ensure that the descriptor loads only occur once, and
avoid both the additional stores and loads from the stack, as well as the
stalls due to the blocked loads.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
2016-05-06 15:51:22 +02:00
Bruce Richardson
ad981b6c7a i40e: require SSE4.1 support for vector driver
Later commits to improve the driver will make use of the SSE4.1
_mm_blend_epi16 intrinsic, so:
* set the compilation level to always have SSE4.1 support,
* and add in a runtime check for SSE4.1 as part of the condition checks
  for vector driver selection.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
2016-05-06 15:51:22 +02:00
Rami Rosen
4bc0650ef6 i40e: remove redundant fdir forward declarations
This patch removes several redundant forward declarations
in i40e_fdir.c.

Fixes: a778a1fa2e ("i40e: set up and initialize flow director")

Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
2016-05-06 15:51:22 +02:00