Commit Graph

981 Commits

Author SHA1 Message Date
Jingjing Wu
d69be32d4d ethdev: structures to add or delete flow director
define structures to add or delete flow director filter
  - struct rte_eth_fdir_filter

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-25 00:06:03 +01:00
Jingjing Wu
f05ec7d77e i40e: initialize flow director flexible payload setting
set flexible payload related registers to default value at initialization time.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-25 00:05:54 +01:00
Jingjing Wu
71d35259ff i40e: tear down flow director
release fortville resources on flow director, includes
 - queue 0 pair release
 - release vsi

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-24 23:47:05 +01:00
Jingjing Wu
a778a1fa2e i40e: set up and initialize flow director
set up fortville resources to support flow director, includes
 - queue 0 pair allocated and set up for flow director
 - create vsi
 - reserve memzone for flow director programming packet

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-24 23:46:00 +01:00
Helin Zhang
2abd11d2e9 i40evf: support querying and updating redirection table
Support of updating/querying redirection table has been added for VF.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-24 23:15:51 +01:00
Helin Zhang
66c594904a ethdev: support multiple sizes of redirection table
As 40G NIC supports different sizes (128/512/64 entries) of
redirection table from that (128 entries) of 1G and 10G NICs,
support of multiple sizes of redirection table is needed.
It includes,
* Redefine 'struct rte_eth_rss_reta' in ethdev.
  - To 'struct rte_eth_rss_reta_entry64' which contains 64
    entries and 64 bits mask.
  - Array of above new structure can be used for any number of
    redirection table entries, as long as the number is multiple
    of 64. This is quite flexible for the future expanding of
    redirection table.
* Redefinition of relevant interfaces in ethdev.
  - Interface of reta update has been redefined with new parameters.
  - Interface of reta query has been redefined with new parameters.
* Rework of 1G PMD in igb.
  - reta update has been reworked.
  - reta query has been reworked.
* Rework of 10G PMD in ixgbe.
  - reta update has been reworked.
  - reta query has been reworked.
* Rework of 40G PMD (PF only) in i40e.
  - reta update has been reworked.
  - reta query has been reworked.
* Implement relevant commands in testpmd.

Test report: http://dpdk.org/ml/archives/dev/2014-November/008362.html

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Erlu Chen <erlu.chen@intel.com>
2014-11-24 22:59:15 +01:00
Helin Zhang
a887690986 i40e: add redirection table size in device info
Returning redirection table size has been supported in ops of
'dev_infos_get' for both PF and VF. Default RX/TX configurations
of VF can be returned in ops of 'dev_infos_get', while it was
missed before.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-24 22:59:15 +01:00
Helin Zhang
2144f6630f ixgbe: add redirection table size in device info
As more and more information are different between PF and VF, ops
of 'dev_infos_get' has been implemented respectively. In addition,
returning redirection table size has been supported in it.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-24 22:59:15 +01:00
Helin Zhang
1f35bdcbfd igb: add redirection table size in device info
As more and more information are different between PF and VF,
ops of 'dev_infos_get' has been implemented respectively. In
addition, new field of 'reta_size' has been added in
'struct rte_eth_dev_info' for returning redirection table size.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-24 22:59:15 +01:00
Helin Zhang
03fce3d063 i40e: support setting hash lookup table size
Add support of setting hash lookup table size according
to the hardawre capability.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-24 22:59:14 +01:00
Helin Zhang
9097952eae i40evf: fix code style
Fix of several code style issues.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-24 22:59:14 +01:00
Declan Doherty
a45b288ef2 bond: support link status polling
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
2014-11-24 21:44:02 +01:00
Declan Doherty
620f98d66f bond: free mbufs on Tx burst failure
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
2014-11-24 21:43:50 +01:00
Declan Doherty
2a61ae793a bond: fix naming inconsistency
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-24 21:43:42 +01:00
Declan Doherty
2493e691c7 bond: remove switch statement from Rx burst
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2014-11-24 21:43:12 +01:00
Declan Doherty
76d29903f5 bond: support link status interrupt
Adding support for lsc interrupt from bonded device to link
bonding library with supporting unit tests in the test application.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: SunX Jiajia <sunx.jiajia@intel.com>
2014-11-24 21:40:29 +01:00
John W. Linville
364e08f2bb af_packet: add PMD for AF_PACKET-based virtual devices
This is a Linux-specific virtual PMD driver backed by an AF_PACKET
socket.  This implementation uses mmap'ed ring buffers to limit copying
and user/kernel transitions.  The PACKET_FANOUT_HASH behavior of
AF_PACKET is used for frame reception.  In the current implementation,
Tx and Rx queues are always paired, and therefore are always equal
in number -- changing this would be a Simple Matter Of Programming.

Interfaces of this type are created with a command line option like
"--vdev=eth_af_packet0,iface=...".  There are a number of options available
as arguments:

 - Interface is chosen by "iface" (required)
 - Number of queue pairs set by "qpairs" (optional, default: 1)
 - AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
 - AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
 - AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)

Signed-off-by: John W. Linville <linville@tuxdriver.com>
[Thomas: disable because of incompatibility with some kernels]
2014-11-24 16:39:49 +01:00
Sergio Gonzalez Monroy
9c2c27f8c4 cmdline: fix for bsd
Some features of the cmdline were broken in FreeBSD as a result of
termios not being compiled.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-24 13:17:49 +01:00
Sergio Gonzalez Monroy
89a233209e xenvirt: fix reference to old mbuf field
Since commit 08b563ffb1 ("mbuf: replace data pointer by an offset"),
data is not an mbuf field anymore.

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-24 13:17:49 +01:00
Pawel Wodkowski
a1a57f3e11 alarm: make cancellation thread-safe
It eliminates a race between threads using rte_alarm_cancel() and
rte_alarm_set().

Signed-off-by: Pawel Wodkowski <pawelx.wodkowski@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-11-24 13:17:49 +01:00
Balazs Nemeth
5bc3c265cb table: fix pointer calculations at initialization
During initialization of rte_table_hash_ext and rte_table_hash_lru, a
contiguous region of memory is allocated to store meta data, buckets,
extended buckets, keys, stack of keys, stack of extended buckets and
data entries. The size of each region depends on the hash table
configuration.

The address of each region is calculated using offsets relative to the
beginning of the memory region. Without this patch, the offsets
contain the size of the table meta data (sizeof(struct
rte_table_hash)). These addresses are stored in pointers which are
used when entries are added or deleted and lookups are performed.

Instead of adding these offsets to the address of the beginning of the
memory region, they are added to the address of the end of the meta
data (= address of the beginning of the memory region + sizeof(struct
rte_table_hash)). The resulting addresses are off by sizeof(struct
rte_table_hash) bytes. As a consequence, memory past the allocated
region can be accessed by the add, delete and lookup operations.

This patch corrects the address calculation by not including the size
of the meta data in the offsets.

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2014-11-24 13:17:49 +01:00
Balazs Nemeth
8595428e50 table: fix incorrect initialization
During initialization of rte_hash_table_ext and rte_hash_table_lru,
t->data_size_shl is calculated.  This member contains the number of
bits to shift left during calculation of the location of entries in
the hash table.  To determine the number of bits to shift left, the
size of the entry (as provided to the rte_table_hash_ext_create and
rte_table_hash_lru_create) has to be used instead of the size of the
key.

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2014-11-24 13:17:49 +01:00
Balazs Nemeth
14f2544cda table: fix checking extended buckets in unoptimized case
If a key is not found in a bucket and the bucket has been extended,
the extended buckets also have to checked for potentially matching
keys. The extended buckets are checked at the end of the lookup. In
most cases, this logic is skipped as it is uncommon to have buckets in
an extended state.

In case the lookup is performed with less than 5 packets, an
unoptimized version is run instead (the optimized version requires at
least 5 packets). The extended buckets should also be checked in this
case instead of simply ignoring the extended buckets.

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2014-11-24 13:17:49 +01:00
Balazs Nemeth
fb20a4bd0f table: fix empty bucket removal during entry deletion
When an entry is deleted from an extensible rte_table_hash, the bucket
that stored the entry can become empty. If this is the case, the
bucket needs to be removed from the chain of buckets.

During removal of the bucket, the chain should be updated first. If
the bucket that will be removed is cleared first, the chain is broken
and the information to update the chain is lost.

Signed-off-by: Balazs Nemeth <balazs.nemeth@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2014-11-24 13:17:49 +01:00
Yong Wang
2e84937377 vmxnet3: leverage data ring on Tx path
Data_ring is a pre-mapped guest ring buffer that vmxnet3
backend has access to directly without a need for buffer
address mapping and unmapping during packet transmission.
It is useful in reducing device emulation cost on the tx
path.  There are some additional cost though on the guest
driver for packet copy and overall it's a win.

This patch leverages the data_ring for packets with a
length less than or equal to the data_ring entry size
(128B).  For larger packet, we won't use the data_ring
as that requires one extra tx descriptor and it's not
clear if doing this will be beneficial.

Performance results show that this patch significantly
boosts vmxnet3 64B tx performance (pkt rate) for l2fwd
application on a Ivy Bridge server by >20% at which
point we start to hit some bottleneck on the rx side.

Signed-off-by: Yong Wang <yongwang@vmware.com>
2014-11-14 17:32:27 +01:00
Yong Wang
14680e3747 vmxnet3: improve Rx performance
This patch includes two small performance optimizations
on the rx path:

(1) It adds unlikely hints on various infrequent error
paths to the compiler to make branch prediction more
efficient.

(2) It also moves a constant assignment out of the pkt
polling loop.  This saves one branching per packet.

Performance evaluation configs:
- On the DPDK-side, it's running some l3 forwarding app
inside a VM on ESXi with one core assigned for polling.
- On the client side, pktgen/dpdk is used to generate
64B tcp packets at line rate (14.8M PPS).

Performance results on a Nehalem box (4cores@2.8GHzx2)
shown below.  CPU usage is collected factoring out the
idle loop cost.
- Before the patch, ~900K PPS with 65% CPU of a core
used for DPDK.
- After the patch, only 45% of a core used, while
maintaining the same packet rate.

Signed-off-by: Yong Wang <yongwang@vmware.com>
2014-11-14 17:32:01 +01:00
Yong Wang
d768f6273c vmxnet3: add Rx check offloads
Only supports IPv4 so far.

Signed-off-by: Yong Wang <yongwang@vmware.com>
2014-11-14 17:31:43 +01:00
Yong Wang
5aecdc17a9 vmxnet3: fix stop/restart
This change makes vmxnet3 consistent with other pmds in
terms of dev_stop behavior: rather than releasing tx/rx
rings, it only resets the ring structure and release the
pending mbufs.

Verified with various tests (test-pmd and pktgen) over
vmxnet3 that dev stop/restart works fine.

Signed-off-by: Yong Wang <yongwang@vmware.com>
2014-11-14 17:31:18 +01:00
Yong Wang
3604496377 vmxnet3: add vlan Tx offload
Signed-off-by: Yong Wang <yongwang@vmware.com>
2014-11-14 17:31:06 +01:00
Yong Wang
b3e03223f1 vmxnet3: fix vlan Rx stripping
Shouldn't reset vlan_tci to 0 if a valid VLAN tag is stripped.

Signed-off-by: Yong Wang <yongwang@vmware.com>
2014-11-14 17:30:51 +01:00
Thomas Monjalon
4b9bb6b71a acl: fix code typos
Replace indicies by indices.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-14 17:23:50 +01:00
Thomas Monjalon
7eef9194ab acl: fix comments typos
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-14 17:23:50 +01:00
Qinglai Xiao
ecb6c4559e distributor: enhance and fix tag matching
With introduction of in_flight_bitmask, the whole 32 bits of tag can be
used. Further more, this patch fixed the integer overflow when finding
the matched tags.
The maximum number workers is now defined as 64, which is length of
double-word. The link between number of workers and RTE_MAX_LCORE is
now removed. Compile time check is added to ensure the
RTE_DISTRIB_MAX_WORKERS is less than or equal to size of double-word.

Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-13 12:26:10 +01:00
Qinglai Xiao
9f2e99d171 mbuf: add usr alias for hash
This field is added for librte_distributor. User of librte_distributor
is advocated to set value of mbuf->hash.usr before calling
rte_distributor_process. The value of usr is the tag which stands as
identifier of flow.

Signed-off-by: Qinglai Xiao <jigsaw@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-13 12:26:10 +01:00
Helin Zhang
8dae34c15c eal: update i40e supported devices
According to the changes of the i40e base driver, two device
IDs (0x1573, 0x1582) are not supported anymore, and one new
device ID (0x1586) is supported. The list of i40e device IDs
DPDK supported should be modified accordingly.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-13 10:26:00 +01:00
Cunming Liang
5974ee01f4 ixgbe: fix reconfiguration of Rx method
The scattered_rx configuration is updated in dev_start().
For the execution sequence "stop, re-configure and then re-start",
it expects using the new configuration.
But during re-configure, the stored data may still be the old one.
The patch clean the configuration anyway in dev_stop().
So that make sure always get the best Rx routine.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
2014-11-13 00:48:16 +01:00
Cunming Liang
5e8ae7fc91 ethdev: fix Rx/Tx return in debug mode
Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count
returns the packet number.
When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was
set to -ENOTSUP. It makes confusing.
The patch always return 0 no matter no packet or there's error.

Signed-off-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-13 00:48:16 +01:00
Cunming Liang
ec3d82db2d ether: new function to format mac address
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
2014-11-13 00:48:16 +01:00
Ouyang Changchun
90924caf08 vhost: enable promiscuous and multicast
This is to enable user space vhost receiving and forwarding broadcast
and multicast packets:
Use new option in command line to enable promisc mode;
Enable 2 bits in VMDQ RX mode: ETH_VMDQ_ACCEPT_BROADCAST and ETH_VMDQ_ACCEPT_MULTICAST.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-11-12 00:10:23 +01:00
Ouyang Changchun
cd91b7348d virtio: support promiscuous and allmulticast
Add codes for supporting promiscuous and allmulticast enable and disable.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
2014-11-12 00:10:23 +01:00
Ouyang Changchun
38da13a9c3 ixgbe: VMDQ Rx mode
Config PFVML2FLT register in ixgbe PMD to enable it receive broadcast and multicast packets;
also factorize the common logic with ixgbe_set_pool_rx_mode.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-12 00:10:23 +01:00
Ouyang Changchun
8d74cfc4d2 igb: VMDQ Rx mode
Config VM offload register in igb PMD to enable it receive broadcast and multicast packets.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-12 00:10:17 +01:00
Ouyang Changchun
7e1fceb51d ethdev: VMDQ Rx mode
Add vmdq rx mode field into rx config struct, it is flag from ETH_VMDQ_ACCEPT_*.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-12 00:10:12 +01:00
Jia Yu
5bad0b917e kni: add build-time checks for mbuf mapping
Adding this check is to avoid breakage from future data structure changes.

Signed-off-by: Jia Yu <jyu@vmware.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-10 10:28:46 +01:00
Thomas Monjalon
4ffab9b998 kni: fix build
Since commit 08b563ffb1 ("mbuf: replace data pointer by an offset"),
KNI vhost compilation (CONFIG_RTE_KNI_VHOST=y) was broken.

rte_pktmbuf_mtod() is not used in the kernel context but is replaced
by a simple addition of the base address and the offset.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2014-11-10 10:28:46 +01:00
Bruce Richardson
98d5a1318a distributor: add comments
Add in some additional comments around more complex areas of the code
so as to make the code easier to read and understand.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2014-11-07 15:04:59 +01:00
David Marchand
1c1dc182da eal: fix C++ compilation after headers rework
Following the big headers rework, all C++ stuff has moved to arch-specific
headers. The generic headers should not contain this so that this is done only
once.
There was a remaining #ifdef __cplusplus in "eal: split CPU cycle operation to
architecture specific" (fa4001c30e).

Reported-by: Keunhong Lee <dlrmsghd@gmail.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-07 11:57:16 +01:00
Thomas Monjalon
ee63ac39f8 i40e: fix build with icc
Since commit d798a94 ("mac vlan filter"),
ICC reports this error:
	lib/librte_pmd_i40e/i40e_ethdev.c(1763): error #188:
	enumerated type mixed with another type

Indeed, RTE_ETH_FILTER_NONE comes from enum rte_filter_type but
enum rte_filter_op is expected.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2014-11-07 00:08:27 +01:00
Helin Zhang
d6b1972909 i40evf: support configurable crc stripping
Configurable CRC stripping needs to be supported in VF,
and the configuration should be finally set in relevant
RX queue context with PF host support.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-06 23:50:14 +01:00
Helin Zhang
9c7aeb45f4 i40e: support configurable crc stripping
Support of configurable crc stripping in context of
VF RX queues.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2014-11-06 23:50:14 +01:00