Commit Graph

62 Commits

Author SHA1 Message Date
Rich Lane
610e0a8b62 virtio: use zeroed memory for simple Tx header
For simple TX the virtio-net header must be zeroed, but it was using memory
that had been initialized with indirect descriptor tables. This resulted in
"unsupported gso type" errors from librte_vhost.

We can use the same memory for every descriptor to save cachelines in the
vswitch.

Fixes: 6dc5de3a ("virtio: use indirect ring elements")

Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
2016-04-06 12:27:57 +02:00
Marc Sune
1131900006 ethdev: use constants for link duplex
Some duplex values are replaced from 0 to half-duplex when link is down.

Some drivers are still using their own constants for duplex modes.

Signed-off-by: Marc Sune <marcdevel@gmail.com>
2016-04-01 21:38:34 +02:00
Thomas Monjalon
09419f235e ethdev: use constants for link state
Define and use ETH_LINK_UP and ETH_LINK_DOWN where appropriate.

Signed-off-by: Marc Sune <marcdevel@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2016-04-01 21:38:34 +02:00
Kyle Larose
3eabd79c50 virtio: fix Rx ring descriptor starvation
Virtio has an mbuf descriptor ring containing mbufs to be used for
receiving traffic. When the host queues traffic to be sent to the guest, it
consumes these descriptors. If none exist, it discards the packet.

The virtio pmd allocates mbufs to the descriptor ring every time it
successfully receives a packet. However, it never does it if it does not
receive a valid packet. If the descriptor ring is exhausted, and the mbuf
mempool does not have any mbufs free (which can happen for various reasons,
such as queueing along the processing pipeline), then the receive call will
not allocate any mbufs to the descriptor ring, and when it finishes, the
descriptor ring will be empty. The ring being empty means that we will
never receive a packet again, which means we will never allocate mbufs to
the ring: we are stuck.

Ultimately, the problem arises because there is a dependency between
receiving packets and making the descriptor ring not be empty, and a
dependency between the descriptor ring not being empty, and receiving
packets.

To fix the problem, this pakes makes virtio always try to allocate mbufs
to the descriptor ring, if necessary, when polling for packets. Do this by
removing the early exit if no packets were received. Since the packet loop
later will do nothing if there are no packets, this is fine.

I reproduced the problem by pushing packets through a pipelined systems
(such as the client_server sample application) after artificially
decreasing the size of the mbuf pool and introducing a delay in a secondary
stage.

Without the fix, the process stops receiving packets fairly quicky. With
the fix, it continues to receive packets.

Fixes: c1f86306a0 ("virtio: add new driver")

Signed-off-by: Kyle Larose <klarose@sandvine.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-03-25 19:01:37 +01:00
Huawei Xie
0bb159ad74 virtio: remove redundant function names in log
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2016-03-16 19:05:46 +01:00
Stephen Hemminger
17cbf09fe1 virtio: optimize Tx enqueue
All the error checks in virtqueue_enqueue_xmit are already done
by the caller. Therefore they can be removed to improve performance.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-03-16 19:05:35 +01:00
Stephen Hemminger
dd856dfcb9 virtio: use any layout on Tx
Virtio supports a feature that allows sender to put transmit
header prepended to data.  It requires that the mbuf be writeable, correct
alignment, and the feature has been negotiatied.  If all this works out,
then it will be the optimum way to transmit a single segment packet.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-03-16 19:05:25 +01:00
Stephen Hemminger
6dc5de3a6a virtio: use indirect ring elements
The virtio ring in QEMU/KVM is usually limited to 256 entries
and the normal way that virtio driver was queuing mbufs required
nsegs + 1 ring elements. By using the indirect ring element feature
if available, each packet will take only one ring slot even for
multi-segment packets.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-03-16 19:05:25 +01:00
Igor Ryzhov
64a7619ee8 virtio: remove broadcast packets from multicast statistics
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>

Applied with coding standards fixes:
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2016-03-16 18:52:18 +01:00
Huawei Xie
3b1e3e4e36 virtio: fix descriptors pointing to the same buffer
The virtio_net_hdr desc all pointed to the same buffer. It doesn't cause
issue because in the simple TX mode we don't use the header. This patch
makes the header desc point to different buffer.

Fixes: b4ae9c505f ("virtio: optimize ring layout")

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-03-16 18:52:18 +01:00
Bernard Iremonger
c680a4a88c virtio: fix crash in statistics functions
This initialisation of nb_rx_queues and nb_tx_queues has been removed
from eth_virtio_dev_init.

The nb_rx_queues and nb_tx_queues were being initialised in
eth_virtio_dev_init before the tx_queues and rx_queues arrays were
allocated.

The arrays are allocated when the ethdev port is configured and the
nb_tx_queues and nb_rx_queues are initialised.

If any of the following functions were called before the ethdev
port was configured there was a segmentation fault because
rx_queues and tx_queues were NULL:

rte_eth_stats_get
rte_eth_stats_reset
rte_eth_xstats_get
rte_eth_xstats_reset

Fixes: 823ad64795 ("virtio: support multiple queues")

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-03-16 18:52:18 +01:00
Jianfeng Tan
9a0615af77 virtio: fix restart
Fix the issue that virtio device cannot be started after stopped.

The field, hw->started, should be changed by virtio_dev_start/stop instead
of virtio_dev_close.

Fixes: a85786dc81 ("virtio: fix states handling during initialization")

Reported-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
2016-03-16 18:52:18 +01:00
Yuanhan Liu
36ea36efb4 virtio: fix query of legacy features
Declare dst as type uint32_t instead of uint64_t, otherwise, we will get
a random upper 32 bit feature bits, as the following io port read reads
lower 32 bit only. It could lead a feature bits that include VIRTIO_F_VERSION_1
(the 32th bit) for legacy virtio, which is obviously wrong.

Fixes: b8f04520ad ("virtio: use PCI ioport API")

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
2016-03-14 23:16:15 +01:00
Huawei Xie
ac5e1d838d virtio: skip error when probing kernel managed device
virtio PMD could use IO port to configure the virtio device without
using UIO/VFIO driver in legacy mode.

There are two issues with previous implementation:
1) virtio PMD will take over the virtio device(s) blindly even if not
intended for DPDK.
2) driver conflict between virtio PMD and virtio-net kernel driver.

This patch checks if there is kernel driver other than UIO/VFIO managing
the virtio device before using port IO.

If legacy_virtio_resource_init fails and kernel driver other than
VFIO/UIO is managing the device, return 1 to tell the upper layer we
don't take over this device.
For all other IO port mapping errors, return -1.

Note than if VFIO/UIO fails, now we don't fall back to port IO.

Fixes: da978dfdc4 ("virtio: use port IO to get PCI resource")

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
2016-03-10 00:36:51 +01:00
Ravi Kerur
d6b324c00f mbuf: get DMA address
Macros RTE_MBUF_DATA_DMA_ADDR and RTE_MBUF_DATA_DMA_ADDR_DEFAULT
are defined in each PMD driver file. Convert macros to inline
functions and move them to common lib/librte_mbuf/rte_mbuf.h file.
PMD drivers include rte_mbuf.h file directly/indirectly hence no
additioanl header file inclusion is necessary.

Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2016-03-04 16:01:15 +01:00
Santosh Shukla
69d308e1c0 virtio: restrict vector Rx/Tx to x86 SSSE3
Temporary implementation to let virtio operate in non-vec mode for archs
which doesn't support _ssse_ cpuflag.

todo:
1) Move virtio_recv_pkts_vec() implementation to
   drivers/virtio/virtio_vec_<arch>.h file.
2) Remove use_simple_rxtx flag, so that virtio/virtio_vec_<arch>.h
   files to provide vectored/non-vectored rx/tx apis.

Fixes: fc3d66212f ("virtio: add vector Rx")
Fixes: c121c8d6d3 ("virtio: add simple Tx")
Fixes: 8d8393fb18 ("virtio: pick simple Rx/Tx")

Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-03-03 14:00:28 +01:00
David Marchand
b8f04520ad virtio: use PCI ioport API
Move all os / arch specifics to eal.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Santosh Shukla <sshukla@mvista.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-02-16 22:55:44 +01:00
David Marchand
7a66c72d6c virtio: fix check when mapping PCI resources
According to the api, rte_eal_pci_map_device is only successful when
returning 0.

Fixes: 6ba1f63b5a ("virtio: support specification 1.0")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-02-16 22:55:44 +01:00
David Marchand
25294cd3a6 virtio: fix FreeBSD build
Fixes: c52afa68d7 ("virtio: move left PCI stuff in the right file")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-02-16 22:55:44 +01:00
Huawei Xie
693f715da4 remove extra parentheses in return statement
fix the error reported by checkpatch:
  "ERROR: return is not a function, parentheses are not required"

remove parentheses in return like:
  "return (logical expressions)"

remove parentheses in return a function like:
  "return (rte_mempool_lookup(...))"

Fixes: 6307b909b8 ("lib: remove extra parenthesis after return")

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
2016-02-10 15:47:50 +01:00
Yuanhan Liu
b86af7b1b5 virtio: move ioport macros
virtio_pci.c is the only file references macros VIRTIO_READ/WRITE_REG_X.
Move them there.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-02-03 16:07:50 +01:00
Yuanhan Liu
6ba1f63b5a virtio: support specification 1.0
Modern (v1.0) virtio pci device defines several pci capabilities.
Each cap has a configure structure corresponding to it, and the
cap.bar and cap.offset fields tell us where to find it.

Firstly, we map the pci resources by rte_eal_pci_map_device().
We then could easily locate a cfg structure by:

    cfg_addr = dev->mem_resources[cap.bar].addr + cap.offset;

Therefore, the entrance of enabling modern (v1.0) pci device support
is to iterate the pci capability lists, and to locate some configs
we care; and they are:

- common cfg

  For generic virtio and virtqueue configuration, such as setting/getting
  features, enabling a specific queue, and so on.

- nofity cfg

  Combining with `queue_notify_off' from common cfg, we could use it to
  notify a specific virt queue.

- device cfg

  Where virtio_net_config structure is located.

- isr cfg

  Where to read isr (interrupt status).

If any of above cap is not found, we fallback to the legacy virtio
handling.

If succeed, hw->vtpci_ops is assigned to modern_ops, where all
operations are implemented by reading/writing a (or few) specific
configuration space from above 4 cfg structures. And that's basically
how this patch works.

Besides those changes, virtio 1.0 introduces a new status field:
FEATURES_OK, which is set after features negotiation is done.

Last, set the VIRTIO_F_VERSION_1 feature flag.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-02-03 16:07:50 +01:00
Yuanhan Liu
1905e101dc virtio: retrieve header size from device setting
The mergeable virtio net hdr format has been the standard and the
only virtio net hdr format since virtio 1.0. Therefore, we can
not hardcode hdr_size to "sizeof(struct virtio_net_hdr)" any more
at virtio_recv_pkts(), otherwise, there would be a mismatch of
hdr size from rte_vhost_enqueue_burst() and virtio_recv_pkts(),
leading a packet corruption.

Instead, we should retrieve it from hw->vtnet_hdr_size; we will
do proper settings at eth_virtio_dev_init() in later patches.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-02-03 16:07:49 +01:00
Yuanhan Liu
3891f233f7 virtio: switch to 64 bit features
Switch to 64 bit features, which virtio 1.0 supports.

While legacy virtio only supports 32 bit features, it complains aloud
and quit when trying to setting > 32 bit features.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-02-03 16:07:49 +01:00
Yuanhan Liu
c52afa68d7 virtio: move left PCI stuff in the right file
virtio_pci.c is a more proper place for pci stuff; virtio_ethdev is not.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-02-03 16:07:49 +01:00
Yuanhan Liu
d5bbeefca8 virtio: introduce PCI implementation structure
Introduce struct virtio_pci_ops, to let legacy virtio (v0.95) and
modern virtio (1.0) have different implementation regarding to a
specific pci action, such as read host status.

With that, this patch reimplements all exported pci functions, in
a way like:

	vtpci_foo_bar(struct virtio_hw *hw)
	{
		hw->vtpci_ops->foo_bar(hw);
	}

So that we need pay attention to those pci related functions only
while adding virtio 1.0 support.

This patch introduced a new vtpci function, vtpci_init(), to do
proper virtio pci settings. It's pretty simple so far: just sets
hw->vtpci_ops to legacy_ops as we don't support 1.0 yet.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-02-03 16:07:49 +01:00
Yuanhan Liu
4c2277ff45 virtio: define offset as size_t type
offset arg of vtpci_read/write_dev_config is derived from offsetof(),
which is of size_t type, instead of uint64_t. So, define it as size_t
type.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-02-03 16:07:49 +01:00
Yuanhan Liu
c47787cfaa virtio: do not set vring address again at queue startup
As we have already set up it at virtio_dev_queue_setup(), and a vq
restart will not reset the settings.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2016-02-03 16:07:49 +01:00
Yuanhan Liu
6ed346a462 virtio: fix wrong queue index
We should provide VIRTIO_PCI_QUEUE_SEL with vq->vq_queue_idx,
but not vq->queue_id.

vq->queue_id is the queue id from rte_eth_rx/tx_queue_setup(),
which always starts from 0 no matter which queue it is. However,
for virtio, even number is for RX queue, and odd number is for
TX queue.

Fixes: 5382b188fb ("virtio: add queue release")

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2015-12-09 22:02:33 +01:00
Igor Ryzhov
e1cf0d0853 ethdev: fix reset of Rx mbuf allocation failures
The rx_mbuf_alloc_failed counter was only cleared by virtio driver.
Now it is cleared by common rte_eth_stats_reset function for all
drivers at once.

Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
2015-12-07 04:55:31 +01:00
Bernard Iremonger
d15339b928 virtio: fix link state interrupt
call rte_eth_copy_pci_info() after the RTE_PCI_DRV_INTR_LSC
has been initialised.

Fixes: eeefe73f0a ("drivers: copy PCI device info to ethdev data")

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
2015-12-07 01:03:12 +01:00
Stephen Hemminger
43ffe8aa86 virtio: clean up Tx space checks
The space check for transmit ring only needs a single conditional.
I.e only need to recheck for space if there was no space in first check.

This can help performance and simplifies loop.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2015-12-07 01:03:12 +01:00
Stephen Hemminger
c21835fab8 virtio: fix Rx mbuf initialization
The virtio driver was not initializing all the fields in
the receive mbuf. This would cause bugs where previous usage
of mbuf would leave stale TCI and offload flags.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2015-12-07 01:03:12 +01:00
Jerin Jacob
4c02e453cc eal: introduce SMP memory barriers
This commit introduce rte_smp_mb(), rte_smp_wmb() and rte_smp_rmb(), in
order to enable memory barriers between lcores.
The patch does not provide any functional change for IA, the goal is to
have infrastructure for weakly ordered machines like ARM to work on DPDK.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2015-11-18 22:44:01 +01:00
Bernard Iremonger
eeefe73f0a drivers: copy PCI device info to ethdev data
Use new function rte_eth_copy_pci_info.
Copy device info for the following pdevs:

bnx2x
cxgbe
e1000
enic
fm10k
i40e
ixgbe
mlx4
mlx5
virtio
vmxnet3

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2015-11-03 18:39:26 +01:00
Ivan Boule
b5b0467ca8 virtio: fix size of MAC address array
Make the virtio PMD allocate the array of unicast MAC addresses with
the maximum of entries (VIRTIO_MAX_MAC_ADDRS) that it exports.

Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2015-11-03 11:40:58 +01:00
Harry van Haaren
76d4c652e0 virtio: add extended stats
Add xstats() functions and statistic strings to virtio PMD.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
2015-11-03 00:19:25 +01:00
Huawei Xie
8d8393fb18 virtio: pick simple Rx/Tx
simple rx/tx func is chose when merge-able rx is disabled and user
specifies single segment and no offload support.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
2015-11-02 15:34:44 +01:00
Huawei Xie
c121c8d6d3 virtio: add simple Tx
Bulk free of mbufs when clean used ring.
Shift operation of idx could be saved if vq_free_cnt means
free slots rather than free descriptors.

TODO: rearrange vq data structure, pack the stats var together so that
we could use one vec instruction to update all of them.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
2015-11-02 15:34:03 +01:00
Huawei Xie
fc3d66212f virtio: add vector Rx
With fixed avail ring, we don't need to get desc idx from avail ring.
virtio driver only has to deal with desc ring.
This patch uses vector instruction to accelerate processing desc ring.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
2015-11-02 15:33:43 +01:00
Huawei Xie
cab0461234 virtio: fill Rx avail ring with blank mbufs
Add software RX ring in virtqueue.
Add fake_mbuf in virtqueue for wraparound processing.
Fill avail ring with blank mbufs in virtio_dev_vring_start

Add virtio_rxtx.h header file for RTE_VIRTIO_PMD_MAX_BURST.
Would move all rx/tx related declarations into this header file in future.

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
2015-11-02 15:32:19 +01:00
Huawei Xie
b4ae9c505f virtio: optimize ring layout
In DPDK based switching environment, mostly vhost runs on a dedicated core
while virtio processing in guest VMs runs on different cores.
Take RX for example, with generic implementation, for each guest buffer,
a) virtio driver allocates a descriptor from free descriptor list
b) modify the entry of avail ring to point to allocated descriptor
c) after packet is received, free the descriptor

When vhost fetches the avail ring, it need to fetch the modified L1 cache from
virtio core, which is a heavy cost in current CPU implementation.

This idea of this optimization is:
    allocate the fixed descriptor for each entry of avail ring, so avail ring will
always be the same during the run.
This removes L1M cache transfer from virtio core to vhost core for avail ring.
(Note we couldn't avoid the cache transfer for descriptors).
Besides, descriptor allocation and free operation is eliminated.
This also makes vector procesing possible to further accelerate the processing.

This is the layout for the avail ring(take 256 ring entries for example), with
each entry pointing to the descriptor with the same index.
                    avail
                    idx
                    +
                    |
+----+----+---+-------------+------+
| 0  | 1  | 2 | ... |  254  | 255  |  avail ring
+-+--+-+--+-+-+---------+---+--+---+
  |    |    |       |   |      |
  |    |    |       |   |      |
  v    v    v       |   v      v
+-+--+-+--+-+-+---------+---+--+---+
| 0  | 1  | 2 | ... |  254  | 255  |  desc ring
+----+----+---+-------------+------+
                    |
                    |
+----+----+---+-------------+------+
| 0  | 1  | 2 |     |  254  | 255  |  used ring
+----+----+---+-------------+------+
                    |
                    +

This is the ring layout for TX.
As we need one virtio header for each xmit packet, we have 128 slots available.

                         ++
                         ||
                         ||
+-----+-----+-----+--------------+------+------+------+
|  0  |  1  | ... |  127 || 128  | 129  | ...  | 255  |   avail ring
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
   |     |            |  ||  |      |             |
   v     v            v  ||  v      v             v
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| 128 | 129 | ... |  255 || 128  | 129  | ...  | 255  |   desc ring for virtio_net_hdr
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
   |     |            |  ||  |      |             |
   v     v            v  ||  v      v             v
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
|  0  |  1  | ... |  127 ||  0   |  1   | ...  | 127  |   desc ring for tx dat
+-----+-----+-----+--------------+------+------+------+
                         ||
                         ||
                         ++

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
2015-11-02 15:31:42 +01:00
Changchun Ouyang
6d7740e2c1 virtio: fix deadloop after wrong config read
The old code adjusts the config bytes we want to read depending on
what kind of features we have, but we later cast the entire buf we
read with "struct virtio_net_config", which is obviously wrong.

The wrong config reading results to a dead loop at virtio_send_command()
while starting testpmd.

The right way to go is to read related config bytes when corresponding
feature is set, which is exactly what this patch does.

Fixes: 823ad64795 ("virtio: support multiple queues")

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Huawei Xie <huawei.xie@intel.com>
2015-10-26 21:23:53 +01:00
Stephen Hemminger
1e7bd2380f virtio: fix Coverity unsigned warnings
There are some places in virtio driver where uint16_t or int are used
where it would be safer to use unsigned.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2015-10-21 16:14:02 +02:00
Stephen Hemminger
954ea11540 virtio: do not report link state feature unless available
If host does not support virtio link state (like current DPDK vhost)
then don't set the flag. This keeps applications from incorrectly
assuming that link state is available when it is not. It also
avoids useless "guess what works in the config".

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2015-10-21 16:12:32 +02:00
Bernard Iremonger
ce8e121870 virtio: fix crash when releasing null queue
if input parameter vq is NULL, hw = vq->hw, causes a segmentation fault.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2015-10-20 23:29:37 +02:00
Stephen Hemminger
27b31d130e virtio: small cleanups
Some minor cleanups.
  * pass constant to virtio_dev_queue_setup
  * fix message on rx_queue_setup
  * get rid of extra double spaces

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2015-07-22 10:55:26 +02:00
Stephen Hemminger
945884f14b virtio: fix queue size and number of descriptors
The virtual queue ring size and the number of slots actually usable
are separate parameters. In the most common environment (QEMU)
the virtual queue ring size is 256, but some environments the
ring maybe much larger.

The ring size comes from the host and the driver must use the
actual size passed.

The number of descriptors can be either zero to use the whole
available ring, or some value smaller. This is used to limit
the number of mbufs allocated for the receive ring. If more
descriptors are requested than available the size is silently
truncated.

Note: the ring size (from host) must be a power of two, but
the number of descriptors used can be any size from 1 to the
size of the virtual ring.

Fixes: d78deadae4 ("virtio: fix ring size negotiation")

Reported-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
2015-07-22 10:33:50 +02:00
Bernard Iremonger
941d64b5bf virtio: free queue memory when closing
Add function virtio_free_queues() and call from virtio_dev_close()
Use virtio_dev_rx_queue_release() and virtio_dev_tx_queue_release()

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2015-07-19 22:24:42 +02:00
Bernard Iremonger
5382b188fb virtio: add queue release
Add functions virtio_dev_queue_release(), virtio_dev_rx_queue_release() and
virtio_dev_tx_queue_release().

Use queue_release in virtio_dev_uninit().

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
2015-07-19 22:24:42 +02:00