Added neon based Rx vector implementation.
Selection of the new handler based neon availability at runtime.
Updated the release notes and MAINTAINERS file.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
Introduced cpuflag based run-time detection to select the
SSE based simple Rx handler
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Removed unnecessary compile time dependency on "use_simple_rxtx".
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The rxq/txq for the queue_release callback could be NULL, say when
rte_eth_dev_configure() fails that the queue is not setup at all.
Do a simple NULL check would fix the crash issue.
Fixes: 01ad44fd374f ("net/virtio: split Rx/Tx queue")
Reported-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The support of virtio-user changed the way the mbuf dma address is
retrieved, using a physical address in case of virtio-pci and a virtual
address in case of virtio-user.
This change introduced some possible memory corruption in packets,
replacing:
m->buf_physaddr + RTE_PKTMBUF_HEADROOM
by:
m->buf_physaddr + m->data_off (through a macro)
This patch fixes this issue, restoring the original behavior.
By the way, it also rework the macros, adding a "VIRTIO_" prefix and
API comments.
Fixes: f24f8f9fee8a ("net/virtio: allow virtual address to fill vring descriptors")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch is related to how to calculate relative address for vhost
backend.
The principle is that: based on one or multiple shared memory regions,
vhost maintains a reference system with the frontend start address,
backend start address, and length for each segment, so that each
frontend address (GPA, Guest Physical Address) can be translated into
vhost-recognizable backend address. To make the address translation
efficient, we need to maintain as few regions as possible. In the case
of VM, GPA is always locally continuous. But for some other case, like
virtio-user, GPA continuous is not guaranteed, therefore, we use virtual
address here.
It basically means:
a. when set_base_addr, VA address is used;
b. when preparing RX's descriptors, VA address is used;
c. when transmitting packets, VA is filled in TX's descriptors;
d. in TX and CQ's header, VA is used.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
We keep a common vq structure, containing only vq related fields,
and then split others into RX, TX and control queue respectively.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
[Jianfeng Tan: found and fixed 2 bugs]
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The commit dd856dfcb9e74 introduced an optimization that prepends virtio
header to mbuf data. It can be used when the tx mbuf is writeable, so we
need to check that the mbuf is direct (i.e. it embeds its own data).
Fixes: dd856dfcb9e7 ("virtio: use any layout on Tx")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Many drivers provide their own implementation of rte_mbuf_raw_alloc(),
duplicating the code. Introduce a new public function in rte_mbuf to
allocate a raw mbuf (uninitialized).
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
After the do-while loop, idx could be VQ_RING_DESC_CHAIN_END (32768)
when it's the last vring desc buf we can get. Therefore, following
expresssion could lead to a segfault error, as it tries to access
beyond the desc memory boundary.
start_dp[idx].flags &= ~VRING_DESC_F_NEXT;
This bug could be reproduced easily with "set fwd txonly" in the
guest PMD, where the dequeue on host is slower than the guest Tx,
that running out of free desc buf is pretty easy.
The fix is straightforward and easy, just remove it, as we have
already set desc flags properly inside the do-while loop.
Fixes: dd856dfcb9e ("virtio: use any layout on Tx")
[Yuanhan Liu: commit log reword]
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Issue: output of appliations and debug info of DPDK may be mixed up
in same line when enabling below debug options of virtio:
CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_INIT
CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX
CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DRIVER
This patch adds "\n" in the tail of definitions like PMD_RX_LOG,
PMD_TX_LOG, and PMD_DRV_LOG, and removes some "\n" when using these
macros.
Fixes: c1f86306a026 ("virtio: add new driver")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
For simple TX the virtio-net header must be zeroed, but it was using memory
that had been initialized with indirect descriptor tables. This resulted in
"unsupported gso type" errors from librte_vhost.
We can use the same memory for every descriptor to save cachelines in the
vswitch.
Fixes: 6dc5de3a ("virtio: use indirect ring elements")
Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Virtio has an mbuf descriptor ring containing mbufs to be used for
receiving traffic. When the host queues traffic to be sent to the guest, it
consumes these descriptors. If none exist, it discards the packet.
The virtio pmd allocates mbufs to the descriptor ring every time it
successfully receives a packet. However, it never does it if it does not
receive a valid packet. If the descriptor ring is exhausted, and the mbuf
mempool does not have any mbufs free (which can happen for various reasons,
such as queueing along the processing pipeline), then the receive call will
not allocate any mbufs to the descriptor ring, and when it finishes, the
descriptor ring will be empty. The ring being empty means that we will
never receive a packet again, which means we will never allocate mbufs to
the ring: we are stuck.
Ultimately, the problem arises because there is a dependency between
receiving packets and making the descriptor ring not be empty, and a
dependency between the descriptor ring not being empty, and receiving
packets.
To fix the problem, this pakes makes virtio always try to allocate mbufs
to the descriptor ring, if necessary, when polling for packets. Do this by
removing the early exit if no packets were received. Since the packet loop
later will do nothing if there are no packets, this is fine.
I reproduced the problem by pushing packets through a pipelined systems
(such as the client_server sample application) after artificially
decreasing the size of the mbuf pool and introducing a delay in a secondary
stage.
Without the fix, the process stops receiving packets fairly quicky. With
the fix, it continues to receive packets.
Fixes: c1f86306a026 ("virtio: add new driver")
Signed-off-by: Kyle Larose <klarose@sandvine.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
All the error checks in virtqueue_enqueue_xmit are already done
by the caller. Therefore they can be removed to improve performance.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Virtio supports a feature that allows sender to put transmit
header prepended to data. It requires that the mbuf be writeable, correct
alignment, and the feature has been negotiatied. If all this works out,
then it will be the optimum way to transmit a single segment packet.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
The virtio ring in QEMU/KVM is usually limited to 256 entries
and the normal way that virtio driver was queuing mbufs required
nsegs + 1 ring elements. By using the indirect ring element feature
if available, each packet will take only one ring slot even for
multi-segment packets.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Igor Ryzhov <iryzhov@nfware.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Applied with coding standards fixes:
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
The virtio_net_hdr desc all pointed to the same buffer. It doesn't cause
issue because in the simple TX mode we don't use the header. This patch
makes the header desc point to different buffer.
Fixes: b4ae9c505f2e ("virtio: optimize ring layout")
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Macros RTE_MBUF_DATA_DMA_ADDR and RTE_MBUF_DATA_DMA_ADDR_DEFAULT
are defined in each PMD driver file. Convert macros to inline
functions and move them to common lib/librte_mbuf/rte_mbuf.h file.
PMD drivers include rte_mbuf.h file directly/indirectly hence no
additioanl header file inclusion is necessary.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
The mergeable virtio net hdr format has been the standard and the
only virtio net hdr format since virtio 1.0. Therefore, we can
not hardcode hdr_size to "sizeof(struct virtio_net_hdr)" any more
at virtio_recv_pkts(), otherwise, there would be a mismatch of
hdr size from rte_vhost_enqueue_burst() and virtio_recv_pkts(),
leading a packet corruption.
Instead, we should retrieve it from hw->vtnet_hdr_size; we will
do proper settings at eth_virtio_dev_init() in later patches.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
As we have already set up it at virtio_dev_queue_setup(), and a vq
restart will not reset the settings.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
The space check for transmit ring only needs a single conditional.
I.e only need to recheck for space if there was no space in first check.
This can help performance and simplifies loop.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The virtio driver was not initializing all the fields in
the receive mbuf. This would cause bugs where previous usage
of mbuf would leave stale TCI and offload flags.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Add xstats() functions and statistic strings to virtio PMD.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
simple rx/tx func is chose when merge-able rx is disabled and user
specifies single segment and no offload support.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
With fixed avail ring, we don't need to get desc idx from avail ring.
virtio driver only has to deal with desc ring.
This patch uses vector instruction to accelerate processing desc ring.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Add software RX ring in virtqueue.
Add fake_mbuf in virtqueue for wraparound processing.
Fill avail ring with blank mbufs in virtio_dev_vring_start
Add virtio_rxtx.h header file for RTE_VIRTIO_PMD_MAX_BURST.
Would move all rx/tx related declarations into this header file in future.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
In DPDK based switching environment, mostly vhost runs on a dedicated core
while virtio processing in guest VMs runs on different cores.
Take RX for example, with generic implementation, for each guest buffer,
a) virtio driver allocates a descriptor from free descriptor list
b) modify the entry of avail ring to point to allocated descriptor
c) after packet is received, free the descriptor
When vhost fetches the avail ring, it need to fetch the modified L1 cache from
virtio core, which is a heavy cost in current CPU implementation.
This idea of this optimization is:
allocate the fixed descriptor for each entry of avail ring, so avail ring will
always be the same during the run.
This removes L1M cache transfer from virtio core to vhost core for avail ring.
(Note we couldn't avoid the cache transfer for descriptors).
Besides, descriptor allocation and free operation is eliminated.
This also makes vector procesing possible to further accelerate the processing.
This is the layout for the avail ring(take 256 ring entries for example), with
each entry pointing to the descriptor with the same index.
avail
idx
+
|
+----+----+---+-------------+------+
| 0 | 1 | 2 | ... | 254 | 255 | avail ring
+-+--+-+--+-+-+---------+---+--+---+
| | | | | |
| | | | | |
v v v | v v
+-+--+-+--+-+-+---------+---+--+---+
| 0 | 1 | 2 | ... | 254 | 255 | desc ring
+----+----+---+-------------+------+
|
|
+----+----+---+-------------+------+
| 0 | 1 | 2 | | 254 | 255 | used ring
+----+----+---+-------------+------+
|
+
This is the ring layout for TX.
As we need one virtio header for each xmit packet, we have 128 slots available.
++
||
||
+-----+-----+-----+--------------+------+------+------+
| 0 | 1 | ... | 127 || 128 | 129 | ... | 255 | avail ring
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| | | || | | |
v v v || v v v
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| 128 | 129 | ... | 255 || 128 | 129 | ... | 255 | desc ring for virtio_net_hdr
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| | | || | | |
v v v || v v v
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| 0 | 1 | ... | 127 || 0 | 1 | ... | 127 | desc ring for tx dat
+-----+-----+-----+--------------+------+------+------+
||
||
++
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
There are some places in virtio driver where uint16_t or int are used
where it would be safer to use unsigned.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Some minor cleanups.
* pass constant to virtio_dev_queue_setup
* fix message on rx_queue_setup
* get rid of extra double spaces
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Add functions virtio_dev_queue_release(), virtio_dev_rx_queue_release() and
virtio_dev_tx_queue_release().
Use queue_release in virtio_dev_uninit().
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
The parameter tx_free_thresh is not consistent between the drivers:
some use it as rte_eth_tx_burst() requires, some release buffers when
the number of free descriptors drop below this value.
Let's use it as most fast-path code does, which is the latter, and update
comments throughout the code to reflect that.
Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Move virtio PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>