Virtio requires pseudo-header checksum in TCP/UDP checksum to do
offload, but it was lost when Tx prepare is introduced. Also
rte_validate_tx_offload() should be used to validate Tx offloads.
Also it is incorrect to do virtio_tso_fix_cksum() after prepend
to mbuf without taking prepended size into account, since layer 2/3/4
lengths provide incorrect offsets after prepend.
Fixes: 4fb7e803eb1a ("ethdev: add Tx preparation")
Cc: stable@dpdk.org
Signed-off-by: Dilshod Urazov <dilshod.urazov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
This patch fixes the multi-process support for virtio-user.
Currently virtio-user just provides some limited secondary
process supports. Only some basic operations can be done in
secondary process on virtio-user port, e.g. getting port stats.
Actions which will trigger the communication with vhost backend
can't be done in secondary process for now, as the fds are
not synced between processes. The processing of server mode
devargs is also moved into virtio_user_dev_init().
Fixes: cdb068f031c6 ("bus/vdev: scan by multi-process channel")
Fixes: ee27edbe0c10 ("drivers/net: share vdev data to secondary process")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
VIRTIO_F_ORDER_PLATFORM is required to use proper memory barriers
in case of HW vhost implementations like vDPA.
DMA barriers (rte_cio_*) are sufficent for that purpose.
Previously known as VIRTIO_F_IO_BARRIER.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds support for in-order path when meargeable buffers
feature hasn't been negotiated.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
This implements the transmit path for devices with
support for packed virtqueues.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The virtio features VIRTIO_NET_F_CSUM, VIRTIO_NET_F_HOST_TSO4
and VIRTIO_NET_F_HOST_TSO6 are supported by the virtio PMD.
But they are missing in the supported feature set. And since
below commit:
commit 4174a7b59d05 ("net/virtio: improve Tx offload features negotiation")
Virtio PMD will announce the Tx offloading capabilities based
on the features read from the device. And virtio-user won't
report the features which are not in virtio-PMD's supported
feature set. So since that commit, virtio-user won't announce
the DEV_TX_OFFLOAD_UDP_CKSUM, DEV_TX_OFFLOAD_TCP_CKSUM and
DEV_TX_OFFLOAD_TCP_TSO offloading capabilities even if the
vhost backend supports them.
This patch adds these missing features, and virtio-user will
report them if the backend supports them.
Fixes: 142678d42959 ("net/virtio-user: fix wrongly get/set features")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This macro isn't used any more after below commit:
Fixes: a4996bd89c42 ("ethdev: new Rx/Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
This patch improves the Tx offload features selection depending
on whether the application request for offloads.
When the application doesn't request for Tx offload features,
the corresponding features bits aren't negotiated.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
IN_ORDER Rx function depends on merge-able feature. Descriptors
allocation and free will be done in bulk.
Virtio dequeue logic:
dequeue_burst_rx(burst mbufs)
for (each mbuf b) {
if (b need merge) {
merge remained mbufs
add merged mbuf to return mbufs list
} else {
add mbuf to return mbufs list
}
}
if (last mbuf c need merge) {
dequeue_burst_rx(required mbufs)
merge last mbuf c
}
refill_avail_ring_bulk()
update_avail_ring()
return mbufs list
IN_ORDER Tx function can support offloading features. Packets which
matched "can_push" option will be handled by simple xmit function. Those
packets can't match "can_push" will be handled by original xmit function
with in-order flag.
Virtio enqueue logic:
xmit_cleanup(used descs)
for (each xmit mbuf b) {
if (b can inorder xmit) {
add mbuf b to inorder burst list
continue
} else {
xmit inorder burst list
xmit mbuf b by original function
}
}
if (inorder burst list not empty) {
xmit inorder burst list
}
update_avail_ring()
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When live migration is done, for the backup VM, either the virtio
frontend or the vhost backend needs to send out gratuitous RARP packet
to announce its new network location.
This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support live
migration scenario where the vhost backend doesn't have the ability to
generate RARP packet.
Brief introduction of the work flow:
1. QEMU finishes live migration, pokes the backup VM with an interrupt.
2. Virtio interrupt handler reads out the interrupt status value, and
realizes it needs to send out RARP packet to announce its location.
3. Pause device to stop worker thread touching the queues.
4. Inject a RARP packet into a Tx Queue.
5. Ack the interrupt via control queue.
6. Resume device to continue packet processing.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
This patch adds dev_pause, dev_resume and inject_pkts APIs to allow
driver to pause the worker threads and inject special packets into
Tx queue. The next patch will be based on this.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
DPDK has already the definition of Ethernet numeric link speeds in Mbps
in the file Rte_ethdev.h, it is unnecessary to rededine virtio specific
link speeds macros again.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
In rx/tx queue setup functions, some code is executed only if
use_simple_rxtx == 1. The value of this variable can change depending on
the offload flags or sse support. If Rx queue setup is called before Tx
queue setup, it can result in an invalid configuration:
- dev_configure is called: use_simple_rxtx is initialized to 0
- rx queue setup is called: queues are initialized without simple path
support
- tx queue setup is called: use_simple_rxtx switch to 1, and simple
Rx/Tx handlers are selected
Fix this by postponing a part of Rx/Tx queue initialization in
dev_start(), as it was the case in the initial implementation.
Fixes: 48cec290a3d2 ("net/virtio: move queue configure code to proper place")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
The patch change the prototype of callback function
(rte_intr_callback_fn) by removing the unnecessary parameter.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
So far, virtio-user with vhost-user as the backend can only support
client mode. So when vhost user backend is down, i.e., unix socket
connection is broken, the connection cannot be re-connected. We will
forcely set the link state to be down.
Note: virtio-user with vhost-kernel as the backend still cannot
support lsc now as we fail to find a way to monitor the backend, tap
device, up/down events.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
This patch implements support for the Virtio MTU feature.
When negotiated, the host shares its maximum supported MTU,
which is used as initial MTU and as maximum MTU the application
can set.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Under interrupt mode, rx_descriptor_done is used as an indicator
for applications to check if some number of packets are ready to
be received.
This patch enables this by checking used ring's local consumed idx
with shared (with backend) idx.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Before the commit 86d59b21468a ("net/virtio: support LRO"), features
in virtio PMD, is decided and properly set at device initialization
and will not be changed. But afterward, features could be changed in
virtio_dev_configure(), and will be re-negotiated if it's changed.
In virtio-user, device features is obtained at driver probe phase
only once, but we did not store it. So the added feature bits in
re-negotiation will fail.
To fix it, we store it down, and will be used to feature negotiation
either at device initialization phase or device configure phase.
Fixes: e9efa4d93821 ("net/virtio-user: add new virtual PCI driver")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The only piece of code of virtio_dev_rxtx_start() is actually doing
queue configure/setup work. So, move it to corresponding queue_setup
callback.
Once that is done, virtio_dev_rxtx_start() becomes an empty function,
thus it's being removed.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Queue allocation should be done once, since the queue related info (such
as vring addreess) will only be informed to the vhost-user backend once
without virtio device reset.
That means, if you allocate queues again after the vhost-user negotiation,
the vhost-user backend will not be informed any more. Leading to a state
that the vring info mismatches between virtio PMD driver and vhost-backend:
the driver switches to the new address has just been allocated, while the
vhost-backend still sticks to the old address has been assigned in the init
stage.
Unfortunately, that is exactly how the virtio driver is coded so far: queue
allocation is done at queue_setup stage (when rte_eth_tx/rx_queue_setup is
invoked). This is wrong, because queue_setup can be invoked several times.
For example,
$ start_testpmd.sh ... --txq=1 --rxq=1 ...
> port stop 0
> port config all txq 1 # just trigger the queue_setup callback again
> port config all rxq 1
> port start 0
The right way to do is allocate the queues in the init stage, so that the
vring info could be persistent with the vhost-user backend.
Besides that, we should allocate max_queue pairs the device supports, but
not nr queue pairs firstly configured, to make following case work.
$ start_testpmd.sh ... --txq=1 --rxq=1 ...
> port stop 0
> port config all txq 2
> port config all rxq 2
> port start 0
Since the allocation is switched to init stage, the free should also
moved from the rx/tx_queue_release to dev close stage. That leading we
could do nothing an empty rx/tx_queue_release() implementation.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Move the configuration of control queue in the configure callback.
This is needed by next commit, which introduces the reinitialization
of the device in the configure callback to change the feature flags.
Therefore, the control queue will have to be restarted at the same
place.
As virtio_dev_cq_queue_setup() is called from a place where
config->max_virtqueue_pairs is not available, we need to store this in
the private structure. It replaces max_rx_queues and max_tx_queues which
have the same value. The log showing the value of max_rx_queues and
max_tx_queues is also removed since config->max_virtqueue_pairs is
already displayed above.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Negotiate VIRTIO_F_IOMMU_PLATFORM to have IOMMU support.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Virtio indirect descriptors are supported by the data-path
but the feature bit is never set during feature negociation.
This patch simply adds VIRTIO_RING_F_INDIRECT_DESC back to
the supported features bit mask, hence enabling the use of
indirect descriptors when the feature is negociated with the
device.
Signed-off-by: Pierre Pfister <ppfister@cisco.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Add a new virtual device named virtio-user, which can be used just like
eth_ring, eth_null, etc. To reuse the code of original virtio, we do
some adjustment in virtio_ethdev.c, such as remove key _static_ of
eth_virtio_dev_init() so that it can be reused in virtual device; and
we add some check to make sure it will not crash.
Configured parameters include:
- queues (optional, 1 by default), number of queue pairs, multi-queue
not supported for now.
- cq (optional, 0 by default), not supported for now.
- mac (optional), random value will be given if not specified.
- queue_size (optional, 256 by default), size of virtqueues.
- path (madatory), path of vhost user.
When enable CONFIG_RTE_VIRTIO_USER (enabled by default), the compiled
library can be used in both VM and container environment.
Examples:
path_vhost=<path_to_vhost_user> # use vhost-user as a backend
sudo ./examples/l2fwd/build/l2fwd -c 0x100000 -n 4 \
--socket-mem 0,1024 --no-pci --file-prefix=l2fwd \
--vdev=virtio-user0,mac=00:01:02:03:04:05,path=$path_vhost -- -p 0x1
Known issues:
- Control queue and multi-queue are not supported yet.
- Cannot work with --huge-unlink.
- Cannot work with no-huge.
- Cannot work when there are more than VHOST_MEMORY_MAX_NREGIONS(8)
hugepages.
- Root privilege is a must (mainly becase of sorting hugepages according
to physical address).
- Applications should not use file name like HUGEFILE_FMT ("%smap_%d").
- Cannot work with vhost-net backend.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
We keep a common vq structure, containing only vq related fields,
and then split others into RX, TX and control queue respectively.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
[Jianfeng Tan: found and fixed 2 bugs]
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Some duplex values are replaced from 0 to half-duplex when link is down.
Some drivers are still using their own constants for duplex modes.
Signed-off-by: Marc Sune <marcdevel@gmail.com>
Modern (v1.0) virtio pci device defines several pci capabilities.
Each cap has a configure structure corresponding to it, and the
cap.bar and cap.offset fields tell us where to find it.
Firstly, we map the pci resources by rte_eal_pci_map_device().
We then could easily locate a cfg structure by:
cfg_addr = dev->mem_resources[cap.bar].addr + cap.offset;
Therefore, the entrance of enabling modern (v1.0) pci device support
is to iterate the pci capability lists, and to locate some configs
we care; and they are:
- common cfg
For generic virtio and virtqueue configuration, such as setting/getting
features, enabling a specific queue, and so on.
- nofity cfg
Combining with `queue_notify_off' from common cfg, we could use it to
notify a specific virt queue.
- device cfg
Where virtio_net_config structure is located.
- isr cfg
Where to read isr (interrupt status).
If any of above cap is not found, we fallback to the legacy virtio
handling.
If succeed, hw->vtpci_ops is assigned to modern_ops, where all
operations are implemented by reading/writing a (or few) specific
configuration space from above 4 cfg structures. And that's basically
how this patch works.
Besides those changes, virtio 1.0 introduces a new status field:
FEATURES_OK, which is set after features negotiation is done.
Last, set the VIRTIO_F_VERSION_1 feature flag.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Bulk free of mbufs when clean used ring.
Shift operation of idx could be saved if vq_free_cnt means
free slots rather than free descriptors.
TODO: rearrange vq data structure, pack the stats var together so that
we could use one vec instruction to update all of them.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
With fixed avail ring, we don't need to get desc idx from avail ring.
virtio driver only has to deal with desc ring.
This patch uses vector instruction to accelerate processing desc ring.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Some minor cleanups.
* pass constant to virtio_dev_queue_setup
* fix message on rx_queue_setup
* get rid of extra double spaces
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Add functions virtio_dev_queue_release(), virtio_dev_rx_queue_release() and
virtio_dev_tx_queue_release().
Use queue_release in virtio_dev_uninit().
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Change the features from bit mask to bit number. This allows the
DPDK driver to use the definitions from Linux (yes the header
files already use a license compatiable with DPDK). This makes DPDK
driver handle future feature bit changes.
Get rid of double negative code in the feature bit intialization.
Instead just have a new define with the list of feature bits implemented.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Move virtio PMD to drivers/net directory
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>