2018-02-01 17:18:17 +00:00
|
|
|
.. SPDX-License-Identifier: BSD-3-Clause
|
|
|
|
Copyright(c) 2010-2016 Intel Corporation.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
|
|
|
Vhost Sample Application
|
|
|
|
========================
|
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
The vhost sample application demonstrates integration of the Data Plane
|
|
|
|
Development Kit (DPDK) with the Linux* KVM hypervisor by implementing the
|
|
|
|
vhost-net offload API. The sample application performs simple packet
|
|
|
|
switching between virtual machines based on Media Access Control (MAC)
|
|
|
|
address or Virtual Local Area Network (VLAN) tag. The splitting of Ethernet
|
|
|
|
traffic from an external switch is performed in hardware by the Virtual
|
|
|
|
Machine Device Queues (VMDQ) and Data Center Bridging (DCB) features of
|
|
|
|
the Intel® 82599 10 Gigabit Ethernet Controller.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Testing steps
|
2014-11-11 12:27:01 +00:00
|
|
|
-------------
|
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
This section shows the steps how to test a typical PVP case with this
|
|
|
|
vhost-switch sample, whereas packets are received from the physical NIC
|
|
|
|
port first and enqueued to the VM's Rx queue. Through the guest testpmd's
|
|
|
|
default forwarding mode (io forward), those packets will be put into
|
|
|
|
the Tx queue. The vhost-switch example, in turn, gets the packets and
|
|
|
|
puts back to the same physical NIC port.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Build
|
|
|
|
~~~~~
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2017-10-25 15:50:59 +00:00
|
|
|
To compile the sample application see :doc:`compiling`.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2017-10-25 15:50:59 +00:00
|
|
|
The application is located in the ``vhost`` sub-directory.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2017-10-25 15:50:59 +00:00
|
|
|
.. note::
|
|
|
|
In this example, you need build DPDK both on the host and inside guest.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Start the vswitch example
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
.. code-block:: console
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2017-02-27 19:13:40 +00:00
|
|
|
./vhost-switch -l 0-3 -n 4 --socket-mem 1024 \
|
2016-11-02 03:15:00 +00:00
|
|
|
-- --socket-file /tmp/sock0 --client \
|
|
|
|
...
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Check the `Parameters`_ section for the explanations on what do those
|
|
|
|
parameters mean.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
.. _vhost_app_run_vm:
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Start the VM
|
|
|
|
~~~~~~~~~~~~
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
.. code-block:: console
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
qemu-system-x86_64 -machine accel=kvm -cpu host \
|
|
|
|
-m $mem -object memory-backend-file,id=mem,size=$mem,mem-path=/dev/hugepages,share=on \
|
|
|
|
-mem-prealloc -numa node,memdev=mem \
|
|
|
|
\
|
|
|
|
-chardev socket,id=char1,path=/tmp/sock0,server \
|
|
|
|
-netdev type=vhost-user,id=hostnet1,chardev=char1 \
|
|
|
|
-device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:00:00:14 \
|
|
|
|
...
|
2014-11-11 12:27:01 +00:00
|
|
|
|
|
|
|
.. note::
|
2016-11-02 03:15:00 +00:00
|
|
|
For basic vhost-user support, QEMU 2.2 (or above) is required. For
|
|
|
|
some specific features, a higher version might be need. Such as
|
|
|
|
QEMU 2.7 (or above) for the reconnect feature.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
.. _vhost_app_run_dpdk_inside_guest:
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Run testpmd inside guest
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Make sure you have DPDK built inside the guest. Also make sure the
|
|
|
|
corresponding virtio-net PCI device is bond to a uio driver, which
|
|
|
|
could be done by:
|
2015-03-11 16:21:47 +00:00
|
|
|
|
2014-11-11 12:27:01 +00:00
|
|
|
.. code-block:: console
|
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
modprobe uio_pci_generic
|
2018-08-24 07:43:02 +00:00
|
|
|
$RTE_SDK/usertools/dpdk-devbind.py -b uio_pci_generic 0000:00:04.0
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Then start testpmd for packet forwarding testing.
|
2015-03-11 16:21:47 +00:00
|
|
|
|
|
|
|
.. code-block:: console
|
|
|
|
|
2017-02-27 19:13:40 +00:00
|
|
|
./x86_64-native-gcc/app/testpmd -l 0-1 -- -i
|
2016-11-02 03:15:00 +00:00
|
|
|
> start tx_first
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Inject packets
|
|
|
|
--------------
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
While a virtio-net is connected to vhost-switch, a VLAN tag starts with
|
|
|
|
1000 is assigned to it. So make sure configure your packet generator
|
|
|
|
with the right MAC and VLAN tag, you should be able to see following
|
|
|
|
log from the vhost-switch console. It means you get it work::
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
VHOST_DATA: (0) mac 52:54:00:00:00:14 and vlan 1000 registered
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2015-12-10 11:23:04 +00:00
|
|
|
|
2016-02-25 17:02:27 +00:00
|
|
|
.. _vhost_app_parameters:
|
|
|
|
|
2014-11-11 12:27:01 +00:00
|
|
|
Parameters
|
2016-11-02 03:15:00 +00:00
|
|
|
----------
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--socket-file path**
|
|
|
|
Specifies the vhost-user socket file path.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--client**
|
|
|
|
DPDK vhost-user will act as the client mode when such option is given.
|
|
|
|
In the client mode, QEMU will create the socket file. Otherwise, DPDK
|
|
|
|
will create it. Put simply, it's the server to create the socket file.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--vm2vm mode**
|
|
|
|
The vm2vm parameter sets the mode of packet switching between guests in
|
|
|
|
the host.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
- 0 disables vm2vm, impling that VM's packets will always go to the NIC port.
|
|
|
|
- 1 means a normal mac lookup packet routing.
|
|
|
|
- 2 means hardware mode packet forwarding between guests, it allows packets
|
|
|
|
go to the NIC port, hardware L2 switch will determine which guest the
|
|
|
|
packet should forward to or need send to external, which bases on the
|
|
|
|
packet destination MAC address and VLAN tag.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--mergeable 0|1**
|
|
|
|
Set 0/1 to disable/enable the mergeable Rx feature. It's disabled by default.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--stats interval**
|
2014-11-11 12:27:01 +00:00
|
|
|
The stats parameter controls the printing of virtio-net device statistics.
|
2016-11-02 03:15:00 +00:00
|
|
|
The parameter specifies an interval (in unit of seconds) to print statistics,
|
|
|
|
with an interval of 0 seconds disabling statistics.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--rx-retry 0|1**
|
|
|
|
The rx-retry option enables/disables enqueue retries when the guests Rx queue
|
|
|
|
is full. This feature resolves a packet loss that is observed at high data
|
|
|
|
rates, by allowing it to delay and retry in the receive path. This option is
|
|
|
|
enabled by default.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--rx-retry-num num**
|
|
|
|
The rx-retry-num option specifies the number of retries on an Rx burst, it
|
|
|
|
takes effect only when rx retry is enabled. The default value is 4.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--rx-retry-delay msec**
|
|
|
|
The rx-retry-delay option specifies the timeout (in micro seconds) between
|
|
|
|
retries on an RX burst, it takes effect only when rx retry is enabled. The
|
|
|
|
default value is 15.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--dequeue-zero-copy**
|
2018-03-14 16:24:16 +00:00
|
|
|
Dequeue zero copy will be enabled when this option is given. it is worth to
|
|
|
|
note that if NIC is binded to driver with iommu enabled, dequeue zero copy
|
|
|
|
cannot work at VM2NIC mode (vm2vm=0) due to currently we don't setup iommu
|
|
|
|
dma mapping for guest memory.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
**--vlan-strip 0|1**
|
2016-08-18 05:46:13 +00:00
|
|
|
VLAN strip option is removed, because different NICs have different behaviors
|
|
|
|
when disabling VLAN strip. Such feature, which heavily depends on hardware,
|
|
|
|
should be removed from this example to reduce confusion. Now, VLAN strip is
|
|
|
|
enabled and cannot be disabled.
|
2015-03-03 02:23:06 +00:00
|
|
|
|
2018-07-31 16:00:39 +00:00
|
|
|
**--builtin-net-driver**
|
|
|
|
A very simple vhost-user net driver which demonstrates how to use the generic
|
|
|
|
vhost APIs will be used when this option is given. It is disabled by default.
|
|
|
|
|
2014-11-11 12:27:01 +00:00
|
|
|
Common Issues
|
2016-11-02 03:15:00 +00:00
|
|
|
-------------
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
* QEMU fails to allocate memory on hugetlbfs, with an error like the
|
|
|
|
following::
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
file_ram_alloc: can't mmap RAM pages: Cannot allocate memory
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
When running QEMU the above error indicates that it has failed to allocate
|
|
|
|
memory for the Virtual Machine on the hugetlbfs. This is typically due to
|
|
|
|
insufficient hugepages being free to support the allocation request. The
|
|
|
|
number of free hugepages can be checked as follows:
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
.. code-block:: console
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
cat /sys/kernel/mm/hugepages/hugepages-<pagesize>/nr_hugepages
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
The command above indicates how many hugepages are free to support QEMU's
|
|
|
|
allocation request.
|
2014-11-11 12:27:01 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
* Failed to build DPDK in VM
|
net/virtio-user: add virtual device
Add a new virtual device named virtio-user, which can be used just like
eth_ring, eth_null, etc. To reuse the code of original virtio, we do
some adjustment in virtio_ethdev.c, such as remove key _static_ of
eth_virtio_dev_init() so that it can be reused in virtual device; and
we add some check to make sure it will not crash.
Configured parameters include:
- queues (optional, 1 by default), number of queue pairs, multi-queue
not supported for now.
- cq (optional, 0 by default), not supported for now.
- mac (optional), random value will be given if not specified.
- queue_size (optional, 256 by default), size of virtqueues.
- path (madatory), path of vhost user.
When enable CONFIG_RTE_VIRTIO_USER (enabled by default), the compiled
library can be used in both VM and container environment.
Examples:
path_vhost=<path_to_vhost_user> # use vhost-user as a backend
sudo ./examples/l2fwd/build/l2fwd -c 0x100000 -n 4 \
--socket-mem 0,1024 --no-pci --file-prefix=l2fwd \
--vdev=virtio-user0,mac=00:01:02:03:04:05,path=$path_vhost -- -p 0x1
Known issues:
- Control queue and multi-queue are not supported yet.
- Cannot work with --huge-unlink.
- Cannot work with no-huge.
- Cannot work when there are more than VHOST_MEMORY_MAX_NREGIONS(8)
hugepages.
- Root privilege is a must (mainly becase of sorting hugepages according
to physical address).
- Applications should not use file name like HUGEFILE_FMT ("%smap_%d").
- Cannot work with vhost-net backend.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
2016-06-15 09:03:25 +00:00
|
|
|
|
2016-11-02 03:15:00 +00:00
|
|
|
Make sure "-cpu host" QEMU option is given.
|
2018-05-24 12:05:07 +00:00
|
|
|
|
|
|
|
* Device start fails if NIC's max queues > the default number of 128
|
|
|
|
|
|
|
|
mbuf pool size is dependent on the MAX_QUEUES configuration, if NIC's
|
|
|
|
max queue number is larger than 128, device start will fail due to
|
|
|
|
insufficient mbuf.
|
|
|
|
|
|
|
|
Change the default number to make it work as below, just set the number
|
|
|
|
according to the NIC's property. ::
|
|
|
|
|
|
|
|
make EXTRA_CFLAGS="-DMAX_QUEUES=320"
|
2018-07-31 16:00:39 +00:00
|
|
|
|
|
|
|
* Option "builtin-net-driver" is incompatible with QEMU
|
|
|
|
|
|
|
|
QEMU vhost net device start will fail if protocol feature is not negotiated.
|
|
|
|
DPDK virtio-user pmd can be the replacement of QEMU.
|