446 Commits

Author SHA1 Message Date
Tiwei Bie
58e90a9113 vhost: fix return value on enqueue path
Fixes: 62250c1d0978 ("vhost: extract split ring handling from Rx and Tx functions")
Fixes: a922401f35cc ("vhost: add Rx support for packed ring")
Cc: stable@dpdk.org

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-09-14 20:08:41 +02:00
Ilya Maximets
0d7853a4da vhost-user: drop connection on message handling failures
There are a lot of cases where vhost-user massage handling
could fail and end up in a fully not recoverable state. For
example, allocation failures of shadow used ring and batched
copy array are not recoverable and leads to the segmentation
faults like this on the receiving/transmission path:

  Program received signal SIGSEGV, Segmentation fault.
  [Switching to Thread 0x7f913fecf0 (LWP 43625)]
  in copy_desc_to_mbuf () at /lib/librte_vhost/virtio_net.c:760
  760       batch_copy[vq->batch_copy_nb_elems].dst =

This could be easily reproduced in case of low memory or big
number of vhost-user ports.

Fix that by propagating error to the upper layer which will
end up with disconnection in case we can not report to
the message sender when the error happens.

Fixes: f689586bc060 ("vhost: shadow used ring update")
Cc: stable@dpdk.org

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-09-14 20:08:41 +02:00
Tiwei Bie
77de7c781c vhost: fix vhost interrupt support
When VIRTIO_RING_F_EVENT_IDX is negotiated, we need to
update the avail event to enable the notification.

Fixes: 3f8ff12821e4 ("vhost: support interrupt mode")
Cc: stable@dpdk.org

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-09-14 20:08:41 +02:00
Ilya Maximets
28925156d9 vhost: fix zmbufs array leak after NUMA realloc
'numa_realloc()' allocates 'zmbufs' even if zero copy mode
is not configured. This leads to memory leak, because array
is freed only for zero copy case.

Fixes: 2651726defb7 ("vhost: do deep copy while reallocating queue")
CC: stable@dpdk.org

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2018-09-12 19:10:09 +02:00
Ilya Maximets
d02f2092a3 vhost: suppress error if NUMA is not available
It's a common case that 'get_mempolicy' fails on systems
without NUMA support. No need to flag an error in log for
this situation.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-08-28 15:27:39 +02:00
Maxime Coquelin
af53db4867 vhost: flush IOTLB cache on new mem table handling
IOTLB entries contain the host virtual address of the guest
pages. When receiving a new VHOST_USER_SET_MEM_TABLE request,
the previous regions get unmapped, so the IOTLB entries, if any,
will be invalid. It does cause the vhost-user process to
segfault.

This patch introduces a new function to flush the IOTLB cache,
and call it as soon as the backend handles a VHOST_USER_SET_MEM
request.

Fixes: 69c90e98f483 ("vhost: enable IOMMU support")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
2018-08-05 01:47:47 +02:00
Tiwei Bie
adead74939 vhost: remove unused variable
The nr_updated is just increased and not really used.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
2018-08-02 04:41:49 +02:00
Tiwei Bie
0989161b26 vhost: release locks on RARP packet failure
Fixes: eefac9536a90 ("vhost: postpone device creation until rings are mapped")
Cc: stable@dpdk.org

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
2018-07-26 10:02:52 +02:00
Tiwei Bie
14962a9c4f vhost: fix overflow on shadow used ring
The shadow used ring's size is the same as the vq's size,
so we shouldn't try more than "vq size" times. Besides,
the element pointed by avail->idx isn't available to the
device, so we will return error when try "vq size" times.

Fixes: 24e4844048e1 ("vhost: unify Rx mergeable and non-mergeable paths")
Fixes: a922401f35cc ("vhost: add Rx support for packed ring")

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
2018-07-26 10:02:50 +02:00
Jiayu Hu
6e2fad861a vhost: fix return value on dequeue path
This patch fixes the incorrect return value for rte_vhost_dequeue_burst()
when virtqueue is not enabled or virtqueue address translation fails.

Fixes: 62250c1d0978 ("vhost: extract split ring handling from Rx and Tx functions")

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-26 10:02:43 +02:00
Tiwei Bie
04651f72d1 vhost: fix buffer length calculation
Fixes: fd68b4739d2c ("vhost: use buffer vectors in dequeue path")

Reported-by: Yinan Wang <yinan.wang@intel.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
2018-07-23 23:55:26 +02:00
Dan Gora
9f976204f5 vhost/crypto: use function to access mbuf private area
Use rte_mbuf_to_priv() to access the private data area in the mbuf.

Signed-off-by: Dan Gora <dg@adax.com>
2018-07-13 23:14:41 +02:00
Maxime Coquelin
b1cce26af1 vhost: add notification for packed ring
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
ae999ce49d vhost: add Tx support for packed ring
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
a922401f35 vhost: add Rx support for packed ring
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
2f3225a7d6 vhost: add vector filling support for packed ring
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
cf3af2be49 vhost: create descriptor mapping function
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
37f5e79a27 vhost: add shadow used ring support for packed rings
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
3e2f9700bb vhost: append shadow used ring function names with split
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
62250c1d09 vhost: extract split ring handling from Rx and Tx functions
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
2c22b14388 vhost: clear batch copy index at copy time
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
d6315ce796 vhost: make indirect desc table copy desc type agnostic
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
7e47bba30a vhost: clear shadow used table index at flush time
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Yuanhan Liu
2d1541e2b6 vhost: add vring address setup for packed queues
Add code to set up packed queues when enabled.

Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org>
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Jens Freimann
d3211c98c4 vhost: add helpers for packed virtqueues
Add some helper functions to check descriptor flags
and check if a vring is of type packed.

Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Jens Freimann
297b1e7350 vhost: add virtio packed virtqueue defines
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
611994fc1b vhost: improve prefetching in enqueue path
This is an optimization to prefetch next buffer while the
current one is being processed.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
c1058a6b16 vhost: prefetch first descriptor in dequeue path
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
380a2adff3 vhost: improve prefetching in dequeue path
This is an optimization to prefetch next buffer while the
current one is being processed.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
fd68b4739d vhost: use buffer vectors in dequeue path
To ease packed ring layout integration, this patch makes
the dequeue path to re-use buffer vectors implemented for
enqueue path.

Doing this, copy_desc_to_mbuf() is now ring layout type
agnostic.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
915cf94042 vhost: use shadow used ring in dequeue path
Relax used ring contention by reusing the shadow used
ring feature used by enqueue path.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Marvin Liu
22d2e78840 vhost: advertise support in-order feature
If devices always use descriptors in the same order in which they have
been made available. These devices can offer the VIRTIO_F_IN_ORDER
feature. If negotiated, this knowledge allows devices to notify the use
of a batch of buffers to virtio driver by only writing used ring index.

Vhost user device has supported this feature by default. If vhost
dequeue zero is enabled, should disable VIRTIO_F_IN_ORDER as vhost can’t
assure that descriptors returned from NIC are in order.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-07-03 01:35:58 +02:00
Tiwei Bie
ad0fdb696a vhost: fix potential null pointer dereference
Coverity issue: 293097
Fixes: d90cf7d111ac ("vhost: support host notifier")

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-07-03 01:35:58 +02:00
Maxime Coquelin
e11411b52a vhost: fix missing increment of log cache count
The log_cache_nb_elem was never incremented, resulting
in all dirty pages to be missed during live migration.

Fixes: c16915b87109 ("vhost: improve dirty pages logging performance")
Cc: stable@dpdk.org

Reported-by: Peng He <xnhp0320@icloud.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-03 01:35:58 +02:00
Maxime Coquelin
63b113afa5 vhost: use SMP memory barrier before kicking guest
vhost_vring_call() used rte_mb(), which translates into
mfence instruction on x86.

This patch changes to use rte_smp_mb(), which changed recently
to translate into a locked ADD instruction for performance
reason.

The measured gain is up to 3% with the testpmd benchmarks.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2018-06-15 12:27:25 +02:00
Tonghao Zhang
2396806765 vhost: introduce new function helper
Introduce an new common helper to avoid redundancy.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-15 12:27:25 +02:00
Tiwei Bie
d90cf7d111 vhost: support host notifier
When a vDPA device is attached, vhost user will try to
register host notifiers to QEMU to allow notifications
to be delivered between the driver in the guest and the
vDPA device in the host directly.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-15 09:49:39 +02:00
Tonghao Zhang
76b1e4cec7 vhost: refine new device function
Make sure find avalid device id before allocating
virtio_net, if not, return directly. It may avoid
allocating and freeing virtio_net when there is
not valid device id.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-15 09:49:09 +02:00
Maxime Coquelin
c89e52d9c8 vhost: improve batched copies performance
Instead of copying batch_copy_nb_elems into the stack,
this patch uses it directly.

Small performance gain of 3% is seen when running PVP
benchmark.

Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-14 19:27:50 +02:00
Maxime Coquelin
24e4844048 vhost: unify Rx mergeable and non-mergeable paths
This patch reworks the vhost enqueue path so that a single
code path is used for both Rx mergeable or non-mergeable cases.

Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-14 19:27:50 +02:00
Maxime Coquelin
c16915b871 vhost: improve dirty pages logging performance
This patch caches all dirty pages logging until the used ring index
is updated.

The goal of this optimization is to fix a performance regression
introduced when the vhost library started to use atomic operations
to set bits in the shared dirty log map. While the fix was valid
as previous implementation wasn't safe against concurrent accesses,
contention was induced.

With this patch, during migration, we have:
1. Less atomic operations as only a single atomic OR operation
per 32 or 64 (depending on CPU) pages.
2. Less atomic operations as during a burst, the same page will
be marked dirty only once.
3. Less write memory barriers.

Fixes: 897f13a1f726 ("vhost: make page logging atomic")
Cc: stable@dpdk.org

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2018-05-17 14:19:05 +02:00
Fan Zhang
3c79609fda vhost/crypto: handle virtually non-contiguous buffers
This patch enables the handling of buffers non-contiguous in
virtual address space in the vhost_crypto. Instead of using
rte_vhost_va_from_guest_pa(), the host virtual address is
converted by vhost_iova_to_vva() for wider use cases.

For copy mode, the copy length is limited to the chunk size,
next chunks VAs being fetched afterward.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-17 12:29:05 +02:00
Fan Zhang
2017bc3356 vhost/crypto: fix descriptor move
This patch fixes the redundant descriptor move in the copy mode
of vhost crypto. Originally the redundant descriptor move will
cause the message parsing error.

Fixes: 3bb595ecd682 ("vhost/crypto: add request handler")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-17 12:29:05 +02:00
Maxime Coquelin
d5022533c2 vhost: retranslate vring addr when memory table changes
When the vhost-user master sends memory updates using
VHOST_USER_SET_MEM request, the user backends unmap and then
mmap again the memory regions in its address space.

If the ring addresses have already been translated, it needs to
be translated again as they point to unmapped memory.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:31:50 +01:00
Fan Zhang
dc3530ec0d vhost/crypto: fix symmetric ciphering
A bracket was misplaced in a condition check, this patch
fixes it.

Coverity issue: 277232, 277237
Fixes: 3bb595ecd682 ("vhost/crypto: add request handler")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:31:03 +01:00
Maxime Coquelin
8eac207d49 vhost: fix header copy to discontiguous desc buffer
In the loop to copy virtio-net header to the descriptor buffer,
destination pointer was incremented instead of the source
pointer.

Fixes: fb3815cc614d ("vhost: handle virtually non-contiguous buffers in Rx-mrg")
Fixes: 6727f5a739b6 ("vhost: handle virtually non-contiguous buffers in Rx")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:31:03 +01:00
Tonghao Zhang
bfbf0143d1 vhost: fix typo in comment
Fixes: 3670686ab99f ("vhost: fix race for connection fd")
Cc: stable@dpdk.org

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:31:03 +01:00
Tonghao Zhang
52d874dc67 vhost: fix crash on closing in client mode
when rte_vhost_driver_unregister detstroy the vsocket, we
should set it to NULL after freeing it, because in client mode,
the conn may be added to reconnect thread while vsocket is
destroyed. In one case, if qemu create vhostuser port as a
server with the same unix path, the reconnect thread will
reconnect to it while vsocket is destroyed.

To fix this:
1. set vsocket to NULL after free it.
2. remove the reconnection from reconnection thread in suitable
   position.

Cc: stable@dpdk.org

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:30:48 +01:00
Tonghao Zhang
8b4b949144 vhost: fix dead lock on closing in server mode
When qemu close the unix socket fd of the vhostuser as a
server, and then immediately delete the vhostuser port on
openvswitch. There will be a deadlock.

A thread (fdset event thread):       B thread:
1. fdset_event_dispatch              rte_vhost_driver_unregister
2. set the fd busy to 1.             lock vsocket->conn_mutex
3. vhost_user_read_cb                fdset_del waits busy changed to 0.
4. vhost peer closed, remove the
   conn from vsocket->conn_list:
   lock vsocket->conn_mutex

5. set the fd busy to 0

Fixes: 65388b43f592 ("vhost: fix fd leaks for vhost-user server mode")
Cc: stable@dpdk.org

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:29:59 +01:00
Ferruh Yigit
04db1d0da7 lib: clear experimental version tag in linker scripts
Remove version tag from experimental block in linker version scripts
(.map files).

That label is not used by linker and information only. It is useful
for version blocks but not useful for experimental block but confusing.
Removing those labels.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-05-14 03:37:28 +02:00