Commit Graph

434 Commits

Author SHA1 Message Date
Maxime Coquelin
b1cce26af1 vhost: add notification for packed ring
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
ae999ce49d vhost: add Tx support for packed ring
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
a922401f35 vhost: add Rx support for packed ring
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
2f3225a7d6 vhost: add vector filling support for packed ring
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
cf3af2be49 vhost: create descriptor mapping function
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
37f5e79a27 vhost: add shadow used ring support for packed rings
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
3e2f9700bb vhost: append shadow used ring function names with split
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
62250c1d09 vhost: extract split ring handling from Rx and Tx functions
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
2c22b14388 vhost: clear batch copy index at copy time
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
d6315ce796 vhost: make indirect desc table copy desc type agnostic
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Maxime Coquelin
7e47bba30a vhost: clear shadow used table index at flush time
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Yuanhan Liu
2d1541e2b6 vhost: add vring address setup for packed queues
Add code to set up packed queues when enabled.

Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org>
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:19:29 +02:00
Jens Freimann
d3211c98c4 vhost: add helpers for packed virtqueues
Add some helper functions to check descriptor flags
and check if a vring is of type packed.

Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Jens Freimann
297b1e7350 vhost: add virtio packed virtqueue defines
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
611994fc1b vhost: improve prefetching in enqueue path
This is an optimization to prefetch next buffer while the
current one is being processed.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
c1058a6b16 vhost: prefetch first descriptor in dequeue path
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
380a2adff3 vhost: improve prefetching in dequeue path
This is an optimization to prefetch next buffer while the
current one is being processed.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
fd68b4739d vhost: use buffer vectors in dequeue path
To ease packed ring layout integration, this patch makes
the dequeue path to re-use buffer vectors implemented for
enqueue path.

Doing this, copy_desc_to_mbuf() is now ring layout type
agnostic.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Maxime Coquelin
915cf94042 vhost: use shadow used ring in dequeue path
Relax used ring contention by reusing the shadow used
ring feature used by enqueue path.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-10 23:13:36 +02:00
Marvin Liu
22d2e78840 vhost: advertise support in-order feature
If devices always use descriptors in the same order in which they have
been made available. These devices can offer the VIRTIO_F_IN_ORDER
feature. If negotiated, this knowledge allows devices to notify the use
of a batch of buffers to virtio driver by only writing used ring index.

Vhost user device has supported this feature by default. If vhost
dequeue zero is enabled, should disable VIRTIO_F_IN_ORDER as vhost can’t
assure that descriptors returned from NIC are in order.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-07-03 01:35:58 +02:00
Tiwei Bie
ad0fdb696a vhost: fix potential null pointer dereference
Coverity issue: 293097
Fixes: d90cf7d111 ("vhost: support host notifier")

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-07-03 01:35:58 +02:00
Maxime Coquelin
e11411b52a vhost: fix missing increment of log cache count
The log_cache_nb_elem was never incremented, resulting
in all dirty pages to be missed during live migration.

Fixes: c16915b871 ("vhost: improve dirty pages logging performance")
Cc: stable@dpdk.org

Reported-by: Peng He <xnhp0320@icloud.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2018-07-03 01:35:58 +02:00
Maxime Coquelin
63b113afa5 vhost: use SMP memory barrier before kicking guest
vhost_vring_call() used rte_mb(), which translates into
mfence instruction on x86.

This patch changes to use rte_smp_mb(), which changed recently
to translate into a locked ADD instruction for performance
reason.

The measured gain is up to 3% with the testpmd benchmarks.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2018-06-15 12:27:25 +02:00
Tonghao Zhang
2396806765 vhost: introduce new function helper
Introduce an new common helper to avoid redundancy.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-15 12:27:25 +02:00
Tiwei Bie
d90cf7d111 vhost: support host notifier
When a vDPA device is attached, vhost user will try to
register host notifiers to QEMU to allow notifications
to be delivered between the driver in the guest and the
vDPA device in the host directly.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-15 09:49:39 +02:00
Tonghao Zhang
76b1e4cec7 vhost: refine new device function
Make sure find avalid device id before allocating
virtio_net, if not, return directly. It may avoid
allocating and freeing virtio_net when there is
not valid device id.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-15 09:49:09 +02:00
Maxime Coquelin
c89e52d9c8 vhost: improve batched copies performance
Instead of copying batch_copy_nb_elems into the stack,
this patch uses it directly.

Small performance gain of 3% is seen when running PVP
benchmark.

Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-14 19:27:50 +02:00
Maxime Coquelin
24e4844048 vhost: unify Rx mergeable and non-mergeable paths
This patch reworks the vhost enqueue path so that a single
code path is used for both Rx mergeable or non-mergeable cases.

Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-06-14 19:27:50 +02:00
Maxime Coquelin
c16915b871 vhost: improve dirty pages logging performance
This patch caches all dirty pages logging until the used ring index
is updated.

The goal of this optimization is to fix a performance regression
introduced when the vhost library started to use atomic operations
to set bits in the shared dirty log map. While the fix was valid
as previous implementation wasn't safe against concurrent accesses,
contention was induced.

With this patch, during migration, we have:
1. Less atomic operations as only a single atomic OR operation
per 32 or 64 (depending on CPU) pages.
2. Less atomic operations as during a burst, the same page will
be marked dirty only once.
3. Less write memory barriers.

Fixes: 897f13a1f7 ("vhost: make page logging atomic")
Cc: stable@dpdk.org

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
2018-05-17 14:19:05 +02:00
Fan Zhang
3c79609fda vhost/crypto: handle virtually non-contiguous buffers
This patch enables the handling of buffers non-contiguous in
virtual address space in the vhost_crypto. Instead of using
rte_vhost_va_from_guest_pa(), the host virtual address is
converted by vhost_iova_to_vva() for wider use cases.

For copy mode, the copy length is limited to the chunk size,
next chunks VAs being fetched afterward.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-17 12:29:05 +02:00
Fan Zhang
2017bc3356 vhost/crypto: fix descriptor move
This patch fixes the redundant descriptor move in the copy mode
of vhost crypto. Originally the redundant descriptor move will
cause the message parsing error.

Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-17 12:29:05 +02:00
Maxime Coquelin
d5022533c2 vhost: retranslate vring addr when memory table changes
When the vhost-user master sends memory updates using
VHOST_USER_SET_MEM request, the user backends unmap and then
mmap again the memory regions in its address space.

If the ring addresses have already been translated, it needs to
be translated again as they point to unmapped memory.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:31:50 +01:00
Fan Zhang
dc3530ec0d vhost/crypto: fix symmetric ciphering
A bracket was misplaced in a condition check, this patch
fixes it.

Coverity issue: 277232, 277237
Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:31:03 +01:00
Maxime Coquelin
8eac207d49 vhost: fix header copy to discontiguous desc buffer
In the loop to copy virtio-net header to the descriptor buffer,
destination pointer was incremented instead of the source
pointer.

Fixes: fb3815cc61 ("vhost: handle virtually non-contiguous buffers in Rx-mrg")
Fixes: 6727f5a739 ("vhost: handle virtually non-contiguous buffers in Rx")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:31:03 +01:00
Tonghao Zhang
bfbf0143d1 vhost: fix typo in comment
Fixes: 3670686ab9 ("vhost: fix race for connection fd")
Cc: stable@dpdk.org

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:31:03 +01:00
Tonghao Zhang
52d874dc67 vhost: fix crash on closing in client mode
when rte_vhost_driver_unregister detstroy the vsocket, we
should set it to NULL after freeing it, because in client mode,
the conn may be added to reconnect thread while vsocket is
destroyed. In one case, if qemu create vhostuser port as a
server with the same unix path, the reconnect thread will
reconnect to it while vsocket is destroyed.

To fix this:
1. set vsocket to NULL after free it.
2. remove the reconnection from reconnection thread in suitable
   position.

Cc: stable@dpdk.org

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:30:48 +01:00
Tonghao Zhang
8b4b949144 vhost: fix dead lock on closing in server mode
When qemu close the unix socket fd of the vhostuser as a
server, and then immediately delete the vhostuser port on
openvswitch. There will be a deadlock.

A thread (fdset event thread):       B thread:
1. fdset_event_dispatch              rte_vhost_driver_unregister
2. set the fd busy to 1.             lock vsocket->conn_mutex
3. vhost_user_read_cb                fdset_del waits busy changed to 0.
4. vhost peer closed, remove the
   conn from vsocket->conn_list:
   lock vsocket->conn_mutex

5. set the fd busy to 0

Fixes: 65388b43f5 ("vhost: fix fd leaks for vhost-user server mode")
Cc: stable@dpdk.org

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-14 22:29:59 +01:00
Ferruh Yigit
04db1d0da7 lib: clear experimental version tag in linker scripts
Remove version tag from experimental block in linker version scripts
(.map files).

That label is not used by linker and information only. It is useful
for version blocks but not useful for experimental block but confusing.
Removing those labels.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-05-14 03:37:28 +02:00
Fan Zhang
613e827fb2 vhost/crypto: fix checks while moving descriptors
This patch fix final condition check while moving virtqueue
descriptors.

Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-04-27 19:49:20 +02:00
Fan Zhang
d4cc4c65df vhost/crypto: fix missing head correction
This patch fixes the missing head descriptor correction for
indirect descriptors.

Fixes: 0aee242841 ("vhost/crypto: move to safe GPA translation API")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-04-27 19:49:07 +02:00
Xiao Wang
dfdf4b84b8 vhost: fix vDPA set features
We should call set_features callback after setting features in virtio_net
structure, otherwise vDPA driver cannot get the right features.

Fixes: 07718b4f87 ("vhost: adapt library for selective datapath")

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-04-27 18:01:00 +01:00
Maxime Coquelin
bb77d555d4 vhost: revert avoid concurrency when logging dirty pages
This reverts commit 394313fff3.

While the patch did solve concurrency issue, it induces more
pages copies as some clean pages are marked as dirty for
performance reasons. Moreover, as there is no more contention
doing the logging, the rate of packets than can be processed is
higher, leading to even more pages to be dirtied.

It has been reported that with more than one queue pair, and
with a relatively low packet rate (1Mpps), the live migration
never converges until the flow is stopped.

While a better solution is found, it is better to reset to the
old behaviour, i.e. using atomic operation for dirty pages
logging.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-04-27 18:01:00 +01:00
Maxime Coquelin
996096e629 vhost/crypto: fix build with gcc 4.7.2
Build error has been reported by Intel build system:
SUSE12SP3_64 / Linux 3.7.10-1 / GCC 4.7.2
lib/librte_vhost/vhost_crypto.c: In function ‘rte_vhost_crypto_set_zero_copy’:
lib/librte_vhost/vhost_crypto.c:1192:2: error:
comparison of unsigned expression < 0 is always false

As enums can be either signed or unsigned, this patch removes
the negative check and cast to unsigned the upper limit check.

Fixes: 939066d965 ("vhost/crypto: add public function implementation")

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-04-27 11:31:39 +02:00
Olivier Matz
6383d2642b eal: set name when creating a control thread
To avoid code duplication, add a parameter to rte_ctrl_thread_create()
to specify the name of the thread.

This requires to add a wrapper for the thread start routine in
rte_thread_init(), which will first wait that the thread is configured.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2018-04-25 00:51:31 +02:00
Olivier Matz
9e5afc72c9 eal: add function to create control threads
Many parts of dpdk use their own management threads. Introduce a new
wrapper for thread creation that will be extended in next commits to set
the name and affinity.

To be consistent with other DPDK APIs, the return value is negative in
case of error, which was not the case for pthread_create().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2018-04-25 00:51:31 +02:00
Olivier Matz
dec7b1884a use sizeof to avoid double use of a length define
Only a cosmetic change: the *_LEN defines are already used
when defining the buffer. Using sizeof() ensures that the length
stays consistent, even if the definition is modified.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2018-04-25 00:51:31 +02:00
Maxime Coquelin
9553e6e408 vhost: deprecate unsafe GPA translation API
This patch marks rte_vhost_gpa_to_vva() as deprecated because
it is unsafe. Application relying on this API should move
to the new rte_vhost_va_from_guest_pa() API, and check
returned length to avoid out-of-bound accesses.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-04-23 17:12:13 +02:00
Maxime Coquelin
0aee242841 vhost/crypto: move to safe GPA translation API
This patch uses the new rte_vhost_va_from_guest_pa() API
to ensure all the descriptor buffer is mapped contiguously
in the application virtual address space.

It does not handle buffers discontiguous in host virtual
address space, but only return an error.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-04-23 17:12:13 +02:00
Maxime Coquelin
fb3815cc61 vhost: handle virtually non-contiguous buffers in Rx-mrg
This patch enables the handling of buffers non-contiguous in
process virtual address space in the enqueue path when mergeable
buffers are used.

When virtio-net header doesn't fit in a single chunck, it is
computed in a local variable and copied to the buffer chuncks
afterwards.

For packet content, the copy length is limited to the chunck
size, next chuncks VAs being fetched afterward.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-04-23 17:12:13 +02:00
Maxime Coquelin
6727f5a739 vhost: handle virtually non-contiguous buffers in Rx
This patch enables the handling of buffers non-contiguous in
process virtual address space in the enqueue path when mergeable
buffers aren't used.

When virtio-net header doesn't fit in a single chunck, it is
computed in a local variable and copied to the buffer chuncks
afterwards.

For packet content, the copy length is limited to the chunck
size, next chuncks VAs being fetched afterward.

This issue has been assigned CVE-2018-1059.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-04-23 17:12:13 +02:00