This patch fixes the incorrect packet content copy in the
chaining mode. Originally the content before cipher offset is
overwritten by all zeros. This patch fixes the problem by
making sure the correct write back source and destination
settings during set up.
Fixes: 3bb595ecd682 ("vhost/crypto: add request handler")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
We should apply for RO access when receiving packets from the
VM and apply for RW access when sending packets to the VM.
Fixes: a922401f35cc ("vhost: add Rx support for packed ring")
Fixes: ae999ce49dcb ("vhost: add Tx support for packed ring")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
For packed ring layout, we need save avail index and its wrap
counter value. At restore time, the used index and its wrap counter
are set to available's ones, as the ring procressing is stopped
at vring base get time.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Currently, postcopy_ufd is initialized to 0 implicitly, so fd 0
could be closed unexpectedly by vhost_backend_cleanup(). Fix this
issue by initializing postcopy_ufd to -1 explicitly.
Fixes: 9eefef3b5970 ("vhost: introduce postcopy advise message")
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In both split and packed dequeue paths, flush_shadow_used_ring
and vhost_ring_call variants gets called even if not packets
have been dequeued, and so no descriptors updates happened.
It has an impact on CPU pipeline, as memory barriers are used
in these functions.
This patch don't call these functions if no descriptors have
been dequeued. The performance gain with split ring when
dequeue zero-copy is disabled should be null, but should be
noticeable with packed ring or dequeue zero-copy enabled.
Fixes: ae999ce49dcb ("vhost: add Tx support for packed ring")
Fixes: 915cf9404225 ("vhost: use shadow used ring in dequeue path")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Tested-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
We should return the length of the buffers described by
the current descriptor chain after filling the buffer
vector. So we need to zero the *len first.
Fixes: 2f3225a7d69b ("vhost: add vector filling support for packed ring")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Currently it's not possible to build DPDK as shared library with
cryptodev disabled since vhost is trying to link with rte_crypto,
but rte_crypto and rte_hash are only needed when you build vhost_crypto
and so only when cryptodev is enabled.
This patch fix this by linking rte_vhost with rte_crypto and rte_hash
only when cryptodev is enabled.
Fixes: b4ca81298613 ("vhost/crypto: fix build without cryptodev")
Fixes: 939066d96563 ("vhost/crypto: add public function implementation")
Cc: stable@dpdk.org
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Postcopy live-migration feature requires the application to
not populate the guest memory. As the vhost library cannot
prevent the application to that (e.g. preventing the
application to call mlockall()), the feature is disabled by
default.
The application should only enable the feature if it does not
force the guest memory to be populated.
In case the user passes the RTE_VHOST_USER_POSTCOPY_SUPPORT
flag at registration but the feature was not compiled,
registration fails.
For the same reason, postcopy and dequeue zero copy features
are not compatible, so don't advertize postcopy support if
dequeue zero copy is requested.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The master sends this message before stopping handling
userfaults, so that the backend closes the userfaultfd.
The master waits for the slave to acknowledge the request
with an empty 64bits payload for synchronization purpose.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The VHOST_USER_SET_MEM_TABLE payload is copied when handled,
whereas it could directly be referenced.
This is not very important, but next, we'll need to update the
payload and send it back to Qemu.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
This patch opens a userfaultfd and sends it back to Qemu's
VHOST_USER_POSTCOPY_ADVISE request.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Postcopy live-migration features relies on userfaultfd,
which was only introduced in kernel v4.3.
This patch introduces a new define to allow building vhost
library on kernels not supporting userfaultfd.
With legacy build system, user has to explicitly set
CONFIG_RTE_LIBRTE_VHOST_POSTCOPY to 'y'.
With Meson build system, RTE_LIBRTE_VHOST_POSTCOPY gets
automatically defined if userfaultfd kernel header is
present.
Suggested-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Passing userfault fds to Qemu will be required for postcopy
live-migration feature.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This is not used for now, but will be needed for the
special handling of VHOST_USER_SET_MEM_TABLE message
once postcopy will be supported.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As soon as some ancillary data (fds) are received, it is copied
without checking its length.
This patch adds the number of fds received to the message,
which is set in read_vhost_message().
This is preliminary work to support sending fds to Qemu.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
When the memory table gets updated, the rings addresses need
to be translated again. If it fails, we need to exit cleanly
by unmapping memory regions.
Fixes: d5022533c20a ("vhost: retranslate vring addr when memory table changes")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
QEMU doesn't expect any payload for the reply of
VHOST_USER_SET_LOG_BASE request, so don't send any.
Note that the Vhost-user specification isn't clear about
it and would need to be fixed.
Fixes: 54f9e32305d4 ("vhost: handle dirty pages logging request")
Cc: stable@dpdk.org
Reported-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
For messages that require a reply, a second ack should not be
sent when reply-ack protocol feature is negotiated, even if
the corresponding flag is set in the message.
The code is compliant with the spec but it isn't clear it is,
so this patch adds a comment to make it explicit.
Suggested-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
VHOST_USER_GET_PROTOCOL_FEATURES, VHOST_USER_GET_VRING_BASE
and VHOST_USER_SET_LOG_BASE require replies, so their handlers
should return VH_RESULT_REPLY, not VH_RESULT_OK.
Fixes: 0bff510b5ea6 ("vhost: unify message handling function signature")
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Return of message handling has now changed to an enum that can
take non-negative value that is not zero in case a reply is
needed. But the code checking the variable afterwards has not
been updated, leading to success messages handling being
treated as errors.
External post and pre callbacks return type needs also to be
changed to the new enum, so that its handling is consistent.
This is done in this patch alongside with the convertion of
its only user, vhost-crypto backend.
Fixes: 0bff510b5ea6 ("vhost: unify message handling function signature")
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As APIs in rte_vdpa.h are public, we need to add doxygen comments
to all APIs and structures.
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The notification can't be disabled in packed ring when
application tries to disable notification, because the
device event flags field is overwritten by an unexpected
value. This patch fixes this issue.
Fixes: b1cce26af1dc ("vhost: add notification for packed ring")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
It's used to get number of available registered vDPA devices.
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When performing enqueue operations on the split and packed rings,
if the reserved buffer length from the descriptor table exceeds
65535, the returned length by fill_vec_buf_split/_packed()
overflows. This patch is to avoid this corner case.
Fixes: f689586bc060 ("vhost: shadow used ring update")
Fixes: fd68b4739d2c ("vhost: use buffer vectors in dequeue path")
Fixes: 2f3225a7d69b ("vhost: add vector filling support for packed ring")
Fixes: 37f5e79a271d ("vhost: add shadow used ring support for packed rings")
Fixes: a922401f35cc ("vhost: add Rx support for packed ring")
Fixes: ae999ce49dcb ("vhost: add Tx support for packed ring")
Cc: stable@dpdk.org
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Introduce vhost_message_handlers, which maps the message request
type to the message handler. Then replace the switch construct
with a map and call.
Failing vhost_user_set_features is fatal and all processing should
stop immediately and propagate the error to the upper layers. Change
the code accordingly to reflect that.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Each vhost-user message handling function will return an int result
which is described in the new enum vh_result: error, OK and reply.
All functions will now have two arguments, virtio_net double pointer
and VhostUserMsg pointer.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As VhostUserMsg structure is reused to generate the reply, move the
relevant fields update into the respective message handling functions.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Do not use the typedef version of struct VhostUserMsg. Also unify the
related parameter name.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
There are a lot of cases where vhost-user massage handling
could fail and end up in a fully not recoverable state. For
example, allocation failures of shadow used ring and batched
copy array are not recoverable and leads to the segmentation
faults like this on the receiving/transmission path:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f913fecf0 (LWP 43625)]
in copy_desc_to_mbuf () at /lib/librte_vhost/virtio_net.c:760
760 batch_copy[vq->batch_copy_nb_elems].dst =
This could be easily reproduced in case of low memory or big
number of vhost-user ports.
Fix that by propagating error to the upper layer which will
end up with disconnection in case we can not report to
the message sender when the error happens.
Fixes: f689586bc060 ("vhost: shadow used ring update")
Cc: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When VIRTIO_RING_F_EVENT_IDX is negotiated, we need to
update the avail event to enable the notification.
Fixes: 3f8ff12821e4 ("vhost: support interrupt mode")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
'numa_realloc()' allocates 'zmbufs' even if zero copy mode
is not configured. This leads to memory leak, because array
is freed only for zero copy case.
Fixes: 2651726defb7 ("vhost: do deep copy while reallocating queue")
CC: stable@dpdk.org
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
It's a common case that 'get_mempolicy' fails on systems
without NUMA support. No need to flag an error in log for
this situation.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
IOTLB entries contain the host virtual address of the guest
pages. When receiving a new VHOST_USER_SET_MEM_TABLE request,
the previous regions get unmapped, so the IOTLB entries, if any,
will be invalid. It does cause the vhost-user process to
segfault.
This patch introduces a new function to flush the IOTLB cache,
and call it as soon as the backend handles a VHOST_USER_SET_MEM
request.
Fixes: 69c90e98f483 ("vhost: enable IOMMU support")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
The shadow used ring's size is the same as the vq's size,
so we shouldn't try more than "vq size" times. Besides,
the element pointed by avail->idx isn't available to the
device, so we will return error when try "vq size" times.
Fixes: 24e4844048e1 ("vhost: unify Rx mergeable and non-mergeable paths")
Fixes: a922401f35cc ("vhost: add Rx support for packed ring")
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
This patch fixes the incorrect return value for rte_vhost_dequeue_burst()
when virtqueue is not enabled or virtqueue address translation fails.
Fixes: 62250c1d0978 ("vhost: extract split ring handling from Rx and Tx functions")
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Acked-by: Tiwei Bie <tiwei.bie@intel.com>
Fixes: fd68b4739d2c ("vhost: use buffer vectors in dequeue path")
Reported-by: Yinan Wang <yinan.wang@intel.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>