This patch adds support to the new Virtio device status
Vhost-user protocol feature.
Getting such information in the backend helps to know
when the driver is done with the device configuration
and so makes the initialization phase more robust.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Performing large memory copies usually takes up a major part of CPU
cycles and becomes the hot spot in vhost-user enqueue operation. To
offload the large copies from CPU to the DMA devices, asynchronous
APIs are introduced, with which the CPU just submits copy jobs to
the DMA but without waiting for its copy completion. Thus, there is
no CPU intervention during data transfer. We can save precious CPU
cycles and improve the overall throughput for vhost-user based
applications. This patch introduces registration/un-registration
APIs for vhost async data enqueue operation. Together with the
registration APIs implementations, data structures and the prototype
of the async callback functions required for async enqueue data path
are also defined.
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
The vhost library provide an infrastructure in order to help the DPDK
users to manage vhost devices.
One of the infrastructure parts is the features enablement APIs.
Some features bits may be defined only in the internal file vhost.h in
case the kernel version doesn't include them.
Hence, user running on old kernel may not be able to manage thus
features.
Move all the feature bits definitions to the API file rte_vhost.h.
Fixes: db69be54b6ff ("vhost: hide internal code")
Fixes: 8d286dbeb8d7 ("vhost: fix multiple queue not enabled for old kernels")
Fixes: 3d3c6590b58c ("vhost: enable virtio MTU feature")
Fixes: 704098fc478c ("vhost: fix build with old kernels")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This removes the notion of device ID in Vhost library
as a preliminary step to get rid of the vDPA device ID.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
Removed the typing error in doc/guides/eventdevs/index.rst,
drivers/net/mlx5/mlx5.c and in lib/librte_vhost/rte_vhost.h
Bugzilla ID: 477
Fixes: 0857b9421138 ("doc: add event device and software eventdev")
Fixes: 039253166a57 ("vhost: add device op when notification to guest is sent")
Fixes: ad74bc619504 ("net/mlx5: support multiport IB device during probing")
Cc: stable@dpdk.org
Signed-off-by: Muhammad Bilal <m.bilal@emumba.com>
This patch fixes the vhost crypto missed
"VHOST_USER_PROTOCOL_F_CONFIG" flag problem during initialization.
Newer Qemu version requires this feature enabled.
Fixes: 939066d96563 ("vhost/crypto: add public function implementation")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This msg is used to notify qemu that should get the config of backend.
For example, vhost-user-blk uses this msg to notify guest OS the
capacity of backend has changed.
The need_reply flag is not mandatory because it will block the sender
thread and master process will send get_config message to fetch the
configuration, this need an extra thread to process the vhost message.
Signed-off-by: Li Feng <fengli@smartx.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The rte_vhost_dequeue_burst supports two ways of dequeuing data.
If the data fits into a buffer, then all data is copied and a
single linear buffer is returned. Otherwise it allocates
additional mbufs and chains them together to return a multiple
segments mbuf.
While that covers most use cases, it forces applications that
need to work with larger data sizes to support multiple segments
mbufs. The non-linear characteristic brings complexity and
performance implications to the application.
To resolve the issue, add support to attach external buffer
to a pktmbuf and let the host provide during registration if
attaching an external buffer to pktmbuf is supported and if
only linear buffer are supported.
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch introduces two APIs. one is for getting inflgiht
ring and the other is for getting base.
Signed-off-by: Lin Li <lilin24@baidu.com>
Signed-off-by: Xun Ni <nixun@baidu.com>
Signed-off-by: Yu Zhang <zhangyu31@baidu.com>
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch introduces three APIs to operate the inflight
ring. Three APIs are set, set last and clear. It includes
split and packed ring.
Signed-off-by: Lin Li <lilin24@baidu.com>
Signed-off-by: Xun Ni <nixun@baidu.com>
Signed-off-by: Yu Zhang <zhangyu31@baidu.com>
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch shows how to checkout the inflight ring and construct
the resubmit information also include destroying resubmit info.
Signed-off-by: Lin Li <lilin24@baidu.com>
Signed-off-by: Xun Ni <nixun@baidu.com>
Signed-off-by: Yu Zhang <zhangyu31@baidu.com>
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds the inflight queue region structure include
the split and packed.
Signed-off-by: Lin Li <lilin24@baidu.com>
Signed-off-by: Xun Ni <nixun@baidu.com>
Signed-off-by: Yu Zhang <zhangyu31@baidu.com>
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch add the packed ring in the rte_vhost_vring.
Signed-off-by: Lin Li <lilin24@baidu.com>
Signed-off-by: Xun Ni <nixun@baidu.com>
Signed-off-by: Yu Zhang <zhangyu31@baidu.com>
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch add the inflight message description and
the inflight share fd protocol feature flag.
Signed-off-by: Lin Li <lilin24@baidu.com>
Signed-off-by: Xun Ni <nixun@baidu.com>
Signed-off-by: Yu Zhang <zhangyu31@baidu.com>
Signed-off-by: Jin Yu <jin.yu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add IOVA versions of dirty page logging functions.
Note that the API facing rte_vhost_log_write is not modified.
So, make explicit that it expects the address in GPA space.
Fixes: 69c90e98f483 ("vhost: enable IOMMU support")
Cc: stable@dpdk.org
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This function is listed under EXPERIMENTAL in the
rte_vhost_version.map, so it needs to be marked
with __rte_experimental in the header file as well.
Found by check-experimental-syms.sh when trying to compile
DPDK with -finstrument-functions. This script didn't
catch this in the normal case, since the function is
declared __rte_always_inline.
This also requires updating the vhost_scsi example to allow
use of this newly marked experimental API.
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds an operation callback which gets called every time
the library is waking up the guest trough an eventfd_write() call.
This can be used by 3rd party application, like OVS, to track the
number of times interrupts where generated. This might be of
interest to find out system-call were called in the fast path.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Putting a '__attribute__((deprecated))' in the middle of a function
prototype does not result in the expected result with gcc (while clang
is fine with this syntax).
$ cat deprecated.c
void * __attribute__((deprecated)) incorrect() { return 0; }
__attribute__((deprecated)) void *correct(void) { return 0; }
int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
$ gcc -o deprecated.o -c deprecated.c
deprecated.c: In function ‘main’:
deprecated.c:3:1: warning: ‘correct’ is deprecated (declared at
deprecated.c:2) [-Wdeprecated-declarations]
int main(int argc, char *argv[]) { incorrect(); correct(); return 0; }
^
Move the tag on a separate line and make it the first thing of function
prototypes.
This is not perfect but we will trust reviewers to catch the other not
so easy to detect patterns.
sed -i \
-e '/^\([^#].*\)\?__rte_experimental */{' \
-e 's//\1/; s/ *$//; i\' \
-e __rte_experimental \
-e '/^$/d}' \
$(git grep -l __rte_experimental -- '*.h')
Special mention for rte_mbuf_data_addr_default():
There is either a bug or a (not yet understood) issue with gcc.
gcc won't drop this inline when unused and rte_mbuf_data_addr_default()
calls rte_mbuf_buf_addr() which itself is experimental.
This results in a build warning when not accepting experimental apis
from sources just including rte_mbuf.h.
For this specific case, we hide the call to rte_mbuf_buf_addr() under
the ALLOW_EXPERIMENTAL_API flag.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Add a missing include with the defines for vhost-user driver features.
Fixes: 5fbb3941da9f ("vhost: introduce driver features related APIs")
Cc: stable@dpdk.org
Signed-off-by: Noa Ezra <noae@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Matan Azrad <matan@mellanox.com>
External backends may have specific requests to handle, and so
we don't want the vhost-user lib to handle these requests as
errors.
This patch also changes the experimental API by introducing
RTE_VHOST_MSG_RESULT_NOT_HANDLED so that vhost-user lib
can report an error if a message is handled neither by
the vhost-user library nor by the external backend.
The logic changes a bit so that if the callback returns
with ERR, OK or REPLY, it is considered the message
is handled by the external backend so it won't be
handled by the vhost-user library.
It is still possible for an external backend to listen
to requests that have to be handled by the vhost-user
library like SET_MEM_TABLE, but the callback have to
return NOT_HANDLED in that case.
Vhost-crypto backend is also adapted to this API change.
Suggested-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
rte_vhost_driver_set_protocol_features API is to be used
by external backends to advertise vhost-user protocol
features it supports.
It has to be called after rte_vhost_driver_register() and
before rte_vhost_driver_start().
Example of usage to advertize VHOST_USER_PROTOCOL_F_FOOBAR
protocol feature:
const char *path = "/tmp/vhost-user";
uint64_t protocol_features;
rte_vhost_driver_register(path, 0);
rte_vhost_driver_get_protocol_features(path, &protocol_features);
protocol_features |= VHOST_USER_PROTOCOL_F_FOOBAR;
rte_vhost_driver_set_protocol_features(path, protocol_features);
rte_vhost_driver_start(path);
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
External message callbacks are used e.g. by vhost crypto
to parse crypto-specific vhost-user messages.
We are now publishing the API to register those callbacks,
so that other backends outside of DPDK can use them as well.
Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Postcopy live-migration feature requires the application to
not populate the guest memory. As the vhost library cannot
prevent the application to that (e.g. preventing the
application to call mlockall()), the feature is disabled by
default.
The application should only enable the feature if it does not
force the guest memory to be populated.
In case the user passes the RTE_VHOST_USER_POSTCOPY_SUPPORT
flag at registration but the feature was not compiled,
registration fails.
For the same reason, postcopy and dequeue zero copy features
are not compatible, so don't advertize postcopy support if
dequeue zero copy is requested.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
When a vDPA device is attached, vhost user will try to
register host notifiers to QEMU to allow notifications
to be delivered between the driver in the guest and the
vDPA device in the host directly.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch marks rte_vhost_gpa_to_vva() as deprecated because
it is unsafe. Application relying on this API should move
to the new rte_vhost_va_from_guest_pa() API, and check
returned length to avoid out-of-bound accesses.
This issue has been assigned CVE-2018-1059.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This new rte_vhost_va_from_guest_pa API takes an extra len
parameter, used to specify the size of the range to be mapped.
Effective mapped range is returned via len parameter.
This issue has been assigned CVE-2018-1059.
Reported-by: Yongji Xie <xieyongji@baidu.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds virtio-crypto spec user message structure to
vhost_user.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds APIs to enable live migration for non-builtin data paths.
At src side, last_avail/used_idx from the device need to be set into the
virtio_net structure, and the log_base and log_size from the virtio_net
structure need to be set into the device.
At dst side, last_avail/used_idx need to be read from the virtio_net
structure and set into the device.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adapts vhost lib for selective datapath by calling device ops
at the corresponding stage.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds APIs for datapath configuration.
The did of the vhost-user socket can be set to identify the backend device,
in this case each vhost-user socket can have only 1 connection. The did is
set to -1 by default when the software datapath is used.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch exports vhost-user protocol features to support device driver
development.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Users of librte_vhost currently implement the vring call operation
themselves. Each caller performs the operation slightly differently.
This patch introduces a new librte_vhost API called
rte_vhost_vring_call() that performs the operation so that vhost-user
applications don't have to duplicate it.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Replace the BSD license header with the SPDX tag for files
with only an Intel copyright on them.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Qemu versions from v2.7.0 to v2.9.0 have their reply-ack protocol
feature implementation broken with multiqueue. The reply-ack
protocol feature is optional except for IOMMU feature.
This patch introduce a new RTE_VHOST_USER_IOMMU_SUPPORT flag to
enable VIRTIO_F_IOMMU_PLATFORM virtio feature.
By default, the IOMMU support is now disabled.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Tested-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Added new callbacks to notify about socket connection status.
As destroy_device is used for virtqueue processing *pause* as well as
connection close, the user has no distinction between those.
Consider the following scenario:
rte_vhost: received SET_VRING_BASE message,
calling destroy_device() as usual
user: end-user asks to remove the device (together with socket file),
OK, device is not *in use* - that's NOT the behavior we want
calling rte_vhost_driver_unregister() etc.
Instead of changing new_device/destroy_device callbacks and breaking
the ABI, a set of new functions new_connection/destroy_connection
has been added.
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
This patch implements the ops rx_queue_count for vhost PMD by adding
a helper function rte_vhost_rx_queue_count in vhost lib.
The ops rx_queue_count gets vhost RX queue avail count and helps to
understand the queue fill level.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Fixing typos across dpdk source code using codespell utility.
Skipped the ethdev driver's base code fixes to keep the base
code intact.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Different drivers use internal macros like force_inline for compiler
always inline feature.
Standardizing it through __rte_always_inline macro.
Verified the change by comparing the output binary file.
No difference found in the output binary file with this change.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Exported headers must allow compilation with the strictest flags. This
commit addresses the following errors:
In file included from /tmp/check-includes.sh.20132.c:1:0:
build/include/rte_vhost.h:73:30: error: ISO C forbids zero-size array
'regions' [-Werror=pedantic]
[...]
Also:
- Add C++ awareness to rte_vhost.h for consistency with rte_eth_vhost.h.
- Move Linux includes into C++ block to prevent linking issues with
exported symbols.
- Update check-includes.sh following the removal of rte_virtio_net.h.
Finally, update check-includes.sh to ignore rte_vhost.h and rte_eth_vhost.h
from now on since the Linux headers they depend on are not clean enough:
In file included from /usr/include/linux/vhost.h:17:0,
from build/include/rte_vhost.h:43,
from build/include/rte_eth_vhost.h:44,
from /tmp/check-includes.sh.20132.c:1:
/usr/include/linux/virtio_ring.h: In function 'vring_init':
/usr/include/linux/virtio_ring.h:146:16: error: pointer of type 'void *'
used in arithmetic [-Werror=pointer-arith]
[...]
In file included from build/include/rte_vhost.h:43:0,
from build/include/rte_eth_vhost.h:44,
from /tmp/check-includes.sh.20132.c:1:
/usr/include/linux/vhost.h: At top level:
/usr/include/linux/vhost.h:73:3: error: ISO C99 doesn't support unnamed
structs/unions [-Werror=pedantic]
[...]
Fixes: eb32247457fe ("vhost: export guest memory regions")
Fixes: a798beb47c8e ("vhost: rename header file")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
rte_mbuf struct is something more likely will be used only in vhost-user
net driver, while we have made vhost-user generic enough that it can
be used for implementing other drivers (such as vhost-user SCSI), they
have also include <rte_mbuf.h>. Otherwise, the build will be broken.
We could workaround it by using forward declaration, so that other
non-net drivers won't need include <rte_mbuf.h>.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Rename "rte_virtio_net.h" to "rte_vhost.h", to not let it be virtio
net specific.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>