This message is used to enable/disable a specific vring queue pair.
The first queue pair is enabled by default.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Do not use VIRTIO_RXQ or VIRTIO_TXQ anymore; use the queue_id
instead, which will be set to a proper value for a specific queue
when we have multiple queue support enabled.
For now, queue_id is still set with VIRTIO_RXQ or VIRTIO_TXQ,
so it should not break anything.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Huawei Xie <huawei.xie@intel.com>
All queue pairs, including the default (the first) queue pair,
are allocated dynamically, when a vring_call message is received
first time for a specific queue pair.
This is a refactor work for enabling vhost-user multiple queue;
it should not break anything as it does no functional changes:
we don't support mq set, so there is only one mq at max.
This patch is based on Changchun's patch.
Signed-off-by: Ouyang Changchun <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Add VHOST_USER_GET_QUEUE_NUM message
to tell the frontend (qemu) how many queue pairs we support.
And it is initiated to VIRTIO_NET_CTRL_MQ_VQ_PAIRS_MAX.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Huawei Xie <huawei.xie@intel.com>
The two protocol features messages are introduced by qemu vhost
maintainer(Michael) for extendting vhost-user interface. Here is
an excerpta from the vhost-user spec:
Any protocol extensions are gated by protocol feature bits,
which allows full backwards compatibility on both master
and slave.
The vhost-user multiple queue features will be treated as a vhost-user
extension, hence, we have to implement the two messages first.
VHOST_USER_PROTOCOL_FEATURES is initialized to 0, as we don't support
any yet.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Huawei Xie <huawei.xie@intel.com>
virtio-net search for it's device in reset_owner.
The function don't check the return result of get_config_ll_entry.
Using get_config_ll_entry in reset_owner don't show any error when the
device is not found. This patch fix this by using get_device instead
instead of get_config_ll_entry.
In user_get_vring_base, get_device return is not checked and may cause
segfault when device is not found.
Signed-off-by: Jerome Jutteau <jerome.jutteau@outscale.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
virtio-net clean and init device after a VHOST_USER_RESET_OWNER.
This reset device identifier to 0 and break ll_root listing logic.
This patch keep the old device identifier and re-write it on the cleaned
device.
Signed-off-by: Jerome Jutteau <jerome.jutteau@outscale.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
In merge-able RX path, vhost injects interrupts to guest for each packet.
This should degrade performance a lot.
This patch fixes this issue by injecting one interrupt for a batch of packets.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
According to eventfd man page:
typedef uint64_t eventfd_t;
int eventfd_read(int fd, eventfd_t *value);
int eventfd_write(int fd, eventfd_t value);
eventfd_t is defined for the second arg(value), but not for fd.
Here I redefine those fd fields to `int' type, which also removes
the redundant (int) cast. And as the man page stated, we should
cast 1 to eventfd_t type for eventfd_write().
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
The vHost eventlink driver is a kernel module that requires a kernel
source/build directory to build the ko. Convert the fixed kernel build
directory specifier to one which may be user specified on the command-line.
Signed-off-by: Aaron Conole <aconole@redhat.com>
This patch originates from the patch:
"Patch for Qemu wrapper for US-VHost to ensure Qemu process ends when
VM is shutdown", http://dpdk.org/ml/archives/dev/2014-June/003606.html
Also update the vhost sample guide doc.
Signed-off-by: Claire Murphy <claire.k.murphy@intel.com>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
This commit removes the dev-index, so update the doc for this change:
Fixes: 17b8320a3e ("vhost: remove index parameter")
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
It adds more readable log info if a socket fails to bind to
local socket file name.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
rte_vhost_driver_unregister API will remove the listenfd from event list,
and then close it.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Peng Sun <peng.a.sun@intel.com>
In the event handler of connection fd, the connection fd could be possibly
closed. The event dispatch loop would then try to remove the fd from fdset.
Between these two actions, another thread might register a new listenfd
reusing the val of just closed fd, so we couldn't call fdset_del which would
wrongly clean up the new listenfd. A new function fdset_del_slot is provided
to cleanup the fd at the specified location.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
When we get the address of vring descriptor table in VHOST_SET_VRING_ADDR
message, will try to reallocate vhost device and virt queue to the same
numa node.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
This patch simply applies the transform previously committed in
scripts/cocci/mtod-offset.cocci. No other modifications have been
made here.
Signed-off-by: Cyril Chemparathy <cchemparathy@ezchip.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Remove these unnecessary vring descriptor length updating, vhost should
not change them.
virtio in front end should assign value to desc.len for both rx and tx.
Test report: http://dpdk.org/ml/archives/dev/2015-June/018610.html
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Extract codes into a function:
update_secure_len which is used to accumulate the buffer len in the
vring descriptors and to fill struct buf_vec.
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Vring enqueue need consider the 2 cases:
1. use separate descriptors to contain virtio header and actual data,
e.g. the first descriptor is for virtio header, and then followed
by descriptors for actual data.
2. virtio header and some data are put together in one descriptor,
e.g. the first descriptor contain both virtio header and part of
actual data, and then followed by more descriptors for rest of packet
data, current DPDK based virtio-net pmd implementation is this case;
So does vring dequeue, it should not assume vring descriptor is chained
or not chained, it should use desc->flags to check whether it is chained
or not. This patch also fixes TX corrupt issue when vhost co-work with
virtio-net driver which uses one single vring descriptor (header and data
are in one descriptor) for virtio tx process on default.
Test report: http://dpdk.org/ml/archives/dev/2015-June/018610.html
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
When we migrate VM, without this feature, qemu will report error:
"migrate: Migration disabled: vhost lacks VHOST_F_LOG_ALL feature".
Signed-off-by: Krishna Murthy <krishna.j.murthy@intel.com>
Update of used->idx and read of avail->flags could be reordered.
Memory fence should be used to ensure the order, otherwise guest could see
a stale used->idx value after it toggles the interrupt suppression flag.
After guest sets the interrupt suppression flag, it will check if there
is more buffer to process through used->idx. If it sees a stale value,
it will exit the processing while host won't send interrupt to guest.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Reviewed-by: Luke Gorrie <luke@snabb.co>
The virtio_net header file includes the mbuf header file, but it does not
need to do so as it only uses pointers to the struct rte_mbuf type, and
does not use any of the mbuf internals, nor any of the mbuf functions or
macros. Therefore the inclusion is unnecessary, and can be replaced by a
forward declaration of the mbuf type.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Due to increased `struct file's reference counter subsequent call
to `filp_close' does not free the `struct file'. Prepend `fput' call
to decrease the reference counter.
Signed-off-by: Pavel Boldin <pboldin@mirantis.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
When failed to malloc buffer from mempool we just update last_used_idx but
not used->idx so after many times vhost thought have handle all packets
but virtio_net thought vhost have not handle all packets and will not
update avail->idx.
Signed-off-by: Haifeng Lin <haifeng.lin@huawei.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Fix the error "missing initializer" and "cast to pointer from integer of different size".
For the pointer to integer cast issue, need to investigate changing the typeof mapped_address.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Let's make sure people will not forget to set and unset VIRTIO_DEV_RUNNING.
Signed-off-by: Benoît Canet <benoit.canet@nodalink.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
Turn on CONFIG_RTE_LIBRTE_VHOST to enable vhost.
vhost-user is turned on by default. Turn off CONFIG_RTE_LIBRTE_VHOST_USER to
enable vhost-cuse implementation.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Previous vhost implementation wrongly name kickfd as callfd and callfd as kickfd.
It is functional correct, but causes confusion.
Exchange kickfd and callfd to avoid confusion.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Changchun Ouyang <changchun.ouyang@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
This patch reorder the code a bit to use loop instead of goto.
Besides, remove abudant check 'fd != -1'.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
combine sleep into select when there is no file descriptors to be monitored.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch fixes the segfault issue in the case vhost receives
new VHOST_SET_MEM_TABLE message without VHOST_VRING_GET_VRING_BASE
(which we uses as the stop message).
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tommy Long <thomas.long@intel.com>
In C++11 concatenated string literals need to have a space in between.
Found with clang++-3.4, IIRC g++-4.8 also complains about this.
Sample error message:
error: invalid suffix on literal; C++11 requires a space between literal
and identifier [-Wreserved-user-defined-literal]
Signed-off-by: Stefan Puiu <stefan.puiu@gmail.com>
Reviewed-by: John McNamara <john.mcnamara@intel.com>
* support calling rte_vhost_driver_register after rte_vhost_driver_session_start
* add mutext to protect fdset from concurrent access
* add busy flag in fdentry. this flag is set before cb and cleared after cb is finished.
mutex lock scenario in vhost:
* event_dispatch(in rte_vhost_driver_session_start) runs in a separate thread, infinitely
processing vhost messages through cb(callback).
* event_dispatch acquires the lock, get the cb and its context, mark the busy flag,
and releases the mutex.
* vserver_new_vq_conn cb calls fdset_add, which acquires the mutex and add new fd into fdset.
* vserver_message_handler cb frees data context, marks remove flag to request to delete
connfd(connection fd) from fdset.
* after cb returns, event_dispatch
1. clears busy flag.
2. if there is remove request, call fdset_del, which acquires mutex, checks busy flag, and
removes connfd from fdset.
* rte_vhost_driver_unregister(not implemented) runs in another thread, acquires the mutex,
calls fdset_del to remove fd(listenerfd) from fdset. Then it could free data context.
The above steps ensures fd data context isn't freed when cb is using.
VM(s) should have been shutdown before rte_vhost_driver_unregister.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
for vhost-cuse, ifname is the name of the tap device
for vhost-user, ifname is the name of the unix domain socket path
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
In rte_vhost_driver_register(), vhost unix domain socket listener fd is created
and added to polled(based on select) fdset.
In rte_vhost_driver_session_start(), fds in the fdset are checked for
processing. If there is new connection from qemu, connection fd accepted is
added to polled fdset. The listener and connection fds in the fdset are
then both checked. When there is message on the connection fd, its
callback vserver_message_handler is called to process vhost-user messages.
To support identifying which virtio is from which guest VM, we could call
rte_vhost_driver_register with different socket path. Virtio devices from
same VM will connect to VM specific socket. The socket path information is
stored in the virtio_net structure.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
remove set_memory_table ops
vhost-cuse or vhost-user will both implement their own set_memory_region handler.
In current vhost-cuse implementation, guest numa memory isn't supported.
Assume that guest memory is backed by only one file.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Przemyslaw Czesnowicz <przemyslaw.czesnowicz@intel.com>
This functions accepts a virtual address and pid(qemu), and maps it into
current process(vhost)'s address space.
The memory behind the virtual address should be backed by a file,
and virtual address should be the starting address.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>