numam-dpdk

Author	SHA1	Message	Date
Jiayu Hu	57b4eafa1d	vhost: support Explicit Congestion Notification In virtio, Explicit Congestion Notification (ECN) includes two parts: guest ECN and host ECN. Guest ECN means the frontend can handle TSO packets which have ECN set, and host ECN means the backend can handle TSO packets which have ECN set. The ECN features are rarely used. However, virtio-net enables them by default, and vhost-net support both. To make live migration from vhost-net to vhost-user possible, this patch announces to support guest and host ECN in vhost-user. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Xiao Wang	c3ffdba0e8	vhost: use API to make RARP packet Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>	2018-01-16 18:47:49 +01:00
Olivier Matz	da51d2f6b8	vhost: fix error code check when creating thread On error, pthread_create() returns a positive number (errno). Fix the test on the return value. Fixes: af1475918124 ("vhost: introduce API to start a specific driver") Fixes: e623e0c6d8a5 ("vhost: add reconnect ability") Cc: stable@dpdk.org Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com>	2018-01-16 18:47:49 +01:00
Tonghao Zhang	ae0b1de941	vhost: add reconnect thread name for client mode This patch adds the name for vhost-user reconnect thread. It can help us to know whether the thread is running. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Junjie Chen	e37ff95440	vhost: support virtqueue interrupt/notification suppression The driver can suppress interrupt when VIRTIO_F_EVENT_IDX feature bit is negotiated. The driver set vring flags to 0, and MAY use used_event in available ring to advise device interrupt util reach an index specified by used_event. The device ignore the lower bit of vring flags, and send an interrupt when index reach used_event. The device can suppress notification in a manner analogous to the ways driver suppress interrupt. The device manipulates flags or avail_event in the used ring in the same way the driver manipulates flags or used_event in available ring. Signed-off-by: Junjie Chen <junjie.j.chen@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Lei Yao <lei.a.yao@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Maxime Coquelin	e291093235	vhost: destroy unused virtqueues when multiqueue not negotiated QEMU sends VHOST_USER_SET_VRING_CALL requests for all queues declared in QEMU command line before the guest is started. It has the effect in DPDK vhost-user backend to allocate vrings for all queues declared by QEMU. If the first driver being used does not support multiqueue, the device never changes to VIRTIO_DEV_RUNNING state as only the first queue pair is initialized. One driver impacted by this bug is virtio-net's iPXE driver which does not support VIRTIO_NET_F_MQ feature. It is safe to destroy unused virtqueues in SET_FEATURES request handler, as it is ensured the device is not in running state at this stage, so virtqueues aren't being processed. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Maxime Coquelin	467fe22df9	vhost: extract virtqueue cleaning and freeing functions This patch extracts needed code for vhost_user.c to be able to clean and free virtqueues unitary. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Maxime Coquelin	59fe5e17d9	vhost: propagate set features handling error Not propagating VHOST_USER_SET_FEATURES request handling error may result in unpredictable behavior, as host and guests features may no more be synchronized. This patch fixes this by reporting the error to the upper layer, which would result in the device being destroyed and the connection with the master to be closed. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Maxime Coquelin	07f8db29b8	vhost: prevent features to be changed while device is running As section 2.2 of the Virtio spec states about features negotiation: "During device initialization, the driver reads this and tells the device the subset that it accepts. The only way to renegotiate is to reset the device." This patch implements a check to prevent illegal features change while the device is running. One exception is the VHOST_F_LOG_ALL feature bit, which is enabled when live-migration is initiated. But this feature is not negotiated with the Virtio driver, but directly with the Vhost master. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Laszlo Ersek <lersek@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Jiayu Hu	6d18505efa	vhost: support UDP Fragmentation Offload In virtio, UDP Fragmentation Offload (UFO) includes two parts: host UFO and guest UFO. Guest UFO means the frontend can receive large UDP packets, and host UFO means the backend can receive large UDP packets. This patch supports host UFO and guest UFO for vhost-user. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Lei Yao <lei.a.yao@intel.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Stefan Hajnoczi	6c299bb732	vhost: introduce vring call API Users of librte_vhost currently implement the vring call operation themselves. Each caller performs the operation slightly differently. This patch introduces a new librte_vhost API called rte_vhost_vring_call() that performs the operation so that vhost-user applications don't have to duplicate it. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Stefan Hajnoczi	413a8fee30	vhost: add vring call helper Extract the callfd eventfd signal operation so virtio_net.c does not have to repeat it multiple times. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Jiayu Hu	ee1bc7d0dc	vhost: support Generic Segmentation Offload In virtio, Generic Segmentation Offload (GSO) is the feature for the backend, which means the backend can receive packets with any GSO type. Virtio-net enables the GSO feature by default, and vhost-net supports it. To make live migration from vhost-net to vhost-user possible, this patch enables GSO for vhost-user. Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Tested-by: Lei Yao <lei.a.yao@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Junjie Chen	803aeecef1	vhost: fix dequeue zero copy with virtio1 This fix dequeue zero copy can not work with Qemu version >= 2.7. Since from Qemu 2.7 virtio device use virtio-1 protocol, the zero copy code path forget to add offset to buffer address. Fixes: b0a985d1f340 ("vhost: add dequeue zero copy") Cc: stable@dpdk.org Signed-off-by: Junjie Chen <junjie.j.chen@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Jianfeng Tan	cab278dee9	vhost: fix crash In a running VM, operations (like device attach/detach) will trigger the QEMU to resend set_mem_table to vhost-user backend. DPDK vhost-user handles this message rudely by unmap all existing regions and map new ones. This might lead to segfault if there is pmd thread just trying to touch those unmapped memory regions. But for most cases, except VM memory hotplug, QEMU still sends the set_mem_table message even the memory regions are not changed as QEMU vhost-user filters out those not backed by file (fd > 0). To fix this case, we add a check in the handler to see if the memory regions are really changed; if not, we just keep old memory regions. Fixes: 8f972312b8f4 ("vhost: support vhost-user") CC: stable@dpdk.org Reported-by: Yang Zhang <zy107165@alibaba-inc.com> Reported-by: Xin Long <longxin.xl@alibaba-inc.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2018-01-16 18:47:49 +01:00
Bruce Richardson	369991d997	lib: use SPDX tag for Intel copyright files Replace the BSD license header with the SPDX tag for files with only an Intel copyright on them. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2018-01-04 22:41:39 +01:00
Maxime Coquelin	002d6a7e55	vhost: add flag to enable IOMMU support Qemu versions from v2.7.0 to v2.9.0 have their reply-ack protocol feature implementation broken with multiqueue. The reply-ack protocol feature is optional except for IOMMU feature. This patch introduce a new RTE_VHOST_USER_IOMMU_SUPPORT flag to enable VIRTIO_F_IOMMU_PLATFORM virtio feature. By default, the IOMMU support is now disabled. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org> Tested-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>	2017-11-07 14:19:11 +01:00
Maxime Coquelin	6ea069651e	vhost: disable reply-ack feature if IOMMU disabled If the application has disabled VIRTIO_F_IOMMU_PLATFORM, disable VHOST_USER_PROTOCOL_F_REPLY_ACK protocol feature that is only mandatory with IOMMU for now. This is done to provide a way for the application to support multiqueue with old Qemu versions (v2.7.0 to v2.9.0) that have reply-ack feature broken. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org> Tested-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>	2017-11-07 14:13:47 +01:00
Maxime Coquelin	5a4933e56b	vhost: postpone ring address translations at kick time only If multiple queue pairs are created but all are not used, the device is never started, as unused queues aren't enabled and their ring addresses aren't translated. The device is changed to running state when all rings addresses are translated. This patch fixes this by postponning rings addresses translation at kick time unconditionnaly, VHOST_USER_F_PROTOCOL_FEATURES being negotiated or not. Reported-by: Lei Yao <lei.a.yao@intel.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Lei Yao <lei.a.yao@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-11-07 02:33:05 +01:00
Santosh Shukla	df6e0a06a3	drivers/net: rename physical address type to IOVA Renamed data type from phys_addr_t to rte_iova_t. Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>	2017-11-06 22:44:26 +01:00
Santosh Shukla	455da54539	mbuf: rename physical address to IOVA Rename buf_physaddr to buf_iova. Keep the deprecated name in an anonymous union to avoid breaking the API. Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Olivier Matz <olivier.matz@6wind.com>	2017-11-06 22:44:26 +01:00
Thomas Monjalon	62196f4e09	mem: rename address mapping function to IOVA The function rte_mem_virt2phy() is kept and used in functions which works only with physical addresses. For all other calls this function is replaced by rte_mem_virt2iova() which does a direct mapping (no conversion) in the VA case. Note: the new function rte_mem_virt2iova() function matches the behaviour implemented in rte_mem_virt2phy() by the commit 680f6c12600f ("mem: honor IOVA mode in virt2phy") Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>	2017-11-06 22:24:19 +01:00
Tiwei Bie	1d8161ba02	vhost: fix dequeue offload support When offload is enabled, vhost needs to access the first mbuf to get the packet info, e.g. TCP header. So we couldn't delay the data copy in this case. Fixes: e5c494a7a22b ("vhost: batch small guest memory copies") Reported-by: Lei Yao <lei.a.yao@intel.com> Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-24 21:31:23 +02:00
Maxime Coquelin	5cd690e4fd	vhost: fix vring addresses not translated Commit 3ea7052f4b1b ("vhost: postpone rings addresses translation") moves rings addresses translation at either vring kick or enable time, depending on whether protocol features are enabled or not. This is done not interpret ring information as long as the vring is not fully initialized. The problem is that with old QEMU versions, like v2.5, the ring is enabled before addresses are sent, so addresses are never translated. This patch fixes the issue by doing the translation in VHOST_USER_SET_VRING_ADDR handling if ring is already enabled. Fixes: 3ea7052f4b1b ("vhost: postpone rings addresses translation") Reported-by: Lei Yao <lei.a.yao@intel.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-24 21:26:10 +02:00
Olivier Matz	cbc12b0a96	mk: do not generate LDLIBS from directory dependencies The list of libraries in LDLIBS was generated from the DEPDIRS-xyz variable. This is valid when the subdirectory name match the library name, but it's not always the case, especially for PMDs. The patches removes this feature and explicitly adds the proper libraries in LDLIBS. Some DEPDIRS-xyz variables become useless, remove them. Reported-by: Gage Eads <gage.eads@intel.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Gage Eads <gage.eads@intel.com>	2017-10-24 02:14:57 +02:00
Maxime Coquelin	86fe881c03	vhost: fetch ring address after NUMA reallocation In case of NUMA reallocation, the virtqueue struct is reallocated on another socket, meaning that its address changes. In translate_ring_addresses(), addr pointer was not fetched again after the reallocation, so it pointed to freed memory. This patch just fetch again addr pointer after the reallocation. Reported-by: Lei Yao <lei.a.yao@intel.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Lei Yao <lei.a.yao@intel.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com>	2017-10-13 22:08:21 +02:00
Maxime Coquelin	b9c07b3141	vhost: fix IOTLB on NUMA realloc In case of NUMA reallocation, virtqueue's iotlb list is broken, has its head changes but first iotlb entry in the list still points to the previous head pointer. Also, in case of reallocation, we want the IOTLB cache mempool to be on the new socket. This patch perform a full re-init of the IOTLB cache when mempool already exists, and calls the IOTLB cache init function in case the virtqueue is being reallocated on a new socket. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com>	2017-10-13 22:08:21 +02:00
Maxime Coquelin	1aadb2f6b1	vhost: fix deadlock on IOTLB miss An optimization was done to only take the iotlb cache lock once per packet burst instead of once per IOVA translation. With this, IOTLB miss requests are sent to Qemu with the lock held, which can cause a deadlock if the socket buffer is full, and if Qemu is waiting for an IOTLB update to be done. Holding the lock is not necessary when sending an IOTLB miss request, as it is not manipulating the IOTLB cache list, which the lock protects. Let's just release it while sending the IOTLB miss. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com>	2017-10-13 22:08:21 +02:00
Bruce Richardson	d65b3b1668	vhost: fix false-positive warning from clang 5 When compiling with clang extra warning flags, such as used by default with meson, a warning is given in iotlb.c: lib/librte_vhost/iotlb.c:318:6: warning: variable 'socket' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized] This is a false positive, as the socket value will be initialized by the call to get_mempolicy in the case where the NUMA build-time flag is set, and in cases where it is not set, "if (ret)" will always be true as ret is initialized to -1 and never changed. However, this is not immediately obvious, and is perhaps a little fragile, as it will break if other code using ret is subsequently added above the call to get_mempolicy by someone unaware of this subtle dependency. Therefore, we can fix the warning and making the code more robust by explicitly initializing socket to zero, and moving the extra condition check on the return from get_mempolicy() into the #ifdef Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>	2017-10-11 13:56:34 +02:00
Maxime Coquelin	3494ed045e	vhost: distinguish master and slave requests This patch adds an union in VhostUserMsg to distinguish between master and slave initiated requests, instead of casting slave requests as master request. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:54:31 +02:00
Dariusz Stojaczyk	efba12a78d	vhost: add user callbacks for socket open/close Added new callbacks to notify about socket connection status. As destroy_device is used for virtqueue processing pause as well as connection close, the user has no distinction between those. Consider the following scenario: rte_vhost: received SET_VRING_BASE message, calling destroy_device() as usual user: end-user asks to remove the device (together with socket file), OK, device is not in use - that's NOT the behavior we want calling rte_vhost_driver_unregister() etc. Instead of changing new_device/destroy_device callbacks and breaking the ABI, a set of new functions new_connection/destroy_connection has been added. Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com> Reviewed-by: Jens Freimann <jfreimann@redhat.com>	2017-10-10 15:54:31 +02:00
Kuba Kozak	66a6210124	vhost: check poll error code Add return value check for poll() call. Coverity issue: 140740 Fixes: 59317cef249c ("vhost: allow many vhost-user ports") Cc: stable@dpdk.org Signed-off-by: Kuba Kozak <kubax.kozak@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:54:31 +02:00
Maxime Coquelin	69c90e98f4	vhost: enable IOMMU support Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:53:27 +02:00
Maxime Coquelin	36031f80cc	vhost: invalidate vring in case of matching IOTLB invalidate As soon as a page used by a ring is invalidated, the access_ok flag is cleared, so that processing threads try to map them again. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	eefac9536a	vhost: postpone device creation until rings are mapped Translating the start addresses of the rings is not enough, we need to be sure all the ring is made available by the guest. It depends on the size of the rings, which is not known on SET_VRING_ADDR reception. Furthermore, we need to be be safe against vring pages invalidates. This patch introduces a new access_ok flag per virtqueue, which is set when all the rings are mapped, and cleared as soon as a page used by a ring is invalidated. The invalidation part is implemented in a following patch. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	09927b5249	vhost: translate ring addresses when IOMMU enabled When IOMMU is enabled, the ring addresses set by the VHOST_USER_SET_VRING_ADDR requests are guest's IO virtual addresses, whereas Qemu virtual addresses when IOMMU is disabled. When enabled and the required translation is not in the IOTLB cache, an IOTLB miss request is sent, but being called by the vhost-user socket handling thread, the function does not wait for the requested IOTLB update. The function will be called again on the next IOTLB update message reception if matching the vring addresses. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	3ea7052f4b	vhost: postpone rings addresses translation This patch postpones rings addresses translations and checks, as addresses sent by the master shuld not be interpreted as long as ring is not started and enabled[0]. When protocol features aren't negotiated, the ring is started in enabled state, so the addresses translations are postponed to vhost_user_set_vring_kick(). Otherwise, it is postponed to when ring is enabled, in vhost_user_set_vring_enable(). [0]: http://lists.nongnu.org/archive/html/qemu-devel/2017-05/msg04355.html Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	b0098b5e21	vhost: fix dereferencing invalid pointer after realloc numa_realloc() reallocates the virtio_net device structure and updates the vhost_devices[] table with the new pointer if the rings are allocated different NUMA node. Problem is that vhost_user_msg_handler() still dereferences old pointer afterward. This patch prevents this by fetching again the dev pointer in vhost_devices[] after messages have been handled. Fixes: af295ad4698c ("vhost: realloc device and queues to same numa node as vring desc") Cc: stable@dpdk.org Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	321203a54b	vhost: enable rings at the right time When VHOST_USER_F_PROTOCOL_FEATURES is negotiated, the ring is not enabled when started, but enabled through dedicated VHOST_USER_SET_VRING_ENABLE request. When not negotiated, the ring is started in enabled state, at VHOST_USER_SET_VRING_KICK request time. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	62fdb8255a	vhost: use the guest IOVA to host VA helper Replace rte_vhost_gpa_to_vva() calls with vhost_iova_to_vva(), which requires to also pass the mapped len and the access permissions needed. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	fed67a20ac	vhost: introduce guest IOVA to backend VA helper This patch introduces vhost_iova_to_vva() function to translate guest's IO virtual addresses to backend's virtual addresses. When IOMMU is enabled, the IOTLB cache is queried to get the translation. If missing from the IOTLB cache, an IOTLB_MISS request is sent to Qemu, and IOTLB cache is queried again on IOTLB event notification. When IOMMU is disabled, the passed address is a guest's physical address, so the legacy rte_vhost_gpa_to_vva() API is used. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	e95f34d380	vhost: handle IOTLB update and invalidate requests Vhost-user device IOTLB protocol extension introduces VHOST_USER_IOTLB message type. The associated payload is the vhost_iotlb_msg struct defined in Kernel, which in this was can be either an IOTLB update or invalidate message. On IOTLB update, the virtqueues get notified of a new entry. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	76e99bfc4c	vhost: initialize vrings IOTLB caches The per-virtqueue IOTLB cache init is done at virtqueue init time. init_vring_queue() now takes vring id as parameter, so that the IOTLB cache mempool name can be generated. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	01a4bb55f9	vhost: support IOTLB miss slave requests Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	f72c2ad63a	vhost: add pending IOTLB miss request list and helpers In order to be able to handle other ports or queues while waiting for an IOTLB miss reply, a pending list is created so that waiter can return and restart later on with sending again a miss request. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	d012d1f293	vhost: add IOTLB helper functions Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	06903abc0d	vhost: add IOMMU-related macros for old kernels These defines and enums have been introduced in upstream kernel v4.8, and backported to RHEL 7.4. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	275c3f9447	vhost: support slave requests channel Currently, only QEMU sends requests, the backend sends replies. In some cases, the backend may need to send requests to QEMU, like IOTLB miss events when IOMMU is supported. This patch introduces a new channel for such requests. QEMU sends a file descriptor of a new socket using VHOST_USER_SET_SLAVE_REQ_FD. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	a0563bd2e3	vhost: prepare for slave requests send_vhost_message() is currently only used to send replies, so it modifies message flags to perpare the reply. With upcoming channel for backend initiated request, this function can be used to send requests. This patch introduces a new send_vhost_reply() that does the message flags modifications, and makes send_vhost_message() generic. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00
Maxime Coquelin	25bf7a0b09	vhost: make error handling consistent in Rx path In the non-mergeable receive case, when copy_mbuf_to_desc() call fails the packet is skipped, the corresponding used element len field is set to vnet header size, and it continues with next packet/desc. It could be a problem because it does not know why it failed, and assume the desc buffer is large enough. In mergeable receive case, when copy_mbuf_to_desc_mergeable() fails, packets burst is simply stopped. This patch makes the non-mergeable error path to behave as the mergeable one, as it seems the safest way. Also, doing this way will simplify pending IOTLB miss requests handling. Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>	2017-10-10 15:52:27 +02:00

1 2 3 4 5 ...

325 Commits