A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.
[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Start a new release cycle with empty release notes.
The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Async copy fails when single ring buffer vector is split on multiple
physical pages. This happens because current hpa address translation
function doesn't handle multi-page buffers. A new gpa to hpa address
conversion function, which returns the hpa on the first hitting host
pages, is implemented in this patch. Async data path recursively calls
this new function to construct a multi-segments async copy descriptor
for ring buffers crossing physical page boundaries.
Fixes: cd6760da1076 ("vhost: introduce async enqueue for split ring")
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If rte_vhost_enable_guest_notification is called before
the virtqueue is ready, the configuration is lost.
This patch fixes this by saving the guest notification
enablement value requested by the application, and apply
it before the virtqueue is made ready to the application.
Fixes: 604052ae5395 ("net/vhost: support queue update")
Reported-by: Yinan Wang <yinan.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
The async copy device callbacks are used by async APIs to transfer data
and check completion status. Async APIs return the number of packets
successfully processed to the caller applications and no error
(negative) value is allowed for API return value. Thus, negative return
values from async device callbacks don't have meaningful usage, while
adding overhead in checking the return value validity. This patch change
the callback return values from "int" to "uint32_t" to get aligned with
async API definition.
Fixes: 78639d54563a ("vhost: introduce async enqueue registration API")
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
zmbufs should be set to NULL when getting freed to avoid double free on
the same buffer pointer
Fixes: b0a985d1f340 ("vhost: add dequeue zero copy")
Cc: stable@dpdk.org
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In async enqueue copy, a packet could be split into multiple copy
segments. When polling the copy completion status, current async data
path assumes the async device callbacks are aware of the packet
boundary and return completed segments only if all segments belonging
to the same packet are done. Such assumption are not generic to common
async devices and may degrade the copy performance if async callbacks
have to implement it in software manner.
This patch adds tracking of the completed copy segments at vhost side.
If async copy device reports partial completion of a packets, only
vhost internal record is updated and vring status keeps unchanged
until remaining segments of the packet are also finished. The async
copy device is no longer necessary to care about the packet boundary.
Fixes: cd6760da1076 ("vhost: introduce async enqueue for split ring")
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Vring should not be touched if vq is disabled. This patch adds the vq
status check in async enqueue polling to avoid accessing to a disabled
queue.
Fixes: cd6760da1076 ("vhost: introduce async enqueue for split ring")
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds the check of dev pointer in vhost async enqueue
completion poll. If a NULL dev pointer detected, the poll function
returns immediately.
Coverity issue: 360839
Fixes: cd6760da1076 ("vhost: introduce async enqueue for split ring")
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch adds support to the new Virtio device get status
Vhost-user message.
The driver can send this new message to read the device status.
One of the uses of this message is to ensure the feature negotiation has
succeeded. According to the virtio spec, after completing the feature
negotiation, the driver sets the FEATURE_OK status bit and re-reads it
to ensure the device has accepted the features.
This patch also clears the FEATURE_OK status bit if the feature
negotiation has failed to let the driver know about his failure.
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch adds support to the new Virtio device status
Vhost-user protocol feature.
Getting such information in the backend helps to know
when the driver is done with the device configuration
and so makes the initialization phase more robust.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch checks whether vDPA device configuration
succeed and does not set the CONFIGURED flag if it
didn't.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Some of the vDPA callbacks have to be implemented
for vDPA to work properly.
This patch marks them as mandatory in the API doc and
simplify code calling these ops with removing
unnecessary checks that are now done at registration
time.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch is a small refactoring, as preliminary work
for adding support to Virtio status support.
No functional change here.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Before checking whether the device is ready is done
a check on whether the RUNNING flag is set. Then the
READY flag is set if virtio_is_ready() returns true.
While it seems to not cause any issue, it makes more
sense to check whether the READY flag is set and not
the RUNNING one.
Fixes: c0674b1bc898 ("vhost: move the device ready check at proper place")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Restrict pointer aliasing to allow the compiler to vectorize loop
more aggressively.
With this patch, a 9.6% improvement is observed in throughput for
the packed virtio-net PVP case, and a 2.8% improvement in throughput
for the packed virtio-user PVP case. All performance data are measured
on ThunderX-2 platform under 0.001% acceptable packet loss with 1 core
on both vhost and virtio side.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
This patch implements async enqueue data path for split ring. 2 new
async data path APIs are defined, by which applications can submit
and poll packets to/from async engines. The async engine is either
a physical DMA device or it could also be a software emulated backend.
The async enqueue data path leverages callback functions registered by
applications to work with the async engine.
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Performing large memory copies usually takes up a major part of CPU
cycles and becomes the hot spot in vhost-user enqueue operation. To
offload the large copies from CPU to the DMA devices, asynchronous
APIs are introduced, with which the CPU just submits copy jobs to
the DMA but without waiting for its copy completion. Thus, there is
no CPU intervention during data transfer. We can save precious CPU
cycles and improve the overall throughput for vhost-user based
applications. This patch introduces registration/un-registration
APIs for vhost async data enqueue operation. Together with the
registration APIs implementations, data structures and the prototype
of the async callback functions required for async enqueue data path
are also defined.
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Introduce the RTE_LOG_REGISTER macro to avoid the code duplication
in the logtype registration process.
It is a wrapper macro for declaring the logtype, registering it and
setting its level in the constructor context.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
The vhost library provide an infrastructure in order to help the DPDK
users to manage vhost devices.
One of the infrastructure parts is the features enablement APIs.
Some features bits may be defined only in the internal file vhost.h in
case the kernel version doesn't include them.
Hence, user running on old kernel may not be able to manage thus
features.
Move all the feature bits definitions to the API file rte_vhost.h.
Fixes: db69be54b6ff ("vhost: hide internal code")
Fixes: 8d286dbeb8d7 ("vhost: fix multiple queue not enabled for old kernels")
Fixes: 3d3c6590b58c ("vhost: enable virtio MTU feature")
Fixes: 704098fc478c ("vhost: fix build with old kernels")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When virtq call or kick file descriptors are changed in the device
configuration when the queue is ready, the application and the vDPA
driver should be notified to be aligned to the new file descriptors.
Notify the state to be disabled before the file descriptor update and
return it back to be enabled after the update.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Some vDPA drivers' basic configurations should be updated when the
guest memory is hotplugged.
Close vDPA device before hotplug operation and recreate it after the
hotplug operation is done.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Some guest drivers may not configure disabled virtio queues.
In this case, the vhost management never notifies the application and
the vDPA device readiness because it waits to the device to be ready.
The current ready state means that all the virtio queues should be
configured regardless the enablement status.
In order to support this case, this patch changes the ready state:
The device is ready when at least 1 queue pair is configured and
enabled.
So, now, the application and vDPA driver are notifies when the first
queue pair is configured and enabled.
Also the queue notifications will be triggered according to the new
ready definition.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
No need to take access lock in the vhost-user message handler when
vDPA driver controls all the data-path of the vhost device.
It allows the vDPA set_vring_state operation callback to configure
guest notifications.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
As an arrangement to per queue operations in the vDPA device it is
needed to change the next experimental API:
The API ``rte_vhost_host_notifier_ctrl`` was changed to be per queue
instead of per device.
A `qid` parameter was added to the API arguments list.
Setting the parameter to the value RTE_VHOST_QUEUE_ALL configures the
host notifier to all the device queues as done before this patch.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch split the vDPA header file in two, making
rte_vdpa_device structure opaque to the application.
Applications should only include rte_vdpa.h, while drivers
should include both rte_vdpa.h and rte_vdpa_dev.h.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
This API is no more useful, this patch removes it.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
This patch is preliminary work to make the vDPA device
structure opaque to the user application. Some callbacks
of the vDPA devices are used to query capabilities before
attaching to a Vhost port. This patch introduces wrappers
for these ops.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
There is no more notion of device ID outside of vdpa.c.
We can now move from array to linked-list model for keeping
track of the vDPA devices.
There is no point in using array here, as all vDPA API are
used from the control path, so no performance concerns.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
vDPA is no more used outside of the vDPA internals,
so remove rte_vdpa_get_device() API that is now useless.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
This patch replaces the use of vDPA device ID with
vDPA device pointer. The goals is to remove the vDPA
device ID to avoid confusion with the Vhost ID.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
This removes the notion of device ID in Vhost library
as a preliminary step to get rid of the vDPA device ID.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
This patch is a preliminary step to get rid of the
vDPA device ID. It makes vDPA callbacks to use the
vDPA device struct as a reference instead of the ID.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
This patch makes the vDPA framework to no more
support only PCI devices, but any devices by relying
on the generic device name as identifier.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
This patch introduces vDPA device class. It will enable
application to iterate over the vDPA devices.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
The vDPA device offloads all the datapath of the vhost
device to the HW device.
In order to expose to the user traffic information this
patch introduces new 3 APIs to get traffic statistics, the
device statistics name and to reset the statistics per
virtio queue.
The statistics are taken directly from the vDPA driver
managing the HW device and can be different for each vendor
driver.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As announced during v20.05 release cycle, this
patch makes reply-ack protocol feature to be enabled
unconditionally.
This protocol feature makes the communication between the
master and the slave more robust, avoiding for example
possible undefined behaviour with VHOST_USER_SET_MEM_TABLE.
Also, reply-ack support will be required for upcoming
VHOST_USER_SET_STATUS request.
Note that this protocol feature was disabled by default
because Qemu version 2.7.0 to 2.9.0 had a bug causing a
deadlock when reply-ack was negotiated and multiqueue
enabled. These Qemu version are now very old and no more
maintained, so we can reasonably consider we no more
support them.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch fixes the situation where vhost-user cannot start as server
with dequeue_zero_copy enabled.
Using flag instead of vsocket->is_server to determine whether vhost-user
is in client mode. Because vsocket->is_server is not ready at this time.
Fixes: 715070ea10e6 ("vhost: prevent zero-copy with incompatible client mode")
Cc: stable@dpdk.org
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Removed the typing error in doc/guides/eventdevs/index.rst,
drivers/net/mlx5/mlx5.c and in lib/librte_vhost/rte_vhost.h
Bugzilla ID: 477
Fixes: 0857b9421138 ("doc: add event device and software eventdev")
Fixes: 039253166a57 ("vhost: add device op when notification to guest is sent")
Fixes: ad74bc619504 ("net/mlx5: support multiport IB device during probing")
Cc: stable@dpdk.org
Signed-off-by: Muhammad Bilal <m.bilal@emumba.com>
vhost buffer allocation is successful for packets that fit
into a linear buffer. If it fails, vhost library is expected
to drop the current packet and skip to the next.
The patch fixes the error scenario by skipping to next packet.
Note: Drop counters are not currently supported.
Fixes: c3ff0ac70acb ("vhost: improve performance by supporting large buffer")
Cc: stable@dpdk.org
Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Vhost will create temporary file when receiving VHOST_USER_GET_INFLIGHT_FD
message. Malicious guest can send endless this message to drain out the
resource of host.
When receiving VHOST_USER_GET_INFLIGHT_FD message repeatedly, closing the
file created during the last handling of this message.
CVE-2020-10726
Fixes: d87f1a1cb7b666550 ("vhost: support inflight info sharing")
Cc: stable@dpdk.org
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
A malicious container which has direct access to the vhost-user socket
can keep sending VHOST_USER_GET_INFLIGHT_FD messages which may cause
leaking resources until resulting a DOS. Fix it by unmapping the
dev->inflight_info->addr before assigning new mapped addr to it.
CVE-2020-10726
Fixes: d87f1a1cb7b6 ("vhost: support inflight info sharing")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Malicious guest can construct desc with invalid address and zero buffer
length. That will request vhost to check both translated address and
translated data length. This patch will add missed address check.
CVE-2020-10725
Fixes: 75ed51697820 ("vhost: add packed ring batch dequeue")
Fixes: ef861692c398 ("vhost: add packed ring batch enqueue")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
transform_cipher_param() and transform_chain_param() handle
the payload data for the VHOST_USER_CRYPTO_CREATE_SESS
message. These payloads have to be validated, since it
could come from untrusted sources.
Two buffers and their lengths are defined in this payload,
one the the auth key and one for the cipher key. But above
functions do not validate the key length inputs, which could
lead to read out of bounds, as buffers have static sizes of
64 bytes for the cipher key and 512 bytes for the auth key.
This patch adds necessary checks on the key length field
before being used.
CVE-2020-10724
Fixes: e80a98708166 ("vhost/crypto: add session message handler")
Cc: stable@dpdk.org
Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
vhost_user_check_and_alloc_queue_pair() is used to extract
a vring index from a payload. This function validates the
index and is called early on in when performing message
handling. Most message handlers depend on it correctly
validating the vring index.
Depending on the message type the vring index is in
different parts of the payload. The function contains a
switch/case for each type and copies the index. This is
stored in a uint16. This index is then validated. Depending
on the message, the source index is an unsigned int. If
integer truncation occurs (uint->uint16) the top 16 bits
of the index are never validated.
When they are used later on (e.g. in
vhost_user_set_vring_num() or vhost_user_set_vring_addr())
it can lead to out of bound indexing. The out of bound
indexed data gets written to, and hence this can cause
memory corruption.
This patch fixes this vulnerability by declaring vring
index as an unsigned int in
vhost_user_check_and_alloc_queue_pair().
CVE-2020-10723
Fixes: 160cbc815b41 ("vhost: remove a hack on queue allocation")
Cc: stable@dpdk.org
Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
vhost_user_set_log_base() is a message handler that is
called to handle the VHOST_USER_SET_LOG_BASE message.
Its payload contains a 64 bit size and offset. Both are
added up and used as a size when calling mmap().
There is no integer overflow check. If an integer overflow
occurs a smaller memory map would be created than
requested. Since the returned mapping is mapped as writable
and used for logging, a memory corruption could occur.
CVE-2020-10722
Fixes: fbc4d248b198 ("vhost: fix offset while mmaping log base address")
Cc: stable@dpdk.org
Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend
and backend are assumed to be implemented in software, that is they can
run on identical CPUs in an SMP configuration.
Thus a weak form of memory barriers like rte_smp_r/wmb, other than
rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1)
and yields better performance.
For the above case, this patch helps yielding even better performance
by replacing the two-way barriers with C11 one-way barriers for avail
index in split ring.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend
and backend are assumed to be implemented in software, that is they can
run on identical CPUs in an SMP configuration.
Thus a weak form of memory barriers like rte_smp_r/wmb, other than
rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1)
and yields better performance.
For the above case, this patch helps yielding even better performance
by replacing the two-way barriers with C11 one-way barriers for used
index in split ring.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If Tx zero copy enabled, gpa to hpa mapping table is updated one by
one. This will harm performance when guest memory backend using 2M
hugepages. Now utilize binary search to find the entry in mapping
table, meanwhile set the threshold to 256 entries for linear search.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>