This patch adds statistics for IOTLB hits and misses.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch adds a new virtqueue statistic for guest
notifications. It is useful to deduce from hypervisor side
whether the corresponding guest Virtio device is using
Kernel Virtio-net driver or DPDK Virtio PMD.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch introduces new APIs for the application
to query and reset per-virtqueue statistics. The
patch also introduces generic counters.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
When choosing IOVA as PA mode, IOVA is likely to be discontinuous,
which requires page by page mapping for DMA devices. To be consistent,
this patch implements page by page mapping instead of mapping at the
region granularity for both IOVA as VA and PA mode.
Fixes: 7c61fa08b7 ("vhost: enable IOMMU for async vhost")
Cc: stable@dpdk.org
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch renames the host_phys_addr to host_iova in guest_page
struct. The host_phys_addr is iova, it depends on the DPDK
IOVA mode.
Fixes: e246896178 ("vhost: get guest/host physical address mappings")
Cc: stable@dpdk.org
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Since dmadev is introduced in 21.11, to avoid the overhead of vhost DMA
abstraction layer and simplify application logics, this patch integrates
dmadev in asynchronous data path.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Sunil Pai G <sunil.pai.g@intel.com>
Tested-by: Yvonne Yang <yvonnex.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Async copy fails when looking up hpa in the gpa to hpa mapping table.
This happens because the gpa is matched exactly in the merged
mapping table, and the merge loses the mapping entries.
A new range comparison method is introduced to solve this issue.
Fixes: 6563cf9238 ("vhost: fix async copy on multi-page buffers")
Cc: stable@dpdk.org
Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
As previously announced, this patch renames struct
vhost_device_ops to struct rte_vhost_device_ops.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch marks the vDPA driver APIs as internal and
rename the corresponding header file to vdpa_driver.h.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
This patch increases the number of IO vectors for the
asynchronous data path from 512 to 2048. It has been
reported during testing the starvation of IO vectors
during iperf benchmark with 64KB packet size.
As there are no direct relationship between
VHOST_MAX_ASYNC_VEC and BUF_VECTOR_MAX, this patch also
assign VHOST_MAX_ASYNC_VEC value directly instead of being
a multiple of BUF_VECTOR_MAX.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
vhost_poll_enqueue_completed() assumes some inflight
packets could have been completed in a previous call but
not returned to the application. But this is not the case,
since check_completed_copies callback is never called with
more than the current count as argument.
In other words, async->last_pkts_n is always 0. Removing it
greatly simplifies the function.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
IO vectors and their iterators arrays were part of the
async metadata but not their indexes.
In order to makes this more consistent, the patch adds the
indexes to the async metadata. Doing that, we can avoid
triggering DMA transfer within the loop as it IO vector
index overflow is now prevented in the async_mbuf_to_desc()
function.
Note that previous detection mechanism was broken
since the overflow already happened when detected, so OOB
memory access would already have happened.
With this changes done, virtio_dev_rx_async_submit_split()
and virtio_dev_rx_async_submit_packed() can be further
simplified.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
This patch introduces rte_vhost_iovec struct that contains
both source and destination addresses since we always have
a 1:1 mapping between source and destination. While using
the standard iovec struct might have seemed better, having
to duplicate IO vectors and its iterators is memory
inefficient and make the implementation more complex.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
This patch splits the iterator arrays in two, one for
source and one for destination. The goal is make the code
easier to understand.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
IO vectors implementation is unnecessarily complex, mixing
source and destinations vectors in the same array.
This patch declares two arrays, one for the source and one
for the destination. It also gets rid of seg_awaits variable
in both packed and split implementation, which is the same
as iovec_idx.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
This patch moves async_inflight_info struct to internal
header since it should not be part of the API.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
This patch moves async-related metadata from vhost_virtqueue
to a dedicated struct. It makes it clear which fields are
async related, and also saves some memory when async feature
is not in use.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
Async DMA map status flag was added to prevent the unnecessary unmap
when DMA devices bound to kernel driver. This brings maintenance cost
for a lot of code. This patch removes the DMA map status by using
rte_errno instead.
This patch relies on the following patch to fix a partial
unmap check in vfio unmapping API.
[1] https://www.mail-archive.com/dev@dpdk.org/msg226464.html
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The use of IOMMU has many advantages, such as isolation and address
translation. This patch extends the capability of DMA engine to use
IOMMU if the DMA engine is bound to vfio.
When set memory table, the guest memory will be mapped
into the default container of DPDK.
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Yvonne Yang <yvonnex.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Copy threshold has been introduced in async vhost data
path to select the appropriate copy engine to do copies
for higher efficiency.
However, it may cause packets ordering issues and also
introduces performance unpredictability.
Therefore, this patch removes copy threshold support in
async vhost data path.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch reworks the async configuration structure to improve code
readability. In addition, add preserved padding fields on the structure
for future usage.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch saves the NUMA node the virtqueue is allocated
on at init time, in order to allocate all other data on the
same node.
While most of the data are allocated before numa_realloc()
is called and so the data will be reallocated properly, some
data like the log cache are most likely allocated after.
For the virtio device metadata, we decide to allocate them
on the same node as the VQ 0.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
The vhost library currently configures Tx offloading (PKT_TX_*) on any
packet received from a guest virtio device which asks for some offloading.
This is problematic, as Tx offloading is something that the application
must ask for: the application needs to configure devices
to support every used offloads (ip, tcp checksumming, tso..), and the
various l2/l3/l4 lengths must be set following any processing that
happened in the application itself.
On the other hand, the received packets are not marked wrt current
packet l3/l4 checksumming info.
Copy virtio rx processing to fix those offload flags with some
differences:
- accept VIRTIO_NET_HDR_GSO_ECN and VIRTIO_NET_HDR_GSO_UDP,
- ignore anything but the VIRTIO_NET_HDR_F_NEEDS_CSUM flag (to comply with
the virtio spec),
Some applications might rely on the current behavior, so it is left
untouched by default.
A new RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS flag is added to enable the
new behavior.
The vhost example has been updated for the new behavior: TSO is applied to
any packet marked LRO.
Fixes: 859b480d5a ("vhost: add guest offload setting")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
For now async vhost data path only supports split ring. This patch
enables packed ring in async vhost data path to make async vhost
compatible with virtio 1.1 spec.
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
There is no reason for the DPDK libraries to all have 'librte_' prefix on
the directory names. This prefix makes the directory names longer and also
makes it awkward to add features referring to individual libraries in the
build - should the lib names be specified with or without the prefix.
Therefore, we can just remove the library prefix and use the library's
unique name as the directory name, i.e. 'eal' rather than 'librte_eal'
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>