GCC 12 raises the following warning:
../drivers/vdpa/ifc/ifcvf_vdpa.c: In function ‘vdpa_enable_vfio_intr’:
../drivers/vdpa/ifc/ifcvf_vdpa.c:383:62: error: writing 4 bytes into a
region of size 0 [-Werror=stringop-overflow=]
383 | fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd;
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
../drivers/vdpa/ifc/ifcvf_vdpa.c:348:14: note: at offset 32 into
destination object ‘irq_set_buf’ of size 32
348 | char irq_set_buf[MSIX_IRQ_SET_BUF_LEN];
| ^~~~~~~~~~~
Validate number of vrings to avoid out of bound access.
Bugzilla ID: 855
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Register address is different between net and blk device.
We are re-using most of the code, when register address is
different, we have to check net and blk device go through
different code.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Create a thread to poll and relay config space change interrupt.
Use VHOST_USER_SLAVE_CONFIG_CHANGE_MSG to inform QEMU.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Add some log of virtio blk device config space information
at vDPA launch before QEMU connects.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Add SW live-migration support to block device.
For block device, it is critical that no packet
should be dropped. So when virtio blk device is
paused, make sure hardware last_avail_idx and
last_used_idx are the same. This indicates all
requests have received acks, and no inflight IO.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
For the net device type, only interrupt of rxq needed to be relayed.
But for block, since all the queues are used for both read and write
requests. Interrupt of all queues needed to be relayed.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
For virtio blk device, re-use part of ifc driver ops.
Implement ifcvf_blk_get_config for virtio blk device.
Support VHOST_USER_PROTOCOL_F_CONFIG feature for virtio
blk device.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Re-use the vdpa/ifc code, distinguish blk and net device by pci_device_id.
Blk and net device are implemented with proper feature and ops.
Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
In order to speed-up the device suspend and resume, make the statistics
counters persistent in reconfiguration until the device gets removed.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch supports device cleanup callback API which is called when
the device is disconnected from the VM. Cached resources like VM MR and
VQ memory are released.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
During device suspend and resume, resources are not changed normally.
When huge resources were allocated to VM, like huge memory size or lots
of queues, time spent on release and recreate became significant.
To speed up, this patch reuses resources like VM MR and VirtQ memory if
not changed.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
To speed up device resume, create reuseable resources during device
probe state, release when device is removed. Reused resources includes
TIS,
TD, VAR Doorbell mmap, error handling event channel and interrupt
handler, UAR, Rx event channel, NULL MR, steer domain and table.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When Qemu suspends a VM, HW notifier is un-mmapped while vCPU thread may
still be active and write notifier through kick socket.
PMD kick handler thread tries to install HW notifier through client
socket. In such case, it will timeout and slow down device close.
This patch skips HW notifier install if VQ or device in middle of
shutdown.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In Ctrl+C handling, sometimes kick handling thread gets endless EGAIN
error and fall into dead lock.
Kick happens frequently in real system due to busy traffic or retry
mechanism. This patch simplifies kick firmware anyway and skip setting
hardware notifier due to potential device error, notifier could be set
in next successful kick request.
Fixes: 62c813706e ("vdpa/mlx5: map doorbell")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Improve the devargs handling in two aspects:
- Parse the devargs string only once.
- Return error and report for unknown keys.
The common driver parses once the devargs string into a dictionary, then
provides it to all the drivers' probe. Each driver updates within it
which keys it has used, then common driver receives the updated
dictionary and reports about unknown devargs.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Buffer for MCDI channel is allocated using rte_memzone_reserve_aligned
with zone name 'mcdi'. Since multiple MCDI channels are needed to
support multiple VF(s) and rte_memzone_reserve_aligned expects unique
zone names, append PCI address to zone name to make it unique.
Signed-off-by: Abhimanyu Saini <asaini@xilinx.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When sva is null, sfc_vdpa_info(sva, ...) will cause a null
dereference. Use SFC_VDPA_GENERIC_LOG() to avoid that.
See macros sfc_vdpa_info and SFC_VDPA_GENERIC_LOG
defined in drivers/vdpa/sfc/sfc_vdpa_log.h for detail.
Fixes: 5e7596ba7c ("vdpa/sfc: introduce Xilinx vDPA driver")
Cc: stable@dpdk.org
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Fixes: b11961363b ("vdpa/sfc: support device configure and close")
Cc: stable@dpdk.org
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Functions like free, rte_free, and rte_mempool_free
already handle NULL pointer so the checks here are not necessary.
Remove redundant NULL pointer checks before free functions
found by nullfree.cocci
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When the event thread polls traffic and a virtq is stopping, the FW loses
synchronization in the virtq indexes.
It causes LM failure on synchronization between the HOST indexes to
the GUEST indexes.
Unset the event thread before the queue stop in the LM process.
Fixes: 31b9c29c86 ("vdpa/mlx5: support close and config operations")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Some drivers currently have their own checks and give some non
consistent reasons when an internal dependency is unavailable.
drivers/meson.build also checks for internal dependencies via 'deps'.
Let's rely on it for consistency, and smaller code.
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Long Li <longli@microsoft.com>
The return value of "mlx5_os_wrapped_mkey_create" is checked in the
caller. A zero means success without any error.
The typo in the if-condition should be fixed in case there is a
misjudgment.
Fixes: 398ea8450c ("vdpa/mlx5: workaround dirty bitmap MR creation")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Due to kernel issue in direct MKEY creation using the DevX API, this
patch replaces the virtio MR creation to use Verbs API.
Fixes: cc07a42da2 ("vdpa/mlx5: prepare memory regions")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
Due to kernel driver/FW issues in direct MKEY creation using the DevX
API, this patch replaces the dirty bitmap MR creation to use wrapped
mkey instead.
Fixes: 9d39e57f21 ("vdpa/mlx5: support live migration")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
The number of WQEBBs was provided to QP create, and QP size was calculated
by multiplying the number of WQEBBs by 64, which is the send WQE size.
When creating RQ in the QP (i.e., vdpa driver), the queue size was bigger
because the receive WQE size is 16.
Provide queue size to QP create instead of the number of WQEBBs.
Fixes: f9213ab12c ("common/mlx5: share DevX queue pair operations")
Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The DevX interface for QP creation expects the number of WQEBBs.
Wrongly, the number of descriptors was provided to the QP creation.
In addition, the QP size must be a power of 2 what was not guaranteed.
Provide the number of WQEBBs to the QP creation API.
Round up the SQ size to a power of 2.
Rename (sq/rq)_size to num_of_(send/receive)_wqes.
Fixes: 6152534e21 ("crypto/mlx5: support queue pairs operations")
Cc: stable@dpdk.org
Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
The rdma-core library can map doorbell register in two ways, depending
on the environment variable "MLX5_SHUT_UP_BF":
- as regular cached memory, the variable is either missing or set to
zero. This type of mapping may cause the significant doorbell
register writing latency and requires an explicit memory write
barrier to mitigate this issue and prevent write combining.
- as non-cached memory, the variable is present and set to not "0"
value. This type of mapping may cause performance impact under
heavy loading conditions but the explicit write memory barrier is
not required and it may improve core performance.
The UAR creation function maps a doorbell in one of the above ways
according to the system. In run time, it always adds an explicit memory
barrier after writing to.
In cases where the doorbell was mapped as non-cached memory, the
explicit memory barrier is unnecessary and may impair performance.
The commit [1] solved this problem for a Tx queue. In run time, it
checks the mapping type and provides the memory barrier after writing to
a Tx doorbell register if it is needed. The mapping type is extracted
directly from the uar_mmap_offset field in the queue properties.
This patch shares this code between the drivers and extends the above
solution for each of them.
[1] commit 8409a28573
("net/mlx5: control transmit doorbell register mapping")
Fixes: f8c97babc9 ("compress/mlx5: add data-path functions")
Fixes: 8e196c08ab ("crypto/mlx5: support enqueue/dequeue operations")
Fixes: 4d4e245ad6 ("regex/mlx5: support enqueue")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
UAR mapping type can be affected by the devarg tx_db_nc, which can cause
setting the environment variable MLX5_SHUT_UP_BF.
So, the MLX5_SHUT_UP_BF value and the UAR mapping parameter affect the
UAR cache mode.
Wrongly, the devarg was considered for the MLX5_SHUT_UP_BF but not for
the UAR mapping parameter in all the drivers except the net.
Take the tx_db_nc devarg into account for all the drivers.
Fixes: ca1418ce39 ("common/mlx5: share device context object")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Add support for unicast and broadcast MAC filter configuration.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Implement the vDPA ops get_notify_area to get the notify
area info of the queue.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Implement vDPA ops get_queue_num to get the maximum number
of queues supported by the device.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Add new vDPA PMD to support vDPA operations of Xilinx devices.
This patch implements probe and remove functions.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch marks the vDPA driver APIs as internal and
rename the corresponding header file to vdpa_driver.h.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the drivers access the interrupt handle fields.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
VAR is the device memory space for the virtio queues doorbells,
Qemu could mmap it to directly to speed up doorbell push.
On a busy system, Qemu takes time to release VAR resources during driver
shutdown. If vdpa restarted quickly, the VAR allocation failed with
error 28 since the VAR is singleton resource per device.
This patch adds retry mechanism for VAR allocation.
Fixes: 4cae722c1b ("vdpa/mlx5: move virtual doorbell alloc to probe")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
After a vDPA application restart, Qemu restores VQ with used and
available index, new incoming packet triggers virtio driver to
handle buffers. Under heavy traffic, no available buffer for
firmware to receive new packets, no Rx interrupts generated,
driver is stuck on endless interrupt waiting.
As a firmware workaround, this patch sends a notification after
VQ setup to ask driver handling buffers and filling new buffers.
Fixes: bff7350110 ("vdpa/mlx5: prepare virtio queues")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add HCA attributes structure as a field of device config structure.
It query in common probing, and updates the timestamp format fields.
Each driver use HCA attributes from common device config structure,
instead of query it for itself.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Create shared Protection Domain in common area and add it and its PDN as
fields of common device structure.
Use this Protection Domain in all drivers and remove the PD and PDN
fields from their private structure.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Add option to get IB device after disabling RoCE. It is relevant if
there is vDPA class in device arguments list.
Use common device context in vDPA driver and remove the ctx field from
its private structure.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>