33048 Commits

Author SHA1 Message Date
David Marchand
ea2810fc21 vdpa/mlx5: fix leak on event thread creation
As stated in the manual, pthread_attr_init return value should be
checked.
Besides, a pthread_attr_t should be destroyed once unused.

In practice, we may have no leak (from what I read in glibc current code),
but this may change in the future.
Stick to a correct use of the API.

Fixes: 5cf3fd3af4df ("vdpa/mlx5: add CPU core parameter to bind polling thread")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-07-08 11:15:32 +02:00
Bruce Richardson
ac847b437c doc: add reference to virtio-user from KNI guide
To help encourage use of virtio-user in place of KNI, put a reference to
the relevant howto section at the top of the KNI doc.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-08 11:15:32 +02:00
Bruce Richardson
99c58d238d doc: add code example for virtio-user exception path
The HOWTO guide for using virtio-user as an exception path to the kernel
only provided an example of how testpmd may be used for that purpose.
However, a real application wanting to use virtio-user as exception path
would likely want to create such devices from code within the app
itself. Therefore, we update the doc with instructions and a code
snippet showing how this may be done.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-08 11:15:11 +02:00
Bruce Richardson
decb35d890 doc: rework section on virtio-user as exception path
This patch extensively reworks the howto guide on using virtio-user for
exception packets. Changes include:

* rename "exceptional path" to "exception path"
* remove references to uio and just reference vfio-pci
* simplify testpmd command-lines, giving a basic usage example first
  before adding on detail about checksum or TSO parameters
* give a complete working example showing traffic flowing through the
  whole system from a testpmd loopback using the created TAP netdev
* replace use of "ifconfig" with Linux standard "ip" command
* other general rewording.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-08 10:48:21 +02:00
Raja Zidane
14d460b888 examples/link_status_interrupt: fix stats refresh rate
TIMER_MILLISECOND is defined as the number of CPU cycles per millisecond.
The current definition is correct only for cores with frequency of 2GHz.

Use DPDK API to get CPU frequency, and to define timer period.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Signed-off-by: Omar Awaysa <omara@nvidia.com>
2022-07-08 16:44:04 +02:00
Zhipeng Lu
2f655c9710 config/arm: add Phytium FT-2000+
Here adds configs for Phytium server.

Signed-off-by: Zhipeng Lu <luzhipeng@cestc.cn>
Reviewed-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
2022-07-08 14:43:26 +02:00
Thomas Monjalon
204638154e version: 22.07-rc3
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2022-07-05 22:28:46 +02:00
Raslan Darawsheh
a442ca2d23 app/regex: fix mbuf size for multi-segment buffer
When allocating multi segmented buffers, and in case there is
a remainder in total buf len, the actual job len might be more
than expected job_len.

This adds additional space in the mbuf in the multi seg case,
to allow the remaining memory to be stored in one segment.

Fixes: c1d1b94eec58 ("app/regex: fix number of matches")
Cc: stable@dpdk.org

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
2022-07-05 22:19:03 +02:00
Thierry Herbelot
1afdf9edc8 app/regex: avoid division by zero
Check that nb_jobs is not zero before using it for a division.

Fixes: f5cffb7eb7fb6 ("app/regex: read data file once at startup")
Cc: stable@dpdk.org

Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
2022-07-05 22:03:39 +02:00
Junfeng Guo
834d99f388 raw/ntb: add PPD status check for Sapphire Rapids
Add PPD (PCIe Port Definition) status check for SPR (Sapphire Rapids).

Note that NTB on SPR has the same device id with that on ICX, while
the field offsets of PPD Control Register are different. Here, we use
the PCI device revision id to distinguish the HW platform (ICX/SPR)
and check the Port Config Status and Port Definition accordingly.

+---------------------------+--------------------+--------------------+
|          Fields           | Bit Range (on ICX) | Bit Range (on SPR) |
+---------------------------+--------------------+--------------------+
| Port Configuration Status | 12                 | 14                 |
| Port Definition           | 9:8                | 10:8               |
+---------------------------+--------------------+--------------------+

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
2022-07-05 21:55:24 +02:00
Kevin Laatz
b29427649b dma/idxd: fix null dereference in PCI remove
The 'info' struct was being declared as a NULL pointer. If a NULL
pointer is passed to 'rte_dma_info_get', EINVAL is returned and the
struct is not populated. This subsequently causes a segfault when
dereferencing 'info'.

This patch fixes the issue by simply declaring 'info' on the stack and
passing its address to 'rte_dma_info_get'.

Fixes: 9449330a8458 ("dma/idxd: create dmadev instances on PCI probe")
Cc: stable@dpdk.org

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-07-05 21:37:25 +02:00
Kevin Laatz
1a57c8d553 dma/idxd: fix partial freeing in PCI close
During PCI device close, any allocated memory needs to be free'd.
Currently, one of the free's is being called on an incorrect idxd_dmadev
struct member, namely 'batch_idx_ring'.

At device creation, memory is allocated for both 'batch_comp_ring' and
'batch_idx_ring' simultaneously. Calling free only on 'batch_idx_ring'
meant the first half of this memory was not being free'd, leading to the
memleak.

This patch fixes this memleak by calling free on 'batch_comp_ring' which
will free the memory for both rings.

Fixes: 9449330a8458 ("dma/idxd: create dmadev instances on PCI probe")
Cc: stable@dpdk.org

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-07-05 21:34:38 +02:00
Kevin Laatz
8a6eb404c4 dma/idxd: fix memory leak in PCI close
ASAN reports a memory leak for the 'pci' pointer in the 'idxd_dmadev'
struct.

This is fixed by free'ing the struct when the last queue on the PCI
device is being closed.

Fixes: 9449330a8458 ("dma/idxd: create dmadev instances on PCI probe")
Cc: stable@dpdk.org

Reported-by: Xingguang He <xingguang.he@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2022-07-05 21:19:00 +02:00
Kai Ji
81e3122fe1 crypto/openssl: fix 3.0 EVP_PKEY usage in RSA operations
EVP_PKEY function need to be called twice for RSA sign
and verify operations in 3.0 EVP API. Original OpenSSL
1.x routines are untouched. The OPENSSL_API_COMPAT is
also removed as the driver now supports OpenSSL 3.0 lib
as well when it is detected on the host.

Fixes: d7bd42f6db19 ("crypto/openssl: update RSA routine with 3.0 EVP API")

Signed-off-by: Kai Ji <kai.ji@intel.com>
2022-07-05 18:30:24 +02:00
Rebecca Troy
c6de7bb69b crypto/qat: fix secure session check
Currently when running the dpdk-perf-test with DOCSIS
security sessions, a segmentation fault occurs. This
is due to the check being made that the session is not
equal to op->sym->sec_session. This check passes the
first time but on the second iteration fails and doesn't
create the build_request.

This commit fixes that error by getting the ctx first
from the private session data and then comparing ctx,
rather than op->sym->sec_session, with the sess.

Fixes: fb3b9f492205 ("crypto/qat: rework burst data path")
Cc: stable@dpdk.org

Signed-off-by: Rebecca Troy <rebecca.troy@intel.com>
Signed-off-by: Kai Ji <kai.ji@intel.com>
2022-07-05 17:44:10 +02:00
Spike Du
95ff465009 vdpa/mlx5: use common interrupt management
Replace vDPA interrupt handle creation logic
with mlx5-common interrupt management function.

Signed-off-by: Spike Du <spiked@nvidia.com>
2022-07-05 20:15:28 +02:00
Raja Zidane
5ddb903824 net/mlx5: reject negative integrity item configuration
Negative integrity item refers to condition when the item value mask
is set, but value spec is cleared:
    ... integrity value mask l4_ok value spec 0 ...

ethdev library defines integrity bits `l3_ok` and `l4_ok` as accumulators
for all hardware L3 and L4 integrity verifications respectfully.
Hardware `l3_ok` and `l4_ok` integrity bits refer to L3 and L4
network headers only.
Integrity bits `l3_ok` and `l4_ok` are not compatible between
ethdev library and hardware.

PMD translations for ethdev `l3_ok` are:
 IPv4: `l3_ok` and `l3_csum_ok`
 IPv6: `l3_ok`
ethdev `l4_ok` is translated into PMD `l4_ok` and `l4_csum_ok` bits.

Positive IPv4 `l3_ok` flow item configuration is translated into
a single matcher that AND corresponding hardware bits.
Negative IPv4 `l3_ok` is translated into 2 hardware conditions where
each condition probes a single integrity bit:
  ethdev::l3_ok is 0 => MLX5::l3_ok is 0 OR MLX5:l3_csum_ok is 0
MLX5 hardware does not do OR condition in flow rule item.
Negative IPv4 `l3_ok` must be translated into 2 flow rules.
Similarly negative ethdev `l4_ok` condition is also translated into 2
hardware rules.

Current PMD roadmap does not allow implicit flow rule split.

Bugzilla ID: 948
Cc: stable@dpdk.org

Suggested-by: Raja Zidane <rzidane@nvidia.com>
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2022-07-05 20:04:02 +02:00
Dmitry Kozlyuk
e96d3d02d6 common/mlx5: fix non-expandable global MR cache
The number of memory regions (MR) that MLX5 PMD can use
was limited by 512 per IB device, the size of the global MR cache
that was fixed at compile time.
The cache allows to search MR LKey by address efficiently,
therefore it is the last place searched on data path
(skipped is the global MR database which would be slow).
If the application logic caused the PMD to create more than 512 MRs,
which can be the case with external memory,
those MRs would never be found on data path
and later cause a HW failure.

The cache size was fixed because at the time of overflow
the EAL memory hotplug lock may be held,
prohibiting to allocate a larger cache
(it must reside in DPDK memory for multi-process support).
This patch adds logic to release the necessary locks,
extend the cache, and repeat the attempt to insert new entries.

`mlx5_mr_btree` structure had `overflow` field
that was set when a cache (not only the global one)
could not accept new entries.
However, it was only checked for the global cache,
because caches of upper layers were dynamically expandable.
With the global cache size limitation removed, this field is not needed.
Cache size was previously limited by 16-bit indices.
Use the space in the structure previously field by `overflow` field
to extend indices to 32 bits.
With this patch, it is the HW and RAM that limit the number of MRs.

Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-07-05 20:03:44 +02:00
Michael Baum
740a28366c net/mlx5: add test for external Rx queue
Add mlx5 internal test for map and unmap external RxQs.
This patch adds to testpmd app a runtime function to test the mapping
API.

  testpmd> mlx5 port (port_id) ext_rxq map (sw_queue_id) (hw_queue_id)
  testpmd> mlx5 port (port_id) ext_rxq unmap (sw_queue_id)

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-07-05 20:02:57 +02:00
Michael Baum
85d9252e55 net/mlx5: add test for remote PD and CTX
Add mlx5 internal option in testpmd similar to run-time function
"port attach" which adds another parameter named "socket" for attaching
port and add 2 devargs before.

The arguments are "cmd_fd" and "pd_handle" using to import device
created out of PMD. Testpmd application import it using IPC, and updates
the devargs list before attaching.

These arguments were added in
the commit 9d936f4f1a5e ("common/mlx5: support remote PD and CTX")

The syntax is:

  testpmd> mlx5 port attach (identifier) socket=(path)

Where "path" is the IPC socket path agreed on the remote process.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Reviewed-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@nvidia.com>
2022-07-05 20:01:33 +02:00
Tomasz Duszynski
2ddf4b110c common/cnxk: allow changing PTP mode on CN10K
Since firmware has added support for toggling PTP mode on 10k platforms
userspace code should allow doing that as well.

Cc: stable@dpdk.org

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2022-07-05 18:51:59 +02:00
Kumara Parameshwaran
bdf2f895a6 gro: fix identifying fragmented packets
A packet with RTE_PTYPE_L4_FRAG(0x300) contains both RTE_PTYPE_L4_TCP
(0x100) & RTE_PTYPE_L4_UDP (0x200). A fragmented packet as defined in
rte_mbuf_ptype.h cannot be recognized as other L4 types and hence the
GRO layer should not use IS_IPV4_TCP_PKT or IS_IPV4_UDP_PKT for
RTE_PTYPE_L4_FRAG. Hence, if the packet type is RTE_PTYPE_L4_FRAG the
IP header should be parsed to recognize the appropriate IP type and
invoke the respective gro handler.

Fixes: 1ca5e6740852 ("gro: support UDP/IPv4")
Cc: stable@dpdk.org

Signed-off-by: Kumara Parameshwaran <kumaraparamesh92@gmail.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
2022-07-05 18:30:38 +02:00
Harry van Haaren
6550113be6 service: fix lingering active status
This commit fixes an issue where calling rte_service_lcore_stop()
would result in a service's "active on lcore" status becoming stale.

The stale status would result in rte_service_may_be_active() always
returning "1", indicating that the service is not certainly stopped.

This is fixed by ensuring the "active on lcore" status of each service
is set to 0 when an lcore is stopped.

Fixes: e30dd31847d2 ("service: add mechanism for quiescing")
Fixes: 8929de043eb4 ("service: retrieve lcore active state")

Reported-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2022-07-05 16:24:43 +02:00
Zhichao Zeng
e097bf80e4 net/igc: support multi-process
The Rx function was not specified in the secondary process, causing the
secondary process to segfault in a multi-process environment.

This patch specify RX/TX functions in "dev_init" to support secondary
processes.

Fixes: 66fde1b943eb ("net/igc: add skeleton")
Cc: stable@dpdk.org

Signed-off-by: Zhichao Zeng <zhichaox.zeng@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2022-07-05 03:55:32 +02:00
Kevin Liu
5bd74df1db net/i40e: fix QinQ enablement
Enable double VLAN by default after firmware v8.3
and disable double VLAN is not allowed in subsequent
operations.

Fixes: 38e9762be16a ("net/i40e: add outer VLAN processing")

Signed-off-by: Kevin Liu <kevinx.liu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2022-07-05 03:55:32 +02:00
Yiding Zhou
196f35f81c net/iavf: fix VF reset
When the VF is in closed state, the vf_reset flag can not be reverted
if the VF is reset asynchronously. This prevents all virtchnl commands
from executing, causing subsequent calls to iavf_dev_reset() to fail.

So the vf_reset flag needs to be reverted even when VF is in closed state.

Fixes: 676d986b4b86 ("net/iavf: fix crash after VF reset failure")
Cc: stable@dpdk.org

Signed-off-by: Yiding Zhou <yidingx.zhou@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2022-07-05 03:55:32 +02:00
Yuying Zhang
b84d7eb886 net/ice: fix memory allocation insufficiency
Current code doesn't allocate memory of lookup element to add packet
flag. This patch adds one lookup item in the list to fix this memory
issue.

Fixes: 8b95092b7f69 ("net/ice/base: fix direction of flow that matches any")
Cc: stable@dpdk.org

Signed-off-by: Yuying Zhang <yuying.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2022-07-05 03:55:32 +02:00
Rakesh Kudurumalla
f120772811 net/cnxk: fix extended statistics
This fix replaces the usage of roc_nix_num_xstats_get() which is compile
time RoC API with runtime RoC  roc_nix_xstats_names_get() API resolving
xstat count difference for cn9k and cn10k while displaying xstats
for ethdev ports

Fixes: 825bd1d9d8e6 ("common/cnxk: update extra stats for inline device")
Cc: stable@dpdk.org

Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
2022-07-04 14:47:03 +02:00
Nithin Dabilpuram
e47565a78e doc: add environment variables in cnxk guide
Add list of environment variables used by cnxk drivers.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2022-07-04 14:46:55 +02:00
Satheesh Paul
3b1a48f1ed common/cnxk: fix GRE tunnel parsing
After parsing GRE tunnel, parse subsequent protocols
(for example, TCP or UDP) as tunneled versions.

Fixes: c34ea71b878 ("common/cnxk: add NPC parsing API")
Cc: stable@dpdk.org

Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
2022-07-04 14:46:46 +02:00
Yuan Wang
23ab0c59bc net/virtio-user: fix Rx interrupts with multi-queue
The callfds[] array stores eventfds sequentially for Rx and Tx vq.

Fixes: d61138d4f0e2 ("drivers: remove direct access to interrupt handle")
Cc: stable@dpdk.org

Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2022-07-04 10:28:05 +02:00
Herakliusz Lipiec
0a666b8631 examples/vhost: update usage message
updating vhost usage message to be aligned with the documentation.

Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
Herakliusz Lipiec
5f4f26d315 doc: update vhost application guide
Vhost sample app documentation describes parameters that are not in the
code and omits parameters that exist.
Also switching the order of sections on running vhost and VM,
since the --client parameter in the sample line
requires a socket to be created by VM.
Removing uio references and updating with vfio-pci.

Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
Herakliusz Lipiec
25651c5647 examples/vhost: update makefile to match Meson build
Meson build system creates a vhost binary but Makefile
and docs reference same as vhost-switch. Updating makefile
to match meson and the docs accordingly.

Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-01 15:49:49 +02:00
David Marchand
36c525a035 vhost: prefix logs with context
We recently improved the log messages in the vhost library, adding some
context that helps filtering for a given vhost-user device.
However, some parts of the code were missed, and some later code changes
broke this new convention (fixes were sent previous to this patch).

Change the VHOST_LOG_CONFIG/DATA helpers and always ask for a string
used as context. This should help limit regressions on this topic.

Most of the time, the context is the vhost-user device socket path.
For the rest when a vhost-user device can not be related, generic
names were chosen:
- "dma", for vhost-user async DMA operations,
- "device", for vhost-user device creation and lookup,
- "thread", for threads management,

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-01 15:49:49 +02:00
David Marchand
481a2c7ef2 vhost: improve some datapath log messages
Those messages were missed when adding socket context.
Fix this.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-01 15:49:49 +02:00
David Marchand
bb15129da9 vhost: restore device information in log messages
device information in the log messages was dropped.

Fixes: 52ade97e3641 ("vhost: fix physical address mapping")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
David Marchand
1ef468a7e9 vhost: add some trailing newline in log messages
VHOST_LOG_* macros don't append a newline.
Add missing ones.

Fixes: e623e0c6d8a5 ("vhost: add reconnect ability")
Fixes: af1475918124 ("vhost: introduce API to start a specific driver")
Fixes: 2dfeebe26546 ("vhost: check return of mutex initialization")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
Jiayu Hu
1e4bcee9ba vhost: check DMA info return
This patch checks the return value of rte_dma_info_get()
called in rte_vhost_async_dma_configure().

Coverity issue: 379066
Fixes: 53d3f4778c1d ("vhost: integrate dmadev in asynchronous data-path")
Cc: stable@dpdk.org

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
Abhimanyu Saini
2eb13ddea3 vdpa/sfc: fix sync between QEMU and vhost-user
When DPDK app is running in the VF, it sometimes rings the doorbell
before dev_config has had a chance to complete and hence it misses
the event. As workaround, ring the doorbell when vDPA reports the
notify_area to QEMU.

Fixes: 630be406dcbf ("vdpa/sfc: get queue notify area info")
Cc: stable@dpdk.org

Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Signed-off-by: Abhimanyu Saini <absaini@amd.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-01 15:49:49 +02:00
Yuan Wang
193edd75a9 net/vhost: fix deadlock on vring state change
If vring state changes after pmd starts working, the locked vring
notifies pmd, thus calling update_queuing_status(), the latter
will wait for pmd to finish accessing vring, while pmd is also
waiting for vring to be unlocked, thus causing deadlock.

Actually, update_queuing_status() only needs to wait while
destroy/stopping the device, but not in other cases.

This patch adds a flag for whether or not to wait to fix this issue.

Fixes: 1ce3c7fe149f ("net/vhost: emulate device start/stop behavior")
Cc: stable@dpdk.org

Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
Xuan Ding
b6eee3e834 vhost: fix sync dequeue offload
This patch fixes the missing virtio net header copy in sync
dequeue path caused by refactoring, which affects dequeue
offloading.

Fixes: 6d823bb302c7 ("vhost: prepare sync for descriptor to mbuf refactoring")

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Wei Ling <weix.ling@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
Herakliusz Lipiec
c8a3ee49c9 doc: fix readability in vhost guide
fix grammar issues and readbility in vhost library programmer guide

Fixes: 768274ebbd5e ("vhost: avoid populate guest memory")
Cc: stable@dpdk.org

Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
Wisam Jaddo
2cf6f9aac9 vdpa/mlx5: add ConnectX-6 LX device ID
This adds ConnectX-6 LX to the list of supported
Mellanox devices that run the MLX5 vdpa PMD.

Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-01 15:49:49 +02:00
Yuan Wang
1907ce4bae examples/vhost: fix retry logic on Rx path
drain_eth_rx() uses rte_vhost_avail_entries() to calculate
the available entries to determine if a retry is required.
However, this function only works with split rings, and
calculating packed rings will return the wrong value and cause
unnecessary retries resulting in a significant performance penalty.

This patch fix that by using the difference between tx/rx burst
as the retry condition.

Fixes: be800696c26e ("examples/vhost: use burst enqueue and dequeue from lib")
Cc: stable@dpdk.org

Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Tested-by: Wei Ling <weix.ling@intel.com>
2022-07-01 15:49:49 +02:00
Andy Pei
b90574b10e vhost: fix virtio block vDPA live migration IO drop
In the virtio blk vDPA live migration use case, before the live
migration process, QEMU will set call fd to vDPA back-end. QEMU
and vDPA back-end stand by until live migration starts.
During live migration process, QEMU sets kick fd and a new call
fd. However, after the kick fd is set to the vDPA back-end, the
vDPA back-end configures device and data path starts. The new
call fd will cause some kind of "re-configuration", this kind
of "re-configuration" cause IO drop.
After this patch, vDPA back-end configures device after kick fd
and call fd are well set and make sure no IO drops.
This patch only impact virtio blk vDPA device and does not impact
net device.

Fixes: 7015b6577178 ("vdpa/ifc: add block device SW live-migration")

Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-01 15:49:49 +02:00
Xuan Ding
741eda9d57 doc: clean vhost async note
This patch moves the 'Recommended IOVA mode in async datapath'
section under 'Vhost asynchronous data path' as a sub-section,
which makes the doc cleaner.

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
Andy Pei
98c6096187 vdpa/ifc: fix vhost message size check
For vhost message VHOST_USER_GET_CONFIG, we do not check
payload size in vhost lib, we check payload size in driver
specific ops.
For ifc vdpa driver, we just need to make sure payload size
is not smaller than sizeof(struct virtio_blk_config).

Fixes: 856d03bcdc54 ("vdpa/ifc: add block operations")

Signed-off-by: Andy Pei <andy.pei@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2022-07-01 15:49:49 +02:00
Xuan Ding
9851c4e339 doc: add vhost async enqueue API usage
This patch updates the correct usage for async enqueue APIs.
The rte_vhost_poll_enqueue_completed() needs to be
called in time to notify the guest of completed packets and
avoid packet loss.

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:49 +02:00
Yuan Wang
41f9a1757f net/virtio-user: fix socket non-blocking mode
The virtio-user initialization requires unix socket to receive backend
messages in block mode. However, vhost_user_update_link_state() sets
the same socket to nonblocking via fcntl, which affects all threads.
Enabling the rxq interrupt can causes both of these behaviors to occur
concurrently, with the result that the initialization may fail
because no messages are received in nonblocking socket.

Thread 1:
virtio_init_device()
--> virtio_user_start_device()
	--> vhost_user_set_memory_table()
		--> vhost_user_check_reply_ack()

Thread 2:
virtio_interrupt_handler()
--> vhost_user_update_link_state()

Fix that by replacing O_NONBLOCK with the recv per-call option
MSG_DONTWAIT.

Fixes: ef53b6030039 ("net/virtio-user: support LSC")
Cc: stable@dpdk.org

Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2022-07-01 15:49:32 +02:00