Commit Graph

211 Commits

Author SHA1 Message Date
Thomas Monjalon
d599bf8209 vdpa/mlx5: migrate to bus-agnostic common interface
Replace PCI-specific handling with bus-agnostic structures.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Thomas Monjalon
bb060bb545 vdpa/mlx5: define driver name as macro
Use a macro for the PMD driver name.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-07-22 00:11:14 +02:00
Rongwei Liu
630a587bfb net/mlx5: support matching on VXLAN reserved field
This adds matching on the reserved field of VXLAN
header (the last 8-bits). The capability from rdma-core
is detected by creating a dummy matcher using misc5
when the device is probed.

For non-zero groups and FDB domain, the capability is
detected from rdma-core, meanwhile for NIC domain group
zero it's relying on the HCA_CAP from FW.

Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Raslan Darawsheh <rasland@nvidia.com>
2021-07-13 15:06:43 +02:00
Xueming Li
ff09f80697 vdpa/mlx5: fix TSO offload without checksum
Packet was corrupted when TSO requested without CSUM update.

Enables CSUM automatically if only TSO requested.

Fixes: 2aa8444b00 ("vdpa/mlx5: support stateless offloads")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-06-30 13:39:23 +02:00
Xueming Li
35d4f17b3d devargs: add common key definition
Add common devargs key definition for "bus", "class" and "driver".

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-07-05 16:33:18 +02:00
Matan Azrad
2838aa76ba vdpa/mlx5: fix device unplug
The vDPA PCI device unplug process should release all the private
device resources and also to unregister the device.

The device unregistration was missed what remained the device data
invalid in the rte_vhost library.

Unregister the device in unplug process via the remove operation.

Fixes: 95276abaaf ("vdpa/mlx5: introduce Mellanox vDPA driver")
Cc: stable@dpdk.org

Reported-by: Eli Britstein <elibr@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>
Tested-by: Eli Britstein <elibr@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-05-18 10:15:19 +02:00
Shiri Kuzin
9f39076b71 common/mlx5: fix mkey attributes initialization
The crypto driver added new fields to the mkey attributes struct:
crypto_en and set_remote_rw.

The entire mkey struct was not initialized, only specific fields in it,
which caused the new added fields not to be initialized resulting in a
mkey creation error.

This is fixed by initializing the entire mkey attributes struct to 0
which will prevent this issue from reoccurring if any fields are added
to the mkey struct in the future.

Fixes: 0111a74e13 ("common/mlx5: adjust DevX mkey fields for crypto")

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-05-09 09:06:31 +02:00
David Marchand
eeded2044a log: register with standardized names
Let's try to enforce the convention where most drivers use a pmd. logtype
with their class reflected in it, and libraries use a lib. logtype.

Introduce two new macros:
- RTE_LOG_REGISTER_DEFAULT can be used when a single logtype is
  used in a component. It is associated to the default name provided
  by the build system,
- RTE_LOG_REGISTER_SUFFIX can be used when multiple logtypes are used,
  and then the passed name is appended to the default name,

RTE_LOG_REGISTER is left untouched for existing external users
and for components that do not comply with the convention.

There is a new Meson variable log_prefix to adapt the default name
for baseband (pmd.bb.), bus (no pmd.) and mempool (no pmd.) classes.

Note: achieved with below commands + reverted change on net/bonding +
edits on crypto/virtio, compress/mlx5, regex/mlx5

$ git grep -l RTE_LOG_REGISTER drivers/ |
  while read file; do
    pattern=${file##drivers/};
    class=${pattern%%/*};
    pattern=${pattern#$class/};
    drv=${pattern%%/*};
    case "$class" in
      baseband) pattern=pmd.bb.$drv;;
      bus) pattern=bus.$drv;;
      mempool) pattern=mempool.$drv;;
      *) pattern=pmd.$class.$drv;;
    esac
    sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
    sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
  done

$ git grep -l RTE_LOG_REGISTER lib/ |
  while read file; do
    pattern=${file##lib/};
    pattern=lib.${pattern%%/*};
    sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
    sed -i -e 's/RTE_LOG_REGISTER(\(.*\), '$pattern'\.\(.*\),/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
  done

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-05-11 15:17:55 +02:00
Matan Azrad
99f9d799ce vdpa/mlx5: improve interrupt management
The driver should notify the guest for each traffic burst detected by CQ
polling.

The CQ polling trigger is defined by `event_mode` device argument,
either by busy polling on all the CQs or by blocked call to HW
completion event using DevX channel.

Also, the polling event modes can move to blocked call when the
traffic rate is low.

The current blocked call uses the EAL interrupt API suffering a lot
of overhead in the API management and serve all the drivers and
libraries using only single thread.

Use blocking FD of the DevX channel in order to do blocked call
directly by the DevX channel FD mechanism.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-05-04 10:22:17 +02:00
Thomas Monjalon
91d7e76462 vdpa/mlx5: improve portability of thread naming
The function pthread_setname_np is non-portable,
so it may be unavailable in old glibc or other systems.
The function rte_thread_setname is workarounding portability issues.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
2021-04-28 05:13:49 +02:00
Shiri Kuzin
c31f3f7f7b common/mlx5: share Verbs device match function
The get_ib_device_match function iterates over the list of ib devices
returned by the get_device_list glue function and returns the ib device
matching the provided address.

Since this function is in use by several drivers, in this patch we
share the function in common part.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-05-04 22:49:37 +02:00
Chengwen Feng
a011555fb8 vdpa/ifc: set notify and vring relay thread names
This patch supports set notify and vring relay thread name which is
helpful for debugging.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
2021-04-21 15:57:51 +02:00
Bruce Richardson
4ad4b20a79 drivers: change indentation in build files
Switch from using tabs to 4 spaces for meson.build indentation.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-21 14:04:09 +02:00
Bruce Richardson
cf995efc53 drivers: clean up build lists
Ensure all lists of drivers are standardized:
* one driver per line
* lists double-indented with spaces (as they are line continuations)
* elements in alphabetical order
* opening and closing list brackets "[" & "]" on own lines
* last element has trailing comma

Any code snippets in the list files is adjusted to single-indent using
whitespace to correspond to the new style also.

The lists of standard library dependencies per class, and other short
lists are not formatted one-per-line as these lists are not expected to
grow beyond 2 or 3 entries.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-21 12:37:55 +02:00
Thomas Monjalon
b164198729 drivers: align log names
The log levels are configured by using the name of the logs.
Some drivers are aligned to follow a common log name standard:
	pmd.class.driver[.sub]
Some "common" drivers skip the "class" part:
	pmd.driver.sub

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Min Hu (Connor) <humin29@huawei.com>
2021-04-08 18:32:31 +02:00
Matan Azrad
846ec2ea75 vdpa/mlx5: fix virtq cleaning
The HW virtq object can be destroyed either when the device is closed or
when the state of the virtq becomes disabled.

Some parameters of the virtq should continue to be managed when the
virtq state is changed but all of them must be initialized when the
device is closed.

Wrongly, the enable parameter stayed on when the device is closed what
might cause creation of invalid virtq in the next time a device is
assigned to the driver.

Clean all the virtqs memory when the device is closed.

Fixes: c47d6e8333 ("vdpa/mlx5: support queue update")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-03-31 10:37:10 +02:00
Xiao Wang
629d75653b vdpa/ifc: check PCI config read
The return value of rte_pci_read_config should be checked.

Coverity issue: 302860
Fixes: a3f8150eac ("net/ifcvf: add ifcvf vDPA driver")
Cc: stable@dpdk.org

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-03-31 08:39:14 +02:00
Thomas Monjalon
41b5a7a849 vdpa/mlx5: replace pthread functions unavailable in musl
1/ The function pthread_yield() does not exist in musl libc,
and can be replaced with sched_yield() after including sched.h.

2/ The function pthread_attr_setaffinity_np() does not exist in musl libc,
and can be replaced with pthread_setaffinity_np() after pthread_create().

Fixes: b7fa0bf4d5 ("vdpa/mlx5: fix polling threads scheduling")
Fixes: 5cf3fd3af4 ("vdpa/mlx5: add CPU core parameter to bind polling thread")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-03-23 08:41:05 +01:00
Thomas Monjalon
924e6b7634 drivers: replace page size definitions with function
The page size is often retrieved from the macro PAGE_SIZE.
If PAGE_SIZE is not defined, it is either using hard coded default,
or getting the system value from the UNIX-only function sysconf().

Such definitions are replaced with the generic function
rte_mem_page_size() defined for each supported OS.

Removing PAGE_SIZE definitions will fix dlb drivers for musl libc,
because #ifdef checks were missing, causing redefinition errors.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Boyer <aboyer@pensando.io>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
2021-03-23 08:41:05 +01:00
Viacheslav Ovsiienko
044423c4db vdpa/mlx5: support timestamp format
This patch adds support for the timestamp format settings for
the receive and send queues. If the firmware version x.30.1000
or above is installed and the NIC timestamps are configured
with the real-time format, the default zero values for newly
added fields cause the queue creation to fail.

The patch queries the timestamp formats supported by the hardware
and sets the configuration values in queue context accordingly.

Fixes: 95276abaaf ("vdpa/mlx5: introduce Mellanox vDPA driver")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
2021-03-16 10:05:36 +01:00
Thomas Monjalon
1b9e9826ad common/mlx5: remove extra line feed in log messages
The macro DRV_LOG already includes a terminating line feed character
defined in PMD_DRV_LOG_.
The extra line feeds added in some messages are removed.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-03-15 14:30:57 +01:00
Matan Azrad
b7fa0bf4d5 vdpa/mlx5: fix polling threads scheduling
When the event mode is with 0 fixed delay, the polling-thread will never
give-up CPU.

So, when multi-polling-threads are active, the context-switch between
them will be managed by the system which may affect latency according to
the time-out decided by the system.

In order to fix multi-devices polling thread scheduling, this patch
forces rescheduling for each CQ poll iteration.

Move the polling thread to SCHED_RR mode with maximum priority to
complete the fairness.

Fixes: 6956a48cab ("vdpa/mlx5: set polling mode default delay to zero")

Signed-off-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
2021-02-10 22:17:47 +01:00
Matan Azrad
f00e5a15af vdpa/mlx5: fix configuration mutex cleanup
When the vDPA device is closed, the driver polling thread is canceled.
The polling thread locks the configuration mutex while it polls the CQs.

When the cancellation happens, it may terminate the thread inside the
critical section what remains the configuration mutex locked.

After device close, the driver may be configured again, in this case,
for example, when the first queue state is updated, the driver tries to
lock the mutex again and deadlock appears.

Initialize the mutex after the polling thread cancellation.

Fixes: 99abbd62c2 ("vdpa/mlx5: fix queue update synchronization")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-29 18:16:10 +01:00
Bruce Richardson
762bfccc8a config: remove compatibility build defines
As announced in the deprecation note, remove all compatibility build
defines from previous make/meson versions and use only the standardized
ones - RTE_LIB_<name> for libraries, and RTE_<CLASS>_<NAME> for drivers.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-01-20 01:43:25 +01:00
Michael Baum
0e41abd198 vdpa/mlx5: move DevX CQ creation to common
Using common function for DevX CQ creation.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-14 10:12:36 +01:00
Xueming Li
1f93bee4e7 vdpa/mlx5: add hardware queue moderation
The next parameters control the HW queue moderation feature.
This feature helps to control the traffic performance and latency
trade-off.

Each packet completion report from HW to SW requires CQ processing by SW
and triggers interrupt for the guest driver. Interrupt report and
handling cost CPU cycles and time and the amount of this affects
directly on packet performance and latency.

hw_latency_mode parameters [int]
  0, HW default.
  1, Latency is counted from the first packet completion report.
  2, Latency is counted from the last packet completion.
hw_max_latency_us parameters [int]
  0 - 4095, The maximum time in microseconds that packet completion
  report can be delayed.
hw_max_pending_comp parameter [int]
  0 - 65535, The maximum number of pending packets completions in an HW
queue.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:56 +01:00
Xueming Li
05421ec938 vdpa/mlx5: set default event mode to polling
For better performance and latency, this patch sets default event
handling mode to polling mode which uses dedicate thread per device to
poll and process event.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:55 +01:00
Xueming Li
5cf3fd3af4 vdpa/mlx5: add CPU core parameter to bind polling thread
This patch adds new device argument to specify cpu core affinity to
event polling thread for better latency and throughput. The thread
could be also located by name "vDPA-mlx5-<id>".

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:55 +01:00
Xueming Li
c9a189f4ea vdpa/mlx5: default polling mode delay time to zero
To improve performance and latency, this patch sets Rx polling mode
default delay time to zero.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:55 +01:00
Xueming Li
6956a48cab vdpa/mlx5: set polling mode default delay to zero
To improve throughput and latency, this patch allows Rx polling timer
delay to 0us.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:55 +01:00
Tal Shnaiderman
981746264e common/mlx5: wrap event channel functions per OS
Wrap the API to create/destroy event channel and to subscribe an event
with OS calls. In Linux those calls are implemented by glue functions
while in Windows they are not supported.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-01-08 16:03:07 +01:00
Raslan Darawsheh
3ea12cad71 common/mlx5: fix name for ConnectX VF device ID
Starting ConnectX-6 Dx, the VF device ID is generic
and not per chip.

https://pci-ids.ucw.cz/v2.2/pci.ids
101e  ConnectX Family mlx5Gen Virtual Function

This means that all will have the same VF device ID.

Fixes: 5fc66630be ("net/mlx5: add ConnectX6-DX device ID")
Cc: stable@dpdk.org

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-20 21:10:05 +01:00
Viacheslav Ovsiienko
b9aa4ba7ce vdpa/mlx5: fix UAR allocation
This patch provides the UAR allocation workaround for the
hosts where UAR allocation with Write-Combining memory
mapping type fails.

Fixes: 8395927cdf ("vdpa/mlx5: prepare HW queues")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-14 10:56:30 +01:00
Tal Shnaiderman
e82ddd28e3 common/mlx5: split PCI relaxed ordering for read and write
The current DevX implementation of the relaxed ordering feature is
enabling relaxed ordering usage only if both relaxed ordering read AND
write are supported.  In that case both relaxed ordering read and write
are activated.

This commit will optimize the usage of relaxed ordering by enabling it
when the read OR write features are supported.  Each relaxed ordering
type will be activated according to its own capability bit.

This will align the DevX flow with the verbs implementation of
ibv_reg_mr when using the flag IBV_ACCESS_RELAXED_ORDERING

Fixes: 53ac93f71a ("net/mlx5: create relaxed ordering memory regions")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2020-11-04 19:16:24 +01:00
Xueming Li
c783fd433c vdpa/mlx5: specify lag port affinity
If set TIS lag port affinity to auto, firmware assign port affinity on
each creation with Round Robin. In case of 2 PFs, if create virtq,
destroy and create again, then each virtq will get same port affinity.

To resolve this fw limitation, this patch sets create TIS with specified
affinity for each PF.

Fixes: bff7350110 ("vdpa/mlx5: prepare virtio queues")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-11-03 23:35:05 +01:00
Xueming Li
0474419bae vdpa/mlx5: handle hardware error
When hardware error happens, vdpa didn't get such information and leave
driver in silent: working state but no response.

This patch subscribes firmware virtq error event and try to recover max
3 times in 3 seconds, stop virtq if max retry number reached.

When error happens, PMD log in warning level. If failed to recover,
outputs error log. Query virtq statistics to get error counters report.

Acked-by: Matan Azrad <matan@nvidia.com>
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-11-03 23:35:05 +01:00
Raslan Darawsheh
6ca37b06e9 common/mlx5: add ConnectX-7 and Bluefield-3 device IDs
This adds the ConnectX-7 and Bluefield-3 device ids to the list of
supported Mellanox devices that run the MLX5 PMDs.
The devices is still in development stage.

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2020-11-03 23:35:04 +01:00
Bruce Richardson
a8d0d473a0 build: replace use of old build macros
Use the newer macros defined by meson in all DPDK source code, to ensure
there are no errors when the old non-standard macros are removed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-10-19 22:15:44 +02:00
Bruce Richardson
a20b2c01a7 build: standardize component names and defines
As discussed on the dpdk-dev mailing list[1], we can make some easy
improvements in standardizing the naming of the various components in DPDK,
and their associated feature-enabled macros.

Following this patch, each library will have the name in format,
'librte_<name>.so', and the macro indicating that library is enabled in the
build will have the form 'RTE_LIB_<NAME>'.

Similarly, for libraries, the equivalent name formats and macros are:
'librte_<class>_<name>.so' and 'RTE_<CLASS>_<NAME>', where class is the
device type taken from the relevant driver subdirectory name, i.e. 'net',
'crypto' etc.

To avoid too many changes at once for end applications, the old macro names
will still be provided in the build in this release, but will be removed
subsequently.

[1] http://inbox.dpdk.org/dev/ef7c1a87-79ab-e405-4202-39b7ad6b0c71@solarflare.com/t/#u

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Rosen Xu <rosen.xu@intel.com>
2020-10-19 22:15:34 +02:00
Bruce Richardson
63b3907833 build: remove library name from version map file name
Since each version map file is contained in the subdirectory of the library
it refers to, there is no need to include the library name in the filename.
This makes things simpler in case of library renaming.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Rosen Xu <rosen.xu@intel.com>
2020-10-19 22:13:59 +02:00
Maxime Coquelin
4e1b5092ad vdpa/ifc: fix build with recent kernels
VIRTIO_F_IOMMU_PLATFORM is now defined in recent kernel
headers, causing build issue.

Let's define it in the IFC vDPA driver only if it wasn't already.

Fixes: a3f8150eac ("net/ifcvf: add ifcvf vDPA driver")
Cc: stable@dpdk.org

Reported-by: Brandon Lo <blo@iol.unh.edu>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-10-02 18:38:33 +02:00
Matan Azrad
4fb86eb5e8 vdpa/mlx5: fix completion queue polling
The CQ polling is done in order to notify the guest about new traffic
bursts and to release FW resources for the next bursts management.

When HW is faster than SW, it may be that all the FW resources are busy
in SW due to late polling.
In this case, due to wrong WQE counter masking, the fullness
calculation of the completions number is 0 while the queue is full.

Change the WQE counter masking to 16-bit wideness instead of the CQ
size mask as defined by the CQE format.

Fixes: c5f714e50b ("vdpa/mlx5: optimize completion queue poll")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Matan Azrad
9c0e15a117 vdpa/mlx5: fix completion queue assertion
The CQ configuration enables the collapse feature in HW what cause HW to
write all the completions in the first CQE.
When this feature is enabled the HW doesn't switch the owner bit when it
starts a new cycle of the CQ, not like working without the collapse
feature.

The current SW CQ polling wrongly added an assertion to validate the
owner bit switch what causes a panic in debug mode.

Remove the aforementioned assertion.

Fixes: c5f714e50b ("vdpa/mlx5: optimize completion queue poll")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Xueming Li
e8671aca20 vdpa/mlx5: fix event channel setup
During vDPA device setup, if some error happens, event channel
release stucks at polling event channel.

Event channel fd is set to non-blocking in cqe setup, so if any
error happens before this function and after event channel created,
the pooling before releasing resources will stuck.

This patch moves event channel to non-blocking mode right after
creation.

Fixes: 8395927cdf ("vdpa/mlx5: prepare HW queues")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Ciara Power
3cc6ecfdfe build: remove makefiles
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.

[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-08 00:09:50 +02:00
Thomas Monjalon
4f86c0ba19 version: 20.11-rc0
Start a new release cycle with empty release notes.

The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-08-12 11:32:16 +02:00
Matan Azrad
118494d3ad vdpa/mlx5: fix virtio queue unset
When a virtq is destroyed, the SW should be able to continue the virtq
processing from where the HW stopped.

The current destroy behavior in the driver saves the virtq state (used
and available indexes) only when LM is requested.
So, when LM is not requested the queue state is not saved and the SW
indexes stay invalid.

Save the virtq state in the virtq destroy process.

Fixes: bff7350110 ("vdpa/mlx5: prepare virtio queues")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-08-05 18:33:35 +02:00
Xueming Li
99abbd62c2 vdpa/mlx5: fix queue update synchronization
The driver CQ event management is done by non vhost library thread,
either the dpdk host thread or the internal vDPA driver thread.

When a queue is updated the CQ may be destroyed and created by the vhost
library thread via the queue state operation.

When the queue update feature was added, it didn't synchronize the CQ
management to the queue update what may cause invalid memory access.

Add the aforementioned synchronization by a new per device configuration
mutex.

Fixes: c47d6e8333 ("vdpa/mlx5: support queue update")

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-08-05 18:12:10 +02:00
Chenbo Xia
e2a1a08a76 vdpa/ifc: support vring update after device config
The device ready state in vhost lib is now defined as the state
that first queue pair is ready. And kick/callfd may be updated
by QEMU when ifc device is configured.

Although now ifc driver only supports one queue pair, it still
has to update callfd when working with QEMU. This patch fixes
this vring update problem by implementing the set_vring_state
callback.

Suggested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-07-30 00:41:23 +02:00
Xueming Li
c47463272f vdpa/mlx5: fix event queue number query
Vdpa example failed on vq setup, the api to get event queue of specified
core failed.

Internal api devx_query_eqn expects index of event queue vectors, no
need to use cpu id. As the doorbell handling thread is per device, it's
sufficient to use default event queue.

This patch uses the default id(0) as event queue index.

Fixes: 8395927cdf ("vdpa/mlx5: prepare HW queues")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-07-30 00:41:23 +02:00
Xueming Li
b887250ba8 vdpa/mlx5: fix completion queue initialization
Vdpa device failed to initialize 2nd VQ during setup. From FW syndrome,
unsupported CQE size was specified in CQ initialization attributes.

The unsupported CQE size comes from uninitialized stack struct data, and
the struct has new fields defined recently which are not initialized in
vdpa code.

This patch initializes cq creation attributes with zero to avoid such
random data.

Fixes: 79a7e409a2 ("common/mlx5: prepare support of packet pacing")

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-07-30 00:41:23 +02:00
Matan Azrad
ca4cc612d7 vdpa/mlx5: fix notification timing
The issue is relevant only for the timer event modes: 0 and 1.

When the HW finishes to consume a burst of the guest Rx descriptors,
it creates a CQE in the CQ.
When traffic stops, the mlx5 driver arms the CQ to get a notification
when a specific CQE index is created - the index to be armed is the
next CQE index which should be polled by the driver.

The mlx5 driver configured the kernel driver to send notification to
the guest callfd in the same time of the armed CQE event.
It means that the guest was notified only for each first CQE in a
poll cycle, so if the driver polled CQEs of all the virtio queue
available descriptors, the guest was not notified again for the rest
because there was no any new CQE to trigger the guest notification.

Hence, the Rx queues might be stuck when the guest didn't work with
poll mode.

Remove prior kernel notification, and do manual notification after CQ
polling.

Fixes: a9dd7275a1 ("vdpa/mlx5: optimize notification events")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-07-30 00:41:23 +02:00
Matan Azrad
d2a58c2402 vdpa/mlx5: fix steering update in virtq unset
When a virtq is destroyed by the driver, it must be removed from the
steering RQT which holds its reference.

The driver didn't remove the virtq from RQT before destroying it what
caused HW syndrome in virtq unset.

Remove the virtq from RQT before destroying it.

Fixes: 9f09b1ca15 ("vdpa/mlx5: recreate a virtq becoming enabled")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-07-30 00:41:23 +02:00
Matan Azrad
581e312d69 vdpa/mlx5: fix live migration termination
There are a lot of per virtq operations in the live migration
handling.

Before the driver support for queue update, when a virtq was not valid,
all the LM handling was terminated.

But now, when the driver supports queue update, the virtq can be invalid
as legal stage.

Skip invalid virtq in LM handling.

Fixes: c47d6e8333 ("vdpa/mlx5: support queue update")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-07-30 00:41:23 +02:00
Bing Zhao
8c8c3b01c3 vdpa/mlx5: fix compatibility with MISC4
When dynamic flex parser feature is introduced, the support for misc
parameters 4 of flow table entry (FTE) match set is needed. The
structure of "mlx5_ifc_fte_match_param_bits" is extended with
"mlx5_ifc_fte_match_set_misc4_bits" at the end of it. The total size
of the FTE match set will be changed into 384 bytes from 320 bytes.
Low level user space driver (rdma-core) will have the validation of
the length of FTE match set. In the old release that no MISC4
supported in the rdma-core, and this will break the backward
compatibility, even if the MISC4 is not used in most cases, like
in vDPA driver.
In order not to break the compatibility old rdma-core, the length
adjustment needs to be done. In mlx5 vDPA driver, the lengths of
the matcher and value are both set to 320 without MISC4. There is
no need to change the structure definition, all bytes of the MISC4
will be discarded if it is not needed. Since the MISC4 parameter
is aligned with a 64B boundary and so does the whole FTE match set
parameter, there is no need to take any padding and alignment into
consideration when calculating the size.

Fixes: daa38a8924 ("net/mlx5: add flow translation of eCPRI header")

Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-07-30 00:41:23 +02:00
Parav Pandit
f6d099d7da common/mlx5: remove class check from class drivers
Now that mlx5_pci PMD checks for enabled classes and performs
probe(), remove() of associated classes, individual class driver
does not need to check if other driver is enabled.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-28 19:01:30 +02:00
Parav Pandit
392bf9084d common/mlx5: register class drivers through common layer
Migrate mlx5 net, vdpa and regex PMD to start using mlx5 common class
driver.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-07-28 19:01:30 +02:00
Parav Pandit
8208800163 common/mlx5: avoid class constructor priority
mlx5_common is shared library between mlx5 net, VDPA and regex PMD.
It is better to use common initialization helper instead of using
RTE_PRIORITY_CLASS priority.

Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2020-07-28 18:52:11 +02:00
Maxime Coquelin
112510aece vdpa/mlx5: enable status protocol feature
This patch advertises VHOST_USER_PROTOCOL_F_STATUS
support in the MLX5 driver so that that the protocol
feature is negotiated.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-07-11 06:18:52 +02:00
Maxime Coquelin
0c88dfa106 vdpa/ifc: enable status protocol feature
This patch advertises VHOST_USER_PROTOCOL_F_STATUS
support in the IFC driver so that that the protocol
feature is negotiated.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-07-11 06:18:52 +02:00
Jerin Jacob
a7551b6c60 log: remove unneeded logtype declaration
RTE_LOG_REGISTER macro already declares the log type.
Remove the unneeded log type declaration.

Fixes: 9c99878aa1 ("log: introduce logtype register macro")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Gage Eads <gage.eads@intel.com>
2020-07-07 13:18:23 +02:00
Jerin Jacob
9c99878aa1 log: introduce logtype register macro
Introduce the RTE_LOG_REGISTER macro to avoid the code duplication
in the logtype registration process.

It is a wrapper macro for declaring the logtype, registering it and
setting its level in the constructor context.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-07-03 15:52:51 +02:00
Matan Azrad
c47d6e8333 vdpa/mlx5: support queue update
Last changes in vDPA device management by vhost library may cause queue
ready state update after the device configuration.

So, there is chance that some queue configuration information will be
known only after the device was configured.

Add support to reconfigure a queue after the device configuration
according to the queue state update and the configuration changes.

Adjust the host notifier and the guest notification configuration to be
per queue and to be applied in the enablement process.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:31 +02:00
Matan Azrad
0329868d6a vhost: support host notifier queue configuration
As an arrangement to per queue operations in the vDPA device it is
needed to change the next experimental API:

The API ``rte_vhost_host_notifier_ctrl`` was changed to be per queue
instead of per device.

A `qid` parameter was added to the API arguments list.

Setting the parameter to the value RTE_VHOST_QUEUE_ALL configures the
host notifier to all the device queues as done before this patch.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:30 +02:00
Matan Azrad
edc6391e45 vdpa/mlx5: control completion queue event mode
The CQ polling is necessary in order to manage guest notifications when
the guest doesn't work with poll mode (callfd != -1).

The CQ polling scheduling method can affect the host CPU utilization and
the traffic bandwidth.

Define 3 modes to control the CQ polling scheduling:

1. A timer thread which automatically adjusts its delays to the coming
   traffic rate.
2. A timer thread with fixed delay time.
3. Interrupts: Each CQE burst arms the CQ in order to get an interrupt
   event in the next traffic burst.

When traffic becomes off, mode 3 is taken automatically.

The interrupt management takes a lot of CPU cycles but forward traffic
event to the guest very fast.

Timer thread save the interrupt overhead but may add delay for the guest
notification.

Add device arguments to control on the mode.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:30 +02:00
Matan Azrad
c5f714e50b vdpa/mlx5: optimize completion queue poll
The vDPA driver uses a CQ in order to know when traffic works were
completed by the HW.

Each traffic burst completion adds a CQE to the CQ.

When the vDPA driver detects CQEs in the CQ, it triggers the guest
notification for the corresponding queue and consumes all of them.

There is collapse feature in the HW that configures the HW to write all
the CQEs in the first entry of the CQ.

Using this feature, the vDPA driver can read only the first CQE,
validate that the completion counter inside the CQE was changed and if
so, to notify the guest.

Use CQ collapse feature in order to improve the poll utilization.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:30 +02:00
Matan Azrad
a9dd7275a1 vdpa/mlx5: optimize notification events
When the virtio guest driver doesn't work with poll mode, the driver
creates event mechanism in order to schedule completion notifications
for each virtq burst traffic.

When traffic comes to a virtq, a CQE will be added to the virtq CQ by
the FW.
The driver requests interrupt for the next CQE index, and when interrupt
is triggered, the driver polls the CQ and notifies the guest by virtq
callfd writing.

According to the described method, the interrupts will be triggered for
each burst of traffic. The burst size depends on interrupt latency.

Interrupts management takes a lot of CPU cycles and using it for each
traffic burst takes big portion of CPU capacity.

When traffic is on, using timer for CQ poll scheduling instead of
interrupts saves a lot of CPU cycles.

Move CQ poll scheduling to be done by timer in case of running traffic.
Request interrupts only when traffic is off.

The timer scheduling management is done by a new dedicated thread uses
a usleep command.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
a49f758d11 vhost: split vDPA header file
This patch split the vDPA header file in two, making
rte_vdpa_device structure opaque to the application.

Applications should only include rte_vdpa.h, while drivers
should include both rte_vdpa.h and rte_vdpa_dev.h.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
2263f13941 vhost: replace vDPA device ID in Vhost
This removes the notion of device ID in Vhost library
as a preliminary step to get rid of the vDPA device ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
81a6b7fe06 vhost: replace device ID in vDPA ops
This patch is a preliminary step to get rid of the
vDPA device ID. It makes vDPA callbacks to use the
vDPA device struct as a reference instead of the ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
38f8ab0bbc vhost: make vDPA framework bus agnostic
This patch makes the vDPA framework to no more
support only PCI devices, but any devices by relying
on the generic device name as identifier.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Matan Azrad
441476b000 vdpa/mlx5: support MTU feature
The guest virtio device may request MTU updating when the vhost backend
device exposes a capability to support it.

Expose the MTU feature capability.

At configuration time, check the requested MTU and update it in the HW
device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
04e7beeb12 vdpa/mlx5: adjust virtio queue protection domain
In other to fill the new requirement for virtq
configuration, set the single PD managed by the driver for
all the virtqs.

Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Matan Azrad
7de66d823e vdpa/mlx5: support virtio queue statistics get
Add support for statistics operations.

A DevX counter object is allocated per virtq in order to
manage the virtq statistics.

The counter object is allocated before the virtq creation
and destroyed after it, so the statistics are valid only in
the life time of the virtq.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Tal Shnaiderman
33031608e8 bus/pci: introduce Windows support with stubs
Addition of stub eal and bus/pci functions to compile
bus/pci for Windows.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-06-30 00:02:54 +02:00
Ophir Munk
72f7566056 common/mlx5: move glue files under Linux directory
The glue file mlx5_glue.c is based on Linux specifics APIs.
Move it (including file mlx5_glue.h) to common/mlx5/linux directory.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-06-03 17:19:26 +02:00
Matan Azrad
11e2ed9e43 vdpa/mlx5: fix PCI address comparison
A regular memcmp function was used to compare between two objects of
type `struct rte_pci_addr`.

Due to the alignment rules of compiler structure builders, some memory
is not initiated in the structure even though all the fields were
initiated.

Therefore, the comparison may fail even though the PCI addresses are
identical and to cause false failure in probe.

Use the dedicated API to compare 2 PCI addresses.

Fixes: 75dd0ae917 ("vdpa/mlx5: disable RoCE")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Tested-by: Noa Ezra <noae@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Matan Azrad
9f09b1ca15 vdpa/mlx5: recreate a virtq becoming enabled
The virtq configurations may be changed when it moves from disabled
state to enabled state.

Listen to the state callback even if the device is not configured.
Recreate the virtq when it moves from disabled state to enabled state
and when the device is configured.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Matan Azrad
7497873f23 vdpa/mlx5: separate virtq stop
In live migration, before logging the virtq, the driver queries the
virtq indexes after moving it to suspend mode.

Separate this method to new function mlx5_vdpa_virtq_stop as a
preparation for reusing.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Matan Azrad
c2eb33aaf9 vdpa/mlx5: manage virtqs by array
As a preparation to listen the virtqs status before the device is
configured, manage the virtqs structures in array instead of list.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Ray Kinsella
105f3039c7 version: reference next ABI 21 for recent additions
Change references to ABI 20.0.1 to use ABI v21, see
https://doc.dpdk.org/guides/contributing/abi_policy.html#general-guidelines

"Major ABI versions are declared no more frequently than yearly.
Compatibility with the major ABI version is mandatory in subsequent
releases until a new major ABI version is declared."

Combined ABI policy and versioning in maintainers, add map files to the
filter to more closely monitor future ABI changes.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
2020-05-05 00:25:34 +02:00
Matan Azrad
e60b10f260 vdpa/mlx5: add logs
Add log prints to improve driver status following.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-04-21 13:57:08 +02:00
Matan Azrad
f6f1f024b0 vdpa/mlx5: validate notifier configuration
When both, direct and indirect notifier management cannot be
configured, return an error.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-04-21 13:57:08 +02:00
Matan Azrad
bd09eb7b61 vdpa/mlx5: support direct HW notifications
Add support for the next 2 callbacks:
get_vfio_device_fd and get_notify_area.

This will allow direct HW doorbell ringing from guest and will save CPU
usage in host.

By this patch, the QEMU will map the physical address of the virtio
device in guest directly to the physical address of the HW device
doorbell.

The guest doorbell write is 2 bytes transaction while some Mellanox nics
support only 4 bytes transactions.

Remove ConnectX-5 and BF1 devices support which don't support 2B
doorbell writes for HW triggering.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-04-21 13:57:08 +02:00
Matan Azrad
4cae722c1b vdpa/mlx5: move virtual doorbell alloc to probe
The configure and close operations may be called a lot of time by vhost
library according to the virtio connections in the guest.

VAR is the device memory space for the virtio queues doorbells.
Each VAR page can be shared for more than one queue while its owner must
synchronize the writes to it.

The mlx5 driver allocates single VAR page for all its queues.

Therefore, it is better to allocate it in probe device level instead of
creating and destroying it per new connection.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-04-21 13:57:08 +02:00
Asaf Penso
6d6cd38f45 vdpa/mlx5: set default queue indices
The rte_vhost_get_vring_base function is being called to get the values
of last_avail_idx and last_used_idx.
These fields will not have the correct values in case the function
returns an error.

Adding a check for the function return value, and in the case of an
error, set the fields to be zero and print a warning message.

Fixes: bff7350110 ("vdpa/mlx5: prepare virtio queues")
Cc: stable@dpdk.org

Signed-off-by: Asaf Penso <asafp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-04-21 13:57:08 +02:00
Maxime Coquelin
0265a1980f vhost: prefix vDPA enum value for PCI address type
In order to avoid potential conflicts, rename the PCI_ADDR
enum value to VDPA_ADDR_PCI in vdpa_addr_type_enum.

All symbols referencing this enum are experimental, so it
does not break API policy.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-04-21 13:57:07 +02:00
Dekel Peled
a4e6ea97a5 common/mlx5: fix RSS key copy to TIR context
In function mlx5_devx_cmd_create_tir(), the 40 bytes of RSS key are
copied in 10 iterations, 4 bytes each time using the MLX5_SET macro.
As result the RSS key is copied into TIR context in swapped byte order.
This patch fixes the issue, using memcpy() to copy the RSS key as is.
The struct member mlx5_devx_tir_attr.rx_hash_toeplitz_key is updated
to byte array type.

Fixes: c3aea272ee ("net/mlx5: create advanced Rx object via DevX")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-04-21 13:57:05 +02:00
Shiri Kuzin
53ac93f71a net/mlx5: create relaxed ordering memory regions
In the current state, when preforming read/write
transactions we must wait for a completion in order
to run the next transaction, and all transactions are
performed by order.

Relaxed Ordering is a PCI optimization which by enabling it
we allow the system to perform read/writes in a different
order without having to wait for completion and improve
the performance in that matter.

This commit introduces the creation of relaxed ordering
memory regions in mlx5.
As relaxed ordering is an optimization, drivers that
do not support it can simply ignore it and therefore
it is enabled by default.

Signed-off-by: Shiri Kuzin <shirik@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-04-21 13:57:05 +02:00
Thomas Monjalon
ef5baf3486 replace packed attributes
There is a common macro __rte_packed for packing structs,
which is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-04-16 18:16:46 +02:00
Pavan Nikhilesh
acec04c4b2 build: disable experimental API check internally
Remove setting ALLOW_EXPERIMENTAL_API individually for each Makefile and
meson.build. Instead, enable ALLOW_EXPERIMENTAL_API flag across app, lib
and drivers.
This changes reduces the clutter across the project while still
maintaining the functionality of ALLOW_EXPERIMENTAL_API i.e. warning
external applications about experimental API usage.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2020-04-14 16:22:34 +02:00
Matan Azrad
d76a17f7b8 vdpa/mlx5: fix guest notification timing
When the HW finishes to consume the guest Rx descriptors, it creates a
CQE in the CQ.

The mlx5 driver arms the CQ to get notifications when a specific CQE
index is created - the index to be armed is the next CQE index which
should be polled by the driver.

The mlx5 driver configured the kernel driver to send notification to the
guest callfd in the same time it arrives to the mlx5 driver.

It means that the guest was notified only for each first CQE in a poll
cycle, so if the driver polled CQEs of all the virtio queue available
descriptors, the guest was not notified again for the rest because
there was no any new cycle for polling.

Hence, the Rx queues might be stuck when the guest didn't work with
poll mode.

Move the guest notification to be after the driver consumes all the
SW own CQEs.
By this way, guest will be notified only after all the SW CQEs are
polled.

Also init the CQ to be with HW owner in the start.

Fixes: 8395927cdf ("vdpa/mlx5: prepare HW queues")

Signed-off-by: Matan Azrad <matan@mellanox.com>
2020-02-25 10:48:04 +01:00
Matan Azrad
06da8cccb6 vdpa/mlx5: fix event setup
The completion event mechanism should work only if at least one of the
virtqs has valid callfd to be notified on.

When all the virtqs works with poll mode, the event mechanism should not
be configured.

The driver didn't take it into account and crashed in the above case.

Do not configure event interrupt when all the virtqs are in poll mode.

Fixes: 8395927cdf ("vdpa/mlx5: prepare HW queues")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-02-25 10:47:59 +01:00
Matan Azrad
30b6974441 vdpa/mlx5: fix completion queue arming
The mlx5 vDPA driver manages QP and CQ in order to forward the HW event
to the guest by the callfd file descriptor for each virtq.

The driver arms the CQ for the next CQE index that should be
completed by the HW in order to create completion event.

In the SW completion event handler, the driver arms the CQ again for the
next index,

The CQE index in the CQ doorbell and in the CQ doorbell record was
masked incorrectly with the CQ size mask while it should be masked only
with 0xFFFFFF mask.

Remove the CQ size mask, stay only with 0xFFFFFF mask.

Fixes: 8395927cdf ("vdpa/mlx5: prepare HW queues")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-19 13:51:06 +01:00
Raslan Darawsheh
58b4a2b13e net/mlx5: add BlueField-2 device ID
This adds new device id to the list of Mellanox devices
that runs mlx5 PMD.
- BlueField-2 integrated ConnectX-6 Dx network controller

This device is not ready yet, it is in development stage.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-02-19 13:51:06 +01:00
Matan Azrad
3840320242 vdpa/mlx5: fix ABI version
Changed the ABI version to 20.0.1.

Fixes: 95276abaaf ("vdpa/mlx5: introduce Mellanox vDPA driver")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-02-12 09:26:43 +01:00
Matan Azrad
75dd0ae917 vdpa/mlx5: disable RoCE
In order to support virtio queue creation by the FW, RoCE mode
should be disabled in the device.

Do it by netlink which is like the devlink tool commands:
	1. devlink dev param set pci/[pci] name enable_roce value false
	   cmode driverinit
	2. devlink dev reload pci/[pci]
Or by sysfs which is like:
	echo 0 >  /sys/bus/pci/devices/[pci]/roce_enable

The IB device is matched again after ROCE disabling.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
31b9c29c86 vdpa/mlx5: support close and config operations
Support dev_conf and dev_conf operations.
These operations allow vdpa traffic.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
9d39e57f21 vdpa/mlx5: support live migration
Add support for live migration feature by the HW:
	Create a single Mkey that maps the memory address space of the
		VHOST live migration log file.
	Modify VIRTIO_NET_Q object and provide vhost_log_page,
		dirty_bitmap_mkey, dirty_bitmap_size, dirty_bitmap_addr
		and dirty_bitmap_dump_enable.
	Modify VIRTIO_NET_Q object and move state to SUSPEND.
	Query VIRTIO_NET_Q and get hw_available_idx and hw_used_idx.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
62c813706e vdpa/mlx5: map doorbell
The HW supports only 4 bytes doorbell writing detection.
The virtio device set only 2 bytes when it rings the doorbell.

Map the virtio doorbell detected by the virtio queue kickfd to the HW
VAR space when it expects to get the virtio emulation doorbell.

Use the EAL interrupt mechanism to get notification when a new event
appears in kickfd by the guest and write 4 bytes to the HW doorbell space
in the notification callback.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
af72fdb546 vdpa/mlx5: support queue state operation
Add support for set_vring_state operation.

Using DevX API the virtq state can be changed as described in PRM:
	enable - move to ready state.
	disable - move to suspend state.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
a5a1d98ddc vdpa/mlx5: add basic steering configurations
Add a steering object to be managed by a new file mlx5_vdpa_steer.c.

Allow promiscuous flow to scatter the device Rx packets to the virtio
queues using RSS action.

In order to allow correct RSS in L3 and L4, split the flow to 7 flows
as required by the device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
2aa8444b00 vdpa/mlx5: support stateless offloads
Add support for the next features in virtq configuration:
	VIRTIO_F_RING_PACKED,
	VIRTIO_NET_F_HOST_TSO4,
	VIRTIO_NET_F_HOST_TSO6,
	VIRTIO_NET_F_CSUM,
	VIRTIO_NET_F_GUEST_CSUM,
	VIRTIO_F_VERSION_1,

These features support depends in the DevX capabilities reported by the
device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
bff7350110 vdpa/mlx5: prepare virtio queues
The HW virtq object represents an emulated context for a VIRTIO_NET
virtqueue which was created and managed by a VIRTIO_NET driver as
defined in VIRTIO Specification.

Add support to prepare and release all the basic HW resources needed
the user virtqs emulation according to the rte_vhost configurations.

This patch prepares the basic configurations needed by DevX commands to
create a virtq.

Add new file mlx5_vdpa_virtq.c to manage virtq operations.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
8395927cdf vdpa/mlx5: prepare HW queues
As an arrangement to the vitrio queues creation, a 2 QPs and CQ may be
created for the virtio queue.

The design is to trigger an event for the guest and for the vdpa driver
when a new CQE is posted by the HW after the packet transition.

This patch add the basic operations to create and destroy the above HW
objects  and to trigger the CQE events when a new CQE is posted.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
cc07a42da2 vdpa/mlx5: prepare memory regions
In order to map the guest physical addresses used by the virtio device
guest side to the host physical addresses used by the HW as the host
side, memory regions are created.

By this way, for example, the HW can translate the addresses of the
packets posted by the guest and to take the packets from the correct
place.

The design is to work with single MR which will be configured to the
virtio queues in the HW, hence a lot of direct MRs are grouped to single
indirect MR.

Create functions to prepare and release MRs with all the related
resources that are required for it.

Create a new file mlx5_vdpa_mem.c to manage all the MR related code
in the driver.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
f7aaf477d6 vdpa/mlx5: support features get operations
Add support for get_features and get_protocol_features operations.

Part of the features are reported by the DevX capabilities.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
d830dc1642 vdpa/mlx5: support queues number operation
Support get_queue_num operation to get the maximum number of queues
supported by the device.

This number comes from the DevX capabilities.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
95276abaaf vdpa/mlx5: introduce Mellanox vDPA driver
Add a new driver to support vDPA operations by Mellanox devices.

The first Mellanox devices which support vDPA operations are
ConnectX-6 Dx and Bluefield1 HCA for their PF ports and VF ports.

This driver is depending on rdma-core like the mlx5 PMD, also it is
going to use mlx5 DevX to create HW objects directly by the FW.
Hence, the common/mlx5 library is linked to the mlx5_vdpa driver.

This driver will not be compiled by default due to the above
dependencies.

Register a new log type for this driver.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-02-05 09:51:21 +01:00
Matan Azrad
5c060bf178 drivers: move ifc to vDPA directory
A new vDPA class was recently introduced.

IFC driver implements the vDPA operations,
hence it should be moved to the vDPA class.

Move it.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-01-14 00:09:33 +01:00
Matan Azrad
3df349b7af drivers: introduce vDPA class
The vDPA (vhost data path acceleration) drivers provide support for
the vDPA operations introduced by the rte_vhost library.

Any driver which provides the vDPA operations should be moved\added to
the vdpa class under drivers/vdpa/.

Create the general files for vDPA class in drivers and in documentation.

The management tree for vDPA drivers is
git://dpdk.org/next/dpdk-next-virtio.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-01-13 23:28:00 +01:00