6227 Commits

Author SHA1 Message Date
Honnappa Nagarahalli
e1c9850f55 eal/armv8: force inlining of timer API
Change the inline functions to use __rte_always_inline to be
consistent with rest of the inline functions.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-07-07 13:21:26 +02:00
Honnappa Nagarahalli
97c910139b eal/armv8: fix timer frequency calibration with PMU
get_tsc_freq uses 'nanosleep' system call to calculate the CPU
frequency. However, 'nanosleep' results in the process getting
un-scheduled. The kernel saves and restores the PMU state. This
ensures that the PMU cycles are not counted towards a sleeping
process. When RTE_ARM_EAL_RDTSC_USE_PMU is defined, this results
in incorrect CPU frequency calculation. This logic is replaced
with generic counter based loop.

Bugzilla ID: 450
Fixes: f91bcbb2d9a6 ("eal/armv8: use high-resolution cycle counter")
Cc: stable@dpdk.org

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-07-07 13:20:50 +02:00
Jerin Jacob
a7551b6c60 log: remove unneeded logtype declaration
RTE_LOG_REGISTER macro already declares the log type.
Remove the unneeded log type declaration.

Fixes: 9c99878aa1b1 ("log: introduce logtype register macro")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Gage Eads <gage.eads@intel.com>
2020-07-07 13:18:23 +02:00
Hemant Agrawal
c843fca96c rawdev: remove remaining experimental tags
The experimental tags were removed, but the comment
is still having API classification as EXPERIMENTAL

Fixes: 931cc531aad2 ("rawdev: remove experimental tag")
Cc: stable@dpdk.org

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-07-07 12:54:22 +02:00
David Marchand
74f4d6424d lib: remind experimental status in headers
The following libraries are experimental, all of their functions can
be changed or removed:

- librte_bbdev
- librte_bpf
- librte_compressdev
- librte_fib
- librte_flow_classify
- librte_graph
- librte_ipsec
- librte_node
- librte_rcu
- librte_rib
- librte_stack
- librte_telemetry

Their status is properly announced in MAINTAINERS.
Remind this status in their headers in a common fashion (aligned to ABI
docs).

Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-07-07 12:49:10 +02:00
David Marchand
7762e0139b build: remove special versioning for non stable libraries
Having a special versioning for experimental/internal libraries put a
additional maintenance cost while this status is already announced in
MAINTAINERS and the library headers/documentation.
Following discussions and vote at 05/20 TB meeting [1], use a single
versioning for all libraries in DPDK.

Note: for the ABI check, an exception [2] had been added when tweaking
this special versioning [3].
Prefer explicit libabigail rules (which will be dropped in 20.11).

1: https://mails.dpdk.org/archives/dev/2020-May/168450.html
2: https://git.dpdk.org/dpdk/commit/?id=23d7ad5db41c
3: https://git.dpdk.org/dpdk/commit/?id=ec2b8cd7ed69

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-07-07 12:48:25 +02:00
Tal Shnaiderman
f6d7f40576 mbuf: build on Windows
Build the lib for Windows.
Export needed EAL functions used by the lib.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-07-07 01:47:24 +02:00
Tal Shnaiderman
9af385fd89 eal: support endianness detection on Windows
Inclusion of the endian.h header is set only for Linux OS.

Windows endianness will be determined by the predefined
__BYTE_ORDER__ macro.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
2020-07-07 01:42:27 +02:00
Fady Bader
080bda1a0a mempool: build on Windows
Some EAL functions are used by mempool lib but not exported on Windows.
The functions are exported.
Added mempool to supported libraries for Windows compilation.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-07-07 01:28:12 +02:00
Fady Bader
972848da5f mempool: use generic memory syscall wrappers
Using generic memory management calls instead of Unix memory management
calls for mempool.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-07-07 01:24:55 +02:00
Fady Bader
2f59f3b085 eal: disable function versioning on Windows
Function versioning implementation is not supported by Windows.
Function versioning is disabled on Windows.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-07-07 01:23:29 +02:00
Alan Dewar
83415d4fd8 sched: fix port time rounding
The QoS scheduler works off port time that is computed from the number
of CPU cycles that have elapsed since the last time the port was
polled.   It divides the number of elapsed cycles to calculate how
many bytes can be sent, however this division can generate rounding
errors, where some fraction of a byte sent may be lost.

Lose enough of these fractional bytes and the QoS scheduler
underperforms.  The problem is worse with low bandwidths.

To compensate for this rounding error this fix doesn't advance the
port's time_cpu_cycles by the number of cycles that have elapsed,
but by multiplying the computed number of bytes that can be sent
(which has been rounded down) by number of cycles per byte.
This will mean that port's time_cpu_cycles will lag behind the CPU
cycles momentarily.  At the next poll, the lag will be taken into
account.

Fixes: de3cfa2c98 ("sched: initial import")
Cc: stable@dpdk.org

Signed-off-by: Alan Dewar <alan.dewar@att.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
2020-07-07 00:58:31 +02:00
Ori Kam
f66967d218 regexdev: implement API functions
This commit implements all the RegEx public API.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Guy Kaneti <guyk@marvell.com>
2020-07-07 00:24:52 +02:00
Ori Kam
b25246beae regexdev: add core functions
This commit introduce the API that is needed by the RegEx devices in
order to work with the RegEX lib.

During the probe of a RegEx device, the device should configure itself,
and allocate the resources it requires.
On completion of the device init, it should call the
rte_regex_dev_register in order to register itself as a RegEx device.

Signed-off-by: Ori Kam <orika@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Guy Kaneti <guyk@marvell.com>
2020-07-07 00:24:51 +02:00
Ori Kam
9c26771bc5 regexdev: add core structures
This commit introduce the rte_regexdev_core.h file.
This file holds internal structures and API that are used by
the regexdev.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Guy Kaneti <guyk@marvell.com>
2020-07-07 00:24:39 +02:00
Jerin Jacob
bab9497ef7 regexdev: introduce API
As RegEx usage become more used by DPDK applications, for example:
* Next Generation Firewalls (NGFW)
* Deep Packet and Flow Inspection (DPI)
* Intrusion Prevention Systems (IPS)
* DDoS Mitigation
* Network Monitoring
* Data Loss Prevention (DLP)
* Smart NICs
* Grammar based content processing
* URL, spam and adware filtering
* Advanced auditing and policing of user/application security policies
* Financial data mining - parsing of streamed financial feeds
* Application recognition.
* Dmemory introspection.
* Natural Language Processing (NLP)
* Sentiment Analysis.
* Big data database acceleration.
* Computational storage.

Number of PMD providers started to work on HW implementation,
along side with SW implementations.

This lib adds the support for those kind of devices.

The RegEx Device API is composed of two parts:
- The application-oriented RegEx API that includes functions to setup
  a RegEx device (configure it, setup its queue pairs and start it),
  update the rule database and so on.

- The driver-oriented RegEx API that exports a function allowing
  a RegEx poll Mode Driver (PMD) to simultaneously register itself as
  a RegEx device driver.

RegEx device components and definitions:

    +-----------------+
    |                 |
    |                 o---------+    rte_regexdev_[en|de]queue_burst()
    |   PCRE based    o------+  |               |
    |  RegEx pattern  |      |  |  +--------+   |
    | matching engine o------+--+--o        |   |    +------+
    |                 |      |  |  | queue  |<==o===>|Core 0|
    |                 o----+ |  |  | pair 0 |        |      |
    |                 |    | |  |  +--------+        +------+
    +-----------------+    | |  |
           ^               | |  |  +--------+
           |               | |  |  |        |        +------+
           |               | +--+--o queue  |<======>|Core 1|
       Rule|Database       |    |  | pair 1 |        |      |
    +------+----------+    |    |  +--------+        +------+
    |     Group 0     |    |    |
    | +-------------+ |    |    |  +--------+        +------+
    | | Rules 0..n  | |    |    |  |        |        |Core 2|
    | +-------------+ |    |    +--o queue  |<======>|      |
    |     Group 1     |    |       | pair 2 |        +------+
    | +-------------+ |    |       +--------+
    | | Rules 0..n  | |    |
    | +-------------+ |    |       +--------+
    |     Group 2     |    |       |        |        +------+
    | +-------------+ |    |       | queue  |<======>|Core n|
    | | Rules 0..n  | |    +-------o pair n |        |      |
    | +-------------+ |            +--------+        +------+
    |     Group n     |
    | +-------------+ |<-------rte_regexdev_rule_db_update()
    | |             | |<-------rte_regexdev_rule_db_compile_activate()
    | | Rules 0..n  | |<-------rte_regexdev_rule_db_import()
    | +-------------+ |------->rte_regexdev_rule_db_export()
    +-----------------+

RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is â~@~\RegExâ~@~].

RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.

PCRE RegEx syntax and semantics specification:
http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html

RegEx queue pair: Each RegEx device should have one or more queue pair to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching
request/response embedded in *rte_regex_ops* structure.

Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.

Rule database: The RegEx device accepts regular expressions and converts
them into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion that
would be far too expensive to compute at run-time. A rule database
contains a set of rules that compiled in device specific binary form.

Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.

Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify
the rule upon match.

Scan: A pattern matching request through *enqueue* API.

It may possible that a given RegEx device may not support all the
features
of PCRE. The application may probe unsupported features through
struct rte_regexdev_info::pcre_unsup_flags

By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue
pair. It is the responsibility of the upper level application to
enforce this rule.

In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*

At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regexdev*.
RegEx devices are dynamically registered during the PCI/SoC device
probing phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regexdev* structure and
a new device identifier are allocated for that device. Then, the
regexdev_init() function supplied by the RegEx driver matching the
probed device is invoked to properly initialize the device.

The role of the device init function consists of resetting the hardware
or software RegEx driver implementations.

If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regexdev* structure is effectively registered.
Otherwise, both the *rte_regexdev* structure and the device identifier
are freed.

The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following
order:
    - rte_regexdev_configure()
    - rte_regexdev_queue_pair_setup()
    - rte_regexdev_start()

Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue
pattern matching response, get the stats, update the rule database,
get/set device attributes and so on

If the application wants to change the configuration (i.e. call
rte_regexdev_configure() or rte_regexdev_queue_pair_setup()), it must
call rte_regexdev_stop() first to stop the device and then do the
reconfiguration before calling rte_regexdev_start() again. The enqueue and
dequeue functions should not be invoked when the device is stopped.

Finally, an application can close a RegEx device by invoking the
rte_regexdev_close() function.

Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.

For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of
type *regexdev_ops*.
The address of the *regexdev_ops* structure is stored in the
*rte_regexdev* structure by the device init function of the RegEx driver,
which is invoked during the PCI/SoC device probing phase, as explained
earlier.

In other words, each function of the RegEx API simply retrieves the
*rte_regexdev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regexdev_ops* structure of the *rte_regexdev*
structure.

For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regexdev_ops* structure.
Instead, they are directly stored at the beginning of the *rte_regexdev*
structure to avoid an extra indirect memory access during their
invocation.

RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.

The *enqueue* operation submits a burst of RegEx pattern matching
request to the RegEx device and the *dequeue* operation gets a burst of
pattern matching response for the ones submitted through *enqueue*
operation.

Typical application utilisation of the RegEx device API will follow the
following programming flow.

- rte_regexdev_configure()
- rte_regexdev_queue_pair_setup()
- rte_regexdev_rule_db_update() Needs to invoke if precompiled rule
  database not
  provided in rte_regexdev_config::rule_db for rte_regexdev_configure()
  and/or application needs to update rule database.
- rte_regexdev_rule_db_compile_activate() Needs to invoke if
  rte_regexdev_rule_db_update function was used.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regexdev_start()
- rte_regexdev_enqueue_burst()
- rte_regexdev_dequeue_burst()

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
2020-07-07 00:24:38 +02:00
David Marchand
0fc601af3a trace: simplify trace point registration
RTE_TRACE_POINT_DEFINE and RTE_TRACE_POINT_REGISTER must come in pairs.
Merge them and let RTE_TRACE_POINT_REGISTER handle the constructor part.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-07-05 21:34:21 +02:00
Bruce Richardson
06c7871dde eal: restrict default plugin path to shared lib mode
When using statically linked DPDK binaries, the EAL checks the default PMD
path and tries to load any drivers there, despite the fact that all drivers
are normally linked into the binary.  This behaviour can cause issues if
the PMD path and lib dir is configured to a non-standard location which is
not in the ld.so.conf paths, e.g. a build with prefix set to a home
directory location. In a case such as this, EAL will try and
(unnecessarily) load the .so driver files but that load will fail as their
dependent libraries, such as ethdev, for example, will not be found.

Because of this, it is better if statically linked DPDK apps do not load
drivers from the standard paths automatically. The user can always have
this behaviour by explicitly specifying the path using -d flag, if so
desired.

Not loading the libraries automatically can also prevent potential issues
with a user building and running a statically-linked DPDK binary based off
a private copy of DPDK, while there exists on the same machine a
system-wide installation of DPDK in the default locations. Without this
change, the system-installed drivers will be loaded to the binary alongside
the statically-linked drivers, which is not what the user would have
intended.

To detect whether we are in a statically or dynamically linked binary, we
can have EAL try to get a dlopen handle to its own shared library, by
calling dlopen with the RTLD_NOLOAD flag. This will return NULL if there is
no such shared lib loaded i.e. the code is executing from a static library,
or a handle to the lib if it is loaded.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
2020-07-05 21:32:40 +02:00
Bruce Richardson
e84921fb56 eal: cache last directory permissions checked
When loading a directory of drivers, we check the same hierarchy multiple
times. If we just cache the last directory checked, this avoids repeated
checks of the same path, since all drivers in that path have been added to
the list consecutively.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2020-07-05 21:32:40 +02:00
Bruce Richardson
7b69fd1e95 eal: forbid loading drivers from insecure paths
Any paths on the system which are world-writable are insecure and should
not be used for loading drivers. Therefore, whenever an absolute or
relative driver path is passed to EAL, check for world-writability and
don't load any drivers from that path if it is insecure. Drivers loaded
from system locations i.e. those passed without any path info and found
automatically by the loader, are excluded from these checks as system paths
are assumed to be secure.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2020-07-05 21:32:40 +02:00
Bruce Richardson
49b536fc30 eal: load only shared libs from driver plugin directories
When we pass a "-d" flag to EAL pointing to a directory, we attempt to load
all files in that directory as driver plugins, irrespective of file type.
This procludes using e.g. the build/drivers directory, as a driver source
since it contains static libs and other files as well as the shared
objects.

By filtering out any files whose filename does not end in ".so", we can
improve usability by allowing other non-driver files to be present in the
driver directory.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2020-07-05 21:32:40 +02:00
Bruce Richardson
efa75d61d2 eal: remove unnecessary null-termination in plugin path
Since strlcpy always null-terminates, and the buffer is zeroed before copy
anyway, there is no need to explicitly zero the end of the character
array, or to limit the bytes that strlcpy can write.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2020-07-05 21:32:40 +02:00
Thomas Monjalon
de321d5918 build: remove special handling for node library
The node library had a need of being linked as a whole
to make some constructors effective.
Now that all libraries are linked with --whole-archive,
there is no need to have this library separate.

Fixes: e2db26f76673 ("build: always link whole DPDK static libraries")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Jerin Jacob <jerinj@marvell.com>
2020-07-05 10:52:11 +02:00
Jerin Jacob
9c99878aa1 log: introduce logtype register macro
Introduce the RTE_LOG_REGISTER macro to avoid the code duplication
in the logtype registration process.

It is a wrapper macro for declaring the logtype, registering it and
setting its level in the constructor context.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-07-03 15:52:51 +02:00
Bruce Richardson
e2db26f766 build: always link whole DPDK static libraries
To ensure all constructors are included in static build, we need to pass
the --whole-archive flag when linking, which is used with the
"link_whole" meson option. Since we use link_whole for all libs, we no
longer need to track the lib as part of the static dependency, just the
path to the headers for compiling.

After this patch is applied, all DPDK .a files are inside
--whole-archive/--no-whole-archive flags, but external dependencies and
shared libs being linked against remain outside.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Andrzej Ostruszka <aostruszka@marvell.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Sunil Pai G <sunil.pai.g@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2020-07-01 19:30:52 +02:00
Morten Brørup
4b7284a71f ring: optimize empty test
Testing if the ring is empty is as simple as comparing the producer and
consumer pointers.

In theory, this optimization reduces the number of potential cache misses
from 3 to 2 by not having to read r->mask in rte_ring_count().

The modification of this function were also discussed in the RFC here:
https://mails.dpdk.org/archives/dev/2020-April/165752.html

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-07-01 11:46:09 +02:00
Morten Brørup
36e3cfbbef ring: cleanup coding style
Fix coding style violations that checkpatch will complain about.

Add missing "int" after "unsigned".
Add missing spaces around "+=" and "+".
Remove superfluous type cast of numerical constant.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-07-01 11:46:09 +02:00
Feifei Wang
708d904451 ring: fix tail in peek API for ST mode
The value of *tail should be the prod->tail not prod->head. After
modification, it can record 'tail' so head/tail can be updated
accordingly.

Fixes: 664ff4b1729b ("ring: introduce peek style API")

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-07-01 10:41:19 +02:00
Feifei Wang
c8d909aeb7 ring: fix bulk enqueue for HTS/RTS ring modes
Remove the unwanted call to "_rte_ring_do_enqueue_elem" to allow for
correct handling of RTS/HTS modes.

Fixes: e6ba4731c0f3 ("ring: introduce RTS ring mode")

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-07-01 10:41:19 +02:00
Matan Azrad
da2b788041 vhost: fix features definition location
The vhost library provide an infrastructure in order to help the DPDK
users to manage vhost devices.

One of the infrastructure parts is the features enablement APIs.

Some features bits may be defined only in the internal file vhost.h in
case the kernel version doesn't include them.

Hence, user running on old kernel may not be able to manage thus
features.

Move all the feature bits definitions to the API file rte_vhost.h.

Fixes: db69be54b6ff ("vhost: hide internal code")
Fixes: 8d286dbeb8d7 ("vhost: fix multiple queue not enabled for old kernels")
Fixes: 3d3c6590b58c ("vhost: enable virtio MTU feature")
Fixes: 704098fc478c ("vhost: fix build with old kernels")
Cc: stable@dpdk.org

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:31 +02:00
Matan Azrad
b213af9aa4 vhost: notify virtq file descriptor update
When virtq call or kick file descriptors are changed in the device
configuration when the queue is ready, the application and the vDPA
driver should be notified to be aligned to the new file descriptors.

Notify the state to be disabled before the file descriptor update and
return it back to be enabled after the update.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-06-30 14:52:31 +02:00
Matan Azrad
127f9c6f7b vhost: handle memory hotplug with vDPA devices
Some vDPA drivers' basic configurations should be updated when the
guest memory is hotplugged.

Close vDPA device before hotplug operation and recreate it after the
hotplug operation is done.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-06-30 14:52:30 +02:00
Matan Azrad
d0fcc38f5f vhost: improve device readiness notifications
Some guest drivers may not configure disabled virtio queues.

In this case, the vhost management never notifies the application and
the vDPA device readiness because it waits to the device to be ready.

The current ready state means that all the virtio queues should be
configured regardless the enablement status.

In order to support this case, this patch changes the ready state:
The device is ready when at least 1 queue pair is configured and
enabled.

So, now, the application and vDPA driver are notifies when the first
queue pair is configured and enabled.

Also the queue notifications will be triggered according to the new
ready definition.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-06-30 14:52:30 +02:00
Matan Azrad
9f2016b2ce vhost: skip access lock when vDPA is configured
No need to take access lock in the vhost-user message handler when
vDPA driver controls all the data-path of the vhost device.

It allows the vDPA set_vring_state operation callback to configure
guest notifications.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-06-30 14:52:30 +02:00
Matan Azrad
0329868d6a vhost: support host notifier queue configuration
As an arrangement to per queue operations in the vDPA device it is
needed to change the next experimental API:

The API ``rte_vhost_host_notifier_ctrl`` was changed to be per queue
instead of per device.

A `qid` parameter was added to the API arguments list.

Setting the parameter to the value RTE_VHOST_QUEUE_ALL configures the
host notifier to all the device queues as done before this patch.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
a49f758d11 vhost: split vDPA header file
This patch split the vDPA header file in two, making
rte_vdpa_device structure opaque to the application.

Applications should only include rte_vdpa.h, while drivers
should include both rte_vdpa.h and rte_vdpa_dev.h.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
e91ac959fa vhost: remove vDPA device count API
This API is no more useful, this patch removes it.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
8d44fc3a81 vhost: introduce wrappers for some vDPA ops
This patch is preliminary work to make the vDPA device
structure opaque to the user application. Some callbacks
of the vDPA devices are used to query capabilities before
attaching to a Vhost port. This patch introduces wrappers
for these ops.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
08a4f9bab3 vhost: use linked list for vDPA devices
There is no more notion of device ID outside of vdpa.c.
We can now move from array to linked-list model for keeping
track of the vDPA devices.

There is no point in using array here, as all vDPA API are
used from the control path, so no performance concerns.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
f6d587754c vhost: remove useless vDPA API
vDPA is no more used outside of the vDPA internals,
so remove rte_vdpa_get_device() API that is now useless.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
0f700f90ad vhost: replace device ID in applications
This patch replaces the use of vDPA device ID with
vDPA device pointer. The goals is to remove the vDPA
device ID to avoid confusion with the Vhost ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
2263f13941 vhost: replace vDPA device ID in Vhost
This removes the notion of device ID in Vhost library
as a preliminary step to get rid of the vDPA device ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
81a6b7fe06 vhost: replace device ID in vDPA ops
This patch is a preliminary step to get rid of the
vDPA device ID. It makes vDPA callbacks to use the
vDPA device struct as a reference instead of the ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
38f8ab0bbc vhost: make vDPA framework bus agnostic
This patch makes the vDPA framework to no more
support only PCI devices, but any devices by relying
on the generic device name as identifier.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Maxime Coquelin
383fb5a9c7 vhost: introduce vDPA device class
This patch introduces vDPA device class. It will enable
application to iterate over the vDPA devices.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
2020-06-30 14:52:30 +02:00
Ruifeng Wang
4eb25acdb7 eal/arm: add vcopyq intrinsic for aarch32
vcopyq_laneq_u32 should be implemented for aarch32 which doesn't have
the intrinsic.
This fixes build of examples/l3fwd for armv7.

Fixes: 3c4b4024c225 ("arch/arm: add vcopyq_laneq_u32 for old gcc")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-06-30 14:52:30 +02:00
Matan Azrad
1cb4415751 vhost: introduce operation to get vDPA queue stats
The vDPA device offloads all the datapath of the vhost
device to the HW device.

In order to expose to the user traffic information this
patch introduces new 3 APIs to get traffic statistics, the
device statistics name and to reset the statistics per
virtio queue.

The statistics are taken directly from the vDPA driver
managing the HW device and can be different for each vendor
driver.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-06-30 14:52:29 +02:00
Maxime Coquelin
d1c074bd76 vhost: enable reply-ack systematically
As announced during v20.05 release cycle, this
patch makes reply-ack protocol feature to be enabled
unconditionally.

This protocol feature makes the communication between the
master and the slave more robust, avoiding for example
possible undefined behaviour with VHOST_USER_SET_MEM_TABLE.

Also, reply-ack support will be required for upcoming
VHOST_USER_SET_STATUS request.

Note that this protocol feature was disabled by default
because Qemu version 2.7.0 to 2.9.0 had a bug causing a
deadlock when reply-ack was negotiated and multiqueue
enabled. These Qemu version are now very old and no more
maintained, so we can reasonably consider we no more
support them.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-06-30 14:52:29 +02:00
Fady Bader
8d05adbd54 ring: enable for Windows
Building ring on Windows.

Signed-off-by: Fady Bader <fady@mellanox.com>
2020-06-30 01:30:22 +02:00
Tasnim Bashar
5652a4ad24 eal/windows: fix thread handle
Casting thread ID to handle is not accurate way to get thread handle.
Need to use OpenThread function to get thread handle from thread ID.

pthread_setaffinity_np and pthread_getaffinity_np functions
for Windows are affected because of it.

Signed-off-by: Tasnim Bashar <tbashar@mellanox.com>
2020-06-30 00:50:26 +02:00