Commit Graph

6199 Commits

Author SHA1 Message Date
Stephen Hemminger
0cc917d809 hash: check flags on creation for future proofing
All API's should check that they support the flag values
passed. If an application passes an invalid flag it could
cause problems in later ABI.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
2020-06-16 17:46:39 +02:00
Stephen Hemminger
3e71b3d456 ring: check flag settings for future proofing
All API's should check that they support the flag values passed.
These checks ensure that the extra bits can safely be used
without risk of ABI breakage.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-06-16 17:46:39 +02:00
Joyce Kong
7f3aa08639 eal: introduce bit operations API
Bitwise operation APIs are defined and used in a lot of PMDs,
which caused a huge code duplication. To reduce duplication,
this patch consolidates them into a common API family.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
2020-06-16 14:16:56 +02:00
Dmitry Kozlyuk
2a5d547a4a eal/windows: implement basic memory management
Basic memory management supports core libraries and PMDs operating in
IOVA as PA mode. It uses a kernel-mode driver, virt2phys, to obtain
IOVAs of hugepages allocated from user-mode. Multi-process mode is not
implemented and is forcefully disabled at startup. Assign myself as a
maintainer for Windows file and memory management implementation.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:30:54 +02:00
Dmitry Kozlyuk
c08bd191b1 eal/windows: initialize hugepage info
Add hugepages discovery ("large pages" in Windows terminology)
and update documentation for required privilege setup. Only 2MB
hugepages are supported and their number is estimated roughly
due to the lack or unstable status of suitable OS APIs.
Assign myself as maintainer for the implementation file.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:30:32 +02:00
Dmitry Kozlyuk
b8a36b0866 eal/windows: improve CPU and NUMA node detection
1. Map CPU cores to their respective NUMA nodes as reported by system.
2. Support systems with more than 64 cores (multiple processor groups).
3. Fix magic constants, styling issues, and compiler warnings.
4. Add EAL private function to map DPDK socket ID to NUMA node number.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:29:39 +02:00
Dmitry Kozlyuk
b831cfd23a eal/windows: complete queue.h data structures
Limited version imported previously lacks at least SLIST macros.
Import a complete file from FreeBSD, since its license exception is
already approved by Technical Board.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:27:28 +02:00
Dmitry Kozlyuk
5690607879 eal/windows: add tracing stubs
EAL common code depends on tracepoint calls, but generic implementation
cannot be enabled on Windows due to missing standard library facilities.
Add stub functions to support tracepoint compilation, so that common
code does not have to conditionally include tracepoints until proper
support is added.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:27:09 +02:00
Dmitry Kozlyuk
262c4ee791 trace: add size_t field emitter
It is not guaranteed that sizeof(long) == sizeof(size_t). On Windows,
sizeof(long) == 4 and sizeof(size_t) == 8 for 64-bit programs.
Tracepoints using "long" field emitter are therefore invalid there.
Add dedicated field emitter for size_t and use it to store size_t values
in all existing tracepoints.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:27:00 +02:00
Dmitry Kozlyuk
694161b7e0 mem: extract common dynamic memory allocation
Code in Linux EAL that supports dynamic memory allocation (as opposed to
static allocation used by FreeBSD) is not OS-dependent and can be reused
by Windows EAL. Move such code to a file compiled only for the OS that
require it. Keep Anatoly Burakov maintainer of extracted code.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:26:37 +02:00
Dmitry Kozlyuk
83713ef276 mem: extract common memseg list initialization
All supported OS create memory segment lists (MSL) and reserve VA space
for them in a nearly identical way. Move common code into EAL private
functions to reduce duplication.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:25:16 +02:00
Dmitry Kozlyuk
c4b89ecb64 eal: introduce memory management wrappers
Introduce OS-independent wrappers for memory management operations used
across DPDK and specifically in common code of EAL:

* rte_mem_map()
* rte_mem_unmap()
* rte_mem_page_size()
* rte_mem_lock()

Windows uses different APIs for memory mapping and reservation, while
Unices reserve memory by mapping it. Introduce EAL private functions to
support memory reservation in common code:

* eal_mem_reserve()
* eal_mem_free()
* eal_mem_set_dump()

Wrappers follow POSIX semantics limited to DPDK tasks, but their
signatures deliberately differ from POSIX ones to be more safe and
expressive. New symbols are internal. Being thin wrappers, they require
no special maintenance.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:25:05 +02:00
Dmitry Kozlyuk
176bb37ca6 eal: introduce internal wrappers for file operations
Introduce OS-independent wrappers in order to support common EAL code
on Unix and Windows:

* eal_file_open: open or create a file.
* eal_file_lock: lock or unlock an open file.
* eal_file_truncate: enforce a given size for an open file.

Implementation for Linux and FreeBSD is placed in "unix" subdirectory,
which is intended for common code between the two. These thin wrappers
require no special maintenance.

Common code supporting multi-process doesn't use the new wrappers,
because it is inherently Unix-specific and would impose excessive
requirements on the wrappers.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:24:37 +02:00
Dmitry Kozlyuk
67a661ed85 eal: replace page sizes enum with a set of constants
Clang on Windows follows MS ABI where enum values are limited to 2^31-1.
Enum rte_page_sizes has members valued above this limit, which get
wrapped to zero, resulting in compilation error (duplicate values in
enum). Using MS ABI is mandatory for Windows EAL to call Win32 APIs.

Remove rte_page_sizes and replace its values with #define's.
This enumeration is not used in public API, so there's no ABI breakage.
Announce API changes for 20.08 in documentation.

Suggested-by: Jerin Jacob <jerinjacobk@gmail.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-15 19:23:34 +02:00
David Marchand
b6f0621201 eal/windows: fix symbol export
rte_eal_get_configuration() has been made private in 19.11, remove
leftover in Windows export list.

Fixes: f58cef079b ("eal: make the global configuration private")

Signed-off-by: David Marchand <david.marchand@redhat.com>
2020-06-15 11:58:26 +02:00
Pallavi Kadam
d87f964ce6 eal/windows: fix warnings
Fixed bunch of warnings when compiling using clang on Windows
such as the use of an unsafe string function (strerror),
[-Wunused-variable], [-Wunused-function] in eal_common_options.c
[-Wunused-const-variable] in getopt.c and [-Wunused-parameter]
in eal_common_thread.c.
Also fixed warnings generated using Mingw:
[-Werror=old-style-definition], [-Werror=cast-function-type] and
[-Werror=attributes]

Signed-off-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Tested-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
2020-06-15 11:35:58 +02:00
Tasnim Bashar
482bcf8404 eal/windows: support thread ID query
Add rte_sys_gettid function to use rte_gettid() on Windows.
rte_gettid() is required for recursive spin lock and recursive ticket lock.

Signed-off-by: Tasnim Bashar <tbashar@mellanox.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-06-11 16:40:29 +02:00
Tal Shnaiderman
4887a7e234 mbuf: align layout in Windows
Using uint32_t type bit-fields in Windows will pads the
'L2/L3/L4 and tunnel information' union with additional bits.

This padding causes rte_mbuf size misalignment and the total size
increases to 3 cache-lines.

Changed packet_type bit-fields types from uint32_t to uint8_t
to allow unified 2 cache-line structure size.

Added the __extension__ attribute over the modified struct to avoid
the warning:

type of bit-field ... is a GCC extension [-pedantic]

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-11 16:26:33 +02:00
Alexander Kozyrev
d6eb247371 mbuf: fix external buffer pool boundaries
Memzones are created in testpmd in order to test external data
buffers functionality. Each memzone is 2Mb in size and divided among
the pool of external memory buffers.

Memzone may not always be fully utilized because mbufs size can vary
and some space can be left unused at the tail of a memzone. This is
not handled properly and mbuf can get the address of this leftover
space since this address is still valid (part of memzone), but there
is not enough space to fit the whole packet data. As a result packet
data may overflow and cause the memory corruption.

Take mbuf size into account when distributing memory addresses from
a memzone to external mbufs. Skip the remaining tail in case there
is not enough room for a packet and move to a next memzone instead.

Fixes: 6c8e50c2e5 ("mbuf: create pool with external memory buffers")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-11 09:51:46 +02:00
Xiaolong Ye
c67a423c53 mbuf: remove unused next member in dynamic flag/field
TAILQ_ENTRY next is not needed in struct mbuf_dynfield_elt and
mbuf_dynflag_elt, since they are actually chained by rte_tailq_entry's
next field when calling TAILQ_INSERT_TAIL(mbuf_dynfield/dynflag_list, te,
next).

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-06-11 09:32:43 +02:00
Thomas Monjalon
d1342ea419 mbuf: document guideline for new fields and flags
Since dynamic fields and flags were added in 19.11,
the idea was to use them for new features, not only PMD-specific.

The guideline is made more explicit in doxygen, in the mbuf guide,
and in the contribution design guidelines.

For more information about the original design, see the presentation
https://www.dpdk.org/wp-content/uploads/sites/35/2019/10/DynamicMbuf.pdf

This decision was discussed in the Technical Board:
http://mails.dpdk.org/archives/dev/2020-June/169667.html

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-06-11 09:29:15 +02:00
Ciara Power
61d6c7a98b telemetry: fix init log printing
Initially, printf was used to indicate and error/warning resulting from
telemetry initialisation. This is now fixed to use EAL logs for
notices, and the unnecessary printf for an error is removed.

Fixes: eeb486f3ba ("eal: add telemetry as dependency")
Fixes: dd6275a424 ("telemetry: fix error log output")

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-05-24 18:01:31 +02:00
Adam Dybkowski
e475fd853a cryptodev: fix SHA-1 digest enum comment
This patch fixes improper SHA-1 digest size in the enum comment
and also adds the note about HMAC-SHA-1-96.

Fixes: 1bd407fac8 ("cryptodev: extract symmetric operations")
Cc: stable@dpdk.org

Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-05-24 11:52:35 +02:00
Xuan Ding
22fa1bcbcb vhost: fix zero-copy server mode
This patch fixes the situation where vhost-user cannot start as server
with dequeue_zero_copy enabled.

Using flag instead of vsocket->is_server to determine whether vhost-user
is in client mode. Because vsocket->is_server is not ready at this time.

Fixes: 715070ea10 ("vhost: prevent zero-copy with incompatible client mode")
Cc: stable@dpdk.org

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
2020-05-19 17:12:17 +02:00
Ferruh Yigit
60197bda97 meter: provide experimental alias for matured API
On v20.02 some meter APIs have been matured and symbols moved from
EXPERIMENTAL to DPDK_20.0.1 block.

This can break the applications that were using these mentioned APIs on
v19.11. Although there is no modification on the APIs and the action is
positive and matures the APIs, the affect can be negative to
applications.

This patch provides aliasing by duplicating the existing and versioned
symbols as experimental.

Since symbols moved from DPDK_20.0.1 to DPDK_21 block in the v20.05, the
aliasing done between EXPERIMENTAL and DPDK_21.

With DPDK_21 ABI (DPDK v20.11) all aliasing will be removed and only
stable version of the APIs will remain.

Fixes: 30512af820 ("meter: remove experimental flag from RFC4115 trTCM API")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-05-19 16:25:09 +02:00
Jerin Jacob
8b9dae0cc3 doc: use globbing terminology
Glob is the terminology used in fnmatch man page.
Use glob terminology across DPDK for shell pattern.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2020-05-19 16:05:17 +02:00
Muhammad Bilal
5a448a55b4 fix same typo in multiple places
Removed the typing error in doc/guides/eventdevs/index.rst,
drivers/net/mlx5/mlx5.c and in lib/librte_vhost/rte_vhost.h

Bugzilla ID: 477
Fixes: 0857b94211 ("doc: add event device and software eventdev")
Fixes: 039253166a ("vhost: add device op when notification to guest is sent")
Fixes: ad74bc6195 ("net/mlx5: support multiport IB device during probing")
Cc: stable@dpdk.org

Signed-off-by: Muhammad Bilal <m.bilal@emumba.com>
2020-05-19 15:55:57 +02:00
Ciara Power
a0c21662b4 telemetry: fix buffer overrun if max bytes read
If 1024 bytes were received over the socket, this caused
buffer_recvf[bytes] to overrun the array. The size of the buffer - 1 is
now passed to the read function.

Coverity issue: 358442
Fixes: b80fe1805e ("telemetry: introduce backward compatibility")

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
2020-05-19 15:05:56 +02:00
Ciara Power
07580a734b telemetry: check socket creation failure
The return value from the socket function is now checked, as it can
return a negative value on error.

Coverity issue: 358443
Fixes: b80fe1805e ("telemetry: introduce backward compatibility")

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
2020-05-19 15:05:56 +02:00
Ciara Power
6aa1aa0ebc telemetry: close socket on connection failure
The socket fd is now being closed when the connection fails.

Coverity issue: 358444
Fixes: b80fe1805e ("telemetry: introduce backward compatibility")

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
2020-05-19 15:05:56 +02:00
Ciara Power
bd3c89cb1a telemetry: fix error checking for strchr function
The strchr function return was not being checked which could lead to
NULL deferencing later in the function.

Coverity issue: 358438, 358445
Fixes: b80fe1805e ("telemetry: introduce backward compatibility")

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
2020-05-19 15:05:56 +02:00
Ciara Power
febbebf7f2 telemetry: keep threads separate from data plane
The threads for listening on the telemetry sockets are control threads
and should be separated from those on the data plane. Since telemetry
cannot use the rte_ctrl_thread_create() API, as it does not depend on
EAL, we pass the ctrl thread cpu_set to telemetry init and use it
directly to ensure that telemetry cannot interfere with the data plane
threads.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
2020-05-19 15:05:56 +02:00
Gaetan Rivet
e90b9c52f8 kvargs: fix strcmp helper documentation
Minor error, "unless" was used instead of "unlike".

Fixes: a3b85476c5 ("kvargs: add generic string matching callback")
Cc: stable@dpdk.org

Signed-off-by: Gaetan Rivet <grive@u256.net>
2020-05-19 15:05:56 +02:00
Hemant Agrawal
1168be0077 metrics: fix library cleanup
metrics_initialized shall be reset in deinit function.
This is currently causing issue in running metrics_autotest
multiple times.

Fixes: 07c1b6925b ("telemetry: invert dependency on metrics library")

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-05-19 14:05:21 +02:00
Gaetan Rivet
8354e681e3 pci: explain how empty strings are rejected in DBDF
Empty strings are forbidden as input to rte_pci_addr_parse().
It is explicitly enforced in BDF parsing as parsing the bus
field will immediately fail. The related check is commented.

It is implicitly enforced in DBDF parsing, as the domain would be
parsed to 0 without error, but the check `end[0] != ':'` afterward
will return -EINVAL.

Enforcing consistency between parsers by reading the code is not helped
by this property being implicit. Add a comment to explain.

Signed-off-by: Gaetan Rivet <grive@u256.net>
Acked-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2020-05-19 11:18:38 +02:00
Gaetan Rivet
21a61fae51 pci: reject negative values in PCI id
The function strtoul will not return ERANGE if the input is negative, as
one might expect.

   0000:-FFFFFFFFFFFFFFFB:00.0

is not a better way to write 0000:05:00.0.
To simplify checking for '-', forbid using spaces before the field value.

   0000: 00:   2c.0

Should not be accepted.

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Gaetan Rivet <grive@u256.net>
Acked-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
2020-05-19 11:18:38 +02:00
Darek Stojaczyk
26cfc20fed pci: accept 32-bit domain numbers
The parsing code was bailing on domains greater than UINT16_MAX,
but domain numbers like that are still valid and present on some systems.
One example is Intel VMD (Volume Management Device), which acts somewhat
as a software-managed PCI switch and its upstream linux driver assigns
all downstream devices a PCI domain of 0x10000.

Parsing a BDF like 10000:01:00.0 was failing before. To fix it, increase
the upper limit of domain number to UINT32_MAX. This matches the size of
struct rte_pci_addr->domain (uint32).

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Darek Stojaczyk <dariusz.stojaczyk@intel.com>
Acked-by: Gaetan Rivet <grive@u256.net>
2020-05-19 10:59:19 +02:00
Sivaprasad Tummala
0fd5608ef9 vhost: handle mbuf allocation failure
vhost buffer allocation is successful for packets that fit
into a linear buffer. If it fails, vhost library is expected
to drop the current packet and skip to the next.

The patch fixes the error scenario by skipping to next packet.
Note: Drop counters are not currently supported.

Fixes: c3ff0ac70a ("vhost: improve performance by supporting large buffer")
Cc: stable@dpdk.org

Signed-off-by: Sivaprasad Tummala <sivaprasad.tummala@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-18 20:35:57 +02:00
Stephen Hemminger
3a2cd6fd06 eal: fix C++17 compilation
Compiling a C++ application that includes directly or indirectly
rte_common.h will cause a warning:

include/rte_common.h:350:37: warning: ISO C++17 does not allow
  ‘register’ storage class specifier [-Wregister]
 rte_combine32ms1b(register uint32_t x)

C++ is pickier than standard C and flags this antique usage.

The register keyword is an old K&R legacy and should be removed
everywhere in DPDK. For now, fix it where it hurts.

Fixes: 08f683174e ("eal: add functions for previous power of 2 alignment")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-05-18 20:46:24 +02:00
Ferruh Yigit
05a38d7c75 compat: provide experimental alias for matured ABI
On v20.02 some APIs matured and symbols moved from EXPERIMENTAL to
DPDK_20.0.1 block.

This had the affect of breaking the applications that were using these
APIs on v19.11. Although there is no modification of the APIs and the
action is positive and matures the APIs, the affect can be negative to
applications.

When a maintainer is promoting an API to become part of the next major
ABI version by removing the experimental tag. The maintainer may
choose to offer an alias to the experimental tag, to prevent these
breakages in future.

The following changes are made to enabling aliasing:

Updated to the ABI policy and ABI versioning documents.

Created VERSION_SYMBOL_EXPERIMENTAL helper macro.

Updated the 'check-symbols.sh' tool, which was complaining that the
symbol is in EXPERIMENTAL tag in .map file but it is not in the
.experimental section (__rte_experimental tag is missing).
Updated tool in a way it won't complain if the symbol in the
EXPERIMENTAL tag duplicated in some other block in .map file (versioned)

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
2020-05-18 19:46:25 +02:00
Xuan Ding
e7debf6026 vhost: fix potential fd leak
Vhost will create temporary file when receiving VHOST_USER_GET_INFLIGHT_FD
message. Malicious guest can send endless this message to drain out the
resource of host.

When receiving VHOST_USER_GET_INFLIGHT_FD message repeatedly, closing the
file created during the last handling of this message.

CVE-2020-10726
Fixes: d87f1a1cb7 ("vhost: support inflight info sharing")
Cc: stable@dpdk.org

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-18 15:22:42 +02:00
Xiaolong Ye
549de54c4f vhost: fix potential memory space leak
A malicious container which has direct access to the vhost-user socket
can keep sending VHOST_USER_GET_INFLIGHT_FD messages which may cause
leaking resources until resulting a DOS. Fix it by unmapping the
dev->inflight_info->addr before assigning new mapped addr to it.

CVE-2020-10726
Fixes: d87f1a1cb7 ("vhost: support inflight info sharing")
Cc: stable@dpdk.org

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-18 15:22:42 +02:00
Marvin Liu
97ecc1c85c vhost: fix translated address not checked
Malicious guest can construct desc with invalid address and zero buffer
length. That will request vhost to check both translated address and
translated data length. This patch will add missed address check.

CVE-2020-10725
Fixes: 75ed516978 ("vhost: add packed ring batch dequeue")
Fixes: ef861692c3 ("vhost: add packed ring batch enqueue")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-18 15:22:42 +02:00
Maxime Coquelin
acd4c92fa6 vhost/crypto: validate keys lengths
transform_cipher_param() and transform_chain_param() handle
the payload data for the VHOST_USER_CRYPTO_CREATE_SESS
message. These payloads have to be validated, since it
could come from untrusted sources.

Two buffers and their lengths are defined in this payload,
one the the auth key and one for the cipher key. But above
functions do not validate the key length inputs, which could
lead to read out of bounds, as buffers have static sizes of
64 bytes for the cipher key and 512 bytes for the auth key.

This patch adds necessary checks on the key length field
before being used.

CVE-2020-10724
Fixes: e80a987081 ("vhost/crypto: add session message handler")
Cc: stable@dpdk.org

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
2020-05-18 15:22:34 +02:00
Maxime Coquelin
c78d94189d vhost: fix vring index check
vhost_user_check_and_alloc_queue_pair() is used to extract
a vring index from a payload. This function validates the
index and is called early on in when performing message
handling. Most message handlers depend on it correctly
validating the vring index.

Depending on the message type the vring index is in
different parts of the payload. The function contains a
switch/case for each type and copies the index. This is
stored in a uint16. This index is then validated. Depending
on the message, the source index is an unsigned int. If
integer truncation occurs (uint->uint16) the top 16 bits
of the index are never validated.

When they are used later on  (e.g. in
vhost_user_set_vring_num() or vhost_user_set_vring_addr())
it can lead to out of bound indexing. The out of bound
indexed data gets written to, and hence this can cause
memory corruption.

This patch fixes this vulnerability by declaring vring
index as an unsigned int in
vhost_user_check_and_alloc_queue_pair().

CVE-2020-10723
Fixes: 160cbc815b ("vhost: remove a hack on queue allocation")
Cc: stable@dpdk.org

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
2020-05-18 15:18:58 +02:00
Maxime Coquelin
3ae4beb079 vhost: check log mmap offset and size overflow
vhost_user_set_log_base() is a message handler that is
called to handle the VHOST_USER_SET_LOG_BASE message.
Its payload contains a 64 bit size and offset. Both are
added up and used as a size when calling mmap().

There is no integer overflow check. If an integer overflow
occurs a smaller memory map would be created than
requested. Since the returned mapping is mapped as writable
and used for logging, a memory corruption could occur.

CVE-2020-10722
Fixes: fbc4d248b1 ("vhost: fix offset while mmaping log base address")
Cc: stable@dpdk.org

Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
2020-05-18 15:18:58 +02:00
Kevin Traynor
572f2a9089 hash: fix gcc 10 maybe-uninitialized warning
gcc 10.1.1 reports a warning for the ext_bkt_id variable:

../lib/librte_hash/rte_cuckoo_hash.c:
In function ‘__rte_hash_add_key_with_hash’:
../lib/librte_hash/rte_cuckoo_hash.c:1104:29:
warning: ‘ext_bkt_id’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
 1104 |  (h->buckets_ext[ext_bkt_id - 1]).sig_current[0] = short_sig;
      |                  ~~~~~~~~~~~^~~

The return value of rte_ring_sc_dequeue_elem() is already checked,
but also initialize ext_bkt_id to zero (invalid value) and check
that it also overwritten.

Fixes: fbfe568103 ("hash: use 32-bit elements rings to save memory")
Cc: stable@dpdk.org

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
2020-05-18 13:54:36 +02:00
Nithin Dabilpuram
a8b8a86317 node: fix arm64 build with old gcc
Older GCC(~4) complains about uninitialized 'dip'
var though all the lanes of the vec register are set.
Hence this patch explicitly initializes vec register
to fix the issue.

In file included from ip4_lookup.c:34:0:
ip4_lookup_neon.h: n function ‘ip4_lookup_node_process’: \
ip4_lookup_neon.h:25:12: error: ‘dip’ may be used uninitialized in \
	this function [-Werror=maybe-uninitialized]
  int32x4_t dip;
            ^

Fixes: 16df6a2c66 ("node: add IPv4 lookup for arm64")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2020-05-13 15:38:50 +02:00
Dekel Peled
6b30428820 doc: refine ethernet and VLAN flow rule items
Specified pattern may be translated in different manner.
For example the pattern "eth / ipv4" can be translated to match
untagged packets only, since the pattern doesn't specify a VLAN item.
It can also be translated to match both tagged and untagged packets,
for the same reason.
This patch updates the rte_flow documentation to clearly specify the
required pattern to use.
For example:
To match tagged ipv4 packets, the pattern "eth / vlan / ipv4 / end"
should be used.
To match untagged ipv4 packets, the pattern "eth / ipv4 / end"
should be used.
To match all IPV4 packets, both tagged and untagged, need to apply
two rules with the patterns above.
To match both tagged and untagged packets of any type, the pattern
"eth / end" should be used.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ori Kam <orika@mellanox.com>
2020-05-11 22:27:39 +02:00
Asaf Penso
f6eb393849 ethdev: add 200G link speed
There is no way to report back a link speed of 200Gbps.

Adding 200G link speed.

Signed-off-by: Asaf Penso <asafp@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-05-11 22:27:39 +02:00
Arek Kusztal
a0f0de06d4 cryptodev: fix ABI compatibility for ChaCha20-Poly1305
This patch adds versioned function rte_cryptodev_info_get()
to prevent some issues with ABI policy.
Node v21 works in same way as before, returning driver capabilities
directly to the API caller. These capabilities may include new elements
not part of the v20 ABI.
Node v20 function maintains compatibility with v20 ABI releases
by stripping out elements not supported in v20 ABI. Because
rte_cryptodev_info_get is called by other API functions,
rte_cryptodev_sym_capability_get function is versioned the same way.

Fixes: b922dbd38c ("cryptodev: add ChaCha20-Poly1305 AEAD algorithm")

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-05-11 13:17:43 +02:00
Arek Kusztal
b922dbd38c cryptodev: add ChaCha20-Poly1305 AEAD algorithm
This patch adds Chacha20-Poly1305 AEAD algorithm to Cryptodev.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-05-11 13:17:43 +02:00
Vladimir Medvedkin
e62893f5ec ipsec: check SAD lookup error
Explicitly check return value in add_specific()
CID 357760 (#2 of 2): Negative array index write (NEGATIVE_RETURNS)
8. negative_returns: Using variable ret as an index to array sad->cnt_arr

Coverity issue: 357760
Fixes: b2ee269267 ("ipsec: add SAD add/delete/lookup implementation")
Cc: stable@dpdk.org

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-05-11 13:17:43 +02:00
Akhil Goyal
e11bdd3774 cryptodev: add feature flag for non-byte aligned data
Some wireless algos like SNOW, ZUC may support input
data in bits which are not byte aligned. However, not
all PMDs can support this requirement. Hence added a
new feature flag RTE_CRYPTODEV_FF_NON_BYTE_ALIGNED_DATA
to identify which all PMDs can support non-byte aligned
data.

Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
2020-05-11 13:17:43 +02:00
Phil Yang
1a805dee01 ipsec: optimize SA outbound sequence update
For SA outbound packets, rte_atomic64_add_return is used to generate
SQN atomically. Use C11 atomics with RELAXED ordering for outbound SQN
update instead of rte_atomic ops which enforce unnecessary barriers on
aarch64.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-05-11 13:17:43 +02:00
Nicolas Chautru
cc29fea1ca bbdev: fix doxygen comments
Several doxygen markup were incorrect in header files.

Fixes: 4935e1e9f7 ("bbdev: introduce wireless base band device lib")
Cc: stable@dpdk.org

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-05-11 13:17:43 +02:00
Ferruh Yigit
867b49d17a ring: fix build for gcc O1 optimization
Can be reproduced with "make EXTRA_CFLAGS='-O1'" command using
gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2)

Two build errors:
1)
In file included from .../build/include/rte_ring_elem.h:1093,
                 from .../lib/librte_rcu/rte_rcu_qsbr.c:21:
../lib/librte_rcu/rte_rcu_qsbr.c: In function ‘rte_rcu_qsbr_dq_reclaim’:
.../build/include/rte_ring_peek.h:282:22:
    error: ‘avail’ may be used uninitialized in this function
           [-Werror=maybe-uninitialized]
  282 |   *available = avail - n;
      |                ~~~~~~^~~
./build/include/rte_ring_peek.h:259:11: note: ‘avail’ was declared here
  259 |  uint32_t avail, head, next;
      |           ^~~~~

2)
In file included from .../build/include/rte_ring_elem.h:1093,
                 from .../build/include/rte_ring.h:405,
                 from .../app/test/test_ring_stress.h:13,
                 from .../app/test/test_ring_stress_impl.h:5,
                 from .../app/test/test_ring_peek_stress.c:5:
.../app/test/test_ring_peek_stress.c: In function ‘_st_ring_enqueue_bulk’:
.../build/include/rte_ring_peek.h:80:22:
    error: ‘free’ may be used uninitialized in this function
           [-Werror=maybe-uninitialized]
   80 |   *free_space = free - n;
      |                 ~~~~~^~~
.../build/include/rte_ring_peek.h:60:11: note: ‘free’ was declared here
   60 |  uint32_t free, head, next;
      |           ^~~~

The cases shouldn't be hit, and it looks like there is already logic
error if it has been hit, but assigning 'avail' & 'free' to '0' to fix
the build error.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-05-11 19:20:54 +02:00
David Marchand
dd6275a424 telemetry: fix error log output
Caught while running testpmd:
No telemetry legacy support- No legacy callbacks, legacy socket not createdInteractive-mode selected

Add missing \n.

Fixes: 6dd571fd07 ("telemetry: introduce new functionality")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-05-11 18:58:14 +02:00
David Marchand
a1d8f30925 telemetry: fix build for armv7
telemetry can not depend on EAL anymore but it still wants to get arch
headers.
We directly point at the right source directories by using the same logic
than EAL. However the special case of armv7 has been missed.

Fix this by defaulting ARCH_DIR to RTE_ARCH.

Caught on OBS:
[  162s]   SYMLINK-FILE include/rte_telemetry.h
[  162s]   CC telemetry.o
[  162s]   CC telemetry_data.o
[  162s]   CC telemetry_legacy.o
[  162s] .../lib/librte_telemetry/telemetry.c:15:10: fatal error:
 rte_spinlock.h: No such file or directory
[  162s]  #include <rte_spinlock.h>
[  162s]           ^~~~~~~~~~~~~~~~
[  162s] compilation terminated.

Fixes: 6dd571fd07 ("telemetry: introduce new functionality")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-05-11 17:44:13 +02:00
Bing Zhao
b341a09c1d mem: fix overflow on allocation
The size checking is done in the caller. The size parameter is an
unsigned (64b wide) right now, so the comparison with zero should be
enough in most cases. But it won't help in the following case.
If the allocating request input a huge number by mistake, e.g., some
overflow after the calculation (especially subtraction), the checking
in the caller will succeed since it is not zero. Indeed, there is not
enough space in the system to support such huge memory allocation.
Usually it will return failure in the following code. But if the
input size is just a little smaller than the UINT64_MAX, like -2 in
signed type.
The roundup will cause an overflow and then "reset" the size to 0,
and then only a header (128B now) with zero length will be returned.
The following will be the previous allocation header.
It should be OK in most cases if the application won't access the
memory body. Or else, some critical issue will be caused and not easy
to debug. So this issue should be prevented at the beginning, like
other big size failure, NULL pointer should be returned also.

Fixes: fdf20fa7be ("add prefix to cache line macros")
Cc: stable@dpdk.org

Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-05-11 17:44:13 +02:00
Phil Yang
205032bbfc service: relax barriers with C11 atomics
The runstate, comp_runstate and app_runstate are used as guard variables
in the service core lib. To guarantee the inter-threads visibility of
these guard variables, it uses rte_smp_r/wmb. This patch use c11 atomic
built-ins to relax these barriers.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2020-05-11 13:21:54 +02:00
Phil Yang
41e8227e20 service: optimize with C11 atomics
The num_mapped_cores is used as a statistics. Use c11 atomics with
RELAXED ordering for num_mapped_cores instead of rte_atomic ops which
enforce unnessary barriers on aarch64.

Replace execute_lock operations to spinlock_try_lock to avoid duplicate
code.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2020-05-11 13:21:54 +02:00
Phil Yang
6c8d14ffbb service: remove redundant code
The service id validation is duplicated, remove the redundant code
in the calling functions.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2020-05-11 13:21:54 +02:00
Phil Yang
7a0ad72f6e service: remove rte prefix from static functions
clean up rte prefix from static functions.
remove unused parameter for service_dump_one function.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2020-05-11 13:21:54 +02:00
Honnappa Nagarahalli
5c76111f06 service: fix identification of service running on other lcore
The logic to identify if the MT unsafe service is running on another
core can return -EBUSY spuriously. In such cases, running the service
becomes costlier than using atomic operations. Assume that the
application passes the right parameters and reduce the number of
instructions for all cases.

Cc: stable@dpdk.org
Fixes: 8d39d3e237 ("service: fix race in service on app lcore function")

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2020-05-11 13:17:05 +02:00
Honnappa Nagarahalli
18cae99cb9 service: fix race condition for MT unsafe service
A MT unsafe service might get configured to run on another core
while the service is running currently. This might result in the
MT unsafe service running on multiple cores simultaneously. Use
'execute_lock' always when the service is MT unsafe.

If the service is known to be mapped on a single lcore,
setting the service capability to MT safe will avoid taking
the lock and improve the performance.

Fixes: e9139a32f6 ("service: add function to run on app lcore")
Cc: stable@dpdk.org

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2020-05-11 09:33:45 +02:00
Bruce Richardson
293c53d8b2 eal: add telemetry callbacks
EAL now registers commands to provide some basic info from EAL.

Example:
Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 20.05.0-rc0", "pid": 72662, "max_output_len": 16384}
--> /
{"/": ["/", "/eal/app_params", "/eal/params", "/ethdev/link_status", \
    "/ethdev/list", "/ethdev/xstats", "/help", "/info", "/rawdev/list", \
    "/rawdev/xstats"]}
--> /eal/app_params
{"/eal/app_params": ["-i"]}
--> /eal/params
{"/eal/params": ["./app/dpdk-testpmd"]}

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:16 +02:00
Ciara Power
e122b0bff9 eal: remove option registration infrastructure
As Telemetry no longer uses rte_option, and was the only user of this
infrastructure, it can now be removed.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:16 +02:00
Ciara Power
eeb486f3ba eal: add telemetry as dependency
This patch moves telemetry further down the build, and adds it as a
dependency for EAL. Telemetry V2 is now configured to build by default,
and the legacy support is built when the telemetry config flag is set.

Telemetry now has EAL flags, shown below:
"--telemetry" = Enables telemetry (this is default if no flags given)
"--no-telemetry" = Disables telemetry

When telemetry is enabled, it will attempt to open the new socket
version, and also the legacy support socket (this will depend on Jansson
external dependency and telemetry config flag, as before).

Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:16 +02:00
Ciara Power
63e7cb1bf1 telemetry: remove redundant code
This patch removes the existing telemetry files, which are now redundant
as the new version of telemetry has backward compatibility for their
functionality.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:16 +02:00
Ciara Power
b80fe1805e telemetry: introduce backward compatibility
The new telemetry will now open a socket using the old telemetry path,
to ensure backward compatibility. This is not yet initialised, as it
would clash with the existing telemetry, to be removed in a later patch.
This means that both old and new telemetry socket interfaces are
handled in a common way.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:15 +02:00
Ciara Power
b1ad0e1245 rawdev: add telemetry callbacks
The rawdev library now registers commands with telemetry, and
implements the corresponding callback functions. These allow a list of
rawdev devices and xstats for a rawdev port to be queried.

An example usage, with ioat rawdev driver instances, is shown below:

Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 20.05.0-rc0", "pid": 65777, "max_output_len": 16384}
--> /
{"/": ["/", "/ethdev/link_status", "/ethdev/list", "/ethdev/xstats", \
    "/help", "/info", "/rawdev/list", "/rawdev/xstats"]}
--> /rawdev/list
{"/rawdev/list": [0, 1, 2, 3, 4, 5]}
--> /rawdev/xstats,0
{"/rawdev/xstats": {"failed_enqueues": 0, "successful_enqueues": 0, \
    "copies_started": 0, "copies_completed": 0}}

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:09 +02:00
Bruce Richardson
c190daedb9 ethdev: add telemetry callbacks
The ethdev library now registers commands with telemetry, and
implements the callback functions. These commands allow the list of
ethdev ports and the xstats and link status for a port to be queried.

An example using ethdev commands is shown below:

Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 20.05.0-rc0", "pid": 64379, "max_output_len": 16384}
--> /
{"/": ["/", "/ethdev/link_status", "/ethdev/list", "/ethdev/xstats", \
    "/help", "/info"]}
--> /ethdev/list
{"/ethdev/list": [0, 1, 2, 3]}
--> /ethdev/link_status,0
{"/ethdev/link_status": {"status": "UP", "speed": 10000, "duplex": \
    "full-duplex"}}
--> /ethdev/xstats,0
{"/ethdev/xstats": {"rx_good_packets": 0, "tx_good_packets": 0, \
    <snip>
    "tx_priority7_xon_to_xoff_packets": 0}}

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:01 +02:00
Ciara Power
f38748736e telemetry: add default callback commands
The default commands are now added to provide the list of commands
available, help text for a specified command, and also information
about DPDK and telemetry.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-10 23:56:33 +02:00
Bruce Richardson
ed1bfad7d3 telemetry: add functions for returning callback data
The functions added in this patch will help applications build
up data in reply to a telemetry request.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-10 23:54:25 +02:00
Bruce Richardson
6dd571fd07 telemetry: introduce new functionality
This patch introduces a new telemetry connection socket and handling
functionality. Like the existing telemetry implementation (which is
unaffected by this change) it uses a unix socket, but unlike the
existing one it does not have a fixed list of commands - instead
libraries or applications can register telemetry commands and callbacks
to provide a full-extensible solution for all kinds of telemetry across
DPDK.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-10 23:53:57 +02:00
Bruce Richardson
52af6ccb2b telemetry: add utility functions for creating JSON
The functions added in this patch will make it easier for telemetry
to convert data to correct JSON responses to telemetry requests.
Tests are also  added for these json utility functions.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-10 23:52:41 +02:00
Bruce Richardson
07c1b6925b telemetry: invert dependency on metrics library
Rather than having the telemetry library depend on the metrics
lib we invert the dependency so that metrics instead depends
on telemetry lib, and registers the needed functions with it
at init time. This prepares the way for a cleaner telemetry
architecture to be applied in later patches.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-10 23:52:00 +02:00
Ciara Power
bb8f5fc317 metrics: reduce telemetry code
The telemetry code that was moved into the metrics library can be
shortened, while still maintaining the same functionality.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-10 23:50:32 +02:00
Ciara Power
c5b7197f66 telemetry: move some functions to metrics library
This commit moves some of the telemetry library code to a new file in
the metrics library. No modifications are made to the moved code,
except what is needed to allow it to compile and run. The additional
code in metrics is built only when the Jansson library is  present.
Telemetry functions as normal, using the functions from the
metrics_telemetry file. This move will enable code be reused by the new
version of telemetry in a later commit, to support backward
compatibility with the existing telemetry usage.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-10 23:46:18 +02:00
Bruce Richardson
44dfb297af build: add arch-specific header path to global includes
The global include path, which is used by anything built before EAL,
points to the EAL header files so they utility macros etc. can be used
anywhere in DPDK. This path included the OS-specific EAL header files,
but not the architecture-specific ones. This patch moves the selection
of target architecture to the top-level meson.build file so that the
global include can reference that.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-10 23:45:02 +02:00
Kevin Laatz
dec44d4110 eal/x86: add more CPU flags
This patch adds CPU flags which will enable the detection of ISA
features available on more recent x86 based CPUs.

The CPUID leaf information can be found in
Table 1-2. "Information Returned by CPUID Instruction" of this document:
https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

The following CPU flags are added in this patch:
    - AVX-512 doubleword and quadword instructions.
    - AVX-512 integer fused multiply-add instructions.
    - AVX-512 conflict detection instructions.
    - AVX-512 byte and word instructions.
    - AVX-512 vector length instructions.
    - AVX-512 vector bit manipulation instructions.
    - AVX-512 vector bit manipulation 2 instructions.
    - Galois field new instructions.
    - Vector AES instructions.
    - Vector carry-less multiply instructions.
    - AVX-512 vector neural network instructions.
    - AVX-512 for bit algorithm instructions.
    - AVX-512 vector popcount instructions.
    - Cache line demote instructions.
    - Direct store instructions.
    - Direct store 64B instructions.
    - AVX-512 two register intersection instructions.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-05-07 14:51:06 +02:00
Pallavi Kadam
5ebf83784d eal/windows: support logging
Initialize logging on Windows to send log output
to the console.

Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Reviewed-by: Tasnim Bashar <tbashar@mellanox.com>
Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Tested-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
2020-05-07 12:18:18 +02:00
Pallavi Kadam
98e792a35c eal/windows: add fnmatch implementation
Fnmatch implementation is required on Windows to support
log level arguments specified with a globbing pattern.
The source file is with BSD-3-Clause license.
https://github.com/lattera/freebsd/blob/master/usr.bin/csup/fnmatch.c

Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Reviewed-by: Tasnim Bashar <tbashar@mellanox.com>
Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
2020-05-07 12:18:17 +02:00
Pavan Nikhilesh
a5f30c925b eventdev: fix probe and remove for secondary process
When probing event device in secondary process skip reinitializing
the device data structure as it is already done in primary process.

When removing event device in secondary process skip closing the
event device as it should be done by primary process.

Fixes: 322d0345c2 ("eventdev: implement PMD registration functions")
Cc: stable@dpdk.org

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-05-02 12:31:57 +02:00
Joyce Kong
3fc1d87c2a virtio: use one way barrier for split vring avail index
In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend
and backend are assumed to be implemented in software, that is they can
run on identical CPUs in an SMP configuration.
Thus a weak form of memory barriers like rte_smp_r/wmb, other than
rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1)
and yields better performance.
For the above case, this patch helps yielding even better performance
by replacing the two-way barriers with C11 one-way barriers for avail
index in split ring.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Joyce Kong
ea5207c158 virtio: use one way barrier for split vring used index
In case VIRTIO_F_ORDER_PLATFORM(36) is not negotiated, then the frontend
and backend are assumed to be implemented in software, that is they can
run on identical CPUs in an SMP configuration.
Thus a weak form of memory barriers like rte_smp_r/wmb, other than
rte_cio_r/wmb, is sufficient for this case(vq->hw->weak_barriers == 1)
and yields better performance.
For the above case, this patch helps yielding even better performance
by replacing the two-way barriers with C11 one-way barriers for used
index in split ring.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Marvin Liu
faa9867c4d vhost: use binary search in address conversion
If Tx zero copy enabled, gpa to hpa mapping table is updated one by
one. This will harm performance when guest memory backend using 2M
hugepages. Now utilize binary search to find the entry in mapping
table, meanwhile set the threshold to 256 entries for linear search.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Marvin Liu
20fd2f91cf vhost: utilize dynamic memory allocator
Replace dynamic memory allocator with dpdk memory allocator.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Xuan Ding
715070ea10 vhost: prevent zero-copy with incompatible client mode
In server mode, virtio-user inits under the assumption that vhost-user
supports a list of features. However, this could be problematic when
in_order feature is negotiated but not supported by vhost-user when
enables dequeue_zero_copy later.

Add handling when vhost-user enables dequeue_zero_copy as client.

Fixes: 64ab701c3d ("vhost: add vhost-user client mode")
Cc: stable@dpdk.org

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Phil Yang
7ffe400019 vhost: optimize broadcast RARP sync with C11 atomic
The rarp packet broadcast flag is synchronized with rte_atomic_XX APIs
which is a full barrier, DMB, on aarch64. This patch optimized it with
c11 atomic one-way barrier.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
Roland Qi
41f32b052c vhost: fix peer close check
In process_slave_message_reply(), there is a
possibility that receiving a peer close
message instead of a real message response.

This patch targeting to handle the peer close
scenario and report the correct error message.

Fixes: a277c71598 ("vhost: refactor code structure")
Cc: stable@dpdk.org

Signed-off-by: Roland Qi <roland.qi@ucloud.cn>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-05-05 15:54:26 +02:00
David Christensen
67889d1130 eal/ppc: fix build with gcc 9.3
Building DPDK on Ubuntu 20.04 with GCC 9.3.0 results in a "subscript is
outside array bounds" message in rte_memcpy function.  The build error
is caused by an interaction between __builtin_constant_p and
"-Werror=array-bounds" as described in this bugzilla:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90387

Modify the code to disable the array-bounds check for GCC versions 9.0
to 9.3.

Cc: stable@dpdk.org

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
2020-05-06 18:12:57 +02:00
Olivier Matz
b2aa2c9723 kvargs: fix invalid token parsing on FreeBSD
The behavior of strtok_r() is not the same between GNU libc and FreeBSD
libc: in the first case, the context is set to "" when the last token is
returned, while in the second case it is set to NULL.

On FreeBSD, the current code crashes because we are dereferencing a NULL
pointer (ctx1). Fix it by first checking if it is NULL. This works with
both GNU and FreeBSD libc.

Fixes: ffcf831454 ("kvargs: fix buffer overflow when parsing list")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Zhimin Huang <zhiminx.huang@intel.com>
2020-05-06 15:22:19 +02:00
Phil Yang
b2f8a22e79 trace: fix build with gcc 10
Prevent from writing beyond the allocated memory.

GCC 10 compiling output:
eal_common_trace_utils.c: In function 'eal_trace_dir_args_save':
eal_common_trace_utils.c:290:24: error: '__builtin___sprintf_chk'   \
	may write a terminating nul past the end of the destination \
	[-Werror=format-overflow=]
  290 |  sprintf(dir_path, "%s/", optarg);
      |                        ^

Fixes: 8af866df8d ("trace: add trace directory configuration parameter")

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Lijian Zhang <lijian.zhang@arm.com>
Tested-by: Lijian Zhang <lijian.zhang@arm.com>
Acked-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2020-05-06 15:07:18 +02:00
David Marchand
3df4282917 trace: remove string duplication
No need to duplicate an untouched string.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sunil Kumar Kori <skori@marvell.com>
2020-05-06 15:07:18 +02:00
David Marchand
970a407648 trace: remove limitation on patterns number
There is nothing performance sensitive in this list, use dynamic
allocations and remove the arbitrary limit on the number of trace
patterns a user can pass.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sunil Kumar Kori <skori@marvell.com>
2020-05-06 15:07:07 +02:00
David Marchand
d73b9f83cd trace: remove unneeded checks in internal API
The trace framework can be configured via 4 EAL options:
- --trace which calls eal_trace_args_save,
- --trace-dir which calls eal_trace_dir_args_save,
- --trace-bufsz which calls eal_trace_bufsz_args_save,
- --trace-mode which calls eal_trace_mode_args_save.

Those 4 internal callbacks are getting passed a non NULL value:
optarg won't be NULL since those options are declared with
required_argument (man getopt_long).

eal_trace_bufsz_args_save() already trusted passed value, align the other
3 internal callbacks.

Coverity issue: 357768
Fixes: 8c8066ea6a ("trace: add trace mode configuration parameter")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sunil Kumar Kori <skori@marvell.com>
2020-05-06 13:50:32 +02:00
David Marchand
b86aebcb6f trace: avoid confusion on optarg
Prefer a local name to optarg which is a global symbol from the C library.

Fixes: 8c8066ea6a ("trace: add trace mode configuration parameter")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sunil Kumar Kori <skori@marvell.com>
2020-05-06 13:50:32 +02:00
David Marchand
ebaee64097 trace: simplify trace point headers
Invert the current trace point headers logic by making
rte_trace_point_register.h include rte_trace_point.h.

There is no more need for a RTE_TRACE_POINT_REGISTER_SELECT special macro
since including rte_trace_point_register.h itself means we want to
register trace points.

The unexplained "provider" notion is removed from the documentation and
rte_trace_point_provider.h is merged into rte_trace_point.h.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-05-06 13:50:32 +02:00