Commit Graph

6660 Commits

Author SHA1 Message Date
Savinay Dharmappa
ac6fcb841b sched: update subport rate dynamically
Add support to update subport rate dynamically.

Signed-off-by: Savinay Dharmappa <savinay.dharmappa@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-15 02:13:08 +02:00
Savinay Dharmappa
5f757d8fcc sched: introduce subport profile add function
API to add new subport bandwidth profile.

Signed-off-by: Savinay Dharmappa <savinay.dharmappa@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-15 02:11:55 +02:00
Savinay Dharmappa
0ea4c6afca sched: add subport profile table
Add subport profile table to internal port data structure
and update the port config function.

Signed-off-by: Savinay Dharmappa <savinay.dharmappa@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-15 02:11:50 +02:00
Dmitry Kozlyuk
841dfdd06d cmdline: support Windows
Implement terminal handling, input polling, and vdprintf() for Windows.

Because Windows I/O model differs fundamentally from Unix and there is
no concept of character device, polling is simulated depending on the
underlying input device. Supporting non-terminal input is useful for
automated testing.

Windows emulation of VT100 uses "ESC [ E" for newline instead of
standard "ESC E", so add a workaround.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk
f40a74cfcf eal/windows: improve compatibility networking headers
Extend compatibility header system to support librte_cmdline.

pthread.h has to include windows.h, which exposes struct in_addr, etc.
conflicting with compatibility headers. WIN32_LEAN_AND_MEAN macro
is required to disable this behavior. Use rte_windows.h to define
WIN32_LEAN_AND_MEAN for pthread library.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk
b5741b5704 cmdline: add internal wrapper for vdprintf
Add internal wrapper for vdprintf(3) that is only available on Unix.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk
9251cd97a6 cmdline: add internal wrappers for character input
poll(3) is a purely Unix facility, so it cannot be directly used by
common code. read(2) is limited in device support outside of Unix.
Create wrapper functions and implement them for Unix.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk
7b5f4e1d30 cmdline: add internal wrappers for terminal handling
Add functions that set up, save, and restore terminal parameters.
Use existing code as Unix implementation.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk
51fcb6a1fe cmdline: make implementation logically opaque
struct cmdline exposes platform-specific members it contains, most
notably struct termios that is only available on Unix. While ABI
considerations prevent from hinding the definition on already supported
platforms, struct cmdline is considered logically opaque from now on.
Add a deprecation notice targeted at 20.11.

* Remove tests checking struct cmdline content as meaningless.

* Fix missing cmdline_free() in unit test.

* Add cmdline_get_rdline() to access history buffer indirectly.
  The new function is currently used only in tests.

Suggested-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-10-15 00:39:10 +02:00
Dmitry Kozlyuk
f4cbdbc7fb eal/windows: implement alarm API
Implementation is based on waitable timers Win32 API. When timer is set,
a callback and its argument are supplied to the OS, while timer handle
is stored in EAL alarm list. When timer expires, OS wakes up the
interrupt thread and runs the callback. Upon completion it removes the
alarm.

Waitable timers must be set from the thread their callback will run in,
eal_intr_thread_schedule() provides a way to schedule asyncronuous code
execution in the interrupt thread. Alarm module builds synchronous timer
setup on top of it.

Windows alarms are not a type of DPDK interrupt handle and do not
interact with interrupt module beyond executing in the same thread.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
2020-10-14 22:54:04 +02:00
Dmitry Kozlyuk
5c016fc020 eal/windows: add interrupt thread skeleton
Windows interrupt support is based on IO completion ports (IOCP).
Interrupt thread would send the devices requests to notify about
interrupts and then wait for any request completion. Add skeleton code
of this model without any hardware support.

Another way to wake up the interrupt thread is APC (asynchronous procedure
call), scheduled by any other thread via eal_intr_thread_schedule().
This internal API is intended for alarm implementation.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
2020-10-14 22:48:38 +02:00
Ting Xu
99541c3028 table: fix hash for 32-bit
When create softnic hash table with 16 keys, it failed on 32-bit
environment, because the pointer field in structure rte_bucket_4_16
is only 32 bits. Add a padding field in 32-bit environment to keep
the structure to a multiple of 64 bytes. Apply this to 8-byte and
32-byte key hash function as well.

Fixes: 8aa327214c ("table: hash")
Cc: stable@dpdk.org

Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-14 14:42:29 +02:00
Konstantin Ananyev
c5cf148d89 acl: deduplicate AVX512 code
Current rte_acl_classify_avx512x32() and rte_acl_classify_avx512x16()
code paths are very similar. The only differences are due to
256/512 register/instrincts naming conventions.
So to deduplicate the code:
  - Move common code into “acl_run_avx512_common.h”
  - Use macros to hide difference in naming conventions

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:37:51 +02:00
Konstantin Ananyev
6fba1c8ba0 acl: optimize AVX512 classify with 4 bytes loads
With current ACL implementation first field in the rule definition
has always to be one byte long. Though for optimising classify
implementation it might be useful to do 4B reads
(as we do for rest of the fields).
So at build phase, check user provided field definitions to determine
is it safe to do 4B loads for first ACL field.
Then at run-time this information can be used to choose classify
behavior.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:23:01 +02:00
Konstantin Ananyev
45da22e42e acl: add 512-bit AVX512 classify method
Introduce classify implementation that uses AVX512 specific ISA.
rte_acl_classify_avx512x32() is able to process up to 32 flows in parallel.
It uses 512-bit width registers/instructions and provides higher
performance then rte_acl_classify_avx512x16(), but can cause
frequency level change.
Note that for now only 64-bit version is supported.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:23:01 +02:00
Konstantin Ananyev
867d0d3649 acl: select 256-bit AVX512 classify method by default
On supported platforms, set RTE_ACL_CLASSIFY_AVX512X16 as
default ACL classify algorithm.
Note that AVX512X16 implementation uses 256-bit registers/instincts only
to avoid possibility of frequency drop.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:23:01 +02:00
Konstantin Ananyev
b64c2295f7 acl: add 256-bit AVX512 classify method
Introduce classify implementation that uses AVX512 specific ISA.
rte_acl_classify_avx512x16() is able to process up to 16 flows in parallel.
It uses 256-bit width registers/instructions only
(to avoid frequency level change).
Note that for now only 64-bit version is supported.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:23:00 +02:00
Konstantin Ananyev
7c6cca6b60 acl: add infrastructure for AVX512 classify methods
Add necessary changes to support new AVX512 specific ACL classify
algorithm:
 - changes in meson.build to check that build tools
   (compiler, assembler, etc.) do properly support AVX512.
 - run-time checks to make sure target platform does support AVX512.
 - dummy rte_acl_classify_avx512() for targets where AVX512
   implementation couldn't be properly supported.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-10-14 14:23:00 +02:00
Konstantin Ananyev
0cea36d689 acl: rework classify method selection
Right now ACL library determines best possible (default) classify method
on a given platform with special constructor function rte_acl_init().
This patch makes the following changes:
 - Move selection of default classify method into a separate private
   function and call it for each ACL context creation (rte_acl_create()).
 - Remove library constructor function
 - Make rte_acl_set_ctx_classify() to check that requested algorithm
   is supported on given platform.

The purpose of these changes to improve and simplify algorithm selection
process and prepare ACL library to be integrated with the
max SIMD bitwidth series in discussion.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:23:00 +02:00
Konstantin Ananyev
ad20877a30 acl: remove classify methods count enum
Removal of unused enum value (RTE_ACL_CLASSIFY_NUM).
This enum value is not used inside DPDK, while it prevents
to add new classify algorithms without causing an ABI breakage.

Note that this change introduce a formal ABI incompatibility
with previous versions of ACL library.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2020-10-14 14:23:00 +02:00
Konstantin Ananyev
85348c3e7d acl: fix x86 build for compiler without AVX2
Right now we define dummy version of rte_acl_classify_avx2()
when both X86 and AVX2 are not detected, though it should be
for non-AVX2 case only.

Fixes: e53ce4e413 ("acl: remove use of weak functions")
Cc: stable@dpdk.org

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2020-10-14 14:23:00 +02:00
Vladimir Medvedkin
afd9edb0d3 eal/x86: introduce type for AVX 512-bit
New data type to manipulate 512 bit AVX values.

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-10-14 14:23:00 +02:00
Tal Shnaiderman
bbdab351ee eal/windows: export all built functions for clang
export for clang build all the functions currently built
on Windows and listed in rte_eal_version.map by adding
them to rte_eal_exports.def.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-10-14 11:49:33 +02:00
Mairtin o Loingsigh
17a937baed net: add CRC AVX512 implementation
This patch enables the optimized calculation of CRC32-Ethernet and
CRC16-CCITT using the AVX512 and VPCLMULQDQ instruction sets. This CRC
implementation is built if the compiler supports the required instruction
sets. It is selected at run-time if the host CPU, again, supports the
required instruction sets.

Signed-off-by: Mairtin o Loingsigh <mairtin.oloingsigh@intel.com>
Signed-off-by: David Coyle <david.coyle@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Jasvinder Singh <jasvinder.singh@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
2020-10-13 19:26:15 +02:00
Mairtin o Loingsigh
ef94569cf9 net: add CRC implementation runtime selection
This patch adds support for run-time selection of the optimal
architecture-specific CRC path, based on the supported instruction set(s)
of the CPU.

The compiler option checks have been moved from the C files to the meson
script. The rte_cpu_get_flag_enabled function is called automatically by
the library at process initialization time to determine which
instructions the CPU supports, with the most optimal supported CRC path
ultimately selected.

Signed-off-by: Mairtin o Loingsigh <mairtin.oloingsigh@intel.com>
Signed-off-by: David Coyle <david.coyle@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Jasvinder Singh <jasvinder.singh@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-10-13 19:26:03 +02:00
Wei Hu (Xavier)
b15a936d75 eal/arm64: update CPU flags
ARM64 Linux kernel updated the CPU flags using the HWCAP scheme.
The related marco definition can be found in linux kernel:
  arch/arm64/include/uapi/asm/hwcap.h

This patch incorporates those changes to the EAL library.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2020-10-13 17:52:11 +02:00
Ruifeng Wang
e9b9739264 config: remap flags used for Arm platforms
RTE_ARCH_xx flags are used to distinguish platform architectures.
These flags can be used to pick different code paths for different
architectures at compile time.
For Arm platforms, there are 3 flags in use: RTE_ARCH_ARM,
RTE_ARCH_ARMv7 and RTE_ARCH_ARM64.
RTE_ARCH_ARM64 is for 64-bit aarch64 platforms,
and RTE_ARCH_ARM & RTE_ARCH_ARMv7 are for 32-bit platforms.
RTE_ARCH_ARMv7 is for ARMv7 platforms as its name suggested.

The issue is meaning of RTE_ARCH_ARM is not clear enough.
Because no info about platform word length is included in the name.
To make the flag names more clear, a naming scheme is proposed.

RTE_ARCH_ARM (all Arm platforms)
    |
    +----RTE_ARCH_32 (New. 32-bit platforms of all architectures)
    |        |
    |        +----RTE_ARCH_ARMv7 (ARMv7 platforms)
    |        |
    |        +----RTE_ARCH_ARMv8_AARCH32 (aarch32 state on aarch64 machine)
    |
    +----RTE_ARCH_64 (64-bit platforms of all architectures)
             |
             +----RTE_ARCH_ARM64 (64-bit Arm platforms)

RTE_ARCH_32 will be explicitly defined for 32-bit platforms.

To fit into the new naming scheme, current usage of RTE_ARCH_ARM in
project is mapped to (RTE_ARCH_ARM && RTE_ARCH_32).

Matching flags for other architectures are:
RTE_ARCH_X86
    |
    +----RTE_ARCH_32
    |        |
    |        +----RTE_ARCH_I686
    |        |
    |        +----RTE_ARCH_X86_X32
    |
    +----RTE_ARCH_64
             |
             +----RTE_ARCH_X86_64

RTE_ARCH_PPC_64 ---- RTE_ARCH_64

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
2020-10-13 16:35:48 +02:00
Radu Nicolau
8a00dfc738 eal: add write combining store
Add rte_write32_wc and rte_write32_wc_relaxed functions
that implement 32bit stores using write combining memory protocol.
Provided generic stubs and x86 implementation.

Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-10-13 14:11:16 +02:00
Nick Connolly
9d42642e86 mem: fix allocation failure on non-NUMA kernel
Running dpdk-helloworld on Linux with lib numa present, but no kernel
support for NUMA (CONFIG_NUMA=n) causes rte_service_init() to fail with
EAL: error allocating rte services array.

alloc_seg() calls get_mempolicy to verify that the allocation
has happened on the correct socket, but receives ENOSYS from
the kernel and fails the allocation.

The allocated socket should only be verified if check_numa() is true.

Fixes: 2a96c88be8 ("mem: ease init in a docker container")
Cc: stable@dpdk.org

Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-10-13 14:02:18 +02:00
Chas Williams
d98b0fc1af net: check segment pointer in raw checksum processing
If the overall pkt_len and segment lengths are out of agreement,
it is possible for the seg to be NULL after the loop. Add assert
to check this condition in debug builds. Otherwise, return failure.

Fixes: c442fed81b ("net: add function to calculate checksum in mbuf")
Cc: stable@dpdk.org

Signed-off-by: Chas Williams <3chas3@gmail.com>
2020-10-12 23:09:52 +02:00
David Marchand
1a11380bf4 eal: fix doxygen for EAL cleanup
Align rte_eal_cleanup return codes description to the rest of dpdk.

Fixes: aec9c13c52 ("eal: add function to release internal resources")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2020-10-12 14:19:05 +02:00
Min Hu (Connor)
b7ccfb09da ethdev: introduce FEC API
This patch adds Forward error correction(FEC) support for ethdev.
Introduce APIs which support query and config FEC information in
hardware.

Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Chengchang Tang <tangchengchang@huawei.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2020-10-09 13:17:43 +02:00
Huisong Li
9f6dc8592d ethdev: fix data type in TC queues
Currently, base and nb_queue in the tc_rxq and tc_txq information
of queue and TC mapping on both TX and RX paths are uint8_t.
However, these data will be truncated when queue number under a TC
is greater than 256. So it is necessary for base and nb_queue to
change from uint8_t to uint16_t.

Fixes: 89d6728c78 ("ethdev: get DCB information")
Cc: stable@dpdk.org

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-10-08 19:58:11 +02:00
Dekel Peled
c7870bfe09 ethdev: move RSS expansion code to mlx5 driver
Patch [1] added support for RSS flow expansion.
It was added in ethdev for public use, but until now it is used only
by MLX5 PMD.
To allow local changes in this code, this patch removes it from ethdev
and moves it to MLX5 PMD file.

[1] commit 4ed05fcd44 ("ethdev: add flow API to expand RSS flows")

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-10-08 19:58:11 +02:00
Dekel Peled
7f6a3168ed ethdev: fix RSS flow expansion in case of mismatch
Function rte_flow_expand_rss() is used to expand a flow rule with
partial pattern into several rules, to ensure all relevant packets
are matched.
It uses utility function rte_flow_expand_rss_item_complete(), to check
if the last valid item in the flow rule pattern needs to be completed.
For example the pattern "eth / ipv4 proto is 17 / end" will be completed
with a "udp" item.
This function returns "void" item in two cases:
1) The last item has empty spec, for example "eth / ipv4 / end".
2) The last itme has spec that can't be expanded for RSS.
   For example the pattern "eth / ipv4 proto is 47 / end" ends with IPv4
   item that has next protocol GRE.

In both cases the flow rule may be expanded, but in the second case such
expansion may create rules with invalid pattern.
For example "eth / ipv4 proto is 47 / udp / end".
In such a case the flow rule should not be expanded.

This patch updates function rte_flow_expand_rss_item_complete().
Return value RTE_FLOW_ITEM_TYPE_END is used to indicate the flow rule
should not be expanded.
In such a case, rte_flow_expand_rss() will return with the original flow
rule only, without any expansion.

Fixes: fc2dd8dd49 ("ethdev: fix expand RSS flows")
Cc: stable@dpdk.org

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
2020-10-08 19:58:11 +02:00
Ferruh Yigit
7ae5c75f37 ethdev: check if queues are allocated before getting info
A crash is detected when '--txpkts=#' parameter provided to the testpmd,
this is because queue information is requested before queues have been
allocated.

Adding check to queue info APIs
('rte_eth_rx_queue_info_get()' & 'rte_eth_tx_queue_info_get')
to protect against similar cases.

Fixes: ba2fb4f022 ("ethdev: check if queue setup when getting queue info")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2020-10-08 19:58:11 +02:00
Stephen Hemminger
bfa63c4d7b ethdev: use mbuf bulk free API
The mbuf library now has routine to free multiple buffers.
Loop is no longer needed.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-10-08 19:58:11 +02:00
David Marchand
0e995cbcfc eal: fix experimental block for 20.11
In EAL, we try to sort the experimental symbols per the release they
were introduced in.

Fixes: 8929de043e ("service: retrieve lcore active state")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2020-10-08 15:20:51 +02:00
Cristian Dumitrescu
f63ba2005e pipeline: fix instruction config free
Coverity issue: 362901
Fixes: a1711f948d ("pipeline: add SWX Rx and extract instructions")

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-08 15:09:28 +02:00
Cristian Dumitrescu
941717ffe1 pipeline: fix unused variable
Coverity issue: 362855
Fixes: 75634474ca ("pipeline: add SWX instruction verifier")

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-08 15:09:28 +02:00
Cristian Dumitrescu
0ebe8c38a3 pipeline: fix memory free
Coverity issue: 362796, 362804, 362819, 362836, 362858, 362865, 362869
Fixes: 3ca60ceed7 ("pipeline: add SWX pipeline specification file")

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-08 15:09:25 +02:00
Cristian Dumitrescu
c0c9dcef88 pipeline: fix argument check
Coverity issue: 362789
Fixes: 3ca60ceed7 ("pipeline: add SWX pipeline specification file")

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-08 15:04:55 +02:00
Cristian Dumitrescu
faa4536684 pipeline: fix resource leak
Coverity issue: 362812
Fixes: b32c0a2c5e ("pipeline: add SWX table update high level API")

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-08 15:01:07 +02:00
Cristian Dumitrescu
bacfdd908d pipeline: fix memory leak
Coverity issue: 362741
Fixes: b32c0a2c5e ("pipeline: add SWX table update high level API")

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-08 15:00:42 +02:00
Bruce Richardson
979e29ddbb raw/ioat: rename functions to be operation-agnostic
Since the hardware supported by the ioat driver is capable of operations
other than just copies, we can rename the doorbell and completion-return
functions to not have "copies" in their names. These functions are not
copy-specific, and so would apply for other operations which may be added
later to the driver.

Also add a suitable warning using deprecation attribute for any code using
the old functions names.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
2020-10-08 14:33:20 +02:00
Erik Gabriel Carrillo
0875ec4dd5 timer: add limitation note for sync stop and reset
If a timer's callback function calls rte_timer_reset_sync() or
rte_timer_stop_sync() on another timer that is in the RUNNING state and
owned by the current lcore, the *_sync() calls will loop indefinitely.

Relatedly, if a timer's callback function calls *_sync() on another
timer that is in the RUNNING state and is owned by a different lcore,
but a timer callback function runs on that different lcore and calls
*_sync() on a timer that is in the RUNNING state and owned by the
current lcore, the two lcores will loop indefinitely.

Add a note in the rte_timer_stop_sync and rte_timer_reset_sync
documentation that indicates that these APIs should not be used inside
timer callback functions in order to avoid the hangs described above,
and suggests an alternative.

Bugzilla ID: 491
Cc: stable@dpdk.org

Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-10-08 09:43:57 +02:00
Timothy McDaniel
7fde94b2df trace: increase event CTF description buffer size
The current buffer size is not big enough to register trace points for
new additions in the eventdev subsystem.
Increase TRACE_CTF_FIELD_SIZE by 64 bytes for now.

Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Acked-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-10-08 09:03:01 +02:00
Reshma Pattan
a3f9cca718 power: fix current frequency index
During power initialization the pstate cpufreq api is
not setting the initial curr_idx of pstate_power_info
to corresponding current frequency index.

Without this the idx is always 0, which is causing the
below check to pass and returns without setting the initial
min/max frequency to system max frequency and this leads to
incorrect frequency settings when power_pstate_cpufreq_set_freq()
is called in the apps.

set_freq_internal(struct pstate_power_info *pi, uint32_t idx)
{
...

 /* Check if it is the same as current */
        if (idx == pi->curr_idx)
                return 0;
...
}

scenario 1:
If system has starting scaling min/max: 1000/1000, and want to
set this to 2200/2200, the max frequency gets updated but not min.

scenario 2:
If system has starting scaling min/max: 2200/1000, and want to set
to 2200/2200, the max, min frequency was not updated. Since no change
in max that should be ok, but min was also ignored, which will be fixed
now with the new changes.

Fixes: e6c6dc0f ("power: add p-state driver compatibility")
Cc: stable@dpdk.org

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Reviewed-by: Liang Ma <liang.j.ma@intel.com>
2020-10-07 14:51:52 +02:00
Pavan Nikhilesh
ca32fa67a7 trace: add size_t as generic trace point
Add size_t as a generic trace point. Also, update
test_generic_trace_point() to validate size_t emitter.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Sunil Kumar Kori <skori@mavell.com>
2020-10-07 14:44:03 +02:00
Pavan Nikhilesh
2114521cff trace: fix size_t field emitter
Add size_t CTF format metadata, this is needed by CTF analyzers to
parse the emitted CTF trace.

Fixes: 262c4ee791 ("trace: add size_t field emitter")

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Sunil Kumar Kori <skori@mavell.com>
2020-10-07 14:40:31 +02:00
Hemant Agrawal
b24b892895 mempool: dump handler index and name
Enhance the dump function to also print the ops index
and associated mempool ops name

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-10-06 23:44:15 +02:00
Fan Zhang
9ef2627cf6 port: remove useless assignment
This patch fixes an unused value in pcap source port by
removing the setting to the value.

Coverity issue: 362020
Fixes: d4b42133d8 ("port: add pcap file source")
Cc: stable@dpdk.org

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
2020-10-06 23:39:34 +02:00
Ciara Power
083b0b310b ethdev: add common stats for telemetry
The ethdev library now registers a telemetry command for common ethdev
statistics.

An example usage is shown below:

Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 20.08.0-rc1", "pid": 14119, "max_output_len": 16384}
--> /ethdev/stats,0
{"/ethdev/stats": {"ipackets": 0, "opackets": 0, "ibytes": 0, "obytes": \
    0, "imissed": 0, "ierrors": 0, "oerrors": 0, "rx_nombuf": 0, \
    "q_ipackets": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \
    "q_opackets": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \
    "q_ibytes": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \
    "q_obytes": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], \
    "q_errors": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]}}

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-10-06 22:55:41 +02:00
Ciara Power
c933bb5177 telemetry: support array values in data object
Arrays of type uint64_t/int/string can now be included within an array
or dict. One level of embedded containers is supported. This is
necessary to allow for instances such as the ethdev queue stats to be
reported as a list of uint64_t values, rather than having multiple dict
entries with one uint64_t value for each queue stat.

The memory management APIs provided by telemetry simplify the memory
allocation/free aspect of the embedded container. The rte_tel_data_alloc
function is called in the library/app callback to return a pointer to a
container that has been allocated memory. When adding this container
to an array/dict, a parameter is passed to indicate if the memory
should be freed by telemetry after use. This will allow reuse of the
allocated memory if the library/app wishes to do so.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-10-06 22:55:00 +02:00
Ciara Power
5514319e7b telemetry: fix passing full params string to command
Telemetry only passed the first param to the command handler if multiple
were entered by the user, separated by commas. Telemetry is required to
pass the full params string to the command, by splitting by a comma
delimiter only once to remove the command part of the string. This will
enable future commands to take multiple param values.

Fixes: b1ad0e1245 ("rawdev: add telemetry callbacks")
Fixes: c190daedb9 ("ethdev: add telemetry callbacks")
Fixes: 6dd571fd07 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org

Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-10-06 22:54:58 +02:00
Yi Yang
e2d8110636 gro: support VXLAN UDP/IPv4
VXLAN UDP/IPv4 GRO can help improve VM-to-VM UDP
performance when UFO or GSO is enabled in VM, GRO
must be supported if UFO or GSO is enabled,
otherwise, performance can't get big improvement
if only GSO is there.

With this enabled in DPDK, OVS DPDK can leverage it
to improve VM-to-VM UDP performance, it will reassemble
VXLAN UDP/IPv4 fragments immediate after they are
received from a physical NIC. It is very helpful in
OVS DPDK VXLAN use case.

Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
2020-10-06 21:51:03 +02:00
Yi Yang
1ca5e67408 gro: support UDP/IPv4
UDP/IPv4 GRO can help improve VM-to-VM UDP performance
when UFO or GSO is enabled in VM, GRO must be supported
if UFO or GSO is enabled, otherwise, performance can't
get big improvement if only GSO is there.

With this enabled in DPDK, OVS DPDK can leverage it
to improve VM-to-VM UDP performance, it will reassemble
UDP fragments immediate after they are received from
a physical NIC. It is very helpful in OVS DPDK VLAN use
case.

Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
2020-10-06 21:51:03 +02:00
Michael Pfeiffer
ac0ad5eff8 net: calculate checksum of packet with IPv4 options
Currently, rte_ipv4_cksum() and rte_ipv4_udptcp_cksum() assume all IPv4
headers have sizeof(struct rte_ipv4_hdr) bytes. This is not true for
those (rare) packets with IPv4 options. Thus, both IPv4 and TCP/UDP
checksums are calculated wrong.

This patch fixes the issue by using the actual IPv4 header length from
the packet's IHL field.

Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-10-06 17:10:02 +02:00
Olivier Matz
e4233e1d0a mempool: promote some experimental functions as stable
Move symbols introduced in version <= 19.11 in the stable ABI.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2020-10-06 12:19:21 +02:00
Olivier Matz
9b427d5229 mempool: remove v20 ABI compatibility
Remove the deprecated v20 ABI of rte_mempool_populate_iova() and
rte_mempool_populate_virt().

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2020-10-06 12:19:21 +02:00
Joyce Kong
6ee0c53fda rcu: promote library as stable
RCU library supporting quiescent state was introduced
in 19.05 release and has been around 4 releases, it
should be mature enough to remove the experimental tag.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
2020-10-06 10:31:13 +02:00
Joyce Kong
ebfe34c501 mcslock: promote as stable
Since rte_mcslock APIs were introduced in 19.08 release,
it is now possible to remove the experimental tag from:
rte_mcslock_lock()
rte_mcslock_unlock()
rte_mcslock_trylock()
rte_mcslock_is_locked()

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Phil Yang <phil.yang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
2020-10-06 10:31:13 +02:00
Joyce Kong
d3a76807fc ticketlock: promote as stable
As rte_ticketlock was introduced in 19.05 release
and there were no changes in its public API since
19.11 release, it should be mature enough to remove
the experimental tag.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
2020-10-06 10:31:13 +02:00
Joyce Kong
62867b7749 eal: promote wait until equal API as stable
rte_wait_until_equal_xx APIs were introduced in 19.11 release
and there were no changes in the public APIs since then, it
should be mature enough to remove the experimental tag.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
2020-10-06 10:31:13 +02:00
David Marchand
aa48ddf4f0 mem: fix allocation in container with SELinux
This is something we encountered while working in an OpenShift
environment with SELinux enabled.
In this environment, a DPDK application could create/write to hugepage
files but removing them was refused.
This resulted in dirty files being reused when starting a new DPDK
application and triggered random crashes / erratic behavior.

Getting a SELinux setup can be a challenge, and even more if you add
containers to the picture :-).
So here is a reproducer for the interested testers:

  # cat >wrap.c <<EOF
  #define _GNU_SOURCE
  #include <dlfcn.h>
  #include <errno.h>
  #include <stdio.h>
  #include <string.h>
  #include <sys/stat.h>
  #include <sys/types.h>
  #include <unistd.h>

  int unlink(const char *pathname)
  {
  	static int (*orig)(const char *pathname) = NULL;
  	struct stat st;

  	if (orig == NULL)
  		orig = dlsym(RTLD_NEXT, "unlink");
  	if (strstr(pathname, "rtemap_") != NULL &&
			stat(pathname, &st) == 0) {
  		fprintf(stderr, "### refused unlink for %s\n",
  			pathname);
  		errno = EACCES;
  		return -1;
  	}
  	fprintf(stderr, "### called unlink for %s\n", pathname);
  	return orig(pathname);
  }

  int unlinkat(int dirfd, const char *pathname, int flags)
  {
  	static int (*orig)(int dirfd, const char *pathname, int flags) =
  		NULL;
  	struct stat st;

  	if (orig == NULL)
  		orig = dlsym(RTLD_NEXT, "unlinkat");
  	if (strstr(pathname, "rtemap_") != NULL &&
  			fstatat(dirfd, pathname, &st, flags) == 0) {
  		fprintf(stderr, "### refused unlinkat for %s\n",
  			pathname);
  		errno = EACCES;
  		return -1;
  	}
  	fprintf(stderr, "### called unlinkat for %s\n", pathname);
  	return orig(dirfd, pathname, flags);
  }
  EOF

  # gcc -fPIC -shared  -o libwrap.so wrap.c -ldl
  # \rm /dev/hugepages/rtemap*

  # # First run is fine
  # LD_PRELOAD=libwrap.so dpdk-testpmd -w 0000:01:00.0 -- -i
  [...]
  Configuring Port 0 (socket 0)
  Port 0: 24:6E:96:3C:52:D8
  Checking link statuses...
  Done
  testpmd>

  # # Second run we have dirty memory
  # LD_PRELOAD=libwrap.so dpdk-testpmd -w 0000:01:00.0 -- -i
  [...]
  ### refused unlinkat for rtemap_0
  [...]
  Port 0 is now not stopped
  Please stop the ports first
  Done
  testpmd>

Removing hugepage files is done in multiple places and the memory
allocation code is complex.
This fix tries to do the minimum and avoids touching other paths.

If trying to remove the hugepage file before allocating a page fails,
the error is reported to the caller and the user will see a memory
allocation error log.

Fixes: 582bed1e1d ("mem: support mapping hugepages at runtime")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-10-06 01:11:45 +02:00
Sachin Saxena
0621033073 mempool: dump socket attribute
Enhance the dump function to also print socket_id attribute
passed at creation time.

Signed-off-by: Sachin Saxena <sachin.saxena@oss.nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-10-06 00:57:16 +02:00
Dmitry Kozlyuk
a6c824360a rcu: avoid literal suffix warning in C++ mode
Sequences like "value = %"PRIu64 (no space before PRIu64) are parsed as
a single preprocessor token, user-defined-string-literal, in C++11
onwards. While modern compilers are smart enough to parse this properly,
GCC 9.3.0 generates warnings like:

    rte_rcu_qsbr.h:555:26: warning: invalid suffix on literal; C++11
    requires a space between literal and string macro [-Wliteral-suffix]

Add spaces around format specifier macros to make public headers
compatible with C++ without causing warnings. Make similar changes in C
source for style consistency within the library.

Fixes: 64994b56c ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-10-06 00:45:37 +02:00
Gage Eads
100d9b8066 stack: promote library as stable
The stack library was first released in 19.05, and its interfaces have been
stable since their initial introduction. This commit promotes the full
interface to stable, starting with the 20.11 major version.

Signed-off-by: Gage Eads <gage.eads@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-10-05 11:56:17 +02:00
Erik Gabriel Carrillo
99cdd4b5ef timer: promote some experimental functions as stable
Some new APIs were added to the timer library in the 19.05 release, and
there have been no changes to their interfaces since then. These
functions can be considered stable enough to remove their 'experimental'
tag.

Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-10-05 11:12:52 +02:00
Ferruh Yigit
61e436e1f3 meter: remove experimental alias
Remove ABI versioning for APIs:
'rte_meter_trtcm_rfc4115_profile_config()'
'rte_meter_trtcm_rfc4115_config()'

The alias was introduced in
commit 60197bda97 ("meter: provide experimental alias for matured API")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-10-05 11:11:59 +02:00
Yunjian Wang
1f16fa99aa vfio: fix group descriptor check
The issue is that a file descriptor at 0 is a valid one. Currently
the file not found, the return value will be set to 0. As a result,
it is impossible to distinguish between a correct descriptor and a
failed return value. Fix it to return -ENOENT instead of 0.

Fixes: b758423bc4 ("vfio: fix race condition with sysfs")
Fixes: ff0b67d1c8 ("vfio: DMA mapping")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-10-05 10:08:57 +02:00
Ophir Munk
2d8eb20ad4 hash: build on Windows
Build the lib for Windows.
Export the needed function from eal.

Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
2020-10-05 09:49:55 +02:00
Dmitry Kozlyuk
ff5db45d47 eal/windows: use bundled getopt with MinGW
Clang builds use getopt.c in librte_eal while MinGW provides
implementation as part of the toolchain. Statically linking librte_eal
to an application that depends on getopt results in undefined reference
errors with MinGW. There are no such errors with Clang, because with
Clang librte_eal actually defines getopt functions.

Use getopt.c in EAL with Clang and MinGW to get identical behavior.
Adjust code for MinGW. Incidentally, this removes a bug when free() is
called on uninitialized memory.

Fixes: 5e373e456e ("eal/windows: add getopt implementation")
Cc: stable@dpdk.org

Reported-by: Khoa To <khot@microsoft.com>
Reported-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Khoa To <khot@microsoft.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
2020-10-05 09:12:24 +02:00
David Marchand
2a8be2ff75 pipeline: fix build with glibc < 2.26
reallocarray has been introduced in glibc 2.26 but we still support
glibc >= 2.7.
Simply replace with realloc, as the considered sizes are unlikely to
overflow.

"""
The reallocarray() function changes the size of the memory block
pointed to by ptr to be large enough for an array of nmemb elements,
each of which is size bytes.  It is equivalent to the call

       realloc(ptr, nmemb * size);

However, unlike that realloc() call, reallocarray() fails safely in
the case where the multiplication would overflow.  If such an over‐
flow occurs, reallocarray() returns NULL, sets errno to ENOMEM, and
leaves the original block of memory unchanged.
"""

Fixes: 3ca60ceed7 ("pipeline: add SWX pipeline specification file")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-02 13:49:16 +02:00
Maxime Coquelin
cacf8267cc vhost: remove dequeue zero-copy support
Dequeue zero-copy removal was announced in DPDK v20.08.
This feature brings constraints which makes the maintenance
of the Vhost library difficult. Its limitations makes it also
difficult to use by the applications (Tx vring starvation).

Removing it makes it easier to add new features, and also remove
some code in the hot path, which should bring a performance
improvement for the standard path.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-30 23:16:56 +02:00
Maxime Coquelin
eca9a0d6c8 vhost: promote vDPA API as stable
As announced in v20.08, this patch makes the vDPA
and related Vhost API stable.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-30 23:16:55 +02:00
Thomas Monjalon
fbd1913561 ethdev: remove old close behaviour
The temporary flag RTE_ETH_DEV_CLOSE_REMOVE is removed.
It was introduced in DPDK 18.11 in order to give time for PMDs to migrate.

The old behaviour was to free only queues when closing a port.
The new behaviour is calling rte_eth_dev_release_port() which does
three more tasks:
	- trigger event callback
	- reset state and few pointers
	- free all generic port resources

The private port resources must be released in the .dev_close callback.

The .remove callback should:
	- call .dev_close callback
	- call rte_eth_dev_release_port()
	- free multi-port device shared resources

Despite waiting two years, some drivers have not migrated,
so they may hit issues with the incompatible new behaviour.
After sending emails, adding logs, and announcing the deprecation,
the only last solution is to declare these drivers as unmaintained:
	ionic, liquidio, nfp
Below is a summary of what to implement in those drivers.

* The freeing of private port resources must be moved
from the ".remove(device)" function to the ".dev_close(port)" function.

* If a generic resource (.mac_addrs or .hash_mac_addrs) cannot be freed,
it must be set to NULL in ".dev_close" function to protect from
subsequent rte_eth_dev_release_port() freeing.

* Note 1:
The generic resources are freed in rte_eth_dev_release_port(),
after ".dev_close" is called in rte_eth_dev_close(), but not when
calling ".dev_close" directly from the ".remove" PMD function.
That's why rte_eth_dev_release_port() must still be called explicitly
from ".remove(device)" after calling the ".dev_close" PMD function.

* Note 2:
If a device can have multiple ports, the common resources must be freed
only in the ".remove(device)" function.

* Note 3:
The port is supposed to be in a stopped state when it is closed.
If it is not the case, it is free to the PMD implementation
how to react when trying to close a non-stopped port:
either try to stop it automatically or just return an error.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-09-30 19:19:14 +02:00
Thomas Monjalon
b142387b07 ethdev: allow drivers to return error on close
The device operation .dev_close was returning void.
This driver interface is changed to return an int.

Note that the API rte_eth_dev_close() is still returning void,
although a deprecation notice is pending to change it as well.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Reviewed-by: Sachin Saxena <sachin.saxena@oss.nxp.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-09-30 19:19:13 +02:00
Thomas Monjalon
2c65898b48 ethdev: reset device and interrupt pointers on release
The pointers .device and .intr_handle were already reset by the helper
rte_eth_dev_pci_generic_remove().
It is now made part of rte_eth_dev_release_port().

It makes rte_eth_dev_pci_release() meaningless,
so it is replaced with a call to rte_eth_dev_release_port().

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-09-30 19:19:13 +02:00
Manish Chopra
92c6786e85 net/qede: define PCI config space specific osals
This patch defines various PCI config space access APIs
in order to read and find IOV specific PCI capabilities.

With these definitions implemented, it enables the base
driver to do SR-IOV specific initialization and HW specific
configuration required from PF-PMD driver instance.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Rasesh Mody <rmody@marvell.com>
2020-09-30 19:19:11 +02:00
Manish Chopra
e00d2b4cea bus/pci: query PCI extended capabilities
By adding generic API, this patch removes individual
functions/defines implemented by drivers to find extended
PCI capabilities.

Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2020-09-30 19:19:11 +02:00
Cristian Dumitrescu
d0a0096661 table: add exact match SWX table
Add the exact match table type for the SWX pipeline. Used under the
hood by the SWX pipeline table instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:10 +02:00
Cristian Dumitrescu
394813eb60 port: add source and sink SWX ports
Add the PCAP file-based source (input) and sink (output) port types
for the SWX pipeline. The sink port is typically used to implement the
packet drop pipeline action. Used under the hood by the pipeline rx
and tx instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:10 +02:00
Cristian Dumitrescu
ecc5f43396 port: add ethernet device SWX port
Add the Ethernet device input/output port type for the SWX pipeline.
Used under the hood by the pipeline rx and tx instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
3ca60ceed7 pipeline: add SWX pipeline specification file
Add support for building the SWX pipeline based on specification file
with syntax aligned to the P4 language. The specification file may be
generated by the P4C compiler in the future.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
b32c0a2c5e pipeline: add SWX table update high level API
High-level transaction-oriented API for SWX pipeline table updates. It
supports multi-table atomic updates, i.e. multiple tables can be
updated in a single step with only the before and after table set
visible to the packets. Uses the lower-level table update mechanisms.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
e87cf5d85e pipeline: add SWX pipeline flush
Flush the packets currently buffered by the SWX pipeline output ports.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
393b96e2aa pipeline: add SWX pipeline query API
Query API to be used by the control plane to detect the configuration
and state of the SWX pipeline and its internal objects.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
31035e87b2 pipeline: add SWX instruction optimizer
Instruction optimizer. Detects frequent patterns and replaces them
with some more powerful vector-like pipeline instructions without any
user effort. Executes at instruction translation, not at run-time.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
75634474ca pipeline: add SWX instruction verifier
Instruction verifier. Executes at instruction translation time during
SWX pipeline build, i.e. at initialization instead of run-time.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
11a86e2268 pipeline: add SWX instruction description
Added SWX instruction set reference table.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
b3947e25be pipeline: introduce SWX jump and return instructions
The jump instructions are either unconditional (jmp) or conditional on
positive/negative tests such as header validity (jmpv/jmpnv), table
lookup hit/miss (jmph/jmpnh), executed action (jmpa/jmpna), equality
(jmpeq/jmpneq), comparison result (jmplt/jmpgt). The return
instruction resumes the pipeline execution after action subroutine.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
c6b752cdf2 pipeline: introduce SWX extern instruction
The extern instruction calls one of the member functions of a given
extern object or it calls the given extern function. The function
arguments must be written in advance to the mailbox. The results
are available in the same place after execution.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
6077ae611d pipeline: introduce SWX table instruction
The table instruction looks up the input key into the table and then
it triggers the execution of the action found in the table entry. On
lookup miss, the default table action is executed.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
e0f51638b7 pipeline: introduce SWX SHR instruction
The shr (i.e. shift right) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
b09ba6d0a3 pipeline: introduce SWX SHL instruction
The shl (i.e. shift left) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
b4e607f9fd pipeline: introduce SWX XOR instruction
The xor (i.e. bitwise exclusive or) instruction source can be header
field (H), meta-data field (M), extern object (E) or function (F)
mailbox field, table entry action data field (T) or immediate value
(I). The destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
8f796198dc pipeline: introduce SWX or instruction
The or (i.e. bitwise or) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
650195cf96 pipeline: introduce SWX and instruction
The and (i.e. bitwise and) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
1e6bf5997c pipeline: introduce SWX cksub instruction
The cksub (i.e. checksum subtract) instruction is used to update the
1's complement sum commonly used by protocols such as IPv4, TCP or
UDP.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
7aa4569822 pipeline: introduce SWX ckadd instruction
The ckadd (i.e. checksum add) instruction is used to either compute,
verify or update the 1's complement sum commonly used by protocols
such as IPv4, TCP or UDP.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:08 +02:00
Cristian Dumitrescu
c88c629438 pipeline: introduce SWX subtract instruction
The sub (i.e. subtract) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
baf7999303 pipeline: introduce SWX add instruction
The add instruction source can be header field (H), meta-data field
(M), extern object (E) or function (F) mailbox field, table entry
action data field (T) or immediate value (I). The destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
8319c47c5f pipeline: add SWX DMA instruction
The DMA instruction handles the bulk read transfer of one header from
the table entry action data. Typically used to generate headers, i.e.
headers that are not extracted from the input packet.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
7210349d5b pipeline: add SWX move instruction
The mov (i.e. move) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
27e48ec92f pipeline: add header validate and invalidate SWX instructions
Add instructions to flag a header as valid or invalid. This flag can
be tested by the jmpv (jump if header valid) and jmpnv (jump if header
not valid) instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
99abed441d pipeline: add SWX Tx and emit instructions
Add header emit and packet transmission instructions. Emit adds to the
output packet a header that is either generated (e.g. read from table
entry by action) or extracted from the input packet. Tx ends the
pipeline processing; discard is implemented by tx to special port.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
a1711f948d pipeline: add SWX Rx and extract instructions
Add packet reception and header extraction instructions. The Rx must
be the first pipeline instruction. Each extracted header is logically
removed from the packet, then it can be read/written by instructions,
emitted into the outgoing packet or discarded.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
7d666cfa96 pipeline: add SWX pipeline instructions
The SWX pipeline instructions represent the main program that defines
the life of the packet. As packets go through tables that trigger
action subroutines, the headers and meta-data get transformed along
the way.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
e9d870dd93 pipeline: add SWX pipeline tables
Add tables to the SWX pipeline. The match fields are flexibly selected
from the headers and meta-data. The set of table actions is flexibly
selected for each table from the set of pipeline actions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
85af61ba62 pipeline: add SWX pipeline action
Add SWX actions that are dynamically-defined through instructions as
opposed to pre-defined. The actions are subroutines of the pipeline
program that triggered by table lookup. The input arguments are the
action data from the table entry (format defined by struct), the
headers and meta-data are in/out.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
1e4c88caea pipeline: add SWX extern objects and funcs
Add extern objects and functions to plug into the SWX pipeline any
functionality that cannot be efficiently implemented with existing
instructions, e.g. special checksum/ECC, crypto, meters, stats arrays,
heuristics, etc. In/out arguments are passed through mailbox with
format defined by struct.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
e719fe9e95 pipeline: add SWX headers and meta-data
Add support for dynamically-defined packet headers and meta-data to
the SWX pipeline. The header and meta-data format are defined by the
struct type they instantiate.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
99a0ba8f4c pipeline: add SWX pipeline output port
Add output ports to the newly introduced SWX pipeline type. Each port
instantiates a port type that defines the port operations, e.g. ethdev
port, PCAP port, etc. The TX interface is single packet, with packet
batching internally for performance.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
6e0ca01c93 pipeline: add SWX pipeline input port
Add input ports to the newly introduced SWX pipeline type. Each port
instantiates a port type that defines the port operations, e.g. ethdev
port, PCAP port, etc. The RX interface is single packet, with packet
batching internally for performance.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:07 +02:00
Cristian Dumitrescu
56492fd536 pipeline: add new SWX pipeline type
Add new improved Software Switch (SWX) pipeline type that supports
dynamically-defined packet headers, meta-data, actions and pipelines.
Actions and pipelines are defined through instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-10-01 18:43:06 +02:00
Yunjian Wang
d1ca6cfd3c stack: fix uninitialized variable
This patch fixes an issue that uninitialized 'success'
is used to be compared with '0'.

Coverity issue: 337676
Fixes: 3340202f59 ("stack: add lock-free implementation")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2020-09-30 21:08:39 +02:00
Steven Lariau
4ee8dc7f66 stack: relax pop CAS ordering
Replace the store-release by relaxed for the CAS success at the end of
pop. Release isn't needed, because there is not write to data that need
to be synchronized.
The only preceding write is when the length is decreased, but the length
CAS loop already ensures the right synchronization.
The situation to avoid is when a thread sees the old length but the new
list, that doesn't have enough items for pop to success.
But the CAS success on length before the pop loop ensures any core reads
and updates the latest length, preventing this situation.

The store-release is also used to make sure that the items are read
before the head is updated, in order to prevent a core in pop to read an
incorrect value because another core rewrites it with push.
But this isn't needed, because items are read only when removed from the
used list. Right after this, they are pushed to the free list, and the
store-release in push makes sure the items are read before they are
visible in the free list.

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
2020-09-30 21:08:39 +02:00
Steven Lariau
18effad9cf stack: reload head when pop fails
List head must be loaded right before continue (when failed to
find the new head).
Without this, one thread might keep trying and failing to pop items
without ever loading the new correct head.

Fixes: 7e6e609939 ("stack: add C11 atomic implementation")
Cc: gage.eads@intel.com
Cc: stable@dpdk.org

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
2020-09-30 21:08:39 +02:00
Steven Lariau
2cdfe4e577 stack: remove redundant orderings on pop
The load-acquire of list->len on pop function is redundant.
Only the CAS success needs to be load-acquire.
It synchronizes with the store release in push, to ensure that the
updated head is visible when the new length is visible.
Without this, one thread in pop could see the increased length but the
old list, which doesn't have enough items yet for pop to succeed.

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
2020-09-30 21:08:39 +02:00
Steven Lariau
9e90bc9c71 stack: remove acquire fence on push
An acquire fence is used to make sure loads after the fence can observe
all store operations before a specific store-release.
But push doesn't read any data, except for the head which is part of a
CAS operation (the items on the list are not read).
So there is no need for the acquire barrier.

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
2020-09-30 21:08:39 +02:00
Steven Lariau
0df8e2d2c9 stack: fix inconsistent weak/strong CAS
Fix cmpexchange usage of weak / strong.
The generated code is the same on x86 and ARM (there is no weak
cmpexchange), but the old usage was inconsistent.
For push and pop update size, weak is used because cmpexchange is inside
a loop.
For pop update root, strong is used even though cmpexchange is inside a
loop, because there may be a lot of operations to do in a loop iteration
(locate the new head).

Signed-off-by: Steven Lariau <steven.lariau@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Gage Eads <gage.eads@intel.com>
2020-09-30 21:08:39 +02:00
David Marchand
dc3cdcd69d ethdev: fix link speed helper documentation
When generating the documentation, a new warning can be seen:

.../dpdk/lib/librte_ethdev/rte_ethdev.h:2441:
  warning: argument 'link_speed' of command @param is not found in the
  argument list of rte_eth_link_speed_to_str(uint32_t speed_link)
.../dpdk/lib/librte_ethdev/rte_ethdev.h:2455: warning: The following
  parameters of rte_eth_link_speed_to_str(uint32_t speed_link) are not
  documented: parameter 'speed_link'

Align the function prototype to its doxygen description.

Fixes: fbf931c9c3 ("ethdev: format link status text")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-09-29 17:43:32 +02:00
Fan Zhang
2d962bb736 vhost/crypto: fix possible TOCTOU attack
This patch fixes the possible time-of-check to time-of-use (TOCTOU)
attack problem by copying request data and descriptor index to local
variable prior to process.

Also the original sequential read of descriptors may lead to TOCTOU
attack. This patch fixes the problem by loading all descriptors of a
request to local buffer before processing.

CVE-2020-14375
Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")
Cc: stable@dpdk.org

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-28 13:19:13 +02:00
Fan Zhang
e15b7c0112 vhost/crypto: fix data length check
This patch fixes the incorrect data length check to vhost crypto.
Instead of blindly accepting the descriptor length as data length, the
change compare the request provided data length and descriptor length
first. The security issue CVE-2020-14374 is not fixed alone by this
patch, part of the fix is done through:
"vhost/crypto: fix missed request check for copy mode".

CVE-2020-14374
Fixes: 3c79609fda ("vhost/crypto: handle virtually non-contiguous buffers")
Cc: stable@dpdk.org

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-28 13:19:13 +02:00
Fan Zhang
409c47c7c5 vhost/crypto: fix write back source
This patch fixes vhost crypto library for the incorrect source and
destination buffer calculation in the copy mode.

Fixes: cd1e8f03ab ("vhost/crypto: fix packet copy in chaining mode")
Cc: stable@dpdk.org

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-28 13:19:02 +02:00
Fan Zhang
b2866f4733 vhost/crypto: fix missed request check for copy mode
This patch fixes the missed request check to vhost crypto
copy mode.

CVE-2020-14376
CVE-2020-14377
Fixes: 3bb595ecd6 ("vhost/crypto: add request handler")
Cc: stable@dpdk.org

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-28 13:17:08 +02:00
Fan Zhang
5677e68c05 vhost/crypto: fix descriptor deduction
This patch fixes the incorrect descriptor deduction for vhost crypto.

CVE-2020-14378
Fixes: 16d2e718b8 ("vhost/crypto: fix possible out of bound access")
Cc: stable@dpdk.org

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-28 13:16:37 +02:00
Fan Zhang
57680e3449 vhost/crypto: fix pool allocation
This patch fixes the missing iv space allocation in crypto
operation mempool.

Fixes: 709521f4c2 ("examples/vhost_crypto: support multi-core")
Cc: stable@dpdk.org

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-28 13:16:37 +02:00
Maxime Coquelin
09424c3f74 vhost: fix external backends readiness
Commit d0fcc38f5f ("vhost: improve device readiness notifications")
makes the assumption that every Virtio devices are considered
ready for preocessing as soon as first queue pair is configured
and enabled.

While this is true for Virtio-net, it isn't for Virtio-scsi
and Virtio-blk.

This patch fixes this by only making this assumption for
the builtin Virtio-net backend, and restores back to previous
behaviour for other backends.

Fixes: d0fcc38f5f ("vhost: improve device readiness notifications")

Reported-by: Changpeng Liu <changpeng.liu@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-09-28 13:16:37 +02:00
Phil Yang
f5d6738c2e ethdev: use C11 atomics for link status
Since rte_atomicXX APIs are not allowed to be used, use C11 atomic
builtins for link status update.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-09-25 15:42:34 +02:00
Phil Yang
e623c943eb power: use C11 atomics for power state
Since rte_atomicXX APIs are not allowed to be used, use C11 atomic
builtins for power in use state update.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Hunt <david.hunt@intel.com>
2020-09-25 15:42:29 +02:00
Phil Yang
b5bafde26c bbdev: use C11 atomics for device processing counter
Since rte_atomicXX APIs are not allowed to be used, use C11 atomic builtins
for device processing counter.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Nicolas Chautru <nicolas.chautru@intel.com>
2020-09-25 15:37:55 +02:00
Phil Yang
c5d6c47257 eal: use C11 atomics for initialization check
Since rte_atomicXX APIs are not allowed to be used, use C11 builtins to
check if EAL is already initialized.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2020-09-25 15:36:17 +02:00
Radu Nicolau
84fb33fec1 build: remove deprecated cpuflag macros
Replace use of RTE_MACHINE_CPUFLAG macros with regular compiler
macros, which are more complete than those provided by DPDK, and as such
it allows new instruction sets to be leveraged without having to do
extra work to set them up in DPDK.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-09-25 11:13:57 +02:00
Phil Yang
f0f5d844d1 eal: remove deprecated coherent IO memory barriers
Since the 20.08 release deprecated rte_cio_*mb APIs because these APIs
provide the same functionality as rte_io_*mb APIs on all platforms, so
remove them and use rte_io_*mb instead.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-09-23 13:40:26 +02:00
Chengchang Tang
61efaf5b62 ethdev: support getting Rx buffer size in Rx queue info
Add a field named rx_buf_size in rte_eth_rxq_info to indicate the buffer
size used in receiving packets for HW.

In this way, upper-layer users can get this information by calling
rte_eth_rx_queue_info_get.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-21 18:05:38 +02:00
Ivan Dyukov
fbf931c9c3 ethdev: format link status text
There is new link_speed value introduced. It's INT_MAX value which
means that speed is unknown. To simplify processing of the value
in application, new function is added which convert link_speed to
string. Also dpdk examples have many duplicated code which format
entire link status structure to text.

This commit adds two functions:
  * rte_eth_link_speed_to_str - format link_speed to string
  * rte_eth_link_to_str - convert link status structure to string

Signed-off-by: Ivan Dyukov <i.dyukov@samsung.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-21 18:05:37 +02:00
Eugenio Pérez
46d3f57537 vhost: fix IOTLB mempool single-consumer flag
Control thread (which handles iotlb msg) and forwarding thread
both use iotlb to translate address. The former may modify the
same entry of mempool and may cause a loop in iotlb_pending_entries
list.

Bugzilla ID: 523
Fixes: d012d1f293 ("vhost: add IOTLB helper functions")
Cc: stable@dpdk.org

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Chenbo Xia
671cc679a5 vhost: add device reset status
vhost lib now does not have definition of reset status. This patch
adds the reset status definition and changes related log.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-09-18 18:55:12 +02:00
Kiran Kumar K
333a38bb2e ethdev: support encapsulation level for RSS offload
This patch reserves 2 bits as input selection to select inner and outer
encapsulation level for RSS computation. It is combined with existing
ETH_RSS_* to choose inner or outer layers.
This functionality already exists in rte_flow through level parameter in
RSS action configuration rte_flow_action_rss.

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2020-09-18 18:55:12 +02:00
Haiyue Wang
976277f9ce net: adjust header length parse size
Enlarge the L3 and tunnel header length from 8-bit to 16-bit to handle
the bigger headers. And reorder the fields to avoid creating a structure
hole.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-09-18 18:55:12 +02:00
Yi Yang
b9b75d9b5c gso: fix payload unit size for UDP
Fragment offset of IPv4 header is measured in units of
8 bytes. Fragment offset of UDP fragments will be wrong
after GSO if pyld_unit_size isn't multiple of 8. Say
pyld_unit_size is 1500, fragment offset of the second
UDP fragment will be 187 (i.e. 1500 / 8), which means 1496,
and it will result in 4-byte data loss (1500 - 1496 = 4).
So UDP GRO will reassemble out a wrong packet.

Fixes: b166d4f30b ("gso: support UDP/IPv4 fragmentation")
Cc: stable@dpdk.org

Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
2020-09-18 18:55:12 +02:00
Andrew Rybchenko
bf3785fbd8 net: check first segment length on SW VLAN insertion
SW VLAN insertion relies on Ethernet addresses location in contiguous
memory (do not split across mbuf segments). There is no any formal
requirements on data location and mbuf structure which guarantee it.
So, check it explicitly to avoid corrupted packets if the condition
is violated. Typically software VLAN insertion is done on Tx prepare
stage and application will get indication that the packet is invalid
and cannot be transmitted.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-18 18:55:10 +02:00
Nithin Dabilpuram
bf4e0faae9 ethdev: support TM for shaper config in packet mode
Some NIC hardware support shaper to work in packet mode i.e
shaping or ratelimiting traffic is in packets per second (PPS) as
opposed to default bytes per second (BPS). Hence this patch
adds support to configure shared or private shaper in packet mode,
provide rate in PPS and add related tm capabilities in port/level/node
capability structures.

This patch also updates tm port/level/node capability structures with
exiting features of scheduler wfq packet mode, scheduler wfq byte mode
and private/shared shaper byte mode.

SoftNIC PMD is also updated with new capabilities.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-09-18 18:55:10 +02:00
Ferruh Yigit
5723fbed4f ethdev: remove underscore prefix from internal API
'_rte_eth_dev_callback_process()' & '_rte_eth_dev_reset()' internal APIs
has unconventional underscore ('_') prefix.
Although this is not documented most probably this is to mark them as
internal. Since we have '__rte_internal' flag to mark this, removing '_'
from API names.

For '_rte_eth_dev_reset()', there is already a public API named
'rte_eth_dev_reset()', so renaming '_rte_eth_dev_reset()' to
'rte_eth_dev_internal_reset'.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
2020-09-18 18:55:08 +02:00
Ferruh Yigit
8682e492ed ethdev: use hairpin helper functions
Hairpin helper functions were not used by drivers, but it was used only
local to ethdev. They are:
'rte_eth_dev_is_rx_hairpin_queue()'
'rte_eth_dev_is_tx_hairpin_queue()'

Exposing them as internal APIs and update mlx5 driver (only user of
hairpin) to use them.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-09-18 18:55:08 +02:00
Ferruh Yigit
cb4115cb84 ethdev: mark internal functions
Some ethdev functions are for drivers only, not for applications.

Since we have '__rte_internal' tag available now, marking internal
functions with it and moving functions to INTERNAL section in linker
script.
This is also good for documenting the internal functions.

Some internal APIs seems marked as experimental, but it doesn't make
sense to have internals APIs as experimental, updating their tag and
doxygen comments.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-09-18 18:55:08 +02:00
Ferruh Yigit
47909357a0 ethdev: make device operations struct private
Hiding the 'struct eth_dev_ops' from applications.

Removing relevant deprecation notice.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-09-18 18:55:08 +02:00
Ferruh Yigit
cbfc6111b5 ethdev: move inline device operations
This patch is a preparation to hide the 'struct eth_dev_ops' from
applications by moving some device operations from 'struct eth_dev_ops'
to 'struct rte_eth_dev'.

Mentioned ethdev APIs are in the data path and implemented as inline
because of performance reasons.

Exposing 'struct eth_dev_ops' to applications is bad because it is a
contract between ethdev and PMDs, not really needs to be known by
applications, also changes in the struct causing ABI breakages which
shouldn't.

To be able to both keep APIs inline and hide the 'struct eth_dev_ops',
moving device operations used in ethdev inline APIs to 'struct
rte_eth_dev' to the same level with Rx/Tx burst functions.

The list of dev_ops moved:
eth_rx_queue_count_t       rx_queue_count;
eth_rx_descriptor_done_t   rx_descriptor_done;
eth_rx_descriptor_status_t rx_descriptor_status;
eth_tx_descriptor_status_t tx_descriptor_status;

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Sachin Saxena <sachin.saxena@nxp.com>
2020-09-18 18:55:08 +02:00
Ferruh Yigit
fa1f5fe4d8 ethdev: deprecate descriptor status check API
Marking 'rte_eth_rx_descriptor_done()' API as deprecated.
``rte_eth_rx_descriptor_status`` and ``rte_eth_tx_descriptor_status``
APIs can be used as replacement.

Plan is to remove the API on 21.11 release.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-09-18 18:55:08 +02:00
Ferruh Yigit
b895a521f9 ethdev: remove redundant license text
Redundant BSD-3 license text removed, the licensing already documented
by "SPDX-License-Identifier: BSD-3-Clause" SPDX tag.

Cc: stable@dpdk.org

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-09-18 18:55:08 +02:00
Nithin Dabilpuram
5dfcb68984 ethdev: mark all traffic manager API as experimental
This patch marks all traffic manager API as experimental as
per deprecation notice[1] and discussion[2] mentioned in following
threads.

[1] https://mails.dpdk.org/archives/dev/2020-May/166221.html
[2] https://mails.dpdk.org/archives/dev/2020-April/165364.html

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-18 18:55:08 +02:00
Thomas Monjalon
810b17d116 ethdev: allow unknown link speed
When querying the link information, the link status is
a mandatory major information.
Other boolean values are supposed to be accurate:
	- duplex mode (half/full)
	- negotiation (auto/fixed)

This API update is making explicit that the link speed information
is optional.
The value ETH_SPEED_NUM_NONE (0) was already part of the API.
The value ETH_SPEED_NUM_UNKNOWN (infinite) is added to cover
two different cases:
	- speed is not known by the driver
	- device is virtual

Suggested-by: Morten Brørup <mb@smartsharesystems.com>
Suggested-by: Benoit Ganne <bganne@cisco.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-18 18:55:07 +02:00
Morten Brørup
a6f67a3b64 net: optimize ethernet address functions
* rte_is_broadcast_ether_addr():
Use binary logic instead of comparisons and boolean logic, thus reducing
the number of branches.
It now resembles rte_is_zero_ether_addr().

* rte_ether_addr_copy():
The source code modifications were discussed on the mailing list:
http://mails.dpdk.org/archives/dev/2020-June/171584.html
Remove obsolete ICC-specific code and related comment.
Restrict pointer aliasing (suggested by Jerin Jacob).
Remove superfluous "Fast" from function description headline; all DPDK
data plane functions are supposed to be fast.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-09-18 18:55:06 +02:00
Wei Hu (Xavier)
ba2fb4f022 ethdev: check if queue setup when getting queue info
This patch adds checking whether the related Tx or Rx queue has been
setup in the rte_eth_rx_queue_info_get and rte_eth_tx_queue_info_get
API function to avoid illegal address access.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-09-18 18:55:06 +02:00
Harry van Haaren
8929de043e service: retrieve lcore active state
This commit adds a new experimental API which allows the user
to retrieve the active state of an lcore. Knowing when the service
lcore is completed its polling loop can be useful to applications
to avoid race conditions when e.g. finalizing statistics.

The service thread itself now has a variable to indicate if its
thread is active. When zero the service thread has completed its
service, and has returned from the service_runner_func() function.

Suggested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-09-21 16:37:11 +02:00
David Marchand
22a2f54f25 eal: hide internal device event structure
This structure is not used in the public API.

Fixes: a753e53d51 ("eal: add device event monitor framework")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-09-21 10:12:10 +02:00
David Marchand
e200535c1c mem: drop mapping API workaround
Now that the pci_map_resource API is private to the PCI bus, we can drop
the compatibility workaround we had implemented in 20.08.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-09-21 10:12:10 +02:00
David Marchand
e1ece60956 pci: move resource mapping to the PCI bus
As reported during 20.08 work for Windows, the pci_map_resource API was
built with the assumption that its flags would be passed to mmap().

This introduced a regression when adding the rte_mem_map API as reported
in the workaround commit 9d2b245937 ("pci: keep API compatibility with
mmap values").

This API was only used in the PCI bus code, so move it there.

There is no code change happening during the move.
The only change is in the pci_map_resource description where the
additional flags are now documented as rte_mem_map API flags:
- *      The additional flags for the mapping range.
+ *      The additional rte_mem_map() flags for the mapping range.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-09-21 10:12:10 +02:00
David Marchand
7c0d798aab bus/pci: switch to private kernel driver enum
The rte_kernel_driver enum actually only pointed at PCI drivers and is
only used in the PCI subsystem.
Remove it from the generic device API and use a private enum in the PCI
code.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-09-21 10:11:44 +02:00
David Marchand
dbd3809261 ethdev: remove unused kernel driver field
This field was not generic as it was filled with PCI kernel drivers only.
It has no known in-tree user (and I could not find opensource projects
using it).

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-09-21 09:30:36 +02:00
Thomas Monjalon
4be717272e mbuf: remove physical address alias
Remove the deprecated buf_physaddr union field from rte_mbuf.
It is replaced with buf_iova which is at the same offset.

The single field buf_physaddr in rte_kni_mbuf is also renamed.

This concludes a 3-year process of semantic change.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-09-19 00:25:37 +02:00
Thomas Monjalon
ce627d633b mbuf: remove deprecated function and macro aliases
Remove the deprecated functions
	- rte_mbuf_data_dma_addr
	- rte_mbuf_data_dma_addr_default
which aliased the more recent functions
	- rte_mbuf_data_iova
	- rte_mbuf_data_iova_default

Remove the deprecated macros
	- rte_pktmbuf_mtophys
	- rte_pktmbuf_mtophys_offset
which aliased the more recent macros
	- rte_pktmbuf_iova
	- rte_pktmbuf_iova_offset

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-09-19 00:25:37 +02:00
Thomas Monjalon
28e3c8b286 mempool: remove physical address aliases
Remove the deprecated unioned fields physaddr and phys_addr
from the structures rte_mempool_objhdr and rte_mempool_memhdr.
They are replaced with the fields iova which are at the same offsets.

Remove the deprecated macro MEMPOOL_F_NO_PHYS_CONTIG
which is an alias of the more recent MEMPOOL_F_NO_IOVA_CONTIG.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-09-19 00:25:37 +02:00
Thomas Monjalon
72f82c4324 mem: remove physical address aliases
Remove the deprecated unioned fields phys_addr
from the structures rte_memseg and rte_memzone.
They are replaced with the fields iova which are at the same offsets.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-09-19 00:25:35 +02:00
Pawel Wodkowski
c23432a9c2 trace: fix C++ compilation
trace_mem is declared as 'void *' which triggers following error:
'...invalid conversion from ‘void*’ to ‘__rte_trace_header*’
[-fpermissive]...'

Fix this by adding proper typecast to 'struct __rte_trace_header *'.

Fixes: ebaee64097 ("trace: simplify trace point headers")
Cc: stable@dpdk.org

Signed-off-by: Pawel Wodkowski <pawelwod@gmail.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Nicolas Chautru <nicolas.chautru@intel.com>
2020-09-17 11:13:44 +02:00
Conor Walsh
dc18be1d8b bpf: promote library as stable
The BPF lib was introduced in 18.05.
There were no changes in its public API since 19.11.
It should be mature enough to remove its 'experimental' tag.
RTE_BPF_XTYPE_NUM is also being dropped from rte_bpf_xtype to
avoid possible ABI problems in the future.

Signed-off-by: Conor Walsh <conor.walsh@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-09-16 18:52:55 +02:00
Stephen Hemminger
f484baeeb6 log: hide internal variable
As announced in earlier releases, rte_logs can now be made
internal to EAL.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2020-09-16 18:37:11 +02:00
Stephen Hemminger
2adc121ef9 log: promote rte_log_get_stream as stable
Applications will need to use this API now to get internal
state of rte_log.

Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2020-09-16 18:37:11 +02:00
Stephen Hemminger
64201f1059 eal/linux: change udev debug message
The debug message was poorly worded and did not include the
part that would be useful. I.e it never said what was being ignored.
Change it to print the message so that if udev changes format or
other subsystems need to be added then the necessary information
will be in the debug log.

Fixes: 0d0f478d04 ("eal/linux: add uevent parse and process")

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Jeff Guo <jia.guo@intel.com>
2020-09-16 18:14:35 +02:00
Phil Yang
e41d27a68d mbuf: remove atomic reference counters
Remove the deprecated refcnt_atomic union fields in
rte_mbuf and rte_mbuf_ext_shared_info structures.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-09-15 11:29:23 +02:00
Bruce Richardson
8c658761de rawdev: mark start and stop functions optional
Not all rawdevs will require a device start/stop function, so rather than
requiring such drivers to provide dummy functions, just set the
started/stopped rawdev flag from the rawdev layer and return success.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
2020-09-11 11:51:16 +02:00
Bruce Richardson
13f8e4a27e rawdev: allow queue config query to return error
The driver APIs for returning the queue default config can fail if the
parameters are invalid, or other reasons, so allow them to return error
codes to the rawdev layer and from hence to the app.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
2020-09-11 11:51:15 +02:00
Bruce Richardson
f574ed8116 rawdev: add private data size to queue config inputs
The queue setup and queue defaults query functions take a void * parameter
as configuration data, preventing any compile-time checking of the
parameters and limiting runtime checks. Adding in the length of the
expected structure provides a measure of typechecking, and can also be used
for ABI compatibility in future, since ABI changes involving structs almost
always involve a change in size.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
2020-09-11 11:51:03 +02:00
Bruce Richardson
8db9dce72d rawdev: add private data size to config inputs
Currently with the rawdev API there is no way to check that the structure
passed in via the dev_private pointer in the structure passed to configure
API is of the correct type - it's just checked that it is non-NULL. Adding
in the length of the expected structure provides a measure of typechecking,
and can also be used for ABI compatibility in future, since ABI changes
involving structs almost always involve a change in size.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
2020-09-11 11:50:55 +02:00
Bruce Richardson
f150dd8839 rawdev: allow drivers to return error from info query
Since we now allow some parameter checking inside the driver info_get()
functions, it makes sense to allow error return from those functions to the
caller. Therefore we change the driver callback return type from void to
int.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
2020-09-11 11:50:54 +02:00
Bruce Richardson
10b71caecb rawdev: add private data size to info query
Currently with the rawdev API there is no way to check that the structure
passed in via the dev_private pointer in the dev_info structure is of the
correct type - it's just checked that it is non-NULL. Adding in the length
of the expected structure provides a measure of typechecking, and can also
be used for ABI compatibility in future, since ABI changes involving
structs almost always involve a change in size.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
2020-09-11 11:50:53 +02:00
Fady Bader
790defbdb6 ethdev: build on Windows
Add ethdev and a missing dependency (meter) to the list
of libraries built on Windows.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2020-09-11 01:55:39 +02:00
Fady Bader
0e9e9e7548 ethdev: remove structs from export map
Some ethdev structs were present in .map export list.
There structs are removed from the .map file.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-11 01:55:39 +02:00
Fady Bader
5ae0d39d4a telemetry: build stubs on Windows
Telemetry didn't compile under Windows.
Empty stubs are arranged, waiting for a proper implementation.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-11 01:55:35 +02:00
Fady Bader
fc5ae478f8 eal/windows: update symbols export
The .def file is a reduced copy of the .map file.
In order to ease comparison, some lines are moved in the .def file
to be in the same order as in the .map file.

rte_eal_get_configuration is removed because it has been removed
from the .map file in DPDK 19.11.
Note: it had been removed and re-added by mistake in 20.08 .def file.

Few functions are added in the .def file to allow ethdev on Windows.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-11 01:40:36 +02:00
Fady Bader
f5192f9162 eal/windows: add stub for Rx interrupt control
Interrupts are not implemented for Windows.
In order to compile ethdev on Windows,
an empty interrupt control function stub has to be added for Windows.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-11 01:38:26 +02:00
Fady Bader
16f0d03098 net: build on Windows
librte_net was not compiling under Windows.
To solve this, needed header files are added.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-10 21:53:48 +02:00
Fady Bader
64fb21d86b net: replace htons with constant endian swap
htons is not defined in Windows with the MinGW compiler.
htons is replaced with RTE_BE16 in order to compile under Windows.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
2020-09-10 21:52:28 +02:00
Fady Bader
507e1ca07b net: fix redefinition in Windows
In Windows, s_addr is defined in winsock2.h which is included by windows.h.
It is undefined in order to be defined as part of rte_ether_hdr.

Signed-off-by: Fady Bader <fady@mellanox.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-10 21:52:04 +02:00
Tal Shnaiderman
a4235b781f eal/windows: probe vdev
Add needed function calls in rte_eal_init to detect vdev PMD.

eal_option_device_parse()
rte_service_init()
rte_bus_probe()

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tested-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
2020-09-09 14:40:41 +02:00
Tal Shnaiderman
ec6d514636 bus/vdev: build on Windows
current support will build vdev with empty MP functions
currently unsupported for Windows.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tested-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
2020-09-09 14:39:37 +02:00
Ciara Power
ec260aa3ad config: remove default configs used with make
Make is not supported for compiling DPDK, the config files are no
longer needed.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-08 00:11:30 +02:00
Ciara Power
3cc6ecfdfe build: remove makefiles
A decision was made [1] to no longer support Make in DPDK, this patch
removes all Makefiles that do not make use of pkg-config, along with
the mk directory previously used by make.

[1] https://mails.dpdk.org/archives/dev/2020-April/162839.html

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-09-08 00:09:50 +02:00
Thomas Monjalon
4f86c0ba19 version: 20.11-rc0
Start a new release cycle with empty release notes.

The ABI version becomes 21.0.
The ABI major is back to normal, having only one number (21 vs 20.0).
The map files are updated to the new ABI major number (21).
The ABI exceptions are dropped.
Travis ABI check is disabled because compatibility is not preserved.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2020-08-12 11:32:16 +02:00
Stephen Hemminger
156055da95 ethdev: improve API comment for MAC address addition
The comment used the term whitelist and was awkardly written.
Replace it with simpler direct description of adding a new address.
No code or API changes for this.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: John McNamara <john.mcnamara@intel.com>
2020-08-07 13:02:10 +02:00
Stephen Hemminger
95a2e18dfb kni: fix reference to master/slave process
In DPDK, the correct terms for process are primary/secondary.
This is bugfix, not a change in terms for new release.

Fixes: f2e7592c47 ("kni: fix multi-process support")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
2020-08-07 13:01:54 +02:00
Thomas Monjalon
2fca871ce7 ethdev: remove device-specific comments from VLAN API
Some confusing comments were still present from old days,
when most drivers were from Intel.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-08-05 20:01:49 +02:00
Patrick Fu
6563cf9238 vhost: fix async copy on multi-page buffers
Async copy fails when single ring buffer vector is split on multiple
physical pages. This happens because current hpa address translation
function doesn't handle multi-page buffers. A new gpa to hpa address
conversion function, which returns the hpa on the first hitting host
pages, is implemented in this patch. Async data path recursively calls
this new function to construct a multi-segments async copy descriptor
for ring buffers crossing physical page boundaries.

Fixes: cd6760da10 ("vhost: introduce async enqueue for split ring")

Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-07-30 00:41:24 +02:00
Maxime Coquelin
b53a497294 vhost: fix guest notification setting
If rte_vhost_enable_guest_notification is called before
the virtqueue is ready, the configuration is lost.

This patch fixes this by saving the guest notification
enablement value requested by the application, and apply
it before the virtqueue is made ready to the application.

Fixes: 604052ae53 ("net/vhost: support queue update")

Reported-by: Yinan Wang <yinan.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2020-07-30 00:41:24 +02:00
Yuying Zhang
4a67e71816 net: fix IPv6 checksum with TSO
The ol_flags check lacks of flag for IPv6 which causes checksum
flag configuration error while IPv6/TCP TSO packet is sent.
This patch fixes the issue by adding PKT_TX_TCP_SEG flag.

The rte_net_intel_cksum_flags_prepare() function prepares the
pseudo header checksum in packet data when doing checksum or TSO
offload.

Fixes: 520059a41a ("net: check fragmented headers in non-debug as well")

Signed-off-by: Yuying Zhang <yuying.zhang@intel.com>
Tested-by: Xi Zhang <xix.zhang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2020-07-30 00:41:24 +02:00
Patrick Fu
819a716858 vhost: fix async callback return type
The async copy device callbacks are used by async APIs to transfer data
and check completion status. Async APIs return the number of packets
successfully processed to the caller applications and no error
(negative) value is allowed for API return value. Thus, negative return
values from async device callbacks don't have meaningful usage, while
adding overhead in checking the return value validity. This patch change
the callback return values from "int" to "uint32_t" to get aligned with
async API definition.

Fixes: 78639d5456 ("vhost: introduce async enqueue registration API")

Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-07-30 00:41:23 +02:00
Parav Pandit
21587b4921 eal: introduce macro for bit definition
There are several drivers which duplicate bit generation macro.
Introduce a generic bit macros so that such drivers avoid redefining
same in multiple drivers.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
2020-07-28 18:27:46 +02:00
Yunjian Wang
a5f803c804 hash: fix out-of-memory handling in hash creation
The function rte_zmalloc_socket() could return NULL, the return
value need to be checked.

Fixes: 5915699153 ("hash: fix scaling by reducing contention")
Cc: stable@dpdk.org

Reported-by: Bin Huang <brian.huangbin@huawei.com>
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
2020-07-27 12:53:40 +02:00