Commit Graph

7130 Commits

Author SHA1 Message Date
Dmitry Kozlyuk
89813a522e net: provide IP-related API on any OS
Users of <rte_ip.h> relied on it to provide IP-related defines,
like IPPROTO_* constants, but still had to include POSIX headers
for inet_pton() and other standard IP-related facilities.

Extend <rte_ip.h> so that it is a single header to gain access
to IP-related facilities on any OS. Use it to replace POSIX includes
in components enabled on Windows. Move missing constants from Windows
networking shim to OS shim header and include it where needed.

Remove Windows networking shim that is no longer needed.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-04-15 01:56:43 +02:00
Dmitry Kozlyuk
6c068dbd9f net: work around s_addr macro on Windows
Windows Sockets headers contain `#define s_addr S_un.S_addr`, which
conflicts with definition of `s_addr` field of `struct rte_ether_hdr`.
Prieviously `s_addr` was undefined in <rte_ether.h>, which had been
breaking access to `s_addr` field of `struct in_addr`, so some DPDK
and Windows headers could not be included in one file.

Renaming of `struct rte_ether_hdr` is planned:
https://mails.dpdk.org/archives/dev/2021-March/201444.html

Temporarily disable `s_addr` macro around `struct rte_ether_hdr`
definition to avoid conflict. Place source MAC address in both `s_addr`
and `S_un.S_addr` fields, so that access works either directly or
through the macro as defined in Windows headers.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-04-15 01:56:40 +02:00
Dmitry Kozlyuk
45d62067c2 eal: make OS shims internal
DPDK code often relies on functions and macros that are not standard C,
but are found on all platforms, even if by slightly different names.
Windows <rte_os.h> provided macros or inline definitions for such symbols.
However, when placed in public header, these symbols were unnecessarily
exposed, breaking consumer POSIX compatibility code.

Move most of the shims to <rte_os_shim.h>, a header to be used instead
of <rte_os.h> by internal code. Include it in libraries and PMDs that
previously imported shims from <rte_os.h>. Directly replace shims that
were only used inside EAL:
* index -> strchr, rindex -> strrchr
* sleep -> rte_delay_us_sleep
* strerror_r -> strerror_s

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-04-15 01:56:20 +02:00
Dmitry Kozlyuk
9ec521006d eal/windows: hide asprintf shim
Make asprintf(3) implementation for Windows private to EAL, so that it's
hidden from external consumers. It is not exposed to internal consumers
either, because they don't need asprintf() and also because callers from
other modules would have no reliable way to free allocated memory.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Khoa To <khot@microsoft.com>
Acked-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-04-15 01:56:06 +02:00
Xueming Li
3ab385063c kvargs: add get by key
Adds a new function to get value of a specific key from kvargs list.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-04-14 22:25:22 +02:00
Xueming Li
e132ee8690 devargs: fix memory leak on parsing failure
This patch fixes memory leak in parsing error handling.

Fixes: 338327d731 ("devargs: add function to parse device layers")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-04-14 22:25:15 +02:00
Xueming Li
64051bb1f1 devargs: unify scratch buffer storage
In current design, legacy parser rte_devargs_parse() saved scratch
buffer to devargs.args while new parser rte_devargs_layers_parse() saved
to devargs.data. Code using devargs had to know the difference and
cleaned up memory accordingly - error prone.

This patch unifies scratch buffer to data field, introduces
rte_devargs_reset() function to wrap the memory clean up logic.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-04-14 22:25:08 +02:00
Stephen Hemminger
9667d97c25 pflock: add phase-fair reader writer locks
This is a new type of reader-writer lock that provides better fairness
guarantees which better suited for typical DPDK applications.
A pflock has two ticket pools, one for readers and one
for writers.

Phase-fair reader writer locks ensure that neither reader nor writer will
be starved.
Neither reader or writer are preferred, they execute in alternating
phases.
All operations of the same type (reader or writer) that acquire the lock
are handled in FIFO order.
Write operations are exclusive, and multiple read operations can be run
together (until a write arrives).

A similar implementation is in Concurrency Kit package in FreeBSD.
For more information see:
   "Reader-Writer Synchronization for Shared-Memory Multiprocessor
    Real-Time Systems",
    http://www.cs.unc.edu/~anderson/papers/ecrts09b.pdf

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-04-14 21:59:47 +02:00
Pavan Nikhilesh
206c562e4b eventdev: fix build on RHEL 7
Since queue identifier is passed as signed integer, a compilation error
is generated:
rte_event_eth_rx_adapter.c:1810:57: error: signed and unsigned type
in conditional expression [-Werror=sign-compare]
Make queue identifier as unsigned when adding it to vector data.

Bugzilla ID: 672
Fixes: d7c428e557 ("eventdev: support Rx adapter event vector")

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-04-14 10:02:05 +02:00
Tyler Retzlaff
7498c2d74e eal: do not redefine asm keyword in C++
C++ forbids redefining a keyword as a macro.
The keyword asm is conditionally-supported and implementation defined,
but it seems our best guess.

In C, if asm does not exist, it is defined as __asm__
which is a GNU extension.

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-04-13 15:32:48 +02:00
Pavan Nikhilesh
bc7d4b0346 eventdev: support Tx adapter event vector
Add event vector support for event eth Tx adapter, the implementation
receives events from the single linked queue and based on
rte_event_vector::attr_valid transmits the vector of mbufs to a given
port, queue pair.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Pavan Nikhilesh
d7c428e557 eventdev: support Rx adapter event vector
Add event vector support for event eth Rx adapter, the implementation
creates vector flows based on port and queue identifier of the received
mbufs.
The flow id for SW Rx event vectorization will use 12-bits of queue
identifier and 8-bits port identifier when custom flow id is not set
for simplicity.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Pavan Nikhilesh
3da4060a30 eventdev: introduce event vector Tx capability
Introduce event vector transmit capability for event eth
tx adapter.

The capability indicates that the Tx adapter is capable of
transmitting event vectors.
When rte_event_vector::union_valid is set, the Tx adapter should
transmit all the packets to the rte_event_vector::port using the
rte_event_vector::queue.
If rte_event_vector::union_valid is not set then the Tx adapter
should peek into each mbuf to get the destination port and queue
pair.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Pavan Nikhilesh
3c838062b9 eventdev: introduce event vector Rx capability
Introduce event ethernet Rx adapter event vector capability.

If an event eth Rx adapter has the capability of
RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR then a given Rx queue
can be configured to enable event vectorization by passing the
flag RTE_EVENT_ETH_RX_ADAPTER_QUEUE_EVENT_VECTOR to
rte_event_eth_rx_adapter_queue_conf::rx_queue_flags while configuring
Rx adapter through rte_event_eth_rx_adapter_queue_add().

The max vector size, vector timeout define the vector size and
mempool used for allocating vector event are configured through
rte_event_eth_rx_adapter_queue_add. The element size of the element
in the vector pool should be equal to
    sizeof(struct rte_event_vector) + (vector_sz * sizeof(uintptr_t))

Application can use `rte_event_vector_pool_create` to create the
vector mempool used for
rte_event_eth_rx_adapter_queue_conf::vector_mp.

The Rx adapter would be responsible for vectorizing the mbufs
based on the flow, the vector limits configured by the application
and add the vector event of mbufs to the event queue set via
rte_event_eth_rx_adapter_queue_conf::ev::queue_id.
It should also mark rte_event_vector::union_valid and fill
rte_event_vector::port, rte_event_vector::queue.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Pavan Nikhilesh
1cc44d4092 eventdev: introduce event vector capability
Introduce rte_event_vector datastructure which is capable of holding
multiple uintptr_t of the same flow thereby allowing applications
to vectorize their pipeline and reducing the complexity of pipelining
the events across multiple stages.
This approach also reduces the scheduling overhead on a event device.

Add a event vector mempool create handler to create mempools based on
the best mempool ops available on a given platform.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Shijith Thotton
a10d79a60b eventdev: introduce adapter flags for periodic mode
A timer adapter in periodic mode can be used to arm periodic timers.
This patch adds flags used to advertise capability and configure timer
adapter in periodic mode. Capability flag should be set for adapters
which support periodic mode.

Below is a programming sequence on the usage:
	/* check for periodic mode support by reading capability. */
	rte_event_timer_adapter_caps_get(...);

	/* create adapter in periodic mode by setting periodic flag
	   (RTE_EVENT_TIMER_ADAPTER_F_PERIODIC) and resolution. */
	rte_event_timer_adapter_create_ext(...);

	/* arm periodic timer of configured resolution */
	rte_event_timer_arm_burst(...);

	/* timer event will be periodically generated at configured
	   resolution till cancel is called. */
	while (running) { rte_event_dequeue_burst(...); }

	/* cancel periodic timer which stops generating events */
	rte_event_timer_cancel_burst(...);

Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-04-12 09:23:34 +02:00
Tal Shnaiderman
13e06abc9d eal/windows: fix return codes of pthread shim layer
The macro definitions of the following pthread functions
return incorrect values from the inner function return code.

While pthread_barrier_init(), pthread_barrier_destroy() and
pthread_cancel() return 0 in a case of success and non-zero (errno) value
otherwise the shimming functions InitializeSynchronizationBarrier,
DeleteSynchronizationBarrier and TerminateThread return FALSE (0)
in a case of failure and TRUE(1) in a case of success.

This issue was undetected as none of the functions return codes were
checked until such check was added in
commit 34cc55cce6 ("eal: fix race in control thread creation")
exposing the issue by failing pthread_barrier_init()
and rte_eal_init() on Windows as a result.

The fix aligned the return value of the 3 function with the expected
pthread API return values.

Fixes: e8428a9d89 ("eal/windows: add some basic functions and macros")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
2021-04-12 22:35:31 +02:00
Chengchang Tang
18239da4ac ethdev: validate input in EEPROM info
This patch adds validity check of input pointer in EEPROM dump API.

Fixes: 7a3f27cbf5 ("ethdev: add access to specific device info")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-08 00:26:39 +02:00
Chengchang Tang
94af45f400 ethdev: validate input in register info
This patch adds validity check of input pointer in regs dump API.

Fixes: 7a3f27cbf5 ("ethdev: add access to specific device info")
Fixes: 936eda25e8 ("net/hns3: support dump register")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-08 00:26:39 +02:00
Chengchang Tang
e2bd08d569 ethdev: validate input in module EEPROM dump
The validity verification of input parameters should be performed at
API layer, not in the PMD.

Fixes: 3a18c44b45 ("ethdev: add access to EEPROM")
Fixes: 40ff8b305a ("net/e1000: add module EEPROM callbacks for e1000")
Fixes: f2088e785c ("net/i40e: fix dereference before check when getting EEPROM")
Fixes: b74d0cd43e ("net/ixgbe: add module EEPROM callbacks for ixgbe")
Fixes: 8a6a09f853 ("net/mlx5: support reading module EEPROM data")
Fixes: 58f6f93c34 ("net/octeontx2: add module EEPROM dump")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-08 00:26:39 +02:00
Junjie Wan
968bbc7e2e vhost: avoid IOTLB mempool allocation while IOMMU disabled
If vhost device's IOMMU feature is disabled, IOTLB mempool allocation
is unnecessary.

Reported-by: Peng He <hepeng.0320@bytedance.com>
Signed-off-by: Junjie Wan <wanjunjie@bytedance.com>
Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-04-07 08:54:05 +02:00
Marvin Liu
98da5545be vhost: fix initialization of async temporary header
This patch fixes coverity issue in async enqueue function by adding
initialization step before using temporary virtio header.

Coverity issue: 366123
Fixes: cd6760da10 ("vhost: introduce async enqueue for split ring")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-04-07 08:41:30 +02:00
Marvin Liu
5b784a2d80 vhost: fix initialization of temporary header
This patch fixs coverity issue by adding initialization step before
using temporary virtio header.

Coverity issue: 366181
Fixes: fb3815cc61 ("vhost: handle virtually non-contiguous buffers in Rx-mrg")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-04-07 08:41:30 +02:00
Qi Zhang
bb6270dab3 ethdev: refine debug build option
PMDs use RTE_LIBRTE_<PMD_NAME>_DEBUG_RX|TX as build option to wrap
data path debug code. As .config has been removed since the meson build,
It is not friendly for new DPDK users to notice those debug options.

The patch introduces below build options for data path debug, so PMD
can choose to reuse them to avoid maintain their own.

- RTE_ETHDEV_DEBUG_RX
- RTE_ETHDEV_DEBUG_TX

All the build options are documented at programming guide
"3.1 Driver Option", so users can easily find them.

The original undocumented RTE_LIBRTE_ETHDEV_DEBUG will alias to
both RTE_ETHDEV_DEBUG_RX and RTE_ETHDEV_DEBUG_TX for backward
compatibility.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-01 16:10:20 +02:00
Bruce Richardson
720dfda455 build: limit symbol checks to developer mode
The checking of symbols within each library and driver is only of
interest to developers, so limit to developer mode only.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-09 19:07:25 +02:00
Bruce Richardson
d317f7ebd4 build: hide debug messages in non-developer mode
The messages about what components have what dependency names, and
information about function versioning not being supported on windows are
only of interest to developers, so hide them when building in
non-developer mode.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-09 19:07:25 +02:00
Luc Pelletier
af68c1d699 eal: fix hang in control thread creation
The affinity of a control thread is set after it has been launched. If
setting the affinity fails, pthread_cancel is called followed by a call
to pthread_join, which can hang forever if the thread's start routine
doesn't call a pthread cancellation point.

This patch modifies the logic so that the control thread exits
gracefully if the affinity cannot be set successfully and removes the
call to pthread_cancel.

Fixes: 6383d2642b ("eal: set name when creating a control thread")
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-04-09 16:37:55 +02:00
Luc Pelletier
34cc55cce6 eal: fix race in control thread creation
The creation of control threads uses a pthread barrier for
synchronization. This patch fixes a race condition where the pthread
barrier could get destroyed while one of the threads has not yet
returned from the pthread_barrier_wait function, which could result in
undefined behaviour.

Fixes: 3a0d465d4c ("eal: fix use-after-free on control thread creation")
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-04-09 16:36:17 +02:00
Thomas Monjalon
76d409ce6e vfio: reformat logs
The log messages had various issues:
- split on 2 lines, making search (grep) difficult
- long lines (can be split after the string)
- indented for no good reason (parent message may have higher log level)
- inconsistent use of __func__, not meaningful context for user
- lack of context (general message not mentioning VFIO)
- log level too high (more below)

Message having its level decreased from WARNING to NOTICE:
	"not managed by VFIO driver, skipping"
Message having its level decreased from INFO to DEBUG:
	"Probing VFIO support..."

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-04-09 15:38:26 +02:00
David Marchand
3e08081637 eal: fix evaluation of log level option
--log-level option is handled early, no need to reevaluate it later in
EAL init.

Before:
$ echo quit | ./build/app/test/dpdk-test --no-huge -m 512 \
  --log-level=lib.eal:debug \
  --log-level=lib.ethdev:debug --log-level=lib.ethdev:info \
  |& grep -i log.level

EAL: lib.eal log level changed from info to debug
EAL: lib.ethdev log level changed from info to debug
EAL: lib.ethdev log level changed from debug to info
EAL: lib.ethdev log level changed from info to debug
EAL: lib.ethdev log level changed from debug to info
EAL: lib.telemetry log level changed from disabled to warning

After:
$ echo quit | ./build/app/test/dpdk-test --no-huge -m 512 \
  --log-level=lib.eal:debug \
  --log-level=lib.ethdev:debug --log-level=lib.ethdev:info \
  |& grep -i log.level

EAL: lib.eal log level changed from info to debug
EAL: lib.ethdev log level changed from info to debug
EAL: lib.ethdev log level changed from debug to info
EAL: lib.telemetry log level changed from disabled to warning

Fixes: 6c7216eefd ("eal: fix log level of early messages")
Fixes: 1c806ae5c3 ("eal/windows: support command line options parsing")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Tested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
2021-04-09 14:20:23 +02:00
David Marchand
eba022b926 log: track log level changes
Add a log message when registering log types and changing log levels.

__rte_log_register previous handled both legacy and dynamic logtypes.
To simplify the code, __rte_log_register is reworked to only handle
dynamic logtypes and takes a log level.

Example:
$ DPDK_TEST=logs_autotest ./build/app/test/dpdk-test --no-huge -m 512 \
  --log-level=lib.eal:debug
...
RTE>>logs_autotest
== dynamic log types
EAL: logtype1 log level changed from disabled to info
EAL: logtype2 log level changed from disabled to info
EAL: logtype1 log level changed from info to error
EAL: logtype3 log level changed from error to emergency
EAL: logtype2 log level changed from info to emergency
EAL: logtype3 log level changed from emergency to debug
EAL: logtype1 log level changed from error to debug
EAL: logtype2 log level changed from emergency to debug
error message
critical message
critical message
error message
== static log types
TESTAPP1: error message
TESTAPP1: critical message
TESTAPP2: critical message
TESTAPP1: error message
Test OK

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-04-09 14:19:53 +02:00
Thomas Monjalon
3c20e6fe72 log: add option argument help
The option --log-level was not completely described in the usage text,
and it was difficult to guess the names of the log types and levels.

A new value "help" is accepted after --log-level to give more details
about the syntax and listing the log types and levels.

The array "levels" used for level name parsing is replaced with
a (modified) existing function which was used in rte_log_dump().

The new function rte_log_list_types() is exported in the API
for allowing an application to give this info to the user
if not exposing the EAL option --log-level.
The list of log types cannot include all drivers if not linked in the
application (shared object plugin case).

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-09 12:56:58 +02:00
Thomas Monjalon
c2bd208a90 log: catch invalid level option number
The parsing check for invalid log level was not trying to catch
irrelevant numeric values.
A log level 0 becomes a failure in parsing so it can be caught early.
A log level higher than the max (8) is accepted with a warning message.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-09 12:56:09 +02:00
Thomas Monjalon
806d888a80 log: introduce macro for maximum level
RTE_DIM(...) and RTE_LOG_DEBUG were used to get the highest log level.
For better clarity a new constant RTE_LOG_MAX is introduced
and mapped to RTE_LOG_DEBUG.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-09 12:56:09 +02:00
Thomas Monjalon
867bb637ae log: move private functions
Some private log functions had a wrong "rte_" prefix.

All private log functions are moved from eal_private.h
to the new file eal_log.h:
	rte_eal_log_init -> eal_log_init
	rte_log_save_regexp -> eal_log_save_regexp
	rte_log_save_pattern -> eal_log_save_pattern
	eal_log_set_default

The static functions in the file eal_common_log.c are renamed:
	rte_log_save_level -> log_save_level
	rte_log_lookup -> log_lookup
	rte_log_init -> log_init
	__rte_log_register -> log_register

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-09 12:56:09 +02:00
Stanislaw Kardach
5516e4760e timer: clarify error if subsystem already initialized
rte_timer_subsystem_init() may return -EALREADY if it has been already
initialized. Therefore put explicitly into doxygen that this is not a
failure for the application.

Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
2021-04-08 23:12:55 +02:00
David Marchand
551b29a714 eal: fix telemetry log type on registration failure
rte_log_register_type_and_pick_level() returns an int.
Casting to a uin32_t will make us miss the -1 passed in case of failure.
Fallback to EAL log type like RTE_LOG_REGISTER.

Fixes: 37b881a961 ("telemetry: use log function from pointer")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-08 18:31:58 +02:00
Thomas Monjalon
bd057ae47d log: choose EAL log type on registration failure
In the unlikely case where something goes wrong
while registering a log type,
the fallback is to use the EAL log type.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-04-08 18:31:58 +02:00
David Marchand
56ea803e87 build: remove Windows export symbol list
Rather than have two files that keeps getting out of sync, let's
annotate the version.map to generate the Windows export file.

Some mlx5 symbols (haswell_broadwell_cpu, mlx5_glue, mlx5_os_*) were
only exported for Windows.
All of them are available and used by Linux too, so this patch adds
them in version.map.

Note: Existing version.map annotation achieved with:
$ for dir in lib/librte_eal drivers/common/mlx5; do
    ./buildtools/map-list-symbol.sh $dir/*.map |
    while read file version sym; do
      ! git grep -qw $sym $dir/*.def || continue;
      sed -i -e "s/$sym;/$sym; # WINDOWS_NO_EXPORT/" $dir/*.map;
    done;
  done

Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-04-08 17:57:33 +02:00
David Marchand
60e0e75b61 service: clean references to removed symbol
rte_service_get_id() was removed in v17.11 but the API description
still referenced it and a version node was still present in EAL map.

Fixes: 8edc9aaaf2 ("service: use id in get by name function")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-04-08 17:47:18 +02:00
Renata Saiakhova
2e761ce184 eal: add synchronous interrupt unregister
Avoid race with unregister interrupt handler if interrupt
source has some active callbacks at the moment, use wrapper
around rte_intr_callback_unregister() to check for -EAGAIN
return value and to loop until rte_intr_callback_unregister()
succeeds.

Signed-off-by: Renata Saiakhova <renata.saiakhova@ekinops.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Harman Kalra <hkalra@marvell.com>
2021-04-07 11:16:11 +02:00
Roy Shterman
edf20bd8a5 mem: fix freeing segments in --huge-unlink mode
When using huge_unlink we unlink the segment right
after allocation. Although we unlink the file we keep
the fd in fd_list so file still exist just the path deleted.
When freeing the hugepage we need to close the fd and assign
it with (-1) in fd_list for the page to be released.

The current flow fails rte_malloc in the following flow when working
with --huge-unlink option:
1. alloc_seg() for segment A -
    We allocate a segment, unlink the path to the segment
    and keep the file descriptor in fd_list.
2. free_seg() for segment A -
    We clear the segment metadata and return - without closing fd
    or assigning (-1) in fd list.
3. alloc_seg() for segment A again -
    We find segment A as available, try to allocate it,
    find the old fd in fd_list try to unlink it
    as part of alloc_seg() but failed because path doesn't exist.

The impact of such error is falsely failing rte_malloc()
although we have hugepages available.

Fixes: d435aad37d ("mem: support --huge-unlink mode")
Cc: stable@dpdk.org

Signed-off-by: Roy Shterman <roy.shterman@vastdata.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-04-07 11:13:45 +02:00
Thomas Monjalon
4d509afa7b pci: rename catch-all ID
The name of the constant PCI_ANY_ID was missing RTE_ prefix.
It is renamed, and the old name becomes a deprecated alias.

While renaming, the duplicate definitions in rte_bus_pci.h
are removed to keep only those in rte_pci.h.
Note: rte_pci.h is included in rte_bus_pci.h

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-06 14:52:49 +02:00
Anatoly Burakov
190f38773a power: do not skip saving original P-state governor
Currently, when we set the pstate governor to "performance", we check if
it is already set to this value, and if it is, we skip setting it.

However, we never save this value anywhere, so that next time we come
back and request the governor to be set to its original value, the
original value is empty.

Fix it by saving the original pstate governor first. While we're at it,
replace `strlcpy` with `rte_strscpy`.

Fixes: e6c6dc0f96 ("power: add p-state driver compatibility")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
2021-04-06 10:36:49 +02:00
Anatoly Burakov
8a5febaac4 power: fix P-state base frequency handling
Previous fix for base frequency handling in pstate mode introduced a
couple of issues:

- When base_frequency file does not exist, it simply bails out because
  of what appears to be accidental addition of FOPEN_OR_ERR_RET. This is
  incorrect, as absence of this file is not fatal and is in fact
  expected on kernel versions earlier than 5.3
- When base_frequency file does exist, it gets opened, but never gets
  closed, resulting in a resource leak

Both issues also manifest themselves as Coverity defects (dead code, and
a resource leak), so this fix addresses both.

Coverity issue: 369693, 369694
Bugzilla ID: 668
Fixes: 4db9587bbf ("power: check sysfs base frequency")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
2021-04-06 10:36:42 +02:00
Marvin Liu
af584d21bf vhost: fix batch dequeue potential buffer overflow
Similar as single dequeue, the multiple accesses of descriptor length
will lead to potential risk. One-time access of descriptor length can
eliminate this risk.

Fixes: 75ed516978 ("vhost: add packed ring batch dequeue")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-03-31 09:34:17 +02:00
Marvin Liu
93ed2f49de vhost: fix packed ring potential buffer overflow
Similar as split ring, the multiple accesses of descriptor length will
lead to potential risk. One-time access of descriptor length can
eliminate this risk.

Fixes: 2f3225a7d6 ("vhost: add vector filling support for packed ring")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-03-31 09:34:17 +02:00
Marvin Liu
134228ca39 vhost: fix split ring potential buffer overflow
In vhost datapath, descriptor's length are mostly used in two coherent
operations. First step is used for address translation, second step is
used for memory transaction from guest to host. But the interval between
two steps will give a window for malicious guest, in which can change
descriptor length after vhost calculated buffer size. Thus may lead to
buffer overflow in vhost side. This potential risk can be eliminated by
accessing the descriptor length once.

Fixes: 1be4ebb1c4 ("vhost: support indirect descriptor in mergeable Rx")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-03-31 09:34:17 +02:00
Keiichi Watanabe
790b1c3171 vhost: get negotiated protocol features
Add rte_vhost_get_negotiated_protocol_features, which returns a set of
enabled protocol features.

Signed-off-by: Keiichi Watanabe <keiichiw@chromium.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-03-31 08:15:14 +02:00
Maxime Coquelin
af4844503e vhost: optimize virtqueue structure
This patch moves vhost_virtqueue struct fields in order
to both optimize packing and move hot fields on the first
cachelines.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Tested-by: Balazs Nemeth <bnemeth@redhat.com>
2021-03-31 07:48:32 +02:00
Maxime Coquelin
1818a63147 vhost: move dirty logging cache out of virtqueue
This patch moves the per-virtqueue's dirty logging cache
out of the virtqueue struct, by allocating it dynamically
only when live-migration is enabled.

It saves 8 cachelines in vhost_virtqueue struct.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Tested-by: Balazs Nemeth <bnemeth@redhat.com>
2021-03-31 07:48:32 +02:00
Maxime Coquelin
2453bbf7e1 vhost: remove unused virtqueue field
This patch removes the "backend" field of the
vhost_virtqueue struct, which is not used by the
library.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Tested-by: Balazs Nemeth <bnemeth@redhat.com>
2021-03-31 07:48:32 +02:00
Tyler Retzlaff
19cc526d6c ethdev: install driver headers
Introduce a meson option 'enable_driver_sdk', when true installs internal
driver headers for ethdev. This allows drivers that do not depend on
stable api/abi to be built external to the dpdk source tree.

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-30 14:46:33 +02:00
Thomas Monjalon
fb7ad441d4 ethdev: replace callback getting filter operations
Since rte_flow is the only API for filtering operations,
the legacy driver interface filter_ctrl was too much complicated
for the simple task of getting the struct rte_flow_ops.

The filter type RTE_ETH_FILTER_GENERIC and
the filter operarion RTE_ETH_FILTER_GET are removed.
The new driver callback flow_ops_get replaces filter_ctrl.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-26 18:37:13 +01:00
Ivan Malov
43af98e687 ethdev: reuse VXLAN header definition in flow item
One ought to reuse existing header structs in flow items.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-22 17:19:16 +01:00
Ivan Malov
694d6ad392 net: clarify endianness of 32-bit fields in VXLAN headers
These fields have network byte order. Highlight it using dedicated type.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-22 17:19:16 +01:00
Ivan Malov
a56a262e34 ethdev: reuse VLAN header definition in flow item
One ought to reuse existing header structs in flow items.
This particular item contains non-header fields, so it's
important to keep the header fields in a separate struct.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-22 17:19:16 +01:00
Ivan Malov
6f2168b69a ethdev: reuse ethernet header definition in flow item
One ought to reuse existing header structs in flow items.
This particular item contains non-header fields, so it's
important to keep the header fields in a separate struct.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-22 17:19:16 +01:00
David Marchand
6545d6c52b telemetry: cleanup internal header
The experimental banner can be removed.
Every in-tree file is compiled with _GNU_SOURCE, so RTE_HAS_CPUSET is
unneeded for an internal header.

Fixes: 0e64ae618e ("telemetry: move init function to internal header")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-03-26 17:17:45 +01:00
Dmitry Kozlyuk
b2f24588b5 mem: fix cleanup when multi-process is disabled
rte_eal_memory_detach() did not account for cases where multi-process
mode is disabled: --in-memory and --no-shconf. This resulted
in unmapping memory that had not been mapped, which caused errors:

    EAL: Could not unmap memory: No error   (Windows)
    EAL: Cannot munmap(0x1d47f40, 0x7000): Invalid argument  (Linux)

Confusing "No error" was caused by using errno instead of rte_errno
set by rte_mem_unmap().

Skip detaching memory altogether when --in-memory is specified.
Skip unmapping configuration when it's not shared.
Fix and add error handling to produce proper log messages.

Fixes: dfbc61a2f9 ("mem: detach memsegs on cleanup")

Reported-by: Jie Zhou <jizh@microsoft.com>
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-03-26 17:17:45 +01:00
Tal Shnaiderman
1325a1ffd9 eal: rename thread TLS API
Rename the key opaque pointer from rte_tls_key to
rte_thread_key to avoid confusion with transport layer security.

Also rename and remove the "_tls" term from the following
functions to avoid redundancy:

rte_thread_tls_key_create
rte_thread_tls_key_delete
rte_thread_tls_value_set
rte_thread_tls_value_get

Suggested-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Suggested-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
2021-03-26 09:22:39 +01:00
Tal Shnaiderman
3d2913c67c eal: add error numbers in thread TLS API
Add error number reporting to rte_errno in all
functions in the rte_thread_tls_* API.

Suggested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-03-26 09:21:05 +01:00
Bruce Richardson
0e64ae618e telemetry: move init function to internal header
The rte_telemetry_init() function is for EAL use only, so can be moved to
the internal header rather than being in the public one.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-03-25 17:35:10 +01:00
Bruce Richardson
d700251c7a telemetry: rename internal-only header file
The header file containing the legacy telemetry function prototypes was all
internal-only, so we rename the file to be an internal-only one to make it
clearer it's not for installation.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-03-25 17:35:10 +01:00
Bruce Richardson
e34e2f55a7 telemetry: make the legacy registration function internal
The function for registration of callbacks for legacy telemetry was
documented as internal-only in the API documents, but marked as
experimental in the version.map file. Since this is an internal-only
function, for consistency we update the version mapping to have it as
internal.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-03-25 17:35:10 +01:00
Bruce Richardson
37b881a961 telemetry: use log function from pointer
Rather than passing back an error string to the caller, take as input the
rte_log function to use, and just use regular logging.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-03-25 17:35:10 +01:00
David Hunt
4db9587bbf power: check sysfs base frequency
Some kernels may show in incorrect value for base frequency in
sysfs (e.g. 15 GHz). This throws off the SST-BF algorithm for
high and low priority cores. So if base_frequency is greater
than max turbo frequency, ignore, and handle it as a normal
core.

Known Kernel version with issue: Linux 5.8.7

Signed-off-by: David Hunt <david.hunt@intel.com>
2021-03-24 22:04:48 +01:00
Cristian Dumitrescu
f38913b7fb pipeline: add meter array to SWX
Meter arrays are stateful objects that are updated by the data plane
and configured & monitored by the control plane. The meters implement
the RFC 2698 Two Rate Three Color Marker (trTCM) algorithm.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-03-24 19:18:45 +01:00
Cristian Dumitrescu
64cfcebd68 pipeline: add register array to SWX
Register arrays are stateful objects that can be read & modified by
both the data plane and the control plane, as opposed to tables, which
are read-only for data plane. One key use-case is the implementation
of stats counters.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-03-24 19:18:40 +01:00
Cristian Dumitrescu
d366a48f46 pipeline: fix instruction translation
The SWX pipeline instructions work with operands of different types:
header fields (h.header.field), packet meta-data (m.field), extern
object mailbox field (e.obj.field), extern function (f.field), action
data read from table entries (t.field), or immediate values; hence the
HMEFTI acronym.

For some pipeline instructions (add/sub, srl/shr, jmplt/jmpgt), only
the H, M and I cases were handled, while the E, F and T cases were
disregarded. This is what we fix here.

Fixes: baf7999303 ("pipeline: introduce SWX add instruction")
Fixes: c88c629438 ("pipeline: introduce SWX subtract instruction")
Fixes: b09ba6d0a3 ("pipeline: introduce SWX SHL instruction")
Fixes: e0f51638b7 ("pipeline: introduce SWX SHR instruction")
Fixes: b3947e25be ("pipeline: introduce SWX jump and return instructions")
Cc: stable@dpdk.org

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-03-24 18:54:04 +01:00
Venkata Suresh Kumar P
e2b8dc5256 port: add file descriptor SWX port
Add the file descriptor input/output port type for the SWX pipeline.
File descriptor port type provides interface with the kernel network
stack. Example file descriptor port is TAP device.

Signed-off-by: Venkata Suresh Kumar P <venkata.suresh.kumar.p@intel.com>
Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-03-23 19:50:44 +01:00
Cristian Dumitrescu
66440b7b22 table: add wildcard match table type
Add the widlcard match/ACL table type for the SWX pipeline, which is
used under the hood by the table instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
2021-03-23 19:47:20 +01:00
Cristian Dumitrescu
ddcce12249 table: add table entry priority
Add support for table entry priority, which is required for the
wildcard match/ACL table type.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-03-23 18:49:41 +01:00
Cristian Dumitrescu
924d3ba019 pipeline: support non-incremental table updates
Some table types (e.g. exact match/hash) allow for incremental table
updates, while others (e.g. wildcard match/ACL) do not. The former is
already supported, the latter is enabled by this patch.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-03-23 18:45:44 +01:00
Cristian Dumitrescu
cff9a7178e pipeline: improve table entry parsing
Improve the table entry parsing: better code structure, enable parsing
for the key field masks, allow comments and empty lines in the table
entry files.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Venkata Suresh Kumar P <venkata.suresh.kumar.p@intel.com>
Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
2021-03-23 18:41:47 +01:00
Cristian Dumitrescu
d526f6e2d0 pipeline: improve table entry helpers
Improve the internal table entry helper routines for key comparison,
entry duplication and checks.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-03-23 18:38:58 +01:00
Cristian Dumitrescu
75a09af1b4 table: fix actions with different data size
The table layer provisions an action_id and action_data_size data
bytes for each table key. This action_data_size is a maximal amount,
as some actions (depending on action_id) can require zero or less data
bytes than the maximal action_data_size. This fix allows for actions
with different data sizes to co-exist within the same table.

Fixes: d0a0096661 ("table: add exact match SWX table")
Cc: stable@dpdk.org

Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-03-23 17:24:52 +01:00
Cristian Dumitrescu
77a413017c port: add ring SWX port
Add the ring input/output port type for the SWX pipeline.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-03-23 17:22:47 +01:00
Thomas Monjalon
e0473c6d5b eal: fix build with musl
In musl libc, cpu_set_t is defined only if _GNU_SOURCE is defined.
In case _GNU_SOURCE is undefined, as in eal_common_errno.c,
it was not possible to include rte_os.h which uses cpu_set_t.

This limitation is removed: if CPU_SETSIZE is not defined,
cpu_set_t related definitions and functions are skipped.
Note: such definitions are unneeded in eal_common_errno.c.

Applications which do not define _GNU_SOURCE may miss cpu_set_t related
features on musl. Such case is detected by RTE_HAS_CPUSET being undefined,
so functions which depend on rte_cpuset_t will be unavailable.

A missing include of fcntl.h is also added.

Bugzilla ID: 35
Fixes: 11b57c6980 ("eal: fix error string function")
Fixes: 176bb37ca6 ("eal: introduce internal wrappers for file operations")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-03-23 08:41:05 +01:00
Thomas Monjalon
bfb42c3777 eal: fix comment of OS-specific header files
The same comment is on top of each rte_os.h file.
It is reworded to remove the mention of "future releases".

Fixes: 428eb983f5 ("eal: add OS specific header file")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-03-23 08:25:16 +01:00
Xueming Li
df7547a6a2 ethdev: add helper function to get representor ID
The NIC can have multiple PCIe links and can be attached to multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. The full representor identifier consists of
three indices - controller index, PF index, and VF or SF index (if any).

SR-IOV and SubFunction are created on top of PF. PF index is introduced
because there might be multiple PFs in the bonding configuration and
only bonding device is probed.

In eth representor comparator callback, ethdev representor ID was
compared with devarg. Since controller index and PF index not compared,
callback returned representor from other PF or controller.

This patch adds new API to get representor ID from controller, pf and
vf/sf index. Representor comparer callback get representor ID then
compare with device representor ID.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-17 19:12:09 +01:00
Xueming Li
85e1588ca7 ethdev: add API to get representor info
The NIC can have multiple PCIe links and can be attached to multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. The full representor identifier consists of
three indices - controller index, PF index, and VF or SF index (if any).

This patch introduces a new API rte_eth_representor_info_get() to
retrieve representor corresponding info mapping:
 - caller controller index and pf index.
 - supported representor ID ranges.
 - type, controller, pf and start vf/sf ID of each range.
The API is useful to calculate representor from devargs to representor
ID.

New ethdev callback representor_info_get() is added to retrieve info
from PMD driver, optional for PMD that doesn't support new devargs
representor syntax.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-17 19:11:56 +01:00
Xueming Li
66e0ea2c98 ethdev: support multi-host in representor
The NIC can have multiple PCIe links and can be attached to the multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. To provide the unambiguous identification
of the PCIe function the controller index is added. The full representor
identifier consists of three indices - controller index, PF index, and
VF or SF index (if any).

This patch introduces controller index to ethdev representor syntax,
examples:

[[c#]pf#]vf#: VF port representor/s, example: pf0vf1
[[c#]pf#]sf#: SF port representor/s, example: c1pf1sf[0-3]

c# is controller(host) ID/range in case of multi-host, optional.

For user application (e.g. OVS), PMD is responsible to interpret and
locate representor device based on controller ID, PF ID and VF/SF ID in
representor syntax.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
da97592635 ethdev: support PF index in representor
With Kernel bonding, multiple underlying PFs are bonded, VFs come
from different PF, need to identify representor of VFs unambiguously by
adding PF index.

This patch introduces optional 'pf' section to representor devargs
syntax, examples:
 representor=pf0vf0             - single VF representor
 representor=pf[0-1]sf[0-1023]  - SF representors from 2 PFs

PF type representor is supported by using standalone 'pf' section:
 representor=pf1                - PF representor

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
9be46b4308 kvargs: support multiple lists
This patch updates kvargs parser to support value of multiple lists or
ranges:
  k1=v[1,2]v[3-5]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-03-16 20:15:29 +01:00
Xueming Li
fa4f3fecb9 ethdev: support sub-function representor
SubFunction is a portion of the PCI device, created on demand, a SF
netdev has its own dedicated queues(txq, rxq). A SF netdev supports
eswitch representation offload similar to existing PF and VF
representors.

To support SF representor, this patch introduces new devargs syntax,
examples:
 representor=sf0               - single SubFunction representor
 representor=sf[1,3,5]         - single list
 representor=sf[0-3],          - single range
 representor=sf[0,2-6,8,10-12] - list with singles and ranges

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
cebf7f1715 ethdev: support new VF representor syntax
Current VF representor syntax:
 representor=2          - single representor
 representor=[0-3]      - single range

To prepare for more representor types, this patch adds compatible VF
representor devargs syntax:

vf#:
 representor=vf2          - single representor
 representor=vf[1,3,5]    - single list
 representor=vf[0-3]      - single range
 representor=vf[0,1,4-7]  - list with singles and range

For backwards compatibility, representor "#" is interpreted as "vf#".

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
83a675177f ethdev: refactor representor port list parsing
To the extended representor syntax which need to reuse the value parsing
function for controller and PF section, this patch refactors the port
list parsing.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
d654167641 ethdev: introduce representor type
To support more representor type, this patch introduces representor type
enum. The enum is subject to be extended to support new representor in
patches upcoming.

For each devarg structure, only one type supported.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2021-03-16 20:15:29 +01:00
Ivan Malov
24f8b2d896 net: fix comment in IPv6 header
The comment got it wrong. The payload length field
does not include the fixed IPv6 header size.

Fixes: 7eca7f7fd0 ("net: add missing endianness annotations")
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-12 14:32:48 +01:00
Thomas Monjalon
be2e6d7895 eal: mark version parts API as experimental
Some functions were introduced in DPDK 21.05 to query the version parts
(prefix, year, month, minor, suffix, release) at runtime.
Per guidelines, these new public functions must be marked with
__rte_experimental and ABI versioned as EXPERIMENTAL.

Fixes: 5b637a8481 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org

Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-03-19 16:20:30 +01:00
Thomas Monjalon
6437522079 eal: fix version macro
The macro RTE_VERSION was broken since updated with function calls.
It is a build-time version number, and must be built with macros.
For a run-time version number, there is the function rte_version().

Fixes: 5b637a8481 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-03-17 16:37:57 +01:00
Tal Shnaiderman
16afcbfa30 eal/windows: fix default thread priority
The hard-coded thread priority for Windows threads in EAL
is REALTIME_PRIORITY_CLASS/THREAD_PRIORITY_TIME_CRITICAL.

This results in issues with DPDK threads causing OS thread starvation
and eventually a bugcheck.

The fix reduce the thread priority to
NORMAL_PRIORITY_CLASS/THREAD_PRIORITY_NORMAL.

Bugzilla ID: 600
Fixes: 53ffd9f080 ("eal/windows: add minimum viable code")
Cc: stable@dpdk.org

Reported-by: Odi Assli <odia@nvidia.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-03-16 12:40:35 +01:00
Dmitry Kozlyuk
e863fe3a13 eal/windows: add missing SPDX license tag
Fixes: c08bd191b1 ("eal/windows: initialize hugepage info")
Cc: stable@dpdk.org

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-03-16 11:28:33 +01:00
Jie Zhou
88f4450ab2 metrics: export telemetry stubs if no libjansson
This patch allows the same set of rte_metrics_tel_* functions to be
exported no matter JANSSON is available or not, by doing following:
1.	Leverage dpdk_conf to set configuration flag RTE_HAS_JANSSON
when Jansson dependency is found.
2.	In rte_metrics_telemetry.c, leverage RTE_HAS_JANSSON to handle the
case when JANSSON is not available by adding stubs for all the instances.
3.	In meson.build, per dpdk/doc/guides/rel_notes/release_20_05.rst,
it is claimed that "Telemetry library is no longer dependent on the
external Jansson library, which allows Telemetry be enabled by default.",
thus make the deps and includes of Telemetry as not conditional anymore.

Signed-off-by: Jie Zhou <jizh@microsoft.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-03-16 10:08:06 +01:00
Ferruh Yigit
5988725d0e log/linux: make default output stderr
In Linux by default DPDK log goes to stdout, as well as syslog.

It is possible for an application to change the library output stream
via 'rte_openlog_stream()' API, to set it to stderr, it can be used as:
rte_openlog_stream(stderr);

But still updating the default log output to 'stderr'.

Bugzilla ID: 8
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-16 00:01:44 +01:00
Bruce Richardson
5b637a8481 eal: fix querying DPDK version at runtime
For using a DPDK application, such as OVS, which is dynamically linked, the
DPDK version in use should always report the actual version, not the
version used at build time. This incorrect behaviour can be seen by
building OVS against one version of DPDK and running it against a later
one. Using "ovs-vsctl list Open_vSwitch" to query basic info, the
dpdk_version returned will be the build version not the currently running
one - which can be verified using the DPDK telemetry library client.

  $ sudo ovs-vsctl list Open_vSwitch | grep dpdk_version
  dpdk_version        : "DPDK 20.11.0-rc4"

  $ echo quit | sudo dpdk-telemetry.py
  Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
  {"version": "DPDK 21.02.0-rc2", "pid": 405659, "max_output_len": 16384}
  -->

To fix this, we need to convert the rte_version() function, and any other
necessary parts of the rte_version.h, to be actual functions in EAL, not
just inlines/macros. The only complication in doing so is that telemetry
library cannot call rte_version() directly, and instead needs the version
string passed in on init.

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-03-15 23:22:14 +01:00
Alexander Kozyrev
cbc78be94c ethdev: document generic modify flow action
Field IDs for the MODIFY_FIELD action lack doxygen comments
and not visible in online DPDK documentation because of that.
Provide a meaningful description for every Field ID for the
rte_flow_field_id enumeration.

Fixes: 73b68f4c54 ("ethdev: introduce generic modify flow action")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-09 14:30:38 +01:00
Lance Richardson
e8a419d6de mbuf: rename outer IP checksum macro
Rename PKT_RX_EIP_CKSUM_BAD to PKT_RX_OUTER_IP_CKSUM_BAD and
deprecate the original name. The new name is better aligned
with existing PKT_RX_OUTER_* flags, which should help reduce
confusion about its use.

Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-03-02 10:57:28 +01:00
Anatoly Burakov
2a2ebeab9e fbarray: fix log message on truncation error
When file truncation fails, the log message attempts to print a path of
file we failed to truncate, but this path was never set to anything and,
what's worse, was uninitialized. Fix it by passing path from the caller.

Coverity issue: 366122
Fixes: c44d09811b ("eal: add shared indexed file-backed array")
Cc: stable@dpdk.org

Reported-by: Andrew Boyer <aboyer@pensando.io>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-03-04 11:37:05 +01:00
David Christensen
44db5a5cf2 eal/ppc: provide arch-specific TSC frequency
Return a PPC specific value for get_tsc_freq_arch() rather than
depending on the EAL framework to estimate the frequency.

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
2021-03-03 10:05:23 +01:00
Anatoly Burakov
dfbc61a2f9 mem: detach memsegs on cleanup
Currently, we don't detach the shared memory on EAL cleanup, which
leaves the page table descriptors still holding on to the file
descriptors as well as memory space occupied by them. Fix it by adding
another detach stage that closes the internal memory allocator resource
references, detaches shared fbarrays and unmaps the shared mem config.

Bugzilla ID: 380
Bugzilla ID: 381

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2021-03-03 10:05:23 +01:00
Yunjian Wang
8d63961fc7 vfio: fix API description
Fix few comments and add detailed comments for return value.

Fixes: 279b581c89 ("vfio: expose functions")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-03-03 10:05:23 +01:00
Ferruh Yigit
e79b0efd98 power: remove duplicated symbols from map file
This is causing build error, like:
https://travis-ci.com/github/ovsrobot/dpdk/jobs/482121104

Also '@internal' marker removed from doxygen comment, since public API
should not be internal.
Experimental tag removed from 'rte_power_guest_channel_send_msg()'

Fixes: 4d3892dcd7 ("power: make channel message functions public")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-03-02 13:43:38 +01:00
Nithin Dabilpuram
c13ca4e81c vfio: fix DMA mapping granularity for IOVA as VA
Partial unmapping is not supported for VFIO IOMMU type1
by kernel. Though kernel gives return as zero, the unmapped size
returned will not be same as expected. So check for
returned unmap size and return error.

For IOVA as PA, DMA mapping is already at memseg size
granularity. Do the same even for IOVA as VA mode as
DMA map/unmap triggered by heap allocations,
maintain granularity of memseg page size so that heap
expansion and contraction does not have this issue.

For user requested DMA map/unmap disallow partial unmapping
for VFIO type1.

Fixes: 73a6390859 ("vfio: allow to map other memory regions")
Cc: stable@dpdk.org

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
2021-03-01 11:58:28 +01:00
Nithin Dabilpuram
016763c219 vfio: do not merge contiguous areas
In order to save DMA entries limited by kernel both for external
memory and hugepage memory, an attempt was made to map physically
contiguous memory in one go. This cannot be done as VFIO IOMMU type1
does not support partially unmapping a previously mapped memory
region while Heap can request for multi page mapping and
partial unmapping.
Hence for going back to old method of mapping/unmapping at
memseg granularity, this commit reverts
commit d1c7c0cdf7 ("vfio: map contiguous areas in one go")

Also add documentation on what module parameter needs to be used
to increase the per-container dma map limit for VFIO.

Fixes: d1c7c0cdf7 ("vfio: map contiguous areas in one go")
Cc: stable@dpdk.org

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
2021-03-01 11:58:24 +01:00
Marvin Liu
894028ace2 vhost: fix packed ring dequeue offloading
When vhost is doing dequeue offloading, it parses ethernet and L3/L4
headers of the packet. Then vhost will set corresponding value in mbuf
attributes. It means offloading action should be after packet data copy.

Fixes: 75ed516978 ("vhost: add packed ring batch dequeue")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-02-10 22:17:47 +01:00
Qi Zhang
8a8c4760a1 ethdev: refine doxygen comment of UDP tunnel API
Clarify what is the scope and impact of the UDP port tunnel API.

There are still missing infos to be improved in future:
	- no capability flag
	- dependency between ports of the same device
	- required privilege

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-02-10 21:48:59 +01:00
Bruce Richardson
9ff791eff6 eal: fix automatic loading of drivers as shared libs
When checking the loading of EAL shared lib to see if we have a shared
DPDK build, we only want to include part of the ABI version in the check
rather than the whole thing. For example, with ABI version 21.1 for DPDK
release 21.02, the linker links the binary against librte_eal.so.21,
without the ".1".

To avoid any further brittleness in this area, we can check for multiple
versions when doing the check, since just about any version of EAL implies
a shared build. Therefore we check for presence of librte_eal.so with full
ABI_VERSION extension, and then repeatedly remove the end part of the
filename after the last dot, checking each time. For example (debug log
output for static build):

  EAL: Checking presence of .so 'librte_eal.so.21.1'
  EAL: Checking presence of .so 'librte_eal.so.21'
  EAL: Checking presence of .so 'librte_eal.so'
  EAL: Detected static linkage of DPDK

Fixes: 7781950f4d ("eal: fix shared lib mode detection")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
2021-02-10 10:01:48 +01:00
Bruce Richardson
0d32fd0945 telemetry: mark init function as internal-only
The "rte_telemetry_init()" function is for use by "rte_eal_init()" and
should not be part of the public API. Mark it as internal only.

Fixes: 6dd571fd07 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-02-09 13:36:45 +01:00
David Marchand
c48d8c3164 mbuf: remove unneeded atomic generic header include
There is no need for the direct inclusion of the generic/ header [1]
now that we don't use the rte_atomic API anymore.

It was the last case of direct inclusion of the generic/ headers,
so the flag -Wno-unused-function can be dropped.

1: https://git.dpdk.org/dpdk/commit/?id=3eb860b08eb7

Fixes: e41d27a68d ("mbuf: remove atomic reference counters")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-02-05 19:49:32 +01:00
Olivier Matz
daeb7c7f41 mempool: fix panic on dump or audit
When doing a mempool dump or an audit, the application can panic because
the length of the cache is greater than the flush threshold, which is
seen as a fatal error. But this can temporarily happen when the mempool
is in use.

Fix the panic condition to abort only when the cache length is greater
than the array.

Fixes: ea5dd2744b ("mempool: cache optimisations")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-02-05 17:40:23 +01:00
Harry van Haaren
b07b80fe1c eventdev: fix a return value comment
The PMD info get API has a void return type. Remove the
@return 0 Success doxygen comment as it doesn't make sense here.

Fixes: 5223a1f3b8 ("eventdev: define southbound driver interface")
Cc: stable@dpdk.org

Reported-by: Fredrik A Lindgren <fredrik.a.lindgren@tietoevry.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2021-02-04 13:51:45 +01:00
Fei Chen
9944bddf80 vhost: fix vid allocation race
vhost_new_device might be called in different threads at
the same time.

thread 1(config thread)
            rte_vhost_driver_start
               ->vhost_user_start_client
                   ->vhost_user_add_connection
                     -> vhost_new_device

thread 2(vhost-events)
	vhost_user_read_cb
           ->vhost_user_msg_handler (return value < 0)
             -> vhost_user_start_client
                 -> vhost_new_device

So there could be a case that a same vid has been allocated
twice, or some vid might be lost in DPDK lib however still
held by the upper applications.

Another place where race would happen is at the func
*vhost_destroy_device*, but after a detailed investigation,
the race does not exist as long as no two devices have the
same vid: Calling vhost_destroy_devices in different
threads with different vids is actually safe.

Fixes: a277c71598 ("vhost: refactor code structure")
Cc: stable@dpdk.org

Reported-by: Peng He <hepeng.0320@bytedance.com>
Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-02-04 18:19:36 +01:00
Anatoly Burakov
7e54f18326 mem: fix deadlock on secondary allocation
Previous fix used `rte_malloc_heap_socket_is_external()` to check if the
heap was an external heap. However, that API is thread-safe, and when
we're inside the allocation process, we're already write-locked, so
calling `rte_malloc_heap_socket_is_external()` will result in a
deadlock followed by a timeout.

Fix it by replacing the API call with a check against maximum number of
NUMA nodes, because external heaps always have higher socket ID's.

Fixes: 7ac31e82bc ("mem: improve parameter checking on memory hotplug")

Reported-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-30 00:26:49 +01:00
Hemant Agrawal
d810252857 ethdev: add MPLS RSS offload type
This patch defines new RSS offload types for MPLS. The distribution
will on the basis of MPLS tag.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-29 18:16:08 +01:00
Alexander Kozyrev
8d00e4698e ethdev: add IPv6 DSCP option for modify field action
IPv6 DSCP field ID is missing from the original list of Field IDs
for MODIFY_FIELD action. Add it to support IPv6 header fully.
Add ipv6_dscp option for the corresponding header field in testpmd.

Fixes: 73b68f4c54 ("ethdev: introduce generic modify flow action")

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-29 18:16:08 +01:00
Thomas Monjalon
a6f34f9100 ethdev: fix close failure handling
If a failure happens when closing a port,
it was unnecessarily failing again in the function eth_err(),
because of a check against HW removal cause.
Indeed there is a big chance the port is released at this point.
Given the port is in the middle (or at the end) of a close process,
checking the error cause by accessing the port is a non-sense.
The error check is replaced by a simple return in the close function.

Bugzilla ID: 624
Fixes: 8a5a0aad5d ("ethdev: allow close function to return an error")
Cc: stable@dpdk.org

Reported-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-29 18:16:08 +01:00
Viacheslav Galaktionov
061abae299 ethdev: clarify what is included in generic byte statistics
Different hardware gathers statistics differently, so some general
rules need to be established.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-29 18:16:08 +01:00
Bruce Richardson
05050ac4ce build: add header includes check
To verify that all DPDK headers are ok for inclusion directly in a C file,
and are not missing any other pre-requisite headers, we can auto-generate
for each header an empty C file that includes that header. Compiling these
files will throw errors if any header has unmet dependencies.

For some libraries, there may be some header files which are not for direct
inclusion, but rather are to be included via other header files. To allow
later checking of these files for missing includes, we separate out the
indirect include files from the direct ones.

To ensure ongoing compliance, we enable this build test as part of the
default x86 build in "test-meson-builds.sh".

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-01-29 20:59:37 +01:00
Bruce Richardson
2518704288 eventdev: make driver-only headers private
The rte_eventdev_pmd*.h files are for drivers only and should be private
to DPDK, and not installed for app use.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
df96fd0d73 ethdev: make driver-only headers private
The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Steven Webster <steven.webster@windriver.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
5d1a53130a rib: fix missing header include
The rte_rib6 header was using RTE_MIN macro from rte_common.h but not
including the header file.

Fixes: f7e861e21c ("rib: support IPv6")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
deb6ea1d2d power: fix missing header includes
The rte_power_guest_channel.h file did not include its dependent
headers, so add them.

Fixes: 5f443cc0f9 ("power: create guest channel public header file")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
4ab63cd60c eal: fix internal ABI tag with clang
Clang does not have an "error" attribute for functions, so for marking
internal functions we need to check for the error attribute, and provide
a fallback if it is not present. For clang, we can use "diagnose_if"
attribute, similarly checking for its presence before use.

Fixes: fba5af82ad ("eal: add internal ABI tag definition")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
3c2cca6a0d eal: fix MCS lock header include
Include 'rte_branch_prediction.h' to get the likely/unlikely macro
definitions.

Fixes: 2173f3333b ("mcslock: add MCS queued lock implementation")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-01-29 20:59:09 +01:00
Thomas Monjalon
45eb6a1dfe lib: fix doxygen for parameters of function pointers
Some parameters of typedef'ed function pointers were not properly listed
in the doxygen comments.
The error is seen with doxygen 1.9 which added this specific check:
	https://github.com/doxygen/doxygen/commit/d34236ba4037

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-01-29 15:58:06 +01:00
Liang Ma
682a645438 power: add ethdev power management
Add a simple on/off switch that will enable saving power when no
packets are arriving. It is based on counting the number of empty
polls and, when the number reaches a certain threshold, entering an
architecture-defined optimized power state that will either wait
until a TSC timestamp expires, or when packets arrive.

This API mandates a core-to-single-queue mapping (that is, multiple
queued per device are supported, but they have to be polled on different
cores).

This design is using PMD RX callbacks.

1. UMWAIT/UMONITOR:

   When a certain threshold of empty polls is reached, the core will go
   into a power optimized sleep while waiting on an address of next RX
   descriptor to be written to.

2. TPAUSE/Pause instruction

   This method uses the pause (or TPAUSE, if available) instruction to
   avoid busy polling.

3. Frequency scaling
   Reuse existing DPDK power library to scale up/down core frequency
   depending on traffic volume.

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2021-01-29 15:29:48 +01:00
Anatoly Burakov
abc0cade20 eal: improve power monitor API comments
Currently, the API documentation is ambiguous as to what happens when
certain conditions are met. Document the behavior explicitly, as well as
fix some typos and outdated comments.

Fixes: 6a17919b0e ("eal: change power intrinsics API")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-01-29 15:29:48 +01:00
Anatoly Burakov
f400ea0b4c eal: rename power monitor condition member
The `data_sz` name is fine, but it looks out of place because nothing
else has "data" prefix in that structure. Rename it to "size", as well
as add more clarity to the comments around each struct member.

Fixes: 6a17919b0e ("eal: change power intrinsics API")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-01-29 15:29:48 +01:00
Feifei Wang
1fc73390bc ring: refactor exported headers
For legacy modes, rename ring_generic/c11 to ring_generic/c11_pvt.
Furthermore, add new file ring_elem_pvt.h which includes ring_do_eq/deq
and ring element copy/delete APIs.

The update_tail internal helper has been prefixed with the library prefix.

For other modes, rename xx_c11_mem to xx_elem_pvt. Move all private APIs
into these new header files.

Finally, the external APIs and internal APIs will be separated from each
other. This can remind users not to use internal APIs and make ring
library easier to maintain.

Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-01-29 11:37:14 +01:00
Bruce Richardson
825fddf651 power: clean up includes
re-organise the including of the new public header file and
remove un-needed includes

Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-29 11:25:40 +01:00
Bruce Richardson
d74b159e8c power: export guest channel header file
Adjust meson.build so that 'ninja install' copies the new header
file into the installation directory.

Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-29 11:25:40 +01:00
Bruce Richardson
38d232b9b8 power: rename constants
Rename the #defines to have an RTE_POWER_ prefix

Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-29 11:25:40 +01:00
Bruce Richardson
bd5b6720fe power: rename public structs
Rename the public structs to have an rte_power_ prefix.

Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-29 11:25:40 +01:00
Bruce Richardson
4d3892dcd7 power: make channel message functions public
Move the 2 public functions into rte_power_guest_channel.h

Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-29 11:25:40 +01:00
Bruce Richardson
5f443cc0f9 power: create guest channel public header file
In preparation for making the header file public, we first rename
channel_commands.h as rte_power_guest_channel.h.

Fixes: 210c383e24 ("power: packet format for vm power management")
Fixes: cd0d5547e8 ("power: vm communication channels in guest")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
2021-01-29 11:25:40 +01:00
Anatoly Burakov
7ac31e82bc mem: improve parameter checking on memory hotplug
Currently, we don't check anything that comes in through memory hotplug
subsystem using the IPC, because we always assume the data is correct.
This is okay as anyone having access to the IPC socket would also have
rights to crash the DPDK process through other means, but it's still a
good practice to do parameter checking, so fix the code to do that.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-27 14:24:05 +01:00
Joyce Kong
36d406c513 eal/arm: fix debug build with gcc for 128-bit atomics
Compiling with "meson build -Dbuildtype=debug --cross-file
config/arm/arm64_thunderx2_linux_gcc" shows the warnings
"function returns an aggregate [-Waggregate-return]":
../../dpdk/lib/librte_eal/arm/include/rte_atomic_64.h: In
function ‘__cas_128_relaxed’:
../../dpdk/lib/librte_eal/arm/include/rte_atomic_64.h:81:20:
error: function returns an aggregate [-Werror=aggregate-return]
 __ATOMIC128_CAS_OP(__cas_128_relaxed, "casp")
                    ^~~~~~~~~~~~~~~~~

Fix the compiling issue by defining __ATOMIC128_CAS_OP as a void
function and passing the address pointer into it.

Fixes: 7e2c3e17fe ("eal/arm64: add 128-bit atomic compare exchange")
Cc: stable@dpdk.org

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-01-27 11:21:21 +01:00
Bruce Richardson
7be7dc6dea build: force pkg-config for dependency detection
Meson can use cmake as a fallback for detecting packages, and this can
lead to picking up 64-libs for 32-bit builds. To work around this, force
the use of pkg-config only for detecting libcrypto, zlib, jansson and
other package dependencies.

Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Ruifeng Wang <ruifeng.wang@arm.com>
Tested-by: Liron Himi <lironh@marvell.com>
Tested-by: Lee Daly <lee.daly@intel.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Martin Spinler <spinler@cesnet.cz>
2021-01-26 00:43:59 +01:00
Bruce Richardson
b9a396b0fd node: fix missing header include
The rte_compat header file is needed for the '__rte_experimental' macro.

Fixes: f00708c2aa ("node: add IPv4 rewrite and lookup control")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:27:47 +01:00
Bruce Richardson
7012833094 metrics: fix variable declaration in header
The global variable "tel_met_data" was declared in a header file, rather
than in a C file, leading to duplicate definitions if more than one C
file included the header.

Fixes: c5b7197f66 ("telemetry: move some functions to metrics library")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:27:47 +01:00
Bruce Richardson
cd93ec8961 pipeline: fix missing header includes
The stdio.h header needs to be included to get the definition of the
FILE type.

Fixes: b32c0a2c5e ("pipeline: add SWX table update high level API")
Fixes: 3ca60ceed7 ("pipeline: add SWX pipeline specification file")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:27:47 +01:00
Bruce Richardson
3b4d434e53 table: fix missing header include
The rte_lru_x86.h header, included from the main rte_lru.h header, uses
the RTE_CC_IS_GNU macro from rte_common.h but fails to include that
header file.

Fixes: 0c9a5735a9 ("eal: fix compiler detection in public headers")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:27:47 +01:00
Bruce Richardson
f5bf8249d5 fib: fix missing header includes
Add stdint.h to get definitions of standard integer types

Fixes: 39e9272484 ("fib: add FIB library")
Fixes: 40d41a8a7b ("fib: support IPv6")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:27:47 +01:00
Bruce Richardson
071b9b78c4 ipsec: fix missing header include
The rte_ipsec_sad.h header used the standard uintXX_t types, but did not
include stdint.h header for them.

Fixes: 401633d9c1 ("ipsec: add inbound SAD API")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:27:47 +01:00
Bruce Richardson
820778fad6 vhost: fix missing header includes
The vhost header files were missing definitions from headers to allow
them to be compiled up individually.

Fixes: d7280c9fff ("vhost: support selective datapath")
Fixes: a49f758d11 ("vhost: split vDPA header file")
Fixes: 939066d965 ("vhost/crypto: add public function implementation")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:27:47 +01:00
Bruce Richardson
585617c320 rib: fix missing header includes
The standard integer types, and the size_t types are missing their
required header includes in the rib header file.

Fixes: 5a5793a5ff ("rib: add RIB library")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:27:47 +01:00
Bruce Richardson
29a3ec7f0e bitrate: fix missing header include
The rte_compat.h header file must be included to get the definition of
__rte_experimental.

Fixes: f030bff72f ("bitrate: add free function")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:27:44 +01:00
Bruce Richardson
b7c0591fd6 mbuf: fix missing header include
The rte_mbuf_dyn.h header file uses a number of types and macros without
including the required header files to get the definitions of those
macros/types.  Similarly, the rte_mbuf_core.h file was missing an
include for rte_byteorder.h header.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Fixes: 3eb860b08e ("mbuf: move definitions into a separate file")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:21:40 +01:00
Bruce Richardson
a6bf1fd817 net: fix missing header include
The Geneve protocol header file is missing the rte_byteorder.h header.

Fixes: ea0e711b8a ("app/testpmd: add GENEVE parsing")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ophir Munk <ophirmu@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:21:40 +01:00
Bruce Richardson
ebbd4d33d7 ethdev: fix missing header include
The define for RTE_ETH_FLOW_MAX is defined in rte_ethdev.h, so that
header should be included in rte_eth_ctrl.h to allow it to be compiled
independently.

Fixes: 7fa96d696f ("ethdev: unification of flow types")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:21:40 +01:00
Bruce Richardson
354e43b1aa telemetry: fix missing header include
The telemetry header file uses the rte_cpuset_t type, but does not
include any header providing that type. Include rte_os.h to provide the
necessary type.

Fixes: febbebf7f2 ("telemetry: keep threads separate from data plane")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:21:40 +01:00
Bruce Richardson
8fc617099e eal: fix reciprocal header include
The rte_reciprocal header file used standard __rte_always_inline from
rte_common.h but does not include that header file, leading to compiler
errors when the reciprocal header is included alone.

Fixes: 6d45659eac ("eal: add u64-bit variant for reciprocal divide")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:21:40 +01:00
Bruce Richardson
2b403e027b eal: fix thread header include
rte_thread.h was missing the compat header to get the __rte_experimental
macro definition.

Fixes: b1fd151267 ("eal: add generic thread-local-storage functions")

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-21 10:21:40 +01:00
Bruce Richardson
762bfccc8a config: remove compatibility build defines
As announced in the deprecation note, remove all compatibility build
defines from previous make/meson versions and use only the standardized
ones - RTE_LIB_<name> for libraries, and RTE_<CLASS>_<NAME> for drivers.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-01-20 01:43:25 +01:00
Alexander Kozyrev
73b68f4c54 ethdev: introduce generic modify flow action
Implement the generic modify flow API to allow manipulations on
an arbitrary header field (as well as mark, metadata or tag) using
data from another field or a user-specified value.
This generic modify mechanism removes the necessity to implement
a separate RTE Flow action every time we need to modify a new packet
field in the future.

Supported operation are:
- set: copy data from source to destination.
- add: integer addition, stores the result in destination.
- sub: integer subtraction, stores the result in destination.

The field ID is used to specify the desired source/destination packet
field in order to simplify the API for various encapsulation models.
Specifying the packet field ID with the needed encapsulation level
is able to quickly get a packet field for any inner packet header.

Alternatively, the special ID (ITEM_START) can be used to point to
the very beginning of a packet. This ID in conjunction with the
offset parameter provides great flexibility to copy/modify any part of
a packet as needed.

The number of bits to use from a source as well as the offset can be
be specified to allow a partial copy or dividing a big packet field
into multiple small fields (e.g. copying 128 bits of IPv6 to 4 tags).

An immediate value (or a pointer to it) can be specified instead of the
level and the offset for the special FIELD_VALUE ID (or FIELD_POINTER).
Can be used as a source only.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-01-19 03:30:32 +01:00
Bruce Richardson
26fe208ad8 ethdev: avoid blocking telemetry for link status
When querying the link status via telemetry interface, we don't want the
client to have to wait for multiple seconds for a reply. Therefore use
"rte_eth_link_get_nowait()" rather than "rte_eth_link_get()" in the
telemetry callback.

Fixes: c190daedb9 ("ethdev: add telemetry callbacks")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-01-19 03:30:32 +01:00
Shiri Kuzin
2b4c72b4d1 ethdev: introduce GENEVE header TLV option item
The Geneve tunneling protocol is designed to allow the
user to specify some data context on the packet.
The GENEVE TLV (Type-Length-Variable) Option
is the mean intended to present the user data.

In order to support GENEVE TLV Option the new rte_flow
item "rte_flow_item_geneve_opt" is added.
The new item contains the values and masks for the
following fields:
-option class
-option type
-length
-data

New item will be added to testpmd to support match and
raw encap/decap actions.

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-19 03:30:15 +01:00
Steve Yang
bf0f90d92d ethdev: fix max Rx packet length check
Ethdev is using default Ethernet overhead to decide if provided
'max_rx_pkt_len' value is bigger than max (non jumbo) MTU value,
and limits it to MAX if it is.

Since the application/driver used Ethernet overhead is different than
the ethdev one, check result is wrong.

If the driver is using Ethernet overhead bigger than the default one,
the provided 'max_rx_pkt_len' is trimmed down, and in the driver when
correct Ethernet overhead is used to convert back, the resulting MTU is
less than the intended one, causing some packets to be dropped.

Like,
app     -> max_rx_pkt_len = 1500/*mtu*/ + 22/*overhead*/ = 1522
ethdev  -> 1522 > 1518/*MAX*/; max_rx_pkt_len = 1518
driver  -> MTU = 1518 - 22 = 1496
Packets with size 1497-1500 are dropped although intention is to be able
to send/receive them.

The fix is to make ethdev use the correct Ethernet overhead for port,
instead of default one.

Fixes: 59d0ecdbf0 ("ethdev: MTU accessors")
Cc: stable@dpdk.org

Signed-off-by: Steve Yang <stevex.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-19 03:30:14 +01:00
Jeff Guo
759364892a ethdev: add eCPRI tunnel type
Add type of RTE_TUNNEL_TYPE_ECPRI into the enum of ethdev tunnel type.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-19 03:30:13 +01:00
Viacheslav Ovsiienko
1d94d8042a ethdev: update GTP headers
This patch introduces the GTP header individual flag bit fields
and the header optional word with N-PDU number, Sequence Number
and Next Extension Header type.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-19 03:30:13 +01:00
Abhinandan Gujjar
1c3ffb9559 cryptodev: add enqueue and dequeue callbacks
This patch adds APIs to add/remove callback functions on crypto
enqueue/dequeue burst. The callback function will be called for
each burst of crypto ops received/sent on a given crypto device
queue pair.

Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2021-01-19 18:05:44 +01:00
Stephen Hemminger
ce7b11ad99 pdump: cleanup logs and variables
Checkpatch prefers 'unsigned int' over bare 'unsigned'.
Reword the error messages for brevity and clarity
so they don't have to be split across multiple lines.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-01-19 15:24:46 +01:00
Stephen Hemminger
2d30d42c0f pdump: replace constant for device name size
The device string has an existing size in rte_dev.h
use that instead of defining our own.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-01-19 15:24:46 +01:00
Stephen Hemminger
f82067c4d5 pdump: free mbuf in bulk
Use rte_pktmbuf_free_bulk instead of loop when freeing
packets.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
2021-01-19 15:24:46 +01:00
Anatoly Burakov
27ff8384de fbarray: fix overlap check
When we're attaching fbarrays in secondary processes, we check for
whether the intended memory address for the fbarray is already in use by
some other, local fbarray. However, the check for end-overlap (i.e. to
see if our memory area's end overlaps with some other fbarray) is
incorrectly counting end offset as part of the overlap. Fix the check.

Fixes: 5b61c62cfd ("fbarray: add internal tailq for mapped areas")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Zhihong Peng <zhihongx.peng@intel.com>
2021-01-19 13:12:59 +01:00
Anatoly Burakov
c691506798 eal/linux: improve no hugepages logging
When no hugepages are found, we log a message about it, but we never
specify on which node. We also implicitly declare the page size based
on the directory name, but that's not very user friendly.

Fix both by changing the text of the message to note the NUMA node (if
applicable) and explicitly mention page size in kilobytes.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-01-19 13:12:59 +01:00
Liang Ma
1fe3eef5e9 ethdev: add simple power management API
Add a simple API to allow getting the monitor conditions for
power-optimized monitoring of the Rx queues from the PMD, as well as
release notes information.

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-01-19 00:00:04 +01:00
Anatoly Burakov
55243435f5 eal: add monitor wakeup function
Now that we have everything in a C file, we can store the information
about our sleep, and have a native mechanism to wake up the sleeping
core. This mechanism would however only wake up a core that's sleeping
while monitoring - waking up from `rte_power_pause` won't work.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-01-18 23:59:42 +01:00
Anatoly Burakov
1b36fc280a eal: remove sync version of power monitor
Currently, the "sync" version of power monitor intrinsic is supposed to
be used for purposes of waking up a sleeping core. However, there are
better ways to achieve the same result, so remove the unneeded function.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-01-18 23:59:30 +01:00
Anatoly Burakov
6a17919b0e eal: change power intrinsics API
Instead of passing around pointers and integers, collect everything
into struct. This makes API design around these intrinsics much easier.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-01-18 23:59:12 +01:00
Anatoly Burakov
68fbbb8369 eal: avoid invalid power intrinsics API usage
Currently, the API documentation mandates that if the user wants to use
the power management intrinsics, they need to call the
`rte_cpu_get_intrinsics_support` API and check support for specific
intrinsics.

However, if the user does not do that, it is possible to get illegal
instruction error because we're using raw instruction opcodes, which may
or may not be supported at runtime.

Now that we have everything in a C file, we can check for support at
startup and prevent the user from possibly encountering illegal
instruction errors.

We also add return values to the API's as well, because why not.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-01-18 23:58:24 +01:00
Anatoly Burakov
56833cbd35 eal: uninline power intrinsics
Currently, power intrinsics are inline functions. Make them part of the
ABI so that we can have various internal data associated with them
without exposing said data to the outside world.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-01-18 23:57:38 +01:00
Narcisa Vasile
e03d2c3c70 cfgfile: build on Windows
The librte_cfgfile lib is functional on Windows.
Enable compilation of this lib for Windows.

Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
2021-01-17 23:21:14 +01:00
Nithin Dabilpuram
d70e87907a bitmap: support 128-byte cacheline in empty check
Currently bitmap line not empty check API assumes cache line
of 64B and only checks 8 slabs. Since in 128B cacheline, we
have 16 slabs per cacheline, rte_bitmap_clear() will mark
complete line as empty as soon as 8 slabs are empty thereby
breaking bitmap scan functionality. Fix it by defining new
__rte_bitmap_line_not_empty() for 128B cacheline platform.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-01-17 22:40:15 +01:00
Long Li
1fef6ced07 eal/linux: allow multiple starts of event monitor
In some cases, a device or infrastructure may want to enable hotplug
but application may also try and start hotplug as well. Therefore
change the monitor_started from a boolean into a reference count.

Signed-off-by: Long Li <longli@microsoft.com>
2021-01-17 22:37:28 +01:00
Tyler Retzlaff
56446c913f eal/windows: fix C++ compatibility
Explicitly cast void * to type * so that EAL headers may be compiled
as C or C++.

Fixes: e8428a9d89 ("eal/windows: add some basic functions and macros")
Cc: stable@dpdk.org

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2021-01-17 18:27:48 +01:00
Olivier Matz
de6aede17b service: propagate init error in EAL
Currently, when rte_service_init() fails at initialization, the
application always gets a ENOEXEC error code. For example, with testpmd,
this is displayed as:

  Cannot init EAL: Exec format error

This error code does not describe the real issue. Instead, use the error
code returned by the function.

Fixes: e398245008 ("service: initialize with EAL")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2021-01-15 16:32:19 +01:00
Tyler Retzlaff
aab8be4497 eal/windows: build reciprocal division functions
Build rte_reciprocal.c and export the following functions on windows:
  * rte_reciprocal_value
  * rte_reciprocal_value_u64

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-01-15 15:04:03 +01:00
Yicai Lu
324242fb51 ip_frag: remove padding length of fragment
In some situations, we would get several ip fragments, which total
data length is less than min_ip_len(64) and padding with zeros.
We simulated intermediate fragments by modifying the MTU.
To illustrate the problem, we simplify the packet format and
ignore the impact of the packet header.In namespace2,
a packet whose data length is 1520 is sent.
When the packet passes tap2, the packet is divided into two
fragments: fragment A and B, similar to (1520 = 1510 + 10).
When the packet passes tap3, the larger fragment packet A is
divided into two fragments A1 and A2, similar to (1510 = 1500 + 10).
Finally, the bond interface receives three fragments:
A1, A2, and B (1520 = 1500 + 10 + 10).
One fragmented packet A2 is smaller than the minimum Ethernet
frame length, so it needs to be padded.

|---------------------------------------------------|
|                      HOST                         |
| |--------------|   |----------------------------| |
| |      ns2     |   |      |--------------|      | |
| |  |--------|  |   |  |--------|    |--------|  | |
| |  |  tap1  |  |   |  |  tap2  | ns1|  tap3  |  | |
| |  |mtu=1510|  |   |  |mtu=1510|    |mtu=1500|  | |
| |--|1.1.1.1 |--|   |--|1.1.1.2 |----|2.1.1.1 |--| |
|    |--------|         |--------|    |--------|    |
|         |                 |              |        |
|         |-----------------|              |        |
|                                          |        |
|                                      |--------|   |
|                                      |  bond  |   |
|--------------------------------------|mtu=1500|---|
                                       |--------|

When processing the preceding packets above,
DPDK would aggregate fragmented packets A2 and B.
And error packets are generated, which padding(zero)
is displayed in the middle of the packet.

A2 + B:
0000   fa 16 3e 9f fb 82 fa 47 b2 57 dc 20 08 00 45 00
0010   00 33 b4 66 00 ba 3f 01 c1 a5 01 01 01 01 02 01
0020   01 02 c0 c1 c2 c3 c4 c5 c6 c7 00 00 00 00 00 00
0030   00 00 00 00 00 00 00 00 00 00 00 00 c8 c9 ca cb
0040   cc cd ce cf d0 d1 d2 d3 d4 d5 d6 d7 d8 d9 da db
0050   dc dd de df e0 e1 e2 e3 e4 e5 e6

So, we would calculate the length of padding, and remove
the padding in pkt_len and data_len before aggregation.
And also we have the fix for both ipv4 and ipv6.

Fixes: 7f0983ee33 ("ip_frag: check fragment length of incoming packet")
Cc: stable@dpdk.org

Signed-off-by: Yicai Lu <luyicai@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-01-15 11:31:28 +01:00
Yi Yang
76f093948f gso: support VXLAN UDP/IPv4
As most NICs do not support segmentation for VXLAN-encapsulated
UDP/IPv4 packets, this patch adds VXLAN UDP/IPv4 GSO support.

Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
2021-01-15 11:31:28 +01:00
Pallavi Kadam
edd66d57d5 eal/windows: add random function
The file rte_random.c is required to build i40e PMD on Windows.
Add rte_rand variable to export file.

Redefine _m_prefetchw for Clang toolchain due to following error
with respect to conflicting types:

FAILED: lib/76b5a35@@rte_eal@sta/librte_eal_common_rte_random.c.obj
clang @lib/76b5a35@@rte_eal@sta/librte_eal_common_rte_random.c.obj.rsp
In file included from ../lib/librte_eal/common/rte_random.c:13:
In file included from ..\lib/librte_eal/include\rte_eal.h:20:
In file included from ..\lib/librte_eal/include\rte_per_lcore.h:25:
In file included from ..\lib/librte_eal/windows/include\pthread.h:21:
In file included from ..\lib/librte_eal/windows/include\rte_windows.h:27:
In file included from C:\Program Files (x86)\Windows Kits\10\include\
10.0.18362.0\um\windows.h:171:
In file included from C:\Program Files (x86)\Windows Kits\10\include\
10.0.18362.0\shared\windef.h:24:
In file included from C:\Program Files (x86)\Windows Kits\10\include\
10.0.18362.0\shared\minwindef.h:182:
C:\Program Files (x86)\Windows Kits\10\include\10.0.18362.0\um\
winnt.h:3324:1: error: conflicting types for '_m_prefetchw'
_m_prefetchw (
^
C:\Program Files\LLVM\lib\clang\10.0.0\include\prfchwintrin.h:50:1:
note: previous definition is here
_m_prefetchw(void *__P)
^
1 error generated.

Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
2021-01-14 23:21:40 +01:00
Jiayu Hu
1b7b24389c vhost: enhance async enqueue for small packets
Async enqueue offloads large copies to DMA devices, and small copies
are still performed by the CPU. However, it requires users to get
enqueue completed packets by rte_vhost_poll_enqueue_completed(), even
if they are completed by the CPU when rte_vhost_submit_enqueue_burst()
returns. This design incurs extra overheads of tracking completed
pktmbufs and function calls, thus degrading performance on small packets.

This patch enhances async enqueue for small packets by enabling
rte_vhost_submit_enqueue_burst() to return completed packets.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-13 18:51:58 +01:00
Jiayu Hu
9ab57ef196 vhost: cleanup async enqueue
This patch removes unnecessary check and function calls, and it changes
appropriate types for internal variables and fixes typos.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-13 18:51:58 +01:00
Ruifeng Wang
67b68824a8 lpm/arm: support SVE
Added new path to do lpm4 lookup by using scalable vector extension.
The SVE path will be selected if compiler has flag SVE set.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2021-01-14 16:42:25 +01:00
Ruifeng Wang
5702b7bf1c lpm: fix vector IPv4 lookup
rte_lpm_lookupx4 could return wrong next hop when more than 256 tbl8
groups are created. This is caused by incorrect type casting of tbl8
group index that been stored in tbl24 entry. The casting caused group
index truncation and hence wrong tbl8 group been searched.

Issue fixed by applying proper mask to tbl24 entry to get tbl8 group index.

Fixes: dc81ebbaca ("lpm: extend IPv4 next hop field")
Fixes: cbc2f1dccf ("lpm/arm: support NEON")
Fixes: d2cc795934 ("lpm: add AltiVec for ppc64")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2021-01-14 14:19:57 +01:00
Vladimir Medvedkin
6e4d4a6381 fib6: improve AVX512 lookup performance
Improved performance for AVX512 FIB6 lookup by doubling the number
of flows being processed

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-01-13 22:13:37 +01:00
Ori Kam
1922db13bf regexdev: add resource limit reached flag
When scanning a buffer it is possible that the scan will abort
due to some internal resource limit.

This commit adds such response flag, so application can handle such cases.

Signed-off-by: Francis Kelly <fkelly@nvidia.com>
Signed-off-by: Ori Kam <orika@nvidia.com>
2021-01-12 23:31:39 +01:00
Tal Shnaiderman
b1fd151267 eal: add generic thread-local-storage functions
Add support for TLS functionality in EAL.

The following functions are added:
rte_thread_tls_key_create - create a TLS data key.
rte_thread_tls_key_delete - delete a TLS data key.
rte_thread_tls_value_set - set value bound to the TLS key
rte_thread_tls_value_get - get value bound to the TLS key

TLS key is defined by the new type rte_tls_key.

The API allocates the thread local storage (TLS) key.
Any thread of the process can subsequently use this key
to store and retrieve values that are local to the thread.

Those functions are added in addition to TLS capability
in rte_per_lcore.h to allow abstraction of the pthread
layer for all operating systems.

Windows implementation is under librte_eal/windows and
implemented using WIN32 API for Windows only.

Unix implementation is under librte_eal/unix and
implemented using pthread for UNIX compilation.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-01-11 23:28:12 +01:00
Tal Shnaiderman
d136fae560 eal: move thread affinity functions to new file
Move the definition of the functions
rte_thread_set_affinity and rte_thread_get_affinity
to new file, rte_thread.h

The file will implement generic threading functionality
and will only host threading functions which do not reference
pthread API.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-01-11 23:27:39 +01:00
Maxime Coquelin
be1525c6b4 vhost: refactor memory regions mapping
This patch moves memory region mmaping and related
preparation in a dedicated function in order to simplify
VHOST_USER_SET_MEM_TABLE request handling function.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-01-08 18:07:56 +01:00
Maxime Coquelin
761ea501ce vhost: refactor postcopy registration
This patch moves the registration of postcopy to a
dedicated function, with the goal of simplifying
VHOST_USER_SET_MEM_TABLE request handling function.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-01-08 18:07:56 +01:00
Maxime Coquelin
fc2225dbc5 vhost: refactor postcopy region registration
This patch moves the registration of memory regions to
userfaultfd to a dedicated function, with the goal of
simplifying VHOST_USER_SET_MEM_TABLE request handling
function.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-01-08 18:07:56 +01:00
Joyce Kong
a33c3584f3 vhost: replace SMP with thread fence for control path
Simply replace the smp barriers with atomic thread fence for vhost control
path, if there are no synchronization points.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:56 +01:00
Joyce Kong
5faf0a9c54 vhost: replace SMP with thread fence for packed vring
Simply replace smp barriers with atomic thread fence for
virtio packed vring.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:55 +01:00
Joyce Kong
10b8c36af0 vhost: relax full barriers for used idx
Used idx can be synchronized by one-way barrier instead of full
write barrier for split vring.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:55 +01:00
Joyce Kong
9253c34cfb vhost: relax full barriers for desc flags
Relax the full read barrier to one-way barrier for desc flags in
packed vring.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:55 +01:00
Joyce Kong
2d031675b2 vhost: remove unnecessary SMP barrier for avail idx
The ordering between avail index and desc reads has been enforced
by load-acquire for split vring, so smp_rmb barrier is not needed
behind it.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:55 +01:00
Joyce Kong
8fc9eaaac7 vhost: remove unnecessary SMP barrier for desc flags
As function desc_is_avail performs a load-acquire barrier to
enforce the ordering between desc flags and desc content, it is
unnecessary to add a rte_smp_rmb barrier around the trace which
follows desc_is_avail.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-01-08 18:07:55 +01:00
Joyce Kong
4aae2397ad rcu: use EAL memory barrier API
Use rte_atomic_thread_fence wrapper which has been provided for
__atomic_thread_fence builtins to support optimized code for
__ATOMIC_SEQ_CST memory order on x86 platforms.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-01-11 15:34:21 +01:00
Ashish Sadanandan
397fb6a8d9 mbuf: add C++ include guard for dynamic fields header
The header was missing the extern "C" directive which causes name
mangling of functions by C++ compilers, leading to linker errors
complaining of undefined references to these functions.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Ashish Sadanandan <ashish.sadanandan@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-01-11 15:34:21 +01:00
Yunjian Wang
e3e9c87c0f eal/linux: fix handling of error events from epoll
The "rev->epdata.event" assigned to "events.epdata.event" directly, which
was wrong in case of epoll events. It should be set to the "evs.events".

Fixes: 9efe9c6cdc ("eal/linux: add epoll wrappers")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Harman Kalra <hkalra@marvell.com>
2021-01-11 15:34:21 +01:00
Pallavi Kadam
aacc29dacd eal/windows: add interrupt functions stub
Add some missing interrupt implementations on Windows.
Also add respective functions to export file.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-01-05 22:45:00 +01:00
Vladimir Medvedkin
e682b02084 rib: fix insertion in some cases
According to GCC documentation for __builtin_clz:
Returns the number of leading 0-bits in x,
starting at the most significant bit position.
If x is 0, the result is undefined.
__builtin_clz will be called with 0 if the existing
prefix address matches the one we want to insert.

Fixes: 5a5793a5ff ("rib: add RIB library")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2020-12-15 10:08:39 +01:00
Nick Connolly
498ec3c6da eal/windows: fix vfprintf warning with clang
When building with clang (11.0,--buildtype=debug), eal_lcore.c
produces a -Wformat-nonliteral warning from the vfprintf call
in log_early.

Add __rte_format_printf annotation.

Fixes: b8a36b0866 ("eal/windows: improve CPU and NUMA node detection")
Cc: stable@dpdk.org

Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
2020-12-07 21:24:57 +01:00
Nick Connolly
a7288328a9 eal/windows: fix debug build with MinGW
Compiling with MinGW in --buildtype=debug produces a redefinition
error for strncasecmp.

The root cause is that rte_os.h shouldn't be injecting POSIX definitions
into the environment.  It is the applications responsibility to decide
how to handle missing functionality.

Resolving this properly will require further work, but in the meantime
wrap all such definitions with #ifndef/#endif.  This resolves the specific
issue with strncasecmp and handles similar issues that applications may
encounter.

Fixes: e8428a9d89 ("eal/windows: add some basic functions and macros")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
2020-12-07 20:46:33 +01:00
Dmitry Kozlyuk
93bf432dd5 eal/windows: fix build with MinGW-w64 8
MinGW-w64 above 8.0.0 exposes VirtualAlloc2() API in headers, but lacks
it in import libraries. Hence, availability of this API at compile-time
can't be used to choose between locating VirtualAlloc2() manually or
relying on the dynamic linker.

Fix redefinition compile-time errors.
Always link VirtualAlloc2() when using GCC.

Fixes: 2a5d547a4a ("eal/windows: implement basic memory management")
Cc: stable@dpdk.org

Reported-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Tested-by: Thomas Monjalon <thomas@monjalon.net>
2020-12-07 14:00:22 +01:00
Andrew Rybchenko
8ca9bf26f5 ethdev: deprecate shared counters using action attribute
A new generic shared actions API may be used to create shared
counter. There is no point to keep duplicate COUNT action specific
capability to create shared counters.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-11-27 19:16:45 +01:00
Ruifeng Wang
d123fd111a eal/arm: fix build with gcc optimization level 0
GCC build with '-O0' on platforms with RTE_ARM_FEATURE_ATOMICS set
failed for:
 ../lib/librte_efd/rte_efd.c
 Assembler messages:
3866: Error: selected processor does not support `crc32cb w0,w0,w1'
3890: Error: selected processor does not support `crc32ch w0,w0,w1'
3914: Error: selected processor does not support `crc32cw w0,w0,w1'
3938: Error: selected processor does not support `crc32cx w0,w0,x1'

This was caused by an architecture specifier added for Clang.
Unlike Clang, GCC considers each inline assembly block to be dependent
and therefore, the architecture specifier impacts assemble of some
blocks require certain extension support.

Removed the architecture for GCC to fix the issue.

Fixes: 8fce34cd0a ("eal/arm: fix clang build of native target")
Cc: stable@dpdk.org

Reported-by: Feifei Wang <feifei.wang2@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-11-27 16:51:46 +01:00
Olivier Matz
44d00a1d12 doc: add missing network layers in API index
Add missing files in doxy-api-index.md and add a short description
for files that hadn't one.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2020-11-27 01:51:27 +01:00
Viacheslav Ovsiienko
0d6ce665ee net: fix eCPRI header generic data field
There was a typo in eCPRI header definition.

Fixes: d164c609e7 ("ethdev: add eCPRI key fields to flow API")
Cc: stable@dpdk.org

Reported-by: Rani Sharoni <ranish@nvidia.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Reviewed-by: Bing Zhao <bingz@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-26 01:14:11 +01:00
Timothy Redaelli
c57f6e5c60 eal: fix plugin loading
Commit 49b536fc30 ("eal: load only shared libs from driver plugin directories")
introduced a check that any shared library must ends with .so, but it can't
work, at least, on Fedora/CentOS/RHEL since .so symlinks are not installed
when you install dpdk package, but only when you install dpdk-devel package.

This commit adds also a check for .so.ABI_VERSION to check for shared lib.

See Fedora Packaging Guidelines for more information:
https://docs.fedoraproject.org/en-US/packaging-guidelines/#_devel_packages

Fixes: 49b536fc30 ("eal: load only shared libs from driver plugin directories")
Cc: stable@dpdk.org

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
2020-11-26 00:00:06 +01:00
Timothy Redaelli
7781950f4d eal: fix shared lib mode detection
Commit 06c7871dde ("eal: restrict default plugin path to shared lib mode")
introduced a check that enabled shared lib mode when librte_eal.so can
be loaded, but it can't work, at least, on Fedora/CentOS/RHEL since .so
symlinks are not installed when you install dpdk package, but only when
you install dpdk-devel package.

This commit uses librte_eal.so.ABI_VERSION to check for shared lib,
since it exists on any linux distributions.

See Fedora Packaging Guidelines for more information:
https://docs.fedoraproject.org/en-US/packaging-guidelines/#_devel_packages

Fixes: 06c7871dde ("eal: restrict default plugin path to shared lib mode")
Cc: stable@dpdk.org

Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
2020-11-25 23:54:12 +01:00
Diogo Behrens
021b698eb5 mcslock: fix hang in weak memory model
The initialization me->locked=1 in lock() must happen before
next->locked=0 in unlock(), otherwise a thread may hang forever,
waiting me->locked become 0. On weak memory systems (such as ARMv8),
the current implementation allows me->locked=1 to be reordered with
announcing the node (pred->next=me) and, consequently, to be
reordered with next->locked=0 in unlock().

This fix adds a release barrier to pred->next=me, forcing
me->locked=1 to happen before this operation.

Fixes: 2173f3333b ("mcslock: add MCS queued lock implementation")
Cc: stable@dpdk.org

Signed-off-by: Diogo Behrens <diogo.behrens@huawei.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-11-25 17:30:04 +01:00
Nick Connolly
141492be99 eal/windows: fix linkage with MinGW
Linking with the 'pci' driver when building with MinGW on
Windows fails with undefined symbol 'GUID_DEVCLASS_NET'.
This occurs because devguid.h is included in rte_windows.h
before INITGUID is defined.

Move the include of devguid.h after the definition of INITGUID.

Fixes: b762221ac2 ("bus/pci: support Windows with bifurcated drivers")
Cc: stable@dpdk.org

Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Reviewed-by: Tal Shnaiderman <talshn@nvidia.com>
2020-11-22 18:55:01 +01:00
Yunjian Wang
a99d2521a3 malloc: fix style in free list index computation
Cleanup code style issue reported by kernel checkpatch. As follows:
  * ERROR:CODE_INDENT: code indent should use tabs where possible
  * ERROR:SPACING: spaces required around that '?' (ctx:VxE)
  * WARNING:INDENTED_LABEL: labels should not be indented

Fixes: b0489e7bca ("malloc: fix linear complexity")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-11-22 18:52:07 +01:00
Thomas Monjalon
dc328d1c55 ethdev: rename a flow shared action error code
In the experimental function rte_flow_shared_action_destroy()
introduced in DPDK 20.11, the errno ETOOMANYREFS was used.
This errno is not always available on Windows,
so it is preferred using EBUSY instead.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Tal Shnaiderman <talshn@nvidia.com>
Tested-by: Tal Shnaiderman <talshn@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-20 21:10:05 +01:00
Simei Su
46914aa1c7 ethdev: add eCPRI RSS offload type
This patch defines new RSS offload types for eCPRI. For eCPRI with
Message Type 0, the hash field is physical channel ID.

Signed-off-by: Simei Su <simei.su@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-20 21:10:05 +01:00
Thomas Monjalon
135155a836 build: align wording of non-support reasons
Reasons for building not supported generally start with lowercase
because printed as the second part of a line.

Other changes:
	- "linux" should be "Linux" with a capital letter.
	- ARCH_X86_64 may be simply x86_64.
	- aarch64 is preferred over arm64.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-11-20 16:05:35 +01:00
Tal Shnaiderman
ba5b133e33 eal/windows: remove definition of ETOOMANYREFS
The definition of ETOOMANYREFS is reverted as it breaks build of
external applications already defining it.

Fixes: c917b54b0c ("eal/windows: add definition of ETOOMANYREFS")

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Reviewed-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-11-20 15:59:02 +01:00
Tal Shnaiderman
c917b54b0c eal/windows: add definition of ETOOMANYREFS
The ETOOMANYREFS errno was missing from the Windows build.
It is used in initialization of flow error structures.

It is defined with the same error code used by WSAETOOMANYREFS.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2020-11-16 00:14:07 +01:00
Stephen Hemminger
db27370b57 eal: replace blacklist/whitelist options
Replace -w / --pci-whitelist with -a / --allow options
and --pci-blacklist with --block.
The -b short option remains unchanged.

Allow the old options for now, but print a nag
warning since old options are deprecated.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Luca Boccassi <bluca@debian.org>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-11-16 00:11:22 +01:00
Stephen Hemminger
a65a34a85e eal: replace usage of blacklist/whitelist in enums
Rename the enum values in the EAL include files.
As a backward compatible temporary migration tool, define
a replacement mapping for old values.

The old names relating to blacklist and whitelist are replaced
by block list and allow list, but applications may be using the
older compatibility macros, marked as deprecated.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Gaetan Rivet <grive@u256.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-11-16 00:11:22 +01:00
Cristian Dumitrescu
27cc077922 pipeline: fix multiple SWX emit pattern detection
Fix the detection of instruction pattern with multiple emits followed
by TX. Once detected, this is one of the instruction patterns that is
internally replaced with a single optimized instruction, as long as
none of the instructions to be replaced is referenced by a jump
instruction. The fix enforces this check for the TX instruction of
the pattern.

Fixes: 31035e87b2 ("pipeline: add SWX instruction optimizer")

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-11-15 16:46:37 +01:00
Maxime Coquelin
bc900f86aa vhost: fix fd leak in kick setup
This patch fixes a file descriptor leak which happens
in the error path of vhost_user_set_vring_kick().

Fixes: 4796ad63ba ("examples/vhost: import userspace vhost application")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
2020-11-13 19:43:27 +01:00
Maxime Coquelin
6dc3f119ce vhost: fix fd leak in dirty logging setup
This patch fixes a file descriptor leak which happens
in the error path of vhost_user_set_log_base().

Fixes: 4796ad63ba ("examples/vhost: import userspace vhost application")
Cc: stable@dpdk.org

Reported-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
2020-11-13 19:43:26 +01:00
Maxime Coquelin
726a14eb83 vhost: fix error path when setting memory tables
If an error is encountered before the memory regions are
parsed, the file descriptors for these shared buffers are
leaked.

This patch fixes this by closing the message file descriptors
on error, taking care of avoiding double closing of the file
descriptors. guest_pages is also freed, even though it was not
leaked as its pointer was not overridden on subsequent function
calls.

Fixes: 8f972312b8 ("vhost: support vhost-user")
Cc: stable@dpdk.org

Reported-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
2020-11-13 19:43:26 +01:00
Maxime Coquelin
7804bbd13a vhost: fix virtqueue initialization
This patches fixes virtqueue initialization issue causing
segfault or file descriptor being closed unexpectedly.

The wrong index was passed to init_vring_queue() by
alloc_vring_queue() when a hole in the virtqueue array was
met.

Fixes: 8acd7c2133 ("vhost: fix virtqueues metadata allocation")
Cc: stable@dpdk.org

Reported-by: Yu Jiang <yux.jiang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Tested-by: Yu Jiang <yux.jiang@intel.com>
2020-11-13 19:43:25 +01:00
Patrick Fu
45ba914134 vhost: fix async inflight packet counter
Async inflight packet counter should take failed packets into account.
Failed packets will be deducted in the error handling logic.

Fixes: 6b3c81db8b ("vhost: simplify async copy completion")
Fixes: cd6760da10 ("vhost: introduce async enqueue for split ring")
Cc: stable@dpdk.org

Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2020-11-13 19:43:25 +01:00
Yi Yang
b605df71be gro: fix packet type detection with IPv6 tunnel
For VxLAN packets, GRO will mistakenly reassemble them
if inner L3 is IPv6, inner L4 is TCP or UDP, and outer L3
is IPv4 because the value of IS_IPV4_VXLAN_TCP4/UDP4_PKT
is true for them.

This fix makes sure IS_IPV4_TCP_PKT, IS_IPV4_UDP_PKT,
IS_IPV4_VXLAN_TCP4_PKT and IS_IPV4_VXLAN_UDP4_PKT can make
decision precisely.

Fixes: e2d8110636 ("gro: support VXLAN UDP/IPv4")
Fixes: 1ca5e67408 ("gro: support UDP/IPv4")
Fixes: 9e0b9d2ec0 ("gro: support VxLAN GRO")
Fixes: 0d2cbe59b7 ("lib/gro: support TCP/IPv4")
Cc: stable@dpdk.org

Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
2020-11-14 10:56:30 +01:00
Nick Connolly
52a7cb0ad0 build: fix MS linker flag with meson 0.54
Meson versions >= 0.54.0 include support for handling /implib
with msvc link. Specifying it explicitly causes failures when
linking against the dll. Tested using Link 14.27.29112.0 and
Clang 11.0.0.

There were a number of changes to the way that import libraries
are handled between 0.47.1 and 0.54.0. Only make the change
for >= 0.54.0, leaving the behaviour unchanged for earlier
versions.

Fixes: 77cca7ccec ("build: fix drivers library path on Windows")
Cc: stable@dpdk.org

Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Tested-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Khoa To <khot@microsoft.com>
2020-11-13 15:13:16 +01:00
Cristian Dumitrescu
5725870c8c table: fix exact match SWX table lookup
Fix for the exact match lookup function.

Fixes: d0a0096661 ("table: add exact match SWX table")

Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2020-11-13 13:55:07 +01:00
Ruifeng Wang
8fce34cd0a eal/arm: fix clang build of native target
When doing Clang build with '-mcpu=native' on N1 platform, build failed
with:
../lib/librte_eal/arm/include/rte_atomic_64.h:76:39:
	error: instruction requires: lse
__ATOMIC128_CAS_OP(__cas_128_release, "caspl")

This is because native detection for Neoverse N1 was added in Clang-11.
Prior version of Clang's assembler doesn't know LSE support on hardware.
Fixed this for Clang earlier than version 11 by specifying architecture
for assembler.
Referred to [1] for this fix.

Fixes: 7e2c3e17fe ("eal/arm64: add 128-bit atomic compare exchange")
Cc: stable@dpdk.org

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e0d5896bd356cd577f9710a02d7a474cdf58426b

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-11-13 10:30:48 +01:00
David Christensen
2d0a972cd6 vfio: use static window sizing for sPAPR IOMMU
The SPAPR IOMMU requires that a DMA window size be defined before memory
can be mapped for DMA. Current code dynamically modifies the DMA window
size in response to every new memory allocation which is potentially
dangerous because all existing mappings need to be unmapped/remapped in
order to resize the DMA window, leaving hardware holding IOVA addresses
that are temporarily unmapped.  The new SPAPR code statically assigns
the DMA window size on first use, using the largest physical memory
memory address when IOVA=PA and the highest existing memseg virtual
address when IOVA=VA.

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2020-11-13 09:35:18 +01:00
Thomas Monjalon
4630290af4 mbuf: move pool pointer in first half
According to the Technical Board decision
(http://mails.dpdk.org/archives/dev/2020-November/191859.html),
the mempool pointer in the mbuf struct is moved
from the second to the first half.
It may increase performance in some cases
on systems having 64-byte cache line, i.e. mbuf split in two cache lines.

Due to this change, all fields after "pool" are moved up.
Hopefully no vector data path is impacted.

Moving this field gives more space to dynfield1
while dropping the temporary dynfield0.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
 0    void *                            buf_addr;         /*   0 +  8 */
 1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
 2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
 3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
 4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
 5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
 5.5  uint64_t             union        hash;             /*  44 +  8 */
 6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
 7    struct rte_mempool *              pool;             /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
 8    struct rte_mbuf *                 next;             /*  64 +  8 */
 9    uint64_t             union        tx_offload;       /*  72 +  8 */
10    struct rte_mbuf_ext_shared_info * shinfo;           /*  80 +  8 */
11    uint16_t                          priv_size;        /*  88 +  2 */
      uint16_t                          timesync;         /*  90 +  2 */
11.5  uint32_t                          dynfield1[9];     /*  92 + 36 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2020-11-12 16:39:10 +01:00
Morten Brørup
3a35c1c0f6 mbuf: clean up comments and prefix
The mbuf header files had some commenting style errors that affected the
API documentation.
Also, the RTE_ prefix was missing on a macro and a definition.

Note: This patch does not touch the offload and attachment flags that are
also missing the RTE_ prefix.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-11-05 17:53:15 +01:00
Dmitry Kozlyuk
c2341bb671 cmdline: avoid name clash with Windows system types
cmdline_numtype member names clash with Windows system identifiers.
Add RTE_ prefix to cmdline constants to avoid this and possible
future conflicts.

Suggested-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Jie Zhou <jizh@microsoft.com>
Tested-by: Jie Zhou <jizh@microsoft.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-11-05 17:49:00 +01:00
Olivier Matz
f8e90bcc1d eal: fix MCS lock and ticketlock headers install
Add missing arch-specific headers in meson.build.

Fixes: 2173f3333b ("mcslock: add MCS queued lock implementation")
Fixes: ca49b92079 ("ticketlock: enable generic ticketlock on all arch")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
2020-11-05 12:08:19 +01:00
Stephen Hemminger
0429a2e1a4 mbuf: fix dynamic fields and flags with multiprocess
The dynamic flag management is broken if rte_mbuf_dynflag_lookup()
is done in a secondary process because the local pointer to
the memzone is not ever initialized.

Fix it by using the same checks as dynfield_register().
I.e if shared memory zone has not been looked up already,
then discover it.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-11-04 20:47:14 +01:00
Yunjian Wang
4bb02a6f5b ethdev: fix data type for port id
The ethdev port id is 16 bits now. This patch fixes the data type
of the variable for 'pid', which changing from uint32_t to uint16_t.

RTE_MAX_ETHPORTS is the maximum number of ports, which customized by
the user. To avoid 16-bit unsigned integer overflow, the valid value
of RTE_MAX_ETHPORTS should be set from 0 to UINT16_MAX, and it is
safer to cut one more port from space.

So we use RTE_BUILD_BUG_ON() to ensure that RTE_MAX_ETHPORTS is less
to UINT16_MAX.

Fixes: 5b7ba31148 ("ethdev: add port ownership")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-04 14:48:45 +01:00
Yunjian Wang
b9b67d0674 ethdev: fix using Rx split config before null check
Coverity flags that 'rx_conf' variable is used before
it's checked for NULL. This patch fixes this issue.

Coverity issue: 363570
Fixes: 4ff702b5df ("ethdev: introduce Rx buffer split")

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-04 12:21:06 +01:00
Ivan Malov
9ff91c0d81 ethdev: introduce transfer attribute to shared action conf
In a flow rule, attribute "transfer" means operation level
at which both traffic is matched and actions are conducted.

Add the very same attribute to shared action configuration.
If a driver needs to prepare HW resources in two different
ways, depending on the operation level, in order to set up
an action, then this new attribute will indicate the level.
Also, when handling a flow rule insertion, the driver will
be able to turn down a shared action if its level is unfit.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Andrey Vesnovaty <andreyv@nvidia.com>
2020-11-03 23:35:07 +01:00
Morten Brørup
4def9a8281 ethdev: document Rx packet number requirement for vector Rx
Updated description of rte_eth_rx_burst() to reflect what drivers,
when using vector instructions, expect from nb_pkts.

Also discussed on the mailing list here:
http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35C61257@smartserver.smartshare.dk/

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-11-03 23:35:06 +01:00
Andrew Rybchenko
8fb8efd4e4 ethdev: move L2 tunnel config structure to ixgbe driver
net/ixgbe driver is the only user of the struct rte_eth_l2_tunnel_conf.
Move it to the driver and use ixgbe_ prefix instead of rte_eth_.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-03 23:35:06 +01:00
Andrew Rybchenko
cf47acc0f9 ethdev: remove L2 tunnel offload control API
Remove rte_eth_dev_l2_tunnel_offload_set() and corresponding
ethdev driver operation.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-03 23:35:06 +01:00
Andrew Rybchenko
99a1b6895f ethdev: remove API to config L2 tunnel EtherType
Remove rte_eth_dev_l2_tunnel_eth_type_conf() and corresponding
ethdev driver operation.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-03 23:35:06 +01:00
Andrew Rybchenko
0b46e9b411 ethdev: remove legacy filter API functions
The legacy filter API, including rte_eth_dev_filter_supported() and
rte_eth_dev_filter_ctrl() is removed. Flow API should be used.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-03 23:35:05 +01:00
Andrew Rybchenko
1be514fbce ethdev: remove legacy FDIR filter type support
Instead of FDIR filters RTE flow API should be used.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-03 23:35:05 +01:00
Andrew Rybchenko
0cd2ff37c3 ethdev: remove legacy global filter configuration support
Global filter configuration request was supported by net/i40e
driver only to configure GRE key length.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-11-03 23:35:05 +01:00