6937 Commits

Author SHA1 Message Date
Matan Azrad
07b0b75370 cryptodev: formalize key wrap method in API
The Key Wrap approach is used by applications in order to protect keys
located in untrusted storage or transmitted over untrusted
communications networks. The constructions are typically built from
standard primitives such as block ciphers and cryptographic hash
functions.

The Key Wrap method and its parameters are a secret between the keys
provider and the device, means that the device is preconfigured for
this method using very secured way.

The key wrap method may change the key length and layout.

Add a description for the cipher transformation key to allow wrapped key
to be forwarded by the same API.

Add a new feature flag RTE_CRYPTODEV_FF_CIPHER_WRAPPED_KEY to be enabled
by PMDs support wrapped key in cipher trasformation.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-04-16 12:43:33 +02:00
Fan Zhang
c21574edc5 cryptodev: add dequeue count parameter in raw API
This patch changes the experimental raw data path dequeue burst API.
Originally the API enforces the user to provide callback function
to get maximum dequeue count. This change gives the user one more
option to pass directly the expected dequeue count.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-04-16 12:43:33 +02:00
Matan Azrad
d014dddb2d cryptodev: support multiple cipher data-units
In cryptography, a block cipher is a deterministic algorithm operating
on fixed-length groups of bits, called blocks.

A block cipher consists of two paired algorithms, one for encryption
and the other for decryption. Both algorithms accept two inputs:
an input block of size n bits and a key of size k bits; and both yield
an n-bit output block. The decryption algorithm is defined to be the
inverse function of the encryption.

For AES standard the block size is 16 bytes.
For AES in XTS mode, the data to be encrypted\decrypted does not have to
be multiple of 16B size, the unit of data is called data-unit.
The data-unit size can be any size in range [16B, 2^24B], so, in this
case, a data stream is divided into N amount of equal data-units and
must be encrypted\decrypted in the same data-unit resolution.

For ABI compatibility reason, the size is limited to 64K (16-bit field).
The new field dataunit_len is inserted in a struct padding hole,
which is only 2 bytes long in 32-bit build.
It could be moved and extended later during an ABI-breakage window.

The current cryptodev API doesn't allow the user to select a specific
data-unit length supported by the devices.
In addition, there is no definition how the IV is detected per data-unit
when single operation includes more than one data-unit.

That causes applications to use single operation per data-unit even though
all the data is continuous in memory what reduces datapath performance.

Add a new feature flag to support multiple data-unit sizes, called
RTE_CRYPTODEV_FF_CIPHER_MULTIPLE_DATA_UNITS.
Add a new field in cipher capability, called dataunit_set,
where the devices can report the range of the supported data-unit sizes.
Add a new cipher transformation field, called dataunit_len, where the user
can select the data-unit length for all the operations.

All the new fields do not change the size of their structures,
by filling some struct padding holes.
They are added as exceptions in the ABI check file libabigail.abignore.

Using a bitmap to report the supported data-unit sizes capability allows
the devices to report a range simply as same as the user to read it
simply. also, thus sizes are usually common and probably will be shared
among different devices.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-04-16 12:43:33 +02:00
Nicolas Chautru
48fc315f02 bbdev: add explicit enum for code block mode
Using explicit enum instead of ambiguous integer value

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
2021-04-16 12:43:33 +02:00
Anatoly Burakov
2414ce9b7b power: fix closing frequency file
Currently, we open the system base frequency file, but never close it,
which results in a memory leak.

Coverity issue: 369693
Fixes: 8a5febaac4f7 ("power: fix P-state base frequency handling")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
2021-04-15 23:53:39 +02:00
Anatoly Burakov
64f22b91c6 power: remove redundant close of frequency file
Previous fix has addressed the incorrect handling of `base_frequency`
file, but has added a use-after-free error due to the fact that all
further code paths will lead to an `fclose()` call at the end, so the
additional `fclose()` call right after processing the file was
unnecessary.

Coverity issue: 369901
Fixes: 8a5febaac4f7 ("power: fix P-state base frequency handling")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Liang Ma <liangma@liangbit.com>
Acked-by: David Hunt <david.hunt@intel.com>
2021-04-15 23:53:33 +02:00
Tyler Retzlaff
a6f88f7eb9 eal: add C++ include guard for reciprocal header
Add missing extern "C" linkage for rte_reciprocal.h consistent with
other eal headers.

Fixes: ffe3ec811ef5 ("sched: introduce reciprocal divide")
Cc: stable@dpdk.org

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-04-15 16:44:18 +02:00
Dmitry Kozlyuk
89813a522e net: provide IP-related API on any OS
Users of <rte_ip.h> relied on it to provide IP-related defines,
like IPPROTO_* constants, but still had to include POSIX headers
for inet_pton() and other standard IP-related facilities.

Extend <rte_ip.h> so that it is a single header to gain access
to IP-related facilities on any OS. Use it to replace POSIX includes
in components enabled on Windows. Move missing constants from Windows
networking shim to OS shim header and include it where needed.

Remove Windows networking shim that is no longer needed.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-04-15 01:56:43 +02:00
Dmitry Kozlyuk
6c068dbd9f net: work around s_addr macro on Windows
Windows Sockets headers contain `#define s_addr S_un.S_addr`, which
conflicts with definition of `s_addr` field of `struct rte_ether_hdr`.
Prieviously `s_addr` was undefined in <rte_ether.h>, which had been
breaking access to `s_addr` field of `struct in_addr`, so some DPDK
and Windows headers could not be included in one file.

Renaming of `struct rte_ether_hdr` is planned:
https://mails.dpdk.org/archives/dev/2021-March/201444.html

Temporarily disable `s_addr` macro around `struct rte_ether_hdr`
definition to avoid conflict. Place source MAC address in both `s_addr`
and `S_un.S_addr` fields, so that access works either directly or
through the macro as defined in Windows headers.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-04-15 01:56:40 +02:00
Dmitry Kozlyuk
45d62067c2 eal: make OS shims internal
DPDK code often relies on functions and macros that are not standard C,
but are found on all platforms, even if by slightly different names.
Windows <rte_os.h> provided macros or inline definitions for such symbols.
However, when placed in public header, these symbols were unnecessarily
exposed, breaking consumer POSIX compatibility code.

Move most of the shims to <rte_os_shim.h>, a header to be used instead
of <rte_os.h> by internal code. Include it in libraries and PMDs that
previously imported shims from <rte_os.h>. Directly replace shims that
were only used inside EAL:
* index -> strchr, rindex -> strrchr
* sleep -> rte_delay_us_sleep
* strerror_r -> strerror_s

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-04-15 01:56:20 +02:00
Dmitry Kozlyuk
9ec521006d eal/windows: hide asprintf shim
Make asprintf(3) implementation for Windows private to EAL, so that it's
hidden from external consumers. It is not exposed to internal consumers
either, because they don't need asprintf() and also because callers from
other modules would have no reliable way to free allocated memory.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Khoa To <khot@microsoft.com>
Acked-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-04-15 01:56:06 +02:00
Xueming Li
3ab385063c kvargs: add get by key
Adds a new function to get value of a specific key from kvargs list.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-04-14 22:25:22 +02:00
Xueming Li
e132ee8690 devargs: fix memory leak on parsing failure
This patch fixes memory leak in parsing error handling.

Fixes: 338327d731e6 ("devargs: add function to parse device layers")
Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-04-14 22:25:15 +02:00
Xueming Li
64051bb1f1 devargs: unify scratch buffer storage
In current design, legacy parser rte_devargs_parse() saved scratch
buffer to devargs.args while new parser rte_devargs_layers_parse() saved
to devargs.data. Code using devargs had to know the difference and
cleaned up memory accordingly - error prone.

This patch unifies scratch buffer to data field, introduces
rte_devargs_reset() function to wrap the memory clean up logic.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-04-14 22:25:08 +02:00
Stephen Hemminger
9667d97c25 pflock: add phase-fair reader writer locks
This is a new type of reader-writer lock that provides better fairness
guarantees which better suited for typical DPDK applications.
A pflock has two ticket pools, one for readers and one
for writers.

Phase-fair reader writer locks ensure that neither reader nor writer will
be starved.
Neither reader or writer are preferred, they execute in alternating
phases.
All operations of the same type (reader or writer) that acquire the lock
are handled in FIFO order.
Write operations are exclusive, and multiple read operations can be run
together (until a write arrives).

A similar implementation is in Concurrency Kit package in FreeBSD.
For more information see:
   "Reader-Writer Synchronization for Shared-Memory Multiprocessor
    Real-Time Systems",
    http://www.cs.unc.edu/~anderson/papers/ecrts09b.pdf

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-04-14 21:59:47 +02:00
Pavan Nikhilesh
206c562e4b eventdev: fix build on RHEL 7
Since queue identifier is passed as signed integer, a compilation error
is generated:
rte_event_eth_rx_adapter.c:1810:57: error: signed and unsigned type
in conditional expression [-Werror=sign-compare]
Make queue identifier as unsigned when adding it to vector data.

Bugzilla ID: 672
Fixes: d7c428e557ba ("eventdev: support Rx adapter event vector")

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-04-14 10:02:05 +02:00
Tyler Retzlaff
7498c2d74e eal: do not redefine asm keyword in C++
C++ forbids redefining a keyword as a macro.
The keyword asm is conditionally-supported and implementation defined,
but it seems our best guess.

In C, if asm does not exist, it is defined as __asm__
which is a GNU extension.

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-04-13 15:32:48 +02:00
Pavan Nikhilesh
bc7d4b0346 eventdev: support Tx adapter event vector
Add event vector support for event eth Tx adapter, the implementation
receives events from the single linked queue and based on
rte_event_vector::attr_valid transmits the vector of mbufs to a given
port, queue pair.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Pavan Nikhilesh
d7c428e557 eventdev: support Rx adapter event vector
Add event vector support for event eth Rx adapter, the implementation
creates vector flows based on port and queue identifier of the received
mbufs.
The flow id for SW Rx event vectorization will use 12-bits of queue
identifier and 8-bits port identifier when custom flow id is not set
for simplicity.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Pavan Nikhilesh
3da4060a30 eventdev: introduce event vector Tx capability
Introduce event vector transmit capability for event eth
tx adapter.

The capability indicates that the Tx adapter is capable of
transmitting event vectors.
When rte_event_vector::union_valid is set, the Tx adapter should
transmit all the packets to the rte_event_vector::port using the
rte_event_vector::queue.
If rte_event_vector::union_valid is not set then the Tx adapter
should peek into each mbuf to get the destination port and queue
pair.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Pavan Nikhilesh
3c838062b9 eventdev: introduce event vector Rx capability
Introduce event ethernet Rx adapter event vector capability.

If an event eth Rx adapter has the capability of
RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR then a given Rx queue
can be configured to enable event vectorization by passing the
flag RTE_EVENT_ETH_RX_ADAPTER_QUEUE_EVENT_VECTOR to
rte_event_eth_rx_adapter_queue_conf::rx_queue_flags while configuring
Rx adapter through rte_event_eth_rx_adapter_queue_add().

The max vector size, vector timeout define the vector size and
mempool used for allocating vector event are configured through
rte_event_eth_rx_adapter_queue_add. The element size of the element
in the vector pool should be equal to
    sizeof(struct rte_event_vector) + (vector_sz * sizeof(uintptr_t))

Application can use `rte_event_vector_pool_create` to create the
vector mempool used for
rte_event_eth_rx_adapter_queue_conf::vector_mp.

The Rx adapter would be responsible for vectorizing the mbufs
based on the flow, the vector limits configured by the application
and add the vector event of mbufs to the event queue set via
rte_event_eth_rx_adapter_queue_conf::ev::queue_id.
It should also mark rte_event_vector::union_valid and fill
rte_event_vector::port, rte_event_vector::queue.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Pavan Nikhilesh
1cc44d4092 eventdev: introduce event vector capability
Introduce rte_event_vector datastructure which is capable of holding
multiple uintptr_t of the same flow thereby allowing applications
to vectorize their pipeline and reducing the complexity of pipelining
the events across multiple stages.
This approach also reduces the scheduling overhead on a event device.

Add a event vector mempool create handler to create mempools based on
the best mempool ops available on a given platform.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-04-12 09:23:34 +02:00
Shijith Thotton
a10d79a60b eventdev: introduce adapter flags for periodic mode
A timer adapter in periodic mode can be used to arm periodic timers.
This patch adds flags used to advertise capability and configure timer
adapter in periodic mode. Capability flag should be set for adapters
which support periodic mode.

Below is a programming sequence on the usage:
	/* check for periodic mode support by reading capability. */
	rte_event_timer_adapter_caps_get(...);

	/* create adapter in periodic mode by setting periodic flag
	   (RTE_EVENT_TIMER_ADAPTER_F_PERIODIC) and resolution. */
	rte_event_timer_adapter_create_ext(...);

	/* arm periodic timer of configured resolution */
	rte_event_timer_arm_burst(...);

	/* timer event will be periodically generated at configured
	   resolution till cancel is called. */
	while (running) { rte_event_dequeue_burst(...); }

	/* cancel periodic timer which stops generating events */
	rte_event_timer_cancel_burst(...);

Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-04-12 09:23:34 +02:00
Tal Shnaiderman
13e06abc9d eal/windows: fix return codes of pthread shim layer
The macro definitions of the following pthread functions
return incorrect values from the inner function return code.

While pthread_barrier_init(), pthread_barrier_destroy() and
pthread_cancel() return 0 in a case of success and non-zero (errno) value
otherwise the shimming functions InitializeSynchronizationBarrier,
DeleteSynchronizationBarrier and TerminateThread return FALSE (0)
in a case of failure and TRUE(1) in a case of success.

This issue was undetected as none of the functions return codes were
checked until such check was added in
commit 34cc55cce6b1 ("eal: fix race in control thread creation")
exposing the issue by failing pthread_barrier_init()
and rte_eal_init() on Windows as a result.

The fix aligned the return value of the 3 function with the expected
pthread API return values.

Fixes: e8428a9d89f1 ("eal/windows: add some basic functions and macros")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
2021-04-12 22:35:31 +02:00
Chengchang Tang
18239da4ac ethdev: validate input in EEPROM info
This patch adds validity check of input pointer in EEPROM dump API.

Fixes: 7a3f27cbf59b ("ethdev: add access to specific device info")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-08 00:26:39 +02:00
Chengchang Tang
94af45f400 ethdev: validate input in register info
This patch adds validity check of input pointer in regs dump API.

Fixes: 7a3f27cbf59b ("ethdev: add access to specific device info")
Fixes: 936eda25e8da ("net/hns3: support dump register")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-08 00:26:39 +02:00
Chengchang Tang
e2bd08d569 ethdev: validate input in module EEPROM dump
The validity verification of input parameters should be performed at
API layer, not in the PMD.

Fixes: 3a18c44b45df ("ethdev: add access to EEPROM")
Fixes: 40ff8b305ab8 ("net/e1000: add module EEPROM callbacks for e1000")
Fixes: f2088e785cca ("net/i40e: fix dereference before check when getting EEPROM")
Fixes: b74d0cd43e37 ("net/ixgbe: add module EEPROM callbacks for ixgbe")
Fixes: 8a6a09f853a0 ("net/mlx5: support reading module EEPROM data")
Fixes: 58f6f93c34c1 ("net/octeontx2: add module EEPROM dump")
Cc: stable@dpdk.org

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-08 00:26:39 +02:00
Junjie Wan
968bbc7e2e vhost: avoid IOTLB mempool allocation while IOMMU disabled
If vhost device's IOMMU feature is disabled, IOTLB mempool allocation
is unnecessary.

Reported-by: Peng He <hepeng.0320@bytedance.com>
Signed-off-by: Junjie Wan <wanjunjie@bytedance.com>
Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-04-07 08:54:05 +02:00
Marvin Liu
98da5545be vhost: fix initialization of async temporary header
This patch fixes coverity issue in async enqueue function by adding
initialization step before using temporary virtio header.

Coverity issue: 366123
Fixes: cd6760da1076 ("vhost: introduce async enqueue for split ring")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-04-07 08:41:30 +02:00
Marvin Liu
5b784a2d80 vhost: fix initialization of temporary header
This patch fixs coverity issue by adding initialization step before
using temporary virtio header.

Coverity issue: 366181
Fixes: fb3815cc614d ("vhost: handle virtually non-contiguous buffers in Rx-mrg")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-04-07 08:41:30 +02:00
Qi Zhang
bb6270dab3 ethdev: refine debug build option
PMDs use RTE_LIBRTE_<PMD_NAME>_DEBUG_RX|TX as build option to wrap
data path debug code. As .config has been removed since the meson build,
It is not friendly for new DPDK users to notice those debug options.

The patch introduces below build options for data path debug, so PMD
can choose to reuse them to avoid maintain their own.

- RTE_ETHDEV_DEBUG_RX
- RTE_ETHDEV_DEBUG_TX

All the build options are documented at programming guide
"3.1 Driver Option", so users can easily find them.

The original undocumented RTE_LIBRTE_ETHDEV_DEBUG will alias to
both RTE_ETHDEV_DEBUG_RX and RTE_ETHDEV_DEBUG_TX for backward
compatibility.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-04-01 16:10:20 +02:00
Bruce Richardson
720dfda455 build: limit symbol checks to developer mode
The checking of symbols within each library and driver is only of
interest to developers, so limit to developer mode only.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-09 19:07:25 +02:00
Bruce Richardson
d317f7ebd4 build: hide debug messages in non-developer mode
The messages about what components have what dependency names, and
information about function versioning not being supported on windows are
only of interest to developers, so hide them when building in
non-developer mode.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-09 19:07:25 +02:00
Luc Pelletier
af68c1d699 eal: fix hang in control thread creation
The affinity of a control thread is set after it has been launched. If
setting the affinity fails, pthread_cancel is called followed by a call
to pthread_join, which can hang forever if the thread's start routine
doesn't call a pthread cancellation point.

This patch modifies the logic so that the control thread exits
gracefully if the affinity cannot be set successfully and removes the
call to pthread_cancel.

Fixes: 6383d2642b62 ("eal: set name when creating a control thread")
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-04-09 16:37:55 +02:00
Luc Pelletier
34cc55cce6 eal: fix race in control thread creation
The creation of control threads uses a pthread barrier for
synchronization. This patch fixes a race condition where the pthread
barrier could get destroyed while one of the threads has not yet
returned from the pthread_barrier_wait function, which could result in
undefined behaviour.

Fixes: 3a0d465d4c53 ("eal: fix use-after-free on control thread creation")
Cc: stable@dpdk.org

Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-04-09 16:36:17 +02:00
Thomas Monjalon
76d409ce6e vfio: reformat logs
The log messages had various issues:
- split on 2 lines, making search (grep) difficult
- long lines (can be split after the string)
- indented for no good reason (parent message may have higher log level)
- inconsistent use of __func__, not meaningful context for user
- lack of context (general message not mentioning VFIO)
- log level too high (more below)

Message having its level decreased from WARNING to NOTICE:
	"not managed by VFIO driver, skipping"
Message having its level decreased from INFO to DEBUG:
	"Probing VFIO support..."

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-04-09 15:38:26 +02:00
David Marchand
3e08081637 eal: fix evaluation of log level option
--log-level option is handled early, no need to reevaluate it later in
EAL init.

Before:
$ echo quit | ./build/app/test/dpdk-test --no-huge -m 512 \
  --log-level=lib.eal:debug \
  --log-level=lib.ethdev:debug --log-level=lib.ethdev:info \
  |& grep -i log.level

EAL: lib.eal log level changed from info to debug
EAL: lib.ethdev log level changed from info to debug
EAL: lib.ethdev log level changed from debug to info
EAL: lib.ethdev log level changed from info to debug
EAL: lib.ethdev log level changed from debug to info
EAL: lib.telemetry log level changed from disabled to warning

After:
$ echo quit | ./build/app/test/dpdk-test --no-huge -m 512 \
  --log-level=lib.eal:debug \
  --log-level=lib.ethdev:debug --log-level=lib.ethdev:info \
  |& grep -i log.level

EAL: lib.eal log level changed from info to debug
EAL: lib.ethdev log level changed from info to debug
EAL: lib.ethdev log level changed from debug to info
EAL: lib.telemetry log level changed from disabled to warning

Fixes: 6c7216eefd63 ("eal: fix log level of early messages")
Fixes: 1c806ae5c3ac ("eal/windows: support command line options parsing")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Tested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
2021-04-09 14:20:23 +02:00
David Marchand
eba022b926 log: track log level changes
Add a log message when registering log types and changing log levels.

__rte_log_register previous handled both legacy and dynamic logtypes.
To simplify the code, __rte_log_register is reworked to only handle
dynamic logtypes and takes a log level.

Example:
$ DPDK_TEST=logs_autotest ./build/app/test/dpdk-test --no-huge -m 512 \
  --log-level=lib.eal:debug
...
RTE>>logs_autotest
== dynamic log types
EAL: logtype1 log level changed from disabled to info
EAL: logtype2 log level changed from disabled to info
EAL: logtype1 log level changed from info to error
EAL: logtype3 log level changed from error to emergency
EAL: logtype2 log level changed from info to emergency
EAL: logtype3 log level changed from emergency to debug
EAL: logtype1 log level changed from error to debug
EAL: logtype2 log level changed from emergency to debug
error message
critical message
critical message
error message
== static log types
TESTAPP1: error message
TESTAPP1: critical message
TESTAPP2: critical message
TESTAPP1: error message
Test OK

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-04-09 14:19:53 +02:00
Thomas Monjalon
3c20e6fe72 log: add option argument help
The option --log-level was not completely described in the usage text,
and it was difficult to guess the names of the log types and levels.

A new value "help" is accepted after --log-level to give more details
about the syntax and listing the log types and levels.

The array "levels" used for level name parsing is replaced with
a (modified) existing function which was used in rte_log_dump().

The new function rte_log_list_types() is exported in the API
for allowing an application to give this info to the user
if not exposing the EAL option --log-level.
The list of log types cannot include all drivers if not linked in the
application (shared object plugin case).

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-09 12:56:58 +02:00
Thomas Monjalon
c2bd208a90 log: catch invalid level option number
The parsing check for invalid log level was not trying to catch
irrelevant numeric values.
A log level 0 becomes a failure in parsing so it can be caught early.
A log level higher than the max (8) is accepted with a warning message.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-09 12:56:09 +02:00
Thomas Monjalon
806d888a80 log: introduce macro for maximum level
RTE_DIM(...) and RTE_LOG_DEBUG were used to get the highest log level.
For better clarity a new constant RTE_LOG_MAX is introduced
and mapped to RTE_LOG_DEBUG.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-09 12:56:09 +02:00
Thomas Monjalon
867bb637ae log: move private functions
Some private log functions had a wrong "rte_" prefix.

All private log functions are moved from eal_private.h
to the new file eal_log.h:
	rte_eal_log_init -> eal_log_init
	rte_log_save_regexp -> eal_log_save_regexp
	rte_log_save_pattern -> eal_log_save_pattern
	eal_log_set_default

The static functions in the file eal_common_log.c are renamed:
	rte_log_save_level -> log_save_level
	rte_log_lookup -> log_lookup
	rte_log_init -> log_init
	__rte_log_register -> log_register

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-09 12:56:09 +02:00
Stanislaw Kardach
5516e4760e timer: clarify error if subsystem already initialized
rte_timer_subsystem_init() may return -EALREADY if it has been already
initialized. Therefore put explicitly into doxygen that this is not a
failure for the application.

Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
2021-04-08 23:12:55 +02:00
David Marchand
551b29a714 eal: fix telemetry log type on registration failure
rte_log_register_type_and_pick_level() returns an int.
Casting to a uin32_t will make us miss the -1 passed in case of failure.
Fallback to EAL log type like RTE_LOG_REGISTER.

Fixes: 37b881a96194 ("telemetry: use log function from pointer")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-04-08 18:31:58 +02:00
Thomas Monjalon
bd057ae47d log: choose EAL log type on registration failure
In the unlikely case where something goes wrong
while registering a log type,
the fallback is to use the EAL log type.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-04-08 18:31:58 +02:00
David Marchand
56ea803e87 build: remove Windows export symbol list
Rather than have two files that keeps getting out of sync, let's
annotate the version.map to generate the Windows export file.

Some mlx5 symbols (haswell_broadwell_cpu, mlx5_glue, mlx5_os_*) were
only exported for Windows.
All of them are available and used by Linux too, so this patch adds
them in version.map.

Note: Existing version.map annotation achieved with:
$ for dir in lib/librte_eal drivers/common/mlx5; do
    ./buildtools/map-list-symbol.sh $dir/*.map |
    while read file version sym; do
      ! git grep -qw $sym $dir/*.def || continue;
      sed -i -e "s/$sym;/$sym; # WINDOWS_NO_EXPORT/" $dir/*.map;
    done;
  done

Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-04-08 17:57:33 +02:00
David Marchand
60e0e75b61 service: clean references to removed symbol
rte_service_get_id() was removed in v17.11 but the API description
still referenced it and a version node was still present in EAL map.

Fixes: 8edc9aaaf217 ("service: use id in get by name function")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-04-08 17:47:18 +02:00
Renata Saiakhova
2e761ce184 eal: add synchronous interrupt unregister
Avoid race with unregister interrupt handler if interrupt
source has some active callbacks at the moment, use wrapper
around rte_intr_callback_unregister() to check for -EAGAIN
return value and to loop until rte_intr_callback_unregister()
succeeds.

Signed-off-by: Renata Saiakhova <renata.saiakhova@ekinops.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Harman Kalra <hkalra@marvell.com>
2021-04-07 11:16:11 +02:00
Roy Shterman
edf20bd8a5 mem: fix freeing segments in --huge-unlink mode
When using huge_unlink we unlink the segment right
after allocation. Although we unlink the file we keep
the fd in fd_list so file still exist just the path deleted.
When freeing the hugepage we need to close the fd and assign
it with (-1) in fd_list for the page to be released.

The current flow fails rte_malloc in the following flow when working
with --huge-unlink option:
1. alloc_seg() for segment A -
    We allocate a segment, unlink the path to the segment
    and keep the file descriptor in fd_list.
2. free_seg() for segment A -
    We clear the segment metadata and return - without closing fd
    or assigning (-1) in fd list.
3. alloc_seg() for segment A again -
    We find segment A as available, try to allocate it,
    find the old fd in fd_list try to unlink it
    as part of alloc_seg() but failed because path doesn't exist.

The impact of such error is falsely failing rte_malloc()
although we have hugepages available.

Fixes: d435aad37da7 ("mem: support --huge-unlink mode")
Cc: stable@dpdk.org

Signed-off-by: Roy Shterman <roy.shterman@vastdata.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-04-07 11:13:45 +02:00
Thomas Monjalon
4d509afa7b pci: rename catch-all ID
The name of the constant PCI_ANY_ID was missing RTE_ prefix.
It is renamed, and the old name becomes a deprecated alias.

While renaming, the duplicate definitions in rte_bus_pci.h
are removed to keep only those in rte_pci.h.
Note: rte_pci.h is included in rte_bus_pci.h

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-04-06 14:52:49 +02:00