6852 Commits

Author SHA1 Message Date
Thomas Monjalon
e0473c6d5b eal: fix build with musl
In musl libc, cpu_set_t is defined only if _GNU_SOURCE is defined.
In case _GNU_SOURCE is undefined, as in eal_common_errno.c,
it was not possible to include rte_os.h which uses cpu_set_t.

This limitation is removed: if CPU_SETSIZE is not defined,
cpu_set_t related definitions and functions are skipped.
Note: such definitions are unneeded in eal_common_errno.c.

Applications which do not define _GNU_SOURCE may miss cpu_set_t related
features on musl. Such case is detected by RTE_HAS_CPUSET being undefined,
so functions which depend on rte_cpuset_t will be unavailable.

A missing include of fcntl.h is also added.

Bugzilla ID: 35
Fixes: 11b57c698005 ("eal: fix error string function")
Fixes: 176bb37ca6f3 ("eal: introduce internal wrappers for file operations")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-03-23 08:41:05 +01:00
Thomas Monjalon
bfb42c3777 eal: fix comment of OS-specific header files
The same comment is on top of each rte_os.h file.
It is reworded to remove the mention of "future releases".

Fixes: 428eb983f5f7 ("eal: add OS specific header file")
Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-03-23 08:25:16 +01:00
Xueming Li
df7547a6a2 ethdev: add helper function to get representor ID
The NIC can have multiple PCIe links and can be attached to multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. The full representor identifier consists of
three indices - controller index, PF index, and VF or SF index (if any).

SR-IOV and SubFunction are created on top of PF. PF index is introduced
because there might be multiple PFs in the bonding configuration and
only bonding device is probed.

In eth representor comparator callback, ethdev representor ID was
compared with devarg. Since controller index and PF index not compared,
callback returned representor from other PF or controller.

This patch adds new API to get representor ID from controller, pf and
vf/sf index. Representor comparer callback get representor ID then
compare with device representor ID.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-17 19:12:09 +01:00
Xueming Li
85e1588ca7 ethdev: add API to get representor info
The NIC can have multiple PCIe links and can be attached to multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. The full representor identifier consists of
three indices - controller index, PF index, and VF or SF index (if any).

This patch introduces a new API rte_eth_representor_info_get() to
retrieve representor corresponding info mapping:
 - caller controller index and pf index.
 - supported representor ID ranges.
 - type, controller, pf and start vf/sf ID of each range.
The API is useful to calculate representor from devargs to representor
ID.

New ethdev callback representor_info_get() is added to retrieve info
from PMD driver, optional for PMD that doesn't support new devargs
representor syntax.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-17 19:11:56 +01:00
Xueming Li
66e0ea2c98 ethdev: support multi-host in representor
The NIC can have multiple PCIe links and can be attached to the multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. To provide the unambiguous identification
of the PCIe function the controller index is added. The full representor
identifier consists of three indices - controller index, PF index, and
VF or SF index (if any).

This patch introduces controller index to ethdev representor syntax,
examples:

[[c#]pf#]vf#: VF port representor/s, example: pf0vf1
[[c#]pf#]sf#: SF port representor/s, example: c1pf1sf[0-3]

c# is controller(host) ID/range in case of multi-host, optional.

For user application (e.g. OVS), PMD is responsible to interpret and
locate representor device based on controller ID, PF ID and VF/SF ID in
representor syntax.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
da97592635 ethdev: support PF index in representor
With Kernel bonding, multiple underlying PFs are bonded, VFs come
from different PF, need to identify representor of VFs unambiguously by
adding PF index.

This patch introduces optional 'pf' section to representor devargs
syntax, examples:
 representor=pf0vf0             - single VF representor
 representor=pf[0-1]sf[0-1023]  - SF representors from 2 PFs

PF type representor is supported by using standalone 'pf' section:
 representor=pf1                - PF representor

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
9be46b4308 kvargs: support multiple lists
This patch updates kvargs parser to support value of multiple lists or
ranges:
  k1=v[1,2]v[3-5]

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-03-16 20:15:29 +01:00
Xueming Li
fa4f3fecb9 ethdev: support sub-function representor
SubFunction is a portion of the PCI device, created on demand, a SF
netdev has its own dedicated queues(txq, rxq). A SF netdev supports
eswitch representation offload similar to existing PF and VF
representors.

To support SF representor, this patch introduces new devargs syntax,
examples:
 representor=sf0               - single SubFunction representor
 representor=sf[1,3,5]         - single list
 representor=sf[0-3],          - single range
 representor=sf[0,2-6,8,10-12] - list with singles and ranges

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
cebf7f1715 ethdev: support new VF representor syntax
Current VF representor syntax:
 representor=2          - single representor
 representor=[0-3]      - single range

To prepare for more representor types, this patch adds compatible VF
representor devargs syntax:

vf#:
 representor=vf2          - single representor
 representor=vf[1,3,5]    - single list
 representor=vf[0-3]      - single range
 representor=vf[0,1,4-7]  - list with singles and range

For backwards compatibility, representor "#" is interpreted as "vf#".

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
83a675177f ethdev: refactor representor port list parsing
To the extended representor syntax which need to reuse the value parsing
function for controller and PF section, this patch refactors the port
list parsing.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-03-16 20:15:29 +01:00
Xueming Li
d654167641 ethdev: introduce representor type
To support more representor type, this patch introduces representor type
enum. The enum is subject to be extended to support new representor in
patches upcoming.

For each devarg structure, only one type supported.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2021-03-16 20:15:29 +01:00
Ivan Malov
24f8b2d896 net: fix comment in IPv6 header
The comment got it wrong. The payload length field
does not include the fixed IPv6 header size.

Fixes: 7eca7f7fd09d ("net: add missing endianness annotations")
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-12 14:32:48 +01:00
Thomas Monjalon
be2e6d7895 eal: mark version parts API as experimental
Some functions were introduced in DPDK 21.05 to query the version parts
(prefix, year, month, minor, suffix, release) at runtime.
Per guidelines, these new public functions must be marked with
__rte_experimental and ABI versioned as EXPERIMENTAL.

Fixes: 5b637a848195 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org

Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-03-19 16:20:30 +01:00
Thomas Monjalon
6437522079 eal: fix version macro
The macro RTE_VERSION was broken since updated with function calls.
It is a build-time version number, and must be built with macros.
For a run-time version number, there is the function rte_version().

Fixes: 5b637a848195 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-03-17 16:37:57 +01:00
Tal Shnaiderman
16afcbfa30 eal/windows: fix default thread priority
The hard-coded thread priority for Windows threads in EAL
is REALTIME_PRIORITY_CLASS/THREAD_PRIORITY_TIME_CRITICAL.

This results in issues with DPDK threads causing OS thread starvation
and eventually a bugcheck.

The fix reduce the thread priority to
NORMAL_PRIORITY_CLASS/THREAD_PRIORITY_NORMAL.

Bugzilla ID: 600
Fixes: 53ffd9f080f ("eal/windows: add minimum viable code")
Cc: stable@dpdk.org

Reported-by: Odi Assli <odia@nvidia.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-03-16 12:40:35 +01:00
Dmitry Kozlyuk
e863fe3a13 eal/windows: add missing SPDX license tag
Fixes: c08bd191b13d ("eal/windows: initialize hugepage info")
Cc: stable@dpdk.org

Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-03-16 11:28:33 +01:00
Jie Zhou
88f4450ab2 metrics: export telemetry stubs if no libjansson
This patch allows the same set of rte_metrics_tel_* functions to be
exported no matter JANSSON is available or not, by doing following:
1.	Leverage dpdk_conf to set configuration flag RTE_HAS_JANSSON
when Jansson dependency is found.
2.	In rte_metrics_telemetry.c, leverage RTE_HAS_JANSSON to handle the
case when JANSSON is not available by adding stubs for all the instances.
3.	In meson.build, per dpdk/doc/guides/rel_notes/release_20_05.rst,
it is claimed that "Telemetry library is no longer dependent on the
external Jansson library, which allows Telemetry be enabled by default.",
thus make the deps and includes of Telemetry as not conditional anymore.

Signed-off-by: Jie Zhou <jizh@microsoft.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-03-16 10:08:06 +01:00
Ferruh Yigit
5988725d0e log/linux: make default output stderr
In Linux by default DPDK log goes to stdout, as well as syslog.

It is possible for an application to change the library output stream
via 'rte_openlog_stream()' API, to set it to stderr, it can be used as:
rte_openlog_stream(stderr);

But still updating the default log output to 'stderr'.

Bugzilla ID: 8
Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-16 00:01:44 +01:00
Bruce Richardson
5b637a8481 eal: fix querying DPDK version at runtime
For using a DPDK application, such as OVS, which is dynamically linked, the
DPDK version in use should always report the actual version, not the
version used at build time. This incorrect behaviour can be seen by
building OVS against one version of DPDK and running it against a later
one. Using "ovs-vsctl list Open_vSwitch" to query basic info, the
dpdk_version returned will be the build version not the currently running
one - which can be verified using the DPDK telemetry library client.

  $ sudo ovs-vsctl list Open_vSwitch | grep dpdk_version
  dpdk_version        : "DPDK 20.11.0-rc4"

  $ echo quit | sudo dpdk-telemetry.py
  Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
  {"version": "DPDK 21.02.0-rc2", "pid": 405659, "max_output_len": 16384}
  -->

To fix this, we need to convert the rte_version() function, and any other
necessary parts of the rte_version.h, to be actual functions in EAL, not
just inlines/macros. The only complication in doing so is that telemetry
library cannot call rte_version() directly, and instead needs the version
string passed in on init.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-03-15 23:22:14 +01:00
Alexander Kozyrev
cbc78be94c ethdev: document generic modify flow action
Field IDs for the MODIFY_FIELD action lack doxygen comments
and not visible in online DPDK documentation because of that.
Provide a meaningful description for every Field ID for the
rte_flow_field_id enumeration.

Fixes: 73b68f4c54a0 ("ethdev: introduce generic modify flow action")
Cc: stable@dpdk.org

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-03-09 14:30:38 +01:00
Lance Richardson
e8a419d6de mbuf: rename outer IP checksum macro
Rename PKT_RX_EIP_CKSUM_BAD to PKT_RX_OUTER_IP_CKSUM_BAD and
deprecate the original name. The new name is better aligned
with existing PKT_RX_OUTER_* flags, which should help reduce
confusion about its use.

Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-03-02 10:57:28 +01:00
Anatoly Burakov
2a2ebeab9e fbarray: fix log message on truncation error
When file truncation fails, the log message attempts to print a path of
file we failed to truncate, but this path was never set to anything and,
what's worse, was uninitialized. Fix it by passing path from the caller.

Coverity issue: 366122
Fixes: c44d09811b40 ("eal: add shared indexed file-backed array")
Cc: stable@dpdk.org

Reported-by: Andrew Boyer <aboyer@pensando.io>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-03-04 11:37:05 +01:00
David Christensen
44db5a5cf2 eal/ppc: provide arch-specific TSC frequency
Return a PPC specific value for get_tsc_freq_arch() rather than
depending on the EAL framework to estimate the frequency.

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
2021-03-03 10:05:23 +01:00
Anatoly Burakov
dfbc61a2f9 mem: detach memsegs on cleanup
Currently, we don't detach the shared memory on EAL cleanup, which
leaves the page table descriptors still holding on to the file
descriptors as well as memory space occupied by them. Fix it by adding
another detach stage that closes the internal memory allocator resource
references, detaches shared fbarrays and unmaps the shared mem config.

Bugzilla ID: 380
Bugzilla ID: 381

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2021-03-03 10:05:23 +01:00
Yunjian Wang
8d63961fc7 vfio: fix API description
Fix few comments and add detailed comments for return value.

Fixes: 279b581c897d ("vfio: expose functions")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-03-03 10:05:23 +01:00
Ferruh Yigit
e79b0efd98 power: remove duplicated symbols from map file
This is causing build error, like:
https://travis-ci.com/github/ovsrobot/dpdk/jobs/482121104

Also '@internal' marker removed from doxygen comment, since public API
should not be internal.
Experimental tag removed from 'rte_power_guest_channel_send_msg()'

Fixes: 4d3892dcd77b ("power: make channel message functions public")
Cc: stable@dpdk.org

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-03-02 13:43:38 +01:00
Nithin Dabilpuram
c13ca4e81c vfio: fix DMA mapping granularity for IOVA as VA
Partial unmapping is not supported for VFIO IOMMU type1
by kernel. Though kernel gives return as zero, the unmapped size
returned will not be same as expected. So check for
returned unmap size and return error.

For IOVA as PA, DMA mapping is already at memseg size
granularity. Do the same even for IOVA as VA mode as
DMA map/unmap triggered by heap allocations,
maintain granularity of memseg page size so that heap
expansion and contraction does not have this issue.

For user requested DMA map/unmap disallow partial unmapping
for VFIO type1.

Fixes: 73a639085938 ("vfio: allow to map other memory regions")
Cc: stable@dpdk.org

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
2021-03-01 11:58:28 +01:00
Nithin Dabilpuram
016763c219 vfio: do not merge contiguous areas
In order to save DMA entries limited by kernel both for external
memory and hugepage memory, an attempt was made to map physically
contiguous memory in one go. This cannot be done as VFIO IOMMU type1
does not support partially unmapping a previously mapped memory
region while Heap can request for multi page mapping and
partial unmapping.
Hence for going back to old method of mapping/unmapping at
memseg granularity, this commit reverts
commit d1c7c0cdf7ba ("vfio: map contiguous areas in one go")

Also add documentation on what module parameter needs to be used
to increase the per-container dma map limit for VFIO.

Fixes: d1c7c0cdf7ba ("vfio: map contiguous areas in one go")
Cc: stable@dpdk.org

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
2021-03-01 11:58:24 +01:00
Marvin Liu
894028ace2 vhost: fix packed ring dequeue offloading
When vhost is doing dequeue offloading, it parses ethernet and L3/L4
headers of the packet. Then vhost will set corresponding value in mbuf
attributes. It means offloading action should be after packet data copy.

Fixes: 75ed51697820 ("vhost: add packed ring batch dequeue")
Cc: stable@dpdk.org

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-02-10 22:17:47 +01:00
Qi Zhang
8a8c4760a1 ethdev: refine doxygen comment of UDP tunnel API
Clarify what is the scope and impact of the UDP port tunnel API.

There are still missing infos to be improved in future:
	- no capability flag
	- dependency between ports of the same device
	- required privilege

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-02-10 21:48:59 +01:00
Bruce Richardson
9ff791eff6 eal: fix automatic loading of drivers as shared libs
When checking the loading of EAL shared lib to see if we have a shared
DPDK build, we only want to include part of the ABI version in the check
rather than the whole thing. For example, with ABI version 21.1 for DPDK
release 21.02, the linker links the binary against librte_eal.so.21,
without the ".1".

To avoid any further brittleness in this area, we can check for multiple
versions when doing the check, since just about any version of EAL implies
a shared build. Therefore we check for presence of librte_eal.so with full
ABI_VERSION extension, and then repeatedly remove the end part of the
filename after the last dot, checking each time. For example (debug log
output for static build):

  EAL: Checking presence of .so 'librte_eal.so.21.1'
  EAL: Checking presence of .so 'librte_eal.so.21'
  EAL: Checking presence of .so 'librte_eal.so'
  EAL: Detected static linkage of DPDK

Fixes: 7781950f4d38 ("eal: fix shared lib mode detection")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
2021-02-10 10:01:48 +01:00
Bruce Richardson
0d32fd0945 telemetry: mark init function as internal-only
The "rte_telemetry_init()" function is for use by "rte_eal_init()" and
should not be part of the public API. Mark it as internal only.

Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-02-09 13:36:45 +01:00
David Marchand
c48d8c3164 mbuf: remove unneeded atomic generic header include
There is no need for the direct inclusion of the generic/ header [1]
now that we don't use the rte_atomic API anymore.

It was the last case of direct inclusion of the generic/ headers,
so the flag -Wno-unused-function can be dropped.

1: https://git.dpdk.org/dpdk/commit/?id=3eb860b08eb7

Fixes: e41d27a68df6 ("mbuf: remove atomic reference counters")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-02-05 19:49:32 +01:00
Olivier Matz
daeb7c7f41 mempool: fix panic on dump or audit
When doing a mempool dump or an audit, the application can panic because
the length of the cache is greater than the flush threshold, which is
seen as a fatal error. But this can temporarily happen when the mempool
is in use.

Fix the panic condition to abort only when the cache length is greater
than the array.

Fixes: ea5dd2744b90 ("mempool: cache optimisations")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-02-05 17:40:23 +01:00
Harry van Haaren
b07b80fe1c eventdev: fix a return value comment
The PMD info get API has a void return type. Remove the
@return 0 Success doxygen comment as it doesn't make sense here.

Fixes: 5223a1f3b8de ("eventdev: define southbound driver interface")
Cc: stable@dpdk.org

Reported-by: Fredrik A Lindgren <fredrik.a.lindgren@tietoevry.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2021-02-04 13:51:45 +01:00
Fei Chen
9944bddf80 vhost: fix vid allocation race
vhost_new_device might be called in different threads at
the same time.

thread 1(config thread)
            rte_vhost_driver_start
               ->vhost_user_start_client
                   ->vhost_user_add_connection
                     -> vhost_new_device

thread 2(vhost-events)
	vhost_user_read_cb
           ->vhost_user_msg_handler (return value < 0)
             -> vhost_user_start_client
                 -> vhost_new_device

So there could be a case that a same vid has been allocated
twice, or some vid might be lost in DPDK lib however still
held by the upper applications.

Another place where race would happen is at the func
*vhost_destroy_device*, but after a detailed investigation,
the race does not exist as long as no two devices have the
same vid: Calling vhost_destroy_devices in different
threads with different vids is actually safe.

Fixes: a277c7159876 ("vhost: refactor code structure")
Cc: stable@dpdk.org

Reported-by: Peng He <hepeng.0320@bytedance.com>
Signed-off-by: Fei Chen <chenwei.0515@bytedance.com>
Reviewed-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-02-04 18:19:36 +01:00
Anatoly Burakov
7e54f18326 mem: fix deadlock on secondary allocation
Previous fix used `rte_malloc_heap_socket_is_external()` to check if the
heap was an external heap. However, that API is thread-safe, and when
we're inside the allocation process, we're already write-locked, so
calling `rte_malloc_heap_socket_is_external()` will result in a
deadlock followed by a timeout.

Fix it by replacing the API call with a check against maximum number of
NUMA nodes, because external heaps always have higher socket ID's.

Fixes: 7ac31e82bc8f ("mem: improve parameter checking on memory hotplug")

Reported-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-30 00:26:49 +01:00
Hemant Agrawal
d810252857 ethdev: add MPLS RSS offload type
This patch defines new RSS offload types for MPLS. The distribution
will on the basis of MPLS tag.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-29 18:16:08 +01:00
Alexander Kozyrev
8d00e4698e ethdev: add IPv6 DSCP option for modify field action
IPv6 DSCP field ID is missing from the original list of Field IDs
for MODIFY_FIELD action. Add it to support IPv6 header fully.
Add ipv6_dscp option for the corresponding header field in testpmd.

Fixes: 73b68f4c54a0 ("ethdev: introduce generic modify flow action")

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-29 18:16:08 +01:00
Thomas Monjalon
a6f34f9100 ethdev: fix close failure handling
If a failure happens when closing a port,
it was unnecessarily failing again in the function eth_err(),
because of a check against HW removal cause.
Indeed there is a big chance the port is released at this point.
Given the port is in the middle (or at the end) of a close process,
checking the error cause by accessing the port is a non-sense.
The error check is replaced by a simple return in the close function.

Bugzilla ID: 624
Fixes: 8a5a0aad5d3e ("ethdev: allow close function to return an error")
Cc: stable@dpdk.org

Reported-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-01-29 18:16:08 +01:00
Viacheslav Galaktionov
061abae299 ethdev: clarify what is included in generic byte statistics
Different hardware gathers statistics differently, so some general
rules need to be established.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-01-29 18:16:08 +01:00
Bruce Richardson
05050ac4ce build: add header includes check
To verify that all DPDK headers are ok for inclusion directly in a C file,
and are not missing any other pre-requisite headers, we can auto-generate
for each header an empty C file that includes that header. Compiling these
files will throw errors if any header has unmet dependencies.

For some libraries, there may be some header files which are not for direct
inclusion, but rather are to be included via other header files. To allow
later checking of these files for missing includes, we separate out the
indirect include files from the direct ones.

To ensure ongoing compliance, we enable this build test as part of the
default x86 build in "test-meson-builds.sh".

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-01-29 20:59:37 +01:00
Bruce Richardson
2518704288 eventdev: make driver-only headers private
The rte_eventdev_pmd*.h files are for drivers only and should be private
to DPDK, and not installed for app use.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
df96fd0d73 ethdev: make driver-only headers private
The rte_ethdev_driver.h, rte_ethdev_vdev.h and rte_ethdev_pci.h files are
for drivers only and should be a private to DPDK and not installed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Steven Webster <steven.webster@windriver.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
5d1a53130a rib: fix missing header include
The rte_rib6 header was using RTE_MIN macro from rte_common.h but not
including the header file.

Fixes: f7e861e21c46 ("rib: support IPv6")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
deb6ea1d2d power: fix missing header includes
The rte_power_guest_channel.h file did not include its dependent
headers, so add them.

Fixes: 5f443cc0f905 ("power: create guest channel public header file")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
4ab63cd60c eal: fix internal ABI tag with clang
Clang does not have an "error" attribute for functions, so for marking
internal functions we need to check for the error attribute, and provide
a fallback if it is not present. For clang, we can use "diagnose_if"
attribute, similarly checking for its presence before use.

Fixes: fba5af82adc8 ("eal: add internal ABI tag definition")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-01-29 20:59:09 +01:00
Bruce Richardson
3c2cca6a0d eal: fix MCS lock header include
Include 'rte_branch_prediction.h' to get the likely/unlikely macro
definitions.

Fixes: 2173f3333b61 ("mcslock: add MCS queued lock implementation")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-01-29 20:59:09 +01:00
Thomas Monjalon
45eb6a1dfe lib: fix doxygen for parameters of function pointers
Some parameters of typedef'ed function pointers were not properly listed
in the doxygen comments.
The error is seen with doxygen 1.9 which added this specific check:
	https://github.com/doxygen/doxygen/commit/d34236ba4037

Cc: stable@dpdk.org

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-01-29 15:58:06 +01:00
Liang Ma
682a645438 power: add ethdev power management
Add a simple on/off switch that will enable saving power when no
packets are arriving. It is based on counting the number of empty
polls and, when the number reaches a certain threshold, entering an
architecture-defined optimized power state that will either wait
until a TSC timestamp expires, or when packets arrive.

This API mandates a core-to-single-queue mapping (that is, multiple
queued per device are supported, but they have to be polled on different
cores).

This design is using PMD RX callbacks.

1. UMWAIT/UMONITOR:

   When a certain threshold of empty polls is reached, the core will go
   into a power optimized sleep while waiting on an address of next RX
   descriptor to be written to.

2. TPAUSE/Pause instruction

   This method uses the pause (or TPAUSE, if available) instruction to
   avoid busy polling.

3. Frequency scaling
   Reuse existing DPDK power library to scale up/down core frequency
   depending on traffic volume.

Signed-off-by: Liang Ma <liang.j.ma@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
2021-01-29 15:29:48 +01:00