Commit Graph

7518 Commits

Author SHA1 Message Date
Ferruh Yigit
f7e04f57ad ethdev: move MTU set check to library
Move requested MTU value check to the API to prevent the duplicated
code.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
dd4e429c95 ethdev: move jumbo frame offload check to library
Setting MTU bigger than RTE_ETHER_MTU requires the jumbo frame support,
and application should enable the jumbo frame offload support for it.

When jumbo frame offload is not enabled by application, but MTU bigger
than RTE_ETHER_MTU is requested there are two options, either fail or
enable jumbo frame offload implicitly.

Enabling jumbo frame offload implicitly is selected by many drivers
since setting a big MTU value already implies it, and this increases
usability.

This patch moves this logic from drivers to the library, both to reduce
the duplicated code in the drivers and to make behaviour more visible.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
1bb4a528c4 ethdev: fix max Rx packet length
There is a confusion on setting max Rx packet length, this patch aims to
clarify it.

'rte_eth_dev_configure()' API accepts max Rx packet size via
'uint32_t max_rx_pkt_len' field of the config struct 'struct
rte_eth_conf'.

Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
stored into '(struct rte_eth_dev)->data->mtu'.

These two APIs are related but they work in a disconnected way, they
store the set values in different variables which makes hard to figure
out which one to use, also having two different method for a related
functionality is confusing for the users.

Other issues causing confusion is:
* maximum transmission unit (MTU) is payload of the Ethernet frame. And
  'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
  Ethernet frame overhead, and this overhead may be different from
  device to device based on what device supports, like VLAN and QinQ.
* 'max_rx_pkt_len' is only valid when application requested jumbo frame,
  which adds additional confusion and some APIs and PMDs already
  discards this documented behavior.
* For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
  field, this adds configuration complexity for application.

As solution, both APIs gets MTU as parameter, and both saves the result
in same variable '(struct rte_eth_dev)->data->mtu'. For this
'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
from jumbo frame.

For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
request and it should be used only within configure function and result
should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
both application and PMD uses MTU from this variable.

When application doesn't provide an MTU during 'rte_eth_dev_configure()'
default 'RTE_ETHER_MTU' value is used.

Additional clarification done on scattered Rx configuration, in
relation to MTU and Rx buffer size.
MTU is used to configure the device for physical Rx/Tx size limitation,
Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer
size as Rx buffer size.
PMDs compare MTU against Rx buffer size to decide enabling scattered Rx
or not. If scattered Rx is not supported by device, MTU bigger than Rx
buffer size should fail.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2021-10-18 19:20:20 +02:00
Georg Sauthoff
24f1955d1e net: fix aliasing in checksum computation
That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers.

Also simplified the loop since a modern C compiler can speed up (i.e.
auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it
for Haswell using AVX registers while halving the number of instructions
in the generated code.

Fixes: 6006818cfb ("net: new checksum functions")
Fixes: e079655c41 ("net: fix build with gcc 4.4.7 and strict aliasing")
Cc: stable@dpdk.org

Signed-off-by: Georg Sauthoff <mail@gms.tf>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-18 18:15:58 +02:00
Jie Wang
632be32735 ethdev: add API to get device configuration
The driver may change offloads info into dev->data->dev_conf
in dev_configure which may cause apps use outdated values.

Add a new API to get actual device configuration.

Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-15 13:27:05 +02:00
Gowrishankar Muthukrishnan
58b43c1ddf ethdev: add telemetry endpoint for device info
Add telemetry endpoint /ethdev/info for device info.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-14 23:44:53 +02:00
Gregory Etelson
63f2bbfa82 net: introduce IPv4 IHL and version fields
RTE IPv4 header definition combines the `version' and `ihl'  fields
into a single structure member.
This patch introduces dedicated structure members for both `version'
and `ihl' IPv4 fields. Separated header fields definitions allow to
create simplified code to match on the IHL value in a flow rule.
The original `version_ihl' structure member is kept for backward
compatibility.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-10-14 23:00:45 +02:00
Viacheslav Ovsiienko
50cd0391a4 ethdev: add experimental comment for modify field action
EXPERIMENTAL tag was missed in rte_flow_action_modify_data
structure description.

Fixes: 73b68f4c54 ("ethdev: introduce generic modify flow action")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-14 14:34:31 +02:00
Viacheslav Ovsiienko
14fc81aed7 ethdev: update modify field flow action
The generic modify field flow action introduced in [1] has
some issues related to the immediate source operand:

  - immediate source can be presented either as an unsigned
    64-bit integer or pointer to data pattern in memory.
    There was no explicit pointer field defined in the union.

  - the byte ordering for 64-bit integer was not specified.
    Many fields have shorter lengths and byte ordering
    is crucial.

  - how the bit offset is applied to the immediate source
    field was not defined and documented.

  - 64-bit integer size is not enough to provide IPv6
    addresses.

In order to cover the issues and exclude any ambiguities
the following is done:

  - introduce the explicit pointer field
    in rte_flow_action_modify_data structure

  - replace the 64-bit unsigned integer with 16-byte array

  - update the modify field flow action documentation

Appropriate deprecation notice has been removed.

[1] commit 73b68f4c54 ("ethdev: introduce generic modify flow action")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-14 14:34:31 +02:00
Ivan Malov
1179f05cc9 ethdev: query proxy port to manage transfer flows
Not all DPDK ports in a given switching domain may have the
privilege to manage "transfer" flows. Add an API to find a
port with sufficient privileges by any port in the domain.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
2021-10-14 13:42:59 +02:00
Ivan Malov
9d2a349b38 ethdev: deprecate direction attributes in transfer flows
Attributes "ingress" and "egress" can only apply unambiguosly
to non-"transfer" flows. In "transfer" flows, the standpoint
is effectively shifted to the embedded switch. There can be
many different endpoints connected to the switch, so the
use of "ingress" / "egress" does not shed light on which
endpoints precisely can be considered as traffic sources.

Add relevant deprecation notices and suggest the use of precise
traffic source items (PORT_REPRESENTOR and REPRESENTED_PORT).

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-10-13 22:59:26 +02:00
Ivan Malov
5da44faa80 ethdev: deprecate hard-to-use or ambiguous items and actions
PF, VF and PHY_PORT require that applications have extra
knowledge of the underlying NIC and thus are hard to use.
Also, the corresponding items depend on the direction
attribute (ingress / egress), which complicates their
use in applications and interpretation in PMDs.

The concept of PORT_ID is ambiguous as it doesn't say whether
the port in question is an ethdev or the represented entity.

Items and actions PORT_REPRESENTOR, REPRESENTED_PORT
should be used instead.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:26 +02:00
Ivan Malov
88caad251c ethdev: add represented port action to flow API
For use in "transfer" flows. Supposed to send matching traffic to the
entity represented by the given ethdev, at embedded switch level.
Such an entity can be a network (via a network port), a guest
machine (via a VF) or another ethdev in the same application.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:26 +02:00
Ivan Malov
8edb6bc026 ethdev: add port representor action to flow API
For use in "transfer" flows. Supposed to send matching traffic to
the given ethdev (to the application), at embedded switch level.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:26 +02:00
Ivan Malov
49863ae2bf ethdev: add represented port item to flow API
For use in "transfer" flows. Supposed to match traffic entering the
embedded switch from the entity represented by the given ethdev.
Such an entity can be a network (via a network port), a guest
machine (via a VF) or another ethdev in the same application.

Must not be combined with direction attributes.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:26 +02:00
Ivan Malov
081e42dab1 ethdev: add port representor item to flow API
For use in "transfer" flows. Supposed to match traffic
entering the embedded switch from the given ethdev.

Must not be combined with direction attributes.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:25 +02:00
Konstantin Ananyev
f9bdee267a ethdev: hide internal structures
Move rte_eth_dev, rte_eth_dev_data, rte_eth_rxtx_callback and related
data into private header (ethdev_driver.h).
Few minor changes to keep DPDK building after that.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:59 +02:00
Konstantin Ananyev
27a300e6af ethdev: add API to retrieve multiple MAC addresses
Introduce rte_eth_macaddrs_get() to allow user to retrieve all ethernet
addresses assigned to given port.
Change testpmd to use this new function and avoid referencing directly
rte_eth_devices[].

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:59 +02:00
Konstantin Ananyev
7a0935239b ethdev: make fast-path functions to use new flat array
Rework fast-path ethdev functions to use rte_eth_fp_ops[].
While it is an API/ABI breakage, this change is intended to be
transparent for both users (no changes in user app is required) and
PMD developers (no changes in PMD is required).
One extra thing to note - RX/TX callback invocation will cause extra
function call with these changes. That might cause some insignificant
slowdown for code-path where RX/TX callbacks are heavily involved.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:58 +02:00
Konstantin Ananyev
c87d435a4d ethdev: copy fast-path API into separate structure
Copy public function pointers (rx_pkt_burst(), etc.) and related
pointers to internal data from rte_eth_dev structure into a
separate flat array. That array will remain in a public header.
The intention here is to make rte_eth_dev and related structures internal.
That should allow future possible changes to core eth_dev structures
to be transparent to the user and help to avoid ABI/API breakages.
The plan is to keep minimal part of data from rte_eth_dev public,
so we still can use inline functions for fast-path calls
(like rte_eth_rx_burst(), etc.) to avoid/minimize slowdown.
The whole idea beyond this new schema:
1. PMDs keep to setup fast-path function pointers and related data
   inside rte_eth_dev struct in the same way they did it before.
2. Inside rte_eth_dev_start() and inside rte_eth_dev_probing_finish()
   (for secondary process) we call eth_dev_fp_ops_setup, which
   copies these function and data pointers into rte_eth_fp_ops[port_id].
3. Inside rte_eth_dev_stop() and inside rte_eth_dev_release_port()
   we call eth_dev_fp_ops_reset(), which resets rte_eth_fp_ops[port_id]
   into some dummy values.
4. fast-path ethdev API (rte_eth_rx_burst(), etc.) will use that new
   flat array to call PMD specific functions.
That approach should allow us to make rte_eth_devices[] private
without introducing regression and help to avoid changes in drivers code.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:58 +02:00
Konstantin Ananyev
8d7d4fcdca ethdev: change input parameters for Rx queue count
Currently majority of fast-path ethdev ops take pointers to internal
queue data structures as an input parameter.
While eth_rx_queue_count() takes a pointer to rte_eth_dev and queue
index.
For future work to hide rte_eth_devices[] and friends it would be
plausible to unify parameters list of all fast-path ethdev ops.
This patch changes eth_rx_queue_count() to accept pointer to internal
queue data as input parameter.
While this change is transparent to user, it still counts as an ABI change,
as eth_rx_queue_count_t is used by ethdev public inline function
rte_eth_rx_queue_count().

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:58 +02:00
Konstantin Ananyev
c024496ae8 ethdev: allocate max space for internal queue array
At queue configure stage always allocate space for maximum possible
number (RTE_MAX_QUEUES_PER_PORT) of queue pointers.
That will allow 'fast' inline functions (eth_rx_burst, etc.) to refer
pointer to internal queue data without extra checking of current number
of configured queues.
That would help in future to hide rte_eth_dev and related structures.
It means that from now on, each ethdev port will always consume:
((2*sizeof(uintptr_t))* RTE_MAX_QUEUES_PER_PORT)
bytes of memory for its queue pointers.
With RTE_MAX_QUEUES_PER_PORT==1024 (default value) it is 16KB per port.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:58 +02:00
Ivan Malov
f6d8a6d3fa ethdev: negotiate delivery of packet metadata from HW to PMD
Provide an API to let the application control the NIC's ability
to deliver specific kinds of per-packet metadata to the PMD.

Checks for the NIC's ability to set these kinds of metadata
in the first place (support for the flow actions) belong in
flow API responsibility domain (flow validate mechanism).
This topic is out of scope of the new API in question.

The PMD's ability to deliver received metadata to the user
by virtue of mbuf fields should be covered by mbuf library.
It is also out of scope of the new API in question.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
2021-10-13 00:47:42 +02:00
Andrew Rybchenko
92ef4b8f16 ethdev: remove deprecated shared counter attribute
Indirect actions should be used to do shared counters.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-10-12 19:20:57 +02:00
Viacheslav Galaktionov
ff4e52efb3 ethdev: fix representor port ID search by name
The patch is required for all PMDs which do not provide representors
info on the representor itself.

The function, rte_eth_representor_id_get(), is used in
eth_representor_cmp() which is required in ethdev class iterator to
search ethdev port ID by name (representor case). Before the patch
the function is called on the representor itself and tries to get
representors info to match.

Search of port ID by name is used after hotplug to find out port ID
of the just plugged device.

Getting a list of representors from a representor does not make sense.
Instead, a backer device should be used.

To this end, extend the rte_eth_dev_data structure to include the port ID
of the backing device for representors.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-10-12 16:54:20 +02:00
Andrew Rybchenko
6c31a8c20a ethdev: remove legacy Rx descriptor done API
rte_eth_rx_descriptor_status() should be used as a replacement.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-10-11 16:44:57 +02:00
Akhil Goyal
fb545457ed security: add reserved bit fields
In struct rte_security_ipsec_sa_options, for every new option
added, there is an ABI breakage, to avoid, a reserved_opts
bitfield is added to for the remaining bits available in the
structure.
Now for every new sa option, these reserved_opts can be reduced
and new option can be added.

Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-18 20:12:19 +02:00
Akhil Goyal
3867ed0280 security: hide internal API
rte_security_dynfield_register() is an internal
API to be used by the driver, hence moving it to internal.

Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-18 20:12:19 +02:00
Nicolas Chautru
ab4e19097b bbdev: add device info for data endianness
Added device information to capture explicitly the assumption
of the input/output data byte endianness being processed.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-18 20:11:16 +02:00
Gagandeep Singh
8edcb68fd0 cryptodev: fix multi-segment raw vector processing
If no next segment available the “for” loop will fail and it still
returns i+1 i.e. 2, which is wrong as it has filled only 1 buffer.

Fixes: 7adf992fb9 ("cryptodev: introduce CPU crypto API")
Cc: stable@dpdk.org

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 19:32:13 +02:00
Hemant Agrawal
68f5d3d320 cryptodev: add field for out-of-place in raw vector
The structure rte_crypto_sym_vec is updated to
add dest_sgl to support out of place processing.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-17 19:32:01 +02:00
Gagandeep Singh
6afd461f9f cryptodev: add total raw buffer length
The current crypto raw data vectors is extended to support
rte_security usecases, where we need total data length to know
how much additional memory space is available in buffer other
than data length so that driver/HW can write expanded size
data after encryption.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-17 19:32:01 +02:00
Hemant Agrawal
10488d59ae cryptodev: rename field in vector struct
This patch renames the sgl to src_sgl in struct rte_crypto_sym_vec
to help differentiating between source and destination sgl.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-17 19:31:15 +02:00
Radu Nicolau
2ed40da848 ipsec: support setting initial ESN value
Update IPsec library to support initial ESN value.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:11:59 +02:00
Radu Nicolau
68977baa75 ipsec: support SA telemetry
Add telemetry support for ipsec SAs.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:08:03 +02:00
Radu Nicolau
64df4712ce mbuf: add IPsec ESP tunnel type
Add ESP tunnel type to the tunnel types list that can be specified
for TSO or checksum on the inner part of tunnel packets.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-17 14:07:03 +02:00
Radu Nicolau
01eef5907f ipsec: support NAT-T
Add support for the IPsec NAT-Traversal use case for Tunnel mode
packets.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:06:24 +02:00
Radu Nicolau
17344c0278 security: add UDP parameters for IPsec NAT-T
Add support for specifying UDP port params for UDP encapsulation option.
RFC3948 section-2.1 does not enforce using specific the UDP ports for
UDP-Encapsulated ESP Header

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:03:43 +02:00
Radu Nicolau
c99d26197c ipsec: support more AEAD algorithms
Added support for AES_CCM, CHACHA20_POLY1305 and AES_GMAC.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:03:13 +02:00
Radu Nicolau
199fcba1bd security: add ESN field to IPsec xform
Update ipsec_xform definition to include ESN field.
This allows the application to control the ESN starting value.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 13:08:35 +02:00
Matan Azrad
cab0c8f3c0 cryptodev: extend data-unit length field
As described in [1] and as announced in [2], The field ``dataunit_len``
of the ``struct rte_crypto_cipher_xform`` moved to the end of the
structure and extended to ``uint32_t``.

In this way, sizes bigger than 64K bytes can be supported for data-unit
lengths.

[1] commit d014dddb2d ("cryptodev: support multiple cipher
data-units")
[2] commit 9a5c09211b ("doc: announce extension of crypto data-unit
length")

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-16 16:24:43 +02:00
David Marchand
afdaa60795 mempool: accept user flags only
As reported by Dmitry, RTE_MEMPOOL_F_POOL_CREATED is a flag only
manipulated internally.
This flag is not supposed to be requested from an application and would
probably result in an incorrect behavior if an application did pass it.

At least one other internal flag has been added recently and more may be
introduced later.

Rework the check and export a mask of valid user flags for use in the
unit test.

Fixes: b240af8b10 ("mempool: enforce valid flags at creation")

Reported-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-20 10:03:55 +02:00
Andrew Rybchenko
fb11ae8816 mempool: deprecate unused physical page defines
MEMPOOL_PG_NUM_DEFAULT and MEMPOOL_PG_SHIFT_MAX are not used.

Fixes: fd943c764a ("mempool: deprecate xmem functions")

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:03:41 +02:00
Andrew Rybchenko
cb77b060eb mempool: add namespace to driver register macro
Add RTE_ prefix to macro used to register mempool driver.
The old one is still available but deprecated.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:00:18 +02:00
Andrew Rybchenko
d720366184 mempool: make header size calculation internal
Add RTE_ prefix to helper macro to calculate mempool header size and
make it internal. Old macro is still available, but deprecated.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:00:18 +02:00
Andrew Rybchenko
ad276d5c7e mempool: add namespace to internal helpers
Add RTE_ prefix to internal API defined in public header.
Use the prefix instead of double underscore.
Use uppercase for macros in the case of name conflict.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:00:18 +02:00
Andrew Rybchenko
c47d7b90a1 mempool: add namespace to flags
Fix the mempool flags namespace by adding an RTE_ prefix to the name.
The old flags remain usable, to be deprecated in the future.

Flag MEMPOOL_F_NON_IO added in the release is just renamed to have RTE_
prefix.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:00:16 +02:00
Andrew Rybchenko
925a83a5bf mempool: enhance flags documentation readability
Move documentation into a separate line just before define.
Prepare to have a bit longer flag name because of namespace prefix.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 09:58:39 +02:00
Feifei Wang
c4629b02c5 mcslock: use WFE in lock for aarch64
Instead of polling for previous lock holder unlocking, use
wait_until_equal API.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2021-10-20 08:22:41 +02:00
Feifei Wang
a6e24bf417 mem: use WFE for init sync on aarch64
Instead of polling for mcfg->magic to be updated, use wait_until_equal
API.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2021-10-20 08:22:18 +02:00
Joyce Kong
4da0136096 stack: remove unneeded atomic header include
In stack module, remove the header file rte_atomic.h
as it is not being used.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-19 17:15:10 +02:00
Dmitry Kozlyuk
11541c5c81 mempool: add non-IO flag
Mempool is a generic allocator that is not necessarily used
for device IO operations and its memory for DMA.
Add MEMPOOL_F_NON_IO flag to mark such mempools automatically
a) if their objects are not contiguous;
b) if IOVA is not available for any object.
Other components can inspect this flag
in order to optimize their memory management.

Discussion: https://mails.dpdk.org/archives/dev/2021-August/216654.html

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-19 16:35:16 +02:00
Dmitry Kozlyuk
da2b9cb25e mempool: add event callbacks
Data path performance can benefit if the PMD knows which memory it will
need to handle in advance, before the first mbuf is sent to the PMD.
It is impractical, however, to consider all allocated memory for this
purpose. Most often mbuf memory comes from mempools that can come and
go. PMD can enumerate existing mempools on device start, but it also
needs to track creation and destruction of mempools after the forwarding
starts but before an mbuf from the new mempool is sent to the device.

Add an API to register callback for mempool life cycle events:
* rte_mempool_event_callback_register()
* rte_mempool_event_callback_unregister()
Currently tracked events are:
* RTE_MEMPOOL_EVENT_READY (after populating a mempool)
* RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
Provide a unit test for the new API.
The new API is internal, because it is primarily demanded by PMDs that
may need to deal with any mempools and do not control their creation,
while an application, on the other hand, knows which mempools it creates
and doesn't care about internal mempools PMDs might create.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-19 16:35:16 +02:00
Bruce Richardson
2e348d8fe3 dmadev: add flag for error handling support
Due to HW or driver limitations, not all dmadevs may support full error
handling e.g. safely managing and reporting an invalid address to a copy
operation. The skeleton dmadev, for example, being pure software will
always seg-fault if passed an invalid address. To indicate the
availability of safe error handling by a device, we add a capability
flag for it.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
2021-10-18 11:19:27 +02:00
Bruce Richardson
190f7e84c3 dmadev: add device iterator
Add a function and wrapper macro to iterate over all DMA devices.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
2021-10-18 11:17:32 +02:00
Kevin Laatz
ea8cf0f853 dmadev: add burst capacity API
Add a burst capacity check API to the dmadev library. This API is useful to
applications which need to how many descriptors can be enqueued in the
current batch. For example, it could be used to determine whether all
segments of a multi-segment packet can be enqueued in the same batch or not
(to avoid half-offload of the packet).

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-18 11:17:30 +02:00
Bruce Richardson
5e0f859127 dmadev: add channel status check for testing use
Add in a function to check if a device or vchan has completed all jobs
assigned to it, without gathering in the results. This is primarily for
use in testing, to allow the hardware to be in a known-state prior to
gathering completions.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
2021-10-18 11:17:21 +02:00
Chengwen Feng
2ece65f00f dmadev: support multi-process
This patch add multi-process support for dmadev.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-17 20:49:58 +02:00
Chengwen Feng
91e581e5c9 dmadev: add data plane API
This patch add data plane API for dmadev.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-17 20:49:58 +02:00
Chengwen Feng
e0180db144 dmadev: add control plane API
This patch add control plane API for dmadev.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-17 20:49:58 +02:00
Chengwen Feng
b36970f2e1 dmadev: introduce DMA device library
The 'dmadev' is a generic type of DMA device.

This patch introduce the 'dmadev' device allocation functions.

The infrastructure is prepared to welcome drivers in drivers/dma/

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-17 20:49:57 +02:00
David Marchand
e9123c467d mbuf: enforce no option for dynamic fields and flags
As stated in the API, dynamic field and flags should be created with no
additional flag (simply in the API for future changes).

Fix the dynamic flag register helper which was not enforcing it and add
unit tests.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-15 10:29:41 +02:00
David Marchand
bc1a35fb3f memzone: enforce valid flags when reserving
If we do not enforce valid flags are passed by an application, this
application might face issues in the future when we add more flags.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-15 10:29:21 +02:00
David Marchand
b240af8b10 mempool: enforce valid flags at creation
If we do not enforce valid flags are passed by an application, this
application might face issues in the future when we add more flags.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-15 10:24:43 +02:00
Bruce Richardson
85e21b77d8 telemetry: fix socket path conflicts for in-memory mode
When running using in-memory mode, multiple processes can use the same
runtime dir, leading to conflicts with the telemetry sockets in that
directory. We can resolve this by appending a suffix to each socket
beyond the first, with the suffix being an increasing counter value.
Each process uses the first unused socket counter value.

Fixes: 6dd571fd07 ("telemetry: introduce new functionality")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
2021-10-14 20:31:10 +02:00
Bruce Richardson
e89463a366 eal: limit telemetry to primary processes
Telemetry interface should be exposed for primary processes only, since
secondary processes will conflict on socket creation, and since all
data in secondary process is generally available to primary. For
example, all device stats for ethdevs, cryptodevs, etc. will all be
common across processes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
2021-10-14 20:31:10 +02:00
David Christensen
b698651b91 eal/ppc: use compiler builtins for atomics
Replace existing PPC assembly code for rte_atomicXX ops with compiler
atomic builtins as previously adopted by DPDK (see [1] and [2]).  This
has the additional benefit of resolving a POWER10 build failure due to an
outstanding gcc issue which fails on the existing PPC assembly code [3].

[1] https://www.dpdk.org/blog/2021/03/26/dpdk-adopts-the-c11-memory-model/
[2] https://doc.dpdk.org/guides/rel_notes/deprecation.html
[3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98519

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
2021-10-14 16:51:25 +02:00
Huichao Cai
567473433b ip_frag: fix fragmenting IPv4 fragment
Current implementation of rte_ipv4_fragment_packet() doesn’t take
into account offset and flag values of the given packet, but blindly
assumes they are always zero (original packet is not fragmented).
According to RFC791, fragment and flag values for new fragment
should take into account values provided in the original IPv4 packet.

Fixes: 4c38e5532a ("ip_frag: refactor IPv4 fragmentation into a proper library")
Cc: stable@dpdk.org

Signed-off-by: Huichao Cai <chcchc88@163.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-14 08:52:34 +02:00
Andrew Rybchenko
74a74bf98c mbuf: remove deprecated flag for bad outer IPv4 checksum
Removed offload flag PKT_RX_EIP_CKSUM_BAD. PKT_RX_OUTER_IP_CKSUM_BAD
should be used as a replacement.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-13 23:03:47 +02:00
Andrew Rybchenko
a87a0c0d1a mempool: fix name size in mempool structure
Use correct define as a name array size.

The change breaks ABI and therefore cannot be backported to
stable branches.

Fixes: 38c9817ee1 ("mempool: adjust name size in related data types")

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-10-13 22:54:10 +02:00
Stephen Hemminger
d75eed0fbe mbuf: fix typo in comment
Misspelling of 'copied'

Fixes: c3a90c381d ("mbuf: add a copy routine")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-13 19:02:18 +02:00
Gowrishankar Muthukrishnan
b76731683b telemetry: fix JSON output buffer length
Earlier, JSON message length was limited to 1024 which would not
allow data more than this size. Removed this limitation by creating
output buffer based on requested data length.

Fixes: 52af6ccb2b ("telemetry: add utility functions for creating JSON")
Cc: stable@dpdk.org

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-10-13 18:17:24 +02:00
Bruce Richardson
0faa4cfc50 eal/freebsd: ignore in-memory option
The in-memory option is not supported on FreeBSD so print a warning and
ignore the flag when it is specified for BSD apps. The lack of support
is due to the different way in which memory is managed on FreeBSD using
the contigmem driver rather than via a hugetlbfs filesystem.

Fixes: 14de8734c4 ("eal: add --in-memory option")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-10-13 17:11:26 +02:00
Olivier Matz
8e506da755 net: promote IPv6 external headers skip API as stable
This function is public since commit 8f0e4d6a78 ("net: export IPv6
header extensions skip function") (2018), and is used by vmxnet3 driver.
Promote it as stable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-13 12:57:12 +02:00
David Marchand
2f3758751b eal/x86: sort CPU extended features definitions
Sort the definitions for extended features (leaf 0) to enhance
readability.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-10-12 21:07:53 +02:00
David Marchand
aae3037ab1 eal/x86: fix some CPU extended features definitions
Caught while checking CPUID related stuff in OVS.

According to [1], for Structured Extended Feature Flags Enumeration Leaf
(EAX = 0x07H, ECX = 0):

- BMI1 is associated to EBX, bit 3 (was incorrectly 2),
- SMEP is associated to EBX, bit 7 (was incorrectly 6),
- BMI2 is associated to EBX, bit 8 (was incorrectly 7),
- ERMS is associated to EBX, bit 9 (was incorrectly 8),

1: https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-10-12 21:07:50 +02:00
John Levon
24d5a1ce6b eal/linux: allow hugetlbfs sub-directories
get_hugepage_dir() was implemented in such a way that a --huge-dir
option had to exactly match the mountpoint, but there's no reason for
this restriction: DPDK might not be the only user of hugepages, and
shouldn't assume it owns an entire mountpoint. For example, if I have
/dev/hugepages/myapp, and /dev/hugepages/dpdk, I should be able to
specify:

--huge-dir=/dev/hugepages/dpdk/

and have DPDK only use that sub-directory.

Fix the implementation to allow a sub-directory within a suitable
hugetlbfs mountpoint to be specified, preferring the closest match.

Signed-off-by: John Levon <john.levon@nutanix.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-10-12 21:07:46 +02:00
Archana Muniganti
03ab51eafd security: add SA config option for inner checksum
Add inner packet IPv4 hdr and L4 checksum enable options
in conf. These will be used in case of protocol offload.
Per SA, application could specify whether the
checksum(compute/verify) can be offloaded to security device.

Signed-off-by: Archana Muniganti <marchana@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-08 21:39:39 +02:00
Shijith Thotton
dd451ad152 doc: remove event crypto metadata deprecation note
Proposed change to event crypto metadata is not done as per deprecation
note. Instead, comments are updated in spec to improve readability.

Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
2021-10-08 21:31:07 +02:00
Nicolas Chautru
5f13f4c03d bbdev: reduce log level of a failure message
Queue setup may genuinely fail when adding incremental queues
for a given priority level. In that case application would
attempt to configure a queue at a different priority level.
Not an actual error.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
2021-10-08 21:31:07 +02:00
Nicolas Chautru
10ea15e35f bbdev: add capability for 4G CB CRC drop
Adding option to drop CRC24B to align with existing
feature for 5G

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
2021-10-08 21:31:07 +02:00
Nicolas Chautru
cc360fd3f2 bbdev: add capability for CRC16 check
Adding a missing operation when CRC16
is being used for TB CRC check.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
2021-10-08 21:31:07 +02:00
Tejasree Kondoj
f7e3aa693d security: add option to configure UDP ports verification
Add option to indicate whether UDP encapsulation ports
verification need to be done as part of inbound
IPsec processing.

Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-08 21:31:07 +02:00
Dmitry Kozlyuk
d47dd94162 eal/windows: do not install virt2phys header
The header was not intended to be a public one.
DPDK users should use `rte_mem_virt2iova()` to translate addresses.
Other virt2phys users should use the header from the driver instead.

Fixes: 2a5d547a4a ("eal/windows: implement basic memory management")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-10-11 21:17:12 +02:00
Narcisa Vasile
694c81721e eal/windows: fix CPU cores counting
On Windows, -l/--lcores EAL option was unable to process CPU sets
containing CPUs other than 0 and 1, because CPU_COUNT() macro
only checked these CPUs in the set. Fix CPU_COUNT() by enumerating
all possible CPU indices.

Fixes: e8428a9d89 ("eal/windows: add some basic functions and macros")
Cc: stable@dpdk.org

Signed-off-by: Narcisa Vasile <navasile@microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
2021-10-11 18:52:56 +02:00
Lance Richardson
dc954ae73a net: fix checksum API documentation
Minor corrections and improvements to documentation
for checksum APIs.

Fixes: 6006818cfb ("net: new checksum functions")
Fixes: 45a08ef55e ("net: introduce functions to verify L4 checksums")
Cc: stable@dpdk.org

Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-07 14:42:45 +02:00
Andrew Rybchenko
b225783dda ethdev: remove legacy mirroring API
A more fine-grain flow API action RTE_FLOW_ACTION_TYPE_SAMPLE should
be used instead of it.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-07 13:02:26 +02:00
Xueming Li
7483341ae5 ethdev: change queue release callback
Currently, most ethdev callback API use queue ID as parameter, but Rx
and Tx queue release callback use queue object which is used by Rx and
Tx burst data plane callback.

To align with other eth device queue configuration callbacks:
- queue release callbacks are changed to use queue ID
- all drivers are adapted

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 19:16:03 +02:00
Xueming Li
49ed322469 ethdev: make queue release callback optional
Some drivers don't need Rx and Tx queue release callback, make them
optional. Clean up empty queue release callbacks for some drivers.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2021-10-06 19:16:03 +02:00
Andrew Rybchenko
8c9f976f05 ethdev: improve xstats names by IDs get prototype
Adjust parameters order to eth_xstats_get_by_id_t prototype.
Make ids the second parameter similar to eth_xstats_get_by_id_t.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 13:07:11 +02:00
Ivan Ilchenko
71b5e430a6 ethdev: update xstats by ID driver callbacks documentation
Update xstats by IDs callbacks documentation in accordance with
ethdev usage of these callbacks. Document valid combinations of
input arguments to make driver implementation simpler.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 13:07:11 +02:00
Andrew Rybchenko
113778be13 ethdev: do not use get xstats names by IDs to obtain count
Relax requirements on get xstats names by IDs. After the patch
corresponding the driver operation is called with non-NULL ids
and xstats_names parameters only.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 13:07:11 +02:00
Ivan Ilchenko
bc5112ca59 ethdev: fix xstats by ID API documentation
Document valid combinations of input arguments in accordance with
current implementation in ethdev.

Fixes: 79c913a42f ("ethdev: retrieve xstats by ID")
Cc: stable@dpdk.org

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 13:07:11 +02:00
Dmitry Kozlyuk
04d43857ea net: rename Ethernet header fields
Definition of `rte_ether_addr` structure used a workaround allowing DPDK
and Windows SDK headers to be used in the same file, because Windows SDK
defines `s_addr` as a macro. Rename `s_addr` to `src_addr` and `d_addr`
to `dst_addr` to avoid the conflict and remove the workaround.
Deprecation notice:
https://mails.dpdk.org/archives/dev/2021-July/215270.html

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-10-08 14:58:11 +02:00
Tal Shnaiderman
56d2c1aa0b security: build on Windows
Build the security library on Windows.

Remove unneeded export of inline functions from version file.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: William Tu <u9012063@gmail.com>
2021-10-07 14:47:35 +02:00
Tal Shnaiderman
cb7b6898c8 cryptodev: build on Windows
Build the cryptography device library on Windows OS
by removing unneeded include and exports of inline functions
blocking the compilation.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: William Tu <u9012063@gmail.com>
2021-10-07 14:47:35 +02:00
Tal Shnaiderman
f9b2a75ed4 security: use net library to include IP structs
Remove the netinet includes and replaces them
with rte_ip.h to support the in_addr/in6_addr structs
on all operating systems.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: William Tu <u9012063@gmail.com>
2021-10-07 14:47:35 +02:00
David Marchand
ddfc59f4fb sort symbol maps
Fixed with ./devtools/update-abi.sh $(cat ABI_VERSION)

Fixes: e73a7ab224 ("net/softnic: promote manage API")
Fixes: 8f532a34c4 ("fib: promote API to stable")
Fixes: 4aeb92396b ("rib: promote API to stable")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-05 17:03:37 +02:00
Xuan Ding
07ee2d7505 vhost: normalize return type and function name
In some function definitions, adjust return type and function name on
a separate line to be consistent with DPDK coding style.

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-09-28 21:23:00 +02:00
David Marchand
5ac9d766b5 vhost: rework RARP packet injection
Caught by code review, this copy is unnecessary.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-09-28 21:21:07 +02:00
Eugenio Pérez
e7cb7fdf54 vhost: clean IOTLB cache on vring stop
Old IOVA cache entries are left when there is a change on virtio driver
in VM. In case that all these old entries have iova addresses lesser
than new iova entries, vhost code will need to iterate all the cache to
find the new ones. In case of just a new iova entry needed for the new
translations, this condition will last forever.

This has been observed in virtio-net to testpmd's vfio-pci driver
transition, reducing the performance from more than 10Mpps to less than
0.07Mpps if the hugepage address was higher than the networking
buffers. Since all new buffers are contained in this new gigantic page,
vhost needs to scan IOTLB_CACHE_SIZE - 1 for each translation at worst.

Fixes: 69c90e98f4 ("vhost: enable IOMMU support")
Cc: stable@dpdk.org

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reported-by: Pei Zhang <pezhang@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-09-28 17:26:44 +02:00
Stephen Hemminger
8680efa04c mbuf: promote Tx offload helper to stable
This function should be made stable now.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-05 11:04:03 +02:00
Stephen Hemminger
a767a3f43e mbuf: promote check helper to stable
This one has been in for required time period.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-05 11:04:03 +02:00
Stephen Hemminger
7ddade3555 mbuf: promote dynamic fields to stable
These functions to register dynamic fields were added in 19.11
and should be promoted to stable.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-05 11:03:58 +02:00
Stephen Hemminger
bf384709c7 mbuf: promote more helpers to stable
These two functions were added in 19.11 as experimental.
Time to promote the to stable status.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-05 10:59:35 +02:00
David Marchand
8d2a436d69 mbuf: promote some helpers to stable
Those accessors have been introduced more than two years ago
(rte_mbuf_to_priv in v18.08, rte_mbuf_*_addr* in v19.02).
Time to mark them stable.

rte_mbuf_to_baddr() could be removed, but since we lack a deprecation
notice, keep it as a simple wrapper.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-05 10:59:31 +02:00
Sean Morrissey
f01eff0d23 ring: promote new sync modes and peek to stable
These methods were introduced in 20.05.
There has been no changes in their public API since then.
They seem mature enough to remove the experimental tag.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-05 10:09:15 +02:00
Bruce Richardson
47a4f2650c eal/freebsd: lock memory device to prevent conflicts
Only a single DPDK process on the system can be using the /dev/contigmem
mappings at a time, but this was never explicitly enforced, e.g. when
using --in-memory flag on two processes. To prevent possible conflict
issues, we lock the dev node when it's in use, preventing other DPDK
processes from starting up and causing problems for us.

Fixes: 764bf26873 ("add FreeBSD support")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-10-02 16:30:16 +02:00
Vladimir Medvedkin
8f532a34c4 fib: promote API to stable
The fib and fib6 API's have been in since 19.11 and
should be marked as stable.

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Conor Walsh <conor.walsh@intel.com>
2021-10-02 11:37:25 +02:00
Stephen Hemminger
4aeb92396b rib: promote API to stable
The rib and rib6 API's have been in since 19.11 and
should be marked as stable.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2021-10-02 11:37:25 +02:00
Stephen Hemminger
2cea5168c1 net: promote string to ethernet to stable
This function has been in since 19.11.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-02 11:35:17 +02:00
Xiao Wang
20ab35d032 net: promote make rarp packet function to stable
rte_net_make_rarp_packet was introduced in version v18.02, there was no
change in this public API since then, and it's still being used by vhost
lib and virtio driver, so promote it as stable ABI.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-02 11:12:32 +02:00
Ivan Malov
8cfad59e29 log: promote some function to stable
This one might be quite mature to be attested as stable.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-02 11:12:32 +02:00
Mattias Rönnblom
15a1e00a65 eal: promote random generator with upper bound to stable
Remove experimental tag from rte_rand_max().

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-02 11:12:19 +02:00
Bruce Richardson
b8a0fbab98 telemetry: promote API to stable
The telemetry APIs have been present and unchanged for >1 year now,
so remove experimental tag from them.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-01 17:17:28 +02:00
Jie Zhou
09e4eceacb mempool/stack: build on Windows
Enable build of mempool/stack on Windows.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-10-01 16:46:05 +02:00
Pablo de Lara
8751a7e983 efd: allow more CPU sockets in table creation
rte_efd_create() function was using uint8_t for a socket bitmask,
for one of its parameters.
This limits the maximum of NUMA sockets to be 8.
Changing to uint64_t increases it to 64, which should be
more future-proof.

Coverity issue: 366390
Fixes: 56b6ef874f ("efd: new Elastic Flow Distributor library")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
2021-10-01 16:33:20 +02:00
Kevin Traynor
4ad8807cfc bitrate: promote free function to stable
rte_stats_bitrate_free() has been in DPDK since 20.11.

Its signature is very basic as it just frees an opaque
data struct allocated in rte_stats_bitrate_create()
and returns void.

It's unlikely that such a basic signature would need to change
so might as well promote it to stable for the next major ABI.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2021-10-01 15:31:47 +02:00
Kevin Traynor
bdd478eede bitrate: fix calculation to match API description
rte_stats_bitrate_calc() API states it returns 'Negative value on error'.

However, the implementation will return the error code from
rte_eth_stats_get() which may be non-zero on error.

Change the implementation of rte_stats_bitrate_calc() to match
the API description by always returning a negative value on error.

Fixes: 2ad7ba9a65 ("bitrate: add bitrate statistics library")

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2021-10-01 15:31:06 +02:00
Kevin Traynor
06ae9f0f92 bitrate: fix registration to match API description
rte_stats_bitrate_reg() API states it returns 'Zero on success'.

However, the implementation directly returns the return of
rte_metrics_reg_names() which may be zero or positive on success,
with a positive value also indicating the index.

The user of rte_stats_bitrate_reg() should not care about the
index as it is stored in the opaque rte_stats_bitrates struct.

Change the implementation of rte_stats_bitrate_reg() to match
the API description by always returning zero on success.

Fixes: 2ad7ba9a65 ("bitrate: add bitrate statistics library")

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2021-10-01 15:29:21 +02:00
Stephen Hemminger
71ecc415c5 telemetry: detach threads
There are a number telemetry threads which are created and
there is nothing that does pthread_join() to wait for them.
Mark these threads as detached, so that the pthread library
can cleanup state when the thread exits.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-10-01 14:50:16 +02:00
Cian Ferriter
0203a14c72 ring: fix Doxygen comment of internal function
Change "enqueue" to "dequeue" because the __rte_ring_move_cons_head()
function is updating the consumer head for dequeue.

Fixes: 0dfc98c507 ("ring: separate out head index manipulation")
Cc: stable@dpdk.org

Signed-off-by: Cian Ferriter <cian.ferriter@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-10-01 14:39:12 +02:00
William Tu
f1f6ebc0ea eal: remove sys/queue.h from public headers
Currently there are some public headers that include 'sys/queue.h', which
is not POSIX, but usually provided by the Linux/BSD system library.
(Not in POSIX.1, POSIX.1-2001, or POSIX.1-2008. Present on the BSDs.)
The file is missing on Windows. During the Windows build, DPDK uses a
bundled copy, so building a DPDK library works fine.  But when OVS or other
applications use DPDK as a library, because some DPDK public headers
include 'sys/queue.h', on Windows, it triggers an error due to no such
file.

One solution is to install the 'lib/eal/windows/include/sys/queue.h' into
Windows environment, such as [1]. However, this means DPDK exports the
functionalities of 'sys/queue.h' into the environment, which might cause
symbols, macros, headers clashing with other applications.

The patch fixes it by removing the "#include <sys/queue.h>" from
DPDK public headers, so programs including DPDK headers don't depend
on the system to provide 'sys/queue.h'. When these public headers use
macros such as TAILQ_xxx, we replace it by the ones with RTE_ prefix.
For Windows, we copy the definitions from <sys/queue.h> to rte_os.h
in Windows EAL. Note that these RTE_ macros are compatible with
<sys/queue.h>, both at the level of API (to use with <sys/queue.h>
macros in C files) and ABI (to avoid breaking it).

Additionally, the TAILQ_FOREACH_SAFE is not part of <sys/queue.h>,
the patch replaces it with RTE_TAILQ_FOREACH_SAFE.

[1] http://mails.dpdk.org/archives/dev/2021-August/216304.html

Suggested-by: Nick Connolly <nick.connolly@mayadata.io>
Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
2021-10-01 13:09:43 +02:00
Dmitry Kozlyuk
6787d0af94 lib: remove sched.h from public headers
Public headers including POSIX-specific <sched.h> were unusable
on Windows. These includes were superfluous, remove them.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
2021-10-01 08:35:05 +02:00
Dmitry Kozlyuk
b7c3eb57bb eal/windows: fix export list
* Version and randomness API were not added to .def file by mistake,
  which is why they were later excluded from the export list.
* Device API stubs were added to EAL but not exported.

Fixes: edd66d57d5 ("eal/windows: add random function")
Fixes: 3d2fcb0e0a ("eal/windows: add device event stubs")
Fixes: 5b637a8481 ("eal: fix querying DPDK version at runtime")

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-09-30 22:47:43 +02:00
Dmitry Kozlyuk
cf665406b1 eal: remove Windows-specific list of common files
The majority of common EAL sources that are built for all platforms were
listed separately for Windows and for other OS. It seems that developers
adding modules to EAL perceived this as if Windows supported
only a limited subset of modules and only added new ones into another.
Factor the truly common modules into a shared list,
then extend it with modules supported by different platforms.

When the two lists were created, UUID API implementation was removed
from Windows build (apparently by mistake), then excluded from the
export list for no reason other than not being built. Restore it.

Fixes: df3ff6be2b ("eal: simplify meson build of common directory")

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-09-30 22:47:28 +02:00
William Tu
fe81e52a91 eal/windows: export version function
When OVS inits, it calls rte_version to get the DPDK's version.
The patch fixes the error below by exposing rte_version symbol.
libopenvswitch.a(dpdk.c.obj) : error LNK2019: unresolved external symbol
rte_version referenced in function dpdk_init

Fixes: 5b637a8481 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-09-30 22:47:23 +02:00
Pallavi Kadam
876d40fe6d net: enable random address on Windows
IAVF PMD needs to generate a random MAC address if it is not configured
by host.
'random' is now supported on Windows.

Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Shivanshu Shukla <shivanshu.shukla@intel.com>
2021-09-30 20:51:11 +02:00
Olivier Matz
f0e18cb4a8 kvargs: fix comments style
A '*' is missing at 2 places, add them.

Fixes: e1a00536c8 ("kvargs: add a new library to parse key/value arguments")
Fixes: 3ab385063c ("kvargs: add get by key")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-30 17:38:13 +02:00
Olivier Matz
6aebb94290 kvargs: add function to get from key and value
A quite common scenario with kvargs is to lookup for a <key>=<value> in
a kvlist. For instance, check if name=foo is present in
name=toto,name=foo,name=bar. This is currently done in drivers/bus with
rte_kvargs_process() + the rte_kvargs_strcmp() handler.

This approach is not straightforward, and can be replaced by this new
function.

rte_kvargs_strcmp() is then removed.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-30 17:38:02 +02:00
Olivier Matz
b36d02ab39 kvargs: promote get from key as stable
The function rte_kvargs_get() is used by eal and pci bus driver since
its introduction in commit 3ab385063c ("kvargs: add get by key") and
commit d2a66ad794 ("bus: add device arguments name parsing"), in
dpdk 21.05.

Let's promote it as stable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-30 15:31:01 +02:00
Olivier Matz
4f5520d910 kvargs: promote delimited parsing as stable
This function is used by EAL to parse key/value strings separated with
specified delimiters.

It was introduced in 2018 by commit 5d6af85ab0 ("kvargs: introduce a
more flexible parsing function"), and can be promoted as stable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-30 15:30:31 +02:00
Raslan Darawsheh
16b8e92d49 ethdev: use extension header for GTP PSC item
This updates the gtp_psc flow item to use the net header
definition of the gtp_psc to be based on RFC 38415-g30

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-09-28 12:34:58 +02:00
Raslan Darawsheh
e8ca1479cd net: add extension header for GTP PSC
Define new rte header for GTP PDU session container
based on RFC 38415-g30

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-09-28 12:34:58 +02:00
Thomas Monjalon
0ce56b057b ethdev: group constant definitions in Doxygen
A lot of flags are parts of a group but are documented alone.
The Doxygen syntax @{ and @} for grouping is used
to make flags appear together and have a common description.

Some Rx/Tx offload flags and RSS definitions are not grouped
because they need to be all properly documented first.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2021-09-27 13:34:45 +02:00
Alvin Zhang
81b0fbb85b ethdev: add IPv4 and L4 checksum RSS offload types
This patch defines new RSS offload types for IPv4 and
L4(TCP/UDP/SCTP) checksum, which are required when users want
to distribute packets based on the IPv4 or L4 checksum field.

For example "flow create 0 ingress pattern eth / ipv4 / end
actions rss types ipv4-chksum end queues end / end", this flow
causes all matching packets to be distributed to queues on
basis of IPv4 checksum.

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Aman Deep Singh <aman.deep.singh@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-09-21 10:25:42 +02:00
Anatoly Burakov
de4ffd50c9 mem: promote some shared memory config API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
27e7e2509c mem: promote DMA mask API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
acddc33b3e mem: promote external memory API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
b893775065 mem: promote memseg API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
437cb6e826 malloc: promote some experimental API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
c335ffdbf7 fbarray: promote experimental API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
1611654bd6 ipc: promote experimental API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Tejasree Kondoj
f0b538a5f8 security: add option to configure tunnel header verification
Add option to indicate whether outer header verification
need to be done as part of inbound IPsec processing.

With inline IPsec processing, SA lookup would be happening
in the Rx path of rte_ethdev. When rte_flow is configured to
support more than one SA, SPI would be used to lookup SA.
In such cases, additional verification would be required to
ensure duplicate SPIs are not getting processed in the inline path.

For lookaside cases, the same option can be used by application
to offload tunnel verification to the PMD.

These verifications would help in averting possible DoS attacks.

Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-09-28 17:40:52 +02:00
Anoob Joseph
ad7515a39f security: add SA lifetime configuration
Add SA lifetime configuration to register soft and hard expiry limits.
Expiry can be in units of number of packets or bytes. Crypto op
status is also updated to include new field, aux_flags, which can be
used to indicate cases such as soft expiry in case of lookaside
protocol operations.

In case of soft expiry, the packets are successfully IPsec processed but
the soft expiry would indicate that SA needs to be reconfigured. For
inline protocol capable ethdev, this would result in an eth event while
for lookaside protocol capable cryptodev, this can be communicated via
`rte_crypto_op.aux_flags` field.

In case of hard expiry, the packets will not be IPsec processed and
would result in error.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-09-28 14:11:29 +02:00
Anoob Joseph
63992166ba security: support user-specified IV
Enabled user to provide IV to be used per security
operation. This would be used with lookaside protocol
offload for comparing against known vectors.

By default, PMD would internally generate random IV.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-09-28 13:35:32 +02:00
Nithin Dabilpuram
d08dcd28c3 security: add option for faster user/meta data access
Currently rte_security_set_pkt_metadata() and rte_security_get_userdata()
methods to set pkt metadata on Inline outbound and get userdata
after Inline inbound processing is always driver specific callbacks.

For drivers that do not have much to do in the callbacks but just
to update metadata in rte_security dynamic field and get userdata
from rte_security dynamic field, having to just to PMD specific
callback is costly per packet operation. This patch provides
a mechanism to do the same in inline function and avoid function
pointer jump if a driver supports the same.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-09-28 08:43:47 +02:00
Nithin Dabilpuram
6d1f8c1319 mbuf: enforce semantics for Tx inline IPsec processing
Not all net PMD's/HW can parse packet and identify L2 header and
L3 header locations on Tx. This is inline with other Tx offloads
requirements such as L3 checksum, L4 checksum offload, etc,
where mbuf.l2_len, mbuf.l3_len etc, needs to be set for HW to be
able to generate checksum. Since Inline IPsec is also such a Tx
offload, some PMD's at least need mbuf.l2_len to be valid to
find L3 header and perform Outbound IPSec processing.

Hence, this patch updates documentation to enforce setting
mbuf.l2_len while setting PKT_TX_SEC_OFFLOAD in mbuf.ol_flags
for Inline IPsec Crypto / Protocol offload processing to
work on Tx.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-09-27 09:55:41 +02:00
Shijith Thotton
a0a388a897 eal: add macro to swap two variables
Add a macro to swap two variables
and updat common autotest for the same.

Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-09-27 18:33:45 +02:00
Julien Meunier
6ded44bce4 stack: fix reload head when pop fails
The previous commit 18effad9cf ("stack: reload head when pop fails")
only changed C11 implementation, not generic implementation.

List head must be loaded right before continue (when failed to find the
new head). Without this, one thread might keep trying and failing to pop
items without ever loading the new correct head.

Fixes: 3340202f59 ("stack: add lock-free implementation")
Cc: stable@dpdk.org

Signed-off-by: Julien Meunier <julien.meunier@nokia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-09-27 17:28:55 +02:00
Xueming Li
eb5636e879 sched: get 64-bit greatest common divisor
This patch adds new function that compute the greatest common
divisor of 64 bits, also changes the original 32 bits function
to call this new 64-bit version.

Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2021-09-27 17:24:16 +02:00
Cristian Dumitrescu
175d213bf8 pipeline: improve handling of learner action arguments
The arguments of actions that are learned are now specified as part of
the learn instruction as opposed to being statically specified as part
of the learner table configuration.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:18:49 +02:00
Cristian Dumitrescu
1c6571c837 pipeline: enable pipeline compilation
Commit the pipeline changes when the compilation process is
successful: change the table lookup instructions to execute the action
function for each action, replace the regular pipeline instructions
with the custom instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:10:26 +02:00
Cristian Dumitrescu
f898a475c3 pipeline: build shared object for pipeline
Build the generated C file into a shared object library.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
2021-09-27 12:10:20 +02:00
Cristian Dumitrescu
724f3ef422 pipeline: generate custom instruction functions
Generate a C function for each custom instruction, which essentially
consolidate multiple regular instructions into a single function call.
The pipeline program is split into groups of instructions, and a
custom instruction is generated for each group that has more than one
instruction. Special care is taken the instructions that can do thread
yield (RX, extern) and for those that can change the instruction
pointer (TX, near/far jump).

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:54 +02:00
Cristian Dumitrescu
d025528d74 pipeline: generate action functions
Generate a C function for each action. For most instructions, the
associated inline function is called directly. Special care is taken
for TX, jump and return instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:45 +02:00
Cristian Dumitrescu
216bc906d0 pipeline: export pipeline instructions to file
Export the array of translated instructions to a C file. There is one
such array per action and one for the pipeline.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:26 +02:00
Cristian Dumitrescu
fc64098a1a pipeline: introduce pipeline compilation
Lay the foundation to generate C code for the pipeline: C functions
for actions and custom instructions are generated, built as shared
object library and loaded into the pipeline.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:15 +02:00
Cristian Dumitrescu
dfa9491a18 pipeline: introduce custom instructions
For better performance, the option to create custom instructions when
the program is translated and add them on-the-fly to the pipeline is
now provided. Multiple regular instructions can now be consolidated
into a single C function optimized by the C compiler directly.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:13 +02:00
Cristian Dumitrescu
5dc6a5f2e7 pipeline: introduce action functions
For better performance, the option to run a single function per action
is now provided, which requires a single function call per action that
can be better optimized by the C compiler, as opposed to one function
call per instruction. Special table lookup instructions are added to
to support this feature.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:11 +02:00
Cristian Dumitrescu
4bd025dc98 pipeline: enable persistent instruction meta-data
Save the instruction meta-data for later use instead of freeing it up
once the instruction translation is completed.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:23 +02:00
Cristian Dumitrescu
40baf712ef pipeline: create inline functions for instruction operands
Create inline functions to get the instruction operands.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:20 +02:00
Cristian Dumitrescu
0d5910ddcf pipeline: create inline functions for meter instructions
Create inline functions for the meter instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:18 +02:00
Cristian Dumitrescu
c5d03ffda7 pipeline: create inline functions for register instructions
Create inline functions for the register instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:16 +02:00
Cristian Dumitrescu
ed7567c9d7 pipeline: create inline functions for ALU instructions
Create inline functions for the ALU instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:15 +02:00
Cristian Dumitrescu
fae7b2baa3 pipeline: create inline functions for DMA instruction
Create inline functions for the DMA instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:13 +02:00
Cristian Dumitrescu
b82733ab25 pipeline: create inline functions for move instruction
Create inline functions for the move instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:12 +02:00
Cristian Dumitrescu
4884264b17 pipeline: create inline functions for extern instruction
Create inline functions for the extern instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:10 +02:00
Cristian Dumitrescu
d1a58ada1a pipeline: create inline functions for learn instruction
Create inline functions for the learn and forget instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:09 +02:00
Cristian Dumitrescu
4565d7db70 pipeline: create inline functions for validate instruction
Create inline functions for the validate and invalidate instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:07 +02:00
Cristian Dumitrescu
d60dbdc88a pipeline: create inline functions for emit instruction
Create inline functions for the emit instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:03:05 +02:00
Cristian Dumitrescu
2574fd607e pipeline: create inline functions for extract instruction
Create inline functions for the extract instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 11:59:47 +02:00
Cristian Dumitrescu
fcb03ae09e pipeline: create inline functions for Tx instruction
Create inline functions for the Tx instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 11:59:37 +02:00
Cristian Dumitrescu
101d7f09bf pipeline: create inline functions for Rx instruction
Create inline functions for the Rx instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 11:59:28 +02:00
Cristian Dumitrescu
c693add3bf pipeline: move thread inline functions to header file
Move the thread inline functions to the internal header file.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 11:59:07 +02:00
Cristian Dumitrescu
97b8278ad9 pipeline: move data structures to internal header file
Start to consolidate the data structures and inline functions required
by the pipeline instructions into an internal header file.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 11:43:36 +02:00
Cristian Dumitrescu
4f59d37261 pipeline: support learner tables
Add pipeline level support for learner tables.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 09:52:04 +02:00
Cristian Dumitrescu
0c06fa3bfa table: support learner tables
A learner table is typically used for learning or connection tracking,
where it allows for the implementation of the "add on miss" scenario:
whenever the lookup key is not found in the table (lookup miss), the
data plane can decide to add this key to the table with a given action
with no control plane intervention. Likewise, the table keys expire
based on a configurable timeout and are automatically deleted from the
table with no control plane intervention.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 09:30:41 +02:00
Cristian Dumitrescu
220d419b86 pipeline: add header look-ahead instruction
Added look-ahead instruction to read a header from the input packet
without advancing the extraction pointer. This is typically used in
correlation with the special extract instruction to extract variable
size headers from the input packet: the first few header fields are
read without advancing the extraction pointer, just enough to detect
the actual length of the header (e.g. IPv4 IHL field); then the full
header is extracted.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 09:15:07 +02:00
Cristian Dumitrescu
5972f82936 pipeline: add variable size headers extract instruction
Added a mechanism to extract variable size headers through a special
flavor of the extract instruction. The length of the last struct field
which has variable size is passed as argument to the instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 09:14:56 +02:00
Cristian Dumitrescu
cef3896928 pipeline: support variable size headers
Added support for variable size headers. The last field of a struct
type can now have a variable size between 0 and N bytes. Useful to
accommodate IPv4 packets with options, etc.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 09:14:45 +02:00
Cristian Dumitrescu
5f3e610422 pipeline: prepare for variable size headers
The emit instruction that is responsible for pushing headers into the
output packet is now reading the header length from internal run-time
structures as opposed to constant value from the instruction opcode.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 09:14:41 +02:00
Harman Kalra
3f156ec284 metrics: promote deinitialize API
Remove experimental flag from rte_metrics_deinit().
This API was introduced in 19.11 release.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-23 19:21:47 +02:00
Tal Shnaiderman
826476bc70 eal/windows: fix debug build
When building DPDK on Windows in debug mode the following
warning appear:

warning: token pasting of ',' and __VA_ARGS__ is a GNU extension
[-Wgnu-zero-variadic-macro-arguments] #define open(path, flags, ...)
_open(path, flags, ##__VA_ARGS__)

Modify the 'open' macro to avoid it.

Fixes: 45d62067c2 ("eal: make OS shims internal")
Cc: stable@dpdk.org

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-09-23 19:16:27 +02:00
Radu Nicolau
d2671e642a telemetry: support dict of dicts
Add support for dicts of dicts to telemetry library.
Increase the max string size to 128.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-09-23 14:15:29 +02:00
Thomas Monjalon
ca0c25bbc9 eal: reword logs for CPU and NUMA counts
Some logs about cores and nodes were using hypotetic plural (s) form.
A fixed plural form with value at the end is preferred.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-09-23 08:55:20 +02:00
Thomas Monjalon
7a33572057 lib: remove C++ include guard from private headers
The private headers are compiled internally with a C compiler.
Thus extern "C" declaration is useless in such files.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-09-22 22:00:17 +02:00
Chenbo Xia
945ef8a040 vhost: promote some APIs to stable
As reported by symbol bot, APIs listed in this patch have been
experimental for more than two years. This patch promotes these
18 APIs to stable.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-09-14 13:21:57 +02:00
Gaoxiang Liu
e53084d84b vhost: log socket path on adding connection
Add log print of socket path in vhost_user_add_connection.
It's useful when adding a mass of socket connections,
because the information of every connection is clearer.

Fixes: 8f972312b8 ("vhost: support vhost-user")
Cc: stable@dpdk.org

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-09-14 13:21:57 +02:00
Gaoxiang Liu
451dc0fad8 vhost: fix crash on port deletion
The rte_vhost_driver_unregister() and vhost_user_read_cb()
can be called at the same time by 2 threads.
when memory of vsocket is freed in rte_vhost_driver_unregister(),
the invalid memory of vsocket is accessed in vhost_user_read_cb().
It's a bug of both mode for vhost as server or client.

E.g., vhostuser port is created as server.
Thread1 calls rte_vhost_driver_unregister().
Before the listen fd is deleted from poll waiting fds,
"vhost-events" thread then calls vhost_user_server_new_connection(),
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
accesses invalid memory of socket while thread1 frees the memory of
vsocket.

E.g., vhostuser port is created as client.
Thread1 calls rte_vhost_driver_unregister().
Before vsocket of reconn is deleted from reconn list,
"vhost_reconn" thread then calls vhost_user_add_connection()
then a new conn fd is added in fdset when trying to reconnect.
"vhost-events" thread then calls vhost_user_read_cb() and
accesses invalid memory of socket while thread1 frees the memory of
vsocket.

The fix is to move the "fdset_try_del" in front of free memory of conn,
then avoid the race condition.

The core trace is:
Program terminated with signal 11, Segmentation fault.

Fixes: 52d874dc67 ("vhost: fix crash on closing in client mode")
Cc: stable@dpdk.org

Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-09-14 13:21:57 +02:00
Jiayu Hu
abeb865255 vhost: remove copy threshold for async path
Copy threshold has been introduced in async vhost data
path to select the appropriate copy engine to do copies
for higher efficiency.

However, it may cause packets ordering issues and also
introduces performance unpredictability.

Therefore, this patch removes copy threshold support in
async vhost data path.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-09-14 13:21:55 +02:00
Mohsin Kazmi
818ce1132a net: fix checksum offload for outer IPv4
Preparation of the headers for the hardware offload
misses the outer IPv4 checksum offload.
It results in bad checksum computed by hardware NIC.

This patch fixes the issue by setting the outer IPv4
checksum field to 0.

Fixes: 4fb7e803eb ("ethdev: add Tx preparation")
Cc: stable@dpdk.org

Signed-off-by: Mohsin Kazmi <mohsin.kazmi14@gmail.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-09-15 12:51:49 +02:00
David Marchand
b37ed6def3 ethdev: promote sibling iterators to stable
This API saw no update since its introduction and will help applications
like OVS ([1] and [2]) that currently look at rte_eth_devices[] to
achieve the same.

1: https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L1285
2: https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L1476

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-15 11:48:54 +02:00
Haiyue Wang
9fecac6c3a ethdev: promote burst mode API
The DPDK Symbol Bot reports:
Please note the symbols listed below have expired. In line with the
DPDK ABI policy, they should be scheduled for removal, in the next
DPDK release.

Symbol
rte_eth_rx_burst_mode_get
rte_eth_tx_burst_mode_get

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-09-15 10:46:00 +02:00
Joyce Kong
97d7b92772 ethdev: fix typo in Rx queue setup API comment
Fix a typo that mb_pool was misspelt as mp_pool.

Fixes: 4ff702b5df ("ethdev: introduce Rx buffer split")
Cc: stable@dpdk.org

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-09-15 10:38:43 +02:00
Pavan Nikhilesh
2def522abc ethdev: promote API to set packet types
Remove experimental tag from rte_eth_dev_set_ptypes().

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-15 09:54:04 +02:00
Xiaoyun Li
dbd34bee98 ethdev: promote API to get interrupt FD per queue
Remove the experimental tag for rte_eth_dev_rx_intr_ctl_q_get_fd API
that was introduced in 18.11 and have been around for 11 releases.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-09-14 18:18:06 +02:00
Conor Walsh
4777674c44 eal: fix memory leak when saving arguments
This patch fixes a memleak which was reported in Bugzilla within the
eal_save_args function. This was caused by the function mistakenly
adding -- to the eal args instead of breaking beforehand.

Bugzilla ID: 722
Fixes: 293c53d8b2 ("eal: add telemetry callbacks")

Reported-by: Zhihong Peng <zhihongx.peng@intel.com>
Signed-off-by: Conor Walsh <conor.walsh@intel.com>
Signed-off-by: Conor Fogarty <conor.fogarty@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-09-16 22:24:54 +02:00
Stephen Hemminger
7df485eb3d eal: remove deprecated noninclusive API
New API for these were added in 20.11 and the old API was retained
but marked deprecated. Since 21.11 is the next LTS, it is time
to remove the deprecated ones.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-09-16 17:21:22 +02:00
Dmitry Kozlyuk
5ffa86a2b4 build: propagate Windows system dependencies to pkg-config
Windows EAL depends on some system libraries. They were linked using
add_project_link_arguments('-l<LIB>'), which prevented meson from adding
them to Libs.private of pkg-config file. As a result, applications using
pkg-config to find DPDK hit link errors, for example:

    librte_eal.a(eal_windows_eal_debug.c.obj) : error LNK2019: unresolved
    external symbol __imp_SymInitialize referenced in function
    rte_dump_stack

Reference required libraries in EAL using ext_deps meson variable.
bus/pci and net/pcap depend on lib/eal and will pull them automatically.
Drop advapi32 dependency, as MinGW locates VirtualAlloc2() dynamically.

Fixes: 2a5d547a4a ("eal/windows: implement basic memory management")
Fixes: c91717eb75 ("eal/windows: support exit and panic")
Cc: stable@dpdk.org

Reported-by: William Tu <u9012063@gmail.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Tested-by: William Tu <u9012063@gmail.com>
2021-09-14 15:58:34 +02:00
Thomas Monjalon
557610a8ff cryptodev: fix indent in Meson file
Fixes: af668035f7 ("cryptodev: expose driver interface as internal")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-09-14 15:58:32 +02:00
Aman Deep Singh
a7db3afce7 net: add macro to extract MAC address bytes
Added macros to simplify print of MAC address.
The six bytes of a MAC address are extracted in
a macro here, to improve code readablity.

Signed-off-by: Aman Deep Singh <aman.deep.singh@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-09-07 19:08:05 +02:00
Aman Deep Singh
c2c4f87b12 net: add macro for MAC address print
Added macro to print six bytes of MAC address.
The MAC addresses will be printed in upper case
hexadecimal format.
In case there is a specific check for lower case
MAC address, the user may need to make a change in
such test case after this patch.

Signed-off-by: Aman Deep Singh <aman.deep.singh@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-09-07 19:07:46 +02:00
Hemant Agrawal
864c1a40d7 security: support PDCP short MAC-I
This patch add support to handle PDCP short MAC-I domain
along with standard control and data domains as it has to
be treaty as special case with PDCP protocol offload support.

ShortMAC-I is the 16 least significant bits of calculated MAC-I. Usually
when a RRC message is exchanged between UE and eNodeB it is integrity &
ciphered protected.

MAC-I = f(key, varShortMAC-I, count, bearer, direction).
Here varShortMAC-I is prepared by using (current cellId, pci of source cell
and C-RNTI of old cell). Other parameters like count, bearer and
direction set to all 1.

crypto-perf app is updated to take short MAC as input mode.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-09-08 16:54:37 +02:00
Akhil Goyal
af668035f7 cryptodev: expose driver interface as internal
The rte_cryptodev_pmd.* files are for drivers only and should be
private to DPDK, and not installed for app use.

Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2021-09-08 09:35:12 +02:00
Akhil Goyal
e74abd4843 cryptodev: rename function to check device validity
The API rte_cryptodev_pmd_is_valid_dev, can be used
by the application as well as PMD to check whether
the device is valid or not. Hence, _pmd is removed
from the API.
The applications and drivers which use this API are
also updated.

Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2021-09-08 09:21:10 +02:00
David Christensen
c13e617739 eal/ppc: ignore GCC 10 stringop-overflow warnings
Suppress gcc warning "warning: writing 16 bytes into a region of
size 0" for users of the POWER rte_memcpy() function.  Existing
rte_memcpy() code takes different code paths based on the actual
size of the move so the warning is already addressed. See also
commit b5b3ea803e ("eal/x86: ignore gcc 10 stringop-overflow warnings")

Cc: stable@dpdk.org

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
2021-09-13 09:18:06 +02:00
Xueming Li
b344eb5d94 devargs: parse global device syntax
When parsing a devargs, try to parse using the global device syntax
first. Fallback on legacy syntax on error.

Example of new global device syntax:
 -a bus=pci,addr=82:00.0/class=eth/driver=mlx5,dv_flow_en=1

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-09-02 17:02:27 +02:00
Xueming Li
d2a66ad794 bus: add device arguments name parsing
For device probe and iterator, devargs name was key information,
parsed by rte_devargs_parse. In legacy parser, devargs name was
extracted after bus name:
  bus:name,kv_arguments,,,
Example:
  pci:83:00.0,arguments,...
  vdev:pcap0,...

To be compatible with legacy parser, this patch introduces new
bus driver API devargs_parse to parse devargs and update devargs name.
If devargs_parse not implemented by bus driver, the new syntax parser
rte_devargs_layers_parse default will resolve devargs name from bus's
"name" argument.

Different bus driver might choose different keys from arguments with
unified format. The PCI bus implementation fills the devargs name with
the "addr" argument, example:
 -a bus=pci,addr=83:00.0/class=eth/driver=mlx5,...
    name: 0000:03:00.0
 -a bus=vdev,name=pcap0/class=eth/driver=pcap,...
    name:pcap0

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-09-02 16:58:19 +02:00
Thomas Monjalon
fdab8f2e17 version: 21.11-rc0
Start a new release cycle with empty release notes.

The ABI version becomes 22.0.
The map files are updated to the new ABI major number (22).
The ABI exceptions are dropped and CI ABI checks are disabled because
compatibility is not preserved.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2021-08-17 08:37:52 +02:00
Churchill Khangar
69fa4a61ae pipeline: fix table statistics
This patch fixes the memcpy function call which was incorrect and led
to memory corruption for tables with more that just a few actions.

Fixes: 742b0a57f5 ("pipeline: add table statistics to SWX")
Cc: stable@dpdk.org

Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-08-04 14:45:13 +02:00
Ciara Power
eeaeca82b8 cryptodev: fix freeing after device release
The PMD destroy function was calling the release function, which frees
cryptodev->data, and then tries to free cryptodev->data->dev_private,
which causes the heap use after free issue.

A temporary pointer is set before the free of cryptodev->data,
which can then be used afterwards to free dev_private.
The free cannot be moved to before the release function is called,
as dev_private is used in the PMD close function while being released.

Fixes: 9e6edea418 ("cryptodev: add APIs to assist PMD initialisation")
Cc: stable@dpdk.org

Reported-by: Zhihong Peng <zhihongx.peng@intel.com>
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-07-30 21:08:12 +02:00
Dmitry Kozlyuk
23ce9e0a19 eal/windows: cleanup virt2phys handle
eal_mem_virt2phys_init() opens a handle for use by rte_mem_virt2phy().
Close this handle on EAL cleanup.

Fixes: 2a5d547a4a ("eal/windows: implement basic memory management")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-07-30 18:56:01 +02:00
Naga Harish K S V
6922655cad eventdev: fix event port setup in Tx adapter
The event port config set by application in
rte_event_eth_tx_adapter_create API is modified in
default configuration callback function. This patch removes
this hardcode to use application provided event port
config value.

Fixes: a3bbf2e097 ("eventdev: add eth Tx adapter implementation")
Cc: stable@dpdk.org

Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
2021-07-30 12:55:19 +02:00
Maxime Coquelin
3c929a0bb3 vhost: fix crash on reconnect
When the vhost-user frontend like Virtio-user tries to
reconnect to the restarted Vhost backend, the Vhost backend
segfaults when multiqueue is enabled.

This is caused by VHOST_USER_GET_VRING_BASE being called for
a virtqueue that has not been created before, causing a NULL
pointer dereferencing.

This patch adds the VHOST_USER_GET_VRING_BASE requests to
the list of requests that trigger queue pair allocations.

Fixes: 160cbc815b ("vhost: remove a hack on queue allocation")
Cc: stable@dpdk.org

Reported-by: Yinan Wang <yinan.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-07-28 08:27:53 +02:00
Huisong Li
c6ccd1e392 sched: rework configuration failure handling
Currently, rte_sched_free_memory() is called multiple times by the
exception handling code in rte_sched_subport_config() and
rte_sched_pipe_config().

This patch optimizes them into a unified outlet to free memory.

Fixes: ac6fcb841b ("sched: update subport rate dynamically")
Fixes: 34a90f8665 ("sched: modify pipe functions for config flexibility")
Fixes: ce7c4fd7c2 ("sched: add pipe config to subport level")
Cc: stable@dpdk.org

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-07-24 10:58:58 +02:00
Huisong Li
a042481ecd sched: fix profile allocation failure handling
This patch fixes return value judgment when allocate memory to store the
subport profile, and releases memory of 'rte_sched_port' if code fails to
apply for this memory.

Fixes: 0ea4c6afca ("sched: add subport profile table")
Cc: stable@dpdk.org

Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-07-24 10:58:39 +02:00
Richael Zhuang
d37462e56c power: check frequencies count before filling array
The freqs array size is RTE_MAX_LCORE_FREQS. Before filling the
array with num_freqs elements, restrict the total num to
RTE_MAX_LCORE_FREQS. This fix aims to fix the coverity scan issue
like:
Overrunning array "pi->freqs" of 256 bytes by passing it to a
function which accesses it at byte offset 464.

Coverity issue: 371913
Fixes: ef1cc88f18 ("power: support cppc_cpufreq driver")
Cc: stable@dpdk.org

Signed-off-by: Richael Zhuang <richael.zhuang@arm.com>
Acked-by: David Hunt <david.hunt@intel.com>
2021-07-24 10:09:58 +02:00
Stephen Hemminger
128c22b998 eal: fix argument in 32-bit safe BSF function
The first argument to rte_bsf32_safe was incorrectly declared as
a 64 bit value. The code only works on 32 bit values and the underlying
function rte_bsf32 only accepts 32 bit values. This was a mistake
introduced when the safe version was added and probably cause
by copy/paste from the 64 bit version.

The bug passed silently under the radar until some other code was
built with -Wall and -Wextra in C++ and C++ complains about the
missing cast.

Yes, this is a API signature change, but the original code was wrong.
It is an inline so not an ABI change.

Fixes: 4e261f5519 ("eal: add 64-bit bsf and 32-bit safe bsf functions")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2021-07-24 09:51:30 +02:00
Jiayu Hu
259caa21d7 vhost: handle memory hotplug for async vhost
When the guest memory is hotplugged, the vhost application which
enables DMA acceleration must stop DMA transfers before the vhost
re-maps the guest memory.

This patch is to notify the vhost application of stopping DMA
transfers.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-23 10:58:53 +02:00
Cheng Jiang
b737fd6139 vhost: add unsafe async API to clear packets
Applications need to stop DMA transfers and finish all the inflight
packets when in VM memory hot-plug case and async vhost is used. This
patch is to provide an unsafe API to clear inflight packets which
are submitted to DMA engine in vhost async data path. Update the
program guide and release notes for virtqueue inflight packets clear
API in vhost lib.

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-23 10:58:53 +02:00
Cheng Jiang
3f63c19b2b vhost: fix async callbacks return type
The async vhost callback ops should return negative value when there
are something wrong in the callback, so the return type should be
changed into int32_t. The issue in vhost example is also fixed.

Fixes: cd6760da10 ("vhost: introduce async enqueue for split ring")
Fixes: 819a716858 ("vhost: fix async callback return type")
Fixes: 6b3c81db8b ("vhost: simplify async copy completion")
Fixes: abec60e711 ("examples/vhost: support vhost async data path")
Fixes: 6e9a9d2a02 ("examples/vhost: fix ioat dependency")
Fixes: 873e8dad6f ("vhost: support packed ring in async datapath")
Cc: stable@dpdk.org

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-23 10:58:53 +02:00
Jie Zhou
b8617fcc51 eal/windows: check callback parameter of alarm functions
EAL functions rte_eal_alarm_set() and rte_eal_alarm_cancel()
did not for invalid parameters in Windows implementation,
which is caught by the unit test alarm_autotest.

Enforce parameter check to fail fast for invalid parameters.

Fixes: f4cbdbc7fb ("eal/windows: implement alarm API")
Cc: stable@dpdk.org

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-07-22 22:06:27 +02:00
Anatoly Burakov
565d01226e power: fix multi-queue scale mode
Currently in scale mode, multi-queue initialization will attempt to
initialize and de-initialize the per-lcore power library structures
multiple times. Fix it to only do this whenever we either enabling
first queue or disabling last queue.

Fixes: 5dff9a72b0 ("power: support callbacks for multiple Rx queues")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
2021-07-22 21:36:30 +02:00
Jiayu Hu
fa51f1aa08 vhost: add thread-unsafe async registration
This patch adds thread unsafe version for async register and
unregister functions.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-21 07:56:13 +02:00
Jiayu Hu
acbc38887b vhost: rework async configuration structure
This patch reworks the async configuration structure to improve code
readability. In addition, add preserved padding fields on the structure
for future usage.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-21 07:56:13 +02:00
Jiayu Hu
7f31d4ea05 vhost: fix lock on device readiness notification
The vhost notifies the application of device readiness via
vhost_user_notify_queue_state(), but calling this function
is not protected by the lock. This patch is to make this
function call lock protected.

Fixes: d0fcc38f5f ("vhost: improve device readiness notifications")
Cc: stable@dpdk.org

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-21 07:56:13 +02:00
Maxime Coquelin
92ed77dce6 vhost: fix packed ring index wrapping
Unlike split ring, packed ring does not mandate the ring size
to be a power of 2. So we have to use a modulo operation when
wrapping ring index.

Fixes: 873e8dad6f ("vhost: support packed ring in async datapath")
Cc: stable@dpdk.org

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-07-21 07:56:13 +02:00
Jiayu Hu
0c0935c5f7 vhost: allow to check in-flight packets for async vhost
This patch allows to check the amount of in-flight packets
for the vhost queue using async acceleration.

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-21 07:56:13 +02:00
Cheng Jiang
2e3f1ab0d8 vhost: fix async packed ring batch datapath
We assume that in the sync path, if there is no buffer wrap in the
avail descriptors fetched in a batch, there is no buffer wrap in the
used descriptors which need to be written back in this batch, but
this assumption is wrong in the async path since there are inflight
descriptors which are processed by the DMA device.

This patch refactors the batch copy code and adds used ring buffer
wrap check as a batch copy condition to fix this issue.

Fixes: 873e8dad6f ("vhost: support packed ring in async datapath")
Cc: stable@dpdk.org

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-21 07:56:13 +02:00
Cheng Jiang
8d2c1260af vhost: fix index overflow for packed ring in async vhost
We introduced some new indexes in packed ring of async vhost. They
will eventually overflow and lead to errors if the ring size is not
a power of 2. This patch is to check and keep these indexes within a
reasonable range.

Fixes: 873e8dad6f ("vhost: support packed ring in async datapath")
Cc: stable@dpdk.org

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-07-21 07:56:13 +02:00
Xiao Wang
706ba48665 vhost: check header for legacy dequeue offload
When parsing the virtio net header and packet header for dequeue offload,
we need to perform sanity check on the packet header to ensure:
  - No out-of-boundary memory access.
  - The packet header and virtio_net header are valid and aligned.

Fixes: d0cf91303d ("vhost: add Tx offload capabilities")
Cc: stable@dpdk.org

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2021-07-21 07:56:13 +02:00
Cristian Dumitrescu
40d42de563 pipeline: fix selector freeing
Due to a typo, the selector_free() function incorrectly takes an early
return when the selectors array is non-NULL, as opposed to the other
way around.

Coverity issue: 371912
Fixes: cdaa937d3e ("pipeline: support selector table")

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-07-21 13:51:17 +02:00
Anatoly Burakov
87fb608356 power: fix crash on error for intel_pstate
Currently, the error paths can lead to attempts at dereferencing NULL
pointers. Add the check to avoid attempts at dereferencing NULL
pointers.

Coverity issue: 371895
Coverity issue: 371889
Fixes: 06cffd468f ("power: refactor ACPI and intel_pstate support")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-07-20 17:24:00 +02:00
David Hunt
de8606bf73 distributor: fix 128-bit write alignment
When the distributor sample app is built as a 32-bit app,
the data buffer passed to find_match_vec can be unaligned,
causing a segmentation fault due to writing a 128-bit value
using _mm_store_si128().  128-bit align the data being
passed in so this does not happen.

Fixes: 775003ad2f ("distributor: add new burst-capable library")
Cc: stable@dpdk.org

Signed-off-by: David Hunt <david.hunt@intel.com>
2021-07-20 14:32:08 +02:00
Viacheslav Galaktionov
10eaf41d70 ethdev: keep count of representor ranges in API
In its current state, the API can overflow the user-passed buffer if a new
representor range appears between function calls.

In order to solve this problem, augment the representor info structure with
the numbers of allocated and initialized ranges. This way the users of this
structure can be sure they will not overrun the buffer.

Fixes: 85e1588ca7 ("ethdev: add API to get representor info")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
2021-07-10 11:29:11 +02:00
Changpeng Liu
7bc7bc3516 eal: suppress error log on multi-process hotplug
This is a normal case that the primary process already
owned one device while the secondary process try to
attach it, so suppress the error log here to exclude
this case.

Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
2021-07-10 10:07:07 +02:00
Cristian Dumitrescu
a3ac0a4836 pipeline: support LPM lookup
Add support for the Longest Prefix Match (LPM) lookup to the SWX
pipeline.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
2021-07-10 08:30:59 +02:00
Cristian Dumitrescu
cdaa937d3e pipeline: support selector table
Add pipeline-level support for selector tables.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-07-10 08:26:12 +02:00
Cristian Dumitrescu
f7598a62d1 table: support selector table
A selector table is made up of groups of weighted members, with a
given member potentially part of several groups. The select operation
returns a member ID by first selecting a group based on an input group
ID and then selecting a member within that group based on hashing one
or several input header/meta-data fields. It is very useful for
implementing an ECMP/WCMP-enabled FIB or a load balancer. It is part
of the action selector described by the P4 Portable Switch
Architecture (PSA) specification.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-07-09 23:31:54 +02:00
Cristian Dumitrescu
a57d92d73d pipeline: fix table entry read
The rte_swx_pipeline_table_entry_read() function is used to read from
a character string a table entry that is to be added to the table,
deleted from the table or set as the default entry of the table.
Addition needs both the match and the part of the entry, deletion
ignores the action part, while the default set ignores the match part,
hence the need to make both the match and the action part optional.
The logic for skipping the match or the action part was broken, hence
the current fix.

Fixes: b32c0a2c5e ("pipeline: add SWX table update high level API")
Cc: stable@dpdk.org

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Venkata Suresh Kumar P <venkata.suresh.kumar.p@intel.com>
Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
2021-07-09 22:52:19 +02:00
Thierry Herbelot
3fc2ddffde table: fix bucket empty check
Due to a typo, only 3 out of 4 keys in the bucket of the exact match
table were considered, which can result in valid keys being
incorrectly dropped from the table.

Fixes: d0a0096661 ("table: add exact match SWX table")
Cc: stable@dpdk.org

Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-07-09 22:42:24 +02:00
Chengwen Feng
5aa9189d74 config/arm: fix SVE build with GCC 8.3
If the target machine has SVE feature (e.g. "-march=armv8.2-a+sve'),
and the compiler is gcc-8.3, it will produce this error:
	In file included from lib/eal/common/eal_common_options.c:38:
	lib/eal/arm/include/rte_vect.h:13:10: fatal	error:
	arm_sve.h: No such file or directory
	#include <arm_sve.h>
	       ^~~~~~~~~~~

The root cause is that gcc-8.3 supports SVE (the macro
__ARM_FEATURE_SVE was 1), but it doesn't support SVE ACLE [1].

The solution:
a) Detect compiler whether support SVE ACLE, if support then define
RTE_HAS_SVE_ACLE macro.
b) Use the RTE_HAS_SVE_ACLE macro to include SVE header file.

[1] ACLE:  Arm C Language Extensions, the SVE ACLE header file is
<arm_sve.h>, user should include it when writing ACLE SVE code.

Fixes: 67b68824a8 ("lpm/arm: support SVE")
Cc: stable@dpdk.org

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2021-07-09 22:25:24 +02:00
Ruifeng Wang
cac2a49b4a ring: use WFE to wait for tail update on aarch64
Instead of polling for tail to be updated, use WFE instruction.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-09 21:33:01 +02:00
Gavin Hu
fa6b488998 spinlock: use WFE to reduce contention on aarch64
In acquiring a spinlock, cores repeatedly poll the lock variable.
This is replaced by rte_wait_until_equal API.

Running micro benchmarking and testpmd and l3fwd traffic tests
on ThunderX2, Ampere eMAG80 and Arm N1SDP, everything went well
and no notable performance gain nor degradation was measured.

Signed-off-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-07-09 21:33:01 +02:00
Anatoly Burakov
f53fe635c1 power: support monitoring multiple Rx queues
Use the new multi-monitor intrinsic to allow monitoring multiple ethdev
Rx queues while entering the energy efficient power state. The multi
version will be used unconditionally if supported, and the UMWAIT one
will only be used when multi-monitor is not supported by the hardware.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
2021-07-09 21:13:13 +02:00
Anatoly Burakov
5dff9a72b0 power: support callbacks for multiple Rx queues
Currently, there is a hard limitation on the PMD power management
support that only allows it to support a single queue per lcore. This is
not ideal as most DPDK use cases will poll multiple queues per core.

The PMD power management mechanism relies on ethdev Rx callbacks, so it
is very difficult to implement such support because callbacks are
effectively stateless and have no visibility into what the other ethdev
devices are doing. This places limitations on what we can do within the
framework of Rx callbacks, but the basics of this implementation are as
follows:

- Replace per-queue structures with per-lcore ones, so that any device
  polled from the same lcore can share data
- Any queue that is going to be polled from a specific lcore has to be
  added to the list of queues to poll, so that the callback is aware of
  other queues being polled by the same lcore
- Both the empty poll counter and the actual power saving mechanism is
  shared between all queues polled on a particular lcore, and is only
  activated when all queues in the list were polled and were determined
  to have no traffic.
- The limitation on UMWAIT-based polling is not removed because UMWAIT
  is incapable of monitoring more than one address.

Also, while we're at it, update and improve the docs.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
2021-07-09 21:13:13 +02:00
Anatoly Burakov
209fd58545 power: make ethdev power management thread unsafe
Currently, we expect that only one callback can be active at any given
moment, for a particular queue configuration, which is relatively easy
to implement in a thread-safe way. However, we're about to add support
for multiple queues per lcore, which will greatly increase the
possibility of various race conditions.

We could have used something like an RCU for this use case, but absent
of a pressing need for thread safety we'll go the easy way and just
mandate that the API's are to be called when all affected ports are
stopped, and document this limitation. This greatly simplifies the
`rte_power_monitor`-related code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
2021-07-09 21:13:13 +02:00
Anatoly Burakov
66834f2974 eal: add power monitor for multiple events
Use RTM and WAITPKG instructions to perform a wait-for-writes similar to
what UMWAIT does, but without the limitation of having to listen for
just one event. This works because the optimized power state used by the
TPAUSE instruction will cause a wake up on RTM transaction abort, so if
we add the addresses we're interested in to the read-set, any write to
those addresses will wake us up.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
2021-07-09 21:13:13 +02:00
Anatoly Burakov
6afc4baf4f eal: use callbacks for power monitoring comparison
Previously, the semantics of power monitor were such that we were
checking current value against the expected value, and if they matched,
then the sleep was aborted. This is somewhat inflexible, because it only
allowed us to check for a specific value in a specific way.

This commit replaces the comparison with a user callback mechanism, so
that any PMD (or other code) using `rte_power_monitor()` can define
their own comparison semantics and decision making on how to detect the
need to abort the entering of power optimized state.

Existing implementations are adjusted to follow the new semantics.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Acked-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
2021-07-09 21:13:13 +02:00