Commit Graph

7508 Commits

Author SHA1 Message Date
Stephen Hemminger
2eccf6afbe bpf: add function to convert classic BPF to DPDK BPF
The pcap library emits classic BPF (32 bit) and is useful for
creating filter programs.  The DPDK BPF library only implements
extended BPF (eBPF).  Add an function to convert from old to
new.

The rte_bpf_convert function uses rte_malloc to put the resulting
program in hugepage shared memory so it can be passed from a
secondary process to a primary process.

The code to convert was originally done as part of the Linux
kernel implementation then converted to a userspace program.
See https://github.com/tklauser/filter2xdp

Both authors have agreed that it is allowable to create a modified
version of this code and license it with BSD license used by DPDK.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-22 17:19:13 +02:00
Stephen Hemminger
80da61198b bpf: allow self-xor operation
Some BPF programs may use XOR of a register with itself
as a way to zero register in one instruction.
The BPF filter converter generates this in the prolog
to the generated code.

The BPF validator would not allow this because the value of
register was undefined. But after this operation it always zero.

Fixes: 8021917293 ("bpf: add extra validation for input BPF program")
Cc: stable@dpdk.org

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2021-10-22 17:19:13 +02:00
Stephen Hemminger
8d23ce8f5e pcapng: add new library for writing pcapng files
This is utility library for writing pcapng format files
used by Wireshark family of utilities. Older tcpdump
also knows how to read (but not write) this format.

See
  https://github.com/pcapng/pcapng/

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-22 17:19:07 +02:00
Stephen Hemminger
09644b58a1 pdump: disable on Windows
The current version of the pdump library was building on
Windows, but it was useless since the pdump utility was not being
built and Windows does not have multi-process support.

The new version of pdump with filtering now has dependency
on bpf. But bpf library is not available on Windows.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-10-22 15:46:19 +02:00
Chengwen Feng
a188277d53 dmadev: fix debug build
This patch fix compile error when enable RTE_DMADEV_DEBUG.

Fixes: ea8cf0f853 ("dmadev: add burst capacity API")

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Conor Walsh <conor.walsh@intel.com>
2021-10-21 22:10:22 +02:00
David Marchand
c61c8282ef dmadev: hide devices array
No need to expose rte_dma_devices out of the dmadev library.
Existing helpers should be enough, and inlines make use of
rte_dma_fp_objs.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
2021-10-21 22:01:37 +02:00
Harry van Haaren
976329581d eventdev: add usage hints to port configure API
This commit introduces 3 flags to the port configuration flags.
These flags allow the application to indicate what type of work
is expected to be performed by an eventdev port.

The three new flags are
- RTE_EVENT_PORT_CFG_HINT_PRODUCER (mostly RTE_EVENT_OP_NEW events)
- RTE_EVENT_PORT_CFG_HINT_CONSUMER (mostly RTE_EVENT_OP_RELEASE events)
- RTE_EVENT_PORT_CFG_HINT_WORKER   (mostly RTE_EVENT_OP_FORWARD events)

These flags are only hints, and the PMDs must operate under the
assumption that any port can enqueue an event with any type of op.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-21 10:16:00 +02:00
Naga Harish K S V
81da8a5ff4 eventdev/eth_rx: fix WRR buffer overrun
When a poll queue is removed from a rx_adapter instance, the WRR poll
array is recomputed. The wrr array length is reduced in this case. The
next wrr position to poll is stored in wrr_pos variable of rx_adapter
instance. This wrr_pos can become invalid in some cases after wrr is
recomputed. Using this variable to get the next queue and device pair
may leed to wrr buffer overruns.

Resetting the wrr_pos to zero after recomputation of wrr array fixes
the buffer overrun issue.

Fixes: 9c38b704d2 ("eventdev: add eth Rx adapter implementation")
Cc: stable@dpdk.org

Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-10-21 10:16:00 +02:00
Pavan Nikhilesh
fcf782051c eventdev: mark trace variables as internal
Mark rte_trace global variables as internal i.e. remove them
from experimental section of version map.
Some of them are used in inline APIs, mark those as global.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-21 10:16:00 +02:00
Pavan Nikhilesh
f26f2ca657 eventdev: make trace API internal
Slowpath trace APIs are only used in rte_eventdev.c so make them
as internal.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
2021-10-21 10:16:00 +02:00
Pavan Nikhilesh
68e9668a09 eventdev: promote event vector API to stable
Promote event vector configuration APIs to stable.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-21 10:16:00 +02:00
Pavan Nikhilesh
f3f3a91788 eventdev/timer: move adapters memory to hugepage
Move memory used by timer adapters to hugepage.
Allocate memory on the first adapter create or lookup to address
both primary and secondary process usecases.
This will prevent TLB misses if any and aligns to memory structure
of other subsystems.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
2021-10-21 10:16:00 +02:00
Pavan Nikhilesh
1dcd67ba1e eventdev/timer: rearrange struct fields
Rearrange fields in rte_event_timer data structure to remove holes.
Also, remove use of volatile from rte_event_timer.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
2021-10-21 10:14:50 +02:00
Pavan Nikhilesh
a256a743cf eventdev: remove rte prefix for internal structs
Remove rte_ prefix from rte_eth_event_enqueue_buffer,
rte_event_eth_rx_adapter and rte_event_crypto_adapter
as they are only used in rte_event_eth_rx_adapter.c and
rte_event_crypto_adapter.c

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
2021-10-21 10:14:50 +02:00
Pavan Nikhilesh
53548ad300 eventdev: hide timer adapter PMD file
Hide rte_event_timer_adapter_pmd.h file as it is an internal file.
Remove rte_ prefix from rte_event_timer_adapter_ops structure.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
2021-10-21 10:14:50 +02:00
Pavan Nikhilesh
295c053f90 eventdev: hide event device related structures
Move rte_eventdev, rte_eventdev_data structures to eventdev_pmd.h.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Harman Kalra <hkalra@marvell.com>
2021-10-21 10:14:50 +02:00
Pavan Nikhilesh
052e25d912 eventdev: use new API for inline functions
Use new driver interface for the fastpath enqueue/dequeue inline
functions.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
2021-10-21 10:14:50 +02:00
Pavan Nikhilesh
d35e61322d eventdev: move inline APIs into separate structure
Move fastpath inline function pointers from rte_eventdev into a
separate structure accessed via a flat array.
The intention is to make rte_eventdev and related structures private
to avoid future API/ABI breakages.`

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-21 10:14:50 +02:00
Pavan Nikhilesh
9c67fcbfd6 eventdev: allocate max space for internal arrays
Allocate max space for internal port, port config, queue config and
link map arrays.
Introduce new macro RTE_EVENT_MAX_PORTS_PER_DEV and set it to max
possible value.
This simplifies the port and queue reconfigure scenarios and will
also allow inline functions to refer pointer to internal port data
without extra checking of current number of configured queues.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
2021-10-21 10:14:50 +02:00
Pavan Nikhilesh
26f14535ed eventdev: separate internal structures
Create rte_eventdev_core.h and move all the internal data structures
to this file. These structures are mostly used by drivers, but they
need to be in the public header file as they are accessed by datapath
inline functions for performance reasons.
The accessibility of these data structures is not changed.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
2021-10-21 10:14:50 +02:00
Pavan Nikhilesh
23d06e3766 eventdev: make driver interface as internal
Mark all the driver specific functions as internal, remove
`rte` prefix from `struct rte_eventdev_ops`.
Remove experimental tag from internal functions.
Remove `eventdev_pmd.h` from non-internal header files.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2021-10-21 10:14:50 +02:00
Ganapati Kundapura
814d017093 eventdev/eth_rx: support telemetry
Added telemetry callbacks to get Rx adapter stats, reset stats and
to get Rx queue config information.

Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-21 10:14:50 +02:00
Naga Harish K S V
b06bca69b7 eventdev/eth_rx: add per-queue event buffer
Added per queue buffer. To configure per queue event buffer size,
application sets rte_event_eth_rx_adapter_params::use_queue_event_buf
flag as true while using rte_event_eth_rx_adapter_create_with_params().

The per queue event buffer size is populated in
rte_event_eth_rx_adapter_queue_conf::event_buf_size and passed
to rte_event_eth_rx_adapter_queue_add().

Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
2021-10-21 10:14:50 +02:00
Naga Harish K S V
bc0df25c83 eventdev/eth_rx: add event buffer size configurability
Currently event buffer is static array with a default size defined
internally.

To configure event buffer size from application,
rte_event_eth_rx_adapter_create_with_params() API is added which
takes struct rte_event_eth_rx_adapter_params to configure event
buffer size in addition other params. The event buffer size is
rounded up for better buffer utilization and performance. In case
of NULL params argument, default event buffer size is used.

Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com>
Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-21 10:14:50 +02:00
Ganapati Kundapura
da781e6488 eventdev/eth_rx: support Rx queue config get
Added rte_event_eth_rx_adapter_queue_conf_get() API to get rx queue
information - event queue identifier, flags for handling received packets,
scheduler type, event priority, polling frequency of the receive queue
and flow identifier in rte_event_eth_rx_adapter_queue_conf structure

Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-21 10:14:50 +02:00
Ganapati Kundapura
83ab470d12 eventdev/eth_rx: use timestamp as dynamic mbuf field
Add support to register timestamp dynamic field in mbuf.

Update the timestamp in mbuf for each packet before enqueuing
to event device if the timestamp is not already set.

Adding the timestamp in Rx adapter avoids additional latency
due to the event device.

Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-21 10:14:50 +02:00
Pavan Nikhilesh
929ebdd543 eventdev/eth_rx: simplify event vector config
Include vector configuration into the structure
``rte_event_eth_rx_adapter_queue_conf`` that is used to configure
Rx adapter ethernet device Rx queue parameters.
This simplifies event vector configuration as it avoids splitting
configuration per Rx queue.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-21 10:14:50 +02:00
Shijith Thotton
e3f128dbee eventdev/crypto: add cryptodev start in adapter spec
Event crypto adapter spec does not mention about cryptodev start and
stop. Cryptodev attached to the adapter should be started before calling
crypto adapter start. Added the same in spec and test application.

Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-21 10:14:50 +02:00
Ganapati Kundapura
8113fd15e2 eventdev/eth_rx: make enqueue buffer circular
Rx adapter uses memove() to move unprocessed events to the beginning of
the packet enqueue buffer. The use memmove() was found to consume good
amount of CPU cycles (about 20%).

This patch removes the use of memove() while implementing a circular
buffer to avoid copying of data. With this change RX adapter is able
to fill the buffer of 16384 events.

Signed-off-by: Ganapati Kundapura <ganapati.kundapura@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-10-21 10:14:49 +02:00
Xueming Li
5adef306da devargs: make bus optional
Global devargs syntax is used as device iteration filter like
"class=vdpa", a devargs without bus args is valid from parsing
perspective.

This patch makes bus args optional.

Fixes: d2a66ad794 ("bus: add device arguments name parsing")

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-10-21 11:32:44 +02:00
Xueming Li
9a1a9e4a2d devargs: support path value with global device syntax
Slash is used to split global device arguments.

To support path value which contains slash, this patch parses devargs by
locating both slash and layer name key:
  bus=a,name=/some/path/class=b,k1=v1/driver=c,k2=v2
"/class=" and "/driver" are valid start of a layer.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Gaetan Rivet <grive@u256.net>
2021-10-21 11:32:06 +02:00
Olivier Matz
efc6f9104c mbuf: fix reset on mbuf free
m->nb_seg must be reset on mbuf free whatever the value of m->next,
because it can happen that m->nb_seg is != 1. For instance in this
case:

  m1 = rte_pktmbuf_alloc(mp);
  rte_pktmbuf_append(m1, 500);
  m2 = rte_pktmbuf_alloc(mp);
  rte_pktmbuf_append(m2, 500);
  rte_pktmbuf_chain(m1, m2);
  m0 = rte_pktmbuf_alloc(mp);
  rte_pktmbuf_append(m0, 500);
  rte_pktmbuf_chain(m0, m1);

As rte_pktmbuf_chain() does not reset nb_seg in the initial m1
segment (this is not required), after this code the mbuf chain
have 3 segments:
  - m0: next=m1, nb_seg=3
  - m1: next=m2, nb_seg=2
  - m2: next=NULL, nb_seg=1

Then split this chain between m1 and m2, it would result in 2 packets:
  - first packet
    - m0: next=m1, nb_seg=2
    - m1: next=NULL, nb_seg=2
  - second packet
    - m2: next=NULL, nb_seg=1

Freeing the first packet will not restore nb_seg=1 in the second
segment. This is an issue because it is expected that mbufs stored
in pool have their nb_seg field set to 1.

Fixes: 8f094a9ac5 ("mbuf: set mbuf fields while in pool")
Cc: stable@dpdk.org

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Ali Alnubani <alialnu@nvidia.com>
2021-10-21 11:18:54 +02:00
Honnappa Nagarahalli
f4acb429d0 hash: promote some functions to stable
Promote APIs to stable.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
2021-10-21 09:46:47 +02:00
Honnappa Nagarahalli
0ff26704b4 ring: fix name size in ring structure
Use correct define for the name array size. The change breaks ABI and
hence cannot be backported to stable branches.

Fixes: 38c9817ee1 ("mempool: adjust name size in related data types")

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-21 09:32:04 +02:00
Thomas Monjalon
e1823e0842 ethdev: replace bit shifts with macros
The macros RTE_BIT32 and RTE_BIT64 are used to replace bit shifts.
The macro UINT64C is also used to replace remaining occurrences of ULL.

The bit shifts of ETH_RSS_LEVEL_* are kept for aesthetic reason.

The API of rte_mtr and rte_tm is using enums for 64-bit variables.
As they are enums, unsigned bit cannot be used.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-20 19:34:24 +02:00
Andrew Rybchenko
febc855b35 ethdev: forbid closing started device
Ethernet device must be stopped first before close in accordance
with the documentation.

Fixes: 980995f8cc ("ethdev: improve API comments of close and detach functions")
Cc: stable@dpdk.org

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-10-20 19:24:22 +02:00
Viacheslav Ovsiienko
dc4d860e8a ethdev: introduce configurable flexible item
1. Introduction and Retrospective

Nowadays the networks are evolving fast and wide, the network
structures are getting more and more complicated, the new
application areas are emerging. To address these challenges
the new network protocols are continuously being developed,
considered by technical communities, adopted by industry and,
eventually implemented in hardware and software. The DPDK
framework follows the common trends and if we bother
to glance at the RTE Flow API header we see the multiple
new items were introduced during the last years since
the initial release.

The new protocol adoption and implementation process is
not straightforward and takes time, the new protocol passes
development, consideration, adoption, and implementation
phases. The industry tries to mitigate and address the
forthcoming network protocols, for example, many hardware
vendors are implementing flexible and configurable network
protocol parsers. As DPDK developers, could we anticipate
the near future in the same fashion and introduce the similar
flexibility in RTE Flow API?

Let's check what we already have merged in our project, and
we see the nice raw item (rte_flow_item_raw). At the first
glance, it looks superior and we can try to implement a flow
matching on the header of some relatively new tunnel protocol,
say on the GENEVE header with variable length options. And,
under further consideration, we run into the raw item
limitations:

- only fixed size network header can be represented
- the entire network header pattern of fixed format
  (header field offsets are fixed) must be provided
- the search for patterns is not robust (the wrong matches
  might be triggered), and actually is not supported
  by existing PMDs
- no explicitly specified relations with preceding
  and following items
- no tunnel hint support

As the result, implementing the support for tunnel protocols
like aforementioned GENEVE with variable extra protocol option
with flow raw item becomes very complicated and would require
multiple flows and multiple raw items chained in the same
flow (by the way, there is no support found for chained raw
items in implemented drivers).

This RFC introduces the dedicated flex item (rte_flow_item_flex)
to handle matches with existing and new network protocol headers
in a unified fashion.

2. Flex Item Life Cycle

Let's assume there are the requirements to support the new
network protocol with RTE Flows. What is given within protocol
specification:

  - header format
  - header length, (can be variable, depending on options)
  - potential presence of extra options following or included
    in the header the header
  - the relations with preceding protocols. For example,
    the GENEVE follows UDP, eCPRI can follow either UDP
    or L2 header
  - the relations with following protocols. For example,
    the next layer after tunnel header can be L2 or L3
  - whether the new protocol is a tunnel and the header
    is a splitting point between outer and inner layers

The supposed way to operate with flex item:

  - application defines the header structures according to
    protocol specification

  - application calls rte_flow_flex_item_create() with desired
    configuration according to the protocol specification, it
    creates the flex item object over specified ethernet device
    and prepares PMD and underlying hardware to handle flex
    item. On item creation call PMD backing the specified
    ethernet device returns the opaque handle identifying
    the object has been created

  - application uses the rte_flow_item_flex with obtained handle
    in the flows, the values/masks to match with fields in the
    header are specified in the flex item per flow as for regular
    items (except that pattern buffer combines all fields)

  - flows with flex items match with packets in a regular fashion,
    the values and masks for the new protocol header match are
    taken from the flex items in the flows

  - application destroys flows with flex items

  - application calls rte_flow_flex_item_release() as part of
    ethernet device API and destroys the flex item object in
    PMD and releases the engaged hardware resources

3. Flex Item Structure

The flex item structure is intended to be used as part of the flow
pattern like regular RTE flow items and provides the mask and
value to match with fields of the protocol item was configured
for.

  struct rte_flow_item_flex {
    void *handle;
    uint32_t length;
    const uint8_t* pattern;
  };

The handle is some opaque object maintained on per device basis
by underlying driver.

The protocol header fields are considered as bit fields, all
offsets and widths are expressed in bits. The pattern is the
buffer containing the bit concatenation of all the fields
presented at item configuration time, in the same order and
same amount. If byte boundary alignment is needed an application
can use a dummy type field, this is just some kind of gap filler.

The length field specifies the pattern buffer length in bytes
and is needed to allow rte_flow_copy() operations. The approach
of multiple pattern pointers and lengths (per field) was
considered and found clumsy - it seems to be much suitable for
the application to maintain the single structure within the
single pattern buffer.

4. Flex Item Configuration

The flex item configuration consists of the following parts:

  - header field descriptors:
    - next header
    - next protocol
    - sample to match
  - input link descriptors
  - output link descriptors

The field descriptors tell the driver and hardware what data should
be extracted from the packet and then control the packet handling
in the flow engine. Besides this, sample fields can be presented
to match with patterns in the flows. Each field is a bit pattern.
It has width, offset from the header beginning, mode of offset
calculation, and offset related parameters.

The next header field is special, no data are actually taken
from the packet, but its offset is used as a pointer to the next
header in the packet, in other words the next header offset
specifies the size of the header being parsed by flex item.

There is one more special field - next protocol, it specifies
where the next protocol identifier is contained and packet data
sampled from this field will be used to determine the next
protocol header type to continue packet parsing. The next
protocol field is like eth_type field in MAC2, or proto field
in IPv4/v6 headers.

The sample fields are used to represent the data be sampled
from the packet and then matched with established flows.

There are several methods supposed to calculate field offset
in runtime depending on configuration and packet content:

  - FIELD_MODE_FIXED - fixed offset. The bit offset from
    header beginning is permanent and defined by field_base
    configuration parameter.

  - FIELD_MODE_OFFSET - the field bit offset is extracted
    from other header field (indirect offset field). The
    resulting field offset to match is calculated from as:

  field_base + (*offset_base & offset_mask) << offset_shift

    This mode is useful to sample some extra options following
    the main header with field containing main header length.
    Also, this mode can be used to calculate offset to the
    next protocol header, for example - IPv4 header contains
    the 4-bit field with IPv4 header length expressed in dwords.
    One more example - this mode would allow us to skip GENEVE
    header variable length options.

  - FIELD_MODE_BITMASK - the field bit offset is extracted
    from other header field (indirect offset field), the latter
    is considered as bitmask containing some number of one bits,
    the resulting field offset to match is calculated as:

  field_base + bitcount(*offset_base & offset_mask) << offset_shift

    This mode would be useful to skip the GTP header and its
    extra options with specified flags.

  - FIELD_MODE_DUMMY - dummy field, optionally used for byte
    boundary alignment in pattern. Pattern mask and data are
    ignored in the match. All configuration parameters besides
    field size and offset are ignored.

  Note:  "*" - means the indirect field offset is calculated
  and actual data are extracted from the packet by this
  offset (like data are fetched by pointer *p from memory).

The offset mode list can be extended by vendors according to
hardware supported options.

The input link configuration section tells the driver after
what protocols and at what conditions the flex item can follow.
Input link specified the preceding header pattern, for example
for GENEVE it can be UDP item specifying match on destination
port with value 6081. The flex item can follow multiple header
types and multiple input links should be specified. At flow
creation time the item with one of the input link types should
precede the flex item and driver will select the correct flex
item settings, depending on the actual flow pattern.

The output link configuration section tells the driver how
to continue packet parsing after the flex item protocol.
If multiple protocols can follow the flex item header the
flex item should contain the field with the next protocol
identifier and the parsing will be continued depending
on the data contained in this field in the actual packet.

The flex item fields can participate in RSS hash calculation,
the dedicated flag is present in the field description to specify
what fields should be provided for hashing.

5. Flex Item Chaining

If there are multiple protocols supposed to be supported with
flex items in chained fashion - two or more flex items within
the same flow and these ones might be neighbors in the pattern,
it means the flex items are mutual referencing.  In this case,
the item that occurred first should be created with empty
output link list or with the list including existing items,
and then the second flex item should be created referencing
the first flex item as input arc, drivers should adjust
the item configuration.

Also, the hardware resources used by flex items to handle
the packet can be limited. If there are multiple flex items
that are supposed to be used within the same flow it would
be nice to provide some hint for the driver that these two
or more flex items are intended for simultaneous usage.
The fields of items should be assigned with hint indices
and these indices from two or more flex items supposed
to be provided within the same flow should be the same
as well. In other words, the field hint index specifies
the group of fields that can be matched simultaneously
within a single flow. If hint indices are specified,
the driver will try to engage not overlapping hardware
resources and provide independent handling of the field
groups with unique indices. If the hint index is zero
the driver assigns resources on its own.

6. Example of New Protocol Handling

Let's suppose we have the requirements to handle the new tunnel
protocol that follows UDP header with destination port 0xFADE
and is followed by MAC header. Let the new protocol header format
be like this:

  struct new_protocol_header {
    rte_be32 header_length; /* length in dwords, including options */
    rte_be32 specific0;     /* some protocol data, no intention */
    rte_be32 specific1;     /* to match in flows on these fields */
    rte_be32 crucial;       /* data of interest, match is needed */
    rte_be32 options[0];    /* optional protocol data, variable length */
  };

The supposed flex item configuration:

  struct rte_flow_item_flex_field field0 = {
    .field_mode = FIELD_MODE_DUMMY,  /* Affects match pattern only */
    .field_size = 96,                /* three dwords from the beginning */
  };
  struct rte_flow_item_flex_field field1 = {
    .field_mode = FIELD_MODE_FIXED,
    .field_size = 32,       /* Field size is one dword */
    .field_base = 96,       /* Skip three dwords from the beginning */
  };
  struct rte_flow_item_udp spec0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFADE),
    }
  };
  struct rte_flow_item_udp mask0 = {
    .hdr = {
      .dst_port = RTE_BE16(0xFFFF),
    }
  };
  struct rte_flow_item_flex_link link0 = {
    .item = {
       .type = RTE_FLOW_ITEM_TYPE_UDP,
       .spec = &spec0,
       .mask = &mask0,
  };

  struct rte_flow_item_flex_conf conf = {
    .next_header = {
      .tunnel = FLEX_TUNNEL_MODE_SINGLE,
      .field_mode = FIELD_MODE_OFFSET,
      .field_base = 0,
      .offset_base = 0,
      .offset_mask = 0xFFFFFFFF,
      .offset_shift = 2	   /* Expressed in dwords, shift left by 2 */
    },
    .sample = {
       &field0,
       &field1,
    },
    .nb_samples = 2,
    .input_link[0] = &link0,
    .nb_inputs = 1
  };

Let's suppose we have created the flex item successfully, and PMD
returned the handle 0x123456789A. We can use the following item
pattern to match the crucial field in the packet with value 0x00112233:

  struct new_protocol_header spec_pattern =
  {
    .crucial = RTE_BE32(0x00112233),
  };
  struct new_protocol_header mask_pattern =
  {
    .crucial = RTE_BE32(0xFFFFFFFF),
  };
  struct rte_flow_item_flex spec_flex = {
    .handle = 0x123456789A
    .length = sizeiof(struct new_protocol_header),
    .pattern = &spec_pattern,
  };
  struct rte_flow_item_flex mask_flex = {
    .length = sizeof(struct new_protocol_header),
    .pattern = &mask_pattern,
  };
  struct rte_flow_item item_to_match = {
    .type = RTE_FLOW_ITEM_TYPE_FLEX,
    .spec = &spec_flex,
    .mask = &mask_flex,
  };

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
2021-10-20 18:58:54 +02:00
Gregory Etelson
6cf7204733 ethdev: support flow elements with variable length
Flow API provides RAW item type for packet patterns of variable
length. The RAW item structure has fixed size members that describe the
variable pattern length and methods to process it.

There is the new Flow items with variable lengths coming - flex
item. In order to handle this item (and potentially other new ones
with variable pattern length) in flow copy and conversion routines
the helper function is introduced.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
2021-10-20 18:53:46 +02:00
Ferruh Yigit
990912e676 ethdev: unify MTU checks
Both 'rte_eth_dev_configure()' & 'rte_eth_dev_set_mtu()' sets MTU but
have slightly different checks. Like one checks min MTU against
RTE_ETHER_MIN_MTU and other RTE_ETHER_MIN_LEN.

Checks moved into common function to unify the checks. Also this has
benefit to have common error logs.

Default 'dev_info->min_mtu' (the one set by ethdev if driver doesn't
provide one), changed to ('RTE_ETHER_MIN_LEN' - overhead). Previously it
was 'RTE_ETHER_MIN_MTU' which is min MTU for IPv4 packets. Since the
intention is to provide min MTU corresponding minimum frame size, new
default value suits better.

Suggested-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
b563c14212 ethdev: remove jumbo offload flag
Removing 'DEV_RX_OFFLOAD_JUMBO_FRAME' offload flag.

Instead of drivers announce this capability, application can deduct the
capability by checking reported 'dev_info.max_mtu' or
'dev_info.max_rx_pktlen'.

And instead of application setting this flag explicitly to enable jumbo
frames, this can be deduced by driver by comparing requested 'mtu' to
'RTE_ETHER_MTU'.

Removing this additional configuration for simplification.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
f7e04f57ad ethdev: move MTU set check to library
Move requested MTU value check to the API to prevent the duplicated
code.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
dd4e429c95 ethdev: move jumbo frame offload check to library
Setting MTU bigger than RTE_ETHER_MTU requires the jumbo frame support,
and application should enable the jumbo frame offload support for it.

When jumbo frame offload is not enabled by application, but MTU bigger
than RTE_ETHER_MTU is requested there are two options, either fail or
enable jumbo frame offload implicitly.

Enabling jumbo frame offload implicitly is selected by many drivers
since setting a big MTU value already implies it, and this increases
usability.

This patch moves this logic from drivers to the library, both to reduce
the duplicated code in the drivers and to make behaviour more visible.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
2021-10-18 19:20:21 +02:00
Ferruh Yigit
1bb4a528c4 ethdev: fix max Rx packet length
There is a confusion on setting max Rx packet length, this patch aims to
clarify it.

'rte_eth_dev_configure()' API accepts max Rx packet size via
'uint32_t max_rx_pkt_len' field of the config struct 'struct
rte_eth_conf'.

Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
stored into '(struct rte_eth_dev)->data->mtu'.

These two APIs are related but they work in a disconnected way, they
store the set values in different variables which makes hard to figure
out which one to use, also having two different method for a related
functionality is confusing for the users.

Other issues causing confusion is:
* maximum transmission unit (MTU) is payload of the Ethernet frame. And
  'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
  Ethernet frame overhead, and this overhead may be different from
  device to device based on what device supports, like VLAN and QinQ.
* 'max_rx_pkt_len' is only valid when application requested jumbo frame,
  which adds additional confusion and some APIs and PMDs already
  discards this documented behavior.
* For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
  field, this adds configuration complexity for application.

As solution, both APIs gets MTU as parameter, and both saves the result
in same variable '(struct rte_eth_dev)->data->mtu'. For this
'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
from jumbo frame.

For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
request and it should be used only within configure function and result
should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
both application and PMD uses MTU from this variable.

When application doesn't provide an MTU during 'rte_eth_dev_configure()'
default 'RTE_ETHER_MTU' value is used.

Additional clarification done on scattered Rx configuration, in
relation to MTU and Rx buffer size.
MTU is used to configure the device for physical Rx/Tx size limitation,
Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer
size as Rx buffer size.
PMDs compare MTU against Rx buffer size to decide enabling scattered Rx
or not. If scattered Rx is not supported by device, MTU bigger than Rx
buffer size should fail.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
2021-10-18 19:20:20 +02:00
Georg Sauthoff
24f1955d1e net: fix aliasing in checksum computation
That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers.

Also simplified the loop since a modern C compiler can speed up (i.e.
auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it
for Haswell using AVX registers while halving the number of instructions
in the generated code.

Fixes: 6006818cfb ("net: new checksum functions")
Fixes: e079655c41 ("net: fix build with gcc 4.4.7 and strict aliasing")
Cc: stable@dpdk.org

Signed-off-by: Georg Sauthoff <mail@gms.tf>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-18 18:15:58 +02:00
Jie Wang
632be32735 ethdev: add API to get device configuration
The driver may change offloads info into dev->data->dev_conf
in dev_configure which may cause apps use outdated values.

Add a new API to get actual device configuration.

Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-15 13:27:05 +02:00
Gowrishankar Muthukrishnan
58b43c1ddf ethdev: add telemetry endpoint for device info
Add telemetry endpoint /ethdev/info for device info.

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-14 23:44:53 +02:00
Gregory Etelson
63f2bbfa82 net: introduce IPv4 IHL and version fields
RTE IPv4 header definition combines the `version' and `ihl'  fields
into a single structure member.
This patch introduces dedicated structure members for both `version'
and `ihl' IPv4 fields. Separated header fields definitions allow to
create simplified code to match on the IHL value in a flow rule.
The original `version_ihl' structure member is kept for backward
compatibility.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
2021-10-14 23:00:45 +02:00
Viacheslav Ovsiienko
50cd0391a4 ethdev: add experimental comment for modify field action
EXPERIMENTAL tag was missed in rte_flow_action_modify_data
structure description.

Fixes: 73b68f4c54 ("ethdev: introduce generic modify flow action")
Cc: stable@dpdk.org

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-14 14:34:31 +02:00
Viacheslav Ovsiienko
14fc81aed7 ethdev: update modify field flow action
The generic modify field flow action introduced in [1] has
some issues related to the immediate source operand:

  - immediate source can be presented either as an unsigned
    64-bit integer or pointer to data pattern in memory.
    There was no explicit pointer field defined in the union.

  - the byte ordering for 64-bit integer was not specified.
    Many fields have shorter lengths and byte ordering
    is crucial.

  - how the bit offset is applied to the immediate source
    field was not defined and documented.

  - 64-bit integer size is not enough to provide IPv6
    addresses.

In order to cover the issues and exclude any ambiguities
the following is done:

  - introduce the explicit pointer field
    in rte_flow_action_modify_data structure

  - replace the 64-bit unsigned integer with 16-byte array

  - update the modify field flow action documentation

Appropriate deprecation notice has been removed.

[1] commit 73b68f4c54 ("ethdev: introduce generic modify flow action")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-14 14:34:31 +02:00
Ivan Malov
1179f05cc9 ethdev: query proxy port to manage transfer flows
Not all DPDK ports in a given switching domain may have the
privilege to manage "transfer" flows. Add an API to find a
port with sufficient privileges by any port in the domain.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
2021-10-14 13:42:59 +02:00
Ivan Malov
9d2a349b38 ethdev: deprecate direction attributes in transfer flows
Attributes "ingress" and "egress" can only apply unambiguosly
to non-"transfer" flows. In "transfer" flows, the standpoint
is effectively shifted to the embedded switch. There can be
many different endpoints connected to the switch, so the
use of "ingress" / "egress" does not shed light on which
endpoints precisely can be considered as traffic sources.

Add relevant deprecation notices and suggest the use of precise
traffic source items (PORT_REPRESENTOR and REPRESENTED_PORT).

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-10-13 22:59:26 +02:00
Ivan Malov
5da44faa80 ethdev: deprecate hard-to-use or ambiguous items and actions
PF, VF and PHY_PORT require that applications have extra
knowledge of the underlying NIC and thus are hard to use.
Also, the corresponding items depend on the direction
attribute (ingress / egress), which complicates their
use in applications and interpretation in PMDs.

The concept of PORT_ID is ambiguous as it doesn't say whether
the port in question is an ethdev or the represented entity.

Items and actions PORT_REPRESENTOR, REPRESENTED_PORT
should be used instead.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:26 +02:00
Ivan Malov
88caad251c ethdev: add represented port action to flow API
For use in "transfer" flows. Supposed to send matching traffic to the
entity represented by the given ethdev, at embedded switch level.
Such an entity can be a network (via a network port), a guest
machine (via a VF) or another ethdev in the same application.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:26 +02:00
Ivan Malov
8edb6bc026 ethdev: add port representor action to flow API
For use in "transfer" flows. Supposed to send matching traffic to
the given ethdev (to the application), at embedded switch level.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:26 +02:00
Ivan Malov
49863ae2bf ethdev: add represented port item to flow API
For use in "transfer" flows. Supposed to match traffic entering the
embedded switch from the entity represented by the given ethdev.
Such an entity can be a network (via a network port), a guest
machine (via a VF) or another ethdev in the same application.

Must not be combined with direction attributes.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:26 +02:00
Ivan Malov
081e42dab1 ethdev: add port representor item to flow API
For use in "transfer" flows. Supposed to match traffic
entering the embedded switch from the given ethdev.

Must not be combined with direction attributes.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-13 22:59:25 +02:00
Konstantin Ananyev
f9bdee267a ethdev: hide internal structures
Move rte_eth_dev, rte_eth_dev_data, rte_eth_rxtx_callback and related
data into private header (ethdev_driver.h).
Few minor changes to keep DPDK building after that.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:59 +02:00
Konstantin Ananyev
27a300e6af ethdev: add API to retrieve multiple MAC addresses
Introduce rte_eth_macaddrs_get() to allow user to retrieve all ethernet
addresses assigned to given port.
Change testpmd to use this new function and avoid referencing directly
rte_eth_devices[].

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:59 +02:00
Konstantin Ananyev
7a0935239b ethdev: make fast-path functions to use new flat array
Rework fast-path ethdev functions to use rte_eth_fp_ops[].
While it is an API/ABI breakage, this change is intended to be
transparent for both users (no changes in user app is required) and
PMD developers (no changes in PMD is required).
One extra thing to note - RX/TX callback invocation will cause extra
function call with these changes. That might cause some insignificant
slowdown for code-path where RX/TX callbacks are heavily involved.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:58 +02:00
Konstantin Ananyev
c87d435a4d ethdev: copy fast-path API into separate structure
Copy public function pointers (rx_pkt_burst(), etc.) and related
pointers to internal data from rte_eth_dev structure into a
separate flat array. That array will remain in a public header.
The intention here is to make rte_eth_dev and related structures internal.
That should allow future possible changes to core eth_dev structures
to be transparent to the user and help to avoid ABI/API breakages.
The plan is to keep minimal part of data from rte_eth_dev public,
so we still can use inline functions for fast-path calls
(like rte_eth_rx_burst(), etc.) to avoid/minimize slowdown.
The whole idea beyond this new schema:
1. PMDs keep to setup fast-path function pointers and related data
   inside rte_eth_dev struct in the same way they did it before.
2. Inside rte_eth_dev_start() and inside rte_eth_dev_probing_finish()
   (for secondary process) we call eth_dev_fp_ops_setup, which
   copies these function and data pointers into rte_eth_fp_ops[port_id].
3. Inside rte_eth_dev_stop() and inside rte_eth_dev_release_port()
   we call eth_dev_fp_ops_reset(), which resets rte_eth_fp_ops[port_id]
   into some dummy values.
4. fast-path ethdev API (rte_eth_rx_burst(), etc.) will use that new
   flat array to call PMD specific functions.
That approach should allow us to make rte_eth_devices[] private
without introducing regression and help to avoid changes in drivers code.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:58 +02:00
Konstantin Ananyev
8d7d4fcdca ethdev: change input parameters for Rx queue count
Currently majority of fast-path ethdev ops take pointers to internal
queue data structures as an input parameter.
While eth_rx_queue_count() takes a pointer to rte_eth_dev and queue
index.
For future work to hide rte_eth_devices[] and friends it would be
plausible to unify parameters list of all fast-path ethdev ops.
This patch changes eth_rx_queue_count() to accept pointer to internal
queue data as input parameter.
While this change is transparent to user, it still counts as an ABI change,
as eth_rx_queue_count_t is used by ethdev public inline function
rte_eth_rx_queue_count().

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:58 +02:00
Konstantin Ananyev
c024496ae8 ethdev: allocate max space for internal queue array
At queue configure stage always allocate space for maximum possible
number (RTE_MAX_QUEUES_PER_PORT) of queue pointers.
That will allow 'fast' inline functions (eth_rx_burst, etc.) to refer
pointer to internal queue data without extra checking of current number
of configured queues.
That would help in future to hide rte_eth_dev and related structures.
It means that from now on, each ethdev port will always consume:
((2*sizeof(uintptr_t))* RTE_MAX_QUEUES_PER_PORT)
bytes of memory for its queue pointers.
With RTE_MAX_QUEUES_PER_PORT==1024 (default value) it is 16KB per port.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
2021-10-13 22:14:58 +02:00
Ivan Malov
f6d8a6d3fa ethdev: negotiate delivery of packet metadata from HW to PMD
Provide an API to let the application control the NIC's ability
to deliver specific kinds of per-packet metadata to the PMD.

Checks for the NIC's ability to set these kinds of metadata
in the first place (support for the flow actions) belong in
flow API responsibility domain (flow validate mechanism).
This topic is out of scope of the new API in question.

The PMD's ability to deliver received metadata to the user
by virtue of mbuf fields should be covered by mbuf library.
It is also out of scope of the new API in question.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
2021-10-13 00:47:42 +02:00
Andrew Rybchenko
92ef4b8f16 ethdev: remove deprecated shared counter attribute
Indirect actions should be used to do shared counters.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2021-10-12 19:20:57 +02:00
Viacheslav Galaktionov
ff4e52efb3 ethdev: fix representor port ID search by name
The patch is required for all PMDs which do not provide representors
info on the representor itself.

The function, rte_eth_representor_id_get(), is used in
eth_representor_cmp() which is required in ethdev class iterator to
search ethdev port ID by name (representor case). Before the patch
the function is called on the representor itself and tries to get
representors info to match.

Search of port ID by name is used after hotplug to find out port ID
of the just plugged device.

Getting a list of representors from a representor does not make sense.
Instead, a backer device should be used.

To this end, extend the rte_eth_dev_data structure to include the port ID
of the backing device for representors.

Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2021-10-12 16:54:20 +02:00
Andrew Rybchenko
6c31a8c20a ethdev: remove legacy Rx descriptor done API
rte_eth_rx_descriptor_status() should be used as a replacement.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2021-10-11 16:44:57 +02:00
Akhil Goyal
fb545457ed security: add reserved bit fields
In struct rte_security_ipsec_sa_options, for every new option
added, there is an ABI breakage, to avoid, a reserved_opts
bitfield is added to for the remaining bits available in the
structure.
Now for every new sa option, these reserved_opts can be reduced
and new option can be added.

Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-18 20:12:19 +02:00
Akhil Goyal
3867ed0280 security: hide internal API
rte_security_dynfield_register() is an internal
API to be used by the driver, hence moving it to internal.

Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-18 20:12:19 +02:00
Nicolas Chautru
ab4e19097b bbdev: add device info for data endianness
Added device information to capture explicitly the assumption
of the input/output data byte endianness being processed.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-18 20:11:16 +02:00
Gagandeep Singh
8edcb68fd0 cryptodev: fix multi-segment raw vector processing
If no next segment available the “for” loop will fail and it still
returns i+1 i.e. 2, which is wrong as it has filled only 1 buffer.

Fixes: 7adf992fb9 ("cryptodev: introduce CPU crypto API")
Cc: stable@dpdk.org

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 19:32:13 +02:00
Hemant Agrawal
68f5d3d320 cryptodev: add field for out-of-place in raw vector
The structure rte_crypto_sym_vec is updated to
add dest_sgl to support out of place processing.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-17 19:32:01 +02:00
Gagandeep Singh
6afd461f9f cryptodev: add total raw buffer length
The current crypto raw data vectors is extended to support
rte_security usecases, where we need total data length to know
how much additional memory space is available in buffer other
than data length so that driver/HW can write expanded size
data after encryption.

Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-17 19:32:01 +02:00
Hemant Agrawal
10488d59ae cryptodev: rename field in vector struct
This patch renames the sgl to src_sgl in struct rte_crypto_sym_vec
to help differentiating between source and destination sgl.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-17 19:31:15 +02:00
Radu Nicolau
2ed40da848 ipsec: support setting initial ESN value
Update IPsec library to support initial ESN value.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:11:59 +02:00
Radu Nicolau
68977baa75 ipsec: support SA telemetry
Add telemetry support for ipsec SAs.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:08:03 +02:00
Radu Nicolau
64df4712ce mbuf: add IPsec ESP tunnel type
Add ESP tunnel type to the tunnel types list that can be specified
for TSO or checksum on the inner part of tunnel packets.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-17 14:07:03 +02:00
Radu Nicolau
01eef5907f ipsec: support NAT-T
Add support for the IPsec NAT-Traversal use case for Tunnel mode
packets.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:06:24 +02:00
Radu Nicolau
17344c0278 security: add UDP parameters for IPsec NAT-T
Add support for specifying UDP port params for UDP encapsulation option.
RFC3948 section-2.1 does not enforce using specific the UDP ports for
UDP-Encapsulated ESP Header

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:03:43 +02:00
Radu Nicolau
c99d26197c ipsec: support more AEAD algorithms
Added support for AES_CCM, CHACHA20_POLY1305 and AES_GMAC.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 14:03:13 +02:00
Radu Nicolau
199fcba1bd security: add ESN field to IPsec xform
Update ipsec_xform definition to include ESN field.
This allows the application to control the ESN starting value.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Daniel Martin Buckley <daniel.m.buckley@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-17 13:08:35 +02:00
Matan Azrad
cab0c8f3c0 cryptodev: extend data-unit length field
As described in [1] and as announced in [2], The field ``dataunit_len``
of the ``struct rte_crypto_cipher_xform`` moved to the end of the
structure and extended to ``uint32_t``.

In this way, sizes bigger than 64K bytes can be supported for data-unit
lengths.

[1] commit d014dddb2d ("cryptodev: support multiple cipher
data-units")
[2] commit 9a5c09211b ("doc: announce extension of crypto data-unit
length")

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-16 16:24:43 +02:00
David Marchand
afdaa60795 mempool: accept user flags only
As reported by Dmitry, RTE_MEMPOOL_F_POOL_CREATED is a flag only
manipulated internally.
This flag is not supposed to be requested from an application and would
probably result in an incorrect behavior if an application did pass it.

At least one other internal flag has been added recently and more may be
introduced later.

Rework the check and export a mask of valid user flags for use in the
unit test.

Fixes: b240af8b10 ("mempool: enforce valid flags at creation")

Reported-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-20 10:03:55 +02:00
Andrew Rybchenko
fb11ae8816 mempool: deprecate unused physical page defines
MEMPOOL_PG_NUM_DEFAULT and MEMPOOL_PG_SHIFT_MAX are not used.

Fixes: fd943c764a ("mempool: deprecate xmem functions")

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:03:41 +02:00
Andrew Rybchenko
cb77b060eb mempool: add namespace to driver register macro
Add RTE_ prefix to macro used to register mempool driver.
The old one is still available but deprecated.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:00:18 +02:00
Andrew Rybchenko
d720366184 mempool: make header size calculation internal
Add RTE_ prefix to helper macro to calculate mempool header size and
make it internal. Old macro is still available, but deprecated.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:00:18 +02:00
Andrew Rybchenko
ad276d5c7e mempool: add namespace to internal helpers
Add RTE_ prefix to internal API defined in public header.
Use the prefix instead of double underscore.
Use uppercase for macros in the case of name conflict.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:00:18 +02:00
Andrew Rybchenko
c47d7b90a1 mempool: add namespace to flags
Fix the mempool flags namespace by adding an RTE_ prefix to the name.
The old flags remain usable, to be deprecated in the future.

Flag MEMPOOL_F_NON_IO added in the release is just renamed to have RTE_
prefix.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 10:00:16 +02:00
Andrew Rybchenko
925a83a5bf mempool: enhance flags documentation readability
Move documentation into a separate line just before define.
Prepare to have a bit longer flag name because of namespace prefix.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-20 09:58:39 +02:00
Feifei Wang
c4629b02c5 mcslock: use WFE in lock for aarch64
Instead of polling for previous lock holder unlocking, use
wait_until_equal API.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2021-10-20 08:22:41 +02:00
Feifei Wang
a6e24bf417 mem: use WFE for init sync on aarch64
Instead of polling for mcfg->magic to be updated, use wait_until_equal
API.

Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2021-10-20 08:22:18 +02:00
Joyce Kong
4da0136096 stack: remove unneeded atomic header include
In stack module, remove the header file rte_atomic.h
as it is not being used.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-19 17:15:10 +02:00
Dmitry Kozlyuk
11541c5c81 mempool: add non-IO flag
Mempool is a generic allocator that is not necessarily used
for device IO operations and its memory for DMA.
Add MEMPOOL_F_NON_IO flag to mark such mempools automatically
a) if their objects are not contiguous;
b) if IOVA is not available for any object.
Other components can inspect this flag
in order to optimize their memory management.

Discussion: https://mails.dpdk.org/archives/dev/2021-August/216654.html

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-19 16:35:16 +02:00
Dmitry Kozlyuk
da2b9cb25e mempool: add event callbacks
Data path performance can benefit if the PMD knows which memory it will
need to handle in advance, before the first mbuf is sent to the PMD.
It is impractical, however, to consider all allocated memory for this
purpose. Most often mbuf memory comes from mempools that can come and
go. PMD can enumerate existing mempools on device start, but it also
needs to track creation and destruction of mempools after the forwarding
starts but before an mbuf from the new mempool is sent to the device.

Add an API to register callback for mempool life cycle events:
* rte_mempool_event_callback_register()
* rte_mempool_event_callback_unregister()
Currently tracked events are:
* RTE_MEMPOOL_EVENT_READY (after populating a mempool)
* RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
Provide a unit test for the new API.
The new API is internal, because it is primarily demanded by PMDs that
may need to deal with any mempools and do not control their creation,
while an application, on the other hand, knows which mempools it creates
and doesn't care about internal mempools PMDs might create.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-19 16:35:16 +02:00
Bruce Richardson
2e348d8fe3 dmadev: add flag for error handling support
Due to HW or driver limitations, not all dmadevs may support full error
handling e.g. safely managing and reporting an invalid address to a copy
operation. The skeleton dmadev, for example, being pure software will
always seg-fault if passed an invalid address. To indicate the
availability of safe error handling by a device, we add a capability
flag for it.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
2021-10-18 11:19:27 +02:00
Bruce Richardson
190f7e84c3 dmadev: add device iterator
Add a function and wrapper macro to iterate over all DMA devices.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
2021-10-18 11:17:32 +02:00
Kevin Laatz
ea8cf0f853 dmadev: add burst capacity API
Add a burst capacity check API to the dmadev library. This API is useful to
applications which need to how many descriptors can be enqueued in the
current batch. For example, it could be used to determine whether all
segments of a multi-segment packet can be enqueued in the same batch or not
(to avoid half-offload of the packet).

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-18 11:17:30 +02:00
Bruce Richardson
5e0f859127 dmadev: add channel status check for testing use
Add in a function to check if a device or vchan has completed all jobs
assigned to it, without gathering in the results. This is primarily for
use in testing, to allow the hardware to be in a known-state prior to
gathering completions.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
2021-10-18 11:17:21 +02:00
Chengwen Feng
2ece65f00f dmadev: support multi-process
This patch add multi-process support for dmadev.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-17 20:49:58 +02:00
Chengwen Feng
91e581e5c9 dmadev: add data plane API
This patch add data plane API for dmadev.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-17 20:49:58 +02:00
Chengwen Feng
e0180db144 dmadev: add control plane API
This patch add control plane API for dmadev.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-17 20:49:58 +02:00
Chengwen Feng
b36970f2e1 dmadev: introduce DMA device library
The 'dmadev' is a generic type of DMA device.

This patch introduce the 'dmadev' device allocation functions.

The infrastructure is prepared to welcome drivers in drivers/dma/

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
2021-10-17 20:49:57 +02:00
David Marchand
e9123c467d mbuf: enforce no option for dynamic fields and flags
As stated in the API, dynamic field and flags should be created with no
additional flag (simply in the API for future changes).

Fix the dynamic flag register helper which was not enforcing it and add
unit tests.

Fixes: 4958ca3a44 ("mbuf: support dynamic fields and flags")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-15 10:29:41 +02:00
David Marchand
bc1a35fb3f memzone: enforce valid flags when reserving
If we do not enforce valid flags are passed by an application, this
application might face issues in the future when we add more flags.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-15 10:29:21 +02:00
David Marchand
b240af8b10 mempool: enforce valid flags at creation
If we do not enforce valid flags are passed by an application, this
application might face issues in the future when we add more flags.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-15 10:24:43 +02:00
Bruce Richardson
85e21b77d8 telemetry: fix socket path conflicts for in-memory mode
When running using in-memory mode, multiple processes can use the same
runtime dir, leading to conflicts with the telemetry sockets in that
directory. We can resolve this by appending a suffix to each socket
beyond the first, with the suffix being an increasing counter value.
Each process uses the first unused socket counter value.

Fixes: 6dd571fd07 ("telemetry: introduce new functionality")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
2021-10-14 20:31:10 +02:00
Bruce Richardson
e89463a366 eal: limit telemetry to primary processes
Telemetry interface should be exposed for primary processes only, since
secondary processes will conflict on socket creation, and since all
data in secondary process is generally available to primary. For
example, all device stats for ethdevs, cryptodevs, etc. will all be
common across processes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>
2021-10-14 20:31:10 +02:00
David Christensen
b698651b91 eal/ppc: use compiler builtins for atomics
Replace existing PPC assembly code for rte_atomicXX ops with compiler
atomic builtins as previously adopted by DPDK (see [1] and [2]).  This
has the additional benefit of resolving a POWER10 build failure due to an
outstanding gcc issue which fails on the existing PPC assembly code [3].

[1] https://www.dpdk.org/blog/2021/03/26/dpdk-adopts-the-c11-memory-model/
[2] https://doc.dpdk.org/guides/rel_notes/deprecation.html
[3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98519

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
2021-10-14 16:51:25 +02:00
Huichao Cai
567473433b ip_frag: fix fragmenting IPv4 fragment
Current implementation of rte_ipv4_fragment_packet() doesn’t take
into account offset and flag values of the given packet, but blindly
assumes they are always zero (original packet is not fragmented).
According to RFC791, fragment and flag values for new fragment
should take into account values provided in the original IPv4 packet.

Fixes: 4c38e5532a ("ip_frag: refactor IPv4 fragmentation into a proper library")
Cc: stable@dpdk.org

Signed-off-by: Huichao Cai <chcchc88@163.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-14 08:52:34 +02:00
Andrew Rybchenko
74a74bf98c mbuf: remove deprecated flag for bad outer IPv4 checksum
Removed offload flag PKT_RX_EIP_CKSUM_BAD. PKT_RX_OUTER_IP_CKSUM_BAD
should be used as a replacement.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-13 23:03:47 +02:00
Andrew Rybchenko
a87a0c0d1a mempool: fix name size in mempool structure
Use correct define as a name array size.

The change breaks ABI and therefore cannot be backported to
stable branches.

Fixes: 38c9817ee1 ("mempool: adjust name size in related data types")

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-10-13 22:54:10 +02:00
Stephen Hemminger
d75eed0fbe mbuf: fix typo in comment
Misspelling of 'copied'

Fixes: c3a90c381d ("mbuf: add a copy routine")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-13 19:02:18 +02:00
Gowrishankar Muthukrishnan
b76731683b telemetry: fix JSON output buffer length
Earlier, JSON message length was limited to 1024 which would not
allow data more than this size. Removed this limitation by creating
output buffer based on requested data length.

Fixes: 52af6ccb2b ("telemetry: add utility functions for creating JSON")
Cc: stable@dpdk.org

Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-10-13 18:17:24 +02:00
Bruce Richardson
0faa4cfc50 eal/freebsd: ignore in-memory option
The in-memory option is not supported on FreeBSD so print a warning and
ignore the flag when it is specified for BSD apps. The lack of support
is due to the different way in which memory is managed on FreeBSD using
the contigmem driver rather than via a hugetlbfs filesystem.

Fixes: 14de8734c4 ("eal: add --in-memory option")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2021-10-13 17:11:26 +02:00
Olivier Matz
8e506da755 net: promote IPv6 external headers skip API as stable
This function is public since commit 8f0e4d6a78 ("net: export IPv6
header extensions skip function") (2018), and is used by vmxnet3 driver.
Promote it as stable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-13 12:57:12 +02:00
David Marchand
2f3758751b eal/x86: sort CPU extended features definitions
Sort the definitions for extended features (leaf 0) to enhance
readability.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-10-12 21:07:53 +02:00
David Marchand
aae3037ab1 eal/x86: fix some CPU extended features definitions
Caught while checking CPUID related stuff in OVS.

According to [1], for Structured Extended Feature Flags Enumeration Leaf
(EAX = 0x07H, ECX = 0):

- BMI1 is associated to EBX, bit 3 (was incorrectly 2),
- SMEP is associated to EBX, bit 7 (was incorrectly 6),
- BMI2 is associated to EBX, bit 8 (was incorrectly 7),
- ERMS is associated to EBX, bit 9 (was incorrectly 8),

1: https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf

Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2021-10-12 21:07:50 +02:00
John Levon
24d5a1ce6b eal/linux: allow hugetlbfs sub-directories
get_hugepage_dir() was implemented in such a way that a --huge-dir
option had to exactly match the mountpoint, but there's no reason for
this restriction: DPDK might not be the only user of hugepages, and
shouldn't assume it owns an entire mountpoint. For example, if I have
/dev/hugepages/myapp, and /dev/hugepages/dpdk, I should be able to
specify:

--huge-dir=/dev/hugepages/dpdk/

and have DPDK only use that sub-directory.

Fix the implementation to allow a sub-directory within a suitable
hugetlbfs mountpoint to be specified, preferring the closest match.

Signed-off-by: John Levon <john.levon@nutanix.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2021-10-12 21:07:46 +02:00
Archana Muniganti
03ab51eafd security: add SA config option for inner checksum
Add inner packet IPv4 hdr and L4 checksum enable options
in conf. These will be used in case of protocol offload.
Per SA, application could specify whether the
checksum(compute/verify) can be offloaded to security device.

Signed-off-by: Archana Muniganti <marchana@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-08 21:39:39 +02:00
Shijith Thotton
dd451ad152 doc: remove event crypto metadata deprecation note
Proposed change to event crypto metadata is not done as per deprecation
note. Instead, comments are updated in spec to improve readability.

Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
2021-10-08 21:31:07 +02:00
Nicolas Chautru
5f13f4c03d bbdev: reduce log level of a failure message
Queue setup may genuinely fail when adding incremental queues
for a given priority level. In that case application would
attempt to configure a queue at a different priority level.
Not an actual error.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
2021-10-08 21:31:07 +02:00
Nicolas Chautru
10ea15e35f bbdev: add capability for 4G CB CRC drop
Adding option to drop CRC24B to align with existing
feature for 5G

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
2021-10-08 21:31:07 +02:00
Nicolas Chautru
cc360fd3f2 bbdev: add capability for CRC16 check
Adding a missing operation when CRC16
is being used for TB CRC check.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
2021-10-08 21:31:07 +02:00
Tejasree Kondoj
f7e3aa693d security: add option to configure UDP ports verification
Add option to indicate whether UDP encapsulation ports
verification need to be done as part of inbound
IPsec processing.

Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-10-08 21:31:07 +02:00
Dmitry Kozlyuk
d47dd94162 eal/windows: do not install virt2phys header
The header was not intended to be a public one.
DPDK users should use `rte_mem_virt2iova()` to translate addresses.
Other virt2phys users should use the header from the driver instead.

Fixes: 2a5d547a4a ("eal/windows: implement basic memory management")
Cc: stable@dpdk.org

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-10-11 21:17:12 +02:00
Narcisa Vasile
694c81721e eal/windows: fix CPU cores counting
On Windows, -l/--lcores EAL option was unable to process CPU sets
containing CPUs other than 0 and 1, because CPU_COUNT() macro
only checked these CPUs in the set. Fix CPU_COUNT() by enumerating
all possible CPU indices.

Fixes: e8428a9d89 ("eal/windows: add some basic functions and macros")
Cc: stable@dpdk.org

Signed-off-by: Narcisa Vasile <navasile@microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
2021-10-11 18:52:56 +02:00
Lance Richardson
dc954ae73a net: fix checksum API documentation
Minor corrections and improvements to documentation
for checksum APIs.

Fixes: 6006818cfb ("net: new checksum functions")
Fixes: 45a08ef55e ("net: introduce functions to verify L4 checksums")
Cc: stable@dpdk.org

Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-07 14:42:45 +02:00
Andrew Rybchenko
b225783dda ethdev: remove legacy mirroring API
A more fine-grain flow API action RTE_FLOW_ACTION_TYPE_SAMPLE should
be used instead of it.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-07 13:02:26 +02:00
Xueming Li
7483341ae5 ethdev: change queue release callback
Currently, most ethdev callback API use queue ID as parameter, but Rx
and Tx queue release callback use queue object which is used by Rx and
Tx burst data plane callback.

To align with other eth device queue configuration callbacks:
- queue release callbacks are changed to use queue ID
- all drivers are adapted

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 19:16:03 +02:00
Xueming Li
49ed322469 ethdev: make queue release callback optional
Some drivers don't need Rx and Tx queue release callback, make them
optional. Clean up empty queue release callbacks for some drivers.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2021-10-06 19:16:03 +02:00
Andrew Rybchenko
8c9f976f05 ethdev: improve xstats names by IDs get prototype
Adjust parameters order to eth_xstats_get_by_id_t prototype.
Make ids the second parameter similar to eth_xstats_get_by_id_t.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 13:07:11 +02:00
Ivan Ilchenko
71b5e430a6 ethdev: update xstats by ID driver callbacks documentation
Update xstats by IDs callbacks documentation in accordance with
ethdev usage of these callbacks. Document valid combinations of
input arguments to make driver implementation simpler.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 13:07:11 +02:00
Andrew Rybchenko
113778be13 ethdev: do not use get xstats names by IDs to obtain count
Relax requirements on get xstats names by IDs. After the patch
corresponding the driver operation is called with non-NULL ids
and xstats_names parameters only.

Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 13:07:11 +02:00
Ivan Ilchenko
bc5112ca59 ethdev: fix xstats by ID API documentation
Document valid combinations of input arguments in accordance with
current implementation in ethdev.

Fixes: 79c913a42f ("ethdev: retrieve xstats by ID")
Cc: stable@dpdk.org

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-10-06 13:07:11 +02:00
Dmitry Kozlyuk
04d43857ea net: rename Ethernet header fields
Definition of `rte_ether_addr` structure used a workaround allowing DPDK
and Windows SDK headers to be used in the same file, because Windows SDK
defines `s_addr` as a macro. Rename `s_addr` to `src_addr` and `d_addr`
to `dst_addr` to avoid the conflict and remove the workaround.
Deprecation notice:
https://mails.dpdk.org/archives/dev/2021-July/215270.html

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-10-08 14:58:11 +02:00
Tal Shnaiderman
56d2c1aa0b security: build on Windows
Build the security library on Windows.

Remove unneeded export of inline functions from version file.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: William Tu <u9012063@gmail.com>
2021-10-07 14:47:35 +02:00
Tal Shnaiderman
cb7b6898c8 cryptodev: build on Windows
Build the cryptography device library on Windows OS
by removing unneeded include and exports of inline functions
blocking the compilation.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: William Tu <u9012063@gmail.com>
2021-10-07 14:47:35 +02:00
Tal Shnaiderman
f9b2a75ed4 security: use net library to include IP structs
Remove the netinet includes and replaces them
with rte_ip.h to support the in_addr/in6_addr structs
on all operating systems.

Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: William Tu <u9012063@gmail.com>
2021-10-07 14:47:35 +02:00
David Marchand
ddfc59f4fb sort symbol maps
Fixed with ./devtools/update-abi.sh $(cat ABI_VERSION)

Fixes: e73a7ab224 ("net/softnic: promote manage API")
Fixes: 8f532a34c4 ("fib: promote API to stable")
Fixes: 4aeb92396b ("rib: promote API to stable")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-05 17:03:37 +02:00
Xuan Ding
07ee2d7505 vhost: normalize return type and function name
In some function definitions, adjust return type and function name on
a separate line to be consistent with DPDK coding style.

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-09-28 21:23:00 +02:00
David Marchand
5ac9d766b5 vhost: rework RARP packet injection
Caught by code review, this copy is unnecessary.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-09-28 21:21:07 +02:00
Eugenio Pérez
e7cb7fdf54 vhost: clean IOTLB cache on vring stop
Old IOVA cache entries are left when there is a change on virtio driver
in VM. In case that all these old entries have iova addresses lesser
than new iova entries, vhost code will need to iterate all the cache to
find the new ones. In case of just a new iova entry needed for the new
translations, this condition will last forever.

This has been observed in virtio-net to testpmd's vfio-pci driver
transition, reducing the performance from more than 10Mpps to less than
0.07Mpps if the hugepage address was higher than the networking
buffers. Since all new buffers are contained in this new gigantic page,
vhost needs to scan IOTLB_CACHE_SIZE - 1 for each translation at worst.

Fixes: 69c90e98f4 ("vhost: enable IOMMU support")
Cc: stable@dpdk.org

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reported-by: Pei Zhang <pezhang@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2021-09-28 17:26:44 +02:00
Stephen Hemminger
8680efa04c mbuf: promote Tx offload helper to stable
This function should be made stable now.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-05 11:04:03 +02:00
Stephen Hemminger
a767a3f43e mbuf: promote check helper to stable
This one has been in for required time period.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-05 11:04:03 +02:00
Stephen Hemminger
7ddade3555 mbuf: promote dynamic fields to stable
These functions to register dynamic fields were added in 19.11
and should be promoted to stable.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-05 11:03:58 +02:00
Stephen Hemminger
bf384709c7 mbuf: promote more helpers to stable
These two functions were added in 19.11 as experimental.
Time to promote the to stable status.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-05 10:59:35 +02:00
David Marchand
8d2a436d69 mbuf: promote some helpers to stable
Those accessors have been introduced more than two years ago
(rte_mbuf_to_priv in v18.08, rte_mbuf_*_addr* in v19.02).
Time to mark them stable.

rte_mbuf_to_baddr() could be removed, but since we lack a deprecation
notice, keep it as a simple wrapper.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2021-10-05 10:59:31 +02:00
Sean Morrissey
f01eff0d23 ring: promote new sync modes and peek to stable
These methods were introduced in 20.05.
There has been no changes in their public API since then.
They seem mature enough to remove the experimental tag.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-10-05 10:09:15 +02:00
Bruce Richardson
47a4f2650c eal/freebsd: lock memory device to prevent conflicts
Only a single DPDK process on the system can be using the /dev/contigmem
mappings at a time, but this was never explicitly enforced, e.g. when
using --in-memory flag on two processes. To prevent possible conflict
issues, we lock the dev node when it's in use, preventing other DPDK
processes from starting up and causing problems for us.

Fixes: 764bf26873 ("add FreeBSD support")
Cc: stable@dpdk.org

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2021-10-02 16:30:16 +02:00
Vladimir Medvedkin
8f532a34c4 fib: promote API to stable
The fib and fib6 API's have been in since 19.11 and
should be marked as stable.

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Conor Walsh <conor.walsh@intel.com>
2021-10-02 11:37:25 +02:00
Stephen Hemminger
4aeb92396b rib: promote API to stable
The rib and rib6 API's have been in since 19.11 and
should be marked as stable.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2021-10-02 11:37:25 +02:00
Stephen Hemminger
2cea5168c1 net: promote string to ethernet to stable
This function has been in since 19.11.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-02 11:35:17 +02:00
Xiao Wang
20ab35d032 net: promote make rarp packet function to stable
rte_net_make_rarp_packet was introduced in version v18.02, there was no
change in this public API since then, and it's still being used by vhost
lib and virtio driver, so promote it as stable ABI.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-10-02 11:12:32 +02:00
Ivan Malov
8cfad59e29 log: promote some function to stable
This one might be quite mature to be attested as stable.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-02 11:12:32 +02:00
Mattias Rönnblom
15a1e00a65 eal: promote random generator with upper bound to stable
Remove experimental tag from rte_rand_max().

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-02 11:12:19 +02:00
Bruce Richardson
b8a0fbab98 telemetry: promote API to stable
The telemetry APIs have been present and unchanged for >1 year now,
so remove experimental tag from them.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-10-01 17:17:28 +02:00
Jie Zhou
09e4eceacb mempool/stack: build on Windows
Enable build of mempool/stack on Windows.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-10-01 16:46:05 +02:00
Pablo de Lara
8751a7e983 efd: allow more CPU sockets in table creation
rte_efd_create() function was using uint8_t for a socket bitmask,
for one of its parameters.
This limits the maximum of NUMA sockets to be 8.
Changing to uint64_t increases it to 64, which should be
more future-proof.

Coverity issue: 366390
Fixes: 56b6ef874f ("efd: new Elastic Flow Distributor library")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
2021-10-01 16:33:20 +02:00
Kevin Traynor
4ad8807cfc bitrate: promote free function to stable
rte_stats_bitrate_free() has been in DPDK since 20.11.

Its signature is very basic as it just frees an opaque
data struct allocated in rte_stats_bitrate_create()
and returns void.

It's unlikely that such a basic signature would need to change
so might as well promote it to stable for the next major ABI.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2021-10-01 15:31:47 +02:00
Kevin Traynor
bdd478eede bitrate: fix calculation to match API description
rte_stats_bitrate_calc() API states it returns 'Negative value on error'.

However, the implementation will return the error code from
rte_eth_stats_get() which may be non-zero on error.

Change the implementation of rte_stats_bitrate_calc() to match
the API description by always returning a negative value on error.

Fixes: 2ad7ba9a65 ("bitrate: add bitrate statistics library")

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2021-10-01 15:31:06 +02:00
Kevin Traynor
06ae9f0f92 bitrate: fix registration to match API description
rte_stats_bitrate_reg() API states it returns 'Zero on success'.

However, the implementation directly returns the return of
rte_metrics_reg_names() which may be zero or positive on success,
with a positive value also indicating the index.

The user of rte_stats_bitrate_reg() should not care about the
index as it is stored in the opaque rte_stats_bitrates struct.

Change the implementation of rte_stats_bitrate_reg() to match
the API description by always returning zero on success.

Fixes: 2ad7ba9a65 ("bitrate: add bitrate statistics library")

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2021-10-01 15:29:21 +02:00
Stephen Hemminger
71ecc415c5 telemetry: detach threads
There are a number telemetry threads which are created and
there is nothing that does pthread_join() to wait for them.
Mark these threads as detached, so that the pthread library
can cleanup state when the thread exits.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ciara Power <ciara.power@intel.com>
2021-10-01 14:50:16 +02:00
Cian Ferriter
0203a14c72 ring: fix Doxygen comment of internal function
Change "enqueue" to "dequeue" because the __rte_ring_move_cons_head()
function is updating the consumer head for dequeue.

Fixes: 0dfc98c507 ("ring: separate out head index manipulation")
Cc: stable@dpdk.org

Signed-off-by: Cian Ferriter <cian.ferriter@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2021-10-01 14:39:12 +02:00
William Tu
f1f6ebc0ea eal: remove sys/queue.h from public headers
Currently there are some public headers that include 'sys/queue.h', which
is not POSIX, but usually provided by the Linux/BSD system library.
(Not in POSIX.1, POSIX.1-2001, or POSIX.1-2008. Present on the BSDs.)
The file is missing on Windows. During the Windows build, DPDK uses a
bundled copy, so building a DPDK library works fine.  But when OVS or other
applications use DPDK as a library, because some DPDK public headers
include 'sys/queue.h', on Windows, it triggers an error due to no such
file.

One solution is to install the 'lib/eal/windows/include/sys/queue.h' into
Windows environment, such as [1]. However, this means DPDK exports the
functionalities of 'sys/queue.h' into the environment, which might cause
symbols, macros, headers clashing with other applications.

The patch fixes it by removing the "#include <sys/queue.h>" from
DPDK public headers, so programs including DPDK headers don't depend
on the system to provide 'sys/queue.h'. When these public headers use
macros such as TAILQ_xxx, we replace it by the ones with RTE_ prefix.
For Windows, we copy the definitions from <sys/queue.h> to rte_os.h
in Windows EAL. Note that these RTE_ macros are compatible with
<sys/queue.h>, both at the level of API (to use with <sys/queue.h>
macros in C files) and ABI (to avoid breaking it).

Additionally, the TAILQ_FOREACH_SAFE is not part of <sys/queue.h>,
the patch replaces it with RTE_TAILQ_FOREACH_SAFE.

[1] http://mails.dpdk.org/archives/dev/2021-August/216304.html

Suggested-by: Nick Connolly <nick.connolly@mayadata.io>
Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
2021-10-01 13:09:43 +02:00
Dmitry Kozlyuk
6787d0af94 lib: remove sched.h from public headers
Public headers including POSIX-specific <sched.h> were unusable
on Windows. These includes were superfluous, remove them.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
2021-10-01 08:35:05 +02:00
Dmitry Kozlyuk
b7c3eb57bb eal/windows: fix export list
* Version and randomness API were not added to .def file by mistake,
  which is why they were later excluded from the export list.
* Device API stubs were added to EAL but not exported.

Fixes: edd66d57d5 ("eal/windows: add random function")
Fixes: 3d2fcb0e0a ("eal/windows: add device event stubs")
Fixes: 5b637a8481 ("eal: fix querying DPDK version at runtime")

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-09-30 22:47:43 +02:00
Dmitry Kozlyuk
cf665406b1 eal: remove Windows-specific list of common files
The majority of common EAL sources that are built for all platforms were
listed separately for Windows and for other OS. It seems that developers
adding modules to EAL perceived this as if Windows supported
only a limited subset of modules and only added new ones into another.
Factor the truly common modules into a shared list,
then extend it with modules supported by different platforms.

When the two lists were created, UUID API implementation was removed
from Windows build (apparently by mistake), then excluded from the
export list for no reason other than not being built. Restore it.

Fixes: df3ff6be2b ("eal: simplify meson build of common directory")

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2021-09-30 22:47:28 +02:00
William Tu
fe81e52a91 eal/windows: export version function
When OVS inits, it calls rte_version to get the DPDK's version.
The patch fixes the error below by exposing rte_version symbol.
libopenvswitch.a(dpdk.c.obj) : error LNK2019: unresolved external symbol
rte_version referenced in function dpdk_init

Fixes: 5b637a8481 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org

Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2021-09-30 22:47:23 +02:00
Pallavi Kadam
876d40fe6d net: enable random address on Windows
IAVF PMD needs to generate a random MAC address if it is not configured
by host.
'random' is now supported on Windows.

Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Shivanshu Shukla <shivanshu.shukla@intel.com>
2021-09-30 20:51:11 +02:00
Olivier Matz
f0e18cb4a8 kvargs: fix comments style
A '*' is missing at 2 places, add them.

Fixes: e1a00536c8 ("kvargs: add a new library to parse key/value arguments")
Fixes: 3ab385063c ("kvargs: add get by key")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-30 17:38:13 +02:00
Olivier Matz
6aebb94290 kvargs: add function to get from key and value
A quite common scenario with kvargs is to lookup for a <key>=<value> in
a kvlist. For instance, check if name=foo is present in
name=toto,name=foo,name=bar. This is currently done in drivers/bus with
rte_kvargs_process() + the rte_kvargs_strcmp() handler.

This approach is not straightforward, and can be replaced by this new
function.

rte_kvargs_strcmp() is then removed.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-30 17:38:02 +02:00
Olivier Matz
b36d02ab39 kvargs: promote get from key as stable
The function rte_kvargs_get() is used by eal and pci bus driver since
its introduction in commit 3ab385063c ("kvargs: add get by key") and
commit d2a66ad794 ("bus: add device arguments name parsing"), in
dpdk 21.05.

Let's promote it as stable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-30 15:31:01 +02:00
Olivier Matz
4f5520d910 kvargs: promote delimited parsing as stable
This function is used by EAL to parse key/value strings separated with
specified delimiters.

It was introduced in 2018 by commit 5d6af85ab0 ("kvargs: introduce a
more flexible parsing function"), and can be promoted as stable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-30 15:30:31 +02:00
Raslan Darawsheh
16b8e92d49 ethdev: use extension header for GTP PSC item
This updates the gtp_psc flow item to use the net header
definition of the gtp_psc to be based on RFC 38415-g30

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-09-28 12:34:58 +02:00
Raslan Darawsheh
e8ca1479cd net: add extension header for GTP PSC
Define new rte header for GTP PDU session container
based on RFC 38415-g30

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-09-28 12:34:58 +02:00
Thomas Monjalon
0ce56b057b ethdev: group constant definitions in Doxygen
A lot of flags are parts of a group but are documented alone.
The Doxygen syntax @{ and @} for grouping is used
to make flags appear together and have a common description.

Some Rx/Tx offload flags and RSS definitions are not grouped
because they need to be all properly documented first.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2021-09-27 13:34:45 +02:00
Alvin Zhang
81b0fbb85b ethdev: add IPv4 and L4 checksum RSS offload types
This patch defines new RSS offload types for IPv4 and
L4(TCP/UDP/SCTP) checksum, which are required when users want
to distribute packets based on the IPv4 or L4 checksum field.

For example "flow create 0 ingress pattern eth / ipv4 / end
actions rss types ipv4-chksum end queues end / end", this flow
causes all matching packets to be distributed to queues on
basis of IPv4 checksum.

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Aman Deep Singh <aman.deep.singh@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2021-09-21 10:25:42 +02:00
Anatoly Burakov
de4ffd50c9 mem: promote some shared memory config API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
27e7e2509c mem: promote DMA mask API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
acddc33b3e mem: promote external memory API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
b893775065 mem: promote memseg API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
437cb6e826 malloc: promote some experimental API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
c335ffdbf7 fbarray: promote experimental API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Anatoly Burakov
1611654bd6 ipc: promote experimental API to stable
As per ABI policy, move the formerly experimental API's to the stable
section.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2021-09-28 22:07:41 +02:00
Tejasree Kondoj
f0b538a5f8 security: add option to configure tunnel header verification
Add option to indicate whether outer header verification
need to be done as part of inbound IPsec processing.

With inline IPsec processing, SA lookup would be happening
in the Rx path of rte_ethdev. When rte_flow is configured to
support more than one SA, SPI would be used to lookup SA.
In such cases, additional verification would be required to
ensure duplicate SPIs are not getting processed in the inline path.

For lookaside cases, the same option can be used by application
to offload tunnel verification to the PMD.

These verifications would help in averting possible DoS attacks.

Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-09-28 17:40:52 +02:00
Anoob Joseph
ad7515a39f security: add SA lifetime configuration
Add SA lifetime configuration to register soft and hard expiry limits.
Expiry can be in units of number of packets or bytes. Crypto op
status is also updated to include new field, aux_flags, which can be
used to indicate cases such as soft expiry in case of lookaside
protocol operations.

In case of soft expiry, the packets are successfully IPsec processed but
the soft expiry would indicate that SA needs to be reconfigured. For
inline protocol capable ethdev, this would result in an eth event while
for lookaside protocol capable cryptodev, this can be communicated via
`rte_crypto_op.aux_flags` field.

In case of hard expiry, the packets will not be IPsec processed and
would result in error.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-09-28 14:11:29 +02:00
Anoob Joseph
63992166ba security: support user-specified IV
Enabled user to provide IV to be used per security
operation. This would be used with lookaside protocol
offload for comparing against known vectors.

By default, PMD would internally generate random IV.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2021-09-28 13:35:32 +02:00
Nithin Dabilpuram
d08dcd28c3 security: add option for faster user/meta data access
Currently rte_security_set_pkt_metadata() and rte_security_get_userdata()
methods to set pkt metadata on Inline outbound and get userdata
after Inline inbound processing is always driver specific callbacks.

For drivers that do not have much to do in the callbacks but just
to update metadata in rte_security dynamic field and get userdata
from rte_security dynamic field, having to just to PMD specific
callback is costly per packet operation. This patch provides
a mechanism to do the same in inline function and avoid function
pointer jump if a driver supports the same.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-09-28 08:43:47 +02:00
Nithin Dabilpuram
6d1f8c1319 mbuf: enforce semantics for Tx inline IPsec processing
Not all net PMD's/HW can parse packet and identify L2 header and
L3 header locations on Tx. This is inline with other Tx offloads
requirements such as L3 checksum, L4 checksum offload, etc,
where mbuf.l2_len, mbuf.l3_len etc, needs to be set for HW to be
able to generate checksum. Since Inline IPsec is also such a Tx
offload, some PMD's at least need mbuf.l2_len to be valid to
find L3 header and perform Outbound IPSec processing.

Hence, this patch updates documentation to enforce setting
mbuf.l2_len while setting PKT_TX_SEC_OFFLOAD in mbuf.ol_flags
for Inline IPsec Crypto / Protocol offload processing to
work on Tx.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2021-09-27 09:55:41 +02:00
Shijith Thotton
a0a388a897 eal: add macro to swap two variables
Add a macro to swap two variables
and updat common autotest for the same.

Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2021-09-27 18:33:45 +02:00
Julien Meunier
6ded44bce4 stack: fix reload head when pop fails
The previous commit 18effad9cf ("stack: reload head when pop fails")
only changed C11 implementation, not generic implementation.

List head must be loaded right before continue (when failed to find the
new head). Without this, one thread might keep trying and failing to pop
items without ever loading the new correct head.

Fixes: 3340202f59 ("stack: add lock-free implementation")
Cc: stable@dpdk.org

Signed-off-by: Julien Meunier <julien.meunier@nokia.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2021-09-27 17:28:55 +02:00
Xueming Li
eb5636e879 sched: get 64-bit greatest common divisor
This patch adds new function that compute the greatest common
divisor of 64 bits, also changes the original 32 bits function
to call this new 64-bit version.

Cc: stable@dpdk.org

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2021-09-27 17:24:16 +02:00
Cristian Dumitrescu
175d213bf8 pipeline: improve handling of learner action arguments
The arguments of actions that are learned are now specified as part of
the learn instruction as opposed to being statically specified as part
of the learner table configuration.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:18:49 +02:00
Cristian Dumitrescu
1c6571c837 pipeline: enable pipeline compilation
Commit the pipeline changes when the compilation process is
successful: change the table lookup instructions to execute the action
function for each action, replace the regular pipeline instructions
with the custom instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:10:26 +02:00
Cristian Dumitrescu
f898a475c3 pipeline: build shared object for pipeline
Build the generated C file into a shared object library.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Signed-off-by: Cunming Liang <cunming.liang@intel.com>
2021-09-27 12:10:20 +02:00
Cristian Dumitrescu
724f3ef422 pipeline: generate custom instruction functions
Generate a C function for each custom instruction, which essentially
consolidate multiple regular instructions into a single function call.
The pipeline program is split into groups of instructions, and a
custom instruction is generated for each group that has more than one
instruction. Special care is taken the instructions that can do thread
yield (RX, extern) and for those that can change the instruction
pointer (TX, near/far jump).

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:54 +02:00
Cristian Dumitrescu
d025528d74 pipeline: generate action functions
Generate a C function for each action. For most instructions, the
associated inline function is called directly. Special care is taken
for TX, jump and return instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:45 +02:00
Cristian Dumitrescu
216bc906d0 pipeline: export pipeline instructions to file
Export the array of translated instructions to a C file. There is one
such array per action and one for the pipeline.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:26 +02:00
Cristian Dumitrescu
fc64098a1a pipeline: introduce pipeline compilation
Lay the foundation to generate C code for the pipeline: C functions
for actions and custom instructions are generated, built as shared
object library and loaded into the pipeline.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:15 +02:00
Cristian Dumitrescu
dfa9491a18 pipeline: introduce custom instructions
For better performance, the option to create custom instructions when
the program is translated and add them on-the-fly to the pipeline is
now provided. Multiple regular instructions can now be consolidated
into a single C function optimized by the C compiler directly.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:13 +02:00
Cristian Dumitrescu
5dc6a5f2e7 pipeline: introduce action functions
For better performance, the option to run a single function per action
is now provided, which requires a single function call per action that
can be better optimized by the C compiler, as opposed to one function
call per instruction. Special table lookup instructions are added to
to support this feature.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2021-09-27 12:09:11 +02:00