450 Commits

Author SHA1 Message Date
Thomas Monjalon
d1342ea419 mbuf: document guideline for new fields and flags
Since dynamic fields and flags were added in 19.11,
the idea was to use them for new features, not only PMD-specific.

The guideline is made more explicit in doxygen, in the mbuf guide,
and in the contribution design guidelines.

For more information about the original design, see the presentation
https://www.dpdk.org/wp-content/uploads/sites/35/2019/10/DynamicMbuf.pdf

This decision was discussed in the Technical Board:
http://mails.dpdk.org/archives/dev/2020-June/169667.html

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-06-11 09:29:15 +02:00
David Marchand
3d4b2afb73 doc: prefer https when pointing to dpdk.org
for file in $(git grep -l http://.*dpdk.org doc/); do
  sed -i -e 's#http://\(.*dpdk.org\)#https://\1#g' $file;
done

Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2020-05-24 23:42:36 +02:00
Ciara Power
1474a341d5 doc: fix telemetry registration example
The example shown for registering telemetry commands was previously
missing the help text parameter.

Fixes: 24cd1b529f35 ("doc: update telemetry guides")

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
2020-05-24 18:50:58 +02:00
Dharmik Thakkar
975a75c057 doc: add aarch64 generic counter in profiling guide
Add a separate section for low-resolution generic counter
for ARM64 profiling methods.

Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-05-18 20:35:57 +02:00
Matteo Croce
9aef9b9fbc doc: fix LTO config option
The documentation says that CONFIG_ENABLE_LTO enables LTO during the
build, but the correct value actually is CONFIG_RTE_ENABLE_LTO.

Fixes: 098cc0fea3be ("build: add option to enable LTO")
Cc: stable@dpdk.org

Signed-off-by: Matteo Croce <mcroce@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrzej Ostruszka <aostruszka@marvell.com>
2020-05-18 18:14:21 +02:00
Dekel Peled
6b30428820 doc: refine ethernet and VLAN flow rule items
Specified pattern may be translated in different manner.
For example the pattern "eth / ipv4" can be translated to match
untagged packets only, since the pattern doesn't specify a VLAN item.
It can also be translated to match both tagged and untagged packets,
for the same reason.
This patch updates the rte_flow documentation to clearly specify the
required pattern to use.
For example:
To match tagged ipv4 packets, the pattern "eth / vlan / ipv4 / end"
should be used.
To match untagged ipv4 packets, the pattern "eth / ipv4 / end"
should be used.
To match all IPV4 packets, both tagged and untagged, need to apply
two rules with the patterns above.
To match both tagged and untagged packets of any type, the pattern
"eth / end" should be used.

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ori Kam <orika@mellanox.com>
2020-05-11 22:27:39 +02:00
Ciara Power
24cd1b529f doc: update telemetry guides
The existing documentation for Telemetry is updated, and further
documentation is added.

Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-11 00:37:16 +02:00
David Marchand
ebaee64097 trace: simplify trace point headers
Invert the current trace point headers logic by making
rte_trace_point_register.h include rte_trace_point.h.

There is no more need for a RTE_TRACE_POINT_REGISTER_SELECT special macro
since including rte_trace_point_register.h itself means we want to
register trace points.

The unexplained "provider" notion is removed from the documentation and
rte_trace_point_provider.h is merged into rte_trace_point.h.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-05-06 13:50:32 +02:00
Jerin Jacob
4dc6d8e63c doc: add graph library guide
Adding programmer's guide for Graph library and the inbuilt nodes.
This patch also updates the release note for the new libraries.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
2020-05-05 23:46:21 +02:00
Dong Zhou
44bf3c796b ethdev: support flow aging
One of the reasons to destroy a flow is the fact that no packet matches
the flow for "timeout" time.
For example, when TCP\UDP sessions are suddenly closed.

Currently, there is not any DPDK mechanism for flow aging and the
applications use their own ways to detect and destroy aged-out flows.

The flow aging implementation need include:
- A new rte_flow action: RTE_FLOW_ACTION_TYPE_AGE to set the timeout and
  the application flow context for each flow.
- A new ethdev event: RTE_ETH_EVENT_FLOW_AGED for the driver to report
  that there are new aged-out flows.
- A new rte_flow API: rte_flow_get_aged_flows to get the aged-out flows
  contexts from the port.
- Support input flow aging command line in Testpmd.

The new event type addition in the enum is flagged as an ABI breakage,
so an ignore rule is added for these reasons:
- It is not changing value of existing types (except MAX)
- The new value is not used by existing API if the event is not
  registered
In general, it is safe adding new ethdev event types at the end of the
enum, because of event callback registration mechanism.

Signed-off-by: Dong Zhou <dongz@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2020-04-21 17:34:05 +02:00
Jerin Jacob
9f8e1810f6 doc: add trace library guide
Add programmer's guide for trace library support.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:40:12 +02:00
Jerin Jacob
27db82c709 trace: introduce new subsystem
Define the public API for trace support.
This patch also adds support for the build infrastructure and
update the MAINTAINERS file for the trace subsystem.

The 8 bytes tracepoint object is a global variable, and can be used in
fast path. Created a new __rte_trace_point section to store the
tracepoint objects as,
- It is a mostly read-only data and not to mix with other "write"
  global variables.
- Chances that the same subsystem fast path variables come in the same
  fast path cache line. i.e, it will enable a more predictable
  performance number from build to build.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-04-23 15:39:06 +02:00
Ruifeng Wang
1d301a8a25 doc: add RCU integration design details
Add a section to describe a design to integrate QSBR RCU library
with other libraries in DPDK.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2020-04-22 20:46:00 +02:00
Konstantin Ananyev
664ff4b172 ring: introduce peek style API
For rings with producer/consumer in RTE_RING_SYNC_ST, RTE_RING_SYNC_MT_HTS
mode, provide an ability to split enqueue/dequeue operation
into two phases:
      - enqueue/dequeue start
      - enqueue/dequeue finish
That allows user to inspect objects in the ring without removing
them from it (aka MT safe peek).

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-04-21 12:52:55 +02:00
Konstantin Ananyev
1cc363b8ce ring: introduce HTS ring mode
Introduce head/tail sync mode for MT ring synchronization.
In that mode enqueue/dequeue operation is fully serialized:
only one thread at a time is allowed to perform given op.
Suppose to reduce stall times in case when ring is used on
overcommitted cpus (multiple active threads on the same cpu).

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-04-21 12:52:55 +02:00
Konstantin Ananyev
e6ba4731c0 ring: introduce RTS ring mode
Introduce relaxed tail sync (RTS) mode for MT ring synchronization.
Aim to reduce stall times in case when ring is used on
overcommited cpus (multiple active threads on the same cpu).
The main difference from original MP/MC algorithm is that
tail value is increased not by every thread that finished enqueue/dequeue,
but only by the last one.
That allows threads to avoid spinning on ring tail value,
leaving actual tail value change to the last thread in the update queue.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-04-21 12:52:55 +02:00
Konstantin Ananyev
ebff988d0c ring: prepare ring to allow new sync schemes
To make these preparations two main things are done:
- Change from *single* to *sync_type* to allow different
  synchronisation schemes to be applied.
  Mark *single* as deprecated in comments.
  Add new functions to allow user to query ring sync types.
  Replace direct access to *single* with appropriate function call.
- Move actual rte_ring and related structures definitions into a
  separate file: <rte_ring_core.h>. It allows to refer contents
  of <rte_ring_elem.h> from <rte_ring.h> without introducing a
  circular dependency.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-04-21 11:34:09 +02:00
Thomas Monjalon
ef5baf3486 replace packed attributes
There is a common macro __rte_packed for packing structs,
which is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2020-04-16 18:16:46 +02:00
Xiao Zhang
ecbc857013 ethdev: add PFCP header to flow API
This patch adds the new flow item RTE_FLOW_ITEM_TYPE_PFCP to flow API to
match a PFCP header.
Add sample PFCP rules for testpmd guide. Since Session Endpoint
Identifier (SEID) only will be present in PFCP Session header and PFCP
Session headers shall be identified when the S field is equal to 1, when
create rules for PFCP Session header with certain SEID the S field need
be set 1.

Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
2020-03-18 10:21:42 +01:00
Prateek Agarwal
fa00525b9a doc: fix multi-producer enqueue figure in ring guide
The producer head pointer in multi producer enqueue fig.6.10
points to incorrect object in the ring array.

Fixes: fc1f2750a3ec ("doc: programmers guide")
Cc: stable@dpdk.org

Signed-off-by: Prateek Agarwal <prateekag@cse.iitb.ac.in>
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-02-21 18:41:46 +01:00
Prateek Agarwal
07d311cc35 doc: fix quiescent state description in RCU guide
The quiescent state description refers to an incorrect
thread.

Fixes: 64994b56cfd7 ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org

Signed-off-by: Prateek Agarwal <prateekag@cse.iitb.ac.in>
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-02-21 18:41:11 +01:00
Hari Kumar Vemula
adbeba3639 doc: add a guide to run unit tests with meson
Add a programmer's guide section for meson ut

Signed-off-by: Hari Kumar Vemula <hari.kumarx.vemula@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Michael Santana <msantana@redhat.com>
2020-02-16 11:30:30 +01:00
Honnappa Nagarahalli
e89680d024 doc: update ring guide
Changed the rte_ring chapter in programmer's guide to reflect
the addition of rte_ring_xxx_elem APIs. References to pointers
as ring elements is changed to generic term 'objects'.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
2020-02-15 17:48:57 +01:00
Marcin Smoczynski
957394f726 ipsec: support CPU crypto mode
Update library to handle CPU cypto security mode which utilizes
cryptodev's synchronous, CPU accelerated crypto operations.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Marcin Smoczynski <marcinx.smoczynski@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-02-05 15:29:59 +01:00
Marcin Smoczynski
5d6d7e443d security: add CPU crypto action type
Introduce CPU crypto action type allowing to differentiate between
regular async 'none security' and synchronous, CPU crypto accelerated
sessions.

This mode is similar to ACTION_TYPE_NONE but crypto processing is
performed synchronously on a CPU.

Signed-off-by: Marcin Smoczynski <marcinx.smoczynski@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-02-05 15:29:49 +01:00
Marcin Smoczynski
7adf992fb9 cryptodev: introduce CPU crypto API
Add new API allowing to process crypto operations in a synchronous
manner. Operations are performed on a set of SG arrays.

Cryptodevs which allows CPU crypto operation mode have to
use RTE_CRYPTODEV_FF_SYM_CPU_CRYPTO capability.

Add a helper method to easily convert mbufs to a SGL form.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Marcin Smoczynski <marcinx.smoczynski@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2020-02-05 15:29:14 +01:00
Jerin Jacob
3f2d6766e3 mempool: remove memory wastage on non-x86
The existing optimize_object_size() function address the memory object
alignment constraint on x86 for better performance.

Different (micro) architecture may have different memory alignment
constraint for better performance and it not the same as the existing
optimize_object_size().

Some use, XOR(kind of CRC) scheme to enable DRAM channel distribution
based on the address and some may have a different formula.

Introducing arch_mem_object_align() function to abstract
the difference between different (micro) architectures to avoid
wasting memory for mempool object alignment for the architecture
that it is not required to do so.

Details on the amount of memory saving:

Currently, arm64 based architectures use the default (nchan=4,
nrank=1). The worst case is for an object whose size (including mempool
header) is 2 cache lines, where it is optimized to 3 cache lines (+50%).

Examples for cache lines size = 64:
  orig     optimized
  64    -> 64           +0%
  128   -> 192          +50%
  192   -> 192          +0%
  256   -> 320          +25%
  320   -> 320          +0%
  384   -> 448          +16%
  ...
  2304  -> 2368         +2.7%  (~mbuf size)

Additional details:
https://www.mail-archive.com/dev@dpdk.org/msg149157.html

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2020-01-20 16:37:27 +01:00
Rory Sexton
65388f4c4c ethdev: add L2TPv3 over IP header to flow API
This patch adds the new flow item RTE_FLOW_ITEM_TYPE_L2TPV3OIP to
flow API to match a L2TPv3 over IP header. This patch supports only
L2TPv3 over IP header format which is different to L2TPv2/L2TPv3
over UDP. The difference in header formats between L2TPv3 over IP
and L2TP over UDP require a separate implementation for each.

Signed-off-by: Rory Sexton <rory.sexton@intel.com>
Signed-off-by: Dariusz Jagus <dariuszx.jagus@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
2020-01-17 19:46:26 +01:00
Suanming Mou
8482ffe4b6 ethdev: add IPv4/IPv6 DSCP rewrite action
For some overlay network, such as VXLAN, the DSCP field in the new outer
IP header after VXLAN decapsulation may need to be updated accordingly.

This commit introduce the DSCP modify action for IPv4 and IPv6.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ori Kam <orika@mellanox.com>
2020-01-17 19:46:02 +01:00
Ferruh Yigit
de480bbf13 kni: fix build with Linux 4.9.x
The 'get_user_pages_remote()' API is updated in kernel 4.10.0 [1],
but the check added as > 4.9.0,
this logic is broken for kernels 4.9.x, because they justify
> 4.9.0 check but have the old API.

Fixing the check as >= 4.10.0

[1]
commit 5b56d49fc31d ("mm: add locked parameter to get_user_pages_remote()")

Fixes: d965af9e8ae1 ("kni: increase kernel version requirement for VA")

Reported-by: Andrew Rybchenko <arybchenko@solarflare.com>
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-11-28 14:48:24 +01:00
Anatoly Burakov
ebf9c7b1db doc: fix a typo in EAL guide
The correct name for virt2memseg API is `rte_mem_virt2memseg`, not
`rte_virt2memseg`.

Fixes: 950e8fb4e194 ("mem: allow registering external memory areas")
Cc: stable@dpdk.org

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-11-26 17:44:09 +01:00
Jasvinder Singh
694fd2cb8d doc: update QoS scheduler guides
Updates documentation to reflect the changes in the QoS scheduler
library and example.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2019-11-26 16:13:14 +01:00
Stephen Hemminger
06710448c9 remove blank lines at end of file
Remove trailing blank lines. They serve no purpose and are just
editor leftovers.
These can cause git to complain about whitespace errors during merges.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2019-11-26 00:12:08 +01:00
Ferruh Yigit
d965af9e8a kni: increase kernel version requirement for VA
A build error reported related to the selected 'get_user_pages_remote()'
kernel API:

.../kernel/linux/kni/kni_dev.h:113:8:
  error: too few arguments to function ‘get_user_pages_remote’
  ret = get_user_pages_remote(tsk, tsk->mm, iova, 1
        ^~~~~~~~~~~~~~~~~~~~~

Currently there are three versions of the 'get_user_pages_remote()'
supported, based on kernel version < 4.9, = 4.9, > 4.9.

These version based checks are not working fine with the distro kernels
which is the cause of reported build error. The error reported by the
kernel version 4.8, but it is using API defined in > 4.9.

To be able to take control of this, and possible more, related build
error, increasing the minimum supported kernel version for iova=va with
KNI to kernel version 4.9.

This leaves us with single version of the kernel API and more manageable.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-11-21 00:18:02 +01:00
Vamsi Attunuru
a0dede62a5 eal/linux: remove KNI restriction on IOVA
Now that KNI supports VA (with kernel versions starting 4.6.0), we can
accept IOVA as VA, but KNI must be configured for this.
Pass iova_mode when creating KNI netdevs.

So far, IOVA detection policy forced IOVA as PA when KNI is loaded,
whatever the buses IOVA requirements were.

We can now use IOVA as VA, but this comes with a cost in KNI.
When no constraint is expressed by the buses, keep the current behavior
of choosing PA.

Note: this change supposes that dpdk is built on the same kernel than
the target system kernel; no objection has been expressed on this topic.

Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2019-11-18 16:00:51 +01:00
David Marchand
f43d3dbbd9 doc/guides: clean repeated words
Shoot repeated words in all our guides.

Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2019-11-15 11:36:27 +01:00
David Marchand
43628b3df2 doc: fix internal links for older releases
Using external explicit references to http://doc.dpdk.org makes older
releases documentation point to the current master documentation pages.
Switch to internal references.

Fixes: 59ad25fe2184 ("doc: add overview of qat guide")
Fixes: 30e7fbd62839 ("doc: add event timer adapter guide")
Fixes: b7f859c9a9a5 ("doc: add switch representation documentation")
Fixes: f714a18885a6 ("app/testbbdev: add test application for bbdev")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2019-11-15 09:58:01 +01:00
Viacheslav Ovsiienko
9bf26e1318 ethdev: move egress metadata to dynamic field
The dynamic mbuf fields were introduced by [1]. The egress metadata is
good candidate to be moved from statically allocated field tx_metadata to
dynamic one. Because mbufs are used in half-duplex fashion only, it is
safe to share this dynamic field with ingress metadata.

The shared dynamic field contains either egress (if application going to
transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst)
metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set
along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior accessing the data.

The availability of dynamic mbuf metadata field can be checked with
rte_flow_dynf_metadata_avail() routine.

DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed.
The metadata support in PMDs is engaged on dynamic field registration.

Metadata feature is getting complex. We might have some set of actions
and items that might be supported by PMDs in multiple combinations,
the supported values and masks are the subjects to query by perfroming
trials (with rte_flow_validate).

[1] http://patches.dpdk.org/patch/62040/

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
2019-11-08 23:15:05 +01:00
Viacheslav Ovsiienko
e02ecc1324 ethdev: extend flow metadata
Currently, metadata can be set on egress path via mbuf tx_metadata field
with PKT_TX_METADATA flag and RTE_FLOW_ITEM_TYPE_META matches metadata.

This patch extends the metadata feature usability.

1) RTE_FLOW_ACTION_TYPE_SET_META

When supporting multiple tables, Tx metadata can also be set by a rule and
matched by another rule. This new action allows metadata to be set as a
result of flow match.

2) Metadata on ingress

There's also need to support metadata on ingress. Metadata can be set by
SET_META action and matched by META item like Tx. The final value set by
the action will be delivered to application via metadata dynamic field of
mbuf which can be accessed by RTE_FLOW_DYNF_METADATA() macro or with
rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper
routines. PKT_RX_DYNF_METADATA flag will be set along with the data.

The mbuf dynamic field must be registered by calling
rte_flow_dynf_metadata_register() prior to use SET_META action.

The availability of dynamic mbuf metadata field can be checked
with rte_flow_dynf_metadata_avail() routine.

If application is going to engage the metadata feature it registers
the metadata  dynamic fields, then PMD checks the metadata field
availability and handles the appropriate fields in datapath.

For loopback/hairpin packet, metadata set on Rx/Tx may or may not be
propagated to the other path depending on hardware capability.

MARK and METADATA look similar and might operate in similar way,
but not interacting.

Initially, there were proposed two metadata related actions:

- RTE_FLOW_ACTION_TYPE_FLAG
- RTE_FLOW_ACTION_TYPE_MARK

These actions set the special flag in the packet metadata, MARK action
stores some specified value in the metadata storage, and, on the packet
receiving PMD puts the flag and value to the mbuf and applications can
see the packet was threated inside flow engine according to the appropriate
RTE flow(s). MARK and FLAG are like some kind of gateway to transfer some
per-packet information from the flow engine to the application via
receiving datapath. Also, there is the item of type RTE_FLOW_ITEM_TYPE_MARK
provided. It allows us to extend the flow match pattern with the capability
to match the metadata values set by MARK/FLAG actions on other flows.

From the datapath point of view, the MARK and FLAG are related to the
receiving side only. It would useful to have the same gateway on the
transmitting side and there was the feature of type RTE_FLOW_ITEM_TYPE_META
was proposed. The application can fill the field in mbuf and this value
will be transferred to some field in the packet metadata inside the flow
engine. It did not matter whether these metadata fields are shared because
of MARK and META items belonged to different domains (receiving and
transmitting) and could be vendor-specific.

So far, so good, DPDK proposes some entities to control metadata inside
the flow engine and gateways to exchange these values on a per-packet basis
via datapaths.

As we can see, the MARK and META means are not symmetric, there is absent
action which would allow us to set META value on the transmitting path.
So, the action of type:

- RTE_FLOW_ACTION_TYPE_SET_META was proposed.

The next, applications raise the new requirements for packet metadata.
The flow ngines are getting more complex, internal switches are introduced,
multiple ports might be supported within the same flow engine namespace.
From the DPDK points of view, it means the packets might be sent on one
eth_dev port and received on the other one, and the packet path inside
the flow engine entirely belongs to the same hardware device. The simplest
example is SR-IOV with PF, VFs and the representors. And there is a
brilliant opportunity to provide some out-of-band channel to transfer
some extra data from one port to another one, besides the packet data
itself. And applications would like to use this opportunity.

It is supposed for application to use trials (with rte_flow_validate)
to detect which metadata features (FLAG, MARK, META) actually supported
by PMD and underlying hardware. It might depend on PMD configuration,
system software, hardware settings, etc., and should be detected
in run time.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ori Kam <orika@mellanox.com>
2019-11-08 23:15:04 +01:00
Viacheslav Ovsiienko
9a2f44c762 ethdev: add flow tag
A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
2019-11-08 23:15:04 +01:00
Andrzej Ostruszka
098cc0fea3 build: add option to enable LTO
This patch adds an option to enable link time optimization.  In addition
to LTO option itself (-flto) fat-lto-objects are being used.  This is
because during the build pmdinfogen scans the generated ELF objects to
find this_pmd_name* symbol in symbol table.  Without fat-lto-objects gcc
produces ELF only with extra symbols for internal use during linking.

Signed-off-by: Andrzej Ostruszka <aostruszka@marvell.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2019-11-08 15:17:05 +01:00
Kiran Kumar K
01b3156d33 ethdev: add HIGIG2 key field to flow API
Add new rte_flow_item_higig2_hdr in order to match higig2 header.
It is a layer 2.5 protocol and used in Broadcom switches.
Header format is based on the following document.
http://read.pudn.com/downloads558/doc/comm/2301468/HiGig_protocol.pdf

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-23 16:43:10 +02:00
Flavio Leitner
c3ff0ac70a vhost: improve performance by supporting large buffer
The rte_vhost_dequeue_burst supports two ways of dequeuing data.
If the data fits into a buffer, then all data is copied and a
single linear buffer is returned. Otherwise it allocates
additional mbufs and chains them together to return a multiple
segments mbuf.

While that covers most use cases, it forces applications that
need to work with larger data sizes to support multiple segments
mbufs. The non-linear characteristic brings complexity and
performance implications to the application.

To resolve the issue, add support to attach external buffer
to a pktmbuf and let the host provide during registration if
attaching an external buffer to pktmbuf is supported and if
only linear buffer are supported.

Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-23 16:43:09 +02:00
Kiran Kumar K
67f8d7b620 ethdev: add AH key field to flow API
Add new rte_flow_item_ah in order to match the Authentication Header
based on RFC 2402.

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-23 16:43:08 +02:00
Kiran Kumar K
30f9f9f451 ethdev: add IGMP key field to flow API
Add new rte_flow_item_igmp in order to match the Internet Group
Management Protocol based on RFC 2236.

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-23 16:43:08 +02:00
Kiran Kumar K
86e1974a42 ethdev: add NSH key field to flow API
Add new rte_flow_item_nsh in order to match the network service header
based on RFC 8300.

Signed-off-by: Kiran Kumar K <kirankumark@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-23 16:43:08 +02:00
Vladimir Medvedkin
b2ee269267 ipsec: add SAD add/delete/lookup implementation
Replace rte_ipsec_sad_add(), rte_ipsec_sad_del() and
rte_ipsec_sad_lookup() stubs with actual implementation.

It uses three librte_hash tables each of which contains
an entries for a specific SA type (either it is addressed by SPI only
or SPI+DIP or SPI+DIP+SIP)

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-23 16:57:06 +02:00
Vladimir Medvedkin
3feb23609c ipsec: add SAD create/destroy implementation
Replace rte_ipsec_sad_create(), rte_ipsec_sad_destroy() and
rte_ipsec_sad_find_existing() API stubs with actual
implementation.

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-23 16:57:06 +02:00
Vladimir Medvedkin
401633d9c1 ipsec: add inbound SAD API
According to RFC 4301 IPSec implementation needs an inbound SA database
(SAD).
For each incoming inbound IPSec-protected packet (ESP or AH) it has to
perform a lookup within it's SAD.
Lookup should be performed by:
Security Parameters Index (SPI) + destination IP (DIP) + source IP (SIP)
or SPI + DIP
or SPI only
and an implementation has to return the 'longest' existing match.
This patch extend DPDK IPsec library with inbound security association
database (SAD) API implementation that:
- conforms to the RFC requirements above
- can scale up to millions of entries
- supports fast lookups
- supports incremental updates

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-23 16:57:06 +02:00
Konstantin Ananyev
b53e6272cb doc: fix list of unsupported features in IPsec guide
List of unsupported features doesn't reflect latest changes.

Fixes: cd5b860c1851 ("ipsec: support header construction")
Fixes: 2c1887fad075 ("ipsec: fix transport mode for IPv6 with extensions")

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
2019-10-23 16:57:06 +02:00