Commit Graph

5604 Commits

Author SHA1 Message Date
Vladimir Medvedkin
b2ee269267 ipsec: add SAD add/delete/lookup implementation
Replace rte_ipsec_sad_add(), rte_ipsec_sad_del() and
rte_ipsec_sad_lookup() stubs with actual implementation.

It uses three librte_hash tables each of which contains
an entries for a specific SA type (either it is addressed by SPI only
or SPI+DIP or SPI+DIP+SIP)

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-23 16:57:06 +02:00
Vladimir Medvedkin
3feb23609c ipsec: add SAD create/destroy implementation
Replace rte_ipsec_sad_create(), rte_ipsec_sad_destroy() and
rte_ipsec_sad_find_existing() API stubs with actual
implementation.

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-23 16:57:06 +02:00
Vladimir Medvedkin
401633d9c1 ipsec: add inbound SAD API
According to RFC 4301 IPSec implementation needs an inbound SA database
(SAD).
For each incoming inbound IPSec-protected packet (ESP or AH) it has to
perform a lookup within it's SAD.
Lookup should be performed by:
Security Parameters Index (SPI) + destination IP (DIP) + source IP (SIP)
or SPI + DIP
or SPI only
and an implementation has to return the 'longest' existing match.
This patch extend DPDK IPsec library with inbound security association
database (SAD) API implementation that:
- conforms to the RFC requirements above
- can scale up to millions of entries
- supports fast lookups
- supports incremental updates

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-23 16:57:06 +02:00
Julien Meunier
3dd4435cf4 cryptodev: fix checks related to device id
Each cryptodev are indexed with dev_id in the global rte_crypto_devices
variable. nb_devs is incremented / decremented each time a cryptodev is
created / deleted. The goal of nb_devs was to prevent the user to get an
invalid dev_id.

Let's imagine DPDK has configured N cryptodevs. If the cryptodev=1 is
removed at runtime, the latest cryptodev N cannot be accessible, because
nb_devs=N-1 with the current implementaion.

In order to prevent this kind of behavior, let's remove the check with
nb_devs and iterate in all the rte_crypto_devices elements: if data is
not NULL, that means a valid cryptodev is available.

Also, remove max_devs field and use RTE_CRYPTO_MAX_DEVS in order to
unify the code.

Fixes: d11b0f30df ("cryptodev: introduce API and framework for crypto devices")
Cc: stable@dpdk.org

Signed-off-by: Julien Meunier <julien.meunier@nokia.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2019-10-23 16:57:06 +02:00
Arek Kusztal
f2b2a44971 cryptodev: add asymmetric session-less
This commit adds asymmetric session-less option to
rte_crypto_asym_op. Feature flag for session-less is added
to rte_cryptodev.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
2019-10-23 16:57:06 +02:00
David Marchand
8e35792c53 eal: remove dead code on NUMA node detection
RTE_EAL_ALLOW_INV_SOCKET_ID had been introduced and documented as used
with xen dom0 support (dropped for some time now).

Closely looking at this, the code was changed later and ensures that the
socket id is in the [0..RTE_MAX_NUMA_NODES] range anyway.

Let's drop this dead code and the build option with it.

Fixes: 94ef296414 ("eal/linux: fix numa node detection")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-10-24 14:15:28 +02:00
David Christensen
ed5d3d5cdb eal/linux: restore specific hugepage ordering for ppc
An ifdef present in eal_memory.c references "RTE_ARCH_PPC64" when
it should actually use "RTE_ARCH_PPC_64".  Simple testing revealed
that both the PPC_64 and non-PPC_64 versions of the code involved
work, but the PPC_64 version of the code is retained to be
consistent with other instances in the same file where mmapped
memory is accessed in reverse order on Power platforms.

Fixes: 66cc45e293 ("mem: replace memseg with memseg lists")
Cc: stable@dpdk.org

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-24 14:15:10 +02:00
Morten Brørup
0f824df6f8 mbuf: add bulk free function
Add function for freeing a bulk of mbufs.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-24 02:45:40 +02:00
Bruce Richardson
47cce54ba8 build: allow stricter fallthrough warnings
DPDK currently compiles with implicit-fallthrough=2 warning level. With gcc
-Wextra flag, the default level is 3, so some minor changes are needed to
support this in DPDK.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
2019-10-24 01:02:30 +02:00
Bruce Richardson
7f8f7f4d0a build: process dependencies before main build check
If we want to add support for turning off components because of missing
dependencies, then we need to check for those dependencies before we
make a determination as to whether a component should be built or not,
assuming that the component says it should be built.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-24 01:02:28 +02:00
Bruce Richardson
ae783b42c4 build: print out dependency names for clarity
To help developers to get the correct dependency name e.g. when creating a
new example that depends on a specific component, print out the dependency
name for each lib/driver as it is processed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2019-10-23 16:41:06 +02:00
Nipun Gupta
b21302a107 eventdev: add Tx flag for packets with same destination
This patch introduces a `flag` in the Eth TX adapter enqueue API.
Some drivers may support burst functionality only with the packets
having same destination device and queue.

The flag `RTE_EVENT_ETH_TX_ADAPTER_ENQUEUE_SAME_DEST` can be used
to indicate this so the underlying driver, for drivers to utilize
burst functionality appropriately.

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2019-10-18 10:03:08 +02:00
David Marchand
08be0e0b68 rcu: fix reference to offline function
Fixes: 64994b56cf ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2019-10-21 21:21:30 +02:00
Honnappa Nagarahalli
33466e0fe1 rcu: update QS only when there are updates from writer
When the writer is checking the quiescent state status, it is not
deleting any entries in the data structure. This means, the readers
do not need to update their quiescent state during that period.
Readers update the quiescent state only when there are updates
available from the writer.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
2019-10-21 17:54:41 +02:00
Honnappa Nagarahalli
1f90d32ce1 rcu: add least acknowledged token optimization
When the rte_rcu_qsbr_check API is called, it is possible to
calculate the least valued token acknowledged by all the readers.
When the API is called next time, the readers' token counters do
not need to be scanned if the value of the token being queried is
less than the last least token acknowledged. This avoids the
cache line bounces between readers and writer.

Fixes: 64994b56cf ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
2019-10-21 17:54:40 +02:00
David Marchand
384b0a33fe clean bare metal support traces
Bare metal support has been gone for quite some time but we still had
some checks on system includes.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2019-10-21 16:19:00 +02:00
Phil Yang
7911ba0473 stack: enable lock-free implementation for aarch64
Enable both C11 atomic and non C11 atomic lock-free stack for aarch64.

Introduced a new header to reduce the ifdef clutter across generic and C11
files. The rte_stack_lf_stubs.h contains stub implementations of
__rte_stack_lf_count, __rte_stack_lf_push_elems and
__rte_stack_lf_pop_elems.

Suggested-by: Gage Eads <gage.eads@intel.com>
Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2019-10-21 10:15:57 +02:00
Phil Yang
7e2c3e17fe eal/arm64: add 128-bit atomic compare exchange
This patch adds the implementation of the 128-bit atomic compare
exchange API on aarch64. Using 64-bit 'ldxp/stxp' instructions
can perform this operation. Moreover, on the LSE atomic extension
accelerated platforms, it is implemented by 'casp' instructions for
better performance.

Since the '__ARM_FEATURE_ATOMICS' flag only supports GCC-9, this
patch adds a new config flag 'RTE_ARM_FEATURE_ATOMICS' to enable
the 'cas' version on older version compilers.
For octeontx2, we make sure that the lse (and other) extensions are
enabled even if the compiler does not know of the octeontx2 target
cpu.

Since direct x0 register used in the code and cas_op_name() and
rte_atomic128_cmp_exchange() is inline function, based on parent
function load, it may corrupt x0 register aka break aarch64 ABI.
Define CAS operations as rte_noinline functions to avoid an ABI
break [1].

1: https://git.dpdk.org/dpdk/commit/?id=5b40ec6b9662

Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Tested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-10-21 10:06:13 +02:00
Jim Harris
b30b134f82 eal: calibrate TSC only in primary process
This ensures secondary processes never have to calculate the TSC rate
themselves, which can be noticeable in VMs that don't have access to
arch-specific detection mechanism (such as CPUID leaf 0x15 or MSR 0xCE
on x86).

Since rte_mem_config is now internal to the EAL library, we can add
tsc_hz without ABI breakage concerns.

Reduces rte_eal_init() execution time in a secondary process from 165ms
to 66ms on my test system.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2019-10-18 13:23:10 +02:00
Ruifeng Wang
b36f587f01 rcu: fix spurious thread unregister
Thread unregister returns success while unregister not been performed.
This is due to incorrect thread registration status check.
Fix this issue by correcting bitmap check.

Fixes: 64994b56cf ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-10-18 06:13:36 +02:00
Nikhil Rao
e484ccddbe service: avoid false sharing on core state
For a valid service, the core mask of the service
is checked against the current core and the corresponding
entry in the active_on_lcore array is set or reset.

Upto 8 cores share the same cache line for their
service active_on_lcore array entries since each entry is a uint8_t.
Some number of these entries also share the cache line with
the internal_flags member of struct rte_service_spec_impl,
hence this false sharing also makes the service_valid() check
expensive.

Eliminate false sharing by moving the active_on_lcore array to
a per-core data structure. The array is now indexed by service id.

Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: Gage Eads <gage.eads@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2019-10-18 06:09:24 +02:00
Jim Harris
c1077933d4 timer: remove useless check on x86 TSC reliability
This code was added 7+ years ago in
commit fb022b85ba ("timer: check TSC reliability")
presumably when variant TSCs were still somewhat common.

But this code doesn't do anything except print a warning,
and the warning doesn't give any kind of advice to the user,
so let's just remove it.

While the warning has no functional meaning, the /proc/cpuinfo
parsing consumes a non-trivial amount of time which is especially
noticeable in secondary processes.
On my test system, it consumes 21ms out of the 66ms total execution
time for rte_eal_init() in a secondary process.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2019-10-17 09:47:42 +02:00
Hemant Agrawal
ad4305d0d5 eal/ppc: add SPDX license tag
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
2019-10-17 06:59:15 +02:00
David Christensen
72e69d801b eal/ppc: fix 64-bit atomic exchange operation
The rte_atomic64_exchange operation for ppc_64 incorrectly linked
back to a 32 bit generic operation (__atomic_exchange_4) rather than
the 64 bit generic operation (__atomic_exchange_8).  As a result,
applications that used rte_eth_link_get_nowait() would only receive
the link speed, they would not receive the link state, link duplex,
or link autoneg properties.

Fixes: ff2863570f ("eal: introduce atomic exchange operation")
Cc: stable@dpdk.org

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-10-17 06:59:11 +02:00
Stephen Hemminger
c3a90c381d mbuf: add a copy routine
This is a commonly used operation that surprisingly the
DPDK has not supported. The new rte_pktmbuf_copy does a
deep copy of packet. This is a complete copy including
meta-data.

It handles the case where the source mbuf comes from a pool
with larger data area than the destination pool. The routine
also has options for skipping data, or truncating at a fixed
length.

This patch also introduces internal inline to copy the
metadata fields of mbuf.

Add a test for this new function, based of the clone tests.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-16 12:43:53 +02:00
Stephen Hemminger
1d2db47c9f mbuf: deinline clone function
Cloning mbufs requires allocations and iteration
and therefore should not be an inline.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-16 12:42:04 +02:00
Stephen Hemminger
6b1dd3be54 mbuf: deinline linearize function
This copy part of this function is too big to be put inline.
The places it is used are only in special exception paths
where a highly fragmented mbuf arrives at a device that can't handle it.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2019-10-16 12:42:04 +02:00
Olivier Matz
a2b5a8722f mempool: clarify default populate function
No functional change.
Clarify the populate function to make future changes easier to
understand.

Rename the variables:
- to avoid negation in the name
- to have more understandable names

Remove useless variable (no_pageshift is equivalent to pg_sz == 0).

Remove duplicate affectation of "external" variable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-10-16 10:41:21 +02:00
Xiaolong Ye
b34801d1aa kni: support allmulticast mode set
This patch adds support to allow users enable/disable allmulticast mode for
kni interface.

This requirement comes from bugzilla 312, more details can refer to:
https://bugs.dpdk.org/show_bug.cgi?id=312

Bugzilla ID: 312

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-15 21:16:32 +02:00
Stephen Hemminger
a8f8ae1cf9 service: use log for error messages
EAL should always use rte_log instead of putting errors to
stderr (which maybe redirected to /dev/null in a daemon).

Also checks for null before rte_free are unnecessary.
Minor code consistency improvements.

Fixes: 21698354c8 ("service: introduce service cores concept")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
2019-10-15 20:37:11 +02:00
Arnon Warshavsky
75dbb45f28 eal: fix mapping leak in secondary process
Have rte_eal_config_reattach clean up the mapped address which is a valid
address but not the one intended.

Coverity issue: 343439
Fixes: 4e8854ae89 ("eal: do not panic on shared memory init")
Fixes: b149a70642 ("eal/freebsd: add config reattach in secondary process")
Cc: stable@dpdk.org

Signed-off-by: Arnon Warshavsky <arnon@qwilt.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2019-10-15 20:37:11 +02:00
Jim Harris
773a860aef vfio: fix leak with multiprocess
The code checks both rte_mp_request_sync() return code and that the number
of messages in the reply equals 1.  If rte_mp_request_sync() succeeds but
there was more than one message, those messages would get leaked.

Found via code review by Anatoly Burakov of patches that used the vhost
code as a template for using rte_mp_request_sync().

Fixes: 83a73c5fef ("vfio: use generic multi-process channel")
Cc: stable@dpdk.org

Reported-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-10-15 20:36:58 +02:00
Jerin Jacob
7fa2537226 bpf: hide internal program argument type
RTE_BPF_ARG_PTR_STACK is used as internal program
arg type. Rename to RTE_BPF_ARG_RESERVED to
avoid exposing internal program type.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-12 14:27:19 +02:00
Jerin Jacob
082482cef4 bpf/arm: add branch operation
Add branch and call operations.

jump_offset_* APIs used for finding the relative offset
to jump w.r.t current eBPF program PC.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-12 14:20:29 +02:00
Jerin Jacob
2acfae37f6 bpf/arm: add atomic-exchange-and-add operation
Implement XADD eBPF instruction using STADD arm64 instruction.
If the given platform does not have atomics support,
use LDXR and STXR pair for critical section instead of STADD.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-12 14:20:29 +02:00
Jerin Jacob
e00906bdc7 bpf/arm: add load and store operations
Add load and store operations.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-12 14:20:29 +02:00
Jerin Jacob
2b6d22fa9a bpf/arm: add byte swap operations
add le16, le32, le64, be16, be32 and be64 operations.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-12 14:20:29 +02:00
Jerin Jacob
9f4469d9e8 bpf/arm: add logical operations
Add OR, AND, NEG, XOR, shift operations for immediate
and source register variants.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-12 14:20:29 +02:00
Jerin Jacob
111e2a747a bpf/arm: add basic arithmetic operations
Add mov, add, sub, mul, div and mod arithmetic
operations for immediate and source register variants.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-12 14:20:28 +02:00
Jerin Jacob
f3e5167724 bpf/arm: add prologue and epilogue
Add prologue and epilogue as per arm64 procedure call standard.

As an optimization the generated instructions are
the function of whether eBPF program has stack and/or
CALL class.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-12 14:20:25 +02:00
Jerin Jacob
6861c01001 bpf/arm: add build infrastructure
Add build infrastructure and documentation
update for arm64 JIT support.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2019-10-12 14:20:21 +02:00
David Marchand
7dde68cf0e net: add missing rte prefix for ESP tail
This structure has been missed during the big rework.

Fixes: 5ef2546767 ("net: add rte prefix to ESP structure")
Cc: stable@dpdk.org

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-08 12:14:31 +02:00
Simei Su
d172886440 ethdev: add symmetric Toeplitz hash
Currently, there are DEFAULT,TOEPLITZ and SIMPLE_XOR hash function.
To support symmetric hash by rte_flow RSS action, this patch adds
new hash function "Symmetric Toeplitz" which is supported by some hardware.

Signed-off-by: Simei Su <simei.su@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-10-07 15:00:58 +02:00
Ying A Wang
226c6e60c3 ethdev: add PPPoE to flow API
- RTE_FLOW_ITEM_TYPE_PPPOES: matches a PPPoE session header.

- RTE_FLOW_ITEM_TYPE_PPPOED: matches a PPPoE discovery header.

- RTE_FLOW_ITEM_TYPE_PPPOE_PROTO_ID: matches a PPPoE session
  protocol identifier.

Signed-off-by: Ying A Wang <ying.a.wang@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-07 15:00:58 +02:00
Ying A Wang
346553db5b ethdev: add GTP extension header to flow API
- RTE_FLOW_ITEM_TYPE_GTP_PSC: matches a GTP
- RTE_FLOW_ITEM_TYPE_GTP_PSC: matches a GTP
  PDU extension header (PDU session container).

Signed-off-by: Ying A Wang <ying.a.wang@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-10-07 15:00:58 +02:00
Adrian Moreno
5d9dc18e1b vhost: fix vring memory partially mapped
Only the mapping of the vring addresses is being ensured. This causes
errors when the vring size is larger than the IOTLB page size. E.g:
queue sizes > 256 for 4K IOTLB pages

Ensure the entire vring memory range gets mapped. Refactor duplicated
code for for IOTLB UPDATE and IOTLB INVALIDATE and add packed virtqueue
support.

Fixes: 09927b5249 ("vhost: translate ring addresses when IOMMU enabled")
Cc: stable@dpdk.org

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-07 15:00:57 +02:00
Tiwei Bie
4e0de8dac8 vhost: protect vring access done by application
Besides the enqueue/dequeue API, other APIs of the builtin net
backend should also be protected.

Fixes: a368804699 ("vhost: protect active rings from async ring changes")
Cc: stable@dpdk.org

Reported-by: Peng He <xnhp0320@icloud.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-07 15:00:57 +02:00
Tiwei Bie
72d002b3eb vhost: fix vring address handling during live migration
When live migration starts, QEMU will set ring addrs again for
each virtqueue. In this case, we should try to translate ring
addrs after we invalidating the ring, otherwise virtqueues can
be enabled with the addrs untranslated. Besides, also leverage
the access_ok flag in non-IOMMU case to prevent the data path
accessing invalidated virtqueues.

Fixes: 5a4933e56b ("vhost: postpone ring address translations at kick time only")
Cc: stable@dpdk.org

Reported-by: Yilong Lv <lvyilong.lyl@alibaba-inc.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-07 15:00:57 +02:00
Tiwei Bie
37f7c1b609 vhost: forbid reallocation when running
When the device has been started, don't do the reallocation anymore.
Otherwise the pointers used in application threads can be invalidated
without proper protection. Instead of introducing a global lock to
protect the change of device pointers which will hurt the performance,
let's just do the reallocation during setup.

Fixes: af295ad469 ("vhost: realloc device and queues to same numa node as vring desc")
Cc: stable@dpdk.org

Reported-by: Yinan Wang <yinan.wang@intel.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-07 15:00:57 +02:00
Jim Harris
61af1713d3 vhost: add missing experimental flag
This function is listed under EXPERIMENTAL in the
rte_vhost_version.map, so it needs to be marked
with __rte_experimental in the header file as well.

Found by check-experimental-syms.sh when trying to compile
DPDK with -finstrument-functions.  This script didn't
catch this in the normal case, since the function is
declared __rte_always_inline.

This also requires updating the vhost_scsi example to allow
use of this newly marked experimental API.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2019-10-07 15:00:57 +02:00