17 Commits

Author SHA1 Message Date
Omkar Maslekar
4ffc2276e2 eal: add cache line demotion API
rte_cldemote is similar to a prefetch hint - in reverse.
On x86, cldemote(addr) enables software to hint to hardware that line is
likely to be shared. This is quite useful in core-to-core communications
where cache-line is likely to be shared.
ARM and PPC implementation is provided with NOP and can be added if any
equivalent instructions could be used for implementation on those
architectures.

Signed-off-by: Omkar Maslekar <omkar.maslekar@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2020-10-16 14:11:45 +02:00
Eli Britstein
057d9a92f0 eal: fix build with conflicting libc variable memory_order
The cited commit introduced functions with 'int memory_order' argument.
The C11 standard section 7.17.1.4 defines 'memory_order' as the
"enumerated type whose enumerators identify memory ordering constraints".

A compilation error occurs:
error: declaration of 'memory_order' shadows a global declaration
    [-Werror=shadow]
     rte_atomic_thread_fence(int memory_order)

This issue was hit when trying to compile OVS with gcc 4.8.5. This
compiler version does not provide stdatomic.h, so enum memory_order is
redefined in OVS code.
In another case, if the compiler does provide stdatomic.h header,
passing -Wsystem-headers in the CFLAGS will also cause that failure.

Fix it by changing the argument name 'memory_order' to 'memorder'.

Fixes: 672a15056380 ("eal: add wrapper for C11 atomic thread fence")

Signed-off-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Asaf Penso <asafp@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2020-10-15 18:49:53 +02:00
David Marchand
0c0d0d9df7 eal: add experimental tags for write combining store
Only marking the doxygen declarations is not enough.
Arch specific implementations must be tagged as well since there is no
common declaration of those inlines.

Fixes: 8a00dfc738fe ("eal: add write combining store")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Radu Nicolau <radu.nicolau@intel.com>
2020-10-15 08:45:30 +02:00
Wei Hu (Xavier)
b15a936d75 eal/arm64: update CPU flags
ARM64 Linux kernel updated the CPU flags using the HWCAP scheme.
The related marco definition can be found in linux kernel:
  arch/arm64/include/uapi/asm/hwcap.h

This patch incorporates those changes to the EAL library.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2020-10-13 17:52:11 +02:00
Ruifeng Wang
e9b9739264 config: remap flags used for Arm platforms
RTE_ARCH_xx flags are used to distinguish platform architectures.
These flags can be used to pick different code paths for different
architectures at compile time.
For Arm platforms, there are 3 flags in use: RTE_ARCH_ARM,
RTE_ARCH_ARMv7 and RTE_ARCH_ARM64.
RTE_ARCH_ARM64 is for 64-bit aarch64 platforms,
and RTE_ARCH_ARM & RTE_ARCH_ARMv7 are for 32-bit platforms.
RTE_ARCH_ARMv7 is for ARMv7 platforms as its name suggested.

The issue is meaning of RTE_ARCH_ARM is not clear enough.
Because no info about platform word length is included in the name.
To make the flag names more clear, a naming scheme is proposed.

RTE_ARCH_ARM (all Arm platforms)
    |
    +----RTE_ARCH_32 (New. 32-bit platforms of all architectures)
    |        |
    |        +----RTE_ARCH_ARMv7 (ARMv7 platforms)
    |        |
    |        +----RTE_ARCH_ARMv8_AARCH32 (aarch32 state on aarch64 machine)
    |
    +----RTE_ARCH_64 (64-bit platforms of all architectures)
             |
             +----RTE_ARCH_ARM64 (64-bit Arm platforms)

RTE_ARCH_32 will be explicitly defined for 32-bit platforms.

To fit into the new naming scheme, current usage of RTE_ARCH_ARM in
project is mapped to (RTE_ARCH_ARM && RTE_ARCH_32).

Matching flags for other architectures are:
RTE_ARCH_X86
    |
    +----RTE_ARCH_32
    |        |
    |        +----RTE_ARCH_I686
    |        |
    |        +----RTE_ARCH_X86_X32
    |
    +----RTE_ARCH_64
             |
             +----RTE_ARCH_X86_64

RTE_ARCH_PPC_64 ---- RTE_ARCH_64

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
2020-10-13 16:35:48 +02:00
Radu Nicolau
8a00dfc738 eal: add write combining store
Add rte_write32_wc and rte_write32_wc_relaxed functions
that implement 32bit stores using write combining memory protocol.
Provided generic stubs and x86 implementation.

Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2020-10-13 14:11:16 +02:00
Radu Nicolau
84fb33fec1 build: remove deprecated cpuflag macros
Replace use of RTE_MACHINE_CPUFLAG macros with regular compiler
macros, which are more complete than those provided by DPDK, and as such
it allows new instruction sets to be leveraged without having to do
extra work to set them up in DPDK.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-09-25 11:13:57 +02:00
Phil Yang
f0f5d844d1 eal: remove deprecated coherent IO memory barriers
Since the 20.08 release deprecated rte_cio_*mb APIs because these APIs
provide the same functionality as rte_io_*mb APIs on all platforms, so
remove them and use rte_io_*mb instead.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-09-23 13:40:26 +02:00
Phil Yang
672a150563 eal: add wrapper for C11 atomic thread fence
Provide a wrapper for __atomic_thread_fence builtins to support
optimized code for __ATOMIC_SEQ_CST memory order for x86 platforms.

Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2020-07-17 16:00:30 +02:00
Honnappa Nagarahalli
264f7f80e1 eal/arm: adjust memory barriers for IO on ARMv8
Change the barrier APIs for IO to reflect that Armv8-a is other-multi-copy
atomicity memory model.

Armv8-a memory model has been strengthened to require
other-multi-copy atomicity. This property requires memory accesses
from an observer to become visible to all other observers
simultaneously [3]. This means

a) A write arriving at an endpoint shared between multiple CPUs is
   visible to all CPUs
b) A write that is visible to all CPUs is also visible to all other
   observers in the shareability domain

This allows for using cheaper DMB instructions in the place of DSB
for devices that are visible to all CPUs (i.e. devices that DPDK
caters to).

Please refer to [1], [2] and [3] for more information.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=22ec71615d824f4f11d38d0e55a88d8956b7e45f
[2] https://www.youtube.com/watch?v=i6DayghhA8Q
[3] https://www.cl.cam.ac.uk/~pes20/armv8-mca/

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Tested-by: Ruifeng Wang <ruifeng.wang@arm.com>
2020-07-08 13:44:23 +02:00
Honnappa Nagarahalli
e1c9850f55 eal/armv8: force inlining of timer API
Change the inline functions to use __rte_always_inline to be
consistent with rest of the inline functions.

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-07-07 13:21:26 +02:00
Honnappa Nagarahalli
97c910139b eal/armv8: fix timer frequency calibration with PMU
get_tsc_freq uses 'nanosleep' system call to calculate the CPU
frequency. However, 'nanosleep' results in the process getting
un-scheduled. The kernel saves and restores the PMU state. This
ensures that the PMU cycles are not counted towards a sleeping
process. When RTE_ARM_EAL_RDTSC_USE_PMU is defined, this results
in incorrect CPU frequency calculation. This logic is replaced
with generic counter based loop.

Bugzilla ID: 450
Fixes: f91bcbb2d9a6 ("eal/armv8: use high-resolution cycle counter")
Cc: stable@dpdk.org

Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2020-07-07 13:20:50 +02:00
Ruifeng Wang
4eb25acdb7 eal/arm: add vcopyq intrinsic for aarch32
vcopyq_laneq_u32 should be implemented for aarch32 which doesn't have
the intrinsic.
This fixes build of examples/l3fwd for armv7.

Fixes: 3c4b4024c225 ("arch/arm: add vcopyq_laneq_u32 for old gcc")
Cc: stable@dpdk.org

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2020-06-30 14:52:30 +02:00
Bruce Richardson
44dfb297af build: add arch-specific header path to global includes
The global include path, which is used by anything built before EAL,
points to the EAL header files so they utility macros etc. can be used
anywhere in DPDK. This path included the OS-specific EAL header files,
but not the architecture-specific ones. This patch moves the selection
of target architecture to the top-level meson.build file so that the
global include can reference that.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Keith Wiles <keith.wiles@intel.com>
2020-05-10 23:45:02 +02:00
Stephen Hemminger
a9aa14d9aa eal: fix comments spelling
Fix spelling errors in comments (found with codespell).

Note that "inbetween" is not correct in English and should
either be two words or better yet, the in can be dropped.
https://www.grammarly.com/blog/in-between-or-inbetween/

Fixes: 12f45fa7e29b ("eal/arm: read timer from PMU if enabled")
Fixes: 096ffd811fe2 ("eal/x86: use lock-prefixed instructions for SMP barrier")
Fixes: 1d406458db47 ("mem: make segment preallocation OS-specific")
Fixes: bb372060dad4 ("malloc: make heap a doubly-linked list")
Fixes: 7353ee7344b4 ("fbarray: add API to find biggest used or free chunks")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2020-04-24 19:29:06 +02:00
Thomas Monjalon
f35e5b3e07 replace alignment attributes
There is a common macro __rte_aligned for alignment,
which is now used where appropriate for consistency.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2020-04-16 18:16:18 +02:00
Thomas Monjalon
a1b6cda16a eal: move arch-specific header files
The arch-specific directories arm, ppc and x86 in common/include/arch/
are moved as include/ sub-directories of respective arch directories:
	- arm/include/
	- ppc/include/
	- x86/include/

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: David Marchand <david.marchand@redhat.com>
2020-03-31 13:08:55 +02:00