Windows interrupt support is based on IO completion ports (IOCP).
Interrupt thread would send the devices requests to notify about
interrupts and then wait for any request completion. Add skeleton code
of this model without any hardware support.
Another way to wake up the interrupt thread is APC (asynchronous procedure
call), scheduled by any other thread via eal_intr_thread_schedule().
This internal API is intended for alarm implementation.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
When create softnic hash table with 16 keys, it failed on 32-bit
environment, because the pointer field in structure rte_bucket_4_16
is only 32 bits. Add a padding field in 32-bit environment to keep
the structure to a multiple of 64 bytes. Apply this to 8-byte and
32-byte key hash function as well.
Fixes: 8aa327214c ("table: hash")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Current rte_acl_classify_avx512x32() and rte_acl_classify_avx512x16()
code paths are very similar. The only differences are due to
256/512 register/instrincts naming conventions.
So to deduplicate the code:
- Move common code into “acl_run_avx512_common.h”
- Use macros to hide difference in naming conventions
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
With current ACL implementation first field in the rule definition
has always to be one byte long. Though for optimising classify
implementation it might be useful to do 4B reads
(as we do for rest of the fields).
So at build phase, check user provided field definitions to determine
is it safe to do 4B loads for first ACL field.
Then at run-time this information can be used to choose classify
behavior.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Introduce classify implementation that uses AVX512 specific ISA.
rte_acl_classify_avx512x32() is able to process up to 32 flows in parallel.
It uses 512-bit width registers/instructions and provides higher
performance then rte_acl_classify_avx512x16(), but can cause
frequency level change.
Note that for now only 64-bit version is supported.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
On supported platforms, set RTE_ACL_CLASSIFY_AVX512X16 as
default ACL classify algorithm.
Note that AVX512X16 implementation uses 256-bit registers/instincts only
to avoid possibility of frequency drop.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Introduce classify implementation that uses AVX512 specific ISA.
rte_acl_classify_avx512x16() is able to process up to 16 flows in parallel.
It uses 256-bit width registers/instructions only
(to avoid frequency level change).
Note that for now only 64-bit version is supported.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Add necessary changes to support new AVX512 specific ACL classify
algorithm:
- changes in meson.build to check that build tools
(compiler, assembler, etc.) do properly support AVX512.
- run-time checks to make sure target platform does support AVX512.
- dummy rte_acl_classify_avx512() for targets where AVX512
implementation couldn't be properly supported.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Right now ACL library determines best possible (default) classify method
on a given platform with special constructor function rte_acl_init().
This patch makes the following changes:
- Move selection of default classify method into a separate private
function and call it for each ACL context creation (rte_acl_create()).
- Remove library constructor function
- Make rte_acl_set_ctx_classify() to check that requested algorithm
is supported on given platform.
The purpose of these changes to improve and simplify algorithm selection
process and prepare ACL library to be integrated with the
max SIMD bitwidth series in discussion.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Removal of unused enum value (RTE_ACL_CLASSIFY_NUM).
This enum value is not used inside DPDK, while it prevents
to add new classify algorithms without causing an ABI breakage.
Note that this change introduce a formal ABI incompatibility
with previous versions of ACL library.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Right now we define dummy version of rte_acl_classify_avx2()
when both X86 and AVX2 are not detected, though it should be
for non-AVX2 case only.
Fixes: e53ce4e41379 ("acl: remove use of weak functions")
Cc: stable@dpdk.org
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
New data type to manipulate 512 bit AVX values.
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
export for clang build all the functions currently built
on Windows and listed in rte_eal_version.map by adding
them to rte_eal_exports.def.
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
This patch enables the optimized calculation of CRC32-Ethernet and
CRC16-CCITT using the AVX512 and VPCLMULQDQ instruction sets. This CRC
implementation is built if the compiler supports the required instruction
sets. It is selected at run-time if the host CPU, again, supports the
required instruction sets.
Signed-off-by: Mairtin o Loingsigh <mairtin.oloingsigh@intel.com>
Signed-off-by: David Coyle <david.coyle@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Jasvinder Singh <jasvinder.singh@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch adds support for run-time selection of the optimal
architecture-specific CRC path, based on the supported instruction set(s)
of the CPU.
The compiler option checks have been moved from the C files to the meson
script. The rte_cpu_get_flag_enabled function is called automatically by
the library at process initialization time to determine which
instructions the CPU supports, with the most optimal supported CRC path
ultimately selected.
Signed-off-by: Mairtin o Loingsigh <mairtin.oloingsigh@intel.com>
Signed-off-by: David Coyle <david.coyle@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Jasvinder Singh <jasvinder.singh@intel.com>
Reviewed-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
ARM64 Linux kernel updated the CPU flags using the HWCAP scheme.
The related marco definition can be found in linux kernel:
arch/arm64/include/uapi/asm/hwcap.h
This patch incorporates those changes to the EAL library.
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
RTE_ARCH_xx flags are used to distinguish platform architectures.
These flags can be used to pick different code paths for different
architectures at compile time.
For Arm platforms, there are 3 flags in use: RTE_ARCH_ARM,
RTE_ARCH_ARMv7 and RTE_ARCH_ARM64.
RTE_ARCH_ARM64 is for 64-bit aarch64 platforms,
and RTE_ARCH_ARM & RTE_ARCH_ARMv7 are for 32-bit platforms.
RTE_ARCH_ARMv7 is for ARMv7 platforms as its name suggested.
The issue is meaning of RTE_ARCH_ARM is not clear enough.
Because no info about platform word length is included in the name.
To make the flag names more clear, a naming scheme is proposed.
RTE_ARCH_ARM (all Arm platforms)
|
+----RTE_ARCH_32 (New. 32-bit platforms of all architectures)
| |
| +----RTE_ARCH_ARMv7 (ARMv7 platforms)
| |
| +----RTE_ARCH_ARMv8_AARCH32 (aarch32 state on aarch64 machine)
|
+----RTE_ARCH_64 (64-bit platforms of all architectures)
|
+----RTE_ARCH_ARM64 (64-bit Arm platforms)
RTE_ARCH_32 will be explicitly defined for 32-bit platforms.
To fit into the new naming scheme, current usage of RTE_ARCH_ARM in
project is mapped to (RTE_ARCH_ARM && RTE_ARCH_32).
Matching flags for other architectures are:
RTE_ARCH_X86
|
+----RTE_ARCH_32
| |
| +----RTE_ARCH_I686
| |
| +----RTE_ARCH_X86_X32
|
+----RTE_ARCH_64
|
+----RTE_ARCH_X86_64
RTE_ARCH_PPC_64 ---- RTE_ARCH_64
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Add rte_write32_wc and rte_write32_wc_relaxed functions
that implement 32bit stores using write combining memory protocol.
Provided generic stubs and x86 implementation.
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Running dpdk-helloworld on Linux with lib numa present, but no kernel
support for NUMA (CONFIG_NUMA=n) causes rte_service_init() to fail with
EAL: error allocating rte services array.
alloc_seg() calls get_mempolicy to verify that the allocation
has happened on the correct socket, but receives ENOSYS from
the kernel and fails the allocation.
The allocated socket should only be verified if check_numa() is true.
Fixes: 2a96c88be83e ("mem: ease init in a docker container")
Cc: stable@dpdk.org
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
If the overall pkt_len and segment lengths are out of agreement,
it is possible for the seg to be NULL after the loop. Add assert
to check this condition in debug builds. Otherwise, return failure.
Fixes: c442fed81bb9 ("net: add function to calculate checksum in mbuf")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <3chas3@gmail.com>
Align rte_eal_cleanup return codes description to the rest of dpdk.
Fixes: aec9c13c5257 ("eal: add function to release internal resources")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
This patch adds Forward error correction(FEC) support for ethdev.
Introduce APIs which support query and config FEC information in
hardware.
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Chengchang Tang <tangchengchang@huawei.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Currently, base and nb_queue in the tc_rxq and tc_txq information
of queue and TC mapping on both TX and RX paths are uint8_t.
However, these data will be truncated when queue number under a TC
is greater than 256. So it is necessary for base and nb_queue to
change from uint8_t to uint16_t.
Fixes: 89d6728c7837 ("ethdev: get DCB information")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Patch [1] added support for RSS flow expansion.
It was added in ethdev for public use, but until now it is used only
by MLX5 PMD.
To allow local changes in this code, this patch removes it from ethdev
and moves it to MLX5 PMD file.
[1] commit 4ed05fcd441b ("ethdev: add flow API to expand RSS flows")
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Function rte_flow_expand_rss() is used to expand a flow rule with
partial pattern into several rules, to ensure all relevant packets
are matched.
It uses utility function rte_flow_expand_rss_item_complete(), to check
if the last valid item in the flow rule pattern needs to be completed.
For example the pattern "eth / ipv4 proto is 17 / end" will be completed
with a "udp" item.
This function returns "void" item in two cases:
1) The last item has empty spec, for example "eth / ipv4 / end".
2) The last itme has spec that can't be expanded for RSS.
For example the pattern "eth / ipv4 proto is 47 / end" ends with IPv4
item that has next protocol GRE.
In both cases the flow rule may be expanded, but in the second case such
expansion may create rules with invalid pattern.
For example "eth / ipv4 proto is 47 / udp / end".
In such a case the flow rule should not be expanded.
This patch updates function rte_flow_expand_rss_item_complete().
Return value RTE_FLOW_ITEM_TYPE_END is used to indicate the flow rule
should not be expanded.
In such a case, rte_flow_expand_rss() will return with the original flow
rule only, without any expansion.
Fixes: fc2dd8dd492f ("ethdev: fix expand RSS flows")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
A crash is detected when '--txpkts=#' parameter provided to the testpmd,
this is because queue information is requested before queues have been
allocated.
Adding check to queue info APIs
('rte_eth_rx_queue_info_get()' & 'rte_eth_tx_queue_info_get')
to protect against similar cases.
Fixes: ba2fb4f022fc ("ethdev: check if queue setup when getting queue info")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
The mbuf library now has routine to free multiple buffers.
Loop is no longer needed.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
In EAL, we try to sort the experimental symbols per the release they
were introduced in.
Fixes: 8929de043eb4 ("service: retrieve lcore active state")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Since the hardware supported by the ioat driver is capable of operations
other than just copies, we can rename the doorbell and completion-return
functions to not have "copies" in their names. These functions are not
copy-specific, and so would apply for other operations which may be added
later to the driver.
Also add a suitable warning using deprecation attribute for any code using
the old functions names.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
If a timer's callback function calls rte_timer_reset_sync() or
rte_timer_stop_sync() on another timer that is in the RUNNING state and
owned by the current lcore, the *_sync() calls will loop indefinitely.
Relatedly, if a timer's callback function calls *_sync() on another
timer that is in the RUNNING state and is owned by a different lcore,
but a timer callback function runs on that different lcore and calls
*_sync() on a timer that is in the RUNNING state and owned by the
current lcore, the two lcores will loop indefinitely.
Add a note in the rte_timer_stop_sync and rte_timer_reset_sync
documentation that indicates that these APIs should not be used inside
timer callback functions in order to avoid the hangs described above,
and suggests an alternative.
Bugzilla ID: 491
Cc: stable@dpdk.org
Signed-off-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
The current buffer size is not big enough to register trace points for
new additions in the eventdev subsystem.
Increase TRACE_CTF_FIELD_SIZE by 64 bytes for now.
Signed-off-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Acked-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
During power initialization the pstate cpufreq api is
not setting the initial curr_idx of pstate_power_info
to corresponding current frequency index.
Without this the idx is always 0, which is causing the
below check to pass and returns without setting the initial
min/max frequency to system max frequency and this leads to
incorrect frequency settings when power_pstate_cpufreq_set_freq()
is called in the apps.
set_freq_internal(struct pstate_power_info *pi, uint32_t idx)
{
...
/* Check if it is the same as current */
if (idx == pi->curr_idx)
return 0;
...
}
scenario 1:
If system has starting scaling min/max: 1000/1000, and want to
set this to 2200/2200, the max frequency gets updated but not min.
scenario 2:
If system has starting scaling min/max: 2200/1000, and want to set
to 2200/2200, the max, min frequency was not updated. Since no change
in max that should be ok, but min was also ignored, which will be fixed
now with the new changes.
Fixes: e6c6dc0f ("power: add p-state driver compatibility")
Cc: stable@dpdk.org
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Reviewed-by: Liang Ma <liang.j.ma@intel.com>
Add size_t CTF format metadata, this is needed by CTF analyzers to
parse the emitted CTF trace.
Fixes: 262c4ee791c6 ("trace: add size_t field emitter")
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Sunil Kumar Kori <skori@mavell.com>
Enhance the dump function to also print the ops index
and associated mempool ops name
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This patch fixes an unused value in pcap source port by
removing the setting to the value.
Coverity issue: 362020
Fixes: d4b42133d85b ("port: add pcap file source")
Cc: stable@dpdk.org
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Arrays of type uint64_t/int/string can now be included within an array
or dict. One level of embedded containers is supported. This is
necessary to allow for instances such as the ethdev queue stats to be
reported as a list of uint64_t values, rather than having multiple dict
entries with one uint64_t value for each queue stat.
The memory management APIs provided by telemetry simplify the memory
allocation/free aspect of the embedded container. The rte_tel_data_alloc
function is called in the library/app callback to return a pointer to a
container that has been allocated memory. When adding this container
to an array/dict, a parameter is passed to indicate if the memory
should be freed by telemetry after use. This will allow reuse of the
allocated memory if the library/app wishes to do so.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Telemetry only passed the first param to the command handler if multiple
were entered by the user, separated by commas. Telemetry is required to
pass the full params string to the command, by splitting by a comma
delimiter only once to remove the command part of the string. This will
enable future commands to take multiple param values.
Fixes: b1ad0e124536 ("rawdev: add telemetry callbacks")
Fixes: c190daedb9b1 ("ethdev: add telemetry callbacks")
Fixes: 6dd571fd07c3 ("telemetry: introduce new functionality")
Cc: stable@dpdk.org
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
VXLAN UDP/IPv4 GRO can help improve VM-to-VM UDP
performance when UFO or GSO is enabled in VM, GRO
must be supported if UFO or GSO is enabled,
otherwise, performance can't get big improvement
if only GSO is there.
With this enabled in DPDK, OVS DPDK can leverage it
to improve VM-to-VM UDP performance, it will reassemble
VXLAN UDP/IPv4 fragments immediate after they are
received from a physical NIC. It is very helpful in
OVS DPDK VXLAN use case.
Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
UDP/IPv4 GRO can help improve VM-to-VM UDP performance
when UFO or GSO is enabled in VM, GRO must be supported
if UFO or GSO is enabled, otherwise, performance can't
get big improvement if only GSO is there.
With this enabled in DPDK, OVS DPDK can leverage it
to improve VM-to-VM UDP performance, it will reassemble
UDP fragments immediate after they are received from
a physical NIC. It is very helpful in OVS DPDK VLAN use
case.
Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>
Currently, rte_ipv4_cksum() and rte_ipv4_udptcp_cksum() assume all IPv4
headers have sizeof(struct rte_ipv4_hdr) bytes. This is not true for
those (rare) packets with IPv4 options. Thus, both IPv4 and TCP/UDP
checksums are calculated wrong.
This patch fixes the issue by using the actual IPv4 header length from
the packet's IHL field.
Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Move symbols introduced in version <= 19.11 in the stable ABI.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Remove the deprecated v20 ABI of rte_mempool_populate_iova() and
rte_mempool_populate_virt().
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>