All PCI functionality should be hidden from apps via the PCI bus driver,
the EAL and individual device drivers. Therefore remove the inclusion of
rte_pci.h from sample apps.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
With -f-strict-aliasing enabled by default from -O2, gcc > 5.x gives
undefined behavior in port_groupx4 in ARM. 'pn' and 'pnum' are
two different pointers pointing to same chunk of memory and
with -f-strict-aliasing the pointers are assumed to be pointing to
different memory and compiler reorders instructions that depend on
pnum and pn. This breaks port grouping algorithm.
This patch eliminates the above problem by introducing a compiler
barrier between the instructions that depend on pnum, pn and lp.
Fixes: 569b290cdb36 ("examples/l3fwd: add NEON implementation")
Cc: stable@dpdk.org
Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
To group consecutive packets with same destination port in bursts of 4
neon intrinsic data types dp1 and dp2 are calculated such that if
dst_port[]={a,b,c,d,e,f,g,h,i...} dp1 should contain: <a,b,c,d> and
dp2 should contain: <b,c,d,e> in the first iteration. dp1 should
be <e,f,g,h> and dp2 should be <f,g,h,i> in the next iteration.
Whereas the existing code incorrectly calculates dp1 as <d,e,f,g> from
second iteration.
This patch fixes the incorrect ARM NEON instructions on dp1.
Fixes: 569b290cdb36 ("examples/l3fwd: add NEON implementation")
Cc: stable@dpdk.org
Signed-off-by: Guduri Prathyusha <gprathyusha@caviumnetworks.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
The memzone header is often included without good reason.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
This patch adds altivec support for lpm packet processing in powerpc.
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
Extend port_id definition from uint8_t to uint16_t in lib and drivers
data structures, specifically rte_eth_dev_data. Modify the APIs,
drivers and app using port_id at the same time.
Fix some checkpatch issues from the original code and remove some
unnecessary cast operations.
release_17_11 and deprecation docs have been updated in this patch.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fix a typo that cause IPv6 packet type not be parsed.
Fixes: 71a7e2424e07 ("examples/l3fwd: fix using packet type blindly")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for sequential hash lookup,
and doesn't include any x86 SSE instructions.
Signed-off-by: Jianbo Liu <jianbo.liu@linaro.org>
Since SSE4 is now part of the minimum requirements for DPDK, we don't need
to check for its presence any more.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Fixing typos across dpdk source code using codespell utility.
Skipped the ethdev driver's base code fixes to keep the base
code intact.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Different drivers use internal macros like force_inline for compiler
always inline feature.
Standardizing it through __rte_always_inline macro.
Verified the change by comparing the output binary file.
No difference found in the output binary file with this change.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Since VF can not disable/enable HW CRC strip for non-DPDK PF drivers,
and kernel driver almost default enable that feature, if disable it in
example app's rxmode, VF driver will report the VF launch failure. So
this patch default to enable HW CRC strip to let VF launch successful.
Cc: stable@dpdk.org
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
This patch extend next_hop field from 8-bits to 21-bits in LPM library
for IPv6.
Added versioning symbols to functions and updated
library and applications that have a dependency on LPM library.
Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The variable optind should be reset to one not zero.
From the man page:
"The variable optind is the index of the next element to be processed in
argv. The system initializes this value to 1.
The caller can reset it to 1 to restart scanning of the same argv, or when
scanning a new argument vector.”
The problem I saw with my application was trying to parse the wrong
option, which can happen as DPDK parses the first part of the command line
and the application parses the second part. If you call getopt() multiple
times in the same execution, the behavior is not maintained when using
zero for optind.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Avoid the use of several strncpy() since getopt is able to
map a long option with an id, which can be matched in the
same switch/case than short options.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
if machine level CRC extension are available, offload the
hash to machine provide functions e.g. armv8-a CRC extensions
support it
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Update l3fwd example usage and documentation with missing options.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
If no SSE nor NEON are available the l3fwd should complain loudly
to quickly find out the reason.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
The rte_eth_dev_count() function will never return a value greater
than RTE_MAX_ETHPORTS, so that checking is useless.
Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
The function rte_hash_lookup_multi() was renamed rte_hash_lookup_bulk()
in DPDK 1.4 and was kept as an undocumented alias.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
It seems that with gcc >5.x and -O2/-O3 optimization breaks packet
grouping algorithm.
When last packet pointer "lp" and "pnum->u64" buffer points the same
memory buffer, high optimization can cause unpredictable results.
It seems that assignment of precalculated group sizes may interfere
with initialization of new group size when lp points value inside
current group and didn't should be changed.
With gcc >5.x and optimization we cannot be sure which assignment will
be done first, so the group size can be counted incorrectly.
This patch eliminates intersection of assignment of initial group size
(lp[0] = 1) and precalculated group sizes when gptbl[v].idx < 4.
Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Define and use ETH_LINK_UP and ETH_LINK_DOWN where appropriate.
Signed-off-by: Marc Sune <marcdevel@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Originally l3fwd used 16-bit value to store dest_port value.
To accommodate 24-bit nexthop dest_port was increased to 32-bit,
though some further packet processing code remained unchanged and
still expects dest_port to be 16-bit.
That is not correct and can cause l3fwd invalid behaviour or even
process crash/hang on some input packet patterns.
For the fix, I choose the simplest approach and restored dest_port
as 16-bit value, plus necessary conversions from 32 to 16 bit values
after lpm_lookupx4.
Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field")
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Not all tx ports was included in tx_port_id array, used to periodically
drain only available ports. This caused that some packets remain in buffer
when application stops to receiving packets.
Fixes: 52c97adc1f0f ("examples/l3fwd: fix exact match performance")
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
As a example to use ptype info, l3fwd needs firstly to use
rte_eth_dev_get_supported_ptypes() API to check if device and/or
its PMD driver will parse and fill the needed packet type; if not,
use the newly added option, --parse-ptype, to analyze it in the
callback softly.
As the mode of EXACT_MATCH uses the 5 tuples to caculate hash, so
we narrow down its scope to:
a. ip packets with no extensions, and
b. L4 payload should be either tcp or udp.
Note: this patch does not completely solve the issue, "cannot run
l3fwd on virtio or other devices", because hw_ip_checksum may be
not supported by the devices. Currently we can:
a. remove this requirements, or
b. wait for virtio front end (pmd) to support it.
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Added validation for queue id of config parameter tuple.
This validation enforces user to enter queue ids of a port
from 0 and in sequence.
This additional validation on queue ids avoids ixgbe crash caused
by null rxq pointer access inside ixgbe_dev_rx_init.
Reason for null rxq is, L3fwd application allocates memory only for
queues passed by user. But rte_eth_dev_start tries to initialize rx
queues in sequence from 0 to nb_rx_queues,
which is not true and coredump while accessing the unallocated queue .
Fixes: af75078fece3 ("first public release")
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The flag ENABLE_MULTI_BUFFER_OPTIMIZE has been removed so the
related comments are now useless.
Fixes: 268888b5b020 ("examples/l3fwd: modularize")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
It seems that for the most use cases, previous hash_multi_lookup provides
better performance, and more, sequential lookup can cause significant
performance drop.
This patch sets previously optional hash_multi_lookup method as default.
It also provides some minor optimizations such as queue drain only on used
tx ports.
Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field")
Fixes: 64d3955de1de ("examples/l3fwd: fix ARM build")
Reported-by: Qian Xu <qian.q.xu@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
l3fwd does not compile with HASH_MULTI_LOOKUP.
2 issues:
* in 64d395 mask0 changed type from xmm_t to rte_xmm_t
-> use x field from rte_xmm_t
* in dc81eb dst_port parameter changed to uint32_t
-> change uint16_t dst_port to uin32_t dsp_port
Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field")
Fixes: 64d3955de1de ("examples/l3fwd: fix ARM build")
Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Enable NEON support in exact match mode.
l3fwd example did not compile on ARM due to SSE2 instrincics used
in generic part.
Some instrinsins were used to initialize data structures and those were
replaced by ordinary structure initalization.
All SSE2 intrinsics used in forwarding, i.e. masking the IP/TCP header
are moved to single inline function and made arch-specific.
Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
A new rte_lpm_config structure is used so LPM library will allocate
exactly the amount of memory which is necessary to hold application’s
rules.
Signed-off-by: Michal Kobylinski <michalx.kobylinski@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
This patch extend next_hop field from 8-bits to 24-bits in LPM library
for IPv4.
Added versioning symbols to functions and updated
library and applications that have a dependency on LPM library.
Signed-off-by: Michal Kobylinski <michalx.kobylinski@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
Current implementation of Exact-Match uses different execution path than
for LPM. Unifying them allows to reuse big part of LPM code and sightly
increase performance of Exact-Match.
Main changes:
-------------
* Packet classification stage is separated from the rest of path for both
LPM and EM.
* Packet processing, modifying and transmit part is the same for LPM and EM
and mostly based on the current LPM implementation.
* Shared code is moved to the common file "l3fwd_sse.h".
* While sequential packet classification in EM path, seems to be faster
than using multi hash lookup, used before, it is used by default. Old
implementation is moved to the file l3fwd_em_hlm_sse.h and can be enabled
with HASH_LOOKUP_MULTI global define in compilation time.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The main problem with l3fwd is that it is too monolithic with everything
being in one file, and the various options all controlled by compile time
flags. This means that it's hard to read and understand, and when making
any changes, you need to go to a lot of work to try and ensure you cover
all the code paths, since a compile of the app will not touch large parts
of the l3fwd codebase.
Following changes were done to fix the issues mentioned above
- Split out the various lpm and hash specific functionality into separate
files, so that l3fwd code has one file for common code e.g. args
processing, mempool creation, and then individual files for the various
forwarding approaches.
Following are new file lists
main.c (Common code for args processing, memppol creation, etc)
l3fwd_em.c (Hash/Exact match aka 'EM' functionality)
l3fwd_em_sse.h (SSE4_1 buffer optimizated 'EM' code)
l3fwd_lpm.c (Longest Prefix Match aka 'LPM' functionality)
l3fwd_lpm_sse.h (SSE4_1 buffer optimizated 'LPM' code)
l3fwd.h (Common include for 'EM' and 'LPM')
- The choosing of the lpm/hash path should be done at runtime, not
compile time, via a command-line argument. This will ensure that
both code paths get compiled in a single go
Following examples show runtime options provided
Select 'LPM' or 'EM' based on run time selection f.e.
> l3fwd -c 0x1 -n 1 -- -p 0x1 -E ... (EM)
> l3fwd -c 0x1 -n 1 -- -p 0x1 -L ... (LPM)
Options "E" and "L" are mutualy-exclusive.
If none selected, "L" is default.
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Piotr Azarewicz <piotrx.t.azarewicz@intel.com>
Tested-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
clang reports this error:
examples/l3fwd/main.c:550:1: error: unused function 'send_packetsx4'
The function is used only when ENABLE_MULTI_BUFFER_OPTIMIZE is 1.
Fixes: 96ff445371e0 ("examples/l3fwd: reorganise and optimize LPM code path")
Fixes: 6f1c1e28d98e ("examples/l3fwd: fix build with exact-match enabled")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
fix the error reported by checkpatch:
"ERROR: return is not a function, parentheses are not required"
remove parentheses in return like:
"return (logical expressions)"
remove parentheses in return a function like:
"return (rte_mempool_lookup(...))"
Fixes: 6307b909b8e0 ("lib: remove extra parenthesis after return")
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Handle SIGINT and SIGTERM in l3fwd.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
l3fwd app expects PMDs to return packets whose L2 header is
16-byte aligned due to usage of _mm_load_si128()/_mm_store_si128()
intrinsics in the app. However, most of the protocol stacks expects
packets such that its IP/L3 header be aligned on a 16-byte boundary.
Based on the recommendations received on dpdk-dev, we are changing
the l3fwd app to use _mm_loadu_si128()/_mm_loadu_si128() so that the
address need not be 16-byte aligned and thereby preventing crash.
We have tested that there is no performance impact due to this
change.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>