The buffer split feature is mentioned in the mlx5 PMD
documentation, the limitation is description is added
as well.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tunnel Offload API provides hardware independent, unified model
to offload tunneled traffic. Key model elements are:
- apply matches to both outer and inner packet headers
during entire offload procedure;
- restore outer header of partially offloaded packet;
- model is implemented as a set of helper functions.
Implementation details:
* tunnel_offload PMD parameter must be set to 1 to enable the feature.
* application cannot use MARK and META flow actions with tunnel.
* offload JUMP action is restricted to steering tunnel rule only.
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Due to PRM requirement, the IPv6 header item 'proto' field, indicating
the next header protocol, should not be set as extension header.
This patch adds the relevant validation, and documents the limitation.
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Since the built driver filenames have changed in DPDK 20.11, we need to
update the driver doc to match.
Most drivers start their section with the driver filename highlighted in
bold, while a number were missing the highlight. When updating the names,
add the markers for bold text to any missing it, so as to have things more
consistent.
Fixes: a20b2c01a7 ("build: standardize component names and defines")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Previously, the Tx timestamp field and flag were registered in testpmd,
as described in mlx5 guide.
For consistency between Rx and Tx timestamps,
managing mbuf registrations inside the driver, as properly documented,
is a simpler expectation.
The only driver to support this feature (mlx5) is updated
as well as the testpmd application.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Make is no longer supported for compiling DPDK, references are now
removed in the documentation.
Signed-off-by: Ciara Power <ciara.power@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Add description about the sample flow limitation.
Sample Flow supports in NIC-Rx and E-Switch domains.
Due to Metadata register c0 is deleted while doing the loopback,
so that only support forward the sampling packet into
E-Switch manager port, no additional action support in sample flow.
Add the offloads minimum versions for new sampling feature.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
PRM expose fields "Icmp_header_data" in IPv4 ICMP.
Update ICMP mask parameter with ICMP identifier and sequence number
fields.
ICMP sequence number spec with mask, Icmp_header_data low 16 bits are
set.
ICMP identifier spec with mask, Icmp_header_data high 16 bits are set.
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
The page "Development Kit Build System" was about make,
so it has been removed. A better help is in the Linux guide
(note: mlx4/mlx5 are supported on Linux only for now).
Fixes: 3cc6ecfdfe ("build: remove makefiles")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Add description about Tx scheduling timestamp upper limit.
If timestamp exceeds the value, it is marked by PMD as being
into "too-distant-future" and not scheduled at all
(is being sent without any wait).
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
There are some limitations on some NICs (at least on ConnectX-6 Dx
and BlueField 2) with supporting FCS (frame checksum) scattering for
the tunnel decapsulated packets.
For the case only one of the features can be supported in the same time,
and the new devarg "decap_en" is introduced to provide the choice to the
users.
If FCS scattering feature is not supposed to be engaged by application,
this new devarg should be specified as "decap_en=0", forcing the FCS
feature enable and rejecting tunnel decap actions in the rte_flow engine.
If FCS scatter is not needed and application supposes to use tunnel
decapsulation in rte_flow, the devarg can be omitted or set to non-zero
value (this is default settings).
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
As scatter FCS might be not supported for decapsulated tunnel
packets in some NIC HW, a new capability bit which indicates
if scatter FCS works with decap is added.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Currently, for MLX5 PMD, once millions of flows created, the memory
consumption of the flows are also very huge. For the system with limited
memory, it means the system need to reserve most of the memory as huge
page memory to serve the flows in advance. And other normal applications
will have no chance to use this reserved memory any more. While most of
the time, the system will not have lots of flows, the reserved huge
page memory becomes a bit waste of memory at most of the time.
By the new sys_mem_en devarg, once set it to be true, it allows the PMD
allocate the memory from system by default with the new add mlx5 memory
management functions. Only once the MLX5_MEM_RTE flag is set, the memory
will be allocate from rte, otherwise, it allocates memory from system.
So in this case, the system with limited memory no need to reserve most
of the memory for hugepage. Only some needed memory for datapath objects
will be enough to allocated with explicitly flag. Other memory will be
allocated from system. For system with enough memory, no need to care
about the devarg, the memory will always be from rte hugepage.
One restriction is that for DPDK application with multiple PCI devices,
if the sys_mem_en devargs are different between the devices, the
sys_mem_en only gets the value from the first device devargs, and print
out a message to warn that.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Update the release notes of mlx5 PMD part by adding the
support of eCPRI.
Update the firmware configuration in the mlx5 NIC guide to support
the usage of eCPRI.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This patch introduces the new devargs:
tx_pp - enables accurate packet send scheduling on mbuf timestamps
in the PMD. On the device start if "rte_dynflag_timestamp"
dynamic flag is registered and this devarg non-zero value is
specified, the driver initializes all necessary internal
infrastructure to provide packet scheduling. The parameter
value specifies scheduling granularity in nanoseconds.
tx_skew - the parameter adjusts the send packet scheduling on
timestamps and represents the average delay between beginning
of the transmitting descriptor processing by the hardware and
appearance of actual packet data on the wire. The value should
be provided in nanoseconds and is valid only if tx_pp parameter
is specified. The default value is zero.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The new devarg will control the steering of the lacp traffic.
When setting dv_lacp_by_user = 0 the lacp traffic will be
steered to kernel and managed there.
When setting dv_lacp_by_user = 1 the lacp traffic will
not be steered and the user will need to manage it.
Signed-off-by: Shiri Kuzin <shirik@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Currently, when flow destroyed, some memory resources may still be kept
as cached to help next time create flow more efficiently.
Some system may need the resources to be more flexible with flow create
and destroy. After peak time, with millions of flows destroyed, the
system would prefer the resources to be reclaimed completely, no cache
is needed. Then the resources can be allocated and used by other
components. The system is not so sensitive about the flow insertion
rate, but more care about the resources.
Both DPDK mlx5 PMD driver and the low level component rdma-core have
provided the flow resources to be configured cached or not, but there is
no APIs or parameters exposed to user to configure the flow resources
cache mode. In this case, introduce a new PMD devarg to let user
configure the flow resources cache mode will be helpful.
This commit is to add a new "reclaim_mem_mode" to help user configure if
the destroyed flows' cache resources should be kept or not.
Their will be three mode can be chosen:
1. 0(none). It means the flow resources will be cached as usual. The
resources will be cached, helpful with flow insertion rate.
2. 1(light). It will only enable the DPDK PMD level resources reclaim.
3. 2(aggressive). Both DPDK PMD level and rdma-core low level will be
configured as reclaimed mode.
With these three mode, user can configure the resources cache mode with
different levels.
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Removing the current limitation for TSO over VM
due to the fact that mlx5 currently support it.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Asaf Penso <asafp@mellanox.com>
If running DPDK as non-root, some extra capabilities may be required.
The Mellanox devices, using a bifurcated model with Linux drivers,
have some specific requirements summarized in mlx5 PMD guide.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
This patch adds to MLX5 PMD the support of matching on
GTP header item v_pt_rsv_flags.
This item is contained in 1 byte of the format:
-------------------------------------------
| bit | 0 - 2 | 3 | 4 | 5 | 6 | 7 |
|-----------------------------------------|
| value | Version | PT | Res | E | S | PN |
-------------------------------------------
Matching is supported only for GTP flags E, S, PN.
Therefore values 0 to 7 are supported.
Mask must be set accordingly:
... gtp v_pt_rsv_flags is 1 v_pt_rsv_flags mask 0x07 ...
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
In existing implementation, using wild card VLAN item is not allowed.
A VLAN item in flow pattern must include VLAN ID (vid) value.
This obligation contradict the flow API specification [1].
This patch updates the VLAN item validation and translation, to allow
wild card VLAN item, without VLAN ID value.
User guide and release notes are updated accordingly.
[1]
commit 40513808b165 ("doc: refine ethernet and VLAN flow rule items")
Fixes: 00f75a4057 ("net/mlx5: fix VLAN match for DV mode")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This patch updates the MLX5 PMD and release notes documentations.
Adding the notes of the behavior change that rte flows organization
is switched into non-cached mode for applications.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
This patch updates the MLX5 PMD and release notes documentations.
Adding the guideline for hairpin data buffer size configuration.
Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Define a device parameter to configure log 2 of a stride size for MPRQ
- mprq_log_stride_size. User is able to specify a stride size in a range
allowed by an underlying hardware. The default stride size is defined as
2048 bytes to encompass most commonly used packet sizes in the Internet
(MTU 1518 and less) and will be used in case a maximum configured packet
size cannot fit into the largest possible stride size. Otherwise a
stride size is set to a large enough value to encompass a whole packet.
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
MARK and META items are interrelated with datapath -
they might move from/to the applications in mbuf.
zero value for these items has the special meaning -
it means "no metadata are provided", not zero values
are treated by applications and PMD as valid ones.
Moreover in the flow engine domain the value zero is
acceptable to match and set, and we should allow to
specify zero values as rte_flow parameters for the
META and MARK items and actions. In the same time
zero mask has no meaning and should be rejected
on validation stage.
Fixes: fcc8d2f716 ("net/mlx5: extend flow metadata support")
Fixes: e554b672aa ("net/mlx5: support flow tag")
Fixes: 55deee1715 ("net/mlx5: extend flow mark support")
Cc: stable@dpdk.org
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The no-inline hint flag is described.
Fixes: cacb44a099 ("net/mlx5: add no-inline Tx flag")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The devices of the family ConnectX may have two letters as suffix.
Such suffix is preceded with a space and the second x is lowercase:
- ConnectX-4 Lx
- ConnectX-5 Ex
- ConnectX-6 Dx
Uppercase of the device family name BlueField is also fixed.
The lists of supported devices are fixed.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
When adding GTP to the list of supported tunnels,
the old line was not removed.
Fixes: f31d7a0171 ("net/mlx5: support GTP")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The new rte_flow feature for DSCP field rewrite offload was
missing in the release notes.
The mlx5 requirements for DSCP field rewrite offload were missing.
Fixes: 8482ffe4b6 ("ethdev: add IPv4/IPv6 DSCP rewrite action")
Fixes: 6f26e604a9 ("net/mlx5: support IPv4/IPv6 DSCP rewrite action")
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
The libibverbs (and libmlx4/5) can be statically embedded
in the shared PMD library, or in the application with the static PMD.
It was supported with make build system in
commit 2c0dd7b69f ("config: add static linkage of mlx dependency").
The same feature is enabled with meson when using pkg-config
(i.e. only if the call to dependency() is successful).
The fallback method for searching library with cc.find_library()
is not supported because the dependencies of the found library
would not be linked (no such info in .a file unlike .so).
The main difference, in meson build system, is the generated .pc file
giving arguments to link DPDK with the application.
Unfortunately the .pc file will not keep memory of the static linkage
option for libibverbs.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Currently MLX5 PMD can't match on untagged packets specifically.
Tagged traffic still hits the flows intended for untagged packets.
If the flow has ETH, it will catch all matching packets, tagged
and untagged.
The solution is to use cvlan_tag bit.
If mask=1 and value=0 it matches on untagged traffic.
If mask=1 and value=1 it matches on tagged traffic.
This is the kernel implementation.
This patch updated MLX5 PMD to set cvlan_tag mask and value according
to flow rule contents.
This update is relevant when using DV flow engine (dv_flow_en=1).
See example at https://doc.dpdk.org/guides/nics/mlx5.html#limitations.
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Function of_set_vlan_vid is wrongly listed twice in table
"Supported hardware offloads".
This patch removes the listing of of_set_vlan_vid under
"Header rewrite", and leaves the listing of of_set_vlan_vid
under "VLAN".
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This patch adds to MLX5 PMD support of matching on GTP item,
fields msg_type and teid, according to RFC [1].
GTP item validation and translation functions are added and called.
GTP tunnel type is added to supported tunnels.
[1] http://mails.dpdk.org/archives/dev/2019-December/152799.html
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
This patch implements the feature described in RFC [1], adding
support of RSS action on L3 and/or L4 source or destination only.
[1] http://mails.dpdk.org/archives/dev/2019-December/152796.html
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Fix OFED and rdma-core versions for current offloads.
Add new offloads minimum versions.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Commit in fixes line sets the DV (Direct Verbs) flow engine as default.
Newer versions of DV flow engine use the DR (Direct Rules) features.
DR is supported from RDMA Core library version rdma-core-24.0.
This cause failure to start port when using older rdma-core version,
without DR support.
This patch selects DV flow engine if rdma-core version is v24.0 or
higher. Verbs flow engine is selected otherwise.
Fixes: cd4569d2bf ("net/mlx5: change default flow engine to DV")
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
As the result of testing it was found that some hosts have
the performance penalty imposed by required write memory barrier
after doorbell writing. Before 19.08 release there was some
heuristics to decide whether write memory barrier should be
performed. For the bursts of recommended size (or multiple)
it was supposed there were some extra ongoing packets in the
next burst and write memory barrier may be skipped (supposed
to be performed in the next burst, at least after descriptor
writing).
This patch restores that behaviour, the devargs tx_db_nc=2
must be specified to engage this performance tuning feature.
Fixes: 8409a28573 ("net/mlx5: control transmit doorbell register mapping")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The default flow engine is Verbs flow engine, for legacy reasons.
This patch changes the default to DV flow engine (dv_flow_en = 1).
Documentation is updated accordingly.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This patch implements use of the API for LRO aggregated packet
max size.
Rx queue create is updated to use the relevant configuration.
Documentation is updated accordingly.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The rdma core library can map doorbell register in two ways,
depending on the environment variable "MLX5_SHUT_UP_BF":
- as regular cached memory, the variable is either missing or
set to zero. This type of mapping may cause the significant
doorbell register writing latency and requires explicit
memory write barrier to mitigate this issue and prevent
write combining.
- as non-cached memory, the variable is present and set to
not "0" value. This type of mapping may cause performance
impact under heavy loading conditions but the explicit write
memory barrier is not required and it may improve core
performance.
The new devarg is introduced "tx_db_nc", if this parameter is
set to zero, the doorbell register is forced to be mapped to
cached memory and requires explicit memory barrier after
writing to. If "tx_db_nc" is set to non-zero value the doorbell
will be mapped as non-cached memory, not requiring the memory
barrier. If "tx_db_nc" is missing the behaviour will be defined
by presence of "MLX5_SHUT_UP_BF" in environment. If variable
is missed the default value zero will be set for ARM64 hosts
and one for others.
In run time the code checks the mapping type and provides the
memory barrier after writing to tx doorbell register if it is
needed. The mapping type is extracted directly from the
uar_mmap_offset field in the queue properties.
Fixes: 18a1c20044 ("net/mlx5: implement Tx burst template")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The PMD parameter dv_xmeta_en is added to control extensive
metadata support. A nonzero value enables extensive flow
metadata support if device is capable and driver supports it.
This can enable extensive support of MARK and META item of
rte_flow. The newly introduced SET_TAG and SET_META actions
do not depend on dv_xmeta_en parameter, because there is
no compatibility issue for new entities. The dv_xmeta_en is
disabled by default.
There are some possible configurations, depending on parameter
value:
- 0, this is default value, defines the legacy mode, the MARK
and META related actions and items operate only within NIC Tx
and NIC Rx steering domains, no MARK and META information
crosses the domain boundaries. The MARK item is 24 bits wide,
the META item is 32 bits wide.
- 1, this engages extensive metadata mode, the MARK and META
related actions and items operate within all supported steering
domains, including FDB, MARK and META information may cross
the domain boundaries. The ``MARK`` item is 24 bits wide, the
META item width depends on kernel and firmware configurations
and might be 0, 16 or 32 bits. Within NIC Tx domain META data
width is 32 bits for compatibility, the actual width of data
transferred to the FDB domain depends on kernel configuration
and may be vary. The actual supported width can be retrieved
in runtime by series of rte_flow_validate() trials.
- 2, this engages extensive metadata mode, the MARK and META
related actions and items operate within all supported steering
domains, including FDB, MARK and META information may cross
the domain boundaries. The META item is 32 bits wide, the MARK
item width depends on kernel and firmware configurations and
might be 0, 16 or 24 bits. The actual supported width can be
retrieved in runtime by series of rte_flow_validate() trials.
If there is no E-Switch configuration the ``dv_xmeta_en`` parameter is
ignored and the device is configured to operate in legacy mode (0).
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This adds new device id to the list of Mellanox devices
that runs mlx5 PMD.
- ConnectX-6DX device ID
- ConnectX-6DX SRIOV device ID
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Rx queue for LRO is created using DevX. Flows created on this queue
must use the DV flow engine.
This patch adds check of dv_flow_en=1 when configuring LRO support
on device spawn.
Documentation is updated accordingly.
Fixes: 175f1c21d0 ("net/mlx5: check conditions to enable LRO")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This commit adds support for matching flows on Geneve headers.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The hardware may have limitations on maximal amount of
supported Tx descriptors building blocks (WQEBB). Application
requires the Tx queue must accept the specified amount of packets.
If inline data feature is engaged the packet may require more WQEBBs
and overall amount of blocks may exceed the hardware capabilities.
Application has to make a trade-off between Tx queue size and maximal
data inline size.
In case if the inline settings are not requested explicitly with
devarg keys the default values are used. This patch adjusts the
applied default values if large Tx queue size is requested and
default inline settings can not be satisfied due to hardware
limitations.
The explicitly requested inline setting may be aligned (enlarging
only) by configurations routines to provide better WQEBB filling,
this implicit alignment is the subject for adjustment either.
The warning message is emitted to the log if adjustment happens.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This commit adds support for modifying the VID of the outermost VLAN
header already present in the packet.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This commit adds support for modifying the VLAN ID (VID) field
in an about-to-be-pushed VLAN header.
This feature can only modify the VID field of a new VLAN header yet
to be pushed. It does not support modifying an existing or already
pushed VLAN headers.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This commit adds support for modifying the VLAN priority (PCP) field
in about-to-be-pushed VLAN header.
This feature can only modify the PCP field of a new VLAN header yet
to be pushed. It does not support modifying an existing or already
pushed VLAN headers.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This commit adds support for RTE_FLOW_ACTION_TYPE_OF_PUSH_VLAN using
direct verbs flow rules.
If present in the flow, The VLAN default values are taken from the
VLAN item configuration.
In this commit only the VLAN TPID value can be set since VLAN
modification actions are not supported yet.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This commit adds support for RTE_FLOW_ACTION_TYPE_OF_POP_VLAN via
direct verbs flow rules.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Introduces the possible limitations on maximal Tx queue
size in descriptors if Tx inline data are enabled.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This patch updates mlx5 documentation in parts:
- txq_inline_min parameter is described in more details,
values are fixed
- maximal amount of segments in multi-segment packets.
Fixes: 38b4b397a5 ("net/mlx5: add Tx configuration and setup")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
Add firmware config for MPLS and DevX (required by LRO and DR).
Add a table for queue offloads requirements.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
Some details about libibverbs were missing:
- automatic detection by meson
- main ways to access the device
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
It is not needed to use "console" syntax highlighting
for literal blocks.
The file is easier to read by removing the code-block lines
and simply having double colons in previous line.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
These are simple fixes of punctuation, anchor placement
or wording.
The table format is fixed to avoid having a long line
in the first column.
Fixes: 909be50a34 ("doc: update Mellanox guides and release notes")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
The command mlxconfig was not enough explained and too much verbose
at the same time.
The syntax is now explained in introduction before listing the options,
without repeating the commands.
Some options, which are explained elsewhere in the doc,
are added to this list.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
This patch fixes the default settings for packet size to inline
with Enhanced Multi-Packet Write feature, allowing 256B packets
to be inlined with Out-Of-the-Box settings.
Fixes: 50724e1bba ("net/mlx5: update Tx definitions")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Flow control was not documented as a supported feature
since the first fill of features matrix for mlx drivers.
Flow API and CRC offload flag support in mlx4 were missing in the
feature matrix when they were implemented (see below commits).
Fixes: 46d5736a70 ("net/mlx4: support basic flow items and actions")
Fixes: ce07b1514d ("net/mlx4: fix CRC stripping capability report")
Fixes: e86b85ca75 ("doc: fill nics features matrix for mlx")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@mellanox.com>
When LRO offload is configured in Rx queue, the HW may coalesce TCP
packets from same TCP connection into single packet.
In this case the SW should fix the relevant packet headers because
the HW doesn't update them according to the new created packet
characteristics but provides the update values in the CQE.
Add update header code to the regular Rx burst function to support LRO
feature.
Make sure the first mbuf has enough space to include each TCP header,
otherwise the header update may cross mbufs what complicates the
operation too match.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Patch [1] zeroes the mbuf headroom when the port is configured with LRO
because when working with more than one stride per packet the HW cannot
guaranty an headroom in the start stride of each packet.
Change the solution to support mbuf headroom by adding an empty buffer
as the first packet segment, scatter mode must be enabled to support it.
[1] http://patches.dpdk.org/patch/56912/
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
LRO packet may consume all the stride memory, hence the PMD cannot
guaranty head-room for the LRO mbuf.
The issue is lack in HW support to write the packet in offset from the
stride start.
A new striding RQ feature may be added in CX6 DX to allow head-room and
tail-room for the LRO strides.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Use DevX API to read device LRO capabilities.
Check if LRO is supported and can be enabled.
Check if MPRQ is supported and can be used.
Enable MPRQ for LRO use if not enabled by user.
Added note for mlx5_mprq_enabled(), to emphasize that LRO
enables MPRQ.
Disable CQE compression and CRC stripping if LRO is enabled.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add command-line argument to set LRO session timeout.
Add LRO settings struct in PMD configuration struct.
Add support of LRO offload in port configuration.
Add macros and function to check if LRO is supported and enabled.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This patch introduces new mlx5 PMD devarg options:
- txq_inline_min - specifies minimal amount of data to be inlined into
WQE during Tx operations. NICs may require this minimal data amount
to operate correctly. The exact value may depend on NIC operation
mode, requested offloads, etc.
- txq_inline_max - specifies the maximal packet length to be completely
inlined into WQE Ethernet Segment for ordinary SEND method. If packet
is larger the specified value, the packet data won't be copied by the
driver at all, data buffer is addressed with a pointer. If packet
length is less or equal all packet data will be copied into WQE.
- txq_inline_mpw - specifies the maximal packet length to be completely
inlined into WQE for Enhanced MPW method.
Driver documentation is also updated.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This patch removes the existing Tx datapath code
as preparation step before introducing the new
implementation. The following entities are being
removed:
- deprecated devargs support
- tx_burst() routines
- related PRM definitions
- SQ configuration code
- Tx routine selection code
- incompatible Tx completion code
The following devargs are deprecated and ignored:
- "txq_inline" is going to be converted to "txq_inline_max"
for compatibility issue
- "tx_vec_en"
- "txqs_max_vec"
- "txq_mpw_hdr_dseg_en"
- "txq_max_inline_len" is going to be converted
to "txq_inline_mpw" for compatibility issue
The deprecated devarg keys are recognized by PMD
and ignored/converted to the new ones in order not
to block device probing.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Enabled IP-in-IP tunnel type support on DV/DR flow engine.
This includes the following combination:
- IPv4 over IPv4
- IPv4 over IPv6
- IPv6 over IPv4
- IPv6 over IPv6
MLX5 NIC supports IP-in-IP tunnel via FLEX Parser so
need to make sure fw using FLEX Paser profile 0.
mlxconfig -d <mst device> -y set FLEX_PARSER_PROFILE_ENABLE=0
The example testpmd commands would be:
- Match on IPv4 over IPv4 packets and do inner RSS:
testpmd> flow create 0 ingress pattern eth / ipv4 proto is 0x04 /
ipv4 / udp / end actions rss level 2 queues 0 1 2 3 end / end
- Match on IPv6 over IPv4 packets and do inner RSS:
testpmd> flow create 0 ingress pattern eth / ipv4 proto is 0x29 /
ipv6 / udp / end actions rss level 2 queues 0 1 2 3 end / end
Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
On DV/DR flow engine, MLX5 can match on ICMP/ICMP6's code and type field
via FLEX Parser, which can be enabled by config FW using FLEX Parser
profile 2:
mlxconfig -d <mst device> -y set FLEX_PARSER_PROFILE_ENABLE=2
The testpmd commands could be:
testpmd> flow create 0 ingress pattern eth / ipv4 /
icmp type is 8 code is 0 / end
actions rss queues 0 1 end / end
testpmd> flow create 0 ingress pattern eth / ipv6 /
icmp6 type is 128 code is 0 / end
actions rss queues 0 1 end / end
Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This commit removes the support of configuring the device E-switch
using TCF since it is now possible to configure it via DR (direct
verbs rules), and by that to also remove the PMD dependency in libmnl.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Add a global function in the PMD which dumps debug information to
specific file.
The data can be printed in hexadecimal format or as regular string.
The number of debug files per PMD entity should be limited by a new PMD
probe parameter called max_dump_files_num.
The files will be created in the /var/log directory or in the current
directory.
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch adds some missing features to Mellanox drivers release notes.
It also updates the mlx5/mlx4 documentations.
Fixes: d85b204b5d ("doc: update release notes for Mellanox drivers")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The Memory Region (MR) for DMA memory can't be created from secondary
process due to lib/driver limitation. Whenever it is needed, secondary
process can make a request to primary process through the EAL IPC
channel (rte_mp_msg) which is established on initialization. Once a MR
is created by primary process, it is immediately visible to secondary
process because the MR list is global per a device. Thus, secondary
process can look up the list after the request is successfully returned.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
A new PMD parameter (mr_ext_memseg_en) is added to control extension of
memseg when creating a MR. It is enabled by default.
If enabled, mlx5_mr_create() tries to maximize the range of MR
registration so that the LKey lookup tables on datapath become smaller
and get the best performance. However, it may worsen memory utilization
because registered memory is pinned by kernel driver. Even if a page in
the extended chunk is freed, that doesn't become reusable until the
entire memory is freed and the MR is destroyed.
To make freed pages available immediately, this parameter has to be
turned off but it could drop performance.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Secondary process is not allowed to register MR due to a restriction of
library and kernel driver.
Fixes: 7e43a32ee0 ("net/mlx5: support externally allocated static memory")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Rather than using linuxapp and bsdapp everywhere, we can change things to
use the, more readable, terms "linux" and "freebsd" in our build configs.
Rather than renaming the configs we can just duplicate the existing ones
with the new names using symlinks, and use the new names exclusively
internally. ["make showconfigs" also only shows the new names to keep the
list short] The result is that backward compatibility is kept fully but any
new builds or development can be done using the newer names, i.e. both
"make config T=x86_64-native-linuxapp-gcc" and "T=x86_64-native-linux-gcc"
work.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Some drivers (mlx, mvpp2, sfc) support the flow isolated mode,
but the feature was not advertised.
A reference to the feature description is added for each driver.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Mellanox EN package is supported along with Mellanox OFED. Mellanox EN is
distriubuted for Ethernet users.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Rx packet padding is supposed to be set by an environment variable -
MLX5_PMD_ENABLE_PADDING, but it has been missing for some time by mistake.
Rather than using such a variable, a PMD parameter (rxq_pkt_pad_en) is
added instead.
Fixes: a1366b1a2b ("net/mlx5: add reference counter on DPDK Rx queues")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Reviewed-by: Erez Ferber <erezf@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The libraries provided by rdma-core may be statically linked
if enabling CONFIG_RTE_IBVERBS_LINK_STATIC in the make-based build.
If CONFIG_RTE_BUILD_SHARED_LIB is disabled, the applications
will embed the mlx PMDs with ibverbs and the mlx libraries.
If CONFIG_RTE_BUILD_SHARED_LIB is enabled,
the mlx PMDs will embed ibverbs and the mlx libraries.
Support with meson may be added later.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Rename options CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS and
CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS to a single option
CONFIG_RTE_IBVERBS_LINK_DLOPEN.
Rename meson option enable_driver_mlx_glue to ibverbs_link.
There was no good reason for setting a different link option
for mlx4 and mlx5. Having a single common option makes it
easier to understand and unify make and meson systems.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This commit includes the add of:
- ConnectX-6 device ID
- ConnectX-6 SRIOV device ID
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The imissed counters (number of packets dropped because the queues were
full) were actually reported through xstats as "rx_out_of_buffer"
but was not reported through stats.
Following a recent discussion on the ML, as there is no way to tell the
user if a counter is implemented or not, this should be considered a
bug. For example, user looking at imissed will think the packets are
lost before reaching the device.
Signed-off-by: Tom Barbette <barbette@kth.se>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch adds limitation notice for MLX5 PMD regarding
VXLAN tunnels support on E-Switch Flows.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
This patch adds limitation notice for MLX5 PMD.
IPv6 multicast messages are not received on VM when promiscuous
and allmulticast modes are off, due to netlink restriction.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Would be good to add also a code which disable the dv_flow_en
the user requested. However such support will need to use new netlink
command to query the switchdev mode from the underlying kernel.
Considering the current 18.11 release is close to RC3, only a
documentation is added.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Applications which add RSS rules must supply an RSS key and length.
If an application is only interested in default RSS operation it
should not care about the exact RSS key.
By setting the key to NULL - the PMD will use the default RSS key.
In addition if the application does not care about the RSS type it can
set it to 0 and the PMD will use the default type (ETH_RSS_IP).
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>