The RSS expand function for IP-in-IP tunnel type is missed,
which leads to create following flow failed:
flow create 0 ingress pattern eth / ipv4 proto is 4 /
ipv4 / udp / end actions rss queues 0 1 end level 2
types ip ipv4-other udp ipv4 ipv4-frag end /
mark id 221 / count / end
In order to make RSS expand function working correctly,
now the way to check whether a IP tunnel existing is to
check whether there is the second IPv4/IPv6 item and whether the
first IPv4/IPv6 item's next protocl is IPPROTO_IPIP/IPPROTO_IPV6.
For example:
... pattern eth / ipv4 proto is 4 / ipv4 / ....
Fixes: 5e33bebdd8 ("net/mlx5: support IP-in-IP tunnel")
Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Function flow_dv_zero_encap_udp_csum() uses a while loop to iterate
over vlan items in flow rule.
Pointer next_hdr is incremented to the next item before it is used,
so the first item is skipped.
This patch moves the incrementing of next_hdr to the correct place.
Fixes: bf1d7d9a03 ("net/mlx5: zero out UDP checksum in encapsulation")
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
When the link is down, the link speed returned by ethtool is
UINT32_MAX and the link status is 0.
In this case, the DPDK ethdev link speed should be set to
ETH_SPEED_NUM_NONE.
Otherwise since link speed is non-zero but link status is zero, this
is an inconsistent situation and -EAGAIN is returned, which is not right.
Fixes: 1884087198 ("net/mlx5: fix support for newer link speeds")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
There is the limit on completion descriptor fetch to improve
latency. If burst size is large there might be not enough
resources freed in completion processing. This fix reiterates
sending loop and allows multiple completion descriptor fetch
and processing.
Fixes: 18a1c20044 ("net/mlx5: implement Tx burst template")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This patch fixes the default settings for packet size to inline
with Enhanced Multi-Packet Write feature, allowing 256B packets
to be inlined with Out-Of-the-Box settings.
Fixes: 50724e1bba ("net/mlx5: update Tx definitions")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
If the minimal inline data are required the data inline feature
must be engaged. There were the incorrect settings enabling the
entire small packet inline (in size up to 82B) which may result
in sending rate declining if there is no enough cores. The same
problem was raised if inline was enabled to support VLAN tag
insertion by software.
Fixes: 38b4b397a5 ("net/mlx5: add Tx configuration and setup")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The completion loop speed optimizations for error-free
operations are done - no CQE field fetch on each loop
iteration. Also, code size is oprimized - the flush
buffers routine is invoked once.
Fixes: 318ea4cfa1 ("net/mlx5: fix Tx completion descriptors fetching loop")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The debug assert wrongly triggers on the inline data 18B,
this should be passed successfully.
Fixes: 18a1c20044 ("net/mlx5: implement Tx burst template")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The patch [Fixes] sets the default value of required minimal
inline data to 0 bytes. On some configurations (depends
on switchdev/legacy settings and FW version/settings)
the ConnectX-4LX NIC requires minimal 18 bytes of
Tx descriptor inline data to operate correctly.
Wrongly set to 0 default value may prevent NIC from operating
with out-of-the-box settings, this patch reverts default
value for ConnectX-4LX back to 18 bytes (inline L2).
Fixes: 9f350504bb ("net/mlx5: fix ConnectX-4LX minimal inline data limit")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The rte_flow_item_vlan has the inner_type, which is missing on
DR/DV flow engine.
By adding this support, the example testpmd commands could be:
- matching all vlan traffic with id 2:
testpmd> flow create 0 ingress pattern eth / vlan vid is 2 / end
actions queue index 2 / end
- matching all ipv4 traffic in vlan with id 2:
testpmd> flow create 0 ingress pattern eth / vlan vid is 2
inner_type is 0x0800 / end actions queue index 2 / end
Fixes: fc2c498ccb ("net/mlx5: add Direct Verbs translate items")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyu Min <jackmin@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Some flow rules were not configured.
Part of the code in function flow_dv_matcher_enable() is enclosed in
'#ifdef HAVE_MLX5DV_DR' preprocessor directive.
Using this directive is not needed here, and prevents compilation of
relevant code.
This patch removes the unnecessary preprocessor directive.
Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Function mlx5_flow_validate_item_vlan() validates the user setting
is supported by NIC, using a mask with TCI mask 0x0fff.
This check will reject a flow rule specifying a vlan pcp item.
This patch updates mlx5_flow_validate_item_vlan() to use mask 0xffff,
so flow rules with vlan pcp item are accepted.
Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
mlx4_dev_info_get calls mlx4_get_ifname, but mlx4_get_ifname
uses priv->ctx which is not a valid pointer in a secondary
process. The fix is to cache the value in primary.
In the primary process, get and store the interface index of
the device so that secondary process can see it.
Bugzilla ID: 320
Fixes: 61cbdd4194 ("net/mlx4: separate device control functions")
Cc: stable@dpdk.org
Reported-by: Suyang Ju <sju@paloaltonetworks.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Matan Azrad <matan@mellanox.com>
The function mlx5_set_min_inline() includes a switch() that checks
various PCI device IDs in order to set the txq_inline_min value. No
value is set when the PCI device ID matches the ConnectX-5 adapters,
resulting in an assert() failure later in the function
mlx5_set_txlimit_params().
This error was encountered on an IBM Power 9 system running RHEL 7.6
w/o Mellanox OFED installed.
Fixes: 38b4b397a5 ("net/mlx5: add Tx configuration and setup")
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
MLX5 PMD limits the number of SW steering tables to 32.
This patch updates the limit to 65535, to allow wide range of values.
Fixes: e2b4925ef7 ("net/mlx5: support Direct Rules E-Switch")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
On some virtual setups (particularly on ESXi) when we have SR-IOV and
E-Switch enabled there is the problem to receive VLAN traffic on VF
interfaces. The NIC driver in ESXi hypervisor does not setup E-Switch
vport setting correctly and VLAN traffic targeted to VF is dropped.
The patch provides the temporary workaround - if the rule
containing the VLAN pattern is being installed for VF the VLAN
network interface over VF is created, like the command does:
ip link add link vf.if name mlx5.wa.1.100 type vlan id 100
The PMD in DPDK maintains the database of created VLAN interfaces
for each existing VF and requested VLAN tags. When all of the RTE
Flows using the given VLAN tag are removed the created VLAN interface
with this VLAN tag is deleted.
The name of created VLAN interface follows the format:
evmlx.d1.d2, where d1 is VF interface ifindex, d2 - VLAN ifindex
Implementation limitations:
- mask in rules is ignored, rule must specify VLAN tags exactly,
no wildcards (which are implemented by the masks) are allowed
- virtual environment is detected via rte_hypervisor() call,
and the type of hypervisor is checked. Currently we engage
the workaround for ESXi and unrecognized hypervisors (which
always happen on platforms other than x86 - it means workaround
applied for the Flow over PCI VF). There are no confirmed data
the other hypervisors (HyperV, Qemu) need this workaround,
we are trying to reduce the list of configurations on those
workaround should be applied.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch fixes (dereference after null check) coverity issue.
The address of first segmented packets was not set correctly during
reassembling packets which led to this issue.
Coverity issue: 343416
Fixes: fe65e1e1ce ("fm10k: add vector scatter Rx")
Cc: stable@dpdk.org
Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
This patch fixes (dereference after null check) coverity issue.
The address of first segmented packets was not set correctly during
reassembling packets which led to this issue.
Coverity issue: 343447
Fixes: 319c421f38 ("net/avf: enable SSE Rx Tx")
Cc: stable@dpdk.org
Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
This patch fixes (dereference after null check) coverity issue.
The address of first segmented packets was not set correctly during
reassembling packets which led to this issue.
Coverity issue: 343422, 343403
Fixes: ca74903b75 ("net/i40e: extract non-x86 specific code from vector driver")
Cc: stable@dpdk.org
Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
This patch fixes (dereference after null check) coverity issue.
The address of first segmented packets was not set correctly during
reassembling packets which led to this issue.
Coverity issue: 343452, 343407
Fixes: c68a52b8b3 ("net/ice: support vector SSE in Rx")
Cc: stable@dpdk.org
Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
This patch fixes (dereference after null check) coverity issue.
The address of first segmented packets was not set correctly during
reassembling packets which led to this issue.
Coverity issue: 13245
Fixes: 8a44c15aa5 ("net/ixgbe: extract non-x86 specific code from vector driver")
Cc: stable@dpdk.org
Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Add return value checking when reading configure information from PCI
register to avoid Coverity issue.
Fixes: 1fc97012 ("net/e1000: fix i219 hang on reset/close")
Cc: stable@dpdk.org
Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Given the fact that dev parameter is used in ice_dev_configure.
Fixes: 50370662b7 ("net/ice: support device and queue ops")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
When working as a secondary process, it uses eth_memif_rx in PMD egress.
It should be eth_memif_tx.
Fixes: c41a04958b ("net/memif: support multi-process")
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Fixed a check in bnxt_alloc_hwrm_rx_ring() while initializing
the rx ring.
Driver should not change "deferred_start" status of rx/tx queues.
It should get the status in queue_setup_op() and use that value.
Fixes: 9b63c6fd70 ("net/bnxt: support Rx/Tx queue start/stop")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Octeontx2 PMD's mailbox client uses device memory to send messages
to mailbox server in the admin function Linux kernel driver.
The device memory used for the mailbox communication needs to
be qualified as volatile memory type to avoid unaligned device
memory accesses because of compiler's memory access coalescing.
This patch modifies the mailbox request and responses as volatile
type which were non-volatile earlier and accessed from unaligned
memory addresses which resulted in bus errors on Fedora 30 with
gcc 9.1.1.
Fixes: 2b71657c86 ("common/octeontx2: add mbox request and response definition")
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Since 18.11, it is suggested that driver should release all its private
resources at the dev_close routine. So all resources previously released
in remove routine are now released at the dev_close routine, and the
dev_close routine will be called in driver remove routine in order to
support removing a device without closing its ports.
Above behavior changes are supported by setting RTE_ETH_DEV_CLOSE_REMOVE
flag during probe stage.
Signed-off-by: Liron Himi <lironh@marvell.com>
Reviewed-by: Yuri Chipchev <yuric@marvell.com>
Fix the PCIe detach segfault by releasing eth_dev resources
by adding nicvf cleanup support on PCI detach.
Fixes: fdf91e0f2f ("drivers/net: do not use ethdev driver")
Cc: stable@dpdk.org
Signed-off-by: Amit Gupta <agupta3@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Update workaround changes for erratas that are fixed on 96xx A1.
This patch also enables cq drop for all the passes for
maintaining performance along with updating a default
Rx ring size in dev_info.
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Patch extends minimum supported max_sqb_count devarg value
such that it can limit the max sqb count to 8 buffers and
also defines NIX_DEF_SQB and uses it to compute the number
of sqe buffers required for the egress traffic.
NIX_DEF_SQB is defined as 16 which is optimal across multiple
octeontx2 platforms to scale up the performance proportional
to the corresponding port/queue to lcore mappings.
Fixes: fb0198b7dc ("net/octeontx2: add devargs parsing functions")
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
From B0 HW revision onwards, HW can drop the Rx and L2 error packets.
Enable this by default if the feature is available.
Since this bit field is used as reserved in old HW revisions,
No need to have additional HW version check.
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
During an if-condition evaluation, a 2-bit flag evaluates to 'true' for
'0x1', '0x2' and '0x3'. Thus, from this perspective these flags are
indistinguishable. To make them distinct, respective bits must be
extracted with a mask and then checked for strict equality.
Specifically here, even if `PKT_TX_UDP_CKSUM` (value '0x3') was set, the
expression `mbuf->ol_flags & PKT_TX_TCP` (the second flag of value
'0x1') is evaluated first and the result is 'true'. In consequence, for
UDP packets the execution flow enters an incorrect branch.
Fixes: 56b8b9b7e5 ("net/ena: convert to new Tx offloads API")
Cc: stable@dpdk.org
Reported-by: Eduard Serra <eserra@vmware.com>
Signed-off-by: Maciej Bielski <mba@semihalf.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
When using RTE_PKTMBUF_HEADROOM as 0, dpaa driver throws compilation error
error "Annotation requirement is more than RTE_PKTMBUF_HEADROOM"
This patch change it into run-time check.
Bugzilla ID: 335
Fixes: beb2a7865d ("bus/fslmc: define hardware annotation area size")
Cc: stable@dpdk.org
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
When using RTE_PKTMBUF_HEADROOM as 0, dpaa driver throws compilation error
error "Annotation requirement is more than RTE_PKTMBUF_HEADROOM"
This patch change it into run-time check.
Bugzilla ID: 335
Fixes: ff9e112d78 ("net/dpaa: add NXP DPAA PMD driver skeleton")
Cc: stable@dpdk.org
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
There should not be blank lines at end of files.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Fixed to return the checksum status of rx packets by setting
"ol_flags" correctly in vector mode receive.
These changes have been there for non vector mode receive.
In vector mode receive also indicate inner and outer checksum
errors individually in "ol_flag" to indicate L3 and L4 error.
Fixes: bc4a000f2f ("net/bnxt: implement SSE vector mode")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
There is a bug in context memory allocation because of which
it results in reusing the context memory allocated for the first
port while allocating memory for next ports.
Fix it by passing the port id in the name field while
allocating context memory.
Fixes: f8168ca0e6 ("net/bnxt: support thor controller")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Add extern to variable declaration to avoid some compiler treating it
as variable definition.
build error log:
lib/librte_pmd_virtio.a(vhost_kernel.o):(.rodata+0x110):
multiple definition of `vhost_msg_strings'
lib/librte_pmd_virtio.a(vhost_user.o):(.data.rel.ro.local+0x0):
first defined here
lib/librte_pmd_virtio.a(virtio_user_dev.o):(.rodata+0xe8):
multiple definition of `vhost_msg_strings'
lib/librte_pmd_virtio.a(vhost_user.o):(.data.rel.ro.local+0x0):
first defined here
Fixes: 33d24d65fe ("net/virtio-user: abstract backend operations")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In case of QINT interrupt occurrence, SW fails to clear the QINT
line resulting in recursive interrupts because currently interrupt
handler gets the cause of the interrupt by reading
NIX_LF_RQ[SQ/CQ/AURA/POOL]_OP_INT but does not write 1 to clear
RQ[SQ/CQ/ERR]_INT field in respective NIX_LF_RQ[SQ/CQ/AURA/POOL]_OP_INT
registers.
Fixes: dc47ba15f6 ("net/octeontx2: handle queue specific error interrupts")
Fixes: 50b95c3ea7 ("mempool/octeontx2: add NPA IRQ handler")
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
The packets transmitting in mlx5 is performed by building
Tx descriptors (WQEs) and sending last ones to the NIC.
The descriptor can contain the special flags, telling the NIC
to generate Tx completion notification (CQEs). At the beginning
of tx_burst() routine PMD checks whether there are some Tx
completions and frees the transmitted packet buffers.
The flags to request completion generation must be set once
per specified amount of packets to provide uniform stream
of completions and freeing the Tx queue in uniform fashion.
The previous implementation sets the completion request
generation once per burst, if burst size if big enough it may
latency in CQE generation and freeing large amount of buffers
in tx_burst routine on multiple completions which also
affects the latency and even causes the Tx queue overflow
and Tx drops.
This patches enforces the completion request will be set
in the exact Tx descriptor if specified amount of packets
is already sent.
Fixes: 18a1c20044 ("net/mlx5: implement Tx burst template")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Mellanox ConnectX-4LX NIC in configurations with disabled
E-Switch can operate without minimal required inline data
into Tx descriptor. There was the hardcoded limit set to
18B in PMD, fixed to be no limit (0B).
Fixes: 38b4b397a5 ("net/mlx5: add Tx configuration and setup")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
This patch limits the amount of fetched and processed
completion descriptors in one tx_burst routine call.
The completion processing involves the buffer freeing
which may be time consuming and introduce the significant
latency, so limiting the amount of processed completions
mitigates the latency issue.
Fixes: 18a1c20044 ("net/mlx5: implement Tx burst template")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Enabling LRO offload per queue makes sense because the user will
probably want to allocate different mempool for LRO queues - the LRO
mempool mbuf size may be bigger than non LRO mempool.
Change the LRO offload to be per queue instead of per port.
If one of the queues is with LRO enabled, all the queues will be
configured via DevX.
If RSS flows direct TCP packets to queues with different LRO enabling,
these flows will not be offloaded with LRO.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
When a user configures LRO in the port offloads, he probably wants each
TCP packet will have a chance to open an LRO session.
The PMD wasn't configure LRO in the flow TIR if the flow is not
explicitly configured TCP item despite the flow included TCP traffic.
For example, the next flows were not LRO offloaded:
pattern eth / end, pattern eth / ip / end, pattern eth / ipv6 / end.
Enable LRO configuration for all the TIRs if LRO is configured in the
port.
No performance impact for non-LRO traffic in these TIRs.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
When LRO offload is configured in Rx queue, the HW may coalesce TCP
packets from same TCP connection into single packet.
In this case the SW should fix the relevant packet headers because
the HW doesn't update them according to the new created packet
characteristics but provides the update values in the CQE.
Add update header code to the regular Rx burst function to support LRO
feature.
Make sure the first mbuf has enough space to include each TCP header,
otherwise the header update may cross mbufs what complicates the
operation too match.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
The alignment requested by the FW for WQ buffer allocation is 512.
Change it from cache line alignment to 512.
Fixes: dc9ceff73c ("net/mlx5: create advanced RxQ via DevX")
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>