When bond slave devices cannot transmit all packets in bufs array,
tx_burst callback shall merge the un-transmitted packets back to
bufs array. Recent merge logic introduced a bug which causes
invalid mbuf addresses being written to bufs array.
When caller frees the un-transmitted packets, due to invalid addresses,
application will crash.
The fix is avoid shifting mbufs, and directly write un-transmitted
packets back to bufs array.
Fixes: 09150784a776 ("net/bonding: burst mode hash calculation")
Cc: stable@dpdk.org
Signed-off-by: Jia Yu <jyu@vmware.com>
Acked-by: Chas Williams <chas3@att.com>
Some NFP firmwares support live changes to the MAC address, but
this is not always true and the firmware advertises it accordingly.
This patch checks if firmware does not support live changes and
sets RTE_ETH_DEV_NOLIVE_MAC_ADDR in that case.
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Mark internal variables static to avoid potential redefinition
errors later on.
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
Reviewed-by: Yelena Krivosheev <yelena@marvell.com>
Fix used_bpools array initialization by using range initializer.
This way all necessary variables are properly initialized regardless
of PP2_NUM_PKT_PROC value.
Fixes: 0ddc9b815b11 ("net/mrvl: add net PMD skeleton")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
On the code after the below commits, the criteria to select the IPV4 or
IPV6 hash functions was the existence of some ETH_RSS_IPV4 RSS types on
the flow rule.
The check is wrong. For example ETH_RSS_NONFRAG_IPV4_TCP will not select
the IPV4 hash which will cause the packet to be spread in a bad way.
Fix it by adding the corresponding types needed for each hash selection.
Fixes: 592f05b29a25 ("net/mlx5: add RSS flow action")
Fixes: fd0b70316bca ("net/mlx5: support inner RSS computation")
Cc: stable@dpdk.org
Reported-by: Yaroslav Brustinov <ybrustin@cisco.com>
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
On ConnectX-4 Lx the Multi Packet Send (MPW) feature is considered
un-secure, as on some cases were the application provides incorrect mbufs
on the Tx burst the host or NIC can get stuck.
Hence, disabling the feature by default for this specific NIC.
Users can still enable this feature and enjoy the performance gain
(mostly for low number of cores) by using the txq_mpw_en devarg.
This patch will impact the out of the box performance of some application
using ConnectX-4 Lx for the sack of security and robustness.
Since we need different defaults based on the underlying device the mpw
field in the configuration struct was extended to contain also the
MLX5_ARG_UNSET option.
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
LMTST does not guarantee packet content get synced with L2C when
HW access the packet for transmitting.
Adding coherent IO write barrier will make sure HW sees the correct
packet if its modified.
Fixes: 9e747589bd4c ("net/octeontx: add packet transmit burst function")
Cc: stable@dpdk.org
Reported-by: Vikas Aggarwal <vikas.aggarwal@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
The event buffer was changed to be a fixed size value,
had a couple of issues. The big one is that rte_free was still
being called for a pointer that was not setup with rte_malloc().
The event buffer was also too small to handle heavy receive
traffic; and running the event buffer out would crash
the application.
Fix by going back to a dynamically resized event buffer.
And grow it by 25% to avoid lots of realloc's.
Fixes: 530af95a7849 ("bus/vmbus: avoid signalling host on read")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Add tx_done_cleanup ethdev hook to allow application to
control if/when it wants completions to be handled.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Tune the vmbus connection so the host scans faster. This improves
transmit performance. The host default value is 100us but setting
to 50us reduces packet loss significantly.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Fix the error handling in setting up transmit buffer.
If setting up chimney buffer fails, then it is not connected so
no need to send disconnect.
Allow for some unused area if full area is not used.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
We assume VLAN ethtertype is 0x8100 in get_vlan_offset() function,
but it could be 0x88A8 if QinQ is supported.
Fixes: 06fe78b98ccd ("bond: add mode 6")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Chas Williams <chas3@att.com>
The size of Rx queue is determined by dividing the number of descriptors by
the number of strides. As device can't support single slot queue, if the
number of descriptors is same as the number of strides, MPRQ shouldn't be
enabled. Otherwise, this will cause HW fault. For example, if rxd is set to
512 with testpmd on ConnectX-4 Lx, PMD can't receive more than 512 packets
because the minimum number of strides for ConnectX-4 Lx is 512. Users have
to configure larger number of descriptors in this case.
Fixes: 7d6bf6b866b8 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
PMD did not pass down the intended queue id while
configuring the ntuple filter.
Fixes: 622075356e8f ("net/qede: support ntuple and flow director filter")
Cc: stable@dpdk.org
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
The rte_flow meaning of zero flow mask configuration is to match all
the range of the item value.
For example, the flow eth / ipv4 dst spec 1.2.3.4 dst mask 0.0.0.0
should much all the ipv4 traffic from the rte_flow API perspective.
>From some kernel perspectives the above rule means to ignore all the
ipv4 traffic (e.g. Ubuntu 16.04, 4.15.10).
Due to the fact that the tap PMD should provide the rte_flow meaning,
it is necessary to ignore the spec in case the mask is zero when it
forwards such like flows to the kernel.
So, the above rule should be translated to eth / ipv4 to get the
correct meaning.
Ignore spec configurations when the mask is zero.
Fixes: de96fe68ae95 ("net/tap: add basic flow API patterns and actions")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The host will notify netvsc device about failed UDP and TCP
checksum. Propagate that information into the mbuf.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
The Netvsc PMD was not filling in the packet type information.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
If MPRQ is enabled, a PMD-private mempool is allocated. For ConnectX-4 Lx,
the minimum number of strides is 512 which ConnectX-5 supports 8. This
results in quite small number of elements for the MPRQ mempool. For
example, if the size of Rx ring is configured as 512, only one MPRQ buffer
can cover the whole ring. If there's only one Rx queue is configured. In
the following code in mlx5_mprq_alloc_mp(), desc is 1 and obj_num will be
36 as a result.
desc *= 4;
obj_num = desc + MLX5_MPRQ_MP_CACHE_SZ * priv->rxqs_n;
However, rte_mempool_create_empty() has a sanity check to refuse large
per-lcore cache size compared to the number of elements. Cache flush
threshold should not exceed the number of elements of a mempool. For the
above example, the threshold is 32 * 1.5 = 48 which is larger than 36 and
it fails to create the mempool.
Fixes: 7d6bf6b866b8 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Multiple tunnel isn't allowed but MPLS over GRE should be accepted.
Fixes: a4a5cd21d20a ("net/mlx5: add flow MPLS item")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
mlx5_dev_ops_isolate doesn't have APIs for enabling/disabling allmulti
mode as it can't be enabled in flow isolation mode. If the function
pointers are null, librte APIs such as
rte_eth_allmulticast_enable/disable() fail to set the flag
(dev->data->all_multicast). The flag is used when starting traffic by
mlx5_traffic_enable(). When switching out of flow isolation mode, allmulti
mode will not be set even though it has been enabled.
Fixes: 0887aa7f27f3 ("net/mlx5: add new operations for isolated mode")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
mlx5_dev_ops_isolate doesn't have APIs for enabling/disabling promiscuous
mode as it can't be enabled in flow isolation mode. If the function
pointers are null, librte APIs such as rte_eth_promiscuous_enable/disable()
fail to set the flag (dev->data->promiscuous). The flag is used when
starting traffic by mlx5_traffic_enable(). When switching out of flow
isolation mode, promiscuous mode will not be set even though it has been
enabled.
Fixes: 0887aa7f27f3 ("net/mlx5: add new operations for isolated mode")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Don't signal host that receive ring has been read until all events
have been processed. This reduces the number of guest exits and
therefore improves performance.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
If the number of packets requested are already present in the
rx_ring then skip reading the ring buffer from the host.
If the ring between the poll and receive side is full, then don't
poll (let incoming packets stay on host).
If no more transmit descriptors are available, then still try and
flush any outstanding data.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Increase the size of the ring used to hold mbuf's received
but not processed. The default is now based off the size
of the receive mbuf pool not the number of sections from the host.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
The internal receive ring should be freed with rte_ring_free
not rte_free.
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Moved the link status validity check from the slave add to the slave
activation step. Otherwise slave add will fail for mode 4 if
the ports are all stopped but only one of them checked.
Removed activate slave call from slave add function.
Bugzilla ID: 52
Fixes: b77d21cc2364 ("ethdev: add link status get/set helper functions")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Race condition can appear in the bond_mode_8023ad_periodic_cb()
callback when bonding port is stopped, reconfigured and restarted.
Re-ordered calls in bond_ethdev_start() to have callback alarm set
after slave ports are reconfigured.
Fixes: 2efb58cbab6e ("bond: new link bonding library")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Chas Williams <chas3@att.com>
There are several tx/rx offload capabilities missing in AVF.
Add them in this patch since AVF supports them.
Fixes: 69dd4c3d0898 ("net/avf: enable queue and device")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This reverts commit 7546dc4a1331340ecb665af9af0a005bb8b657c8.
In i40e FDIR PMD code for checking programming status,
when the action of add FDIR filter is ok, i40e NIC will
not write back to programming status descriptor, so if
PMD code check DD is not done after period of time dealy,
it means the add or remove filter action is ok. It only write
back descriptor when fail. So, there is no issue for the original
code.
Fixes: 7546dc4a1331 ("net/i40e: fix check of flow director programming status")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
This commit fixes an infinite loop bug that could occur
if the i40e AVX2 driver is used, and high traffic rates
cause the mempool from which the rxq pulls mbufs to become
empty.
The result would be an infinite loop of checking if we
should perform an rx rearm, calling the function and an
error return due the the mempool being emtpy.
The fix is to align the code in the AVX2 driver with the
SSE driver, where an if() is used instead of a while(),
allowing the thread to return from i40e rx function even
if the mempool is empty.
Fixes: dafadd73762e ("net/i40e: add AVX2 Rx function")
Cc: stable@dpdk.org
Reported-by: David Coyle <david.coyle@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Brendan Ryan <brendan.ryan@intel.com>
Tested-by: David Coyle <david.coyle@intel.com>
There's parsing error when using device argument
support-multi-driver or queue-num-per-vf or both.
Error log is "PMD: Error parsing device, invalid
key <support-multi-driver>" and "PMD: Error parsing
device, invalid key <queue-num-per-vf>". The root
cause is that device argument will be parsed in
different parsing functions with local valid key .
This patch fixes the issue by combine all supported
arguments's valid keys.
Fixes: cfdfca493cae ("net/i40e: fix multiple driver support")
Fixes: ee653bd80044 ("net/i40e: determine number of queues per VF at run time")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
TC message for VLAN POP is broken due to an unfinished nested attribute.
Fixes: 7ac6778d50e0 ("net/mlx5: add VLAN item and actions to switch flow rules")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
VLAN ID is not properly translated to TC due to swapped byte order.
Fixes: 7ac6778d50e0 ("net/mlx5: add VLAN item and actions to switch flow rules")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This commit fixes a segmentation fault observed when configuring
mlx5 with RSS flow rule containing invalid queues indices such as
negative numbers, queue numbers bigger than the number Rx queues the
PMD or has no queues at all.
Fixes: 592f05b29a25 ("net/mlx5: add RSS flow action")
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
When running testpmd with an mlx5 device and then executing at testpmd
prompt in a raw: "port start all" followed by "port stop all"
a new file named /var/tmp/net_mlx5_<socket num> is created as a result
of creating a new unix domain socket (used for communication between
the primary and secondary processes).
When the new unix socket file is created the old unix socket file should
have been removed. This commit fixes it by closing the old unix socket
just before creating the new one in function mlx5_socket_init()
Fixes: f8b9a3bad467 ("net/mlx5: install a socket to exchange a file descriptor")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
The below commit has added a graph based expansion logic for RSS rule to
satisfy Verbs requirements. With this logic, for example, the rule:
flow create 0 ingress pattern eth / end actions rss queues 0 1 end types
ipv4-tcp ipv6-tcp end / end
will be expanded into the rules:
flow create 0 ingress pattern eth / ipv4 / tcp / end actions rss queues 0 1
end types ipv4-tcp ipv6-tcp end / end
flow create 0 ingress pattern eth / ipv6 / tcp / end actions rss queues 0 1
end types ipv4-tcp ipv6-tcp end / end
flow create 0 ingress pattern eth / end actions queue index 0 / end
The below commit defined two graphs:
1. graph for the tunnel case which starts from the ETH item
2. graph for the non-tunnel case which starts from the ETH item
The graphs are ignoring the VLAN case. Hence rules with VLAN item will
fail to traverse the graph and it will result in flow rule creation error.
Adding the VLAN item to the existing graphs will not work as the flow
engine will reject any VLAN item without a specific vid.
To solve this case two new graphs were added (for the tunnel and
non-tunnel case) which contain the VLAN item and are being used only
when the VLAN item exists in the flow pattern.
Two cases left un-covered for the inner RSS:
1. The case were VLAN exists in the pattern as part of the inner headers
2. The case were VLAN exists in the pattern both in the outer and the
inner headers
Solving those cases will require to add two more graphs.
Holding a VLAN for the overlay network is not common, the subnets are
usually defined by the tunnel protocol, for example the VXLAN vni.
Hence adding those two graphs seems like an overkill at this point.
Based on needs one can add those to provide the full support.
Fixes: 592f05b29a25 ("net/mlx5: add RSS flow action")
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
There are dedicated QP attributes, tunnel offload flag and mask, which
must be configured in order to allow part of the HW tunnel offloads.
So, if a QP is pointed by a tunnel flow, the above QP attributes
should be configured.
The mask configuration is wrongly only performed if an internal RSS was
configured by the user, while there is no reason to condition the
tunnel offloads in RSS configurations.
Consequently, some of the tunnel offloads was not performed by the HW
when a tunnel flow was configured, for example, the packet tunnel
types was not reported to the user.
Replace the internal RSS condition with the tunnel flow condition.
Fixes: df6afd377ace ("net/mlx5: remove useless arguments in hrxq API")
Signed-off-by: Matan Azrad <matan@mellanox.com>
Stripping binaries at build time is usually a bad thing since it makes
impossible to generate (split) debug symbols and this can lead to a more
difficult debugging.
Fixes: 59b91bec12c6 ("net/mlx5: spawn rdma-core dependency plug-in")
Cc: stable@dpdk.org
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Stripping binaries at build time is usually a bad thing since it makes
impossible to generate (split) debug symbols and this can lead to a more
difficult debugging.
Fixes: 27cea11686ff ("net/mlx4: spawn rdma-core dependency plug-in")
Cc: stable@dpdk.org
Signed-off-by: Timothy Redaelli <tredaelli@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>