There new kernel/rdma_core [1] supports matching on metadata
register instead of vport field to provide operations over
VF LAG bonding configurations. The patch retrieves parameters
and information about the way is engaged to match vport on E-Switch.
[1] http://patchwork.ozlabs.org/cover/1122170/
"Mellanox, mlx5 vport metadata matching"
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
If bonding Infiniband device is found the unified E-Switch
is supposed and the extra rdma-core/kernel support is needed
to retrieve vport indices. The patch introduces this feature
defines, bonding support check is added to probe routine.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
If device is VF LAG bonding one the port name includes
the bonding Infiniband device name and looks like:
82:00.0_mlx5_bond_0 - for master device port PF0
82:00.1_mlx5_bond_0_representor_5 - for representor
VF5 over PF1
where bonding Infiniband device mlx5_bond_0 controls
the 82:00.0 as PF0 and 82:00.1 as PF1 PCI functions.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The Mellanox NICs starting from ConnectX-5 support LAG over
NIC ports internally, implemented by the NIC firmware and hardware.
The multiport NIC presents multiple physical PCI functions (PF),
with SR-IOV multiple virtual PCI functions (VFs) might be presented.
With switchdev mode the VF representors are engaged and PFs and their
VFs are connected by internal E-Switch feature. Each PF and related VFs
have dedicated E-Switch and belong to dedicated switch domain.
If NIC ports are combined to support NIC the kernel drivers introduce
the single unified Infiniband multiport devices, and all only one
unified E-Switch with single switch domain combines master PF
all all VFs. No extra DPDK bonding device is needed.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
At device probing the device list to spawn was allocated
as dynamic size local variable. It was no possible to have
one unified exit point from routine due to compiler warnings.
This patch allocates the spawn device list directly with
rte_zmalloc() and it is possible to goto to unified exit
label from anywhere of the routine.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The routine mlx5_ibv_device_to_pci_addr() takes Infiniband
device list object, takes the device sysfs path from there
and retrieves PCI address. The routine may be implemented
in more generic way by taking sysfs path directly as parameter
and can be used for getting PCI address of netdevs.
The generic routine is renamed to mlx5_dev_to_pci_addr()
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Now all devices created over the same multiport IB device
have shared context containing the backing PCI device field.
For the VF LAG configurations it becomes possible the
representors might be connected to VF created over different
PFs. In this case representors have the different backing
PCI devices and mentioned field should be moved to device
private area.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The legacy filter API "flow director" has been superseded by rte_flow
since 2017. Remove comments in the enic guide regarding the deprecated
feature.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
Probe for the PCI ID of a new mode of VF which will be added to VIC
adapter firmware. When fully implemented, the new mode will operate
independent of the Cisco Virtual Machine Fabric Extender (VM-FEX) and
will not need to be provisioned through libvirt. The new mode is dubbed
"Standalone vNic" mode or "SN" mode.
Also, minor formatting changes.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
Packet processing in secondary process cannot work because rx_pkt_burst
and tx_pkt_burst in eth_dev are not initialized while probing device.
This patch is to the initialization.
Fixes: ee27edbe0c ("drivers/net: share vdev data to secondary process")
Cc: stable@dpdk.org
Signed-off-by: Yasufumi Ogawa <yasufum.o@gmail.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch adds command line support for Symmetric Toeplitz
hash configuration.
Signed-off-by: Simei Su <simei.su@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
Currently, there are DEFAULT,TOEPLITZ and SIMPLE_XOR hash function.
To support symmetric hash by rte_flow RSS action, this patch adds
new hash function "Symmetric Toeplitz" which is supported by some hardware.
Signed-off-by: Simei Su <simei.su@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Ori Kam <orika@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
The generic RTE_LOGTYPE_PMD is a historical relic and should
not be used. Every driver must dynamic log types.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The generic RTE_LOGTYPE_PMD is a historical relic and should
not be used. Every driver must dynamic log types.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The generic RTE_LOGTYPE_PMD is a historical relic and should
not be used. Every driver must dynamic log types.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The generic RTE_LOGTYPE_PMD is a historical relic and should
not be used. Every driver must dynamic log types.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The generic RTE_LOGTYPE_PMD is a historical relic and should
not be used. Every driver must dynamic log types.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The generic RTE_LOGTYPE_PMD is a historical relic and should
not be used. Every driver should register the logtypes
for itself.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
slave aggregator_port_id is in [0, RTE_MAX_ETHPORTS-1] range.
If RTE_MAX_ETHPORTS is > 8, we can hit out of bound accesses on
agg_bandwidth[] and agg_count[] arrays.
Fixes: 6d72657ce3 ("net/bonding: add other aggregator modes")
Cc: stable@dpdk.org
Signed-off-by: Hui Zhao <zhaohui8@huawei.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Chas Williams <chas3@att.com>
mode_bond_id and mode_band_id are slave ids, stored on 16bits.
Fixes: f8244c6399 ("ethdev: increase port id range")
Cc: stable@dpdk.org
Signed-off-by: Hui Zhao <zhaohui8@huawei.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
During PF/VF link update, a default speed value of 100M will be set
if get_link_info has failed or speed is unknown.
Consequently if PF is put in no-carrier state, VFs will switch to
"in carrier" state due to a link up + a link speed set to 100M
(default value if no speed detected).
To be consistent with linux drivers on which PF and VFs are in
same carrier state, sets default speed to undefined (instead of 100M)
and updates a link status of VF only if link is up and speed is
different from undefined.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Laurent Hardy <laurent.hardy@6wind.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The speed capability of X553 1GbE should be ETH_LINK_SPEED_1G |
ETH_LINK_SPEED_100M | ETH_LINK_SPEED_10M rather than ETH_LINK_SPEED_1G |
ETH_LINK_SPEED_10G. Correct it to fix the issue.
Fixes: e274f57322 ("ethdev: add speed capabilities")
Cc: stable@dpdk.org
Signed-off-by: Xiao Zhang <xiao.zhang@intel.com>
Reviewed-by: Xiaolong Ye <xiaolong.ye@intel.com>
Add support for setting VLAN PCP field via rte_flow API. Hardware
overwrites the entire 16-bit VLAN TCI field. So, both VLAN VID and
PCP actions must be specified.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Add support for matching VLAN fields via rte_flow API.
When matching VLAN pattern, the ethertype field in hardware
filter specification must contain VLAN header's ethertype, and
not Ethernet header's ethertype. The hardware automatically
searches for ethertype 0x8100 in Ethernet header, when
parsing incoming packet against VLAN pattern.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Remove compile time option to control Tx coalescing Latency vs
Throughput behavior. Add tx_mode_latency devarg instead, to
dynamically control Tx coalescing behavior.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Remove compile time flags and use dynamic logging for debug prints.
Also remove rarely used debug logs in register access and datapath.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Link updates come in firmware event queue, which is only created
when device starts. So, don't poll for link status if firmware
event queue is not yet created.
This fixes NULL dereference when accessing non existent firmware
event queue.
Fixes: 265af08e75 ("net/cxgbe: add link up and down ops")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Set VLAN action mode to VLAN_REWRITE only if VLAN_INSERT has not been
set yet. Otherwise, the resulting VLAN packets will have their VLAN
header rewritten, instead of pushing a new outer VLAN header.
Also fix the VLAN ID extraction logic and endianness issues.
Fixes: 1decc62b1c ("net/cxgbe: add flow operations to offload VLAN actions")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
To avoid name collisions, add cxgbe_ prefix to some global functions.
Also, make some local functions static in cxgbe_filter.c.
Fixes: ee61f5113b ("net/cxgbe: parse and validate flows")
Fixes: 9eb2c9a480 ("net/cxgbe: implement flow create operation")
Fixes: 3a381a4116 ("net/cxgbe: query firmware for HASH filter resources")
Fixes: af44a57798 ("net/cxgbe: support to offload flows to HASH region")
Fixes: 41dc98b082 ("net/cxgbe: support to delete flows in HASH region")
Fixes: 23af667f15 ("net/cxgbe: add API to program hardware layer 2 table")
Cc: stable@dpdk.org
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
The PCI virtual function type was not recognized correctly
for ConnectX-6 VF.
Fixes: f0354d8423 ("net/mlx5: add ConnectX-6 device IDs")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The PCI virtual function type was not recognized correctly
for BlueField VF.
Fixes: f38c54571d ("net/mlx5: split PCI from generic probing")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
In the process of recovery from error CQE, when using vectorized Rx
burst, the initialization of data length in mbufs was not done.
As a result the wrong length was left written in mbuf, causing
memory overwrite or wrong error report.
This patch fixes the initialization of mbuf data length during
recovery from error CQE, when using vectorized Rx burst,
Fixes: 88c0733535 ("net/mlx5: extend Rx completion with error handling")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
The txq_uar_init() routine uses the uninitialized uar_mmap_offset
field in 32-bit configurations due to this field is initialized
after txq_uar_init() call.
Fixes: 120dc4a7dc ("net/mlx5: remove device register remap")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
TSO (TCP Segmentation Offload) is enabled by default on vhost.
Add the ability to disable TSO on vhost.
The user should also disable the feature on the virtual machine's xml.
Signed-off-by: Noa Ezra <noae@mellanox.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Matan Azrad <matan@mellanox.com>
Add Igor Chauskin from the Amazon as another maintainer of the driver.
Igor is another person from the Amazon team that is responsible for the
ENA DPDK driver.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Explicitly document that the MAC/VLAN filtering in virtio
is best effort to help users understand why unwanted packets
could still arrive.
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
If reserve virtio header room by function rte_pktmbuf_prepend, both
segment data length and packet length of mbuf will be increased.
Data length will be equal to descriptor length, while packet length
should be decreased as virtio-net header won't be taken into packet.
Thus will cause mismatch in mbuf structure. Fix this issue by access
mbuf data directly and increase descriptor length if it is needed.
Fixes: 58169a9c81 ("net/virtio: support Tx checksum offload")
Fixes: 892dc798fa ("net/virtio: implement Tx path for packed queues")
Fixes: 4905ed3a52 ("net/virtio: optimize Tx enqueue for packed ring")
Fixes: e5f456a98d ("net/virtio: support in-order Rx and Tx")
Cc: stable@dpdk.org
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The loop to read packets does not take all packets as the number of
available packets (nb_used) is decremented in the loop.
Take all available packets provides a performance improvement of 3%.
Fixes: fc3d66212f ("virtio: add vector Rx")
Cc: stable@dpdk.org
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
With vectorized functions, only the rx stats for number of packets is
incremented.
Update also the other statistics.
Performance impact is about 2%
Fixes: fc3d66212f ("virtio: add vector Rx")
Cc: stable@dpdk.org
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Only the mapping of the vring addresses is being ensured. This causes
errors when the vring size is larger than the IOTLB page size. E.g:
queue sizes > 256 for 4K IOTLB pages
Ensure the entire vring memory range gets mapped. Refactor duplicated
code for for IOTLB UPDATE and IOTLB INVALIDATE and add packed virtqueue
support.
Fixes: 09927b5249 ("vhost: translate ring addresses when IOMMU enabled")
Cc: stable@dpdk.org
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Check whether space are enough before burst enqueue operation. If more
space is needed, will try to clean up used descriptors for space on
demand. It can give more chances to free used descriptors, thus will
help RFC2544 performance. Also deduct failed xmit packets from total
xmit number.
Fixes: e5f456a98d ("net/virtio: support in-order Rx and Tx")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When doing xmit in-order enqueue, packets are buffered and then flushed
into avail ring. Buffered packets can be dropped due to insufficient
space. Moving stats update action just after successful avail ring
updates can guarantee correctness.
Fixes: e5f456a98d ("net/virtio: support in-order Rx and Tx")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
In some situations, when a virtual machine is starting,
vring_state_changed can be called while there was no change in the
queue state. This fix makes sure that there was really a change in the
queue state before calling the callback for EVENT_QUEUE_STATE.
Fixes: ee584e9710 ("vhost: add driver on top of the library")
Cc: stable@dpdk.org
Signed-off-by: Noa Ezra <noae@mellanox.com>
Reviewed-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>