Adding a flow when the port is stopped ends in an inconsistent situation
where the queue can receive traffic when it should not.
Record new rules and apply them as soon as the port is started.
Fixes: 2097d0d1e2cc ("net/mlx5: support basic flow items and actions")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
A configuration structure for the MARK action must always be specified.
Fixes: ea3bc3b1df94 ("net/mlx5: support mark flow action")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Default masks were introduced in the API after its implementation in this
PMD.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Querying the link status can end up being in an inconsistent state,
like the port is reporting speed although it is down.
For this case another query is scheduled.
A race condition can occur between the scheduled query and link
status interrupt handlers.
When the scheduled query by-pass interrupt handlers, the link status
will be stuck in an inconsistent state.
This patch addresses the race condition by not blocking link status
queries in case delayed query is used.
Fixes: 198a3c339a8f ("mlx5: handle link status interrupts")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The chosen fake capability (10G) is consistent with the reported
link speed in virtio_dev_link_update():
link.link_speed = SPEED_10G;
The feature is not marked in doc/guides/nics/features/virtio.ini
because it is only a fake value.
Signed-off-by: Ido Barnea <ibarnea@cisco.com>
[Thomas: comments added]
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Rx and Tx queues share the common tap file descriptor, but save this
value separately.
Setting up Rx/Tx queue sets up both queues, release_queue close the
tap file but update file descriptor only for that queue.
This makes other queue's file descriptor invalid.
As a workaround, prevent release_queue callback to be called by default.
This is done by separating Rx/Tx setup functions, so that each only
setup its own queue, this prevents rte_eth_rx/tx_queue_setup() calling
release_queue before setup_queue.
Fixes: 02f96a0a82d1 ("net/tap: add TUN/TAP device PMD")
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Size of the mask is wrongly computed and make the validation process only
verify the first 4 bytes of the layer.
Fixes: 2097d0d1e2cc ("net/mlx5: support basic flow items and actions")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
TCI field is read from the wrong place due to an invalid cast. Moreover
there is no need to limit matching to VID since PCP and DEI bits can be
matched as well.
Fixes: 12475fb203ad ("net/mlx5: support VLAN flow item")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Fixes: a541407fe4bc ("net/i40e: set VF MAC anti-spoofing from PF")
Fixes: 4cbc41efcbb2 ("net/i40e: set VF VLAN anti-spoofing from PF")
Fixes: c0ec14757c5f ("net/i40e: set VF unicast promiscuous mode from PF")
Fixes: ae57070ca8b7 ("net/i40e: set VF multicast promiscuous mode from PF")
Fixes: 83bb95e3fefc ("net/i40e: set VF VLAN insertion from PF")
Fixes: 61fff9b4c68b ("net/i40e: set VF broadcast mode from PF")
Fixes: c33abbc1440e ("net/i40e: set VF VLAN tag from PF")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Change the order of releasing the VSI's.
Release the VMDq VSI's first, then release the main VSI.
Fixes: 3cb446b4aeb2 ("i40e: free vmdq vsi when closing")
Cc: stable@dpdk.org
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
The mac_addr_add callback function was simply replacing the primary MAC
address instead of adding new ones and the mac_addr_remove callback would
only remove the primary MAC form the adapter. Fix the functions to add or
remove new address. Allow up to 64 MAC addresses per port.
Fixes: fefed3d1e62c ("enic: new driver")
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Nelson Escobar <neescoba@cisco.com>
Current ixgbevf driver get max_rx_pktlen = 15872, but in fact PF
supports 15872-byte jumbo frame and VF only supports 9728-byte jumbo
frame. If VF is running DPDK driver and set frame_size > 9728 ,PF
running kernel ixgbe driver will report an error and set VF failed.
This patch fixs DPDK ixgbevf driver to get correct jumbo frame size
of VF.
More datasheet references from Wei Dai:
In 82599 datasheet, there is an annotation in the chapter 1.3 Features
Summary (page 29)
The 82599 supports full-size 15.5 KB (15872-byte) jumbo packets while
in a basic mode of operation. When DCB mode is enabled,
or security engines enabled or virtualization is enabled, the 82599
supports 9.5 KB (9728-byte) jumbo packets.
In x540 datasheet, there is also an annotation in the chapter 1.3
Features Summary (page 13)
The X540 and 82599 support full-size 15.5 KB jumbo packets while in a
basic mode of operation. When DCB mode is enabled,
or security engines enabled, or virtualization is enabled, or OS2BMC is
enabled, then the X540 supports 9.5 KB jumbo packets.
Packets to/from MC longer than 2KB are filtered out.
In x550 datasheet, there is still also an annotation in the chapter 1.4
Feature Summary (page 23)
All the products support full-size 15.5 KB jumbo packets while in a
basic mode of operation. When DCB mode is enabled, or security
engines enabled, or virtualization is enabled, or OS2BMC is enabled,
then only 9.5 KB jumbo packets are supported. Packets to/
from the MC longer than 2 KB are filtered out.
Fixes: 2144f6630fca ("ixgbe: add redirection table size in device info")
Cc: stable@dpdk.org
Signed-off-by: Yi Zhang <zhang.yi75@zte.com.cn>
Acked-by: Wei Dai <wei.dai@intel.com>
When no error reported in Rx descriptor, we should set
CKSUM_GOOD flag before return.
Fixes: 9966a00a0688 ("net/i40e: enable bad checksum flags in vector Rx")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
data value could have been garbage if VPD access timed out for VPD read
request could not been issued.
Found with clang static analysis:
drivers/net/cxgbe/base/t4_hw.c:1577:22:
warning: The left operand of '&' is a garbage value
} while ((stats_reg & 0x1) && --max_poll);
~~~~~~~~~ ^
Fixes: fe0bd9ee5da3 ("net/cxgbe: support EEPROM access")
Cc: stable@dpdk.org
Signed-off-by: Emmanuel Roullit <emmanuel.roullit@gmail.com>
While ENA can handle checksum calculations in almost all cases,
it cannot do so when DF bit in IPv4 header is not set,
that is DF=0, and TSO is requested. For that situation pseudo
header must be prepared manually.
Signed-off-by: Jakub Palider <jpa@semihalf.com>
Add checks to rte_pmd_ixgbe_macsec_* APIs to ensure that the
port is an ixgbe port.
Fixes: b35d309710fe ("net/ixgbe: add MACsec offload")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Ignore the error=IXGBE_ERR_SFP_NOT_PRESENT when SFP is not present.
If it is not ignored, testpmd will fail during the NIC initialization
process.
Ixgbe kernel driver ignores this error and works well. So DPDK
does same thing.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Tested-by: Yuan Peng <yuan.peng@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Check if promisc mode was set when setting allmulti mode and vice-versa.
Introduced BNX2X_RX_MODE_ALLMULTI_PROMISC for the same. If check is
absent the filter configuration gets over written.
Fixes: 540a211084a7 ("bnx2x: driver core")
Fixes: 5dbc53d7e5a2 ("net/bnx2x: restrict Rx mask flags sent to the PF")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
filter_type is not set when removing all macvlan filters. It will
cause error when send AQ command to HW.
This patch fixes this issue.
Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
When VF sends request to remove MAC address, PF host will check
if it is a non-zero or unicast address. When VF remove a multicast
address, it will report error.
This patch fixes this issue.
Fixes: ec852c94af39 ("net/i40e: enhance sanity check of MAC")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
in case of an error argument list is not freed.
Fixes: e72dd09b614e ("net/mlx5: add support for configuration through kvargs")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
The number of Tx queues requested by the user must not be overridden;
instead, the limits imposed by TSO must be applied to the advertised
maximum
Fixes: fec33d5bb3eb ("net/sfc: support firmware-assisted TSO")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
The previous version relied on the fact that DMA sync for device and
PIO write barrier in pair. Now each does its job.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Remove RTE_LIBRTE_SFC_EFX_TSO config option since it is not
required any more:
- unreasonable limit on number of Tx queues when TSO is not
actually required should be solved using per-device parameter
- performance difference with and without TSO compiled in is small
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Alarms are not supported on the FreeBSD.
Application must poll link status periodically itself using
rte_eth_link_get_nowait() to avoid management event queue overflow.
Fixes: 2de39f4e1310 ("net/sfc: periodic management EVQ polling using alarm")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
When the WQ is wrapped around, it wrongly checks the condition when
resetting the pointer. It should be compared against the end of the queue,
not the beginning of the queue. And this isn't even needed when the length
of the copying data crosses the boundary.
Fixes: fdcb0f53053b ("net/mlx5: use work queue buffer as a raw buffer")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The size of Rx RSS indirection table was limited by 256, but it is not
required anymore for all Mellanox NICs. However, the librte_ether still
limits the size by 512.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
On receiving a compressed session of Rx completion, prefetch every entries
to be invalidated. Also, invalidate consumed completions per every 8
mini-completions, not to wait until the last entry is consumed. This helps
to reduce jitter in rx_burst.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
When no error reported in Rx descriptor, we should set CKSUM_GOOD flag
before return.
Fixes: b704f9071b09 ("net/i40e: implement new Rx checksum flag")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
While handling link status change (LSC) interrupt, all interrupts are
blocked until delayed interrupt handler finishes.
The wait duration is at least one second and this may cause timeouts in
VF to PF mailbox.
Make sure only LSC interrupt is blocked while waiting for delayed
interrupt handler to finish.
Fixes: 0a45657a6794 ("pci: rework interrupt handling")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
efx_phy_adv_cap_set() sets all advertised phy capabilities including
pause capabilities which are also configured using efx_mac_fcntl_set().
If we set speed and autonegotiation capabilities only, we should
preserve already configured pause capabilities.
Fixes: d23f3a89ab54 ("net/sfc: support link speed and duplex settings")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Fixes: 886f8d8a05bf ("net/sfc: retrieve link info")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Fixes: 886f8d8a05bf ("net/sfc: retrieve link info")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
In fact efx_port_poll() always initializes it, but it isn't
explicitly documented feature of the API. Moreover, the API
annocation suggests that return code should be checked.
Fixes: 886f8d8a05bf ("net/sfc: retrieve link info")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
When any layout is used, the header is stored in the head room of mbuf.
mbuf is allocated and filled by user, means there is no gurateen the
header is all zero for non TSO case. Therefore, we have to do the reset
by ourself:
memest(hdr, 0, head_size);
The memset has two impacts on performance:
- memset could not be inlined, which is a bit costly.
- more importantly, it touches the mbuf, which could introduce severe
cache issues as described by former patch.
Similiary, we could do the same trick: reset just when necessary, when
the corresponding field is already 0, which is likely true for a simple
l2 forward case. It could boost the performance up to 20+% in micro
benchmarking.
Cc: stable@dpdk.org
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
TSO is now enabled, but it's not actually being used by default in a
simple L2 forward mode. In such case, we have to zero the virtio net
headers, to inform the vhost backend that no offload is being used:
hdr->csum_start = 0;
hdr->csum_offset = 0;
hdr->flags = 0;
hdr->gso_type = 0;
hdr->gso_size = 0;
hdr->hdr_len = 0;
Such writes could be very costly; it introduces severe cache issues:
The above operations introduce cache write for each packet, which
stalls the read operation from the vhost backend.
The fact that virtio net header is initiated to zero in PMD driver
init stage means that these costly writes are unnecessary and could
be avoided:
if (hdr->csum_start != 0)
hdr->csum_start = 0;
And that's what the macro ASSIGN_UNLESS_EQUAL does. With this, the
performance drop introduced by TSO enabling is recovered: it could
be up to 20% in micro benchmarking.
Fixes: 58169a9c8153 ("net/virtio: support Tx checksum offload")
Fixes: 696573046e9e ("net/virtio: support TSO")
Cc: stable@dpdk.org
Cc: Olivier Matz <olivier.matz@6wind.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
The commit aed0b12930b3 ("net/vhost: fix socket file deleted on stop")
moves rte_vhost_driver_register and rte_vhost_driver_unregister from
dev_start() and dev_stop() into driver's probe() and remove().
Apps, like testpmd, using vhost pmd in server mode, usually calls
dev_stop() and dev_close() as quitting, instead of driver-specific
remove(). Then those unix socket files have no chance to get removed.
Semantically, device-specific things should be put into device-specific
APIs. Fix this issue by moving rte_vhost_driver_unregister, plus other
structure free into dev_close().
Fixes: aed0b12930b3 ("net/vhost: fix socket file deleted on stop")
Cc: stable@dpdk.org
Reported-by: Lei Yao <lei.a.yao@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Value returned from malloc is not checked for errors before being used.
This patch fixes following coverity issue.
static struct vhost_memory_kernel *
prepare_vhost_memory_kernel(void)
{
...
vm = malloc(sizeof(struct vhost_memory_kernel) +
max_regions *
sizeof(struct vhost_memory_region));
...
>>> CID 140744: (NULL_RETURNS)
>>> Dereferencing a null pointer "vm".
mr = &vm->regions[k++];
Coverity issue: 140744
Fixes: e3b434818bbb ("net/virtio-user: support kernel vhost")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
The vtpci_ops assignment needs the 'hw->port_id' as an input parameter.
That said, we should set 'hw->port_id' firstly, then do the vtpci_ops
assignment, while the code does reversely. That would result to a crash
when more than one virtio devices are used, because we keep assigning
proper vtpci_ops to virtio_hw_internal[0]->vtpci_ops, leaving the pointer
for other ports being NULL.
Reverse the order fixes this issue.
Fixes: 9470427c88e1 ("net/virtio: do not store PCI device pointer at shared memory")
Cc: stable@dpdk.org
Reported-by: Lei Yao <lei.a.yao@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>