Add more explanations about vring size changes and different
virtio_header size.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
There is no way to bypass IP checksum verification in Linux
kernel, no matter skb->ip_summed is assigned as CHECKSUM_UNNECESSARY
or CHECKSUM_PARTIAL.
So any packets with bad IP checksum will be dropped at VM IP layer.
To correct, we check this flag PKT_TX_IP_CKSUM to calculate IP csum.
Fixes: 859b480d5a ("vhost: add guest offload setting")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
As PKT_TX_TCP_SEG flag in mbuf->ol_flags implies PKT_TX_TCP_CKSUM,
applications, e.g., testpmd, don't set PKT_TX_TCP_CKSUM when TSO
is set.
This leads to that packets get dropped in VM tcp stack layer because
of bad TCP csum.
To fix this, we make sure TCP NEEDS_CSUM info is set into virtio net
header when PKT_TX_TCP_SEG is set, so that VM tcp stack will not
check the TCP csum.
Fixes: 859b480d5a ("vhost: add guest offload setting")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
vsocket->conn_mutex was allocated with pthread_mutex_init() but never
freed with pthread_mutex_destroy(). This is a potential memory leak,
depending on how pthread_mutex_t is implemented.
Signed-off-by: Daniel Verkamp <daniel.verkamp@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
To make vectorized burst routines enabled, it is required to run on x86_64
architecture. If all the conditions are met, the vectorized burst functions
are enabled automatically. The decision is made individually on RX and TX.
There's no PMD option to make a selection.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The callbacks are global to a device but the selection is made every queue
configuration, which is redundant.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
When searching LKEY, if search key is mempool pointer, the 2nd cacheline
has to be accessed and it even requires to check whether a buffer is
indirect per every search. Instead, using address for search key can reduce
cycles taken. And caching the last hit entry is beneficial as well.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
When processing Tx completion, it is more efficient to free buffers in bulk
using rte_mempool_put_bulk() if buffers are from a same mempool.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
For Tx SW ring (txq->elts[]), indexes are kept and used in
txq->elts_head/tail. Because of this, one entry must always be left unused
and it also makes code complex. Changed to store counters instead of
indexes in order to make the code simpler and to reduce a few calculations.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Currently, if requested MTU is bigger than mbuf size and scattered
receive is not enabled, setting MTU to that value fails.
This patch allows setting this special MTU when device is stopped,
because scattered_rx will be re-configured during next port start
and driver may enable scattered receive according new MTU value.
After this patch, driver may select different receive function
automatically after MTU set, according MTU values selected.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch removes restrictions in bonded device library which prevent a
bonded device to be added to another bonded device with the limitation
that 802.3ad mode is not supported if one or more slaves is also a
bonded device,
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
This patch fixes the management of link properties in the bonded device.
In all mode except mode 4 a bonded device link will default to reporting
the link as full duplex and auto-neg. The link speed for a bond port is
calculated on it's active slaves and the particular mode it is running
in. The bonding link speed is reported based on the transmit link as in
some modes link speed between egress/ingress is not symmetrical.
- round-robin, balance, 802.3ad, TLB and ALB modes all report the link
speed as the sum of the speed of each active slave.
- active backup link speed is reported as the speed of the current
primary slave
- broadcast is reported as the minimum of value of the active slaves
link speeds.
In mode 4 (link aggregation 802.3ad) the properties of the first slave
added to the bonded device are slave and subsequent slaves are verified
to have the same properties.
Finally in the bond_ethdev_lsc_event_callback function the link
properties of the device are updated after any change to the number of
active slaves.
Fixes: 2efb58cbab ("bond: new link bonding library")
Cc: stable@dpdk.org
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Enable IPv6 support with rte_flow API.
Only supports signature Match.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Enable signature match for rte_flow API.
RTE_FLOW_ITEM_TYPE_FUZZY specify a signature match.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Code cleanup done for 2 purposes.
1. "index" variable is no more needed.
2. inline function makes it easier and safer to nest into a loop.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
ring network driver is limited to 32 characters
the name of device was extended to net_ring_unit_test_mode4_slave_0
which is 33 characters long.
Fixes: 5e41ab250d ("app/test: unit tests for bonding mode 4")
Cc: stable@dpdk.org
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
This commit adds support for the flow API RSS action with the following
limitations:
- Only supported when isolated mode is enabled.
- The number of queues specified by the action (rte_flow_action_rss.num)
must be a power of two.
- Each queue index can be specified at most once in the configuration
(rte_flow_action_rss.queue[]).
- Because a queue can be associated with a single RSS context, it cannot
be targeted by multiple RSS actions simultaneously.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
A special "parent" queue must be allocated in addition to a group of
standard Rx queues for RSS to work. This is done automatically outside of
isolated mode by the PMD when applications request several Rx queues.
Since each configured flow rule with the RSS action may target a different
set of queues, the PMD must have the ability to dynamically allocate
several parent queues, one per RSS group.
If isolated mode was requested the default RSS parent queue isn't created
in this case.
Refactor RSS parent queue allocations (currently limited to a single
parent) in preparation for flow API RSS action support.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The user must request isolated mode before device configuration.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Refactor vmxnet3_post_rx_bufs() to call vmxnet3_renew_desc()
to update the newly allocated mbufs. While here, relocate the
relevant comments to vmxnet3_renew_desc().
Signed-off-by: Chas Williams <ciwillia@brocade.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
In debug mode, all mbuf ol_flags are temporarily enabled while sitting
in the Rx queue to detect otherwise silent data corruption, however
some of them are special (indirect and control) and must be cleared
before returning mbufs to the pool to avoid crashing.
Fixes: 7fae69eeff ("mlx4: new poll mode driver")
CC: stable@dpdk.org
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Internal Rx packet bytes is over counted, CRC size is not excluded,
that cause incorrect Rx bytes or even negative value.
Fixes: 0bcdc44510 ("net/i40e: exclude internal packet's byte count")
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Add new command to support enable/disable of dedicated Tx/Rx queue on
each slave of a bond device for LACP control plane traffic.
set bonding lacp dedicated_queues <port_id> [enable|disable]
When enabled this option creates dedicated queues on each slave device
for LACP control plane traffic. This removes the need to filter control
plane packets in the data path.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Add support for hardware flow classification of LACP control plane
traffic to be redirect to a dedicated receive queue on each slave which
is not visible to application. Also enables a dedicate transmit queue
for LACP traffic which allows complete decoupling of control and data
paths.
This only applies to bonding devices running in mode 4
(link-aggregation-802.3ad).
Introduce two new APIs to support enable/disabled of dedicated
queues.
- rte_eth_bond_8023ad_dedicated_queues_enable
- rte_eth_bond_8023ad_dedicated_queues_disable
rte_eth_bond_8023ad_dedicated_queues_enable must be called before
bonding port is configured or started to reserved and configure the
dedicated queues.
When this option is enabled all slaves must support flow filtering
by Ethernet type and support one additional Tx and Rx queue on
each slave.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Use packet types flags in mbuf to provide hint for filtering of LACP
control plane traffic from the data path.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
This patch fixes the maximum number of Tx an Rx queues supported by a
bonding device return by the rte_eth_dev_info_get function.
The bonding device now calculates the maximum number of supported Tx
and Rx queues based on the slaves bound to the bonded device, with the
minimum values of Tx and Rx queues from the device slaves being the
bonded devices maximum, as each slave must be able to support the same
number of Tx and Rx queues.
Fixes: 2efb58cbab ("bond: new link bonding library")
Cc: stable@dpdk.org
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Uninitialized memory could cause memory corruption, by indicating
completion of the invalid mbuf.
Fixes: 3d3edc265f ("net/ena: make coherent memory allocation NUMA-aware")
Cc: stable@dpdk.org
Signed-off-by: Alexander Matushevsky <matua@amazon.com>
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
After cleanup of the mbuf on Tx path, queue was still pointing to this
mbuf and upon cleanup of the Tx buffers, it was being freed second time.
Moreover, to make sure that upon interrupt all mbufs will be freed,
whole ring is being cleaned up instead of part of it - originally only
range between head and tail was being cleaned up.
Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
VLAN stripping configuration is supported only for DPDK PF
previously. Since kernel PF supports VLAN stripping now, this
patch adds VLAN stripping support for both DPDK PF and kernel
PF.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Convert DP_NOTICE() to DP_ERR() as appropriate in PMD files.
Change DP_NOTICE() macro to make use of boolean flag to log
it as error message or informational message.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Since nb_max is a u16 it can store value up to 65535 only (not 64K),
but this value is not a power-of-2. So limit the ring sizes to 32K.
Signed-off-by: Harish Patil <harish.patil@cavium.com>
Log HW errmsg on the stdout and do minimal handling to prevent HW
attentions from being reasserted.
Signed-off-by: Harish Patil <harish.patil@cavium.com>
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Implement the macro OSAL_DMA_FREE_COHERENT to release DMA memories.
Track all DMA memory allocations using an array of memzone pointers and
use that to free memory resources along with other resource deallocation.
With this change there is no need to alter the base code to additionally
pass an unique string needed for memzone creation.
Fixes: ec94dbc573 ("qede: add base driver")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@cavium.com>
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
The bonded device does not have to be a PCI device. Use the rte_ethdev
functions instead.
Signed-off-by: Wen Chiu <wchiu@brocade.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
If the numa node is unknown, use the value from rte_socket_id() to avoid
an allocation failure.
Signed-off-by: Eric Kinzie <ekinzie@brocade.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Add support for inner/outer L3/L4 TX csum offload for MPLS-in-UDP packets.
The driver checks for PKT_TX_TUNNEL_MPLSINUDP in Tx ol_flags and updates TX
BD elements with appropriate offsets/length of tunnel headers.
The pseudo csum calculation is removed from qede_xmit_prep_pkts() since its
not needed.
Signed-off-by: Harish Patil <harish.patil@cavium.com>
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Some PMDs need to know the tunnel type in order to handle advance TX
features. This patch adds a new TX offload flag for MPLS-in-UDP packets.
Signed-off-by: Harish Patil <harish.patil@cavium.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Fix an error message. Probably a copy/paste error.
Fixes: 36735a932c ("net/bnxt: support set VF QOS and MAC anti spoof")
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
bnxt_hwrm_port_phy_cfg() was not setting enables appropriately.
This was causing undesired issues with link config.
This patch takes care of that.
Fixes: 7bc8e9a227 ("net/bnxt: support async link notification")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
We are not passing proper speed mask causing autoneg to not work
properly on 10GBase-T links. Use supported speeds from phy_qcfg
for auto mask to fix this.
Fixes: 7bc8e9a227 ("net/bnxt: support async link notification")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
This patch fixes the get link configuration code.
bnxt_get_hwrm_link_config was using wrong macros and wrongly
deriving link speed based on link status which was causing
incorrect link information to be displayed in few scenarios.
Fixes: 7bc8e9a227 ("net/bnxt: support async link notification")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Add code to avoid automatic clearing of VF stats on a function
reset or a stat context free.
Add rte_pmd_bnxt_set_vf_persist_stats() PMD specific API.
Fixes: dd46c6bbd5 ("net/bnxt: support get and clear VF specific stats")
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Previously, no VNICs were allocated as the default VNIC,
even if they were configured as one. This patch fixes that error.
Fixes: db678d5c2b ("net/bnxt: add HWRM VNIC configure")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
We used the default as a work-around for the 4-byte MTU VNICs,
but we filter those out now. So fix the calculation of VNICs
accordingly.
Fixes: ff63ebbb67 ("net/bnxt: determine the Rx status of VF")
Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Set the MTU of the device to MAX size possible.
Fixes: b7778e8a1c ("net/bnxt: refactor to properly allocate resources for PF/VF")
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
This patch adds support for Stratus VF devices.
Other than adding the VF device ID, we also are adding
support for short message format.
The new short firmware message of size 16 bytes points to a location
in host memory that contains the firmware message that the driver
wants the adapter to process. This indirection requires the
firmware to DMA the message into its own memory for processing.
When the firmware receives a Short Command Format, it will DMA the
firmware message from host memory into its internal memory and then
processes the message as it would have if it received the message
directly in the firmware communication channel.
With BCM57454, the number of VFs increase to 1K and so the requirement
for firmware communication channel memory on adapter becomes expensive.
The short firmware message saves the amount of memory required to support
1K VFs on adapter.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>