Mailbox is the communication mechanism between SW and HW. There exist
two approaches for SW to recognize mailbox message from HW. One way is
using match_id, the other is to compare the message code. The two
approaches are independent and used in different scenarios.
But for the second approach, "next_to_use" should be updated and written
to HW register. If it not done, HW do not know the position SW steps,
then, the communication between SW and HW will turn to be failed.
Fixes: dbbbad23e3 ("net/hns3: fix VF handling LSC event in secondary process")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
octeontx_ep driver's dependency on octeontx2 common code is
removed as going forward ep driver will include files from
its own path.
Signed-off-by: Nalla Pradeep <pnalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Link update callback reports speed/duplex based on data
filled on device initialization. This is wrong in case of
VIRTIO_NET_F_SPEED_DUPLEX is negotiated since link could
be down at this time. Fix this function to actually
update the HW data in this case with respect to the fact
that specifying speed via devarg is a highest priority.
Fixes: 1357b4b362 ("net/virtio: support Virtio link speed feature")
Cc: stable@dpdk.org
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
According to current semantics of power monitor, this commit adds a
callback function to decide whether aborts the sleep by checking
current value against the expected value and vhost_get_monitor_addr
to provide address to monitor. When no packet come in, the value of
address will not be changed and the running core will sleep. Once
packets arrive, the value of address will be changed and the running
core will wakeup.
Signed-off-by: Miao Li <miao.li@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
According to current semantics of power monitor, this commit adds a
callback function to decide whether aborts the sleep by checking
current value against the expected value and virtio_get_monitor_addr
to provide address to monitor. When no packet come in, the value of
address will not be changed and the running core will sleep. Once
packets arrive, the value of address will be changed and the running
core will wakeup.
Signed-off-by: Miao Li <miao.li@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>
This patch fixes RETA updating for entries above 64.
Without that, these entries are never updated as
calculated mask value will always be 0.
Fixes: 634efbc2c8 ("mlx5: support RETA query and update")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Provide the capability to update the hash key, hash types
and RETA table on the fly (without needing to stop/start
the device). However, the key length and the number of RETA
entries are fixed to 40B and 128 entries respectively. This
is done in order to simplify the design, but may be
revisited later as the Virtio spec provides this
flexibility.
Note that only VIRTIO_NET_F_RSS support is implemented,
VIRTIO_NET_F_HASH_REPORT, which would enable reporting the
packet RSS hash calculated by the device into mbuf.rss, is
not yet supported.
Regarding the default RSS configuration, it has been
chosen to use the default Intel ixgbe key as default key,
and default RETA is a simple modulo between the hash and
the number of Rx queues.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Remove a stale compile option from meson build file.
RTE_LIBRTE_BNXT_TF sneaked in incorrectly.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
After DCF is reset by PF, the VSI update service is unable to be
completed since the DCF resource is invalid.
This patch removes the call to service that updates VSI since it is
useless and output too many error messages.
Fixes: c7e1a1a3bf ("net/ice: refactor DCF VLAN handling")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Add watchdog to iAVF PMD which support monitoring the VFLR register. If
the device is not already in reset then if a VF reset in progress is
detected then notify user through callback and set into reset state.
If the device is already in reset then poll for completion of reset.
The watchdog is disabled by default, to enable it set
IAVF_DEV_WATCHDOG_PERIOD to a non zero value (microseconds)
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Add per queue counters for maintaining statistics for inline IPsec
crypto offload, which can be retrieved through the
rte_security_session_stats_get() with more detailed errors through the
rte_ethdev xstats.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Add support for inline crypto for IPsec, for ESP transport and
tunnel over IPv4 and IPv6, as well as supporting the offload for
ESP over UDP, and in conjunction with TSO for UDP and TCP flows.
Implement support for rte_security packet metadata
Add definition for IPsec descriptors, extend support for offload
in data and context descriptor to support
Add support to virtual channel mailbox for IPsec Crypto request
operations. IPsec Crypto requests receive an initial acknowledgment
from physical function driver of receipt of request and then an
asynchronous response with success/failure of request including any
response data.
Add enhanced descriptor debugging
Refactor of scalar tx burst function to support integration of offload
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Reviewed-by: Jingjing Wu <jingjing.wu@intel.com>
Rework the Tx path and Tx descriptor usage in order to
allow for better use of offload flags and to facilitate enabling of
inline crypto offload feature.
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Abhijit Sinha <abhijit.sinha@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Rx ring for representors does not use aggregation rings for Rx.
Instead they use simple software buffers for handling Rx packets.
So there is no need to use the same cleanup routine as done by
the non-representor code path.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Minor fixes are needed in the RTE_FLOW RSS action parser.
1. Update the comment in the parser to indicate RSS level 1 implies RSS
on outer header.
2. RSS action will not be supported if level is > 1.
3. RSS action will not be supported if user or application specifies
MARK or COUNT action.
4. If RSS types is not specified i.e., is 0, the best effort RSS should
use IPv4 and IPv6 headers. Currently we are considering only IPv4.
Fixes: 239695f754 ("net/bnxt: enhance RSS action support")
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Move the Rx queue state update before bnxt_setup_one_vnic()
is called. For Thor, rxq->rx_started and eth_dev->data->rx_queue_state[]
needs to be set for all queues before bnxt_hwrm_vnic_cfg() or
bnxt_vnic_rss_configure() are called.
Fixes: 0105ea1296 ("net/bnxt: support runtime queue setup")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Integrity item validation and translation must verify that integrity
item bits match L3 and L4 items in flow rule pattern.
For cases when integrity item was positioned before L3 header, such
verification must be split into two stages.
The first stage detects integrity flow item and makes initializations
for the second stage.
The second stage is activated after PMD completes processing of all
flow items in rule pattern. PMD accumulates information about flow
items in flow pattern. When all pattern flow items were processed,
PMD can apply that data to complete integrity item validation
and translation.
Fixes: 79f8952783 ("net/mlx5: support integrity flow item")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
MLX5 PMD can match on integrity bits for inner and outer headers in
a single flow.
That means a single flow rule can reference both inner and outer
integrity bits. That is implemented by adding 2 flow integrity items
to a rule - one item for outer integrity bits and other for
inner integrity bits.
Integrity item `level` parameter specifies what part is being
targeted.
Current PMD treated integrity items for outer and inner headers as
the same.
The patch separates PMD verifications for inner and outer integrity
items.
Fixes: 79f8952783 ("net/mlx5: support integrity flow item")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Multiple rules could use the same encap_decap/modify_hdr/counter action.
The flow dump data could be duplicated.
To avoid redundancy, flow dump value is based on the actions' pointer
instead of previous rules' pointer.
For counter, the data is stored in cmng of priv->sh.
For encap_decap/modify_hdr, the data stored in encaps_decaps/modify_cmds.
Traverse the fields and get action's pointer and information.
Formats are same for information in the dump except "id" stands for
actions' pointer:
Counter: rec_type,id,hits,bytes
Modify_hdr: rec_type,id,actions_number,actions
Encap_decap: rec_type,id,buf
Signed-off-by: Haifei Luo <haifeil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
During the device spawn process, mlx5 PMD queried the available flow
priorities by calling mlx5_flow_discover_priorities, queried
if the DR drop action was supported on the root table by calling
the mlx5_flow_discover_dr_action_support routine, and queried the
availability of metadata register C by calling mlx5_flow_discover_mreg_c
These functions created the test flows to get the supported fields, and
at the end destroyed the test flows. The test flows in the first two
functions was created on the root table.
If the device was spawned with multiple representors, these test flows
were created and destroyed on each representor as well. The above
operations took a significant amount of init time during the device
spawn.
This patch optimizes the device discover functions, if there is
the device with multiple representors (VF/SF) being spawned,
the priority and drop action and metadata register support check can be
done only ones and check results can be shared for all representors.
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
In socket direct mode, it's possible to bind any two (maybe four
in future) PCIe devices with IDs like xxxx:xx:xx.x and
yyyy:yy:yy.y. Bonding member interfaces are unnecessary to have
the same PCIe domain/bus/device ID anymore,
Kernel driver uses "system_image_guid" to identify if devices can
be bound together or not. Sysfs "phys_switch_id" is used to get
"system_image_guid" of each network interface.
OFED 5.4+ is required to support "phys_switch_id".
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The shared pointer is initialized to a static local array defined in the
primary process and it shall not be accessed in the secondary process.
This patch copies the local data to shared data, to avoid data access
violation.
Fixes: 040b44551f ("net/iavf: unify Rx packet type table")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
This patch uses the index value to call the function, instead of the
function pointer assignment to save the selection of Receive Flex
Descriptor profile ID.
Otherwise the secondary process will run with wrong function address
from primary process.
Fixes: 7a340b0b4e ("net/ice: refactor Rx FlexiMD handling")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
After DCF is reset by PF, the DCF device un-initialization cannot
function normally, ignore the failure does not help since the kernel
does not clean up resource.
The patch workaround the issue by triggering an additional DCF enable/
disable cycle when a passive reset is detected.
Fixes: 1a86f4dbdf ("net/ice: support DCF device reset")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Driver is using 'ETH_FRAME_LEN' Linux defined value as max frame length,
which doesn't include FCS (4 bytes CRC). But ethdev by default uses
frame size with FCS when application doesn't define any explicit value.
As a result device configuration fails because device is tried to be
configured with a frame size length that is bigger than what device
reported as supported. Device reports as max supported frame size is
1514 but configured value is 1518.
Instead use DPDK macro, 'RTE_ETHER_MAX_LEN', that includes FCS in the
driver to report the max supported frame size, this matches to the
initial intention.
Fixes: 1bb4a528c4 ("ethdev: fix max Rx packet length")
Reported-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Driver is using 'ETH_FRAME_LEN' Linux defined value as max frame length,
which doesn't include FCS (4 bytes CRC). But ethdev by default uses
frame size with FCS when application doesn't define any explicit value.
As a result device configuration fails because device is tried to be
configured with a frame size length that is bigger than what device
reported as supported. Device reports as max supported frame size is
1514 but configured value is 1518.
Instead use DPDK macro, 'RTE_ETHER_MAX_LEN', that includes FCS in the
driver to report the max supported frame size, this matches to the
initial intention.
Fixes: 1bb4a528c4 ("ethdev: fix max Rx packet length")
Reported-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Implement PIE based congestion management based on rfc8033.
The Proportional Integral Controller Enhanced (PIE) algorithm works
by proactively dropping packets randomly.
PIE is implemented as more advanced queue management is required to
address the bufferbloat problem and provide desirable quality of
service to users.
Tests for PIE code added to test application.
Added PIE related information to documentation.
Signed-off-by: Wojciech Liguzinski <wojciechx.liguzinski@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
Removing direct access to interrupt handle structure fields,
rather use respective get set APIs for the same.
Making changes to all the drivers access the interrupt handle fields.
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Raslan Darawsheh <rasland@nvidia.com>
Fix the mbuf offload flags namespace by adding an RTE_ prefix to the
name. The old flags remain usable, but a deprecation warning is issued
at compilation.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
The flags PKT_TX_VLAN_PKT and PKT_TX_QINQ_PKT are
marked as deprecated since commit 380a7aab1a ("mbuf: rename deprecated
VLAN flags") (2017). But they were not using the RTE_DEPRECATED
macro, because it did not exist at this time. Add it, and replace
usage of these flags.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Add 'RTE_ETH' namespace to all enums & macros in a backward compatible
way. The macros for backward compatibility can be removed in next LTS.
Also updated some struct names to have 'rte_eth' prefix.
All internal components switched to using new names.
Syntax fixed on lines that this patch touches.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
After DEV_RX_OFFLOAD_JUMBO_FRAME flag removed, drivers give jumbo frame
decisions based on MTU value checks, but some of the checks were wrong
by mistake, causing device initialization to fail, fixing them.
Fixes: b563c14212 ("ethdev: remove jumbo offload flag")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Yu Jiang <yux.jiang@intel.com>
Commit 1bb4a528c4 ("ethdev: fix max Rx packet length") clarified the
expected usage of the max_rx_pktlen and max_mtu values and implemented
some extra checks on these values to ensure they are sane. After this,
the AF_XDP PMD fails to initialise. The value for max_rx_pktlen which
represents the max size of the Ethernet frame was set to ETH_FRAME_LEN
(1514) and the max_mtu which represents the size of the payload was set
to the max size of the Ethernet frame. This did not make sense, as
naturally the maximum frame size should be greater than the payload
size.
Fix this by setting the max_rx_pktlen equal to the max size of the
Ethernet frame as expected, and the max MTU equal to the max_rx_pktlen
less the overhead which is set to the size of an Ethernet header plus
CRC.
Fixes: 1bb4a528c4 ("ethdev: fix max Rx packet length")
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Current, the max waiting time for MBX response is 500ms, but in
some scenarios, it is not enough. Since it depends on the response
of the kernel mode driver, and its response time is related to the
scheduling of the system. In this special scenario, most of the
cores are isolated, and only a few cores are used for system
scheduling. When a large number of services are started, the
scheduling of the system will be very busy, and the reply of the
mbx message will time out, which will cause our PMD initialization
to fail.
This patch add a runtime config to set the max wait time. For the
above scenes, users can adjust the waiting time to a suitable value
by themselves.
Fixes: 463e748964 ("net/hns3: support mailbox")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
This patch adds support for rte flow action type port_id to
enable directing packets from an input port PF to an output
port which is a VF of the input port PF.
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Vhost will update desc’s Buffer ID advance to next used descriptor when
VIRTIO_F_IN_ORDER feature negotiated. When virtio reuses the descriptor,
the Buffer ID should be restored even VIRTQ_DESC_F_INDIRECT
feature negotiated.
Fixes: b473061b0e ("net/virtio: fix indirect descriptors in packed datapaths")
Cc: stable@dpdk.org
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Miao Li <miao.li@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
To improve performance in vhost Tx/Rx, merge vhost stats loop.
eth_vhost_tx has 2 loop of send num iteraion.
It can be merge into one.
eth_vhost_rx has the same issue as Tx.
Signed-off-by: Gaoxiang Liu <liugaoxiang@huawei.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add initialization for packed ring indirect descriptors
in reconnection path.
Fixes: 381f39ebb7 ("net/virtio: fix packed ring indirect descricptors setup")
Cc: stable@dpdk.org
Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tx prepare method calls rte_net_intel_cksum_prepare(), which
handles tunnel packets correctly, but Tx burst path does not
take tunnel presence into account when computing the offsets.
Fixes: 58169a9c81 ("net/virtio: support Tx checksum offload")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
If packed ring size is not power of two, it is possible that remained
number less than one batch and meanwhile batch operation can pass.
This will cause incorrect remained number calculation and then lead to
receiving oversized packets. The patch fixed the issue by added
remained number check before batch operation.
Fixes: 77d66da838 ("net/virtio: add vectorized packed ring Rx")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch fixes the wrong way to obtain virtqueue.
The end of virtqueue cannot be judged based on whether
the array is NULL.
Fixes: 4e8169eb0d ("net/virtio: fix Rx scatter offload")
Cc: stable@dpdk.org
Signed-off-by: Zhihong Peng <zhihongx.peng@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
After DCF commits TM hierarchy configuration, the commit flag is set to
avoid duplicated commit. But the flag is not reset after device stop,
which prevents the update of hierarchy configuration unless close the
device. It is not reasonable. This patch fix to reset the commit flag
after device stop. Then users can delete and add nodes to commit a new
TM hierarchy configuration.
Fixes: 3a6bfc37ea ("net/ice: support QoS config VF bandwidth in DCF")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
This patch enables building the e1000 driver for Windows.
I tested using two Windows VM on top of VMware Fusion,
creating two e1000 devices with device ID 0x10D3 (8274L),
verifying rx/tx works correctly using dpdk-testpmd.exe
rxonly and txonly mode.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>
Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
On a VMware ESXi 6.0 setup with an Intel 82599 NIC the ports don't
seem to initialize anymore, while running testpmd.
Configuring Port 0 (socket 0)
ixgbevf_dev_rx_init(): Set max packet length to 1518 failed.
ixgbevf_dev_start(): Unable to initialize RX hardware (-22)
Fail to start port 0: Invalid argument
Configuring Port 1 (socket 0)
ixgbevf_dev_rx_init(): Set max packet length to 1518 failed.
ixgbevf_dev_start(): Unable to initialize RX hardware (-22)
Fail to start port 1: Invalid argument
Please stop the ports first
If the call to ixgbevf_rlpml_set_vf fails and we return prematurely,
we will not be able to initialize the ports correctly.
The behavior seems to have changed since the following commit:
Fixes: c77866a169 ("net/ixgbe: detect failed VF MTU set")
Cc: stable@dpdk.org
We can make this particular use case work correctly if we don't
return an error, which seems to be consistent with the overall
kernel ixgbevf implementation.
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c?h=v5.14#n2015
Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Previously, we set txq affinity to 0 and let firmware
to perform round-robin when bonding. Firmware uses a
global counter to assign txq affinity to different
physical ports accord to remainder after division.
There are three dis-advantages:
1. The global counter is shared between kernel and dpdk.
2. After restarting pmd or port, the previous counter value
is reused, so the new affinity is unpredictable.
3. There is no way to get what affinity is set by firmware.
In this update, we will create several TISs up to the
number of bonding ports and bind each TIS to one PF port.
For each port, it will start to pick up TIS using its port
index. Upper layer application can quickly calculate each txq's
affinity without querying.
At DPDK layer, when creating txq with 2 bonding ports, the
affinity is set like:
port 0: 1-->2-->1-->2
port 1: 2-->1-->2-->1
port 2: 1-->2-->1-->2
Note: Only applicable to DevX api.
This affinity subjects to HW hash.
Signed-off-by: Rongwei Liu <rongweil@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
MLX5 PMD exposes a socket for external tools to dump port state.
Socket events are listened using an interrupt source of EXT type.
The socket was closed and the interrupt callback was unregistered
at program exit, which is incorrect because DPDK could be already
shut down at this point. Move actions performed at program exit
to the moment the last MLX5 port is closed. The socket will be opened
again if later a new MLX5 device is plugged in and probed.
Also fix comments that were decisively talking
about secondary processes instead of external tools.
Fixes: e6cdc54cc0 ("net/mlx5: add socket server for external tools")
Cc: stable@dpdk.org
Reported-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
mlx5_rxq_start() allocates rxq_ctrl->obj and frees it on failure,
but did not set it to NULL. Later mlx5_rxq_release() could not recognize
this object is already freed and attempted to release its resources,
resulting in a crash:
Configuring Port 0 (socket 0)
mlx5_common: Failed to create RQ using DevX
mlx5_common: Can't create DevX RQ object.
mlx5_net: Port 0 Rx queue 0 RQ creation failure.
Segmentation fault
Set rxq_ctrl->obj to NULL after it is freed to skip resource release.
Fixes: 1260a87b28 ("net/mlx5: share Rx control code")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The RSS configuration in a policy action container was a pointer
inside a union, and the pointer area could be used as other fate
action. In the current implementation, the RSS of the green color
was prior to that of the yellow color. There was a high possibility
the pointer was considered as the RSS and result in a error flow
expansion when only the yellow color had the RSS action.
The check of the fate action type should also be done to get rid of
the misjudgment.
Fixes: b38a12272b ("net/mlx5: split meter color policy handling")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Verbs API doesn't support device port number larger than 255 by design.
To support more VF or SubFunction port representors, forces DevX API
check when max Verbs device link ports larger than 255.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Verbs API does not support Infiniband device port number larger 255 by
design. To support more representors on a single Infiniband device DevX
API should be engaged.
While creating Send Queue (SQ) object with Verbs API, the PMD assigned
IB device port attribute and kernel created the default miss flows in
FDB domain, to redirect egress traffic from the queue being created to
representor appropriate peer (wire, HPF, VF or SF).
With DevX API there is no IB-device port attribute (it is merely kernel
one, DevX operates in PRM terms) and PMD must create default miss flows
in FDB explicitly. PMD did not provide this and using DevX API for
E-Switch configurations was disabled.
The default miss FDB flow matches E-Switch manager vport (to make sure
the source is some representor) and SQn (Send Queue number - device
internal queue index). The root flow table managed by kernel/firmware
and it does not support vport redirect action, we have to split the
default miss flow into two ones:
- flow with lowest priority in the root table that matches E-Switch
manager vport ID and jump to group 1.
- flow in group 1 that matches E-Switch manager vport ID and SQn and
forwards packet to peer vport
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
When creating internal transfer flow on root table with lowest
priority, the flow was created with max UINT32_MAX priority. It is wrong
since the flow is created in kernel and max priority supported is 16.
This patch fixes this by adding internal flow check.
Fixes: 5f8ae44dd4 ("net/mlx5: enlarge maximal flow priority")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Extends txq flow pattern to support both hairpin and regular txq.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
For egress packet on representor, the vport ID in transport domain
is E-Switch manager vport ID since representor shares resources of
E-Switch manager. E-Switch manager vport ID and Tx queue internal device
index are used to match representor egress packet.
This patch adds flow item port ID match on E-Switch manager.
E-Switch manager vport ID is 0xfffe on BlueField, 0 otherwise.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
To detect number flow Verbs flow priorities, PMD try to create Verbs
flows in different priority. While Verbs is not designed to support
ports larger than 255.
When DevX supported by kernel driver, 16 Verbs priorities must be
supported, no need to create Verbs flows.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
IB spec doesn't allow 255 ports on a single HCA, port number of 256 was
cast to u8 value 0 which invalid to ibv_query_port()
This patch invokes Netlink API to query port state when port number
greater than 255.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Add support for PPP over L2TPv2 over UDP protocol RSS Hash based
on inner IP src/dst address and TCP/UDP src/dst port.
Patterns are listed below:
eth/ipv4(6)/udp/l2tpv2/ppp/ipv4(6)
eth/ipv4(6)/udp/l2tpv2/ppp/ipv4(6)/udp
eth/ipv4(6)/udp/l2tpv2/ppp/ipv4(6)/tcp
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Signed-off-by: Jie Wang <jie1x.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
CNF10KA does not differ it terms of RVU resources from
CN10KA platform hence add it to list of devices respective
drivers support.
Otherwise devices on CNF10KA are not probed even though
compatible drivers exist.
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Expand the use of mempool registration to MR management for other
drivers.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Since MR management has moved to the common area, there is no longer a
need for the DMA map and unmap function for each driver.
This patch share those functions. For most drivers it supports these
operations for the first time.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Add global shared MR cache as a field of common device structure.
Move MR management to use this global cache for all drivers.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Add function for global shared MR cache structure initialization.
This function include:
- btree initialization.
- set callbacks for reg and dereg MR.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Add function for MR control structure initialization.
This function include:
- btree initialization.
- dev_gen_ptr initialization.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This patch remove two redundant things from MR file:
1. mr_find_contig_memsegs_data structure which is moved to common file
before.
2. External memory mechanism - mlx5_tx_update_ext_mp function.
Since commit [1] which added support for DMA map and unmap, external
mem must be configured by the user using rte_mem_map function and no
need to handle this in pmd.
[1]
commit 989e999d93
("net/mlx5: support PCI device DMA map and unmap")
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Add HCA attributes structure as a field of device config structure.
It query in common probing, and updates the timestamp format fields.
Each driver use HCA attributes from common device config structure,
instead of query it for itself.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Create shared Protection Domain in common area and add it and its PDN as
fields of common device structure.
Use this Protection Domain in all drivers and remove the PD and PDN
fields from their private structure.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Create shared context device in common area and add it as a field of
common device.
Use this context device in all drivers and remove the ctx field from
their private structure.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Device configure structure has flag named devx as same as SH structure
with the same meaning.
Remove the flag from the configuration structure and move all the
usages to the SH flag.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Add device configure structure and function to parse user device
arguments into it.
Move parsing and management of relevant device arguments to common.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Create MACRO definitions file in the common driver as preparation for MR
and basic probe sharing.
Move relevant definitions from the net driver to the above file.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Create common probing structure that includes, for now, basic probing
information detected by the common driver and share it with all the
internal drivers.
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
In device initialization, the driver registers to free hugepages events.
When hugepage is released, this callback frees all its related MRs.
In Windows initialization, this callback is not registered what may
cause to use invalid memory.
This patch adds memory event callback registration in Windows
initialization.
Fixes: 980826dc6f ("net/mlx5: probe on Windows")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Sideband queue need to be initialized when device is initialized.
Otherwise the calling to function "ice_init_ctrlq" may fail.
This patch fixes it.
Fixes: 97f4f78bbd ("net/ice/base: add functions for device clock control")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Meters are configured per flow using rte_flow_create API.
Patch adds support for destroy operation for meter action
applied on the flow.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Meters are configured per flow using rte_flow_create API.
Implement support for meter action applied on the flow.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to read and update stats corresponding to
given meter instance for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to update DSCP table for pre-coloring for
incoming packet per nixlf for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to enable or disable meter instance for
CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to delete meter instance for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to create meter instance for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to delete meter policy for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to add meter policy for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to validate meter policy for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to delete meter profile for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement API to add meter profile for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Implement ethdev operation to get meter capabilities for
CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
To enable support for ingress meter, supported operations
are exposed for CNXK platform.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Using fast metadata and userdata flags instead of
driver callbacks for set_pkt_metadata and
get_userdata in inline IPsec.
Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
The RSS expansion algorithm is using a graph to find the possible
expansion paths. The current implementation does not differentiate
between standard (L2) VXLAN and L3 VXLAN. As result the flow is expanded
with all possible paths.
For example:
testpmd> flow create... / vxlan / end actions rss level 2 / end
It is currently expanded to the following paths:
ETH IPV4 UDP VXLAN END
ETH IPV4 UDP VXLAN ETH IPV4 END
ETH IPV4 UDP VXLAN ETH IPV6 END
ETH IPV4 UDP VXLAN IPV4 END
ETH IPV4 UDP VXLAN IPV6 END
The fix is to adjust the expansion according to the outer UDP destination
port. In case flow pattern defines a match on the standard udp port, 4789,
or does not define a match on the destination port, which also implies
setting the standard one, the expansion for the above example will be:
ETH IPV4 UDP VXLAN END
ETH IPV4 UDP VXLAN ETH IPV4 END
ETH IPV4 UDP VXLAN ETH IPV6 END
Otherwise, the expansion will be:
ETH IPV4 UDP VXLAN END
ETH IPV4 UDP VXLAN IPV4 END
ETH IPV4 UDP VXLAN IPV6 END
Fixes: f4f06e3615 ("net/mlx5: add flow VXLAN item")
Cc: stable@dpdk.org
Signed-off-by: Lior Margalit <lmargalit@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates
that the rest of the descriptor words have valid values. Hence, the
word containing DD bit must be read first before reading the rest of
the descriptor words.
In NEON vector PMD, vector load loads two contiguous 8B of
descriptor data into vector register. Given vector load ensures no
16B atomicity, read of the word that includes DD field could be
reordered after read of other words. In this case, some words could
contain invalid data.
Read barrier is added after read of qword1 that includes DD field.
And qword0 is reloaded to update vector register. This ensures
that the fetched data is correct.
Testpmd single core test on N1SDP/ThunderX2 showed no performance drop.
Fixes: ae0eb310f2 ("net/i40e: implement vector PMD for ARM")
Cc: stable@dpdk.org
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
To keep flow format uniform with ice, this patch adds support for
this RSS rule:
flow create 0 ingress pattern eth / ipv6 / ipv6_frag_ext / end \
actions rss types ipv6-frag end queues end queues end / end
Fixes: ef4c16fd91 ("net/i40e: refactor RSS flow")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The common header file for vectorization is included in multiple files,
and so must use macros for the current compilation unit, rather than the
compiler-capability flag set for the whole driver. With the current,
incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up
SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
to the compiler-defined "__AVX*__" macros fixes this issue. In addition,
splitting AVX-specific code into the new ice_rxtx_common_avx.h header
file to avoid such bugs.
Bugzilla ID: 788
Fixes: a4e480de26 ("net/ice: optimize Tx by using AVX512")
Fixes: 20daa1c978 ("net/ice: fix crash in AVX512")
Cc: stable@dpdk.org
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The common header file for vectorization is included in multiple files,
and so must use macros for the current compilation unit, rather than the
compiler-capability flag set for the whole driver. With the current,
incorrect, macro, the AVX512 or AVX2 flags may be set when compiling up
SSE code, leading to compilation errors. Changing from "CC_AVX*_SUPPORT"
to the compiler-defined "__AVX*__" macros fixes this issue. In addition,
splitting AVX-specific code into the new i40e_rxtx_common_avx.h header
file to avoid such bugs.
Bugzilla ID: 788
Fixes: 0604b1f220 ("net/i40e: fix crash in AVX512")
Cc: stable@dpdk.org
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
After the meter policies are created, they are not freed on device
close.
This patch fixes it.
Fixes: 5f0d54f372 ("ethdev: add pre-defined meter policy API")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
This version update contains:
* Fix for verification of the offload capabilities (especially for
IPv6 packets).
* Support for Tx and Rx free threshold values.
* Fixes for per-queue offload capabilities.
* Announce support of the scattered Rx offload.
* NUMA aware allocations.
* Check for the missing Tx completions.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
In some cases Tx descriptors may be uncompleted by the HW and as a
result they will never be released.
This patch adds checking for the missing Tx completions to the ENA timer
service, so in order to use this feature, the application must call the
function rte_timer_manage().
Missing Tx completion reset threshold is determined dynamically, by
taking into consideration ring size and the default value.
Tx cleanup is associated with the Tx burst function. As DPDK
applications can call Tx burst function dynamically, time when last
cleanup was called must be traced to avoid false detection of the
missing Tx completion.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Only the IO rings memory was allocated with taking the socket ID into
the respect, while the other structures was allocated using the regular
rte_zmalloc() API.
Ring specific structures are now being allocated using the ring's
socket ID.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
ENA can't be forced to always pass single descriptor for the Rx packet.
Even if the passed buffer size is big enough to hold the data, we can't
make assumption that the HW won't use extra descriptor because of
internal optimizations. This assumption may be true, but only for some
of the FW revisions, which may differ depending on the used AWS instance
type.
As the scattered Rx support on the Rx path already exists, the driver
just needs to announce DEV_RX_OFFLOAD_SCATTER capability by turning on
the rte_eth_dev_data::scattered_rx option.
Fixes: 1173fca25a ("ena: add polling-mode driver")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
As ENA currently doesn't support offloads which could be configured
per-queue, only per-port flags should be set.
In addition, to make the code cleaner, parsing appropriate offload
flags is encapsulated into helper functions, in a similar matter it's
done by the other PMDs.
[1] https://doc.dpdk.org/guides/prog_guide/
poll_mode_drv.html?highlight=offloads#hardware-offload
Fixes: 7369f88f88 ("net/ena: convert to new Rx offloads API")
Fixes: 56b8b9b7e5 ("net/ena: convert to new Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
The caller can pass Tx or Rx free threshold value to the configuration
structure for each ring. It determines when the Tx/Rx function should
start cleaning up/refilling the descriptors. ENA was ignoring this value
and doing it's own calculations.
Now the user can configure ENA's behavior using this parameter and if
this variable won't be set, the ENA will continue with the old behavior
and will use it's own threshold value.
The default value is not provided by the ENA in the ena_infos_get(), as
it's being determined dynamically, depending on the requested ring size.
Note that NULL check for Tx conf was removed from the function
ena_tx_queue_setup(), as at this place the configuration will be
either provided by the user or the default config will be used and it's
handled by the upper (rte_ethdev) layer.
Tx threshold shouldn't be used for the Tx cleanup budget as it can be
inadequate to the used burst. Now the PMD tries to release mbufs for the
ring until it will be depleted.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
ENA PMD has multiple checksum offload flags, which are more discrete
than the DPDK offload capabilities flags.
As the driver wasn't storing it's internal checksum offload capabilities
and was relying only on the DPDK capabilities, not all scenarios could
be properly covered (like when to prepare pseudo header checksum and
when not).
Moreover, the user could request offload capability, which isn't
supported by the HW and the PMD would quietly ignore the issue.
This commit reworks eth_ena_prep_pkts() function to perform additional
checks and to properly reflect the HW requirements. With the
RTE_LIBRTE_ETHDEV_DEBUG enabled, the function will do even more
verifications, to help the user find any issues with the mbuf
configuration.
Fixes: b3fc5a1ae1 ("net/ena: add Tx preparation")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
In sfc, MAE admin serves as a transfer proxy. In order to track which
ethdev is privileged, augment every independent switch port structure
with information about its MAE privilege.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Register unprivileged ports in the switch domain registry in order to
allow redirecting traffic to them.
Differentiate between different levels of MAE support, update all MAE
status checks.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Removing 'DEV_RX_OFFLOAD_JUMBO_FRAME' offload flag.
Instead of drivers announce this capability, application can deduct the
capability by checking reported 'dev_info.max_mtu' or
'dev_info.max_rx_pktlen'.
And instead of application setting this flag explicitly to enable jumbo
frames, this can be deduced by driver by comparing requested 'mtu' to
'RTE_ETHER_MTU'.
Removing this additional configuration for simplification.
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
Move requested MTU value check to the API to prevent the duplicated
code.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Setting MTU bigger than RTE_ETHER_MTU requires the jumbo frame support,
and application should enable the jumbo frame offload support for it.
When jumbo frame offload is not enabled by application, but MTU bigger
than RTE_ETHER_MTU is requested there are two options, either fail or
enable jumbo frame offload implicitly.
Enabling jumbo frame offload implicitly is selected by many drivers
since setting a big MTU value already implies it, and this increases
usability.
This patch moves this logic from drivers to the library, both to reduce
the duplicated code in the drivers and to make behaviour more visible.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
There is a confusion on setting max Rx packet length, this patch aims to
clarify it.
'rte_eth_dev_configure()' API accepts max Rx packet size via
'uint32_t max_rx_pkt_len' field of the config struct 'struct
rte_eth_conf'.
Also 'rte_eth_dev_set_mtu()' API can be used to set the MTU, and result
stored into '(struct rte_eth_dev)->data->mtu'.
These two APIs are related but they work in a disconnected way, they
store the set values in different variables which makes hard to figure
out which one to use, also having two different method for a related
functionality is confusing for the users.
Other issues causing confusion is:
* maximum transmission unit (MTU) is payload of the Ethernet frame. And
'max_rx_pkt_len' is the size of the Ethernet frame. Difference is
Ethernet frame overhead, and this overhead may be different from
device to device based on what device supports, like VLAN and QinQ.
* 'max_rx_pkt_len' is only valid when application requested jumbo frame,
which adds additional confusion and some APIs and PMDs already
discards this documented behavior.
* For the jumbo frame enabled case, 'max_rx_pkt_len' is an mandatory
field, this adds configuration complexity for application.
As solution, both APIs gets MTU as parameter, and both saves the result
in same variable '(struct rte_eth_dev)->data->mtu'. For this
'max_rx_pkt_len' updated as 'mtu', and it is always valid independent
from jumbo frame.
For 'rte_eth_dev_configure()', 'dev->data->dev_conf.rxmode.mtu' is user
request and it should be used only within configure function and result
should be stored to '(struct rte_eth_dev)->data->mtu'. After that point
both application and PMD uses MTU from this variable.
When application doesn't provide an MTU during 'rte_eth_dev_configure()'
default 'RTE_ETHER_MTU' value is used.
Additional clarification done on scattered Rx configuration, in
relation to MTU and Rx buffer size.
MTU is used to configure the device for physical Rx/Tx size limitation,
Rx buffer is where to store Rx packets, many PMDs use mbuf data buffer
size as Rx buffer size.
PMDs compare MTU against Rx buffer size to decide enabling scattered Rx
or not. If scattered Rx is not supported by device, MTU bigger than Rx
buffer size should fail.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Huisong Li <lihuisong@huawei.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Build error:
../drivers/net/enic/enic_fm_flow.c: In function 'enic_fm_flow_parse':
../drivers/net/enic/enic_fm_flow.c:1467:24:
error: 'dev' may be used uninitialized in this function
[-Werror=maybe-uninitialized]
struct rte_eth_dev *dev;
^~~
../drivers/net/enic/enic_fm_flow.c:1580:24:
error: 'dev' may be used uninitialized in this function
[-Werror=maybe-uninitialized]
struct rte_eth_dev *dev;
^~~
../drivers/net/enic/enic_fm_flow.c:1599:24:
error: 'dev' may be used uninitialized in this function
[-Werror=maybe-uninitialized]
struct rte_eth_dev *dev;
^~~
Build error looks like false positive, but to silence the compiler
initializing the pointer with NULL.
Bugzilla ID: 812
Fixes: 54bd4ebe8b ("net/enic: support meta flow actions to overrule destinations")
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Since AARCH32 extension is not implemented on octeontx2 family, only
enable build for 64bit.
Due to Linux kernel AF(Admin Function) driver dependency, only enable
build for 64-bit Linux.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Since AARCH32 extension is not implemented on octeontx family, only
enable build for 64bit.
Due to Linux kernel AF(Admin function) driver dependency, only enable
build for 64-bit Linux.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Since AARCH32 extension is not implemented on thunderx family, only
enable build for 64bit.
Due to Linux kernel AF(Admin function) driver dependency, only enable
build for Linux.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
When release Tx queue, Rx queue data got freed because wrong Tx queue
data located.
This patch fixes the wrong Tx queue data location.
Fixes: 7483341ae5 ("ethdev: change queue release callback")
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fate actions are different per domain.
When all the domains, ingress, egress and FDB (transfer),
can support all the policy actions, i.e. [SET_TAG],
the policy prepares resources for all the domains and
failure happens if one of the domains misses its fate action
in the policy action list.
Remove the domains missing their fate action
from the meter policy preparation.
Now, the policy will prepare a domain only when the domain supports
all the actions and when one of the domain fate actions is on the list.
Fixes: afb4aa4f12 ("net/mlx5: support meter policy operations")
Cc: stable@dpdk.org
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
This patch fixes coverity issue by avoiding use of null pointer
in taking false branch.
Coverity issue: 373360
Fixes: 437dbd2fd4 ("net/ice: support 1PPS")
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
In function ice_dcf_stop_queues(), RX queues and TX queues are actually
not freed, so their pointers shall not be set to NULL when queues are
stopped.
This patch adds function call to free queues on DCF device close,
which also set the RX and TX queues' pointers to NULL on freeing
queues, and avoids referring to the released resource when device is
started again.
Fixes: 1a86f4dbdf ("net/ice: support DCF device reset")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
If flow redirect failed, the spinlock will not be unlocked.
This patch fixes it.
Fixes: bc9201388d ("net/ice: support flow redirect")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The Timestamp Overlay feature is available only in 32B Flex Descriptors.
This patch adds compile option when in 16B Flex Descriptors.
Fixes: 953e74e6b7 ("net/ice: enable Rx timestamp on flex descriptor")
Fixes: 646dcbe6c7 ("net/ice: support IEEE 1588 PTP")
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Null-checking "p" suggests that it may be null, but it has already
been dereferenced on all paths leading to the check. Thus correct
the code lines and remove the redundant line.
Fixes: c84f8aa210 ("net/ice/base: add parser runtime skeleton")
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Firmware 8.4+ will return I40E_AQ_RC_ENOENT when try to delete
non-existent MAC/VLAN addresses from the HW filtering, this should
not be considered as an Admin Queue error. But in i40e_asq_send_command,
it will return I40E_ERR_ADMIN_QUEUE_ERROR if the return value of Admin
Queue command processed by Firmware is not I40E_AQ_RC_OK or
I40E_AQ_RC_EBUSY.
Use i40e_aq_remove_macvlan_v2 instead so that we can get the
corresponding Admin Queue status, and not report as an error in DPDK
when Firmware return I40E_AQ_RC_ENOENT, and this also not break with an
old firmware.
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Adjust the code line order of the parser runtime reset, since the
struct rt->psr is used in function _rt_flag_set before assignment.
Fixes: c84f8aa210 ("net/ice/base: add parser runtime skeleton")
Cc: stable@dpdk.org
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Some drivers do not provide per-queue statistics. So, there is no point
to have these misleading zeros in xstats.
Fixes: f30e69b41f ("ethdev: add device flag to bypass auto-filled queue xstats")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Parse inner L2 length to set correct packet type, and ensure that
hardware can compute the checksum successfully.
Fixes: b950203be7 ("net/txgbe: support VXLAN-GPE")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Setting exact link speed makes sense if auto-negotiation is
disabled. Fixed flag is required to disable auto-negotiation.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
It's necessary to set 1 on TXGBE_PX_INTA register to get interrupts
normally, when legacy interrupt mode is used.
Fixes: 2fc745e6b6 ("net/txgbe: add interrupt operation")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Update immediate value/pointer source operand support
for modify field RTE Flow action:
- source operand data can be presented by byte buffer
(instead of former uint64_t) or by pointer
- no host byte ordering is assumed anymore for immediate
data buffer (not uint64_t anymore)
- no immediate value offset is expected (the source
subfield is located at the same offset as in destination)
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Setting 'RTE_LIBRTE_BNXT_TRUFLOW_DEBUG' macro cause build error,
removing it.
Also with meson build system compile time debug macros should be
documented in driver documentation, since there is no other way to
figure out their existence.
Fixes: ad9eed0248 ("net/bnxt: support flow template for Thor")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reported by "gcc (GCC) 12.0.0 20211003 (experimental)":
./drivers/net/softnic/rte_eth_softnic_cli.c:
In function ‘tmgr_hierarchy_default’:
./drivers/net/softnic/rte_eth_softnic_cli.c:634:73:
error: the comparison will always evaluate as ‘true’ for the
address of ‘tc_valid’ will never be NULL [-Werror=address]
634 | (¶ms->shared_shaper_id.tc_valid[0]) ? 1 : 0,
| ^
Fixing it by removing useless check.
Fixes: 1af2dc5111 ("net/softnic: add command for default tmgr hierarchy")
Fixes: 5eb676d74f ("net/softnic: add config flexibility to TM")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
Add support for item PORT_REPRESENTOR which should
be used instead of ambiguous item PORT_ID.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Action PORT_ID implementation assumes ingress only. Its semantics
suggests that support for equal action PORT_REPRESENTOR be added.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Semantics of the existing support for action PORT_ID suggests
that support for equal action REPRESENTED_PORT be implemented.
Helper functions keep port_id suffix since action
MLX5_FLOW_ACTION_PORT_ID is still used internally.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Add support for actions PORT_REPRESENTOR and REPRESENTED_PORT
based on the existing support for action PORT_ID.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Add support for actions PORT_REPRESENTOR and REPRESENTED_PORT
based on the existing support for action PORT_ID.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Add support for items PORT_REPRESENTOR and REPRESENTED_PORT
based on the existing support for item PORT_ID.
The use of item PORT_ID depends on the specified direction attribute.
Items PORT_REPRESENTOR and REPRESENTED_PORT, in turn, define traffic
direction themselves. The former matches traffic from the driver's
vNIC. The latter matches packets from either a v-port (network) or
a VF's vNIC (if the driver's port is a VF representor).
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Move rte_eth_dev, rte_eth_dev_data, rte_eth_rxtx_callback and related
data into private header (ethdev_driver.h).
Few minor changes to keep DPDK building after that.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
Currently majority of fast-path ethdev ops take pointers to internal
queue data structures as an input parameter.
While eth_rx_queue_count() takes a pointer to rte_eth_dev and queue
index.
For future work to hide rte_eth_devices[] and friends it would be
plausible to unify parameters list of all fast-path ethdev ops.
This patch changes eth_rx_queue_count() to accept pointer to internal
queue data as input parameter.
While this change is transparent to user, it still counts as an ABI change,
as eth_rx_queue_count_t is used by ethdev public inline function
rte_eth_rx_queue_count().
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Feifei Wang <feifei.wang2@arm.com>
By design, in a GROUP flow, outer match criteria go to "ENC" fields
of the action rule match specification. The current HW/FW hasn't
got support for these fields (except the VXLAN VNI) yet.
As a workaround, start parsing the pattern from the tunnel item.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Such a counter will only report the number of hits, which is actually
a sum of two contributions (the JUMP rule's own counter + indirect
increments issued by counters of the associated GROUP rules.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
By design, JUMP flows should be represented solely by the outer rules. But
the HW/FW hasn't got support for setting Rx mark from RECIRC_ID on outer
rule lookup yet. Neither does it support outer rule counters. As a
workaround, an action rule of lower priority is used to do the job.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
The current HW/FW doesn't allow to match on MAC addresses in outer rules.
One day this will change for sure, but right now a workaround is needed.
Match on VLAN presence in outer rules is also unsupported. Ignore it.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Support generic callbacks which callers will invoke to get
PMD-specific actions and items used to produce JUMP and
GROUP flows and to detect tunnel information.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
GROUP is an in-house term for so-called "tunnel_match" flows.
On parsing, they are detected by virtue of PMD-internal item
MARK. It associates a given flow with its tunnel context.
Such a flow is represented by a MAE action rule which is
chained with the corresponding JUMP rule's outer rule
by virtue of matching on its recirculation ID.
GROUP flows do narrower match than JUMP flows do and
decapsulate matching packets (full offload).
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
JUMP is an in-house term for so-called "tunnel_set" flows. On parsing,
they are identified by virtue of actions MARK (PMD-internal) and JUMP.
The action MARK associates a given flow with its tunnel context.
Such a flow is represented by a MAE outer rule (OR) which has its
recirculation ID set. This ID is also associated with the tunnel
context. The OR is supposed to set this ID in 8 high bits of
Rx mark in matching packets. It also counts the packets.
Packets that hit the OR but miss in action rule (AR) table,
should go to MAE admin PF (that is, to DPDK) by default.
Support for the use of action COUNT in JUMP
flows will be introduced by later patches.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Later patches add support for tunnel offload on Riverhead (EF100).
A board can host at most 254 tunnels. Partially offloaded (missed)
tunnel packets are identified by virtue of 8 high bits in Rx mark.
Add basic definitions of the upcoming tunnel offload support and
take care of the dedicated bits in Rx mark across the driver.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
vnic_dev_capable_filter_mode() currently fails when
CMD_CAPABILITY(CMD_ADD_FILTER) returns ERR_EPERM. In turn, this
failure causes the driver initialization to fail.
But, firmware may legitimately return ERR_EPERM. For example, VF vNIC
returns ERR_EPERM when it does not support filtering at all. So, treat
ERR_EPERM as "no filtering available" instead of an unexpected error.
Fixes: 322b355f21 ("net/enic/base: bring NIC interface functions up to date")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
The intr_handle->intr_vec is allocated by rte_zmalloc(), but freed by
free(), this patch fixes it.
Fixes: 02a7b55657 ("net/hns3: support Rx interrupt")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Detect the flag in Rx prefix and pass it to users.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
MAE counter engine gets generation counts by virtue of the mark,
so the code to extract the field is already in place, but flow
action MARK doesn't benefit from it. Support this use case, too.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Initial support for the method. Later patches will extend it to
make FLAG and MARK delivery available on EF100 native datapath.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Enhance support for RSS action in the non-TruFlow path.
This will allow the user or application to update the RSS settings
using RTE_FLOW API.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Fix Rx queue state on device start.
The state of Rx queues could be incorrect in some cases
because instead of updating the state for all the Rx queues,
we are updating it for queues in a VNIC.
Fixes: 0105ea1296 ("net/bnxt: support runtime queue setup")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Aggregation rings are needed when PMD needs to support jumbo frames, LRO.
Currently we are creating the aggregation rings whether jumbo frames or
LRO has been enabled or disabled. This causes unnecessary allocation of
mbufs needing larger mbuf pool which is not used at all.
This patch modifies the code to create aggregation rings only when
needed.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Support of the keep-CRC offloading by checking
the relevant FW capability (scatter_fcs) for NIC support.
Supported offload:
DEV_RX_OFFLOAD_KEEP_CRC
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: Idan Hackmon <idanhac@nvidia.com>
Support of the VLAN stripping offloading by checking
the relevant FW capability (vlan_cap) for NIC support.
Supported offload:
DEV_RX_OFFLOAD_VLAN_STRIP
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: Idan Hackmon <idanhac@nvidia.com>
Support of the TSO offloading by checking
the relevant FW capability for NIC support.
Supported offloads:
DEV_TX_OFFLOAD_TCP_TSO
DEV_TX_OFFLOAD_VXLAN_TNL_TSO
DEV_TX_OFFLOAD_GRE_TNL_TSO
DEV_TX_OFFLOAD_GENEVE_TNL_TSO
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: Idan Hackmon <idanhac@nvidia.com>
Query tunneling supported on the NIC.
Save the offloads values in a config parameter.
This is needed for the following TSO support:
DEV_TX_OFFLOAD_VXLAN_TNL_TSO
DEV_TX_OFFLOAD_GRE_TNL_TSO
DEV_TX_OFFLOAD_GENEVE_TNL_TSO
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: Idan Hackmon <idanhac@nvidia.com>
Currently, the PMD decides if the tunneling offload
can enable VXLAN/GRE/GENEVE tunneled TSO support by checking
config->tunnel_en (single bit) and config->tso.
This is incorrect, the right way is to check the following
flags returned by the mlx5dv_query_device function:
MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_VXLAN - if supported the offload
DEV_TX_OFFLOAD_VXLAN_TNL_TSO can be enabled.
MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GRE - if supported the offload
DEV_TX_OFFLOAD_GRE_TNL_TSO can be enabled.
MLX5DV_RAW_PACKET_CAP_TUNNELED_OFFLOAD_GENEVE - if supported the offload
DEV_TX_OFFLOAD_GENEVE_TNL_TSO can be enabled.
The fix enables the offloads according to the correct
flags returned by the kernel.
Fixes: dbccb4cddc ("net/mlx5: convert to new Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: Idan Hackmon <idanhac@nvidia.com>
Query software parsing supported on the NIC.
Save the offloads values in a config parameter.
This is needed for the outer IPv4 checksum and
IP and UDP tunneled packet TSO support.
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: Idan Hackmon <idanhac@nvidia.com>
Currently, the PMD decides if the software parsing
offload can enable outer IPv4 checksum and tunneled
TSO support by checking config->hw_csum and config->tso
respectively.
This is incorrect, the right way is to check the following
flags returned by the mlx5dv_query_device function:
MLX5DV_SW_PARSING - check general swp support.
MLX5DV_SW_PARSING_CSUM - check swp checksum support.
MLX5DV_SW_PARSING_LSO - check swp LSO/TSO support.
The fix enables the offloads according to the correct
flags returned by the kernel.
Fixes: e46821e9fc ("net/mlx5: separate generic tunnel TSO from the standard one")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: Idan Hackmon <idanhac@nvidia.com>
When the iavf_adapter instance is not initialized completely in the
primary process, the secondary process accesses its "rte_eth_dev"
member, it causes secondary process crash.
This patch replaces eth_dev with eth_dev_data in iavf_adapter.
Fixes: f978c1c9b3 ("net/iavf: add RSS hash parsing in AVX path")
Fixes: 9c9aa00403 ("net/iavf: add offload path for Rx AVX512 flex descriptor")
Fixes: 63660ea3ee ("net/iavf: add RSS hash parsing in SSE path")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
This patch adds some defines related to DDP Track ID.
Signed-off-by: Artur Tyminski <arturx.tyminski@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Update FVL FW API version to 1.15
Signed-off-by: Maciej Paczkowski <maciej.paczkowski@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Add raw format for i40e_32byte_rx_desc, right now this only be used
by kernel driver, the commit is just to sync with kernel driver.
Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Add macros and structures for MAC frequency calculation in case the link
is not present.
Remove duplicate definition in i40e_ethdev.c
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
The variable checksum from i40e_calc_nvm_checksum is used before return
value is checked. Fix this logic.
Fixes: 8db9e2a1b2 ("i40e: base driver")
Fixes: 3ed6c3246f ("i40e/base: handle AQ timeout when releasing NVM")
Cc: stable@dpdk.org
Signed-off-by: Christopher Pau <christopher.pau@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
The status of i40e_read_nvm_word is not checked, so variables set
from this function could be used uninitialized. In this case, preserve
the existing flow that does not block initialization by initializing
these values from the start.
Fixes: 8d6c51fcd2 ("i40e/base: get OEM version")
Fixes: 2db7057424 ("net/i40e/base: limit PF/VF specific code to that driver only")
Cc: stable@dpdk.org
Signed-off-by: Christopher Pau <christopher.pau@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Fix mismatched function name in comments.
Fixes: 8db9e2a1b2 ("i40e: base driver")
Fixes: 842ea19963 ("i40e/base: save link module type")
Fixes: fd72a2284a ("i40e/base: support LED blinking with new PHY")
Fixes: 788fc17b2d ("i40e/base: support proxy config for X722")
Cc: stable@dpdk.org
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Add flags for outer VLAN and include set port parameters.
Add flags, which describe port and switch state for both double VLAN
functionality and outer VLAN processing.
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
For Active Optical Cable (AOC) the correct media type is "Fibre",
not "Direct Attach Copper".
Fixes: d749d4d899 ("i40e/base: add AOC PHY types")
Fixes: aa153cc89f ("net/i40e/base: add new PHY types for 25G AOC and ACC")
Cc: stable@dpdk.org
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
The X722 card has 'Link Type' information elsewhere than the X710.
Previously, for all cards, the 'Link Type' information was retrieved by
opcode 0x0607 and this value was wrong for all X722 cards.
Now this information for X722 only is taken by opcode 0x0600
(function: i40e_aq_get_phy_capabilities) instead of an opcode
0x0607 (function: i40e_aq_get_link_info).
All other parameters read by opcode 0x0607 unchanged.
Fixes: e6691b428e ("i40e/base: fix PHY NVM interaction")
Fixes: 75c3de654e ("net/i40e/base: fix long link down notification time")
Cc: stable@dpdk.org
Signed-off-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
PF has to delete all the filters during reset.
If it is fully loaded with filters then it is possible that it will take
more than 200 ms to finish the reset resulting in timeout during pf_reset
and PF reset failed, -15 error indication.
Increasing the timeout value for PF reset from 200 to 1000 to give PF
more time to finish reset if it is loaded with filters.
Fixes: 1e32378f07 ("i40e/base: increase PF reset max loop limit")
Cc: stable@dpdk.org
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Unlike other supported adapters, 2.5G and 5G use different PHY type
identifiers for reading/writing PHY settings and for reading link status.
This commit introduces separate PHY identifiers for these two operation
types.
Fixes: 988ed63c74 ("net/i40e/base: add support for Carlsville device")
Cc: stable@dpdk.org
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Add definitions for Shadow RAM pointers: 6th FPA (Free Provisioning Area)
module, 5th FPA module in X722 and Preservation Rules module.
These definitions are not using by DPDK now, the purpose of this commit
is to sync base code with kernel driver.
Signed-off-by: Stanislaw Grzeszczak <stanislaw.a.grzeszczak@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Some customers want to downgrade to an earlier FW security revision, this
already implemented by FW so that customers can have more control over
the security revisions they can use. FW also implemented a mechanism via
NVMupdate to allow the users to accept or not a baseline Min SRev version
that will limit the secure version rollback only down to that level.
This commit increments X722 API version and adds new minimal rollback
revision that related to the extended implementation of Security Revision
Opt-In for 4 more X722 modules.
These definitions are not using by DPDK now, the purpose of this commit
is sync with latest share code.
Signed-off-by: Stanislaw Grzeszczak <stanislaw.a.grzeszczak@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
ASQ(Admin Send Queue) send command functions are returning only i40e
status codes yet some calling functions also need Admin Queue status that
is stored in hw->aq.asq_last_status. Since hw object is stored on a heap
it introduces a possibility for a race condition in access to hw if
calling function is not fast enough to read hw->aq.asq_last_status before
next send ASQ command is executed.
Added new versions of send ASQ command functions that return Admin Queue
status on the stack to avoid race conditions in access to
hw->aq.asq_last_status.
Added new _v2 version of i40e_aq_remove_macvlan and i40e_aq_add_macvlan
that is using new _v2 versions of ASQ send command functions and returns
the Admin Queue status on the stack.
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
On the vector implementation, during the tear-down, the mbufs not
drained in the RxQ and TxQ are freed based on an algorithm which
supposed that the number of descriptors is a power of 2 (max_desc).
Based on this hypothesis, this algorithm uses a bitmask in order to
detect an index overflow during the iteration, and to restart the loop
from 0.
However, there is no such power of 2 requirement in the ixgbe for the
number of descriptors in the RxQ / TxQ. The only requirement is to have
a number correctly aligned.
If a user requested to configure a number of descriptors which is not a
power of 2, as a consequence, during the tear-down, it was possible to
be in an infinite loop, and to never reach the exit loop condition.
By removing the bitmask and changing the loop method, we can avoid this
issue, and allow the user to configure a RxQ / TxQ which is not a power
of 2.
Fixes: c95584dc2b ("ixgbe: new vectorized functions for Rx/Tx")
Cc: stable@dpdk.org
Signed-off-by: Julien Meunier <julien.meunier@nokia.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Some packets are discarded by the NIC because they are larger than
the MTU, these packets should be counted as "RX error" instead of
"RX packet", for example:
pkt1 = Ether()/IP()/Raw('x' * 1400)
pkt2 = Ether()/IP()/Raw('x' * 1500)
---------------- Forward statistics for port 0 -----------------
RX-packets: 2 RX-dropped: 0 RX-total: 2
TX-packets: 1 TX-dropped: 0 TX-total: 1
----------------------------------------------------------------
Here the packet pkt2 has been discarded, but still was counted
by "RX-packets"
The register 'GL_RXERR1' can count above discarded packets.
This patch adds reading and calculation of the 'GL_RXERR1' counter
when reporting DPDK statistics.
Fixes: f4a91c38b4 ("i40e: add extended stats")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
If GTPU Extension header has no pdu_type setting, the parsed value of
gtp_psc_spec->hdr.type will be 0, which is same as IAVF_GTPU_EH_DWLINK.
Thus, for this case, we should check gtp_psc_mask->hdr.type instead,
to set QFI field bit of GTPU_EH first.
Fixes: cd212c4669 ("net/iavf: fix QFI fields of GTPU UL/DL for flow director")
Cc: stable@dpdk.org
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Indirect actions should be used to do shared counters.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
The representor support has been implemented to some extent, and the fact
that ethdev mport is equivalent to entity mport is by design.
Fixes: 1fb65e4dae ("net/sfc: support flow action port ID in transfer rules")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Let the driver provide the user with information about available
representors by implementing the representor_info_get operation.
Due to the lack of any structure to representor IDs, every ID range
describes exactly one representor.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Representor IDs must be unique for each representor. VFs, which are
currently used, are not unique as they may repeat in combination with
different PCI controllers and PFs. On the other hand, switch port IDs
are unique, so they are a better fit for this role.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Allow the user to specify representor entities using the structured
parameter values.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Make representor names unique on multi-host configurations.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
This information will be useful when representor info API is implemented.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Newer hardware may have arbitrarily complex controller configurations,
and for this reason the mapping has been made dynamic: it is represented
with a dynamic array that is indexed by controller numbers and each
element contains an EFX interface number. Since the number of controllers
is expected to be small, this approach should not hurt the performance.
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
If for some reason the hardware switch ID initialization function fails,
MAE lock is still held after the function finishes. This patch fixes that.
Fixes: 1e7fbdf0ba ("net/sfc: support concept of switch domains/ports")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>