This patch adds a few optimizations to Tx batching. Instead of
specifying 'cmpl_next' to bnxt_start_xmit() as a hint to enable
completion for a given packet, request for completion on the last
successfully enqueued TxBD. This takes care of both success and
error cases. It eliminates 'cmpl_next' arg and related per-packet
condition checks around it.
Fixes: 5735eb2419 ("net/bnxt: support Tx batching")
Cc: stable@dpdk.org
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
This patch addresses the following issues with Tx batching:
1. Tx stall observed in some conditions:
The batching code doesn't request for a completion when only a partial
chain of packets is transmitted due to mbuf allocation errors. Because
of this, Tx consumer index is not updated correctly and it eventually
leads to qfull condition. Fix this by requesting a completion for the
last packet in the partial chain that is transmitted successfully.
2. Tx stall seen with Jumbo frames:
With jumbo frames, number of TxBDs is > 1. While setting up these
additional BDs in bnxt_start_xmit(), the flags field is being set using
the OR-assignment operator. We end up using a stale value of the flags
field (from a previous use of that descriptor). This results in an
invalid completion and eventually leads to tx stall. Fix this to just
assign the flags field with the right value.
Fixes: 5735eb2419 ("net/bnxt: support Tx batching")
Cc: stable@dpdk.org
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
The HW can have limits on the minimum packet size it can support,
or the maximum number of segments it can support. Check for such
possibilities. Also check if we are going to have a 0 length buffer.
Fixes: 6eb3cc2294 ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
We wrongly update lflags in the Tx descriptor; avoid it.
Also, instead of calculating the last producer index to see if mbuf
segments are chained, check if the pointer is NULL to iterate through
the segment list.
Fixes: 6eb3cc2294 ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
In function mlx5_rxq_ibv_new(), pointer *tmpl allocation is attempted
at the start, but not validated or freed in case of error.
In function mlx5_txq_ibv_new(), pointer *txq_ibv allocation is
attempted at the start, but not freed in case of error.
This patch adds pointers initialization, validation and freeing.
Fixes: 09cb5b5817 ("net/mlx5: separate DPDK from verbs Rx queue objects")
Fixes: faf2667fe8 ("net/mlx5: separate DPDK from verbs Tx queue objects")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
Patch [1] added, among other definitions, the macro MLX5_ST_SZ_DB.
Patch [2] added later the macro MLX5_ST_SZ_BYTES, which is exactly
the same macro with a different name.
Each of these macros was used in very few places.
This patch removes the definition of MLX5_ST_SZ_DB, and replaces it
with MLX5_ST_SZ_BYTES wherever it was used.
Macro MLX5_ST_SZ_BYTES was preffered since it is the same macro
name used in kernel code, see [3].
[1] http://patches.dpdk.org/patch/45254/
[2] http://patches.dpdk.org/patch/49403/
[3] https://lists.openwall.net/netdev/2014/10/02/75
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Add support to match all TCP control bits (flags)
except "NS (ECN-nonce)" via Direct Verbs (DV) or Direct Rule (DR)
engine.
Signed-off-by: Jack Min <jackmin@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
This driver should use dynamic log level not RTE_LOGTYPE_PMD.
Other drivers were converted back in 18.02.
This is really a bug, all other drivers use dynamic log levels
by now.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Multiple functions were declared in header file mlx5_rxtx.h,
implemented in mlx5_rxq.c, and called only in mlx5_rxq.c.
This patch moves all these functions declarations into mlx5_rxq.c,
as static functions.
Some functions implementation was copied higher in the file to
precede the functions calls.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Return value of function mlx5_rxq_releasable() was not described
correctly in function description.
This patch updates the description to correctly describe the optional
return values.
Fixes: a6d83b6a92 ("net/mlx5: standardize on negative errno values")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Function mlx5_rxq_ibv_release() is called in several places.
Before each call except one, the input parameter is validated to make
sure it is not null.
This patch adds the validation where it is missing.
It also changes a priv_ prefix, left in a comment, to mlx5_ prefix.
Fixes: af4f09f282 ("net/mlx5: prefix all functions with mlx5")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Functions implemented but never called:
mlx5_rxq_ibv_releasable()
mlx5_rxq_cleanup()
mlx5_txq_ibv_releasable()
Function declared but not implemented:
rxq_alloc_mprq_buf()
This patch removes these functions from code and header file.
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Currently, i40e Flow Director action parser only allows following nine
action combinations: (QUEUE, PASSTHRU, DROP, QUEUE + MARK, PASSTHRU +
MARK, DROP + MARK, QUEUE + FLAG, PASSTHRU + FLAG, DROP + FLAG)
Using the existing Cloud Filter profile on the NIC, it is possible to
add support for two more combinations as: (MARK + RSS, MARK + FLAG +
RSS)
Addition of these new combinations would allow more applications to
utilize DPDK rte_flow to implement hardware flow offloads with Intel
Ethernet 700 series network adapters, including but not limited to the
existing OVS DPDK partial hardware flow offload feature.
Signed-off-by: Mesut Ali Ergin <mesut.a.ergin@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Add 'RTE_' prefix to defines:
- rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN.
- rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN.
- rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN.
- rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN.
- rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN.
- rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN.
- rename ETHER_MTU as RTE_ETHER_MTU.
- rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN.
- rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID.
- rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN.
- rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU.
- rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR.
- rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR.
- rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4.
- rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6.
- rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP.
- rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN.
- rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP.
- rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ.
- rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG.
- rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588.
- rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW.
- rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB.
- rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP.
- rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS.
- rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM.
- rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN.
- rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE.
- rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4.
- rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6.
- rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH.
- rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH.
- rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS.
- rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP.
- rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG.
- rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add 'rte_' prefix to structures:
- rename struct ether_addr as struct rte_ether_addr.
- rename struct ether_hdr as struct rte_ether_hdr.
- rename struct vlan_hdr as struct rte_vlan_hdr.
- rename struct vxlan_hdr as struct rte_vxlan_hdr.
- rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr.
Do not update the command line library to avoid adding a dependency to
librte_net.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Also rename arp_hrd, arp_pro, arp_hln, arp_pln and arp_op fields
to avoid conflict with the #defines in gnu libc.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Since ice can support 128, 512, 2K RSS RETA size value, change the
update API to set it to resize the RSS RETA table. And by default,
use 512 to sync with ETH_RSS_RETA_SIZE_x maximum value definition.
Also the flag ICE_FLAG_RSS_AQ_CAPABLE is missed to set.
Fixes: 690175ee51 ("net/ice: support getting device information")
Fixes: ff963bfa7c ("net/ice: support RSS")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Tx desc's DD status is not cleaned by NIC automatically after packets
have been transmitted until software refill a new packet during next
loop. So when tx_free_thresh + tx_rs_thresh > nb_desc, it is possible
that an outdated DD status be checked as tx_next_dd, then segment fault
happen due to free a NULL mbuf pointer.
Then patch fixes this issue by
1. try to adapt tx_rs_thresh to an aggressive tx_free_thresh.
2. queue setup fail when tx_free_thresh + tx_rs_thresh > nb_desc
Fixes: af75078fec ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Tx desc's DD status is not cleaned by NIC automatically after packets
have been transmitted until software refill a new packet during next
loop. So when tx_free_thresh + tx_rs_thresh > nb_desc, it is possible
that an outdated DD status be checked as tx_next_dd, then segment fault
happen due to free a NULL mbuf pointer.
Then patch fixes this issue by
1. try to adapt tx_rs_thresh to an aggressive tx_free_thresh.
2. queue setup fail when tx_free_thresh + tx_rs_thresh > nb_desc
Fixes: 50370662b7 ("net/ice: support device and queue ops")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Tx desc's DD status is not cleaned by NIC automatically after packets
have been transmitted until software refill a new packet during next
loop. So when tx_free_thresh + tx_rs_thresh > nb_desc, it is possible
that an outdated DD status be checked as tx_next_dd, then segment fault
happen due to free a NULL mbuf pointer.
Then patch fixes this issue by
1. try to adapt tx_rs_thresh to an aggressive tx_free_thresh.
2. queue setup fail when tx_free_thresh + tx_rs_thresh > nb_desc
Fixes: 4861cde461 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
The nic's interrupt source has some active handler, which maybe call
ixgbe_dev_link_update() to set link handler. We should cancel the
link handler before remove dev to prevent executing the link handler.
It triggers segfault.
Fixes: 0408f47ba4 ("net/ixgbe: fix busy polling while fiber link update")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Wei Zhao <wei.zhao1@intel.com>
This commit sets the min and max supported MTU values for ice devices
via the ice_dev_info_get() function. Min MTU supported is set to
ETHER_MIN_MTU and max mtu is calculated as the max packet length
supported minus the transport overhead.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Refuse to create the second RSS flow rule as only one RSS key and
HASH register for each PF port. Users should delete the first rule
before setup the second rule.
Fixes: ecad87d223 ("net/i40e: move RSS to flow API")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Tested-by: Wenjie Li <wenjiex.a.li@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Some application, e.g. the l3fwd-power sample uses rte_eth_rx_queue_count()
API to get the get the number of used descriptors of a Rx queue. This patch
adds fm10k implementation for this API.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Support link up and down functions for ice, and when stop the ice,
makes the link down also.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
IBV_EVENT_DEVICE_FATAL event is generated by the driver once for
the entire multiport Infiniband device, not for each existing ports.
The port index is zero and it causes dropping the device removal
event. We should invoke the removal event processing routine
for each port we have installed handler for.
Fixes: 028b2a28c3 ("net/mlx5: update event handler for multiport IB devices")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
When Direct Rules API is not supported we don't set the errno.
This results in failing the function but with errno equals to zero.
The result of this is that a function that failed, is considered as
a function that worked correctly.
This commit fixes this issue by setting the errno to ENOTSUP and
returning this error when error value should be returned.
Since RDMA-CORE are returning positive errno we are also returning
positive error values.
Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Function mlx5_rx_intr_disable() calls mlx5_rxq_ibv_get() and performs
some actions on the returned rxq_ibv.
It doesn't release the rxq_ibv when all is completed with success.
This patch adds call to mlx5_rxq_ibv_release() where it's missing.
Fixes: 09cb5b5817 ("net/mlx5: separate DPDK from verbs Rx queue objects")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Currently, the name of MPRQ mempool is set by
snprintf(name, sizeof(name), "%s-mprq", dev->device->name);
For port representor, the name is duplicate of its master and failed to
create such a mempool having the same name. Port ID is used in the name
instead.
Fixes: 7d6bf6b866 ("net/mlx5: add Multi-Packet Rx support")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Recent patch [1] added, at the end of mlx5_dev_configure(), a call to
mlx5_proc_priv_init(), initializing process_private data of eth_dev.
This call is not reached if PMD is started with zero Rx queues.
In this case mlx5_dev_configure() returns earlier due to the check:
if (rxqs_n == priv->rxqs_n)
return 0;
In such a scenario, later references to uninitialized process_private
data will result in segmentation fault.
For example see in function txq_uar_init().
This patch changes the check logic. The following code is executed
if (rxqs_n != priv->rxqs_n), and skipped otherwise.
Function mlx5_proc_priv_init() is always invoked, to ensure
process_private data is initialized.
[1] http://patches.dpdk.org/patch/52629/
Fixes: 120dc4a7dc ("net/mlx5: remove device register remap")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Enable advanced link speed mode (VIRTCHNL_VF_CAP_ADV_LINK_SPEED) so iavf
PMD can identify more link speed that reported by pf.
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Wei Zhao <wei.zhao1@intel.com>
New PCI device ids are created to support X710/XXV710 of Intel FPGA
Programmable Acceleration card N3000, also called ipn3ke.
Signed-off-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The last word should not cross shadow RAM boundary.
Fixes: 68a1ab82ad ("net/ice: speed up to retrieve EEPROM")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
This patch is a cleanup on comment, variable modifier, coding style.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
When dev_close is called, the netvsc driver will clean up all
queues including the primary ring buffer.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Redo the receive logic to set m->port on packets received on VF.
When using VF, still need to check for packets and completions
arriving on the VMBus path even if application is not doing bursting
(ie n_rx == 0).
Also, fix comment.
Fixes: dc7680e859 ("net/netvsc: support integrated VF")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
The driver supports VLAN filtering, but the option is not included in
the advertised list of capabilities. Thus any attempt to enable VLAN
filtering always fails. Update the list of offload capabilities to
include DEV_RX_OFFLOAD_VLAN_FILTER.
Fixes: 95e4a96ccb ("net/vmxnet3: convert to new Rx offload API")
Cc: stable@dpdk.org
Signed-off-by: Mark Gillott <mgillott@vyatta.att-mail.com>
Acked-by: Yong Wang <yongwang@vmware.com>
When secondary to primary process synchronization occurs
there is no check for number of fds which could cause buffer overrun.
Bugzilla ID: 252
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The RDMA-CORE Direct Rules API was changed in latest upstream code
This commit update the API accordingly.
Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
BlueField SmartNIC has 0xa2d2 as PCI device ID on both ARM and x86 host. On
ARM side, Tx inlining need not be used as PCI bandwidth is not bottleneck.
Vectorized Tx can still be used up to 16 queues. For other archs
(e.g., x86), keep using the default value.
Fixes: 09d8b41699 ("net/mlx5: make vectorized Tx threshold configurable")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
If Tx packet inlining is enabled, rdma-core library should allocate large
Tx WQ enough to support it. It is better for PMD to calculate the size of
WQ based on the parameters and return error with appropriate message if it
exceeds the device capability.
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
When creating a flow rule without the port_id pattern item, always the
PF was selected.
This commit fixes this issue, if no port_id pattern item is available
then we use the port that the flow was created on as source port.
Fixes: 822fb31953 ("net/mlx5: add port id item to Direct Verbs")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
If disable LSC when launching testpmd with X722, shown
link speed is 20G but not 10G. It's caused by wrongly
parsed link speed.
Correct it according to X722 datasheet.
Fixes: eef2daf2e1 ("net/i40e: fix link update no wait")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Since 18.11, it is suggested that driver should release all its private
resources at the dev_close routine. So all resources previously released
in remove routine are now released at the dev_close routine, and the
dev_close routine will be called in driver remove routine in order to
support removing a device without closing its ports.
Above behavior changes are supported by setting RTE_ETH_DEV_CLOSE_REMOVE
flag during probe stage.
Also as af_xdp pmd doesn't allocate MAC addresses dynamically, it needs
to be set as NULL, so it won't be released by rte_eth_dev_release_port()
otherwise, there would be "EAL: Error: Invalid memory" error.
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
ibv_destroy_flow_action() refers to QP. QP must not be freed until
corresponding action is destroyed.
Fixes: 3eb0044310 ("net/mlx5: fix release of jump to queue action")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Found by Coverity scan - dead code
Coverity issue: 337665
Fixes: 7943ba05f6 ("net/atlantic: add link status and interrupt management")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Found by Coverity scan, two pointer args should be not zero checked
Coverity issue: 337929
Fixes: 2f40244b39 ("net/atlantic: implement MACsec firmware interface")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Its better to use capability bits FW provides to detect whether
firmware has APIs for EEPROM access.
Before that explicit FW version comparison was used and that may
cause conflicts.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Driver should track negotiated PHY flow control settings during
link state changes and update MAC level flow control configuration.
Otherwise there could be unexpected pause frames generation which
could lockup the datapath.
Fixes: 4c1c8f7646 ("net/atlantic: add flow control configuration")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
EEPROM write accesses start using non zero configuration memory
accesses. This revealed an issue where firmware interface was actually
always did the uploads starting with zero offset
Fixes: 86d36773bd ("net/atlantic: implement firmware operations")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Clients may not filling magic field, thus causing garbage
to be passed as a device address.
Limit that to maximum SMbus address.
Fixes: ce4e8d4180 ("net/atlantic: implement EEPROM get/set")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Maximum size should be 256 bytes.
Move declaration to the top of the file
Fixes: ce4e8d4180 ("net/atlantic: implement EEPROM get/set")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
EEPROM get/set offset logic should take offset into account.
Data transfers to/from FW should also correctly use
dword based transfer interface, taking into account the
remainder.
We also check error code returned from FW.
Fixes: ce4e8d4180 ("net/atlantic: implement EEPROM get/set")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Make indentation correct. No functional impact.
Fixes: f73061d53b ("net/atlantic: use EEPROM magic as a device address")
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Broadcast was not correctly and fully initialized
Fixes: 7906661eda ("net/atlantic: add b0 hardware layer")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Since this driver does not distinguish unicast/multicast addresses,
applications could always rely on the standard MAC add/remove/set
interface to configure both types.
As a result, the multicast address list interface never got implemented
(rte_eth_dev_set_mc_addr_list()) however PMD-agnostic applications still
rely on it for compatibility reasons; a wrapper is therefore required.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Mellanox mlx5 PMD implements the list of devices to process the memory
free events to reflect the actual memory state to Memory Regions.
Because this list contains the devices and devices may share the
same context the callback routine may be called multiple times
with the same parameter, that is not optimal. This patch modifies
the list to contain the device contexts instead of device objects
and shared context is included in the list only once.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The multiport Infiniband device support was introduced [1].
All active ports, belonging to the same Infiniband device use the single
shared Infiniband context of that device and share the resources:
- QPs are created within shared context
- Verbs flows are also created with specifying port index
- DV/DR resources
- Protection Domain
- Event Handlers
This patchset adds support for Memory Regions sharing between
ports, created on the base of multiport Infiniband device.
The datapath of mlx5 uses the layered cache subsystem for
allocating/releasing Memory Regions, only the lowest layer L3
is subject to share due to performance issues.
[1] http://patches.dpdk.org/cover/51800/
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
There are some physical link settings can be queried from
Ethernet devices: link status, link speed, speed capabilities,
duplex mode, etc. These setting do not make a lot of sense for
representors due to missing physical link. The new kernel drivers
dropped query for link settings for representors causing the
ioctl call to fail. This patch adds some kind of emulation
of link settings to PMD - representors inherit the link parameters
from the master device. The actual link status (up/down)
is retrieved from the representor device.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
When creating the modify action using Direct Rules, we need to
add flags to mark, if the action will be done on root table or on
private table.
Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The RX stats will increase even no packets sent, this patch fix this issue
by modifying ipackets and ibytes statistics based on vsi instead of port
to avoid statistics error.
Fixes: a37bde5631 ("net/ice: support statistics")
Cc: stable@dpdk.org
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Identified by LGTM, the loop was incorrectly incrementing a different
variable and conditional on another.
Fixes: fe2b986ac6 ("net/dpaa2: support generic flow")
Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
'rte_eth_dev_get_port_by_name()' return value is not checked caught by
coverity, adding return value check.
Coverity issue: 305853
Fixes: 96cb195211 ("net/ring: use EAL APIs in PMD specific API")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
The ring name is passed in as part of this drivers API, but it
doesn't check that the name is not truncated.
Fixes: 96cb195211 ("net/ring: use EAL APIs in PMD specific API")
Fixes: e1e4017751 ("ring: add new driver")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Don't hard code string length in two place; use sizeof() instead.
Ring name should use RTE_RING_NAMESIZE.
Fixes: 96cb195211 ("net/ring: use EAL APIs in PMD specific API")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Whitespace fixes to bring inline with current DPDK style.
Fixes: e1e4017751 ("ring: add new driver")
Fixes: a74a99bbb8 ("ring: free rings when detaching device")
Fixes: 61934c0956 ("ring: convert to use of PMD_REGISTER_DRIVER and fix linking")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Fix the issues reported by Coverity check, "Null-checking
vsi suggests that it may be null, but it has already been
dereferenced on all paths leading to the check."
Coverity issue: 328509, 328519, 328523
Fixes: e0dcf94a0d ("net/ice: support VLAN ops")
Fixes: ff963bfa7c ("net/ice: support RSS")
Cc: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Rx queue setup function checks configured MTU to make sure that
no oversized packets can be received. But a following call to
set MTU function might make this check irrelevant.
Add a function to check MTU size against Rx buffer size and
additional Rx queue info, including Rx scatter offload.
Fixes: e961cf425e ("net/sfc: support MTU change")
Cc: stable@dpdk.org
Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
If there is the support of DevX is exposed by rdma-core but
DevX is not supported by or disabled for the specific interface
the mlx5_devx_cmd_query_hca_attr() routine returns an error
preventing the device from successful probing. The routine
should be invoked only in case of enabled DevX.
Fixes: e2b4925ef7 ("net/mlx5: support Direct Rules E-Switch")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
In mlx5_rxq.c, in some comments, text includes "Tx" instead of "Rx".
In mlx5_txq.c, in some comments, text includes "Rx" instead of "Tx".
This patch fixes these typos.
Fixes: faf2667fe8 ("net/mlx5: separate DPDK from verbs Tx queue objects")
Fixes: a1366b1a2b ("net/mlx5: add reference counter on DPDK Rx queues")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
All the library calls must be called via the glue layer.
Fixes: b2177648b8 ("net/mlx5: add Direct Rules flow data alloc/free routines")
Fixes: 79e35d0d59 ("net/mlx5: share Direct Rules/Verbs flow related structures")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Looking for an ethdev port is better (and more efficient)
with an ethdev API than an EAL one.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
The flow group should be initialized.
For example selecting if the encapsulation is for root or private tables
is based on the flow->group value.
Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
stats_reset has been missed when support stats in iavf driver.
This patch add statistics reset function.
Fixes: f4a41a6953 ("net/avf: support stats")
Cc: stable@dpdk.org
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
When sending synchronous IPC requests, the caller must free the response
buffer if the request was successful and reply is no longer needed.
Fix the code to correctly
use the IPC API.
Bugzilla ID: 228
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Compiling on Fedora 30, we get the following warning, causing build failure
when Werror flag is set:
../drivers/net/ixgbe/ixgbe_rxtx.c:2141:14: warning:
‘nmb’ may be used uninitialized in this function [-Wmaybe-uninitialized]
2141 | rxe->mbuf = nmb;
| ~~~~~~~~~~^~~~~
Initializing the value to "NULL" fixes the issue.
Fixes: 8eecb3295a ("ixgbe: add LRO support")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
As flagged by coverity, the "info" structure is being explicitly
dereferenced before being checked later for a NULL value.
Coverity issue: 277241
Fixes: 98e60c0d43 ("net/i40e: add module EEPROM callbacks for i40e")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
Coverity flags that the txq variable is used before it's checked for NULL.
Also fix typo in error message.
Coverity issue: 195023
Fixes: 24853544c8 ("net/i40e: fix mbuf free in vector Tx")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
The function rte_eal_cleanup() was introduced more than one year ago,
in DPDK 18.02. It is no longer experimental, allowing
pdump, proc-info and hotplug_mp apps to not need any experimental API.
The function rte_ctrl_thread_create() was introduced one year ago
in DPDK 18.05. It is no longer experimental, allowing
KNI PMD and TEP example to not need any experimental API.
The functions rte_socket_count() and rte_socket_id_by_idx() were
introduced one year ago in DPDK 18.05. They are no longer experimental.
The function rte_dev_is_probed() was introduced half a year ago
in DPDK 18.11. It is no longer experimental.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
A successful call to rte_mp_request_sync does not guarantee that there
are any messages in the buffer, and this should be checked for before
accessing data in the message. Buffer can be empty if IPC is disabled or
if we decide to ignore replies.
Fixes: c9aa56edec ("net/tap: access primary process queues from secondary")
Cc: stable@dpdk.org
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This driver defines lots of functions (like init_rss) which are intended
to only be used in this device. But when doing static linking these
global functions cause link failures when similar function name is
used in application.
This patch prefixes all functions defined in cxgbe.h with cxgbe_
to avoid these kind of conflicts.
Fixes: bfcb257d30 ("net/cxgbe: enable RSS for VF")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
This patch introduces:
- VMxnet3 v4 negotiation and,
- entirely guest-driven UDP RSS support.
VMxnet3 v3 already has UDP RSS support, however it
depends on hypervisor provisioning on the VM through
ESX specific flags, which are not transparent or known
to the guest later on.
Vmxnet3 v4 introduces a new API transaction which allows
configuring RSS entirely from the guest. This API must be
invoked after device shared mem region is initialized.
IPv4 ESP RSS (SPI based) is also available, but currently
there are no ESP RSS definitions on rte_eth layer to
handle that.
Signed-off-by: Eduard Serra <eserra@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Add validation to pointer constructed from the IPv4 header length
in order to prevent malformed packets from generating a potential
out of bounds memory read.
Fixes: 09150784a7 ("net/bonding: burst mode hash calculation")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Chas Williams <chas3@att.com>
This commit adds support for drop action when creating E-Switch flow
using DV.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Actions like encap/decap, modify header require setting the flow table
type. Until now we supported only Nic RX and Nic TX, this commits adds
the support for FDB table type for those actions.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
In current implementation the DV steering supported only NIC steering.
This commit adds the transfer attribute in order to create a matcher
on the FDB tables.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This commit checks the for DR E-Switch support.
The support is based on both Device and Kernel.
This commit also enables the user to manually disable this this feature.
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The meson build was missing the define for Direct Rules.
Fixes: 4f84a19779 ("net/mlx5: add Direct Rules API")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Modify the translate vport function to match other translate items
naming conventions.
Fixes: 0fe3f18f78 ("net/mlx5: add source vport match to the ingress rules")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
According to RTE flow the action order should be the order that the
actions were given.
In the case of modify actions the position of the action was always
last.
This commit solves this issue by saving the position of the first modify
action, and then adds to this position the pointer to the modify action.
Fixes: 4bb14c83df ("net/mlx5: support modify header using Direct Verbs")
Cc: stable@dpdk.org
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Currently the allocation of the jump to QP is done in flow apply,
this results in memory leak.
This patch fixes this issue by moving the allocation and release of the
jump to QP action to the responsibility of the hrxq.
Fixes: cbb66daa3c ("net/mlx5: prepare Direct Verbs for Direct Rule")
Cc: stable@dpdk.org
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
On BlueField platform we have the new entity - PF representor.
This one represents the PCI PF attached to external host on the
side of ARM. The traffic sent by the external host to the NIC
via PF will be seem by ARM on this PF representor.
This patch refactors port recognizing capability on the base of
physical port name. We have two groups of name formats. Legacy
name formats are supported by kernels before ver 5.0 (being
more precise - before the patch [1]) or before Mellanox OFED 4.6,
and new naming formats added by the patch [1].
Legacy naming formats are supported:
- missing physical port name (no sysfs/netlink key) at all,
master is assumed
- decimal digits (for example "12"), representor is assumed,
the value is the index of attached VF
New naming formats are supported:
- "p" followed by decimal digits, for example "p2", master
is assumed
- "pf" followed by PF index concatenated with "vf" followed by
VF index, for example "pf0vf1", representor is assumed.
If index of VF is "-1" it is a special case of host PF
representor, this representor must be indexed in devargs
as 65535, for example representor=[0-3,65535] will
allow representors for VF0, VF1, VF2, VF3 and for host PF.
Note: do not specify representor=[0-65535], it causes devargs
processing error, because number of ports (rte_eth_dev) is
limited.
Applications should distinguish representors and master devices
exclusively by device flag RTE_ETH_DEV_REPRESENTOR and do not
rely on switch port_id (mlx5 PMD deduces ones from representor_id)
values returned by dev_infos_get() API.
[1] https://www.spinics.net/lists/netdev/msg547007.html
Linux-tree: c12ecc23 (Or Gerlitz 2018-04-25 17:32 +0300)
"net/mlx5e: Move to use common phys port names for vport representors"
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
mlx4 driver has a global list of Memory Regions created by
device, and there is a ml4_mr_release() routine which makes
a memory cleanup at device closing. The head of device MR list
was fetched outside the rwlock protected section. Also some
noticed typos are fixed.
Fixes: 9797bfcce1 ("net/mlx4: add new memory region support")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
mlx5 driver has a global list of Memory Regions created by
device, and there is a ml5_mr_release() routine which makes
a memory cleanup at device closing. The head of device MR list
was fetched outside the rwlock protected section. Also some
noticed typos are fixed.
Fixes: 974f1e7ef1 ("net/mlx5: add new memory region support")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
When passed to the application, Rx packets retain the port ID value
originally set by slave devices. Unfortunately these IDs have no
meaning to applications, which are typically unaware of their existence.
This confuses those caring about the source port field in mbufs
(m->port) which experience issues ranging from traffic drop to crashes.
Fixes: a46f8d584e ("net/failsafe: add fail-safe PMD")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Some port iterations are manually checking against RTE_ETH_DEV_UNUSED
instead of using the iterators based on rte_eth_find_next().
A new macro RTE_ETH_FOREACH_VALID_DEV() is introduced, but kept private
because there should be no need of iterating over all devices in the
API. The public iterators have additional filters for ownership, parent
device or sibling ports.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
As stated in the deprecation notice from December 2016,
"the legacy filter API, including rte_eth_dev_filter_supported(),
rte_eth_dev_filter_ctrl() as well as filter types MACVLAN, ETHERTYPE,
FLEXIBLE, SYN, NTUPLE, TUNNEL, FDIR, HASH and L2_TUNNEL, is superseded
by the generic flow API (rte_flow)".
After a long wait of more than two years, the legacy filter API
is marked as deprecated, while still tested with testpmd and
the tep_termination example.
The next step will be to announce a deadline for complete removal.
As preparation of the removal of rte_eth_ctrl.h,
RTE_ETH_FLOW_*, RTE_TUNNEL_TYPE_* and RTE_ETH_HASH_FUNCTION_* definitions
are moved to rte_ethdev.h and rte_flow.h.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Version is synced with internal Aquantia's driver versioning
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
We add extra xstat fields to include MACsec counters and stats
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
MACSEC should be configured only after link up event, thus we use
link interrupt to file an alarm for configuration.
FW also uses link interrupt line to indicate incoming events from
MACSEC. These may include key expiration, packet counter wrap, etc.
We pass these events to the upper layers.
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Implementation of firmware interface for MACsec configuration.
Structure with config data is written into FW memory, then we trigger
FW to execute the request and wait for result.
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
These are driver MACsec configuration routines.
They fill in config structures and prepare these
to be send to FW. Actual configuration will happen in
link interrupt handler.
We declare MACsec offload bits in DPDK offload capabilities
and provide external experimental MACsec API wrappers.
Also update documentation with feature matrix for the
enabled feature.
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Here we define hardware and software configuration structures
for MACsec interface. MACSEC itself is implemented in Phy module,
but its configuration is done via firmware interface
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
When launch testpmd with VF and quit testpmd, there'll
be lots of messages "i40evf_handle_aq_msg(): Request 0
is not supported yet", which are triggered during VF
reset and VF needn't do anything. So cancel alarm handler
before VF reset to ignore the admin queue messages.
Fixes: e0e6a7f215 ("net/i40e: cancel alarm handler at the end of closure")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Enable CLEARPBA bit is required by ice NIC of A0/A1 version to
enable Tx and Rx queue interrupt.
Also enable CLEARPBA bit does no impact on IAVF behaviour when be
hosted by other devices, so we can make it as default.
Fixes: d6bde6b5ea ("net/avf: enable Rx interrupt")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
We should use ETH_AF_XDP_RX_BATCH_SIZE in Rx function rather than
ETH_AF_XDP_TX_BATCH_SIZE.
Fixes: f1debd77ef ("net/af_xdp: introduce AF_XDP PMD")
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
As David pointed out, if we reserve N slots for Tx, but only submit n
slots, we would end up with an incorrect opinion of the number of available
slots later, we also would get wrong idx when we call
xsk_ring_prod__reserve next time. It also applies to
xsk_ring_cons__peek()/xsk_ring_cons__release().
This patch ensures that both reserve/submit and peek/release are
consistent.
Fixes: f1debd77ef ("net/af_xdp: introduce AF_XDP PMD")
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Properly report mtu capability in port device info.
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
When it fails to allocate enough slots in Tx queue for transmitting
packets, we need to return the dequeued addrs to buf ring.
Fixes: f1debd77ef ("net/af_xdp: introduce AF_XDP PMD")
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
When eth_virtio_dev_init() is cleaning up, it does not correctly set
the mac_addrs variable to NULL, which will lead to a double free.
Found during unit-test fixes.
Fixes: 43d18765c0 ("net/virtio: fix memory leak on failure")
Cc: stable@dpdk.org
Reported-by: Michael Santana <msantana@redhat.com>
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
The function rte_vlan_insert may allocate a new buffer for the
vlan header and return a different mbuf than originally passed.
In this case, the stored mbuf in txm[] array could point to wrong
buffer.
Fixes: dd856dfcb9 ("virtio: use any layout on Tx")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When device promiscuous mode has already been enabled,
should not return error if user re-enable it.
Fixes: c945e4bf90 ("net/ice: support promiscuous mode")
Cc: stable@dpdk.org
Signed-off-by: Ying A Wang <ying.a.wang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Variable "status" should be defined as "enum ice_status"
instead of "uint16_t". This patch fixes the issue.
Fixes: c945e4bf90 ("net/ice: support promiscuous mode")
Cc: stable@dpdk.org
Signed-off-by: Ying A Wang <ying.a.wang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
iavf driver reset the dev_info structure. This action will clear
the setting in core library. This patch fix that issue.
Fixes: 69dd4c3d08 ("net/avf: enable queue and device")
Cc: stable@dpdk.org
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
The vring relay thread is created after HW datapath start and is
canceled before HW datapath stop, so we need to take care of the
ring update when the relay thread is not on duty.
Fixes: 4bb531e152 ("net/ifc: support SW assisted VDPA live migration")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
The mediated vring's index should be set as initial value before
enabling HW datapath.
Fixes: 4bb531e152 ("net/ifc: support SW assisted VDPA live migration")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Relay for Tx queue doesn't provide any benefit, since the packet buffer
in Tx queue is read-only, there's no dirty page logging for Tx buffer.
This change further reduces CPU usage and improves throughput.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
On xmit side, there should be a check whether BD ring
has free BDs available before transmit a packet to avoid
data corruption and buffer leak issue.
Fixes: 469c6111a7 ("net/enetc: enable Rx and Tx")
Cc: stable@dpdk.org
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Support for SXGMII port has been enabled. It will
depends on boot loader information passed through IERB.
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
There was an error at rte_constant_bswap64 while compiling
with big endian toolchain. so fixing it by adding type cast.
Also, rte_pktmbuf_alloc API should be used to allocate mbuf
instead of rte_pktmbuf_raw_alloc to avoid use of stale mbuf
information.
Fixes: 469c6111a7 ("net/enetc: enable Rx and Tx")
Cc: stable@dpdk.org
Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
First configure ring with BDs properly then enable
the ring.
Fixes: 469c6111a7 ("net/enetc: enable Rx and Tx")
Cc: stable@dpdk.org
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
The RSS validation function was missing the verifcation that
if RSS is requested on inner packet, the flow must have tunnel data.
Fixes: 23c1d42c71 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Enable the optic module vendor and part number verification from driver
by setting appropriate flags. Link handling module in conjunction with
FW performs optic module verification.
Without this fix the SFP is powered down and the link doesn't come up,
if module verification is enforced in nvram setting.
Fixes: 540a211084 ("bnx2x: driver core")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rmody@marvell.com>
The periodic callout function and the interrupt handler both modify the
periodic flags. There is a possible race condition when an application
is going through dev_stop()/dev_start() and an interrupt handler is
invoked. We also need to ensure that periodic polling is not invoked in
interrupt context.
This patch handles such case by using separate variable to check for
interrupt context. Also, atomically load and store the periodic flag
value.
Fixes: 0f6ebeee24 ("net/bnx2x: fix call to link handling periodic function")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rmody@marvell.com>
In some cases, DPDK application may send packets
while PMD is going through load or unload flow.
This causes firmware to access invalid/unallocated
memory to process transmit buffer. Which results in
error on PCI bus and chip further blocks access to host,
causing a DMAE timeout.
Fix this issue by installing dummy empty transmit and receive
handlers at the beginning of unload path (rte_eth_dev_stop())
and install actual transmit and receive handlers after successful
load of the PMD port (rte_eth_dev_start()). This way, application
won't be able to send packets while device is going through
load/unload flow.
Fixes: 540a211084 ("bnx2x: driver core")
Cc: stable@dpdk.org
Signed-off-by: Shahed Shaikh <shshaikh@marvell.com>
There is a race condition while processing RAMROD
completion in fast path queue through interrupt handler
and polling method.
Interrupt handler invoked from actual interrupt event
and from RAMROD processing polling flow may create a
situation where one flow will read and clear a fastpath
interrupt without actually processing the RAMROD completion.
Thus, causing a RAMROD timeout even though HW sent an
completion event.
Fix this by introducing an atomic variable which will be
set only when interrupt handler needs to process RAMROD
completion.
Fixes: 540a211084 ("bnx2x: driver core")
Cc: stable@dpdk.org
Signed-off-by: Shahed Shaikh <shshaikh@marvell.com>
We allocate DMA memory but never free after using it.
Add function to free DMA memory.
Fixes: b5bf771922 ("bnx2x: driver support routines")
Cc: stable@dpdk.org
Signed-off-by: Shahed Shaikh <shshaikh@marvell.com>
If MLNX_OFED is installed, there's no .pc file installed for libraries and
dependency() can't find libraries by pkg-config. By adding fallback of
using cc.find_library(), libraries are properly located.
Fixes: e30b4e566f ("build: improve dependency handling")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Define variables for "is_linux", "is_freebsd" and "is_windows"
to make the code shorter for comparisons and more readable.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
This patch stopped LLDP by default to avoid the statistics error.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
No need to add additional vlan tag size for max packet size, since
for i40e, the queue's Rx Max Frame Size (rxq->max_pkt_len) already
includes the vlan header size.
Fixes: a3c83a2527 ("net/i40e: enable runtime queue setup")
Fixes: 4861cde461 ("i40e: new poll mode driver")
Fixes: c1715402df ("i40evf: fix jumbo frame support")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
The result of the endian conversion was not being used incorrectly.
Fixes: b11cceb83a ("net/bnxt: support timesync")
Cc: stable@dpdk.org
Reported-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
The vlan pattern handler currently assumes the NIC always strips vlan
header from the L2 buffer, regardless of the vlan strip setting. But,
with older VIC models, the vlan header is actually present in the L2
buffer if stripping is disabled. So in this case, the inner ether type
needs to be shifted by that much.
Fixes: 6ced137607 ("net/enic: flow API for NICs with advanced filters enabled")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Currently, the raw item is always preceded by a UDP header, and both
land in the L4 pattern buffer. So consider the UDP header size when
checking if the raw spec fits in the L4 buffer.
Coverity issue: 336796, 336850
Fixes: 477959e6ee ("net/enic: enable limited support for raw flow item")
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
UAR (User Access Region) register does not need to be remapped for
primary process but it should be remapped only for secondary process.
UAR register table is in the process private structure in
rte_eth_devices[],
(struct mlx4_proc_priv *)rte_eth_devices[port_id].process_private
The actual UAR table follows the data structure and the table is used
for both Tx and Rx.
For Tx, BlueFlame in UAR is used to ring the doorbell.
MLX4_TX_BFREG(txq) is defined to get a register for the txq. Processes
access its own private data to acquire the register from the UAR table.
For Rx, the doorbell in UAR is required in arming CQ event. However, it
is a known issue that the register isn't remapped for secondary process.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
UAR (User Access Region) register does not need to be remapped for
primary process but it should be remapped only for secondary process.
UAR register table is in the process private structure in
rte_eth_devices[],
(struct mlx5_proc_priv *)rte_eth_devices[port_id].process_private
The actual UAR table follows the data structure and the table is used
for both Tx and Rx.
For Tx, BlueFlame in UAR is used to ring the doorbell.
MLX5_TX_BFREG(txq) is defined to get a register for the txq. Processes
access its own private data to acquire the register from the UAR table.
For Rx, the doorbell in UAR is required in arming CQ event. However, it
is a known issue that the register isn't remapped for secondary process.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Queue index is redundantly stored for both Rx and Tx structures.
E.g. txq_ctrl->idx and txq->stats.idx. Both are consolidated to single
storage - rxq->idx and txq->idx.
Also, rxq and txq are moved to the beginning of its control structure
(rxq_ctrl and txq_ctrl) for cacheline alignment.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Max frame size setup should consider double VLAN case.
Fixes: ae2bdd0219 ("net/ice: support MTU setting")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
If rdma-core library doesn't support remapping UAR registers, the register
shouldn't be unmapped on device stop.
Fixes: 0203d33a10 ("net/mlx4: support secondary process")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
In case of cross compilation on aarch64 we must add include for
stdlib in order to use the free function.
Fixes: cbb66daa3c ("net/mlx5: prepare Direct Verbs for Direct Rule")
Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Call to CPP read (nfp_cpp_readl()) can fail, return 0 on fail.
If the call to _nfp6000_cppat_mu_locality fails, the function needs
to return with an error.
If the nfp_cpp_readl() call fails just returns 0.
Coverity issue: 277209, 277215, 277225
Fixes: c7e9729da6 ("net/nfp: support CPP")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Although it is rather unlikely getting 0 as the descriptor handle, better
to contemplate that possibility.
Coverity issue: 195018
Fixes: 896c265ef9 ("net/nfp: use new CPP interface")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
The boolean value was assigned to Tx flag twice, so it could cause bug
whenever Rx checksum will not be supported and Tx will be.
Coverity issue: 336831
Fixes: 117ba4a604 ("net/ena: get device info statically")
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
At the mlx5 device closing the shared IB context was destroyed
before cleanup routines completion. As it was found on some
setups (Netlink fails with old kernel drivers and we have to use
sysfs to retrieve interface index, this requires IB device name,
which is stored in shared context) the mlx5_nl_mac_addr_flush()
requires IB device name, and if shared context is removed it
causes the segmentation fault.
Fixes: 17e19bc4dd ("net/mlx5: add IB shared context alloc/free functions")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Retrieving network interface index via Netlink fails in
case of old ib_core kernel driver installed - mlx5_nl_ifindex()
routine fails due to RDMA_NLDEV_ATTR_NDEV_INDEX attribute is not
supported by the old driver.
The patch allowing to retrieve the network interface index and
name via Netlink [1]. So, the problem depends on ib_core module
version - 4.16 supports getting ifindex via Netlink, 4.15 does not.
This error was ignored in previous versions of MLX5 PMD probing
routine. For single device ifindex was retrieved via sysfs
and link control was not lost, so problem just was not noticed.
In order to support MLX5 PMD functioning over old kernel driver
this patch adds ifindex retrieving via sysfs into probing routine.
It is worth to note this method works for master/standalone
device only.
[1] https://www.spinics.net/lists/linux-rdma/msg62948.html
Linux tree: 5b2cc79d (Leon Romanovsky 2018-03-27 20:40:49 +0300 270)
Fixes: ad74bc6195 ("net/mlx5: support multiport IB device during probing")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
This patch implements the changes proposed in the deprecation
note[1]. Replace multiple color definitions in various places such as
rte_meter.h, rte_tm.h and rte_mtr.h with single rte_color defined
in rte_meter.h.
This is simple search and replace exercise without any implementation
change.
[1] https://mails.dpdk.org/archives/dev/2019-January/123861.html
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Rx refill in one bulk (which is just 8 descriptors) by default is too
aggressive and makes too many MMIO writes (Rx doorbells) if packet rate
is high. Setting default to 1/8 of Rx descriptors number shows good
performance results. Anyway it is a default value which may be
overridden by Rx configuration provided by application.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Add new protocol type VXLAN-GPE support for UDP tunnel.
inner IP/TCP/UDP checksum and RSS configuration shared
the same implementation of VXLAN.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Currently, if requested MTU is bigger than mbuf size and scattered
receive is not enabled, setting MTU to that value fails.
This patch allows setting this special MTU when device is stopped,
because scattered_rx will be re-configured during next port start
and driver may enable scattered receive according new MTU value.
After this patch, driver may select different receive function
automatically after MTU set, according MTU values selected.
Signed-off-by: David Harton <dharton@cisco.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>
The driver must send its version information to the firmware, so
the firmware knows the driver is up. Otherwise, it will cause unexpected
OS package downloading when multiple driver instances running on the
same device.
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
This patch enable 10Mb/s link for ixgbe X553.
This new device has own device id of 0x15E4 and 0x15E5, so
ixgbe PMD driver need to special check when setup link for
these two types of device.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>