By default, Verbs maps the doorbell register to write combining.
Working with write combining is useful for drivers which use blue flame
for the doorbell write.
Since mlx5 PMD uses only doorbells and write combining mapping requires
an extra memory barrier to flush the doorbell after its write, setting
the mapping to un-cached by default.
Such change is expected to reduce the max and average round trip latency.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Alexander Solganik <solganik@gmail.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The reason for the requirement of a barrier between the txq writes
and the doorbell record writes is to avoid a case where the device
reads the doorbell record's new value before the txq writes are flushed
to memory.
The current use of rte_wmb is not necessary, and can be replaced by
rte_io_wmb which is more relaxed.
Replacing the rte_wmb is also expected to improve the throughput.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Alexander Solganik <solganik@gmail.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
In function qede_rss_reta_update(), the pointer params returned from
call to function rte_zmalloc() may be NULL and will be dereferenced.
So, should judge if the params is NULL or not.
Fixes: 8b3ee85efe ("net/qede: fix RSS table entries for 100G adapter")
Cc: stable@dpdk.org
Signed-off-by: RongQiang Xie <xie.rongqiang@zte.com.cn>
Acked-by: Harish Patil <harish.patil@cavium.com>
The sub_device iterator macro should follow the general gist of the
tailq API for an easier understanding and safer use.
Once the loop has finished, the iterator should be set to NULL.
If no sub_device was iterated upon, the iterator should still be NULL.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
In enic_alloc_consistent() function, if rte_malloc for mze is failed,
!mze is true, memzone should be freed and function should return NULL.
Fixes: da5f560be9 ("net/enic: fix memory freeing")
Cc: stable@dpdk.org
Signed-off-by: RongQiang Xie <xie.rongqiang@zte.com.cn>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Extend debug logs verbosity by printing the full completion with error
along with the entire txq in case of error. For the Rx case no logs were
added since such errors are counted and recovered by the Rx data path.
Such prints are essential to understand the root cause for the error.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Currently, rte_eth_dev_mac_addr_add is used by a testpmd CLI
to add a MAC address for VF. But the parameter 'pool' of this
API means the VMDq pool, not VF.
So, it's wrong to use it to add the VF MAC address.
This patch provides a new API that can be used to
add VF MAC address on i40e.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
The corrupted code didn't unlock the spinlock in xstats
get and reset functions error flow.
Hence, if these errors happened, the device spinlock was
left locked and many mlx5 device functionalities were blocked.
The fix unlocks the spinlock in the missed places.
Fixes: e62bc9e706 ("net/mlx5: fix extended statistics")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
In the function ixgbe_flow_create(), the value ntuple_filter_ptr,
ethertype_filter_ptr, syn_filter_ptr, fdir_rule_ptr and l2_tn_filter_ptr
use rte_zmalloc().
malloc may return NULL, so we should check if the return value is NULL
or success.
Signed-off-by: RongQiang Xie <xie.rongqiang@zte.com.cn>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This patch fixes the mapping of user priority to traffic class
in Rx/Tx path of DCB configuration. Each DCB traffic class
should include all user priorities mapping to it in both Rx and
Tx path.
Fixes: 0807f80d35 ("ixgbe: DCB / flow control")
Cc: stable@dpdk.org
Signed-off-by: Wei Dai <wei.dai@intel.com>
This version of MLNX_OFED is no more supported.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Since MLNX_OFED 4.1 this code is no more useful.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Secondary process is a copy/paste of the mlx4 drivers, it was never
tested and it even segfault at the secondary process start in the
mlx5_pci_probe().
This makes more sense to wipe this non working feature to re-write a
working and functional version.
Fixes: a48deada65 ("mlx5: allow operation in secondary processes")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Those are useless since DPDK headers have been cleaned up.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Those two if statements are useless as there is a verification on the drop
field of the flow to jump to the end of the function just above.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Vector PMD returns buffers to the application without setting the pointers
in the Rx queue to null nor allocating them. When the PMD cleanup the ring
it needs to take a special care to those pointers to not free the mbufs
before the application have used them nor if the application have already
freed them.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
To use the vector, it needs to add to the PMD Rx mbuf ring four extra mbuf
to avoid memory corruption. This additional mbuf are added on dev_start()
whereas all other mbuf are allocated on queue setup.
This patch brings this allocation back to the same place as other mbuf
allocation.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
This patch prepare the merge of fake mbuf allocation needed by the vector
code with rxq_alloc_elts() where all mbuf of the queues should be
allocated.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
We need to support how firmware metadata was handled until now and also
the new API, since NFP NFD 3.0 firmware versions. The new metadata API
adds flexibility for working with different metadata types and, mainly,
to allow adding metadata from different firmware components independently.
Although this patch just supports one type handled by the PMD, future uses
regarding firmware apps will extend this support.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
A DPDK app could, whatever the reason, send packets with size 0.
The PMD is not sending those packets, which does make sense,
but the problem is the mbuf is not released either. That leads
to mbufs not being available, because the app trusts the
PMD will do it.
Although this is a problem related to app wrong behavior, we
should harden the PMD in this regard. Not sending a packet with
size 0 could be problematic, needing special handling inside the
PMD xmit function. It could be a burst of those packets, which can
be easily handled, but it could also be a single packet in a burst,
what is harder to handle.
It would be simpler to just send that kind of packets, which will
likely be dropped by the hw at some point. The main problem is how
the fw/hw handles the DMA, because a dma read to a hypothetical 0x0
address could trigger an IOMMU error. It turns out, it is safe to
send a descriptor with packet size 0 to the hardware: the DMA never
happens, from the PCIe point of view.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
This patch is to align with PF kernel driver version 5.1.3 to add the
number of queues to transmit VLAN packets in msg of queue info to VF.
If DCB is enabled, it is the number of DCB traffic classes.
If DCB is not enabled and default VLAN is enabled, it is 1.
For other cases, it is 0.
Signed-off-by: Wei Dai <wei.dai@intel.com>
igb_uio and vfio-pci does pci reset during open and release of device.
So FLR request to LiquidIO PF driver during init and close in PMD is not
required.
See commit b58eedfc7d ("igb_uio: issue FLR during open and release of
device file")
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
The corrupted code couldn't recognize that all sub devices
were not ready for Tx traffic when failsafe PMD was trying
to switch device because of an unreachable condition using.
Hence, the current Tx sub device variable was not updated
correctly.
The fix removed the unreachable branch and added new one
in the right place respecting the original intent.
Fixes: ebea83f899 ("net/failsafe: add plug-in support")
Fixes: 598fb8aec6 ("net/failsafe: support device removal")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
intr_vec was not properly configured. This is not a problem when
just one queue is supported but it fails with multiqueue.
Some minor refactoring also done for hardware interrupt configuration.
Fixes: ea121b2831 ("net/nfp: add Rx interrupts")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Redirection table was not being updated properly.
There is also a problem when configuring RSS.
Fixes: 934e4c60fb ("nfp: add RSS")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
If not a valid mac present in configuration bar, PMD creates a random
one. It needs to be passed to the NIC.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Flow control watermark is not read out correctly,
that may cause an application who not intend to change
watermark but does change it with a rte_eth_dev_flow_ctrl_set
call right after rte_eth_dev_flow_ctrl_get.
The idea fix is, during init, the watermark is set with default value,
so it is not necessary to read out from hw register during flow_ctl_get,
But due to I40E_GLRPB_GHW limitation, it is shared by different ports on
the same device, it is possible the value is changed on another port,
but local variable not sync, so we have to read out register every
flow_ctl_get.
Fixes: f53577f069 ("i40e: support flow control")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Calling i40e_vsi_delete_mac without checking return
value (as is done elsewhere 5 out of 6 times)
Coverity issue: 140735
Fixes: 43c89d5a4f ("net/i40e: set VF MAC from PF")
Signed-off-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add a mode type check for MAC VLAN mode, if fdir is
in this mode, it do not need to do sanity check for x550.
Fixes: dc0c16105d ("ixgbe: fix X550 flow director check")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Reset a NIC by calling dev_uninit() and then dev_init().
Go through the same way in NIC PCI remove without release
of ethdev resource and then NIC PCI probe function without
ethdev resource allocation.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Reset a NIC by calling dev_uninit and then dev_init.
Go through same way in NIC PCI remove without release of
ethdev resource and then NIC PCI probe function without
ethdev resource allocation.
Signed-off-by: Wei Dai <wei.dai@intel.com>
The filenames of the linker map files for DPDK PMDs, all follow a
standard format: rte_pmd_<libname>_version.map. The ring version, however,
had eth instead of pmd in the name, so was non-standard. By changing
this, we no longer need the build system to explicitly be given the name of
the mapfile, as it can determine it from the directory name.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
This reverts commit 035a8cf88f.
Don't send messages to inactive VF will cause DPDK PF failing
to send messages to kernel VF.
With this revert, this issue will be solved.
Fixes: 035a8cf88f ("net/i40e: fix PF notify when VF is not up")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
After fail to create a flow and if this is the first flow, the
mask_added flag should be reset, or it will prevent a new flow
which require different mask be created, since the mask config
remains impact.
Fixes: 72c135a89f ("net/ixgbe: create consistent filter")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Vector code is very young and can present some issues for users, to avoid
them to modify the selections function by commenting the code and recompile
the PMD, new devices parameters are added to deactivate the Tx and/or Rx
vector code.
By using such device parameters, the user will be able to fall back to
regular burst functions.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
If there's a Rx completion with error (e.g, MTU mismatch), it is handled
later out of main burst loop as a slow path for performance reason.
Statistics should be corrected by subtracting counters of errored packets.
Also, the last entry of mlx5_ptype_table[] must be RTE_PTYPE_ALL_MASK to
mark error in completion.
Fixes: ea16068c00 ("net/mlx5: fix L4 packet type support")
Fixes: 6cb559d67b ("net/mlx5: add vectorized Rx/Tx burst for x86")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The pinfo variable has wrong data. This has to have merged data of two
fields from Rx completion - pkt_info and hdr_type_etc.
Fixes: 6cb559d67b ("net/mlx5: add vectorized Rx/Tx burst for x86")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The data_off field of newly allocated mbufs is stale data. This shouldn't
be used in calculating Rx address for device when posting free buffers.
RTE_PKTMBUF_HEADROOM should be used instead and data_off of a mbuf will be
reset on packet reception anyway.
Fixes: 6cb559d67b ("net/mlx5: add vectorized Rx/Tx burst for x86")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Unlike mlx5_rx_burst(), mlx5_rx_burst_vec() doesn't replace completed
buffers one by one right after completion is processed but replenishes
multiple buffers later with rte_mempool_get_bulk(). Therefore, there could
be some buffer addresses left in the SW ring (rxq->elts[]) which have
already been delivered to application. As PMD doesn't own such buffers, it
must not be freed by PMD. "Trimming" is needed before cleanup.
A problem can be seen when quitting testpmd when
CONFIG_RTE_LIBRTE_MBUF_DEBUG=y and CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=y
Trimming should be as simple as possible, it shouldn't touch any indexes
and buffer allocation isn't necessary.
Fixes: 6cb559d67b ("net/mlx5: add vectorized Rx/Tx burst for x86")
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Changing the MTU is not related to changing the number of segments,
activating or not the multi-segment support should be handled by the
application.
Fixes: 9964b965ad ("net/mlx5: re-add Rx scatter support")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Current mlx4 OFED version has bug which returns error to
ibv destroy functions when the device was plugged out, in
spite of the resources were destroyed correctly.
Hence, failsafe PMD was aborted, only in debug mode, when
it tries to remove the device in plug-out process.
The workaround added option to replace all claim_zero
assertions with debugging messages, by the way, this option
affects non ibv destroy assertions.
DPDK 18.02 release should work with Mellanox OFED-4.2 which will
include the verbs fix to this bug, then, this patch can
be removed.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Occasionally, the amount of packets to free from the work queue ends
perfectly on a boundary to have nb_free = 0 and pool = 0. This causes
a segfault as follows:
(gdb) bt
#0 rte_mempool_default_cache
#1 rte_mempool_put_bulk (n=0, obj_table=0x7f10deff2530, mp=0x0)
#2 enic_free_wq_bufs (wq=wq@entry=0x7efabffcd5b0,
completed_index=completed_index@entry=33)
#3 0x00007f11e9c86e17 in enic_cleanup_wq (enic=<optimized out>,
wq=wq@entry=0x7efabffcd5b0)
at /usr/src/debug/openvswitch-2.6.1/dpdk-16.11/drivers/net/enic/enic_rxtx.c:442
#4 0x00007f11e9c86e5f in enic_xmit_pkts (tx_queue=0x7efabffcd5b0,
tx_pkts=0x7f10deffb1a8, nb_pkts=<optimized out>)
at /usr/src/debug/openvswitch-2.6.1/dpdk-16.11/drivers/net/enic/enic_rxtx.c:470
#5 0x00007f11e9e147ad in rte_eth_tx_burst (nb_pkts=<optimized out>,
tx_pkts=0x7f10deffb1a8, queue_id=0, port_id=<optimized out>)
This commit makes the enic wq driver match other drivers who call the
bulk free, by checking that there are actual packets to free.
Fixes: 36935afbc5 ("net/enic: refactor Tx mbuf recycling")
CC: stable@dpdk.org
Reported-by: Vincent S. Cojot <vcojot@redhat.com>
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1468631
Signed-off-by: Aaron Conole <aconole@redhat.com>
Reviewed-by: John Daley <johndale@cisco.com>
Buffer length be configured for each data segment should not exceed
the requested value, or device may fill data that exceed the boundary
of memory that be reserved.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Reviewed-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Tested-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
When the output of an exec() slave definition is only a single newline
character, the fail-safe currently fails to parse the device with the
value returned by the rte_devargs library.
This behavior is incorrect, because the fail-safe should make a
difference between the absence of a device, and an erroneous device
declaration.
Fix the output sanitization in the case where no newline was at its end
and detect the special case of an absent device. The correct error code
is then returned.
Fixes: a0194d8281 ("net/failsafe: add flexible device definition")
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
When there is no preferred device, failsafe will always
try to scan for preferred device. And if there is no device
found with the exec option, popen() will get an empty output.
In this case, it was forgotten to close the file descriptor.
It is fixed by closing the file descriptor even if the output is empty.
Coverity issue: 158633
Fixes: a0194d8281 ("net/failsafe: add flexible device definition")
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
We should only restore shadow_vfta when hw_vlan_filter is active.
Otherwise, we should restore the previous filtering behavior.
Fixes: f003fc3834 ("vmxnet3: enable vlan filtering")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <ciwillia@brocade.com>
Acked-by: Shrikrishna Khare <skhare@vmware.com>