EvQ info array is a problem on device reconfigure when number of
Rx and Tx queues may change. It is a step to get rid of it.
Fixes: 58294ee65a ("net/sfc: support event queue")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
The index of an event queue index is computed from the index of the
corresponding transmit or receive queue, and depends on the total
number of receive queues. As a consequence, the index of an event
queue bound to a transmit queue changes if the total number of
receive queues is changed.
Fixes: 58294ee65a ("net/sfc: support event queue")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Make it clear that corresponding functions should be called on device
configure and close operations. No functional change.
Fixes: 06bc197796 ("net/sfc: interrupts support sufficient for event queue init")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
If Tx datapath does not support TSO, TSO was dropped on device configure.
It is incorrect to change advertised offloads.
Fixes: 7a4d44a639 ("net/sfc: make TSO a datapath-dependent feature")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Datapath choice requires NIC capabilities knowledge and, therefore,
should be done after probe. Whereas NIC resources estimation needs
to know chosen datapath (e.g. if Tx datapath is going to use TSO).
Fixes: df1bfde4ff ("net/sfc: factor out libefx-based Rx datapath")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Put to mempool does not free chained segments.
Fixes: e0b063941e ("net/sfc: support scattered Rx DMA")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
MPLSoUDP & MPLSoGRE is not supported by tunnel
filter due to limited resource of HW, this patch
enables MPLS tunnel filter by replacing inner_mac
filter.
This configuration will be set when adding MPLSoUDP
and MPLSoGRE filter rules, and it will be invalid
only by NIC core reset.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
This patch add MPLS parsing function to support MPLS filtering.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
First segment size must be at least 18 bytes, packets not respecting this
are silently not sent by the NIC but counted as sent by the PMD. The only
way to figure out is compiling the PMD in debug mode.
Fixes: 6579c27c11 ("net/mlx5: remove gather loop on segments")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
The driver can handle dynamic MTU change without needing the port to be
stopped explicitly by the application. However, there is currently no
check to prevent I/Os from happening on a different thread while the
port is going thru' reset internally. This patch fixes this issue by
assigning RX/TX burst functions to a dummy function and also reconfigure
RX bufsize for each rx queue based on the new MTU value.
Fixes: 200645ac79 ("net/qede: set MTU")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
The newer SR-IOV PF drivers expects RX/TX queues to be created before
applying RSS configuration. This patch addresses this requirement by
deferring RSS configuration till the queues are created. Even though
this issue is only seen in SR-IOV context, the changes will be made
applicable to PF also to keep the behavior consistent between VF/PF.
Fixes: 7ab35bf6b9 ("net/qede: fix RSS")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Both UDP and TCP based RSS offload types are supported by the device.
This patch adds UDP protocol which got missed out in the original patch.
Fixes: 4c98f2768e ("net/qede: support RSS hash configuration")
Cc: stable@dpdk.org
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Add i40e_tunnel_type enumeration type to refine consistent
tunnel filter, it will be esay to add new tunnel type for
i40e.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Previously, only tunnel filter to PF is supported.
This patch adds i40e_dev_consistent_tunnel_filter_set
function for consistent filter API to support tunnel
filter to VF.
Signed-off-by: Yong Liu <yong.liu@intel.com>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Change tunnel filter function name to VXLAN filter
function, prepare for other tunnel filter function.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Rework tunnel filter functions to align with the
new command buffer for add/remove cloud filter.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
vhost removes limit of Rx burst size(32 pkts) and supports to make
an best effort to receive pkts.
Cc: yuanhan.liu@linux.intel.com
Cc: maxime.coquelin@redhat.com
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
vhost removes limit of Tx burst size(32 pkts) and supports to make
an best effort to transmit pkts.
Cc: yuanhan.liu@linux.intel.com
Cc: maxime.coquelin@redhat.com
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
To add a wrapper function to remove the limit of tx burst size and
implement the "make an best effort to transmit the pkts" policy.
The patch makes ixgbe vec function work in a consistent behavior
like ixgbe_xmit_pkts_simple and ixgbe_xmit_pkts do that.
Cc: Helin Zhang <helin.zhang@intel.com>
Cc: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
To add a wrapper function to remove the limit of tx burst size. The patch
makes i40e vec function an best effort to transmit the pkts in the
consistent behavior like i40e_xmit_pkts_simple and i40e_xmit_pkts do that.
Cc: Helin Zhang <helin.zhang@intel.com>
Cc: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
To add a wrapper function to remove the limit of tx burst size.
The patch makes fm10k vec function an best effort to transmit
pkts in the consistent behavior like fm10k_xmit_pkts does that.
Cc: Jing Chen <jing.d.chen@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch includes slowpath configuration and fastpath changes
to support LRO and TSO. A bit of revamping is needed in order
to make use of existing packet classification schemes in Rx fastpath
and for SG element processing in Tx.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Add limited support for ntuple filter and flow director configuration.
The filtering is based on 4-tuples viz src-ip, dst-ip, src-port,
dst-port. The mask fields, tcp_flags, flex masks, priority fields,
Rx queue drop etc are not supported.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Add base driver APIs to enable accelerated RFS[aRFS] mode and ramrod
to configure rfs and ntuple filter.
Signed-off-by: Harish Patil <harish.patil@qlogic.com>
Merge hw_stop and hw_reset into one function.
Prevent race condition between MFW attentions and pf stop command during
unload flow that causes an ASSERT.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
A step toward having multi-Txq support on same queue-zone for VFs.
This change takes care of:
- VFs assume a single CID per-queue, where queue X receives CID X.
Switch to a model similar to that of PF - I.e., Use different CIDs
for Rx/Tx, and use mapping to acquire/release those. Each VF
currently will have 32 CIDs available for it [for its possible 16
Rx & 16 Tx queues].
- To retain the same interface for PFs/VFs when initializing queues,
the base driver would have to retain a unique number per-each queue
that would be communicated in some extended TLV [current TLV
interface allows the PF to send only the queue-id]. The new TLV isn't
part of the current change but base driver would now start adding
such unique keys internally to queue_cids. This would also force
us to start having alloc/setup/free for L2 [we've refrained from
doing so until now]
The limit would be no-more than 64 queues per qzone [This could be
changed if needed, but hopefully no one needs so many queues]
- In IOV, Add infrastructure for up to 64 qids per-qzone, although
at the moment hard-code '0' for Rx and '1' for Tx [Since VF still
isn't communicating via new TLV which index to associate with a
given queue in its queue-zone].
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Add support for the new interface with the Management FW for setting
max values of "soft" resources.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Add new image types - RECOVERY and PK (Public Key) towards
the second phase of NVRAM security support.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Optimize cache-line access in ecore_chain -
re-arrange fields so that fields that are needed for fastpath
[mostly produce/consume and their derivatives] are in the first cache
line, and the rest are in the second.
This is true for both PBL and NEXT_PTR kind of chains.
Advancing a page in a SINGLE_PAGE chain would still require the 2nd
cacheline as well, but afaik only SPQ uses it and so it isn't
considered as 'fastpath'.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
L2 handler changes:
This is change to remove the queue-id/qzone difference for Tx queues.
It does that by mainly doing:
a. VFs queues are no longer determined by the SBs they're using.
Instead, the ecore-client needs to maintain those and choose the values
to be used by VF when initializing it.
b. Eliminate the HW-cid array in the hw-function.
To do that, have all the rx/tx functionality turn into 'handle' base -
when queue would be started the caller would get a (void*) handle,
which it would later use with ecore for configuring various
queue-related stop [update, close].
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Allow only trusted VFs to be promisc/multi-promisc. The reasonable
thing is to use the 'trusted' node instead of simply allowing VFs to
become promiscuous.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Remove attribute field from update_current_config() API, Management FW
need to know only the last entity who configured the device.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
This change is in preparation to work with new FW 8.18.9.0.
Rename the defines to use E4_ and structs to use e4_. This renaming
is to add support for future chipsets.
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
The device specific data is located at dev->data->dev_private.
Fixes: 162f32290a ("fm10k: move parameters initialization")
Fixes: 039991bc28 ("fm10k: add vector pre-condition check")
Fixes: 77a8ab47eb ("fm10k: select best Rx function")
Cc: stable@dpdk.org
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
This commit introduces changes required to support VM live-migration. This
is done by registering and responding to interrupts coming from the host to
signal that the memory is about to be made invalid and replaced with a new
memory zone on the destination compute node.
Enabling and disabling of the interrupts are maintained outside of the
start/stop functions because they must be enabled for the lifetime of the
device. This is so that host interrupts are serviced and acked even in
cases where the app may have stopped the device.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds support for device start and stop functions. This allows an
application to control the administrative state of an AVP device. Stopping
the device will notify the host application to stop sending packets on that
device's receive queues.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds support for setting and clearing promiscuous mode on an AVP device.
When enabled the _mac_filter function will allow packets destined to any
MAC address to be processed by the receive functions.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds support for packet transmit functions so that an application can send
packets to the host application via an AVP device queue. Both the simple
and scattered functions are supported.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds function required for receiving packets from the host application via
AVP device queues. Both the simple and scattered functions are supported.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds queue management operations so that an application can setup and
release the transmit and receive queues.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds support for "dev_configure" operations to allow an application to
configure the device.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds support for initialization newly probed AVP PCI devices. Initial
queue translations are setup in preparation for device configuration.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds the initial framework for registering the driver against the support
PCI device identifiers.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds a header file with log macros for the AVP PMD
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Adds public/exported header files for the AVP PMD. The AVP device is a
shared memory based device. The structures and constants that define the
method of operation of the device must be visible by both the PMD and the
host DPDK application. They must not change without proper version
controls and updates to both the hypervisor DPDK application and the PMD.
The hypervisor DPDK application is a Wind River Systems proprietary
virtual switch.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
This commit introduces the AVP PMD file structure without adding any actual
driver functionality. Functional blocks will be added in later patches.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Signed-off-by: Matt Peters <matt.peters@windriver.com>
Acked-by: Vincent Jardin <vincent.jardin@6wind.com>
Let error messages in place, but return unambiguous values upon
probing errors.
Fixes: 66e1591687 ("mlx4: avoid init errors when kernel modules are not loaded")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Toggle Rx scatter mode based on the scatter_enable flag and the maximum
packet size only instead of deriving this information from the jumbo_frame
setting and the MTU configuration.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Most ConnectX-3 adapters expose two physical ports on a single PCI bus
address.
Add a new port parameter allowing the user to choose
either or both physical ports to be used by the application.
This parameter is used as follows:
Selecting only the second port:
-w 00:00.0,port=1
Selecting both ports:
-w 00:00.0,port=0,port=1
If no parameter is given, the default behavior is unchanged: all ports
are probed.
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Set some TCs to strict priority mode.
It's a global setting on a physical port.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Set a specific TC's max bandwidth on a VF.
User can call the API to set VF TC's max bandwidth
from PF.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Allocate all TCs' relative bandwidth (percentage) on
a specific VF.
It can be taken as relative min bandwidth setting.
User can call the API to set VF TC's min bandwidth
from PF.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Support setting VF max bandwidth from PF.
User can call the API on PF to set a specific VF's
max bandwidth.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Add APIs to setup and free Scatter-Gather list. SG list is used while
sending packets with multiple segments.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Venkat Koppula <venkat.koppula@caviumnetworks.com>
Signed-off-by: Srisivasubramanian S <ssrinivasan@caviumnetworks.com>
Signed-off-by: Mallesham Jatharakonda <mjatharakonda@oneconvergence.com>
Add API to send control and data packets to device. Request list keeps
track of host buffers to be freed till it reaches device.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Venkat Koppula <venkat.koppula@caviumnetworks.com>
Signed-off-by: Srisivasubramanian S <ssrinivasan@caviumnetworks.com>
Signed-off-by: Mallesham Jatharakonda <mjatharakonda@oneconvergence.com>
Add APIs to setup and process response list. Response list holds soft
commands waiting for response from device. Entries of this list are
processed to check for command response or timeout.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Venkat Koppula <venkat.koppula@caviumnetworks.com>
Signed-off-by: Srisivasubramanian S <ssrinivasan@caviumnetworks.com>
Signed-off-by: Mallesham Jatharakonda <mjatharakonda@oneconvergence.com>
Get buffers from SC buffer pool and create soft command. Buffers are
freed to the pool once the command reaches device.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Venkat Koppula <venkat.koppula@caviumnetworks.com>
Signed-off-by: Srisivasubramanian S <ssrinivasan@caviumnetworks.com>
Signed-off-by: Mallesham Jatharakonda <mjatharakonda@oneconvergence.com>
Soft command (SC) holds device control command and related information.
SC buffer pool holds buffers which are used during soft command
allocation.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Venkat Koppula <venkat.koppula@caviumnetworks.com>
Signed-off-by: Srisivasubramanian S <ssrinivasan@caviumnetworks.com>
Signed-off-by: Mallesham Jatharakonda <mjatharakonda@oneconvergence.com>
Instruction queue (IQ) is used to send control and data packets to
device from host. IQ 0 is used to send device configuration commands
during initialization and later re-allocated as per application
requirement.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Venkat Koppula <venkat.koppula@caviumnetworks.com>
Signed-off-by: Srisivasubramanian S <ssrinivasan@caviumnetworks.com>
Signed-off-by: Mallesham Jatharakonda <mjatharakonda@oneconvergence.com>
VF sends Function Level Reset request to PF using mbox and PF does the
reset.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Venkat Koppula <venkat.koppula@caviumnetworks.com>
Signed-off-by: Srisivasubramanian S <ssrinivasan@caviumnetworks.com>
Signed-off-by: Mallesham Jatharakonda <mjatharakonda@oneconvergence.com>
Add debug options to config file. Define macros used for log and make
use of config file options to enable them.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Derek Chickles <derek.chickles@caviumnetworks.com>
Signed-off-by: Venkat Koppula <venkat.koppula@caviumnetworks.com>
Signed-off-by: Srisivasubramanian S <ssrinivasan@caviumnetworks.com>
Signed-off-by: Mallesham Jatharakonda <mjatharakonda@oneconvergence.com>
4 and 8 TCs are supported on ixgbe. By default there're
8 TCs. So when initializing the device, the bandwidth for
8 TCs is set.
When changing the TC number, it's only considered setting
the bandwidth for 4 TCs. If the user change the number
from 4 to 8, the TCs' bandwidth is not right.
Fixes: 0807f80d35 ("ixgbe: DCB / flow control")
Cc: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
As tap is a virtual device, there's no physical way a link can be cut.
However, it has an associated kernel netdevice and possibly a remote
netdevice too. These netdevices link status may change outside of the
DPDK scope, through an external command such as:
ip link set dev tapX down
This commit implements link status notification through netlink.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Reflect device link status according to the state of the tap netdevice
and the remote netdevice (if any). If both are UP and RUNNING, then the
device link status is set to ETH_LINK_UP, otherwise ETH_LINK_DOWN.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
By default, a tap netdevice is of no use when not fed by a separate
process. The ability to automatically feed it from another netdevice
allows applications to capture any kind of traffic normally destined to
the kernel stack.
This patch implements this ability through a new optional "remote"
parameter.
Packets matching filtering rules created with the flow API are matched
on the remote device and redirected to the tap PMD, where the relevant
action will be performed.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Support for segmented packets (scatter/gather) is mandatory for most
purposes, regardless of the MTU size. Tx packets are often the result of
mbuf concatenation, and an mbuf is not necessarily large enough for Rx
packets to fit in a single one.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
- bgx_link_status mbox definition was changed in Linux
commit 1cc702591bae ("net: thunderx: Add ethtool support")
- NIC_MBOX_MSG_RES_BIT related changes were never part of Linux PF driver
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Supported flow rules are now mapped to TC rules on the tap netdevice.
The netlink message used for creating the TC rule is stored in struct
rte_flow. That way, by simply changing a metadata in it, we can require
for the rule deletion without further parsing.
Supported items:
- eth: src and dst (with variable masks), and eth_type (0xffff mask).
- vlan: vid, pcp, tpid, but not eid.
- ipv4/6: src and dst (with variable masks), and ip_proto (0xffff mask).
- udp/tcp: src and dst port (0xffff) mask.
Supported actions:
- DROP
- QUEUE
- PASSTHRU
It is generally not possible to provide a "last" item. However, if the
"last" item, once masked, is identical to the masked spec, then it is
supported.
Only IPv4/6 and MAC addresses can use a variable mask. All other
items need a full mask (exact match).
Support for VLAN requires kernel headers >= 4.9, checked using
auto-config.sh.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Each kernel netdevice may have queueing disciplines set for it, which
determine how to handle the packet (mostly on egress). That's part of
the TC (Traffic Control) mechanism.
Through TC, it is possible to set filter rules that match specific
packets, and act according to what is in the rule. This is a perfect
candidate to implement the flow API for the tap PMD, as it has an
associated kernel netdevice automatically.
Each flow API rule will be translated into its TC counterpart.
To leverage TC, it is necessary to communicate with the kernel using
netlink. This patch introduces a library to help that communication.
Inside netlink.c, functions are generic for any netlink messaging.
Inside tcmsgs.c, functions are specific to deal with TC rules.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
The flow API provides the ability to classify packets received by a tap
netdevice.
This patch only implements skeleton functions for flow API support, no
patterns are supported yet.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
In the next patch, access to struct pmd_internals will be necessary in
tap_flow.c to store the flows.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
When configuring Rx/Tx queue, if queue already exists, it is reused. But if
the queue size is changed, it must be resized to not access/overwrite
invalid memory.
Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
VLAN filter is not working on i40e because driver need to
disable the VLAN promiscuous mode and set the VLAN filter
table.
Fixes: 5f2b0e3f76 ("net/i40e: set VF VLAN filter from PF")
Cc: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Add new admin queue function and extended fields for cloud filter:
- Add admin queue function for Replace filter command (Opcode: 0x025F)
- Define big buffer for extended general fields in Add/Remove
Cloud filters command
Signed-off-by: Laura Stroe <laura.stroe@intel.com>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
This patch adds:
- ENCAP offload negotiation flag. Use the existing ENCAP_CSUM offload
flag to negotiate GSO_UDP_TUNNEL_CSUM capability and create new ENCAP
flag for negotiating offloads for encapsulated packets
- RX_ENCAP_CSUM offload negotiation flag for VF to negotiate RX
checksum capability for tunnelled packet types.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
When sending an adminq command, we wait for the command to complete in
a loop. This loop waits for an entire millisecond, when in practice the
adminq command is processed often much faster.
Change the loop to use i40e_usec_delay instead, and wait for 50 usecs
each time instead. This appears to be about the minimum time required,
based on some manual observation and testing.
The primary benefit of this change is reducing latency of various
operations in the PF driver, especially when related to having a large
number of VFs enabled.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
This is fix for klocwork issue where dcbcfg->numapps could
be greater than size of array (i.e dcbcfg->app[I40E_DCBX_MAX_APPS]).
The fix makes sure the array is not accessed past size of array
(i.e. I40E_DCBX_MAX_APPS).
Fixes: 166dceeeea ("i40e/base: add parsing for CEE DCBX TLVs")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
The X722 doesn't support the AQ command to read/write the control
register so enable it to bypass the check and use the direct read/write
method.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
On X722, we can control whether or not the hardware performs ATR
eviction. Define the correct bit so we can twiddle it.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
When polling the bonded ports for RX packets the old driver would
always start with the first slave in the list. If the requested
number of packets is filled on the first port in a two port config
then the second port could be starved or have larger number of
missed packet errors.
The code attempts to start with a different slave each time RX poll
is done to help eliminate starvation of slave ports. The effect of
the previous code was much lower performance for two slaves in the
bond then just the one slave.
The performance drop was detected when the application can not poll
the rings of Rx packets fast enough and the packets per second for
two or more ports was at the threshold throughput of the application.
At this threshold the slaves would see very little or no drops in
the case of one slave. Then enable the second slave you would see
a large drop rate on the two slave bond and reduction in throughput.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Current code enables RX interrupts even if this it not
requested.
Fixes: ea121b2831 ("net/nfp: add Rx interrupts")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Chained mbufs hold data_len as the length of that particular mbuf
and pkt_len as the full packet length including all the chained
mbufs. It is not clear from the mbuf definition if pkt_len should
be set for all the mbufs in a chain, but code there for handling
mbufs suggests just the first mbuf requires to have pkt_len set.
NFP PMD was assuming pkt_len is set in all the chained mbufs and
unit tests for gather dma were building mbufs with pkt_len always
set. This patch gets rid of that assumption.
Fixes: b812daadad ("nfp: add Rx and Tx")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
When LSO, not doing this can led to firmware disruption. It does
not show as error because TCP ends up sending data again later on.
Fixes: 9ba3d0ae20 ("net/nfp: add TSO support")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Typo in ecore_sriov.c; Ending line with , instead of ;
Fixes: 379cbb2c44 ("net/qede/base: semantic change")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
When a server doesn't support ARI, VF offsets begin at a much higher
number. As a result, ecore miscalculates the first_vf_in_pf and
initialization fails since base driver incorrectly learns there are
no SBs for its VF [as its VFs are out of range].
Fixes: 22d07d939c ("net/qede/base: update")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Remove the unneeded conversion to LE when writing to the 32-bit
XSDM_REG_OPERATION_GEN register
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
VFs are seeing the number of MACs available to them as '0',
and as a result configure themselves as PROMISC. This fix is to
prevent that.
Fixes: 86a2265e59 ("qede: add SRIOV support")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Fix the logic for identifying which bit amongst the Multi-bit
attention sources is set.
Fixes: e6051bd6b0 ("qede: add interrupt handling support")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
There are some constellations where Due to lack of resource allocation
in MFW, There would be an insufficient number of L2 queues for all the
VFs.
This introduces a new feature ECORE_VF_L2_QUE which correctly numbers
the number of VF queues. Notice it might be larger than the actual
number of VFs in configuration space, in which case its the ecore
client responsibility not to try activating that many.
As part of the fix, also correct the nubmering of the VF queues. As
their numbering is dependent on the SBs of the PF, which might only be
partially used by L2 [as half would be assigned for RDMA which doesn't
require L2 queues], we make the numbering consecutive with that of the
L2 queues only.
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Prints in ecore_get_dev_info showed only chip revision,
and did that as number instead of letter.
I.e., BB A0 --> BB0, BB B0 --> BB1, AH A0 --> AH0, AH A1 --> AH0.
Correct the printing scheme into
{AH, BB} {A, B}{0, 1}
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Fix Timer(TM) block Internal Lookup Table(or ILT for logical to
physical address translation) initialization for SRIOV's coexistence
with other protocols.
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Remove the forcing of the driver's default resource allocation.
Fixes: 77f7222124 ("net/qede: add PCI ids for new chip variant")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Set pointers to NULL after freeing the allocations. Change OSAL_FREE
macro to take care of this and cleanup relevant code.
Fixes: 26ae839d06 ("qede: add DCBX support")
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
dcbx-update-flag is incorrectly converted to boolean before assigining
it to ramrod data, fix this typecasting. Also, added more debug
messages in the dcbx code paths.
Fixes: 26ae839d06 ("qede: add DCBX support")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Split control and datapath to make datapath substitutable and
possibly reusable with alternative control path.
libefx-based Tx datapath is bound to libefx control path, but
it should be possible to use other datapaths with alternative
control path(s).
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
If Rx refill threshold guarantees that refill happens for one or
more bulks, less checks may be done on refill.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Split control and datapath to make datapath substitutable and
possibly reusable with alternative control path.
libefx-based Rx datapath is bound to libefx control path, but
other datapaths should be possible to use with alternative
control path(s).
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Rx queue flags should keep the information required on datapath.
It is a preparation to split control and data paths.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Use different sets of libefx EvQ callbacks for management,
transmit and receive event queue. It makes event handling
more robust against unexpected events.
Also it is required for alternative datapath support.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Adding support for the next items: eth, vlan, ipv4, udp, tcp and for the
next actions: queue, drop
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Make priv_lock/priv_unlock functions and some other structs/defines visible
from different source files by placing them into mlx4.h header.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
If LSC interrupts are enabled, the application expects the link_update
ops to be executed by the PMD itself.
No link status change event is received upon probing, therefore the link
status update must be forced.
Fixes: c4da6caa42 ("mlx4: handle link status interrupts")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
SIMPLEQ_* operations are not available in FreeBSD. Replacing
with equivalent STAILQ_* operations.
Fixes: f2546f8e51 ("net/thunderx/base: add functions to store qsets")
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
ConnectX-5 supports enhanced version of multi-packet send (MPS). An MPS Tx
descriptor can carry multiple packets either by including pointers of
packets or by inlining packets. Inlining packet data can be helpful to
better utilize PCIe bandwidth. In addition, Enhanced MPS supports hybrid
mode - mixing inlined packets and pointers in a descriptor. This feature is
enabled by default if supported by HW.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Some libefx-based drivers might need this functionality to
indicate DPCPU FW IDs as part of FW version info to assist
experienced users.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
When setting up the VSIs, MAC filter is used for
receiving MAC broadcast packets.
We should follow it to implement the broadcast
promiscuous mode setting.
Fixes: 61fff9b4c6 ("net/i40e: set VF broadcast mode from PF")
Cc: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
If the user reconfigures the queue size, then the previously allocated
memzone may potentially be too small. Release the memzone when a queue
is released and allocate a new one each time a queue is setup.
While here convert to rte_eth_dma_zone_reserve() which does basically
the same things as the private function.
Fixes: dfaff37fc4 ("vmxnet3: import new vmxnet3 poll mode driver implementation")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <ciwillia@brocade.com>
Acked-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
A tap netdevice does not support flow control; ensure nothing but
RTE_FC_NONE mode can be set.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
A tap netdevice actually receives every packet, without any filtering
whatsoever. There is no need for any multicast address registration
to receive multicast packets.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The MTU is assigned to the tap netdevice according to the argument, but
packet transmission and reception just write/read on an fd with the
default limit being the socket buffer size.
As a new rte_eth_dev_data is allocated during tap device init, ensure it
is set again dev->data->mtu.
Once the actual netdevice is created via tun_alloc(), make sure to apply
the desired MTU to the netdevice.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
As soon as the netdevice is created, update pmd->mac_addr with its
actual MAC address.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Create a socket for ioctl at tap device creation instead of opening it
and closing it every call to tap_link_set_flags().
Use a common tap_ioctl() function that can be extended for various uses
(such as MTU change, MAC address change, ...).
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
There is no reason not to support ARP on a tap netdevice. Remove
IFF_NOARP flags.
Focus on IFF_UP when a link status change is required.
Fixes: f457b472b1 ("net/tap: add link up and down operations")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
The patch is to add support for MCDI proxy which comes in
useful, particularly, while running over VF: few commands
will normally fail with EPERM, but in some cases the host
driver (i.e. running over the corresponding PF, typically,
within a hypervisor) may set itself as a proxy to conduct
authorization for the commands coming from VFs; these are
forwarded to the corresponding access control application
which may decline or approve authorization by replying to
the requests; all in all, the guest driver has to process
the replies forwarded back by the firmware MC in order to
give up gracefully (by setting return code which could be
understood by 'libefx') or re-issue the original commands
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
If Rx mode is unacceptable, in particular, when promiscuous
or all-multicast filters are not allowed while running over
PCI function which is not a member of appropriate privilege
groups, the driver has to cope with the failures gracefully
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
If periodic DMA statistics feature is absent (particularly,
while running over VF), the PMD must provide an ability to
cope with it using explicit update requests which are kept
restrained according to 'stats_update_period_ms' parameter
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
The patch is to make MAC statistics update interval tunable
by means of 'stats_update_period_ms' kvarg parameter making
it possible to use values different from 1000 ms in case of
SFN8xxx boards provided that firmware version is 6.2.1.1033
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
In function ena_com_set_hash_ctrl(), the return value is assigned to
"ret" variable, but it is not returned. Fix it by adding the return.
Fixes: 99ecfbf845 ("ena: import communication layer")
Cc: stable@dpdk.org
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Acked-by: Jan Medala <jan@semihalf.com>
It's not queue identifier but a descriptor identifier.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This commit adds a signal-based trigger to the Rx burst function in order
to avoid unnecessary system calls while Rx queues are empty.
Triggered Rx bursts put less pressure on the kernel, free up CPU resources
for applications and result in a noticeable performance improvement when
sharing CPU threads with other PMDs.
Measuring the traffic forwarding rate between two physical devices in
testpmd (IO mode, single thread, 64B packets) before and after adding two
tap PMD instances (4 ports total) that do not process any traffic and
comparing results yields:
Without Rx trigger:
-15% (--burst=32)
-62% (--burst=1)
With Rx trigger:
-0.3% (--burst=32)
-6% (--burst=1)
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Polling the Tx queue file descriptor before writing to it is not mandatory
since it is configured as non-blocking.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
nicvf HW expects the DMA address of the packet data to be
aligned with cache line size.
Packet data offset is a function of struct mbuf size,
mbuf private size and headroom. mbuf private size can
be changed from the application in pool creation, this
check detects HW alignment requirement constraint in pmd
start function.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
The hardware offload capabilities are not being advertised for the EM PMD.
Because of this, applications that only enable these features if the device
advertises them will never do so.
Normally this is not an issue since normal packet processing should work
even if hardware offload is not available. But, in older versions of
Virtual Box the e1000 device emulation (Intel PRO/1000 MT Desktop 82540EM)
assumes that it should enable VLAN stripping even if the driver does not
request it. This means that any ingress packets that have a VLAN tag will
be stripped. Since the application did not request to enable VLAN
stripping it is not expecting these packets so they are not processed as
VLAN packets.
Regardless of the Virtual Box issue, the driver should be advertising
supported capabilities as is done in other drivers.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>