The flow API provides the ability to classify packets received by a tap
netdevice.
This patch only implements skeleton functions for flow API support, no
patterns are supported yet.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
In the next patch, access to struct pmd_internals will be necessary in
tap_flow.c to store the flows.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Olga Shern <olgas@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
When configuring Rx/Tx queue, if queue already exists, it is reused. But if
the queue size is changed, it must be resized to not access/overwrite
invalid memory.
Fixes: 2e22920b85 ("mlx5: support non-scattered Tx and Rx")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
VLAN filter is not working on i40e because driver need to
disable the VLAN promiscuous mode and set the VLAN filter
table.
Fixes: 5f2b0e3f76 ("net/i40e: set VF VLAN filter from PF")
Cc: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Add new admin queue function and extended fields for cloud filter:
- Add admin queue function for Replace filter command (Opcode: 0x025F)
- Define big buffer for extended general fields in Add/Remove
Cloud filters command
Signed-off-by: Laura Stroe <laura.stroe@intel.com>
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
This patch adds:
- ENCAP offload negotiation flag. Use the existing ENCAP_CSUM offload
flag to negotiate GSO_UDP_TUNNEL_CSUM capability and create new ENCAP
flag for negotiating offloads for encapsulated packets
- RX_ENCAP_CSUM offload negotiation flag for VF to negotiate RX
checksum capability for tunnelled packet types.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
When sending an adminq command, we wait for the command to complete in
a loop. This loop waits for an entire millisecond, when in practice the
adminq command is processed often much faster.
Change the loop to use i40e_usec_delay instead, and wait for 50 usecs
each time instead. This appears to be about the minimum time required,
based on some manual observation and testing.
The primary benefit of this change is reducing latency of various
operations in the PF driver, especially when related to having a large
number of VFs enabled.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
This is fix for klocwork issue where dcbcfg->numapps could
be greater than size of array (i.e dcbcfg->app[I40E_DCBX_MAX_APPS]).
The fix makes sure the array is not accessed past size of array
(i.e. I40E_DCBX_MAX_APPS).
Fixes: 166dceeeea ("i40e/base: add parsing for CEE DCBX TLVs")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
The X722 doesn't support the AQ command to read/write the control
register so enable it to bypass the check and use the direct read/write
method.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
On X722, we can control whether or not the hardware performs ATR
eviction. Define the correct bit so we can twiddle it.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
When polling the bonded ports for RX packets the old driver would
always start with the first slave in the list. If the requested
number of packets is filled on the first port in a two port config
then the second port could be starved or have larger number of
missed packet errors.
The code attempts to start with a different slave each time RX poll
is done to help eliminate starvation of slave ports. The effect of
the previous code was much lower performance for two slaves in the
bond then just the one slave.
The performance drop was detected when the application can not poll
the rings of Rx packets fast enough and the packets per second for
two or more ports was at the threshold throughput of the application.
At this threshold the slaves would see very little or no drops in
the case of one slave. Then enable the second slave you would see
a large drop rate on the two slave bond and reduction in throughput.
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Current code enables RX interrupts even if this it not
requested.
Fixes: ea121b2831 ("net/nfp: add Rx interrupts")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Chained mbufs hold data_len as the length of that particular mbuf
and pkt_len as the full packet length including all the chained
mbufs. It is not clear from the mbuf definition if pkt_len should
be set for all the mbufs in a chain, but code there for handling
mbufs suggests just the first mbuf requires to have pkt_len set.
NFP PMD was assuming pkt_len is set in all the chained mbufs and
unit tests for gather dma were building mbufs with pkt_len always
set. This patch gets rid of that assumption.
Fixes: b812daadad ("nfp: add Rx and Tx")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
When LSO, not doing this can led to firmware disruption. It does
not show as error because TCP ends up sending data again later on.
Fixes: 9ba3d0ae20 ("net/nfp: add TSO support")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Typo in ecore_sriov.c; Ending line with , instead of ;
Fixes: 379cbb2c44 ("net/qede/base: semantic change")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
When a server doesn't support ARI, VF offsets begin at a much higher
number. As a result, ecore miscalculates the first_vf_in_pf and
initialization fails since base driver incorrectly learns there are
no SBs for its VF [as its VFs are out of range].
Fixes: 22d07d939c ("net/qede/base: update")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Remove the unneeded conversion to LE when writing to the 32-bit
XSDM_REG_OPERATION_GEN register
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
VFs are seeing the number of MACs available to them as '0',
and as a result configure themselves as PROMISC. This fix is to
prevent that.
Fixes: 86a2265e59 ("qede: add SRIOV support")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Fix the logic for identifying which bit amongst the Multi-bit
attention sources is set.
Fixes: e6051bd6b0 ("qede: add interrupt handling support")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
There are some constellations where Due to lack of resource allocation
in MFW, There would be an insufficient number of L2 queues for all the
VFs.
This introduces a new feature ECORE_VF_L2_QUE which correctly numbers
the number of VF queues. Notice it might be larger than the actual
number of VFs in configuration space, in which case its the ecore
client responsibility not to try activating that many.
As part of the fix, also correct the nubmering of the VF queues. As
their numbering is dependent on the SBs of the PF, which might only be
partially used by L2 [as half would be assigned for RDMA which doesn't
require L2 queues], we make the numbering consecutive with that of the
L2 queues only.
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Prints in ecore_get_dev_info showed only chip revision,
and did that as number instead of letter.
I.e., BB A0 --> BB0, BB B0 --> BB1, AH A0 --> AH0, AH A1 --> AH0.
Correct the printing scheme into
{AH, BB} {A, B}{0, 1}
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Fix Timer(TM) block Internal Lookup Table(or ILT for logical to
physical address translation) initialization for SRIOV's coexistence
with other protocols.
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Remove the forcing of the driver's default resource allocation.
Fixes: 77f7222124 ("net/qede: add PCI ids for new chip variant")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Set pointers to NULL after freeing the allocations. Change OSAL_FREE
macro to take care of this and cleanup relevant code.
Fixes: 26ae839d06 ("qede: add DCBX support")
Fixes: ec94dbc573 ("qede: add base driver")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
dcbx-update-flag is incorrectly converted to boolean before assigining
it to ramrod data, fix this typecasting. Also, added more debug
messages in the dcbx code paths.
Fixes: 26ae839d06 ("qede: add DCBX support")
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
Split control and datapath to make datapath substitutable and
possibly reusable with alternative control path.
libefx-based Tx datapath is bound to libefx control path, but
it should be possible to use other datapaths with alternative
control path(s).
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
If Rx refill threshold guarantees that refill happens for one or
more bulks, less checks may be done on refill.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Split control and datapath to make datapath substitutable and
possibly reusable with alternative control path.
libefx-based Rx datapath is bound to libefx control path, but
other datapaths should be possible to use with alternative
control path(s).
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Rx queue flags should keep the information required on datapath.
It is a preparation to split control and data paths.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Use different sets of libefx EvQ callbacks for management,
transmit and receive event queue. It makes event handling
more robust against unexpected events.
Also it is required for alternative datapath support.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Adding support for the next items: eth, vlan, ipv4, udp, tcp and for the
next actions: queue, drop
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Make priv_lock/priv_unlock functions and some other structs/defines visible
from different source files by placing them into mlx4.h header.
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
If LSC interrupts are enabled, the application expects the link_update
ops to be executed by the PMD itself.
No link status change event is received upon probing, therefore the link
status update must be forced.
Fixes: c4da6caa42 ("mlx4: handle link status interrupts")
Cc: stable@dpdk.org
Signed-off-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
SIMPLEQ_* operations are not available in FreeBSD. Replacing
with equivalent STAILQ_* operations.
Fixes: f2546f8e51 ("net/thunderx/base: add functions to store qsets")
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
ConnectX-5 supports enhanced version of multi-packet send (MPS). An MPS Tx
descriptor can carry multiple packets either by including pointers of
packets or by inlining packets. Inlining packet data can be helpful to
better utilize PCIe bandwidth. In addition, Enhanced MPS supports hybrid
mode - mixing inlined packets and pointers in a descriptor. This feature is
enabled by default if supported by HW.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Some libefx-based drivers might need this functionality to
indicate DPCPU FW IDs as part of FW version info to assist
experienced users.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
When setting up the VSIs, MAC filter is used for
receiving MAC broadcast packets.
We should follow it to implement the broadcast
promiscuous mode setting.
Fixes: 61fff9b4c6 ("net/i40e: set VF broadcast mode from PF")
Cc: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
If the user reconfigures the queue size, then the previously allocated
memzone may potentially be too small. Release the memzone when a queue
is released and allocate a new one each time a queue is setup.
While here convert to rte_eth_dma_zone_reserve() which does basically
the same things as the private function.
Fixes: dfaff37fc4 ("vmxnet3: import new vmxnet3 poll mode driver implementation")
Cc: stable@dpdk.org
Signed-off-by: Chas Williams <ciwillia@brocade.com>
Acked-by: Jan Blunck <jblunck@infradead.org>
Acked-by: Shrikrishna Khare <skhare@vmware.com>
A tap netdevice does not support flow control; ensure nothing but
RTE_FC_NONE mode can be set.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
A tap netdevice actually receives every packet, without any filtering
whatsoever. There is no need for any multicast address registration
to receive multicast packets.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
The MTU is assigned to the tap netdevice according to the argument, but
packet transmission and reception just write/read on an fd with the
default limit being the socket buffer size.
As a new rte_eth_dev_data is allocated during tap device init, ensure it
is set again dev->data->mtu.
Once the actual netdevice is created via tun_alloc(), make sure to apply
the desired MTU to the netdevice.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
As soon as the netdevice is created, update pmd->mac_addr with its
actual MAC address.
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Create a socket for ioctl at tap device creation instead of opening it
and closing it every call to tap_link_set_flags().
Use a common tap_ioctl() function that can be extended for various uses
(such as MTU change, MAC address change, ...).
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
There is no reason not to support ARP on a tap netdevice. Remove
IFF_NOARP flags.
Focus on IFF_UP when a link status change is required.
Fixes: f457b472b1 ("net/tap: add link up and down operations")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
The patch is to add support for MCDI proxy which comes in
useful, particularly, while running over VF: few commands
will normally fail with EPERM, but in some cases the host
driver (i.e. running over the corresponding PF, typically,
within a hypervisor) may set itself as a proxy to conduct
authorization for the commands coming from VFs; these are
forwarded to the corresponding access control application
which may decline or approve authorization by replying to
the requests; all in all, the guest driver has to process
the replies forwarded back by the firmware MC in order to
give up gracefully (by setting return code which could be
understood by 'libefx') or re-issue the original commands
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
If Rx mode is unacceptable, in particular, when promiscuous
or all-multicast filters are not allowed while running over
PCI function which is not a member of appropriate privilege
groups, the driver has to cope with the failures gracefully
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
If periodic DMA statistics feature is absent (particularly,
while running over VF), the PMD must provide an ability to
cope with it using explicit update requests which are kept
restrained according to 'stats_update_period_ms' parameter
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
The patch is to make MAC statistics update interval tunable
by means of 'stats_update_period_ms' kvarg parameter making
it possible to use values different from 1000 ms in case of
SFN8xxx boards provided that firmware version is 6.2.1.1033
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
In function ena_com_set_hash_ctrl(), the return value is assigned to
"ret" variable, but it is not returned. Fix it by adding the return.
Fixes: 99ecfbf845 ("ena: import communication layer")
Cc: stable@dpdk.org
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Acked-by: Jan Medala <jan@semihalf.com>
It's not queue identifier but a descriptor identifier.
Fixes: 4861cde461 ("i40e: new poll mode driver")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
This commit adds a signal-based trigger to the Rx burst function in order
to avoid unnecessary system calls while Rx queues are empty.
Triggered Rx bursts put less pressure on the kernel, free up CPU resources
for applications and result in a noticeable performance improvement when
sharing CPU threads with other PMDs.
Measuring the traffic forwarding rate between two physical devices in
testpmd (IO mode, single thread, 64B packets) before and after adding two
tap PMD instances (4 ports total) that do not process any traffic and
comparing results yields:
Without Rx trigger:
-15% (--burst=32)
-62% (--burst=1)
With Rx trigger:
-0.3% (--burst=32)
-6% (--burst=1)
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
Polling the Tx queue file descriptor before writing to it is not mandatory
since it is configured as non-blocking.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
nicvf HW expects the DMA address of the packet data to be
aligned with cache line size.
Packet data offset is a function of struct mbuf size,
mbuf private size and headroom. mbuf private size can
be changed from the application in pool creation, this
check detects HW alignment requirement constraint in pmd
start function.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
The hardware offload capabilities are not being advertised for the EM PMD.
Because of this, applications that only enable these features if the device
advertises them will never do so.
Normally this is not an issue since normal packet processing should work
even if hardware offload is not available. But, in older versions of
Virtual Box the e1000 device emulation (Intel PRO/1000 MT Desktop 82540EM)
assumes that it should enable VLAN stripping even if the driver does not
request it. This means that any ingress packets that have a VLAN tag will
be stripped. Since the application did not request to enable VLAN
stripping it is not expecting these packets so they are not processed as
VLAN packets.
Regardless of the Virtual Box issue, the driver should be advertising
supported capabilities as is done in other drivers.
Signed-off-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
According to 'libefx' API requirements, one is allowed to
apply multicast address list to the port in started state
only, otherwise the new array should be copied to a local
storage in order to be applied during the next port start
Coverity issue: 141296
Fixes: 0fa0070e43 ("net/sfc: support multicast addresses list controls")
Fixes: e9ddf37a50 ("net/sfc: fix setting empty multicast list")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Support individual/group destination address match (unknown unicast
and all-multicast correspondingly in terms of firmware).
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Exact match of source and destination ports is supported by parser.
IP protocol match is enforced to UDP.
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Exact match of source and destination ports is supported by parser.
IP protocol match is enforced to TCP.
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Exact match of IP protocol, source and destination
addresses is supported by parser.
EtherType match is enforced to IPv6 EtherType.
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Exact match of IP protocol, source and destination
addresses is supported by parser.
EtherType match is enforced to IPv4 EtherType.
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Exact match of VLAN ID bits is supported only and required in VLAN item.
Mask to match VLAN ID bits only is required, default mask to match entire
TCI is not supported.
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Only pattern items VOID, ETH and actions VOID, QUEUE is now
supported.
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
The information is obtained from firmware on attach. It may
change after MC reboot (firmware version or variant change).
Cache should be refreshed after MC reboot when it is handled
properly (not yet).
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
Just log error for all filter types and return no support indication.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andrew Lee <alee@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Reviewed-by: Robert Stonehouse <rstonehouse@solarflare.com>
The previous API had various problems, including the length of the
caller provided buffer not being specified, no means being available
to discover how big the buffer needs to be, and a lack of clarity of
what the resulting list contains.
To improve it:
- add the buffer length as a parameter
- if the provided buffer is too short, fail with ENOSPC and return
the required length
- ensure that the list contents are valid and add comments describing it
It is safe to change this API as, unsurprisingly, it has no users.
Signed-off-by: Mark Spender <mspender@solarflare.com>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
The flag EFX_FILTER_MATCH_LOC_MAC_IG to represent filtering on the
individual/group bit of the MAC address (with the two cases being
distinguished by the MAC address in the filter specification) was
introduced to mirror the Linux driver filtering code, but the
implementations are different enough anyway that it isn't of much
value.
Having separate flags for unknown unicast and multicast simplifies
the code and allows the set of flags to match those used by MCDI.
It will also makes it easier to report whether these filters are
supported.
In the MCDI definitions, the unknown multicast and unicast flags
have the values 0x40000000 and 0x80000000 respectively, and so using
the same values for simplicity requires 32 bits in the filter
specification to store the flags. This means the structure is now
a little bigger than 64 bytes, but filters are not often used on
critical paths so this shouldn't have much impact - on Linux they
are also bigger than they used to be.
Signed-off-by: Mark Spender <mspender@solarflare.com>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Without setting the id, calling xstats_get twice with same array causes
memory corruption.
Also, if IXGBEVF_NB_XSTATS will be different than 1 in the future,
this will cause issues.
Fixes: 156712ba40 ("ixgbevf: add extending stats")
Signed-off-by: Ido Barnea <ibarnea@cisco.com>
Since PKT_TX_TCP_SEG implies PKT_TX_TCP_CKSUM, the PMD must force this
flag.
The fix applied for both tunneled and non-tunneled packets.
Fixes: 3f13f8c23a ("net/mlx5: support hardware TSO")
Fixes: b247f34601 ("net/mlx5: support hardware TSO for VXLAN and GRE")
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Multiple IES API resets can cause a race condition where the mailbox
interrupt request bits can be cleared before being handled. This can
leave certain mailbox messages from the PF to be untreated and the PF
will enter in some inactive state. If this situation occurs, the IES API
will initiate a mailbox version reset which, then, trigger a mailbox
state change. Once this mailbox transition occurs (from OPEN to CONNECT
state), a request for reset will be returned.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Other shared code bases are planning on using
IS_MULTICAST_ETHER_ADDR and friends without leaving the driver
name in the macro.
Remove reference to FM10K here so that we can re-use the specific
compat flags from Linux.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Avoid potential FUM fault errors on a VF when updating MAC address
and VLAN information. Only use the register flow when the mailbox is
disconnected, by checking if the enqueue_tx returns
FM10K_MBX_ERR_NO_MBX. If the mailbox message can be sent, there is no
reason to bother with the register writes which are only intended to
be used during VF driver initialization.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Don't report FM10K_ERR_REQUESTS_PENDING when we fail to disable queues
within the timeout. This can occur due to a hardware Tx hang, or when
the switch ethernet fabric is resetting while we are transmitting
traffic. It can sometimes take up to 500ms before the Tx DMA engine
gives up. Instead, just skip the DMA engine check and perform
a data-path reset anyways. Add a statistic counter to keep track of the
number of resets occurring while we have pending DMA on the rings.
In order to prevent having to assign err = FM10K_SUCCESS, re-order the
last few items of the reset_hw_pf function so that we don't perform
"return err" at the end.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
If the fm10k interface is brought up, but the switch manager software is
not running, the driver will continuously request the lport map every
few seconds in the base driver watchdog routine. Eventually after
several minutes the switch mailbox Tx fifo will fill up and the mailbox
will timeout, resulting in a reset. This reset will appear as if for no
reason, and occurs regularly every few minutes until the switch manager
software is loaded.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
The VF uses a multi-bit update request to clear unused VLANs whenever it
resets. However, an accident in a previous refactor broke multi-bit
updates for VFs, due to misreading a comment in fm10k_vf.c and
attempting to reduce code duplication. The problem occurs because
a multi-bit request has a non-zero length, and the PF would simply drop
any request with the upper 16 bits set. In addition, a multi-bit vlan
update does not have a concept for "VLAN 0" as the single bit update
does.
A previous revision of this patch resolved the issue by simply removing
the upper 16 bit check and the iov_select_vid checks. However, this would
remove the checks for default VID and for ensuring no other VLANs can be
enabled except pf_vid when it has been set. To resolve that issue, this
revision uses the iov_select_vid when we have a single-bit update, and
denies any multi-bit update when the VLAN was administratively set by
the PF. This should be ok since the PF properly updates VLAN_TABLE when
it assigns the PF vid. This ensures that requests to add or "remove" the
PF vid work as expected, but a rogue VF could not use the multi-bit
update as a loophole to attempt receiving traffic on other VLANs.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
The original comment may be read incorrectly as referring to checking
the *entire* length is zero. However, it merely checks only the reserved
bits of both length and reserved in a small amount of code. Update the
comment to indicate this is a clever trick and clearly spell out that it
only checks the reserve bits.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Use a new #define FM10K_VLAN_OVERRIDE even though we're using
the exact same bit. The reason for this is clarity in the code,
otherwise you can read FM10K_VLAN_CLEAR and think it should be
removed. Also add a comment explaining why the FM10K_VLAN_OVERRIDE
bit is set.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
The diagram represents bit layout of the multi-bit VLAN update
message format. Re-draw the numbers using base 8, and mark the
bit values every 8 bits at the top. This should make it more
easy to grasp the table quickly.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Add FM10K_PF_ATTR_ID_ERR, since it is possible for the switch manager
to send out an error message indicating status of the LPORT_MAP due to
zero allocated bandwidth.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Clean up the logic in fm10k_tlv_attr_parse, we
should not reply on that FM10K_NOT_IMPLEMENTED is
greater than zero, as this can easily cause confusion.
The patch also correct a minor document error.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Deleting lport when multicast mod is configured to
FM10K_XCAST_MODE_ALLMULTI or FM10K_XCAST_MODE_PROMISC will
result in generating orphaned multicast-group entries in the
switch manager.
Before deleting the lport, reset multicast mode to
FM10K_XCAST_MODE_NONE to flush out these multicast-group
entries.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Store the device name in dev->data->name, to have symmetrical behavior
between rte_pmd_tap_probe(name) and rte_pmd_tap_remove(name).
The netdevice name (linux interface name) is stored in the name field of
struct pmd_internals.
snprintf(data->name) has been moved closer to the rte_ethdev_allocate()
as it should use the same name.
Fixes: 02f96a0a82 ("net/tap: add TUN/TAP device PMD")
Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
EF10 supported by the PMD has no limitations on address boundary
crossing by Tx DMA descriptors.
Fixes: 428c7ddd2f ("net/sfc: send bursts of packets")
Fixes: fec33d5bb3 ("net/sfc: support firmware-assisted TSO")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Siena has limitation on maximum byte count and 4k boundary crosssing
(which is stricter than maximum byte count).
EF10 has limitation on maximum byte count only.
Fixes: f7dc06bf35 ("net/sfc/base: import 5xxx/6xxx family support")
Fixes: e7cd430c86 ("net/sfc/base: import SFN7xxx family support")
Fixes: 94190e3543 ("net/sfc/base: import SFN8xxx family support")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
If a libefx-based driver needs some way to clear port statistics,
then an MCDI agnostic method is required.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
All kinds of filter need to hardware mac type check
to make sure the hardware support that type of filter.
If not, it may cause serious issue.
Fixes: 11777435c7 ("net/ixgbe: parse flow director filter")
Fixes: 672be56d76 ("net/ixgbe: parse n-tuple filter")
Fixes: eb3539fc85 ("net/ixgbe: parse ethertype filter")
Fixes: 429f6ebb42 ("net/ixgbe: parse TCP SYN filter")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Move ixgbe 2 mac type check macros to ixgbe_ethdev.h in
order to be used by filter parser functions in file
ixgbe_flow.c.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Delete useless function declarations in file ixgbe_flow.c and
adjust function definition position to avoid compile error.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
With all vmxnet3 version 3 changes incorporated in the vmxnet3 driver,
the driver can configure emulation to run at vmxnet3 version 3, provided
the emulation advertises support for version 3.
This patch also updates release notes.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Acked-by: Jin Heo <heoj@vmware.com>
In vmxnet3 version 3, the emulation added support for the vmxnet3 driver
to communicate information about the memory regions the driver will use
for rx/tx buffers. The driver can also indicate which rx/tx queue the
memory region is applicable for. If this information is communicated
to the emulation, the emulation will always keep these memory regions
mapped, thereby avoiding the mapping/unmapping overhead for every packet.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Signed-off-by: Guolin Yang <gyang@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Acked-by: Jin Heo <heoj@vmware.com>
This command is reserved.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Acked-by: Jin Heo <heoj@vmware.com>
vmxnet3 driver preallocates buffers for receiving packets and posts the
buffers to the emulation. In order to deliver a received packet to the
guest, the emulation must map buffer(s) and copy the packet into it.
To avoid this memory mapping overhead, this patch introduces the receive
data ring - a set of small sized buffers that are always mapped by
the emulation. If a packet fits into the receive data ring buffer, the
emulation delivers the packet via the receive data ring (which must be
copied by the guest driver), or else the usual receive path is used.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Acked-by: Jin Heo <heoj@vmware.com>
vmxnet3 driver supports transmit data ring viz. a set of fixed size
buffers used by the driver to copy packet headers. Small packets that
fit these buffers are copied into these buffers entirely.
Currently this buffer size of fixed at 128 bytes. This patch extends
transmit data ring implementation to allow variable length transmit
data ring buffers. The length of the buffer is read from the emulation
during initialization.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Acked-by: Jin Heo <heoj@vmware.com>
Shared memory is used to exchange information between the vmxnet3 driver
and the emulation. In order to request emulation to perform a task, the
driver first populates specific fields in this shared memory and then
issues corresponding command by writing to the command register(CMD). The
layout of the shared memory was defined by vmxnet3 version 1 and cannot
be extended for every new command without breaking backward compatibility.
To address this problem, in vmxnet3 version 3, the emulation repurposed
a reserved field in the shared memory to represent command information
instead. For new commands, the driver first populates the command
information field in the shared memory and then issues the command. The
emulation interprets the data written to the command information
depending on the type of the command. This patch exposes this capability
to the driver.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Acked-by: Jin Heo <heoj@vmware.com>
Cleanup some code in preparation of vmxnet3 version 3 changes.
Signed-off-by: Shrikrishna Khare <skhare@vmware.com>
Acked-by: Yong Wang <yongwang@vmware.com>
Acked-by: Jin Heo <heoj@vmware.com>
enable DCB on SRIOV VFs, including
- UP and TC mapping according to dcb_tc in struct rte_eth_dcb_rx_conf.
- TC and queue mapping: queues are divided equally for each TC.
- UP insert when sending packet according to the TC the Tx queue
belongs to.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
In SRIOV case, ETH_MQ_RX_VMDQ_DCB and ETH_MQ_RX_DCB should be considered as
the same meaning, due to the multi-queue mapping is the same SRIOV and VMDq
in ixgbe.
Fixes: 27b609cbd1 ("ethdev: move the multi-queue mode check to specific drivers")
Cc: stable@dpdk.org
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
DCB is split to RX and TX mode. All-queues-drop is set for TX mode.
It's not appropriate because all-queue-drop is a RX feature.
Move this setting from TX to RX.
Fixes: f3f9b17bb8 ("net/ixgbe: support multiqueue mode VMDq DCB with SRIOV")
Cc: stable@dpdk.org
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Checking whether the counter is IB counter was performed with the
wrong index.
Fixes: 859081d3fb ("net/mlx5: add out of buffer counter to extended statistic")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
This commit adds support for hardware TSO for tunneled packets.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Prior to this commit Tx checksum offload was supported only for the
inner headers.
This commit adds support for the hardware to compute the checksum for the
outer headers as well.
The support is for tunneling protocols GRE and VXLAN.
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
The function is not defined anywhere, remove it.
Fixes: 0eb609239e ("ixgbe: enable Rx queue interrupts for PF and VF")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
In the IOV scenario, multi Rx queues can be assigned to one VF.
If the dropping is not enabled, when no descriptors are available
for one queue, this queue can block others.
Fixes: 00e30184da ("ixgbe: add PF support")
Cc: stable@dpdk.org
Suggested-by: Liang-Min Larry Wang <liang-min.wang@intel.com>
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
During PF initialization, PF will generate an initial mac address
for VFs, the purpose is to help VF keep a constant mac address between
its startup/shutdown cycles. Now this is not necessary, since we already
provide an API to set VF's MAC address from PF side
(rte_pmd_i40e_set_vf_mac_addr).
Application can use this API to lock down VF's mac address.(of course this
should happen before VF init)
While without this patch, we still can use rte_pmd_i40e_set_vf_mac_addr
to overwrite the random one, but this patch align DPDK's default behavior
with Kernel PF driver's, and this help to give an identical experience
when work with kernel VF driver.
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Having a drop queue per drop flow consumes a lot of memory and reduce the
speed capabilities of the NIC to handle such cases.
To avoid this and reduce memory consumption, an RSS drop queue is created
for all drop flows.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Implement a basic flow RSS action. This commits don't handle the default
RSS queues already created by the control plane, this last part being huge.
Any new request RSS flow request will be added using an higher priority
than the default one to be sure this rule will be the one used.
Default ones (those created by dev_start()) remains but has they have a
lower priority they will not receive any new packet.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
In mlx5 PMD handling a single queue of several destination queues ends in
creating the same Verbs attribute, the main difference resides in the
indirection table and the RSS hash key.
This helps to prepare the supports to the RSS queues by first handling the
queue action has being an RSS queue with a single queue. No RSS hash key
will be provided to the Verbs flow.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
ibv_attr should be freed in the function which allocates the memory.
Fixes: 2097d0d1e2 ("net/mlx5: support basic flow items and actions")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>