Add callback for updating information about link status/info.
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
Signed-off-by: Zyta Szpak <zr@semihalf.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add part of PMD for actual reception/transmission.
Signed-off-by: Yelena Krivosheev <yelena@marvell.com>
Signed-off-by: Dmitri Epshtein <dima@marvell.com>
Signed-off-by: Zyta Szpak <zr@semihalf.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Support counter action for 1400 series adapters.
The adapter API for allocating and freeing counters is independent of
the adapter match/action API. If the filter action is requested, a
counter is first allocated and then assigned to the filter, and when
the filter is deleted, the counter must also be deleted.
Counters are DMAd to pre-allocated consistent memory periodically,
controlled by the define VNIC_FLOW_COUNTER_UPDATE_MSECS. The default is
100 milliseconds.
Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
Document MTR (metering) and TM (traffic management) usage plus
do some small updates here and there.
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
This patch introduces necessary changes required by MUSDK 18.09 library.
* As of MUSDK 18.09, pp2_cookie_t is no longer available. Now
RX descriptor cookie is defined as plain u64 so existing cast
is no longer valid.
* MUSDK 18.09 increased number of available bpools (buffer hw pools) by
introducing dma regions support. Update mvpp2 driver accordingly.
* replace MV_NET_IP4_F_TOS with MV_NET_IP4_F_DSCP
Before this patch, API allowed to configure a classification rule
according to IPv4 TOS, which was not supported in classifier. This patch
fixes this by using proper field.
* use 48 bit address mask
We cannot get pointers exceeding 48 bits thus using 48 bit
mask for extracting higher IOVA address bits is enough.
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
Signed-off-by: Yuval Caduri <cyuval@marvell.com>
Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
Reviewed-by: Shlomi Gridish <sgridish@marvell.com>
Reviewed-by: Alan Winkowski <walan@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
Change QoS configuration file syntax for port's default policer
setup.
Since default policer configuration is performed before
any other policer configuration we can pick a default id.
This simplifies default policer configuration since user
no longer has to choose ids from range [0, PP2_CLS_PLCR_NUM].
Explicitly document values for rate_limit_enable field.
Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
Signed-off-by: Natalie Samsonov <nsamsono@marvell.com>
Reviewed-by: Liron Himi <lironh@marvell.com>
set_mc_addr_list method is not implemented by the driver yet.
Fixes: a46f8d584e ("net/failsafe: add fail-safe PMD")
Cc: stable@dpdk.org
Signed-off-by: Evgeny Im <evgeny.im@oktetlabs.com>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
For IA, the AVX2 vector path is only recommended to be used on later
platforms (identified by AVX512 support, like SKL etc.) This is because
performance benchmark shows downgrade when running AVX2 vector path on
early platform (BDW/HSW) in some cases. But we still observe perf gain
with some real work loading.
So this patch introduced the new devarg use-latest-supported-vec to
force the driver always selecting the latest supported vec path. Then
apps are able to take AVX2 path on early platforms. And this logic can
be re-used if we will have AVX512 vec path in future.
This patch only affects IA platforms. The selected vec path would be
like the following:
Without devarg/devarg = 0:
Machine vPMD
AVX512F AVX2
AVX2 SSE4.2
SSE4.2 SSE4.2
<SSE4.2 Not Supported
With devarg = 1
Machine vPMD
AVX512F AVX2
AVX2 AVX2
SSE4.2 SSE4.2
<SSE4.2 Not Supported
Other platforms can also apply the same logic if necessary in future.
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Integrate accelerated networking support into netvsc PMD.
This allows netvsc to manage VF without using failsafe or vdev_netvsc.
For the exception vswitch path some tests like transmit
get a 22% increase in packets/sec.
For the VF path, the code is slightly shorter but has no
real change in performance.
Pro:
* using netvsc is more like other DPDK NIC's
* the exception packet uses less CPU
* much smaller code size
* no locking required on VF transmit/receive path
* no legacy Linux network device to get mangled by userspace
* much simpler (1K vs 9K) LOC
* unified extended statistics
Con:
* using netvsc has more complex startup model
* no bifurcated driver support
* no flow support (since host does not have flow API).
* no tunnel offload support
* no receive interrupt support
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Implement callback functionality on link state changes.
This is not really driven off of interrupt file descriptor like most other
PMD's. Instead, it happens when a link state change message arrives
in the common ring buffer.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Removed DEV_RX_OFFLOAD_CRC_STRIP offload flag.
Without any specific Rx offload flag, default behavior by PMDs is to
strip CRC.
PMDs that support keeping CRC should advertise DEV_RX_OFFLOAD_KEEP_CRC
Rx offload capability.
Applications that require keeping CRC should check PMD capability first
and if it is supported can enable this feature by setting
DEV_RX_OFFLOAD_KEEP_CRC in Rx offload flag in rte_eth_dev_configure()
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Tomasz Duszynski <tdu@semihalf.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Remes <remes@netcope.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
On ConnectX-4 Lx the Multi Packet Send (MPW) feature is considered
un-secure, as on some cases were the application provides incorrect mbufs
on the Tx burst the host or NIC can get stuck.
Hence, disabling the feature by default for this specific NIC.
Users can still enable this feature and enjoy the performance gain
(mostly for low number of cores) by using the txq_mpw_en devarg.
This patch will impact the out of the box performance of some application
using ConnectX-4 Lx for the sack of security and robustness.
Since we need different defaults based on the underlying device the mpw
field in the configuration struct was extended to contain also the
MLX5_ARG_UNSET option.
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Add tx_done_cleanup ethdev hook to allow application to
control if/when it wants completions to be handled.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Since netvsc PMD does not support SR-IOV accelerated networking,
it is not recommended for use on Azure. Make this clear in the
guide.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Add suggested DPDK/kernel driver/firmware version matching list for i40e.
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
This reverts the patch that enabled mbuf fast free.
There are two main reasons.
First, enic_fast_free_wq_bufs is broken. When
DEV_TX_OFFLOAD_MBUF_FAST_FREE is enabled, the driver calls this
function to free transmitted mbufs. This function currently does not
reset next and nb_segs. This is simply wrong as the fast-free flag
does not imply anything about next and nb_segs.
We could fix enic_fast_free_wq_bufs by making it to call
rte_pktmbuf_prefree_seg to reset the required fields. But, it negates
most of cycle saving.
Second, there are customer applications that blindly enable all Tx
offloads supported by the device. Some of these applications do not
satisfy the requirements of mbuf fast free (i.e. a single pool per
queue and refcnt = 1), and end up crashing or behaving badly.
Fixes: bcaa54c1a1 ("net/enic: support mbuf fast free offload")
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
With mlx5, unlike normal flow rules implemented through Verbs for traffic
emitted and received by the application, those targeting different logical
ports of the device (VF representors for instance) are offloaded at the
switch level and must be configured through Netlink (TC interface).
This patch adds preliminary support to manage such flow rules through the
flow API (rte_flow).
Instead of rewriting tons of Netlink helpers and as previously suggested by
Stephen [1], this patch introduces a new dependency to libmnl [2]
(LGPL-2.1) when compiling mlx5.
[1] https://mails.dpdk.org/archives/dev/2018-March/092676.html
[2] https://netfilter.org/projects/libmnl/
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Make a few updates in preparation for 18.08.
- Use SPDX
- Add 1400 series VIC adapters to supported models
- Describe the VXLAN port number
- Expand the description for ig-vlan-rewrite
- Add inner RSS and checksum to the features
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
This patch adds support for building and running mlx5 PMD on
32bit systems such as i686.
The main issue to tackle was handling the 32bit access to the UAR
as quoted from the mlx5 PRM:
QP and CQ DoorBells require 64-bit writes. For best performance, it
is recommended to execute the QP/CQ DoorBell as a single 64-bit write
operation. For platforms that do not support 64 bit writes, it is
possible to issue the 64 bits DoorBells through two consecutive
writes,
each write 32 bits, as described below:
* The order of writing each of the Dwords is from lower to upper
addresses.
* No other DoorBell can be rung (or even start ringing) in the midst
of an on-going write of a DoorBell over a given UAR page.
The last rule implies that in a multi-threaded environment, the access
to a UAR page (which can be accessible by all threads in the process)
must be synchronized (for example, using a semaphore) unless an atomic
write of 64 bits in a single bus operation is guaranteed. Such a
synchronization is not required for when ringing DoorBells on different
UAR pages.
Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Prior to this patch, all port representors detected on a given device were
probed and Ethernet devices instantiated for each of them.
This patch adds support for the standard "representor" parameter, which
implies that port representors are not probed by default anymore, except
for the list provided through device arguments.
(Patch based on prior work from Yuanhan Liu)
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Xueming Li <xuemingl@mellanox.com>
Fixes: 3d12dceed2df ("ethdev: add new offload flag to keep CRC")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
In DPDK 17.11, the ethdev offloads API has changed:
commit cba7f53b71 ("ethdev: introduce Tx queue offloads API")
commit ce17eddefc ("ethdev: introduce Rx queue offloads API")
The new API is documented in the programmer's guide:
http://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html#hardware-offload
For reminder, the main concepts in the new API were:
- All offloads are disabled by default
- Distinction between per port and per queue offloads.
The transition bits are now removed:
- Translation of the old API in ethdev
- rte_eth_conf.rxmode.ignore_offload_bitfield
- ETH_TXQ_FLAGS_IGNORE
The old API bits are now removed:
- Rx per-port rte_eth_conf.rxmode.[bit-fields]
- Tx per-queue rte_eth_txconf.txq_flags
- ETH_TXQ_FLAGS_NO*
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
Support rx of in direction packets only
Useful for apps that also tx to eth_pcap ports in order to not see them
echoed back in as rx when out direction is also captured
Example:
In case using rx_iface and sending *single* packet to eth1
it will loop forever as the when it is sent to tx_iface=eth1
it will be captured again on the rx_iface=eth1 and so on
$RTE_TARGET/app/testpmd l 0-3 -n 4 \
--vdev 'net_pcap0,rx_iface=eth1,tx_iface=eth1'
…
---------------------- Forward statistics for port 0 ------------
RX-packets: 758 RX-dropped: 0 RX-total: 758
TX-packets: 758 TX-dropped: 0 TX-total: 758
------------------------------------------------------------------
While if using rx_iface_in it will not be captured on the way out and
be forwarded only once
$RTE_TARGET/app/testpmd l 0-3 -n 4 \
--vdev 'net_pcap0,rx_iface_in=eth1,tx_iface=eth1'
…
---------------------- Forward statistics for port 0 ------------
RX-packets: 1 RX-dropped: 0 RX-total: 1
TX-packets: 1 TX-dropped: 0 TX-total: 1
------------------------------------------------------------------
Signed-off-by: Ido Goshen <ido@cgstowernetworks.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add a new devarg "ig-vlan-rewrite" to allow the user to set
non-default rewrite mode. The UCS VIC may add/remove/modify the VLAN
header of an ingress packet depending on the ingress VLAN rewrite
mode.
By default, the driver sets the pass-through mode, which tells the NIC
"do not touch VLAN header and preserve it as is". This mode is usually
sufficient, but can complicate deployments for certain environments.
For example, OVS-DPDK in UCS blade environments may want to use "untag
default VLAN mode", which removes the VLAN header from an ingress
packet if it matches vNIC's default VLAN.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
rte_eth_rx_descritpr_status and rte_eth_tx_descriptor_status
are supported by fm10K.
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
For i40evf, internal rx interrupt and adminq interrupt share the same
source, that cause a lot cpu cycles be wasted on interrupt handler
on rx path. This is complained by customers which require low latency
(when set I40E_ITR_INTERVAL to small value), but have to be sufferred by
tremendous interrupts handling that eat significant CPU resources.
The patch disable pci interrupt and remove the interrupt handler,
replace it with a low frequency (50ms) interrupt polling daemon
which is implemented by registering a alarm callback periodly, this
save CPU time significently: On a typical x86 server with 2.1GHz CPU,
with low latency configure (32us) we saw CPU usage from top commmand
reduced from 20% to 0% on management core in testpmd).
Also with the new method we can remove compile option: I40E_ITR_INTERVAL
which is used to balance between low latency and low CPU usage previously.
Now we don't need it since we can reach both at same time.
Suggested-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
After IN_ORDER Rx/Tx paths added, need to update Rx/Tx path selection
logic.
Rx path select logic: If IN_ORDER and merge-able are enabled will select
IN_ORDER Rx path. If IN_ORDER is enabled, Rx offload and merge-able are
disabled will select simple Rx path. Otherwise will select normal Rx
path.
Tx path select logic: If IN_ORDER is enabled will select IN_ORDER Tx
path. Otherwise will select default Tx path.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Add parameters for configuring VIRTIO_NET_F_MRG_RXBUF and
VIRTIO_F_IN_ORDER feature bits. If feature is disabled, also update
corresponding unsupported feature bit.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Multi-Packet Receive Queue is to receive multiple packets on a single large
buffer. The number of consumed strides in CQE is accumulated to keep track
of the current stride index. However, it is safer to directly use stride
index in CQE to avoid out-of-order situation which can possibly be caused
by introducing LRO in the future.
If Rx CQE compression is enabled, HW can be configured to store the stride
index in a mini-CQE but this will need newer version of library/driver.
Therefore, since this change, MPRQ is only supported with the newer
library/driver and Rx hash result is not supported if MPRQ is enabled along
with Rx CQE compression.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>