Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: John W. Linville <linville@tuxdriver.com>
remove rte_pcap_pmd and pci_dev.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
remove rte_ring_pmd and pci_dev.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
remove rte_null_pmd and pci_dev.
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Initialise dev_flags, driver, kdrv, drv_name and numa_node fields
in eth_dev data.
for the following vdevs:
null
ring
pcap
af_packet
xenvirt
mpipe
bonding
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Use new function rte_eth_copy_pci_info.
Copy device info for the following pdevs:
bnx2x
cxgbe
e1000
enic
fm10k
i40e
ixgbe
mlx4
mlx5
virtio
vmxnet3
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Make the virtio PMD allocate the array of unicast MAC addresses with
the maximum of entries (VIRTIO_MAX_MAC_ADDRS) that it exports.
Signed-off-by: Ivan Boule <ivan.boule@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
According to Table 7-38: Valid Fields by Offload Option
of Intel ® 82599 10 GbE Controller Datasheet,
L4LEN field is not needed for L4 XSUM computation by the hardware.
So remove l4_len from tx_offload_mask in ixgbe_set_xmit_ctx
function used to build the context transmitted to the hardware.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
ConnectX-4 adapters do not have a constant indirection table size, which is
set at runtime from the number of RX queues. The maximum size is retrieved
using a hardware query and is normally 512.
Since the current RETA API cannot handle a variable size, any query/update
command causes it to be silently updated to RSS_INDIRECTION_TABLE_SIZE
entries regardless of the original size.
Also due to the underlying type of the configuration structure, the maximum
size is limited to RSS_INDIRECTION_TABLE_SIZE (currently 128, at most 256
entries).
A port stop/start must be done to apply the new RETA configuration.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
It supports both Admin queue based and directly writing registers
based RSS hash key and lookup table configuration, as X722 supports
AQ based configuration.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
It supports both Admin queue based and directly writing registers
based RSS hash key and lookup table configuration, as X722 supports
AQ based configuration.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
In order to provide users early access of X722 and its A0 hardware,
new device IDs are added, and also compilation with those support
in base driver is enabled.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Add a one-parameter function to take an existing rte_ring and wrap it as
an ethdev, returning the port id of the new ethdev instance.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
The ring ethdev creation function creates an ethdev, but does not
actually set it up for use. Even if it's just a single ring, the user
still needs to create a mempool, call rte_eth_dev_configure, then call
rx and tx setup functions before the ethdev can be used.
This patch changes things so that the ethdev is fully set up after the
call to create the ethdev. The above-mentionned functions can still be
called - as will be the case, for instance, if the NIC is created via
commandline parameters - but they no longer are essential.
The function now also sets rte_errno appropriately on error, so the
caller can get a better indication of why a call may have failed.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch removes the mac local fault count and
mac remote fault count from rx errors. The mac
fault count registers count faults, not packets,
and hence should not be added to packet counters.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Add xstats() functions and statistic strings to virtio PMD.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add xstats() functions and statistic strings.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add implementation of xstats() functions in i40evf PMD.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add xstats functions to i40e PMD, allowing extended statistics
to be retrieved from the NIC and exposed to the DPDK.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add xstats() functions and stat strings as necessary to ixgbevf PMD.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Added and updated statistic strings as used by xstats_get(),
exposed extended queue statistics.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Add xstats_get() and xstats_reset() functions to igb
driver, and the necessary strings to expose these
NIC statistics.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
simple rx/tx func is chose when merge-able rx is disabled and user
specifies single segment and no offload support.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Bulk free of mbufs when clean used ring.
Shift operation of idx could be saved if vq_free_cnt means
free slots rather than free descriptors.
TODO: rearrange vq data structure, pack the stats var together so that
we could use one vec instruction to update all of them.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
With fixed avail ring, we don't need to get desc idx from avail ring.
virtio driver only has to deal with desc ring.
This patch uses vector instruction to accelerate processing desc ring.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Add software RX ring in virtqueue.
Add fake_mbuf in virtqueue for wraparound processing.
Fill avail ring with blank mbufs in virtio_dev_vring_start
Add virtio_rxtx.h header file for RTE_VIRTIO_PMD_MAX_BURST.
Would move all rx/tx related declarations into this header file in future.
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
In DPDK based switching environment, mostly vhost runs on a dedicated core
while virtio processing in guest VMs runs on different cores.
Take RX for example, with generic implementation, for each guest buffer,
a) virtio driver allocates a descriptor from free descriptor list
b) modify the entry of avail ring to point to allocated descriptor
c) after packet is received, free the descriptor
When vhost fetches the avail ring, it need to fetch the modified L1 cache from
virtio core, which is a heavy cost in current CPU implementation.
This idea of this optimization is:
allocate the fixed descriptor for each entry of avail ring, so avail ring will
always be the same during the run.
This removes L1M cache transfer from virtio core to vhost core for avail ring.
(Note we couldn't avoid the cache transfer for descriptors).
Besides, descriptor allocation and free operation is eliminated.
This also makes vector procesing possible to further accelerate the processing.
This is the layout for the avail ring(take 256 ring entries for example), with
each entry pointing to the descriptor with the same index.
avail
idx
+
|
+----+----+---+-------------+------+
| 0 | 1 | 2 | ... | 254 | 255 | avail ring
+-+--+-+--+-+-+---------+---+--+---+
| | | | | |
| | | | | |
v v v | v v
+-+--+-+--+-+-+---------+---+--+---+
| 0 | 1 | 2 | ... | 254 | 255 | desc ring
+----+----+---+-------------+------+
|
|
+----+----+---+-------------+------+
| 0 | 1 | 2 | | 254 | 255 | used ring
+----+----+---+-------------+------+
|
+
This is the ring layout for TX.
As we need one virtio header for each xmit packet, we have 128 slots available.
++
||
||
+-----+-----+-----+--------------+------+------+------+
| 0 | 1 | ... | 127 || 128 | 129 | ... | 255 | avail ring
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| | | || | | |
v v v || v v v
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| 128 | 129 | ... | 255 || 128 | 129 | ... | 255 | desc ring for virtio_net_hdr
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| | | || | | |
v v v || v v v
+--+--+--+--+-----+---+------+---+--+---+------+--+---+
| 0 | 1 | ... | 127 || 0 | 1 | ... | 127 | desc ring for tx dat
+-----+-----+-----+--------------+------+------+------+
||
||
++
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Vector RX function will process 4 packets at a time. When the RX
ring wrapps to the tail and the left descriptor size is not multiple
of 4, SW will overwrite memory that not belongs to it and cause crash.
The fix will allocate additional 4 HW/SW spaces at the tail to avoid
overwrite.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Add func fm10k_set_tx_function to decide the best TX func in
fm10k_dev_tx_init.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Vector TX use different way to manage TX queue, it's necessary
to use different functions to reset TX queue and release mbuf
in TX queue. So, introduce 2 function pointers to do such ops.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Since Vector RX use different variables to trace RX HW ring, it
leads to need different func to release mbuf properly.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
This patch add below functions:
1. Add function fm10k_rxq_rearm to re-allocate mbuf for used desc
in RX HW ring.
2. Add 2 functions, in which using SSE instructions to parse RX desc
to get pkt_type and ol_flags in mbuf.
3. Add func fm10k_recv_raw_pkts_vec to parse raw packets, in which
includes possible chained packets.
4. Add func fm10k_recv_pkts_vec to receive single mbuf packet.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Add new function fm10k_params_init to initialize all fm10k related
variables.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Add condition check in rx_queue_setup func. If number of RX desc
can't satisfy vPMD requirement, record it into a variable. Or
call fm10k_rxq_vec_setup to initialize Vector RX.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Add new file fm10k_rxtx_vec.c and add it into compiling.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
According to XL710 datasheet:
RX QLEN restrictions: When the PXE_MODE flag in the GLLAN_RCTL_0
register is cleared, the QLEN must be whole number of 32
descriptors.
TX QLEN restrictions: When the PXE_MODE flag in the GLLAN_RCTL_0
register is cleared, the QLEN must be whole number of 32
descriptors.
So make sure that for both RX and TX queues number of HW descriptors is
a multiple of 32.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
This patch adds fills bonding port's stats with a sum of corresponding
values taken from bonded slaves, when stats are requested for bonding port.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
This implementation allows to set and read RSS configuration for null
device, and is used to validate right values propagation over the slaves,
in test units for dynamic RSS configuration for bonding.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
This patch adds a possibility to configure more than one queue on null
device.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
This patch initializes eth_dev->link_intr_cbs queue used when null pmd is
added to the bonding.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Bonding device implements independent management of RSS settings. It
stores its own copies of settings i.e. RETA, RSS hash function and RSS
key. It’s required to ensure consistency.
1) RSS hash function set for bonding device is maximal set of RSS hash
functions supported by all bonded devices. That mean, to have RSS support
for bonding, all slaves should be RSS-capable.
2) RSS key is propagated over the slaves "as is".
3) RETA for bonding is an internal table managed by bonding API, and is
used as a pattern to set up slaves. Its size is GCD of all RETA sizes, so
it can be easily used as a pattern providing expected behavior, even if
slaves RETA sizes are different.
Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Add separate functions to configure VMDQ and RSS.
Update dglort map and logic ports accordingly.
Reset MAC/VLAN filter after VMDQ config was changed.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
The patch does below things for fm10k MAC/VLAN filter:
- Add separate functions for VMDQ and main VSI to change
MAC filter.
- Disable modification to VLAN filter in VMDQ mode.
- In device close phase, delete logic ports to remove all
MAC/VLAN filters belonging to those ports.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Add multi-queue checking in device configure function.
Currently, VMDQ and RSS are supported.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
This patch adds one new API to get dcb related info.
rte_eth_dev_get_dcb_info
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
This patch enables DCB+RSS multi-queue mode, and also fix some coding
style.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
This patch enables DCB feature on Intel XL710/X710 NICs. It includes:
Receive queue classification based on traffic class
Round Robin ETS schedule (rx and tx)
Priority flow control
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Different NIC has its specific constraint on the multi-queue
configuration, so move the checking from ethdev lib to drivers.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Add interrupts handler for port status notification.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Seen with GCC < 4.6:
error: unknown field ‘tcp_udp’ specified in initializer
error: extra brace group at end of initializer
Static initialization of anonymous structs/unions is a C11 feature
properly supported only since GCC 4.6.
Work around compilation errors with older versions by expanding
struct ibv_exp_flow_spec into struct hash_rxq_init.
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Normal flows do not currently provide IPv6 support.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Only a single flow per hash RX queue is needed in promiscuous mode.
Disable others to free up hardware resources.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
DPDK expects to have an RSS hash key per flow type (IPv4, IPv6, UDPv4,
etc.), to handle this the PMD must keep a table of hash keys to be able
to reconfigure the queues at each start/stop call.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
First implementation of rss_hash_update and rss_hash_conf_get, those
functions still lack in functionality but are usable to change the RSS
hash key. For now, the PMD does not handle an indirection table for
each kind of flow (IPv4, IPv6, etc.), the same RSS hash key is used
for all protocols. This situation explains why the rss_hash_conf_get
returns the RSS hash key for all DPDK supported protocols and why the
hash key is set for all of them too.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Promiscuous and allmulticast modes were historically enabled by adding
specific flows with types IBV_FLOW_ATTR_ALL_DEFAULT or
IBV_EXP_FLOW_ATTR_MC_DEFAULT to each hash RX queue, but this method is
deprecated.
- Promiscuous mode is now enabled by omitting destination MAC addresses from
basic flow specifications.
- Allmulticast mode is now enabled by using flow specifications that match
the broadcast bit in destination MAC addresses.
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
All hash RX QPs currently use the same flow steering rule (L2 MAC filtering)
regardless of their type (TCP, UDP, IPv4, IPv6), which prevents them from
being dispatched properly. This is fixed by adding flow information to the
hash RX queue initialization data and generating specific flow steering
rules for each of them.
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Use the maximum size of the indirection table when the number of requested
RX queues is not a power of two, this help to improve RSS balancing.
A message informs users that balancing is not optimal in such cases.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
The default hash RX queue handles packets that are not matched by more
specific types and requires its own indirection table of size 1 to work
properly.
This commit implements support for multiple indirection tables by grouping
their layout and properties in a static initialization table.
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
The new Verbs RSS API is lower-level than the previous one and much more
flexible but requires RX queues to use Work Queues (WQs) internally instead
of Queue Pairs (QPs), which are grouped in an indirection table used by a
new kind of hash RX QPs.
Hash RX QPs and the indirection table together replace the parent RSS QP
while WQs are mostly similar to child QPs.
RSS hash key is not configurable yet.
Summary of changes:
- Individual DPDK RX queues do not store flow properties anymore, this info
is now part of the hash RX queues.
- All functions affecting the parent queue when RSS is enabled or the basic
queues otherwise are modified to affect hash RX queues instead.
- Hash RX queues are also used when a single DPDK RX queue is configured (no
RSS) to remove that special case.
- Hash RX queues and indirection table are created/destroyed when device
is started/stopped in addition to create/destroy flows.
- Contrary to QPs, WQs are moved to the "ready" state before posting RX
buffers, otherwise they are ignored.
- Resource domain information is added to WQs for better performance.
- CQs are not resized anymore when switching between non-SG and SG modes as
it does not work correctly with WQs. Use the largest possible size
instead, since CQ size does not have to be the same as the number of
elements in the RX queue. This also applies to the maximum number of
outstanding WRs in a WQ (max_recv_wr).
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Removing this structure reduces the size of SG and non-SG RX queue elements
significantly to improve performance.
An nice side effect is that the mbuf pointer is now fully stored in
struct rxq_elt instead of relying on the WR ID data offset hack.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Olga Shern <olgas@mellanox.com>
Signed-off-by: Or Ami <ora@mellanox.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
This commit updates mlx5_rx_burst_sp() to use the fast verbs interface for
posting RX buffers just like mlx5_rx_burst(). Doing so avoids a loop in
libmlx5 and an indirect function call through libibverbs.
Note: recv_sg_list() is not implemented in the QP burst API, this commit is
only to prepare transition to the WQ-based API.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Port is considering inactive when the related netdevice is down. There is no
reason to warn about it (and confuse users) since it's automatically brought
up later by the PMD.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Prefetching initial bytes of mbuf structures earlier and in two cache lines
instead of one improves performance of mlx4_rx_burst(), which accesses the
mbuf->next field not present in the first 128 bytes.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
This is the same implementation as mlx4.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
All MAC RX flows must be updated with VLAN information when configuring a
VLAN filter.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Like most other device control operations, those are handled by the related
kernel network device through syscalls.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Link information is retrieved using ethtool ioctls.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Depending on the MTU and whether jumbo frames are enabled, RX queues may
switch between SG and non-SG modes for better performance.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
A dedicated RX callback is added to handle scattered buffers. For better
performance, it is only used when jumbo frames are enabled and MTU is larger
than a single mbuf.
On the TX path, scattered buffers are also handled in a separate function.
When there are more than MLX5_PMD_SGE_WR_N segments in a given mbuf, the
remaining segments are linearized in the last SGE entry.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
This commit adds the remaining missing callbacks to make mlx5 usable.
Like mlx4, device start and stop are implemented on top of MAC RX flows.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Francesco Santoro <francesco.santoro@6wind.com>
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
This commit adds support for MAC flow steering rules mandatory for the RX
path as well as the related callbacks to add/remove MAC addresses.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
RSS implementation with parent/child QPs comes from mlx4 and is temporary.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
In its current state, this driver implements the bare minimum to initialize
itself and Mellanox ConnectX-4 adapters without doing anything else
(no RX/TX for instance). It is disabled by default since it is based on the
mlx4 driver and also depends on libibverbs.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Or Ami <ora@mellanox.com>
For every packet sent, a completion was being requested and the
posted_index register on the nic was being updated. Instead, request a
completion and update the posted index once per burst after all
packets have been sent by the burst function.
Signed-off-by: John Daley <johndale@cisco.com>
Acked-by: Sujith Sankar <ssujith@cisco.com>
To support FVL PMD can select which RX and TX function should be used
according to the queue config.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
To support the multiple segments in one packets when the received pkts exceed
one buffer size.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
The way to increase the performance of the vPMD TX is to use some fast mbuf
release method compares to the scalar TX.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
The vPMD RX function uses the multi-buffer and SSE instructions to
accelerate the RX speed, but now the pktype cannot be supported by the vPMD RX,
because it will decrease the performance heavily.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Feature Add: Rx/Tx flow control support for the i40e
All the Rx/Tx LFC enable/disable operation is done by the F/W,
so PMD driver need to use the Set PHY Config AD command to trigger the PHY
to do the auto-negotiation, after the Tx/Rx pause ability is negotiated,
the F/W will help us to set the related LFC enable/disable registers.
PMD driver also need to configure the related registers to control
how often to send the pause frame and what the value in the pause frame.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
If DPDK is used on VF while the host is using Linux Kernel driver
as PF driver on FVL NIC, then VF Rx is reported only in batches of
4 packets. It is due to the kernel driver assumes VF driver is working
in interrupt mode, but DPDK VF is working in Polling mode.
This patch fixes this issue by using the V1.1 virtual channel with
Linux i40e PF driver.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
This patch adds a workaround to drop flow control frames from being
transmitted from VSIs.
With this patch in place a malicious VF cannot send flow control or PFC
packets out on the wire.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Boulder Rapid is Intel new NIC within fm10k family.
This patch make DPDK driver support this new NIC.
Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
This patch enables fm10k TSO feature for both non-tunneling packet
and tunneling packet.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
When datapath rxq interrupt is enabled, enable related device rxq.
Remove the interrupt handler after device stopped.
e1000 only support one type of interrupt cause, so remove lsc interrupt
handler if rxq enabled.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Enable rx interrupt support on e1000 physical and emulated device.
Implement rxq interrupt related functions in eth_dev_ops structure.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Separate lsc and rxq interrupt for they have different interrupt handlers.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Only mask lsc interrupt bit when setup device interrupt.
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Acked-by: Cunming Liang <cunming.liang@intel.com>
Implement the new CLIs for fdir mac vlan and tunnel modes, including
flow_director_filter and flow_director_mask. Set the mask of fdir.
Add, delete or update the entities of filter.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch marks rxq with RTE_SET_USED in
rx_desc_hlen_type_rss_to_pkt_flags(), when
ieee1588 is disabled. Previously a compilation
error occurred on unused-parameter.
Fixes: 1ce6591e23 ("igb: fix ieee1588 frame identification in i210")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch implements the VF RSS reta/hash query and update function
on 10G NICs. But the update function is only provided for x550. Because
the other NICs don't have the separate registers for VF, we don't want
to let a VF NIC change the shared RSS reta/hash registers. It may cause
PF and other VF NICs' behavior change without being noticed.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
On x550, there're separate registers provided for VF RSS while on the other
10G NICs, for example, 82599, VF and PF share the same registers.
This patch lets x550 use the VF specific registers when doing RSS configuration
on VF. The behavior of other 10G NICs doesn't change.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Comparing with the older NICs, x550's RSS redirection table is enlarged to 512
entries. As the original code is for the NICs which have a 128 entries RSS table,
it means only part of the RSS table is set on x550. So, RSS cannot work as
expected on x550, it doesn't redirect the packets evenly.
This patch configs the entries beyond 128 on x550 to let RSS work well, and also
update the query and update functions to support 512 entries.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch will drop flow control frames from being transmitted
from VSIs.
With this patch in place a malicious VF cannot send flow control
or PFC packets out on the wire.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
At the original point the rx_pkts[pos( + n)] pointers are not initialized,
so the code is prefetching random data.
Signed-off-by: Zoltan Kiss <zoltan.kiss@linaro.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Incorrect operator in ixgbe_get_eeprom & ixgbe_set_eeprom prevents
last byte of EEPROM being read/written, and hence cannot be dumped
or updated in entirity using these functions.
Fixes: 0198848a47 ("ixgbe: add access to specific device info")
Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
This patch adds the mspdc (MAC Short Packet Discard Count)
to the total rx errors, as discussed on the dev@dpdk mailing
list: http://comments.gmane.org/gmane.comp.networking.dpdk.devel/23717
Suggested-by: Igor Ryzhov <iryzhov@arccn.ru>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Remove receive jabber count (rjc) from ierrors count as the
register overlaps with the CRC error register, previously
causing some packets to be counted twice.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Ixgbe based 82598 and 82599 have different priority receive link-on
register addresses. This is solved in base/ by providing in the
PXONRXC and PXONXCNT as separate macros. This patch ensures the
correct address is read, avoiding reading garbage values.
Also PXON2OFFCNT doesn't exist in 82598, so it is not read for
that MAC.
This issue has existed since the drivers were imported into DPDK,
but was not easily discoverable as xstats were not available.
Tested using testpmd> show port xstats all
Fixes: af75078fec ("first public release")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
When setting TSO on VF ixgbe NICs, for example, 82599, x550, the
prompt that TSO is not supported will be printed. But TSO is
supported by VF ixgbe NICs.
We should add TSO to the capability flag, so, we will not see the
wrong prompt.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Fix a misinterpreatation of VF statistic macro in e1000/igb.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Roger Melton <rmelton@cisco.com>
Fixed issue where the flag PKT_RX_IEEE1588_PTP was not being set
in Intel I210 NIC, as EtherType in RX descriptor is in bits 8:10 of
Packet Type and not in the default bits 0:2.
Fixes known issue "IEEE1588 support possibly not working
with an Intel Ethernet Controller I210 NIC"
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch enables igb TSO feature, the feature works on both PF and VF.
The TCP segmentation offload needs to write the offload related information
into the advanced context descriptors, which is similar to checksum offload.
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch fixes a bug in reading the 64 bit register reading
which was causing the total octets counters to show zero.
Now the code reads both the lower and higher 32 bits.
Tested in testpmd, byte values are correct.
Fixes: 805803445a ("e1000: support EM devices (also known as e1000/e1000e)")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
On power up, the MAC - PHY interface needs to be set to PCIe, even if
cable is disconnected. In ME systems, the ME handles this on exit from
Sx(Sticky mode) state. In non-ME, the driver handles it. Added a check
for non-ME system to the driver code that handles that.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Previously, in check_reset_block RSPCIPHY was polled for 100 ms before determining
that the ME veto is set. This needed to be increased to 300 ms.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Enabling ulp on link down when cable is connect caused an infinite
loop of linkup/down indications in the NDIS driver.
After discussed, correct flow is to enable ULP only when cable is
disconnected.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Requires driver changes!
Change e1000_set_eee_i350 and e1000_set_eee_i354 to have flags allowing
changes in the advertised EEE speeds.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
There are some images which contain ETrackID in inverted format. This patch
allows reading this format.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
EEARBC has changed on i210. It means EEARBC has a different address on
i210 than on other NICs. So, add a new entity named EEARBC_I210 to the
register list and make sure the right one is being used on i210.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
These are the defaults for the packet buffer size registers that need to
be explicitly set back if someone changes them and comes back to a normal
driver.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This patch is for the following updates to the K1 configurations:
Tx idle period for entering K1 should be 128 ns.
Minimum Tx idle period in K1 should be 256 ns.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
In case that auto-negotiate is not enabled, call
e1000_setup_copper_link_generic instead of e1000_phy_setup_autoneg.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
e1000_check_for_link_media_swap() is supposed to check PHY page 0 for
copper and PHY page 1 for "other" (fiber) link. We switched back from
page 1 to page 0 too soon, before e1000_check_for_link_82575() is
executed and we were never finding link on fiber (other).
Note: The precedence of link type is controlled by the PHY settings.
If the link is copper, as the M88E1112 page address is set to 1, it should be
set back to 0 before checking this link.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Fix for I217 Packet Loss issue - The Management Engine sets the FEXTNVM4
Beacon Duration incorrectly. This fix ensures that the correct value will
always be set. Correct value for this field is 8 usec.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
TIPG value is increased when setting speed to 10 half to prevent
packet loss. However, it was never decreased again when speed
changes. This caused performance issues in the NDIS driver.
Fix this to restore TIPG to default value on non 10 half.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
For DH89XXCC_SGMII, write flush leaves registers of this device trashed
(0xFFFFFFFF). Added check for this device.
Also, after both for Port SW Reset and Device Reset case, platform should
wait at least 3ms before reading any registers. Since waiting is
conditionally executed only for Device Reset - removed the condition.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
The i210 has two EEPROM access registers that are located in
non-standard offsets: EEARBC and EEMNGCTL. EEARBC was fixed previously
and EEMNGCTL should also be corrected.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
The wrong bit is being used in PHYREG16 for PHY power down. In addition,
the use of PHYREG 16 is unnecessary if bit 11 of PHYREG 0 is used.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
This is a patch to change the value of register 776.20[11:2] for jumbo
mode from 0x1A to 0x1F. This is to enlarge the gap between read and
write pointers in the TX Fifo.
And replace the magic number with a macro by the way.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Several customers have reported a link flap issue on 82579. The symptoms
are random and intermittent link losses when 82579 is connected to specific
switches. Issue has been root caused as interoperability problem between
the NIC and at least some Broadcom PHYs in the Energy Efficient Ethernet
wake mechanism.
To fix the issue, we are disabling the Phase Locked Loop shutdown in 100M
Low Power Idle. This solution will cause an increase of power in 100M EEE
link. It may cost additional 28mW in this specific mode.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Add u32 return value to function e1000_resume_workarounds_pchlan,
so that calling function can detect PHY access failure during resuming
flow.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Adding code where missing to handle case where calls to
e1000_read_kmrn_reg_80003es2lan and e1000_write_kmrn_reg_80003es2lan return
an error value.
Also, when accessing the E1000_KMRNCTRLSTA_INBAND_PARAM offset to disable
far-end loopback on 80003es2lan devices, make the handling of a read or
write failure consistent between hw_init and hw_reset.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>