Commit Graph

669 Commits

Author SHA1 Message Date
Hyong Youb Kim
d16623dd39 net/enic: revert mbuf fast free offload
This reverts the patch that enabled mbuf fast free.

There are two main reasons.

First, enic_fast_free_wq_bufs is broken. When
DEV_TX_OFFLOAD_MBUF_FAST_FREE is enabled, the driver calls this
function to free transmitted mbufs. This function currently does not
reset next and nb_segs. This is simply wrong as the fast-free flag
does not imply anything about next and nb_segs.

We could fix enic_fast_free_wq_bufs by making it to call
rte_pktmbuf_prefree_seg to reset the required fields. But, it negates
most of cycle saving.

Second, there are customer applications that blindly enable all Tx
offloads supported by the device. Some of these applications do not
satisfy the requirements of mbuf fast free (i.e. a single pool per
queue and refcnt = 1), and end up crashing or behaving badly.

Fixes: bcaa54c1a1 ("net/enic: support mbuf fast free offload")

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2018-08-02 10:26:02 +02:00
Stephen Hemminger
2be90f7915 doc: fix typo in vdev_netvsc guide
Fixes: 56252de779 ("net/vdev_netvsc: add automatic probing")
Cc: stable@dpdk.org

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-07-26 22:56:51 +02:00
Adrien Mazarguil
20b71e92ef net/mlx5: lay groundwork for switch offloads
With mlx5, unlike normal flow rules implemented through Verbs for traffic
emitted and received by the application, those targeting different logical
ports of the device (VF representors for instance) are offloaded at the
switch level and must be configured through Netlink (TC interface).

This patch adds preliminary support to manage such flow rules through the
flow API (rte_flow).

Instead of rewriting tons of Netlink helpers and as previously suggested by
Stephen [1], this patch introduces a new dependency to libmnl [2]
(LGPL-2.1) when compiling mlx5.

[1] https://mails.dpdk.org/archives/dev/2018-March/092676.html
[2] https://netfilter.org/projects/libmnl/

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-26 14:05:52 +02:00
Hyong Youb Kim
dd35c0d693 doc: update enic guide and features
Make a few updates in preparation for 18.08.
- Use SPDX
- Add 1400 series VIC adapters to supported models
- Describe the VXLAN port number
- Expand the description for ig-vlan-rewrite
- Add inner RSS and checksum to the features

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2018-07-23 23:55:26 +02:00
Rahul Lakkireddy
174e54baf9 net/cxgbe: update release notes for flow API support
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
2018-07-23 23:55:26 +02:00
Stephen Hemminger
beff6d8e8e net/netvsc: add documentation
Matching documentation for new netvsc device.
Includes a brief note about the restart issue.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
2018-07-13 23:48:07 +02:00
Moti Haimovsky
6bf10ab69b net/mlx5: support 32-bit systems
This patch adds support for building and running mlx5 PMD on
32bit systems such as i686.

The main issue to tackle was handling the 32bit access to the UAR
as quoted from the mlx5 PRM:
QP and CQ DoorBells require 64-bit writes. For best performance, it
is recommended to execute the QP/CQ DoorBell as a single 64-bit write
operation. For platforms that do not support 64 bit writes, it is
possible to issue the 64 bits DoorBells through two consecutive
writes,
each write 32 bits, as described below:
* The order of writing each of the Dwords is from lower to upper
  addresses.
* No other DoorBell can be rung (or even start ringing) in the midst
 of an on-going write of a DoorBell over a given UAR page.

The last rule implies that in a multi-threaded environment, the access
to a UAR page (which can be accessible by all threads in the process)
must be synchronized (for example, using a semaphore) unless an atomic
write of 64 bits in a single bus operation is guaranteed. Such a
synchronization is not required for when ringing DoorBells on different
UAR pages.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-07-12 14:34:59 +02:00
Adrien Mazarguil
6de569f5ec net/mlx5: add parameter for port representors
Prior to this patch, all port representors detected on a given device were
probed and Ethernet devices instantiated for each of them.

This patch adds support for the standard "representor" parameter, which
implies that port representors are not probed by default anymore, except
for the list provided through device arguments.

(Patch based on prior work from Yuanhan Liu)

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Reviewed-by: Xueming Li <xuemingl@mellanox.com>
2018-07-11 15:37:29 +02:00
Moti Haimovsky
ba576975a8 net/mlx4: support hardware TSO
Implement support for hardware TSO.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
2018-07-10 14:02:57 +02:00
Ferruh Yigit
b8a09c2e27 doc: update CRC feature with new offload flag
Fixes: 3d12dceed2df ("ethdev: add new offload flag to keep CRC")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-07-05 16:10:13 +02:00
Ferruh Yigit
ab3ce1e0c1 ethdev: remove old offload API
In DPDK 17.11, the ethdev offloads API has changed:
	commit cba7f53b71 ("ethdev: introduce Tx queue offloads API")
	commit ce17eddefc ("ethdev: introduce Rx queue offloads API")
The new API is documented in the programmer's guide:
	http://doc.dpdk.org/guides/prog_guide/poll_mode_drv.html#hardware-offload

For reminder, the main concepts in the new API were:
	- All offloads are disabled by default
	- Distinction between per port and per queue offloads.

The transition bits are now removed:
	- Translation of the old API in ethdev
	- rte_eth_conf.rxmode.ignore_offload_bitfield
	- ETH_TXQ_FLAGS_IGNORE

The old API bits are now removed:
	- Rx per-port rte_eth_conf.rxmode.[bit-fields]
	- Tx per-queue rte_eth_txconf.txq_flags
	- ETH_TXQ_FLAGS_NO*

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Shahaf Shuler <shahafs@mellanox.com>
2018-07-04 21:50:32 +02:00
Ido Goshen
53bf484034 net/pcap: capture only ingress packets from Rx iface
Support rx of in direction packets only
Useful for apps that also tx to eth_pcap ports in order to not see them
echoed back in as rx when out direction is also captured

Example:
In case using rx_iface and sending *single* packet to eth1
it will loop forever as the when it is sent to tx_iface=eth1
it will be captured again on the rx_iface=eth1 and so on
  $RTE_TARGET/app/testpmd l 0-3 -n 4 \
	--vdev 'net_pcap0,rx_iface=eth1,tx_iface=eth1'
  …
  ---------------------- Forward statistics for port 0  ------------
  RX-packets: 758            RX-dropped: 0             RX-total: 758
  TX-packets: 758            TX-dropped: 0             TX-total: 758
  ------------------------------------------------------------------
While if using rx_iface_in it will not be captured on the way out and
be forwarded only once
  $RTE_TARGET/app/testpmd l 0-3 -n 4 \
	--vdev 'net_pcap0,rx_iface_in=eth1,tx_iface=eth1'
  …
  ---------------------- Forward statistics for port 0  ------------
  RX-packets: 1              RX-dropped: 0             RX-total: 1
  TX-packets: 1              TX-dropped: 0             TX-total: 1
  ------------------------------------------------------------------

Signed-off-by: Ido Goshen <ido@cgstowernetworks.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-07-04 20:54:05 +02:00
Hyong Youb Kim
bcaa54c1a1 net/enic: support mbuf fast free offload
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2018-07-03 01:54:20 +02:00
Hyong Youb Kim
e39c2756e2 net/enic: add devarg to specify ingress VLAN rewrite mode
Add a new devarg "ig-vlan-rewrite" to allow the user to set
non-default rewrite mode. The UCS VIC may add/remove/modify the VLAN
header of an ingress packet depending on the ingress VLAN rewrite
mode.

By default, the driver sets the pass-through mode, which tells the NIC
"do not touch VLAN header and preserve it as is". This mode is usually
sufficient, but can complicate deployments for certain environments.
For example, OVS-DPDK in UCS blade environments may want to use "untag
default VLAN mode", which removes the VLAN header from an ingress
packet if it matches vNIC's default VLAN.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
2018-07-03 01:54:15 +02:00
Wei Zhao
b7102e98fa net/fm10k: support descriptor status API
rte_eth_rx_descritpr_status and rte_eth_tx_descriptor_status
are supported by fm10K.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2018-07-03 01:35:58 +02:00
Qi Zhang
864a800d70 net/i40e: remove VF interrupt handler
For i40evf, internal rx interrupt and adminq interrupt share the same
source, that cause a lot cpu cycles be wasted on interrupt handler
on rx path. This is complained by customers which require low latency
(when set I40E_ITR_INTERVAL to small value), but have to be sufferred by
tremendous interrupts handling that eat significant CPU resources.

The patch disable pci interrupt and remove the interrupt handler,
replace it with a low frequency (50ms) interrupt polling daemon
which is implemented by registering a alarm callback periodly, this
save CPU time significently: On a typical x86 server with 2.1GHz CPU,
with low latency configure (32us) we saw CPU usage from top commmand
reduced from 20% to 0% on management core in testpmd).

Also with the new method we can remove compile option: I40E_ITR_INTERVAL
which is used to balance between low latency and low CPU usage previously.
Now we don't need it since we can reach both at same time.

Suggested-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
2018-07-03 01:35:58 +02:00
Marvin Liu
8f3bd7e870 net/virtio: add in-order Rx/Tx into selection
After IN_ORDER Rx/Tx paths added, need to update Rx/Tx path selection
logic.

Rx path select logic: If IN_ORDER and merge-able are enabled will select
IN_ORDER Rx path. If IN_ORDER is enabled, Rx offload and merge-able are
disabled will select simple Rx path. Otherwise will select normal Rx
path.

Tx path select logic: If IN_ORDER is enabled will select IN_ORDER Tx
path. Otherwise will select default Tx path.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-07-03 01:35:58 +02:00
Marvin Liu
488ed97a20 net/virtio-user: add mrg-rxbuf and in-order vdev parameters
Add parameters for configuring VIRTIO_NET_F_MRG_RXBUF and
VIRTIO_F_IN_ORDER feature bits. If feature is disabled, also update
corresponding unsupported feature bit.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-07-03 01:35:58 +02:00
Rasesh Mody
1fc3afdf78 doc: update qede management firmware guide
Fixes: c49a438fce ("doc: update qede guide and features")
Fixes: db86fbe54d ("doc: update qede PMD NIC guide")
Cc: stable@dpdk.org

Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
2018-07-03 01:35:58 +02:00
Yongseok Koh
1787eb7b4d net/mlx5: use stride index in Rx completion entry
Multi-Packet Receive Queue is to receive multiple packets on a single large
buffer. The number of consumed strides in CQE is accumulated to keep track
of the current stride index. However, it is safer to directly use stride
index in CQE to avoid out-of-order situation which can possibly be caused
by introducing LRO in the future.

If Rx CQE compression is enabled, HW can be configured to store the stride
index in a mini-CQE but this will need newer version of library/driver.
Therefore, since this change, MPRQ is only supported with the newer
library/driver and Rx hash result is not supported if MPRQ is enabled along
with Rx CQE compression.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-07-03 01:35:58 +02:00
Rakesh Kudurumalla
279d33199c net/thunderx: add support for hardware first skip feature
This feature is used to create a hole between HEADROOM
and actual data.Size of hole is specified in bytes as
module param to pmd

Signed-off-by: Rakesh Kudurumalla <rkudurumalla@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2018-07-03 01:35:58 +02:00
Xiao Wang
4b614e9504 net/ifc: make driver name consistent
Make the compiler switch name and document name consistent as ``ifc`` to
avoid confusion. Also rename the map file to standard name for meson
build in the process.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
2018-06-14 19:27:50 +02:00
Shagun Agrawal
ee61f5113b net/cxgbe: parse and validate flows
Introduce rte_flow skeleton and implement validate operation.

Parse and convert <item>, <action>, <attributes> into hardware
specification. Perform validation, including basic sanity tests
and underlying device's supported filter capability checks.

Currently add support for:
<item>: IPv4, IPv6, TCP, and UDP.
<action>: Drop, Queue, and Count.

Also add sanity checks to ensure filters are created at specified
index in LE-TCAM region. The index in LE-TCAM region indicates
the filter rule's priority with index 0 having the highest priority.
If no index is specified, filters are created at closest available
free index.

Signed-off-by: Shagun Agrawal <shaguna@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
2018-06-14 19:27:50 +02:00
Tiwei Bie
eef1e585f9 doc: convert virtio guide to new offload API
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2018-05-25 17:07:40 +02:00
Rasesh Mody
c49a438fce doc: update qede guide and features
Add LINUX VFIO support to feature list.

Add GRE tunneling offload to supported feature list.

Update firmware references to use newer FW image 8.33.12.0. Update
management FW version as well to 8.34.x.x.

Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
2018-05-25 17:07:40 +02:00
Shahaf Shuler
5a768814a6 doc: drop support of Mellanox OFED 4.2
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-25 17:07:40 +02:00
Shijith Thotton
5aa495f402 doc: remove deprecated terms from liquidio guide
Remove reference to deprecated ethdev offload parameter from liquidio
documentation.

Fixes: dd6aab1671 ("net/liquidio: move to new offload API")

Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com>
2018-05-25 17:07:40 +02:00
Jerin Jacob
8db1caa962 doc: remove deprecated terms from octeontx guide
Remove reference to deprecated ethdev offload parameter from octeontx
documentation.

Fixes: a92870896b ("net/octeontx: use the new offload APIs")
Cc: stable@dpdk.org

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2018-05-25 17:07:40 +02:00
Jerin Jacob
6e2320bec5 doc: remove deprecated terms from thunderx guide
Remove reference to deprecated ethdev offload parameter from thunderx
documentation.

Fixes: c97da2cbc2 ("net/thunderx: convert to new offload API")
Cc: stable@dpdk.org

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
2018-05-25 17:07:40 +02:00
Qi Zhang
ba932ddabd doc: convert fm10k guide to new offload API
Cleanup document to convert old rxmode and txq_flags to new offload
bit mask.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-25 17:07:40 +02:00
Qi Zhang
a19b726531 doc: convert ixgbe guide to new offload API
Cleanup document to convert old rxmode flags to new offload
bit mask.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-25 17:07:40 +02:00
Olivier Matz
f83a3d3fa8 use SPDX tag for 6WIND copyrighted files
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
2018-05-25 10:47:06 +02:00
Qi Zhang
e435197a5f net/ixgbe: remove unnecessary macro
Since we move to new offload APIs, IXGBE_SIMPLE_FLAGS is not used.

Fixes: 51215925a3 ("net/ixgbe: convert to new Tx offloads API")

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
2018-05-23 00:35:01 +02:00
Shagun Agrawal
f5b3c7b293 net/cxgbevf: fix inter-VM traffic when physical link down
Add force_link_up devargs to always force link as up for VFs.
This enables VFs on the same NIC to send traffic to each other
even when physical link is down.

Also add RTE_PMD_REGISTER_PARAM_STRING to export all supported
devargs.

Fixes: 011ebc236d ("net/cxgbe: add skeleton VF driver")

Signed-off-by: Shagun Agrawal <shaguna@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
2018-05-23 00:35:01 +02:00
Qiming Yang
aa4cef698c doc: fix spelling in i40e guide
This patch corrects some spelling issues in i40e.rst and clarifies
which controllers and connections are part of the 700 Series.

Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
2018-05-23 00:35:00 +02:00
Matan Azrad
1f106da2bf net/mlx5: support MPLS-in-GRE and MPLS-in-UDP
Add support for MPLS over GRE and MPLS over UDP tunnel types as
described in the next RFCs:
1. https://tools.ietf.org/html/rfc4023
2. https://tools.ietf.org/html/rfc7510
3. https://tools.ietf.org/html/rfc4385

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-17 12:31:42 +02:00
Shahaf Shuler
dd3331c6f1 net/mlx5: add Bluefield device id
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-05-17 12:31:42 +02:00
Qiming Yang
dbb36bb2a1 doc: add XXV710 support in i40e guide
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
7d6bf6b866 net/mlx5: add Multi-Packet Rx support
Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe
bandwidth by posting a single large buffer for multiple packets. Instead of
posting a buffer per a packet, one large buffer is posted in order to
receive multiple packets on the buffer. A MPRQ buffer consists of multiple
fixed-size strides and each stride receives one packet.

Rx packet is mem-copied to a user-provided mbuf if the size of Rx packet is
comparatively small, or PMD attaches the Rx packet to the mbuf by external
buffer attachment - rte_pktmbuf_attach_extbuf(). A mempool for external
buffers will be allocated and managed by PMD.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
9797bfcce1 net/mlx4: add new memory region support
This is the new design of Memory Region (MR) for mlx PMD, in order to:
- Accommodate the new memory hotplug model.
- Support non-contiguous Mempool.

There are multiple layers for MR search.

L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most
Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized
array by linear search. L0/L1 is in an inline function -
mlx4_mr_lookup_cache().

If L1 misses, the bottom-half function is called to look up the address
from the bigger local cache of the queue. This is L2 - mlx4_mr_addr2mr_bh()
and it is not an inline function. Data structure for L2 is the Binary Tree.

If L2 misses, the search falls into the slowest path which takes locks in
order to access global device cache (priv->mr.cache) which is also a B-tree
and caches the original MR list (priv->mr.mr_list) of the device. Unless
the global cache is overflowed, it is all-inclusive of the MR list. This is
L3 - mlx4_mr_lookup_dev(). The size of the L3 cache table is limited and
can't be expanded on the fly due to deadlock. Refer to the comments in the
code for the details - mr_lookup_dev(). If L3 is overflowed, the list will
have to be searched directly bypassing the cache although it is slower.

If L3 misses, a new MR for the address should be created -
mlx4_mr_create(). When it creates a new MR, it tries to register adjacent
memsegs as much as possible which are virtually contiguous around the
address. This must take two locks - memory_hotplug_lock and
priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any
allocation/free of memory inside.

In the free callback of the memory hotplug event, freed space is searched
from the MR list and corresponding bits are cleared from the bitmap of MRs.
This can fragment a MR and the MR will have multiple search entries in the
caches. Once there's a change by the event, the global cache must be
rebuilt and all the per-queue caches will be flushed as well. If memory is
frequently freed in run-time, that may cause jitter on dataplane processing
in the worst case by incurring MR cache flush and rebuild. But, it would be
the least probable scenario.

To guarantee the most optimal performance, it is highly recommended to use
an EAL option - '--socket-mem'. Then, the reserved memory will be pinned
and won't be freed dynamically. And it is also recommended to configure
per-lcore cache of Mempool. Even though there're many MRs for a device or
MRs are highly fragmented, the cache of Mempool will be much helpful to
reduce misses on per-queue caches anyway.

'--legacy-mem' is also supported.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:52 +01:00
Yongseok Koh
2d684b911d net/mlx4: remove memory region support
This patch removes current support of Memory Region (MR) in order to
accommodate the dynamic memory hotplug patch. This patch can be compiled
but traffic can't flow and HW will raise faults. Subsequent patches will
add new MR support.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Yongseok Koh
974f1e7ef1 net/mlx5: add new memory region support
This is the new design of Memory Region (MR) for mlx PMD, in order to:
- Accommodate the new memory hotplug model.
- Support non-contiguous Mempool.

There are multiple layers for MR search.

L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most
Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized
array by linear search. L0/L1 is in an inline function -
mlx5_mr_lookup_cache().

If L1 misses, the bottom-half function is called to look up the address
from the bigger local cache of the queue. This is L2 - mlx5_mr_addr2mr_bh()
and it is not an inline function. Data structure for L2 is the Binary Tree.

If L2 misses, the search falls into the slowest path which takes locks in
order to access global device cache (priv->mr.cache) which is also a B-tree
and caches the original MR list (priv->mr.mr_list) of the device. Unless
the global cache is overflowed, it is all-inclusive of the MR list. This is
L3 - mlx5_mr_lookup_dev(). The size of the L3 cache table is limited and
can't be expanded on the fly due to deadlock. Refer to the comments in the
code for the details - mr_lookup_dev(). If L3 is overflowed, the list will
have to be searched directly bypassing the cache although it is slower.

If L3 misses, a new MR for the address should be created -
mlx5_mr_create(). When it creates a new MR, it tries to register adjacent
memsegs as much as possible which are virtually contiguous around the
address. This must take two locks - memory_hotplug_lock and
priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any
allocation/free of memory inside.

In the free callback of the memory hotplug event, freed space is searched
from the MR list and corresponding bits are cleared from the bitmap of MRs.
This can fragment a MR and the MR will have multiple search entries in the
caches. Once there's a change by the event, the global cache must be
rebuilt and all the per-queue caches will be flushed as well. If memory is
frequently freed in run-time, that may cause jitter on dataplane processing
in the worst case by incurring MR cache flush and rebuild. But, it would be
the least probable scenario.

To guarantee the most optimal performance, it is highly recommended to use
an EAL option - '--socket-mem'. Then, the reserved memory will be pinned
and won't be freed dynamically. And it is also recommended to configure
per-lcore cache of Mempool. Even though there're many MRs for a device or
MRs are highly fragmented, the cache of Mempool will be much helpful to
reduce misses on per-queue caches anyway.

'--legacy-mem' is also supported.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Yongseok Koh
d561b5dc13 net/mlx5: remove memory region support
This patch removes current support of Memory Region (MR) in order to
accommodate the dynamic memory hotplug patch. This patch can be compiled
but traffic can't flow and HW will raise faults. Subsequent patches will
add new MR support.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:51 +01:00
Ophir Munk
ce07b1514d net/mlx4: fix CRC stripping capability report
There are two capabilities related to CRC stripping:
1. mlx4 HW capability to perform CRC stripping on a received packet.
This capability is built in mlx4 HW. It should be returned by the API
call mlx4_get_rx_queue_offloads().
2. mlx4 driver capability to enable/disable HW CRC stripping. This
capability is dependent on the driver version.

Before this commit the second capability was falsely returned by
the mentioned API. This commit fixes it by returning the first
capability.
mlx4 HW performs CRC stripping by default and this capability is
always reported as "true".

The ability to enable/disable CRC stripping is supported since this
commit and requires OFED version 4.3-1.5.0.0 or rdma-core version v18.
CRC stripping will be done by default regardless of its configuration
when working with OFED or rdma-core versions earlier than those
previously specified or before this commit.

Fixes: de1df14e6e ("net/mlx4: support CRC strip toggling")
Cc: stable@dpdk.org

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
2018-05-14 22:31:51 +01:00
Ophir Munk
8d54ede738 net/tap: report on supported RSS hash functions
Report on TAP supported RSS functions as part of dev_infos_get
callback: ETH_RSS_IP, ETH_RSS_UDP and ETH_RSS_TCP.
Known limitation: TAP supports all of the above hash functions together
and not in partial combinations.
Previous to this commit RSS support was reported as none. Since the
introduction of [1] it is required that all RSS configurations will be
verified.

[1] commit 8863a1fbfc ("ethdev: add supported hash function check")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2018-05-14 22:31:51 +01:00
John Daley
07ec1ddad5 doc: remove mention of unreleased nics from enic guide
Company policy discourages the mention of unreleased hardware in
guides and release notes.

Fixes: 08df773 ("doc: update enic guide and features")

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
2018-05-14 22:31:50 +01:00
Hyong Youb Kim
e53f18e81c doc: update the enic guide and features
Add more descriptions regarding SR-IOV and RSS settings.

Remove 'Multicast MAC filter' and add 'Allmulticast mode' to the
features.

Cc: stable@dpdk.org

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
2018-05-14 22:31:50 +01:00
Xueming Li
c8122e93ac net/mlx5: document update for Tx
Add document for hw header parsing and SWP.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14 22:31:49 +01:00
Hemant Agrawal
50245be05d bus/fslmc: support device blacklisting
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2018-05-14 00:35:47 +02:00
Hemant Agrawal
6e0752205b bus/dpaa: support device blacklisting
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
2018-05-14 00:34:25 +02:00