Commit Graph

8077 Commits

Author SHA1 Message Date
Yongseok Koh
97d37d2c1f net/mlx4: remove device register remap
UAR (User Access Region) register does not need to be remapped for
primary process but it should be remapped only for secondary process.
UAR register table is in the process private structure in
rte_eth_devices[],
(struct mlx4_proc_priv *)rte_eth_devices[port_id].process_private

The actual UAR table follows the data structure and the table is used
for both Tx and Rx.

For Tx, BlueFlame in UAR is used to ring the doorbell.
MLX4_TX_BFREG(txq) is defined to get a register for the txq. Processes
access its own private data to acquire the register from the UAR table.

For Rx, the doorbell in UAR is required in arming CQ event. However, it
is a known issue that the register isn't remapped for secondary process.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Yongseok Koh
120dc4a7dc net/mlx5: remove device register remap
UAR (User Access Region) register does not need to be remapped for
primary process but it should be remapped only for secondary process.
UAR register table is in the process private structure in
rte_eth_devices[],
(struct mlx5_proc_priv *)rte_eth_devices[port_id].process_private

The actual UAR table follows the data structure and the table is used
for both Tx and Rx.

For Tx, BlueFlame in UAR is used to ring the doorbell.
MLX5_TX_BFREG(txq) is defined to get a register for the txq. Processes
access its own private data to acquire the register from the UAR table.

For Rx, the doorbell in UAR is required in arming CQ event. However, it
is a known issue that the register isn't remapped for secondary process.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-12 11:02:02 +02:00
Yongseok Koh
d5c900d1dd net/mlx5: remove redundant queue index
Queue index is redundantly stored for both Rx and Tx structures.
E.g. txq_ctrl->idx and txq->stats.idx. Both are consolidated to single
storage - rxq->idx and txq->idx.

Also, rxq and txq are moved to the beginning of its control structure
(rxq_ctrl and txq_ctrl) for cacheline alignment.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Yongseok Koh
227684feb8 net/mlx5: fix recursive inclusion of header file
mlx5.h includes mlx5_rxtx.h and mlx5_rxtx.h includes mlx5.h recursively.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Qi Zhang
50cc9d2a6e net/ice: fix max frame size
Max frame size setup should consider double VLAN case.

Fixes: ae2bdd0219 ("net/ice: support MTU setting")
Cc: stable@dpdk.org

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
2019-04-12 11:02:02 +02:00
Yongseok Koh
b27669d691 net/mlx4: fix Tx doorbell register unmap
If rdma-core library doesn't support remapping UAR registers, the register
shouldn't be unmapped on device stop.

Fixes: 0203d33a10 ("net/mlx4: support secondary process")

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Ori Kam
bbda883ca0 net/mlx5: fix build on Arm
In case of cross compilation on aarch64 we must add include for
stdlib in order to use the free function.

Fixes: cbb66daa3c ("net/mlx5: prepare Direct Verbs for Direct Rule")

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Andrew Rybchenko
bac46c6fef net/sfc: set min and max MTU
Advertise minimum and maximum MTU value in device information.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-12 11:02:02 +02:00
Alejandro Lucero
701e0bfa84 net/nfp: fix resource leak on errors
Not closing the socket implies a resource leak.

Coverity issue: 336865
Fixes: 29a62d1476 ("net/nfp: add CPP bridge as service")
Cc: stable@dpdk.org

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
2019-04-12 11:02:02 +02:00
Alejandro Lucero
8539ec0a40 net/nfp: fix memory leak
Coverity issue: 32806
Fixes: ef28aa96e5 ("net/nfp: support multiprocess")
Cc: stable@dpdk.org

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
2019-04-12 11:02:02 +02:00
Alejandro Lucero
318e737d2d net/nfp: fix memory leak
If errors, release the allocated structure.

Coverity issue: 277222
Fixes: c7e9729da6 ("net/nfp: support CPP")
Cc: stable@dpdk.org

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
2019-04-12 11:02:02 +02:00
Alejandro Lucero
0fec453d58 net/nfp: check return value
Call to CPP read (nfp_cpp_readl()) can fail, return 0 on fail.

If the call to _nfp6000_cppat_mu_locality fails, the function needs
to return with an error.

If the nfp_cpp_readl() call fails just returns 0.

Coverity issue: 277209, 277215, 277225
Fixes: c7e9729da6 ("net/nfp: support CPP")
Cc: stable@dpdk.org

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
2019-04-12 11:02:02 +02:00
Alejandro Lucero
3243f7c0d8 net/nfp: fix potential integer overflow
Coverity issue: 277204
Fixes: defb9a5dd1 ("nfp: introduce driver initialization")
Cc: stable@dpdk.org

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
2019-04-12 11:02:02 +02:00
Alejandro Lucero
0c08589d9e net/nfp: fix file descriptor check
Although it is rather unlikely getting 0 as the descriptor handle, better
to contemplate that possibility.

Coverity issue: 195018
Fixes: 896c265ef9 ("net/nfp: use new CPP interface")
Cc: stable@dpdk.org

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
2019-04-12 11:02:02 +02:00
Michal Krawczyk
ef538c1a7f net/ena: fix checksum feature flag
The boolean value was assigned to Tx flag twice, so it could cause bug
whenever Rx checksum will not be supported and Tx will be.

Coverity issue: 336831
Fixes: 117ba4a604 ("net/ena: get device info statically")

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
942d13e6e7 net/mlx5: fix sharing context destroy order
At the mlx5 device closing the shared IB context was destroyed
before cleanup routines completion. As it was found on some
setups (Netlink fails with old kernel drivers and we have to use
sysfs to retrieve interface index, this requires IB device name,
which is stored in shared context) the mlx5_nl_mac_addr_flush()
requires IB device name, and if shared context is removed it
causes the segmentation fault.

Fixes: 17e19bc4dd ("net/mlx5: add IB shared context alloc/free functions")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
9c2bbd0488 net/mlx5: fix device probing for old kernel drivers
Retrieving network interface index via Netlink fails in
case of old ib_core kernel driver installed - mlx5_nl_ifindex()
routine fails due to RDMA_NLDEV_ATTR_NDEV_INDEX attribute is not
supported by the old driver.

The patch allowing to retrieve the network interface index and
name via Netlink [1]. So, the problem depends on ib_core module
version - 4.16 supports getting ifindex via Netlink, 4.15 does not.

This error was ignored in previous versions of MLX5 PMD probing
routine. For single device ifindex was retrieved via sysfs
and link control was not lost, so problem just was not noticed.
In order to support MLX5 PMD functioning over old kernel driver
this patch adds ifindex retrieving via sysfs into probing routine.
It is worth to note this method works for master/standalone
device only.

[1] https://www.spinics.net/lists/linux-rdma/msg62948.html
    Linux tree: 5b2cc79d (Leon Romanovsky 2018-03-27 20:40:49 +0300 270)

Fixes: ad74bc6195 ("net/mlx5: support multiport IB device during probing")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Viacheslav Ovsiienko
ae4eb7dc79 net/mlx5: fix typos in comments
Fixes: 299d7dc28c ("net/mlx5: add representor recognition on Linux 5.x")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12 11:02:02 +02:00
Jasvinder Singh
c1656328db meter: replace color definitions
This patch implements the changes proposed in the deprecation
note[1]. Replace multiple color definitions in various places such as
rte_meter.h, rte_tm.h and rte_mtr.h with single rte_color defined
in rte_meter.h.

This is simple search and replace exercise without any implementation
change.

[1] https://mails.dpdk.org/archives/dev/2019-January/123861.html

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2019-04-11 14:27:32 +02:00
Andrew Rybchenko
06b186a01d net/sfc: improve Rx free threshold default
Rx refill in one bulk (which is just 8 descriptors) by default is too
aggressive and makes too many MMIO writes (Rx doorbells) if packet rate
is high. Setting default to 1/8 of Rx descriptors number shows good
performance results. Anyway it is a default value which may be
overridden by Rx configuration provided by application.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Qiming Yang
414e31cd6f net/i40e: support VXLAN-GPE classification
Added VXLAN-GPE tunnel filter, supported filter to queue.

Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2019-04-05 17:45:22 +02:00
Qiming Yang
9d17021648 net/i40e: support VXLAN-GPE
Add new protocol type VXLAN-GPE support for UDP tunnel.
inner IP/TCP/UDP checksum and RSS configuration shared
the same implementation of VXLAN.

Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2019-04-05 17:45:22 +02:00
David Harton
f9409d6c28 net/ixgbevf: remove MTU setting limitation
Currently, if requested MTU is bigger than mbuf size and scattered
receive is not enabled, setting MTU to that value fails.

This patch allows setting this special MTU when device is stopped,
because scattered_rx will be re-configured during next port start
and driver may enable scattered receive according new MTU value.

After this patch, driver may select different receive function
automatically after MTU set, according MTU values selected.

Signed-off-by: David Harton <dharton@cisco.com>
Reviewed-by: Wei Zhao <wei.zhao1@intel.com>
2019-04-05 17:45:22 +02:00
Qi Zhang
9481b0902e net/ice: send driver version to firmware
The driver must send its version information to the firmware, so
the firmware knows the driver is up. Otherwise, it will cause unexpected
OS package downloading when multiple driver instances running on the
same device.

Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
2019-04-05 17:45:22 +02:00
Wei Zhao
646ca3a15e net/ixgbe: enable 10Mb/s link setup for X553
This patch enable 10Mb/s link for ixgbe X553.
This new device has own device id of 0x15E4 and 0x15E5, so
ixgbe PMD driver need to special check when setup link for
these two types of device.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2019-04-05 17:45:22 +02:00
Viacheslav Ovsiienko
79e35d0d59 net/mlx5: share Direct Rules/Verbs flow related structures
Direct Rules/Verbs related structures are moved to
the shared context:
  - rx/tx namespaces, shared by master and representors
  - rx/tx flow tables
  - matchers
  - encap/decap action resources
  - flow tags (MARK actions)
  - modify action resources
  - jump tables

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Viacheslav Ovsiienko
b2177648b8 net/mlx5: add Direct Rules flow data alloc/free routines
We are going to share the Direct Rules and Direct Verbs flow
device data structures between master and representors in the
E-Switch configurations over multiport IB device.

The code of initializing and destroying these data is
moved to dedicated routines, this is just a preparation
step for actual data sharing.

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Jasvinder Singh
5024da517d net/softnic: fix unchecked return value
Fix unchecked return value issue reported by Coverity.

Coverity issue: 336852
Fixes: a958a5c07f ("net/softnic: support service cores")
Cc: stable@dpdk.org

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
2019-04-05 17:45:22 +02:00
Xiaolong Ye
f1debd77ef net/af_xdp: introduce AF_XDP PMD
Add a new PMD driver for AF_XDP which is a proposed faster version of
AF_PACKET interface in Linux. More info about AF_XDP, please refer to [1]
[2].

This is the vanilla version PMD which just uses a raw buffer registered as
the umem.

[1] https://fosdem.org/2018/schedule/event/af_xdp/
[2] https://lwn.net/Articles/745934/

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2019-04-05 17:45:22 +02:00
Ori Kam
684b9a1b1f net/mlx5: support jump action
When using Direct Rules we can add actions to jump between tables.
This is extra useful since rule insertion rate is much higher on other
tables compared to table zero.

If no group is selected the rule is added to group 0.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
4f84a19779 net/mlx5: add Direct Rules API
Adds calls to the Direct Rules API inside the glue functions.
Due to difference in parameters between the Direct Rules and Direct
Verbs some of the glue functions API was updated.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ori Kam
cbb66daa3c net/mlx5: prepare Direct Verbs for Direct Rule
This is the first patch of a series that is designed to enable the
Direct Rules API.

The main difference between Direct Verbs and Direct Rules from API
perspective is that in Direct Rules each action has it's own create
function and the object itself is of type void.

In this patch I'm adding functions to generate actions that currently
are done without create action, and I'm changing the action type to be
void *, so in next patches only the glue functions will need to change.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Ivan Malov
c1ce2ba218 net/sfc: support tunnel TSO on EF10 native Tx datapath
Handle VXLAN and GENEVE TSO on EF10 native Tx datapath.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Ivan Malov
9906cb2959 net/sfc: improve log about missing HW TSO support
Said message cannot be considered as warning since
the PMD anyway reports available offload capabilities
by means of device info interface. Make this log
message informational and improve its formatting
by placing the text itself on the same line.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Ivan Malov
41ef1ad50a net/sfc: factor out function to get IPv4 packet ID for TSO
As a result, code duplication will be avoided in the current
TSO implementations (EFX and EF10 native). The future patch to
add support for tunnel TSO will also reuse the new function.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Igor Romanov
9b70500cf2 net/sfc: add TSO header length check to Tx prepare
Make Tx prepare function able to detect packets with invalid header
size when header linearization is required.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Igor Romanov
f7a66f9365 net/sfc: introduce descriptor space check in Tx prepare
Add descriptor space check to Tx prepare function to inform a caller
that a packet that needs more than maximum Tx descriptors of a queue
can not be sent.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Igor Romanov
a3895ef38c net/sfc: move TSO header checks from Tx burst to Tx prepare
Tx offloads checks should be done in Tx prepare.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Igor Romanov
8c27fa78f1 net/sfc: support Tx preparation in EF10 simple datapath
Implement tx_prepare callback. The implementation checks for anything
only in RTE debug mode. No checks are done otherwise because EF10
simple datapath ignores Tx offloads.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Igor Romanov
67330d32b8 net/sfc: support Tx preparation in EF10 datapath
Implement tx_prepare callback and update Tx burst function accordingly.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Igor Romanov
07685524c9 net/sfc: support Tx preparation in EFX datapath
Implement generic checks in Tx prepare function and update Tx burst
function accordingly.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Igor Romanov
11c3712fc9 net/sfc: make TSO descriptor numbers EF10-specific
Numbers of extra descriptors required for TSO are EF10-specific
in fact. Highlight it in define names.

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Igor Romanov
3985802ecf net/sfc: improve TSO header length check in EF10 datapath
Move the check inside xmit function to the branch in which
the check is mandatory. It makes case when TSO header is not
fragmented a bit more faster.

Fixes: 6bc985e411 ("net/sfc: support TSO in EF10 Tx datapath")
Cc: stable@dpdk.org

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Igor Romanov
ac30699e9c net/sfc: improve TSO header length check in EFX datapath
Move the check inside xmit function to the branch in which
the check is mandatory. It makes case when TSO header is not
fragmented a bit more faster.

Fixes: fec33d5bb3 ("net/sfc: support firmware-assisted TSO")
Cc: stable@dpdk.org

Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
2019-04-05 17:45:22 +02:00
Ori Kam
f54aeb3ec0 net/mlx5: fix flow counters using devx
The API that was defined in OFED 4.5 was replaced both in OFED 4.6 and
in upstream.

This commit updates the API to match the upstream one.

Fixes: f5bf91de73 ("net/mlx5: support flow counters using devx")
Cc: stable@dpdk.org

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Thomas Monjalon
d874a4eed5 net/mlx5: use port sibling iterators
Iterating over siblings was done with RTE_ETH_FOREACH_DEV()
which skips the owned ports.
The new iterators RTE_ETH_FOREACH_DEV_SIBLING()
and RTE_ETH_FOREACH_DEV_OF() are more appropriate and more correct.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Tested-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
0b259b8e96 net/mlx4: enable secondary process to register DMA memory
The Memory Region (MR) for DMA memory can't be created from secondary
process due to lib/driver limitation. Whenever it is needed, secondary
process can make a request to primary process through the EAL IPC
channel (rte_mp_msg) which is established on initialization. Once a MR
is created by primary process, it is immediately visible to secondary
process because the MR list is global per a device. Thus, secondary
process can look up the list after the request is successfully returned.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
f4efc0eb97 net/mlx4: add control of excessive memory pinning by kernel
A new PMD parameter (mr_ext_memseg_en) is added to control extension of
memseg when creating a MR. It is enabled by default.

If enabled, mlx4_mr_create() tries to maximize the range of MR
registration so that the LKey lookup tables on datapath become smalle
and get the best performance. However, it may worsen memory utilization
because registered memory is pinned by kernel driver. Even if a page in
the extended chunk is freed, that doesn't become reusable until the
entire memory is freed and the MR is destroyed.

To make freed pages available immediately, this parameter has to be
turned off but it could drop performance.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
c18cf501a7 net/mlx5: enable secondary process to register DMA memory
The Memory Region (MR) for DMA memory can't be created from secondary
process due to lib/driver limitation. Whenever it is needed, secondary
process can make a request to primary process through the EAL IPC
channel (rte_mp_msg) which is established on initialization. Once a MR
is created by primary process, it is immediately visible to secondary
process because the MR list is global per a device. Thus, secondary
process can look up the list after the request is successfully returned.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00
Yongseok Koh
dceb502942 net/mlx5: add control of excessive memory pinning by kernel
A new PMD parameter (mr_ext_memseg_en) is added to control extension of
memseg when creating a MR. It is enabled by default.

If enabled, mlx5_mr_create() tries to maximize the range of MR
registration so that the LKey lookup tables on datapath become smaller
and get the best performance. However, it may worsen memory utilization
because registered memory is pinned by kernel driver. Even if a page in
the extended chunk is freed, that doesn't become reusable until the
entire memory is freed and the MR is destroyed.

To make freed pages available immediately, this parameter has to be
turned off but it could drop performance.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05 17:45:22 +02:00