numam-dpdk

Author	SHA1	Message	Date
Viacheslav Ovsiienko	7b836d8443	net/mlx5: fix doorbell register offset type The doorbell register is mapped using mmap() and offset must have off_t instead of unsigned int. Bug is not critical due to only least significant bits of offset are currently tested to determine mapping mode. Fixes: `8409a28573` ("net/mlx5: control transmit doorbell register mapping") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2020-01-17 19:46:26 +01:00
Tonghao Zhang	de561750c3	net/mlx5: clean up redundant assignment The type of txq has been assigned in mlx5_txq_hairpin_new. Fixes: `ae18a1ae96` ("net/mlx5: support Tx hairpin queues") Cc: stable@dpdk.org Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Ori Kam <orika@mellanox.com>	2020-01-17 19:45:23 +01:00
Suanming Mou	4acb96fd52	net/mlx5: add GENEVE in tunnel offloads capabilities GENEVE is available in tunnel offloads. Add it as the default support option. Signed-off-by: Suanming Mou <suanmingm@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-26 18:22:27 +01:00
Viacheslav Ovsiienko	82e75f8323	net/mlx5: fix legacy multi-packet Tx descriptors ConnectX-4LX supports multiple packets within the single Tx descriptor. This feature is named as "Legacy Multi-Packet Write" and imposes a lot of limitations: - no ACLs, it means no NIC Tx Flows are supported and Tx metadata become meaningless - the required minimal inline data must be zero - no SR-IOV, it means no support in E-Switch configurations, - no priority and dscp forcing - no VLAN insertion - no TSO - all packets within MPW session must have the same size This legacy MPW feature is mainly intended for test purposes. To explicitly engage the feature on ConnectX-4LX the devargs should be specified: - txq_mpw_en=1 This feature was dropped in 19.08, this patch reverts it back. Fixes: `18a1c20044` ("net/mlx5: implement Tx burst template") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2019-11-26 18:22:27 +01:00
Viacheslav Ovsiienko	c4d314a352	net/mlx5: fix assert in Tx inline settings Assert condition is fixed to not alert for the case when multi-packet write is not supported/engaged at all. Fixes: `b53cd86965` ("net/mlx5: adjust inline setting for large Tx queue sizes") Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-11-20 17:36:06 +01:00
Viacheslav Ovsiienko	f078ceb6ae	net/mlx5: fix Tx doorbell write memory barrier As the result of testing it was found that some hosts have the performance penalty imposed by required write memory barrier after doorbell writing. Before 19.08 release there was some heuristics to decide whether write memory barrier should be performed. For the bursts of recommended size (or multiple) it was supposed there were some extra ongoing packets in the next burst and write memory barrier may be skipped (supposed to be performed in the next burst, at least after descriptor writing). This patch restores that behaviour, the devargs tx_db_nc=2 must be specified to engage this performance tuning feature. Fixes: `8409a28573` ("net/mlx5: control transmit doorbell register mapping") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-11-20 17:36:06 +01:00
Viacheslav Ovsiienko	8409a28573	net/mlx5: control transmit doorbell register mapping The rdma core library can map doorbell register in two ways, depending on the environment variable "MLX5_SHUT_UP_BF": - as regular cached memory, the variable is either missing or set to zero. This type of mapping may cause the significant doorbell register writing latency and requires explicit memory write barrier to mitigate this issue and prevent write combining. - as non-cached memory, the variable is present and set to not "0" value. This type of mapping may cause performance impact under heavy loading conditions but the explicit write memory barrier is not required and it may improve core performance. The new devarg is introduced "tx_db_nc", if this parameter is set to zero, the doorbell register is forced to be mapped to cached memory and requires explicit memory barrier after writing to. If "tx_db_nc" is set to non-zero value the doorbell will be mapped as non-cached memory, not requiring the memory barrier. If "tx_db_nc" is missing the behaviour will be defined by presence of "MLX5_SHUT_UP_BF" in environment. If variable is missed the default value zero will be set for ARM64 hosts and one for others. In run time the code checks the mapping type and provides the memory barrier after writing to tx doorbell register if it is needed. The mapping type is extracted directly from the uar_mmap_offset field in the queue properties. Fixes: `18a1c20044` ("net/mlx5: implement Tx burst template") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2019-11-11 14:23:02 +01:00
Viacheslav Ovsiienko	9bf26e1318	ethdev: move egress metadata to dynamic field The dynamic mbuf fields were introduced by [1]. The egress metadata is good candidate to be moved from statically allocated field tx_metadata to dynamic one. Because mbufs are used in half-duplex fashion only, it is safe to share this dynamic field with ingress metadata. The shared dynamic field contains either egress (if application going to transmit mbuf with tx_burst) or ingress (if mbuf is received with rx_burst) metadata and can be accessed by RTE_FLOW_DYNF_METADATA() macro or with rte_flow_dynf_metadata_set() and rte_flow_dynf_metadata_get() helper routines. PKT_TX_DYNF_METADATA/PKT_RX_DYNF_METADATA flag will be set along with the data. The mbuf dynamic field must be registered by calling rte_flow_dynf_metadata_register() prior accessing the data. The availability of dynamic mbuf metadata field can be checked with rte_flow_dynf_metadata_avail() routine. DEV_TX_OFFLOAD_MATCH_METADATA offload and configuration flag is removed. The metadata support in PMDs is engaged on dynamic field registration. Metadata feature is getting complex. We might have some set of actions and items that might be supported by PMDs in multiple combinations, the supported values and masks are the subjects to query by perfroming trials (with rte_flow_validate). [1] http://patches.dpdk.org/patch/62040/ Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-11-08 23:15:05 +01:00
Ori Kam	ae18a1ae96	net/mlx5: support Tx hairpin queues This commit adds the support for creating Tx hairpin queues. Hairpin queue is a queue that is created using DevX and only used by the HW. Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-11-08 23:15:04 +01:00
Ori Kam	894c4a8e5a	net/mlx5: prepare Tx queues to have different types Currently all Tx queues are created using Verbs. This commit modify the naming so it will not include verbs, since in next commit a new type will be introduce (hairpin) Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-11-08 23:15:04 +01:00
Viacheslav Ovsiienko	b53cd86965	net/mlx5: adjust inline setting for large Tx queue sizes The hardware may have limitations on maximal amount of supported Tx descriptors building blocks (WQEBB). Application requires the Tx queue must accept the specified amount of packets. If inline data feature is engaged the packet may require more WQEBBs and overall amount of blocks may exceed the hardware capabilities. Application has to make a trade-off between Tx queue size and maximal data inline size. In case if the inline settings are not requested explicitly with devarg keys the default values are used. This patch adjusts the applied default values if large Tx queue size is requested and default inline settings can not be satisfied due to hardware limitations. The explicitly requested inline setting may be aligned (enlarging only) by configurations routines to provide better WQEBB filling, this implicit alignment is the subject for adjustment either. The warning message is emitted to the log if adjustment happens. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-10-08 12:14:32 +02:00
Viacheslav Ovsiienko	46e10a4c1b	net/mlx5: move backing PCI device to private context Now all devices created over the same multiport IB device have shared context containing the backing PCI device field. For the VF LAG configurations it becomes possible the representors might be connected to VF created over different PFs. In this case representors have the different backing PCI devices and mentioned field should be moved to device private area. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2019-10-07 15:00:58 +02:00
Viacheslav Ovsiienko	06def9bc4e	net/mlx5: fix UAR remap initialization for 32-bit systems The txq_uar_init() routine uses the uninitialized uar_mmap_offset field in 32-bit configurations due to this field is initialized after txq_uar_init() call. Fixes: `120dc4a7dc` ("net/mlx5: remove device register remap") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2019-10-07 15:00:57 +02:00
Viacheslav Ovsiienko	c6f0485646	net/mlx5: fix inline data settings If the minimal inline data are required the data inline feature must be engaged. There were the incorrect settings enabling the entire small packet inline (in size up to 82B) which may result in sending rate declining if there is no enough cores. The same problem was raised if inline was enabled to support VLAN tag insertion by software. Fixes: `38b4b397a5` ("net/mlx5: add Tx configuration and setup") Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>	2019-08-06 17:42:12 +02:00
Dekel Peled	8791ff42ef	net/mlx5: update Tx queue create for LRO Update function mlx5_txq_ibv_new(), query and store the TIS transport domain value. It is required later on Rx side when creating matching TIR. Add field in mlx5 data structure to store Transport Domain ID. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-07-23 14:31:36 +02:00
Dekel Peled	970eb58c47	net/mlx5: remove redundant item from union A variable of type struct ibv_cq_ex is declared in 2 unions, but isn't used. This patch removes the 2 redundant declarations. Fixes: `6218063b39` ("net/mlx5: refactor Rx data path") Fixes: `1d88ba1719` ("net/mlx5: refactor Tx data path") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko	38b4b397a5	net/mlx5: add Tx configuration and setup This patch updates the Tx datapath control and configuration structures and code for managing Tx datapath settings. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2019-07-23 14:31:36 +02:00
Viacheslav Ovsiienko	a6bd4911ad	net/mlx5: remove Tx implementation This patch removes the existing Tx datapath code as preparation step before introducing the new implementation. The following entities are being removed: - deprecated devargs support - tx_burst() routines - related PRM definitions - SQ configuration code - Tx routine selection code - incompatible Tx completion code The following devargs are deprecated and ignored: - "txq_inline" is going to be converted to "txq_inline_max" for compatibility issue - "tx_vec_en" - "txqs_max_vec" - "txq_mpw_hdr_dseg_en" - "txq_max_inline_len" is going to be converted to "txq_inline_mpw" for compatibility issue The deprecated devarg keys are recognized by PMD and ignored/converted to the new ones in order not to block device probing. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2019-07-23 14:31:36 +02:00
Dekel Peled	42280dd91b	net/mlx5: fix typos in comments Some spelling mistakes were found in comments. This patch fixes them. Fixes: `d10b09db0a` ("net/mlx5: fix allocation when no memory on device NUMA node") Fixes: `fc2c498ccb` ("net/mlx5: add Direct Verbs translate items") Fixes: `7d6bf6b866` ("net/mlx5: add Multi-Packet Rx support") Fixes: `f6d9ab4e76` ("net/mlx5: check Tx queue size overflow") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>	2019-07-23 14:31:36 +02:00
Ali Alnubani	843e720557	net/mlx5: fix 32-bit build This is to fix the error: ``` drivers/net/mlx5/mlx5_defs.h:14:26: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'off_t {aka long long int}' [-Werror=format=] drivers/net/mlx5/mlx5_txq.c:569:48: note: format string is defined here DRV_LOG(DEBUG, "port %u: uar_mmap_offset 0x%lx" ~~^ %llx ``` Which reproduces with gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0. Fixes: `6bf10ab69b` ("net/mlx5: support 32-bit systems") Cc: stable@dpdk.org Signed-off-by: Ali Alnubani <alialnu@mellanox.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>	2019-06-28 20:31:49 +02:00
Matan Azrad	957e45fb7b	net/mlx5: handle Tx completion with error When WQEs are posted to the HW to send packets, the PMD may get a completion report with error from the HW, aka error CQE which is associated to a bad WQE. The error reason may be bad address, wrong lkey, bad sizes, etc. that can wrongly be configured by the PMD or by the user. Checking all the optional mistakes to prevent error CQEs doesn't make sense due to performance impacts and huge complexity. The error CQEs change the SQ state to error state what causes all the next posted WQEs to be completed with CQE flush error forever. Currently, the PMD doesn't handle Tx error CQEs and even may crashed when one of them appears. Extend the Tx data-path to detect these error CQEs, to report them by the statistics error counters, to recover the SQ by moving the state to ready again and adjusting the management variables appropriately. Sometimes the error CQE root cause is very hard to debug and even may be related to some corner cases which are not reproducible easily, hence a dump file with debug information will be created for the first number of error CQEs, this number can be configured by the PMD probe parameters. Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-06-14 00:01:06 +09:00
Dekel Peled	9e44476448	net/mlx5: fix memory free on queue create error In function mlx5_rxq_ibv_new(), pointer tmpl allocation is attempted at the start, but not validated or freed in case of error. In function mlx5_txq_ibv_new(), pointer txq_ibv allocation is attempted at the start, but not freed in case of error. This patch adds pointers initialization, validation and freeing. Fixes: `09cb5b5817` ("net/mlx5: separate DPDK from verbs Rx queue objects") Fixes: `faf2667fe8` ("net/mlx5: separate DPDK from verbs Tx queue objects") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Ori Kam <orika@mellanox.com>	2019-05-27 10:21:04 +02:00
Dekel Peled	a65c2d6637	net/mlx5: remove unused functions Functions implemented but never called: mlx5_rxq_ibv_releasable() mlx5_rxq_cleanup() mlx5_txq_ibv_releasable() Function declared but not implemented: rxq_alloc_mprq_buf() This patch removes these functions from code and header file. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2019-05-24 14:56:36 +02:00
Yongseok Koh	f6d9ab4e76	net/mlx5: check Tx queue size overflow If Tx packet inlining is enabled, rdma-core library should allocate large Tx WQ enough to support it. It is better for PMD to calculate the size of WQ based on the parameters and return error with appropriate message if it exceeds the device capability. Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-05-03 18:45:23 +02:00
Viacheslav Ovsiienko	ab3cffcfc2	net/mlx5: share Memory Regions for multiport device The multiport Infiniband device support was introduced [1]. All active ports, belonging to the same Infiniband device use the single shared Infiniband context of that device and share the resources: - QPs are created within shared context - Verbs flows are also created with specifying port index - DV/DR resources - Protection Domain - Event Handlers This patchset adds support for Memory Regions sharing between ports, created on the base of multiport Infiniband device. The datapath of mlx5 uses the layered cache subsystem for allocating/releasing Memory Regions, only the lowest layer L3 is subject to share due to performance issues. [1] http://patches.dpdk.org/cover/51800/ Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2019-05-03 18:45:23 +02:00
Dekel Peled	ebd4df0ffc	net/mlx5: fix comments mixing Rx and Tx In mlx5_rxq.c, in some comments, text includes "Tx" instead of "Rx". In mlx5_txq.c, in some comments, text includes "Rx" instead of "Tx". This patch fixes these typos. Fixes: `faf2667fe8` ("net/mlx5: separate DPDK from verbs Tx queue objects") Fixes: `a1366b1a2b` ("net/mlx5: add reference counter on DPDK Rx queues") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-05-03 18:45:22 +02:00
Yongseok Koh	120dc4a7dc	net/mlx5: remove device register remap UAR (User Access Region) register does not need to be remapped for primary process but it should be remapped only for secondary process. UAR register table is in the process private structure in rte_eth_devices[], (struct mlx5_proc_priv *)rte_eth_devices[port_id].process_private The actual UAR table follows the data structure and the table is used for both Tx and Rx. For Tx, BlueFlame in UAR is used to ring the doorbell. MLX5_TX_BFREG(txq) is defined to get a register for the txq. Processes access its own private data to acquire the register from the UAR table. For Rx, the doorbell in UAR is required in arming CQ event. However, it is a known issue that the register isn't remapped for secondary process. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2019-04-12 11:02:02 +02:00
Yongseok Koh	d5c900d1dd	net/mlx5: remove redundant queue index Queue index is redundantly stored for both Rx and Tx structures. E.g. txq_ctrl->idx and txq->stats.idx. Both are consolidated to single storage - rxq->idx and txq->idx. Also, rxq and txq are moved to the beginning of its control structure (rxq_ctrl and txq_ctrl) for cacheline alignment. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-04-12 11:02:02 +02:00
Yongseok Koh	7be600c8d8	net/mlx5: rework PMD global data init There's more need to have PMD global data structure. This should be initialized once per a process regardless of how many PMD instances are probed. mlx5_init_once() is called during probing and make sure all the init functions are called once per a process. Currently, such global data and its initialization functions are even scattered. Rather than 'extern'-ing such variables and calling such functions one by one making sure it is called only once by checking the validity of such variables, it will be better to have a global storage to hold such data and a consolidated function having all the initializations. The existing shared memory gets more extensively used for this purpose. As there could be multiple secondary processes, a static storage (local to process) is also added. As the reserved virtual address for UAR remap is a PMD global resource, this doesn't need to be stored in the device priv structure, but in the PMD global data. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-04-05 17:45:22 +02:00
Viacheslav Ovsiienko	1e14090e31	net/mlx5: provide IB port for the object being created The code is updated to provide IB port index for the Verbs objects being created - QPs and Verbs Flows. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko	f048f3d479	net/mlx5: switch to the shared IB device context The code is updated to use the shared IB device context and device handles. The IB device context is shared between reprentors created over the single multiport IB device. All Verbs and DevX objects will be created within this shared context. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko	d485cdca01	net/mlx5: switch to the shared context IB attributes The code is updated to use the shared IB device attributes, located in the shared IB context. It saves some memory if there are representors created over the single Infiniband device with multiple ports. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-03-29 17:25:32 +01:00
Viacheslav Ovsiienko	1b782252cb	net/mlx5: switch to the shared protection domain The PMD code is updated to use Protected Domain from the shared IB device context. The Domain is shared between all devices belonging to the same multiport Infiniband device. If IB device has only one port, the PD is not shared, because there is only ethernet device created over IB one. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2019-03-29 17:25:32 +01:00
Thomas Monjalon	dbeba4cf18	net/mlx: prefix private structure The private structure stored in rte_eth_dev->data->dev_private was named "struct priv". In order to ease code browsing, the structure is renamed "struct mlx[45]_priv". Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2019-03-01 18:17:35 +01:00
Dekel Peled	6bd7fbd03c	net/mlx5: support metadata as flow rule criteria As described in series starting at [1], it adds option to set metadata value as match pattern when creating a new flow rule. This patch adds metadata support in mlx5 driver, in two parts: - Add the validation and setting of metadata value in matcher, when creating a new flow rule. - Add the passing of metadata value from mbuf to wqe when indicated by ol_flag, in different burst functions. [1] "ethdev: support metadata as flow rule criteria" http://mails.dpdk.org/archives/dev/2018-September/113269.html Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-10-26 22:14:06 +02:00
Moti Haimovsky	6bf10ab69b	net/mlx5: support 32-bit systems This patch adds support for building and running mlx5 PMD on 32bit systems such as i686. The main issue to tackle was handling the 32bit access to the UAR as quoted from the mlx5 PRM: QP and CQ DoorBells require 64-bit writes. For best performance, it is recommended to execute the QP/CQ DoorBell as a single 64-bit write operation. For platforms that do not support 64 bit writes, it is possible to issue the 64 bits DoorBells through two consecutive writes, each write 32 bits, as described below: * The order of writing each of the Dwords is from lower to upper addresses. * No other DoorBell can be rung (or even start ringing) in the midst of an on-going write of a DoorBell over a given UAR page. The last rule implies that in a multi-threaded environment, the access to a UAR page (which can be accessible by all threads in the process) must be synchronized (for example, using a semaphore) unless an atomic write of 64 bits in a single bus operation is guaranteed. Such a synchronization is not required for when ringing DoorBells on different UAR pages. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-07-12 14:34:59 +02:00
Adrien Mazarguil	9083982ce7	net/mlx5: drop useless support for several Verbs ports Unlike mlx4 from which this capability was inherited, mlx5 devices expose exactly one Verbs port per PCI bus address. Each physical port gets assigned its own bus address with a single Verbs port. While harmless, this code requires an extra loop that would get in the way of subsequent refactoring. No functional impact. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-07-11 15:36:55 +02:00
Shahaf Shuler	e46821e9fc	net/mlx5: separate generic tunnel TSO from the standard one The generic tunnel TSO was depended in the regular one capabilities to be enabled. Cc: stable@dpdk.org Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-07-03 01:35:58 +02:00
Nelio Laranjeiro	c44fbc7cc2	net/mlx5: clean-up developer logs Split maintainers logs from user logs. A lot of debug logs are present providing internal information on how the PMD works to users. Such logs should not be available for them and thus should remain available only when the PMD is compiled in debug mode. This commits removes some useless debug logs, move the Maintainers ones under DEBUG and also move dump into debug mode only. Cc: stable@dpdk.org Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>	2018-07-03 01:35:57 +02:00
Yongseok Koh	974f1e7ef1	net/mlx5: add new memory region support This is the new design of Memory Region (MR) for mlx PMD, in order to: - Accommodate the new memory hotplug model. - Support non-contiguous Mempool. There are multiple layers for MR search. L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized array by linear search. L0/L1 is in an inline function - mlx5_mr_lookup_cache(). If L1 misses, the bottom-half function is called to look up the address from the bigger local cache of the queue. This is L2 - mlx5_mr_addr2mr_bh() and it is not an inline function. Data structure for L2 is the Binary Tree. If L2 misses, the search falls into the slowest path which takes locks in order to access global device cache (priv->mr.cache) which is also a B-tree and caches the original MR list (priv->mr.mr_list) of the device. Unless the global cache is overflowed, it is all-inclusive of the MR list. This is L3 - mlx5_mr_lookup_dev(). The size of the L3 cache table is limited and can't be expanded on the fly due to deadlock. Refer to the comments in the code for the details - mr_lookup_dev(). If L3 is overflowed, the list will have to be searched directly bypassing the cache although it is slower. If L3 misses, a new MR for the address should be created - mlx5_mr_create(). When it creates a new MR, it tries to register adjacent memsegs as much as possible which are virtually contiguous around the address. This must take two locks - memory_hotplug_lock and priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any allocation/free of memory inside. In the free callback of the memory hotplug event, freed space is searched from the MR list and corresponding bits are cleared from the bitmap of MRs. This can fragment a MR and the MR will have multiple search entries in the caches. Once there's a change by the event, the global cache must be rebuilt and all the per-queue caches will be flushed as well. If memory is frequently freed in run-time, that may cause jitter on dataplane processing in the worst case by incurring MR cache flush and rebuild. But, it would be the least probable scenario. To guarantee the most optimal performance, it is highly recommended to use an EAL option - '--socket-mem'. Then, the reserved memory will be pinned and won't be freed dynamically. And it is also recommended to configure per-lcore cache of Mempool. Even though there're many MRs for a device or MRs are highly fragmented, the cache of Mempool will be much helpful to reduce misses on per-queue caches anyway. '--legacy-mem' is also supported. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:51 +01:00
Yongseok Koh	d561b5dc13	net/mlx5: remove memory region support This patch removes current support of Memory Region (MR) in order to accommodate the dynamic memory hotplug patch. This patch can be compiled but traffic can't flow and HW will raise faults. Subsequent patches will add new MR support. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:51 +01:00
Wei Dai	a4996bd89c	ethdev: new Rx/Tx offloads API This patch check if a input requested offloading is valid or not. Any reuqested offloading must be supported in the device capabilities. Any offloading is disabled by default if it is not set in the parameter dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and [rt]x_conf->offloads to rte_eth_[rt]x_queue_setup(). If any offloading is enabled in rte_eth_dev_configure() by application, it is enabled on all queues no matter whether it is per-queue or per-port type and no matter whether it is set or cleared in [rt]x_conf->offloads to rte_eth_[rt]x_queue_setup(). If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(), it can be enabled or disabled for individual queue in ret_eth_[rt]x_queue_setup(). A new added offloading is the one which hasn't been enabled in rte_eth_dev_configure() and is reuqested to be enabled in rte_eth_[rt]x_queue_setup(), it must be per-queue type, otherwise trigger an error log. The underlying PMD must be aware that the requested offloadings to PMD specific queue_setup() function only carries those new added offloadings of per-queue type. This patch can make above such checking in a common way in rte_ethdev layer to avoid same checking in underlying PMD. This patch assumes that all PMDs in 18.05-rc2 have already converted to offload API defined in 17.11 . It also assumes that all PMDs can return correct offloading capabilities in rte_eth_dev_infos_get(). In the beginning of [rt]x_queue_setup() of underlying PMD, add offloads = [rt]xconf->offloads \| dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API defined in 17.11 to avoid upper application broken due to offload API change. PMD can use the info that input [rt]xconf->offloads only carry the new added per-queue offloads to do some optimization or some code change on base of this patch. Signed-off-by: Wei Dai <wei.dai@intel.com> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>	2018-05-14 22:31:51 +01:00
Yongseok Koh	df428ceef4	net/mlx5: change device reference for secondary process rte_eth_devices[] is not shared between primary and secondary process, but a static array to each process. The reverse pointer of device (priv->dev) is invalid. Instead, priv has the pointer to shared data of the device, struct rte_eth_dev_data *dev_data; Two macros are added, #define PORT_ID(priv) ((priv)->dev_data->port_id) #define ETH_DEV(priv) (&rte_eth_devices[PORT_ID(priv)]) Signed-off-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:51 +01:00
Yongseok Koh	95d7e115be	net/mlx5: fix calculation of Tx TSO inline room size rdma-core doesn't add up max_tso_header size to max_inline_data size. The library takes bigger value between the two. Fixes: `43e9d9794c` ("net/mlx5: support upstream rdma-core") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>	2018-05-14 22:31:50 +01:00
Xueming Li	32d4246c90	net/mlx5: fix SW parser enabling Fixes: `5f8ba81c42` ("net/mlx5: support generic tunnel offloading") Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-05-14 22:31:49 +01:00
Xueming Li	5f8ba81c42	net/mlx5: support generic tunnel offloading This commit adds support for generic tunnel TSO and checksum offload. PMD will compute the inner/outer headers offset according to the mbuf fields. Hardware will do calculation based on offsets and types. Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-04-27 17:34:43 +01:00
Shahaf Shuler	5feecc57d9	align SPDX Mellanox copyrights Aligning Mellanox SPDX copyrights to a single format. In addition replace to SPDX licence files which were missed. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-04-11 01:47:47 +02:00
Shahaf Shuler	b7059e6e43	net/mlx5: fix TSO enablement TSO should be set if either of the TSO offload flags is requested. Fixes: `dbccb4cddc` ("net/mlx5: convert to new Tx offloads API") Cc: stable@dpdk.org Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	a170a30d22	net/mlx5: use dynamic logging Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00
Nélio Laranjeiro	0f99970b4a	net/mlx5: use port id in PMD log Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>	2018-03-30 14:08:44 +02:00

1 2 3

126 Commits