159 Commits

Author SHA1 Message Date
Hans Petter Selasky
1456d97c01 Optimize ibcore RoCE address handle creation from user-space.
Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.

It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.

This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.

This commit combines the following Linux upstream commits:

IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 14:34:52 +00:00
Hans Petter Selasky
1cbc85fd04 Move the mlx5 core device pointer first in the mlx5en priv. This help simplify
checks to recognize own network devices when using mlx5ib. This patch fixes
an issues where mlx5ib fails to recognize mceX network devices for use with
RoCE.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-01-30 12:38:06 +00:00
Eitan Adler
02ca39cff2 sys/dev/mlx[45]: fix uses of 1 << 31
Reviewed by:		kib (D13858)
2018-01-12 06:36:44 +00:00
Konstantin Belousov
e44f4f3547 mlx5en: Avoid SFENCe on x86
The IA32 memory model guarantees that all writes are seen in the program
order.  Also, any access to the uncacheable memory flushes the store
buffers.  As the consequence, SFENCE instruction is (almost) never needed,
in particular, it is not needed to ensure the correct order of updates as
seen by a PCIe device.

Use atomic_thread_fence_rel() instead of wb() to only emit compiler barriers
on x86 there.  Other architectures get the right barrier instruction as
well.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-12-19 14:11:41 +00:00
Konstantin Belousov
ef23f141bc Implement hardware mlx5(4) rx timestamps.
Driver support is only provided for ConnectX4/5.

System-time timestamp is calculated based on the free-running counter
timestamp provided by hardware.  Driver periodically samples the
counter to calibrate it against the system clock and uses linear
interpolation to convert.  Stability of the crystal which drives the
clock is +-50 ppm at the operational temperature, which makes the
algorithm good enough.

The calculation is somewhat delicate because all values are 64bit and
overflow the naive formula for linear interpolation.  The calculation
drops the least significant bits in advance, see the PREC shift in
mlx5_mbuf_tstmp().

Hardware stamps can be turned off by 'ifconfig mceN -hwrxtsmp'.  Buggy
firmware might result in small but visible errors in the reported
timestamps, detectable e.g. by nonsensical (negative) RTT values for
LAN pings.

Reviewed by:	gallatin, hselasky
Sponsored by:	Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D12638
2017-11-29 10:04:11 +00:00
Hans Petter Selasky
c3125bc5bf Compile fixes for 32-bit architectures.
Sponsored by:	Mellanox Technologies
2017-11-24 12:08:50 +00:00
Hans Petter Selasky
937d37fc6c Merge ^/head r325842 through r325998. 2017-11-19 12:36:03 +00:00
Hans Petter Selasky
b108f35740 Remove duplicate static function prototype to fix compilation of
mlx5_fs_tree.c after r325638 when using GCC.

Found by:	kib @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-18 20:32:09 +00:00
Hans Petter Selasky
55b1c6e7e4 Merge ^/head r325663 through r325841. 2017-11-15 11:28:11 +00:00
Hans Petter Selasky
dd00abf2d7 Make sure the ib_wr_opcode enum is signed by adding a negative dummy element.
Different compilers may optimise the enum type in different ways. This ensures
coherency when range checking the value of enums in ibcore.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-14 14:51:37 +00:00
Hans Petter Selasky
059ecd56d0 The new mlx5ib(4) module requires some existing values to be redefined.
Sponsored by:	Mellanox Technologies
2017-11-10 15:28:17 +00:00
Hans Petter Selasky
8e6e287f8d Update mlx5ib(4) to match Linux 4.9 and the new ibcore APIs.
Sponsored by:	Mellanox Technologies
2017-11-10 15:02:17 +00:00
Hans Petter Selasky
4b109912f1 Add more and update existing mlx5 core firmware structure definitions and bits.
This change is part of coming ibcore and mlx5ib updates.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:39:03 +00:00
Hans Petter Selasky
53d7bb46d5 Expose the current hardware MTU in mlx5en(4) as a separate entry
in the sysctl tree.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:19:22 +00:00
Hans Petter Selasky
61fd7ac087 Add support for configuring local multicast and unicast data traffic loopback
in mlx5en(4) driver via the sysctl interface.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:14:54 +00:00
Hans Petter Selasky
bb3616ab20 Add support for disabling and enabling RX and TX DMA rings in mlx5en(4).
This is useful for supporting setups similar to Netmap.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:10:41 +00:00
Hans Petter Selasky
b35a986d25 Make physical address of init segment available in the priv of mlx5 core.
This change is needed by mlx5ib(4).

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:02:12 +00:00
Hans Petter Selasky
f6923226eb Add API function to query port performance counters for infiniband and RoCE
traffic in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:58:49 +00:00
Hans Petter Selasky
f4554f7830 Add API functions to query and modify local loopback of multicast and
unicast traffic in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:56:11 +00:00
Hans Petter Selasky
2fd90b8297 Add API function to query virtual port counters in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:53:53 +00:00
Hans Petter Selasky
0b3ebe412e Add API functions to modify the transport interface send object, TIS,
in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:50:08 +00:00
Hans Petter Selasky
27c29bc44b Add API functions to set and query dropless port mode in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:44:12 +00:00
Hans Petter Selasky
0e4248a114 Prevent mlx5 core from accessing host memory after shutdown by disabling
PCI busmaster.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:40:27 +00:00
Hans Petter Selasky
197563c294 Set ATOMIC endian mode in mlx5 core.
The hardware is capable of 2 requestor endianness modes for standard 8
byte atomics: BE (0x0) and host endianness (0x1). Read the supported
modes from hca atomic capabilities and configure HW to host endianness
mode if supported.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:38:43 +00:00
Hans Petter Selasky
500d0c409e Add const keyword to input-only argument in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:30:14 +00:00
Hans Petter Selasky
a4d6b00747 Make local variable 64-bits to avoid masking away bits in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:28:23 +00:00
Hans Petter Selasky
6c7057f7ba Implement support for decoding general port notification event in
the mlx5 core module.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:25:29 +00:00
Hans Petter Selasky
5a93b4cd52 Refactor the flowsteering APIs used by mlx5en(4). This change is needed by
the coming ibcore and mlx5ib updates in order to support traffic redirection
to so-called raw ethernet QPs.

Remove unused E-switch related routines and files while at it.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 09:49:08 +00:00
Hans Petter Selasky
d05554bb99 The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both
iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used
to indicate iWarp protocol use. Backport the proper IB device
capabilities from Linux upstream to distinguish between iWarp and
RoCE. Only allocate the additional socket required for iWarp for RDMA
IDs when at least one iWarp device present. This resolves
interopability issues between iWarp and RoCE in ibcore

Reviewed by:		np @
Differential Revision:	https://reviews.freebsd.org/D12563
Sponsored by:		Mellanox Technologies
MFC after:		3 days
2017-10-20 08:20:15 +00:00
Hans Petter Selasky
3cd4c11ab2 Use common rdma_ip2gid() function instead of custom mlx5_ip2gid() one.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-10-10 12:24:52 +00:00
Hans Petter Selasky
e5d6b589ce Make sure the doorbell lock is valid for the i386 version
of the mlx5en(4) driver.

Tested by:		gallatin @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-10-02 12:20:55 +00:00
Hans Petter Selasky
8ac4c959ab Compile fixes for LINT on 32-bit platforms.
MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-24 08:09:42 +00:00
Hans Petter Selasky
1251590741 Add new mlx5ib(4) driver to the kernel source tree which supports
Remote DMA over Converged Ethernet, RoCE, for the ConnectX-4 series of
PCI express network cards.

There is currently no user-space support and this driver only supports
kernel side non-routable RoCE V1. The krping kernel module can be used
to test this driver. Full user-space support including RoCE V2 will be
added as part of the ongoing upgrade to ibcore from Linux 4.9. Otherwise
this driver is feature equivalent to mlx4ib(4). The mlx5ib(4) kernel
module will only be built when WITH_OFED=YES is specified.

MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-23 12:09:37 +00:00
Hans Petter Selasky
8508e4d730 Make sure the received IP header gets 32-bit aligned for short packets
in the mlx5en(4) driver.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-08 11:49:36 +00:00
Hans Petter Selasky
869dd4b498 Count drop events due to lack of PCI bandwidth as queue drops and not as
input errors in the mlx5en(4) driver. This improves the sysadmin view of
physical port errors.

Submitted by:		gallatin@
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-08 11:36:57 +00:00
Hans Petter Selasky
713dd5cb9e Resolve locking issue for non-sleepable context in the mlx5core.
Code inspection reveals the busdma unload and free functions
do not write to the belonging dma tag and does not need to be
serialized. This allows mlx5_fwp_free() to be called from
software interrupt context.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:14:43 +00:00
Hans Petter Selasky
1d4905b5b0 Using GFP_ATOMIC with firmware commands is not supported after busdma was
introduced in the mlx5core, because busdma might sleep when loading memory
into DMA.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:11:51 +00:00
Mark Johnston
c73cdca2c4 Update io-mapping.h in the LinuxKPI.
Add io_mapping_init_wc() and add a third (unused) parameter to
io_mapping_map_wc().

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11286
2017-06-21 18:20:17 +00:00
Hans Petter Selasky
8b48354659 Improve sysadmin visibility of physical port error counters in the
mlx5en driver.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-04-28 19:38:57 +00:00
Hans Petter Selasky
eac79e7755 Make "desc" pointer non-constant inside the mlx5_core_diagnostics_entry
structure. This fixes compilation with amd64-xtoolchain-gcc.

PR:			216588
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-30 08:35:15 +00:00
Hans Petter Selasky
1c807f6795 Use the busdma API to allocate all DMA-able memory.
The MLX5 driver has four different types of DMA allocations which are
now allocated using busdma:

1) The 4K firmware DMA-able blocks. One busdma object per 4K allocation.
2) Data for firmware commands use the 4K firmware blocks split into four 1K blocks.
3) The 4K firmware blocks are also used for doorbell pages.
4) The RQ-, SQ- and CQ- DMA rings. One busdma object per allocation.

After this patch the mlx5en driver can be used with DMAR enabled in
the FreeBSD kernel.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:46:55 +00:00
Hans Petter Selasky
30dfc0518a Add support for device surprise removal and other PCI errors.
- When device disappears from PCI indicate error device state and:
  1) Trigger command completion for all pending commands
  2) Prevent new commands from executing and return:
     - success for modify and remove/cleanup commands
     - failure for create/query commands
  3) When reclaiming pages for a device in error state don't ask FW to
     return all given pages, just release the allocated memory

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:29:33 +00:00
Hans Petter Selasky
44a03e91f3 Wait for all VFs pages to be reclaimed before closing EQ pages.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:19:06 +00:00
Hans Petter Selasky
310804912a Rename struct fw_page into struct mlx5_fw_page as a preparation step
for adding busdma support.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:03:58 +00:00
Hans Petter Selasky
f361e561a5 Fix command completion with callback scenario.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:56:03 +00:00
Hans Petter Selasky
8e7e0ce110 Minor code refactor as a preparation step for suprise removal of CX-4
PCI device(s), changes:
- alloc_entry() now clears bit for page slot entry aswell
- update of cmd->ent_arr[] is now under cmd->alloc_lock
- complete command if alloc_entry() fails

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:47:53 +00:00
Hans Petter Selasky
d0ce5a0da7 Use ffs() to scan for first bit instead of using a for() loop.
Minor code refactor while at it.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:36:49 +00:00
Hans Petter Selasky
115bc9b1d3 Make fw_pages statistics counter 64-bit to avoid overflow.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:20:38 +00:00
Hans Petter Selasky
66d53750b9 Add support for reading advanced diagnostic counters.
By default reading the diagnostic counters is disabled. The firmware
decides which counters are supported and only those supported show up
in the dev.mce.X.diagnostics sysctl tree.

To enable reading of diagnostic counters set one or more of the
following sysctls to one:

dev.mce.X.conf.diag_general_enable=1
dev.mce.X.conf.diag_pci_enable=1

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:03:50 +00:00
Hans Petter Selasky
5e6a76be8a Enforce reading the consumer and producer counters once to ensure
consistent return values from the mlx5e_sq_has_room_for()
function. The two counters are incremented by different threads under
different locks.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 08:32:50 +00:00