Commit Graph

419 Commits

Author SHA1 Message Date
Hans Petter Selasky
a2485fe5a6 Updates for PCI and health monitor recovery in mlx5core.
This patch accumulates the following Linux commits:

mlx5_health.c
- 78ccb25861d76a8fc5c678d762180e6918834200
  mlx5_core: Fix wrong name in struct
- 171bb2c560f45c0427ca3776a4c8f4e26e559400
  mlx5_core: Update health syndromes
- 0144a95e2ad53a40c62148f44fb0c1f9d2a0d1e9
  mlx5_core: Use accessor functions to read from device memory
- ac6ea6e81a80172612e0c9ef93720f371b198918
  mlx5_core: Use private health thread for each device
- fd76ee4da55abb21babfc69310d321b9cb9a32e0
  mlx5_core: Fix internal error detection conditions
- 2241007b3d783cbdbaa78c30bdb1994278b6f9b9
  mlx5: Clear health sick bit when starting health poll
- 712bfef60912d91033cb25739f7444d5b8d8c59f
  mlx5: Fix version printout in case of health issue
- 89d44f0a6c732db23b219be708e2fe1e03ee4842
  mlx5_core: Add pci error handlers to mlx5_core driver

mlx5_cmd.c
- be87544de8df2b1eb34bcb5e32691287d96f9ec4
  mlx5_core: Fix async commands return code
- a31208b1e11df334d443ec8cace7636150bb8ce2
  mlx5_core: New init and exit flow for mlx5_core
- 020446e01eebc9dbe7eda038e570ab9c7ab13586
  mlx5_core: Prepare cmd interface to system errors handling
- 89d44f0a6c732db23b219be708e2fe1e03ee4842
  mlx5_core: Add pci error handlers to mlx5_core driver
- 0d834442cc247c7b3f3bd6019512ae03e96dd99a
  mlx5: Fix teardown errors that happen in pci error handler

mlx5_main.c
- 5fc7197d3a256d9c5de3134870304b24892a4908
  mlx5: Add pci shutdown callback

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 09:47:09 +00:00
Hans Petter Selasky
2e9c3a4f99 Implement priority to traffic class mapping in mlx5core.
Add support for mapping priority to traffic class via sysctl

Submitted by:	Slava Shwartsman <slavash@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 15:23:07 +00:00
Hans Petter Selasky
cfc9c386eb Implement rate limit per traffic class in mlx5core.
Add support for rate limiting traffic class via sysctl.

Submitted by:	Slava Shwartsman <slavash@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 15:17:36 +00:00
Hans Petter Selasky
d91421515a Implement missing query for current port rate in mlx5ib(4).
- Factor out port speed definitions into new port.h header file,
  similarly as done in Linux upstream.
- Correct two existing port speed definitions in mlx5en according to
  Linux upstream.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 15:03:11 +00:00
Hans Petter Selasky
ecb4fcc48e Add log message for unsupported QSFPs in mlx5core.
Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:51:50 +00:00
Hans Petter Selasky
2289559114 Make sure default VNET is set when adding a new interface in mlx5core.
Adding an interface might be done outside the device_attach() routine
and will then cause a panic, due to the VNET not being set.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:49:27 +00:00
Hans Petter Selasky
11546d068d Add timeout handle to commands with callback in mlx5core.
The current implementation does not handle timeout in case of command
with callback request, and this can lead to deadlock if the command
doesn't get firmware response. Add delayed callback timeout work
before posting the command to firmware. In case of real firmware
command completion we will cancel the delayed work. In case of
firmware command timeout the callback timeout handler will be called
and it will simulate firmware completion with timeout error.

linux commit 65ee67084589c1783a74b4a4a5db38d7264ec8b5

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:41:29 +00:00
Hans Petter Selasky
2327a753b9 Fix potential deadlock in command mode change in mlx5core.
Call command completion handler in case of timeout when working in
interrupts mode. Avoid flushing the commands workqueue after acquiring
the semaphores to prevent a potential deadlock.

linux commit commit 9cba4ebcf374c3772f6eb61f2d065294b2451b49

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:35:28 +00:00
Hans Petter Selasky
ee1b5c5811 Use a macro in mlx5_command_str() instead of copying OP name.
linux commit 42ca502e179d0654ef441333a9d0f35c948734f3

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:29:30 +00:00
Hans Petter Selasky
2f2d3c0cf3 Disable unsupported disassociate ucontext functionality in mlx5ib(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:03:31 +00:00
Hans Petter Selasky
1456d97c01 Optimize ibcore RoCE address handle creation from user-space.
Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.

It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.

This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.

This commit combines the following Linux upstream commits:

IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 14:34:52 +00:00
Hans Petter Selasky
1cbc85fd04 Move the mlx5 core device pointer first in the mlx5en priv. This help simplify
checks to recognize own network devices when using mlx5ib. This patch fixes
an issues where mlx5ib fails to recognize mceX network devices for use with
RoCE.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-01-30 12:38:06 +00:00
Eitan Adler
02ca39cff2 sys/dev/mlx[45]: fix uses of 1 << 31
Reviewed by:		kib (D13858)
2018-01-12 06:36:44 +00:00
Konstantin Belousov
e44f4f3547 mlx5en: Avoid SFENCe on x86
The IA32 memory model guarantees that all writes are seen in the program
order.  Also, any access to the uncacheable memory flushes the store
buffers.  As the consequence, SFENCE instruction is (almost) never needed,
in particular, it is not needed to ensure the correct order of updates as
seen by a PCIe device.

Use atomic_thread_fence_rel() instead of wb() to only emit compiler barriers
on x86 there.  Other architectures get the right barrier instruction as
well.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-12-19 14:11:41 +00:00
Konstantin Belousov
ef23f141bc Implement hardware mlx5(4) rx timestamps.
Driver support is only provided for ConnectX4/5.

System-time timestamp is calculated based on the free-running counter
timestamp provided by hardware.  Driver periodically samples the
counter to calibrate it against the system clock and uses linear
interpolation to convert.  Stability of the crystal which drives the
clock is +-50 ppm at the operational temperature, which makes the
algorithm good enough.

The calculation is somewhat delicate because all values are 64bit and
overflow the naive formula for linear interpolation.  The calculation
drops the least significant bits in advance, see the PREC shift in
mlx5_mbuf_tstmp().

Hardware stamps can be turned off by 'ifconfig mceN -hwrxtsmp'.  Buggy
firmware might result in small but visible errors in the reported
timestamps, detectable e.g. by nonsensical (negative) RTT values for
LAN pings.

Reviewed by:	gallatin, hselasky
Sponsored by:	Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D12638
2017-11-29 10:04:11 +00:00
Hans Petter Selasky
c3125bc5bf Compile fixes for 32-bit architectures.
Sponsored by:	Mellanox Technologies
2017-11-24 12:08:50 +00:00
Hans Petter Selasky
937d37fc6c Merge ^/head r325842 through r325998. 2017-11-19 12:36:03 +00:00
Hans Petter Selasky
b108f35740 Remove duplicate static function prototype to fix compilation of
mlx5_fs_tree.c after r325638 when using GCC.

Found by:	kib @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-18 20:32:09 +00:00
Hans Petter Selasky
55b1c6e7e4 Merge ^/head r325663 through r325841. 2017-11-15 11:28:11 +00:00
Hans Petter Selasky
dd00abf2d7 Make sure the ib_wr_opcode enum is signed by adding a negative dummy element.
Different compilers may optimise the enum type in different ways. This ensures
coherency when range checking the value of enums in ibcore.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-14 14:51:37 +00:00
Hans Petter Selasky
059ecd56d0 The new mlx5ib(4) module requires some existing values to be redefined.
Sponsored by:	Mellanox Technologies
2017-11-10 15:28:17 +00:00
Hans Petter Selasky
8e6e287f8d Update mlx5ib(4) to match Linux 4.9 and the new ibcore APIs.
Sponsored by:	Mellanox Technologies
2017-11-10 15:02:17 +00:00
Hans Petter Selasky
4b109912f1 Add more and update existing mlx5 core firmware structure definitions and bits.
This change is part of coming ibcore and mlx5ib updates.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:39:03 +00:00
Hans Petter Selasky
53d7bb46d5 Expose the current hardware MTU in mlx5en(4) as a separate entry
in the sysctl tree.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:19:22 +00:00
Hans Petter Selasky
61fd7ac087 Add support for configuring local multicast and unicast data traffic loopback
in mlx5en(4) driver via the sysctl interface.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:14:54 +00:00
Hans Petter Selasky
bb3616ab20 Add support for disabling and enabling RX and TX DMA rings in mlx5en(4).
This is useful for supporting setups similar to Netmap.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:10:41 +00:00
Hans Petter Selasky
b35a986d25 Make physical address of init segment available in the priv of mlx5 core.
This change is needed by mlx5ib(4).

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:02:12 +00:00
Hans Petter Selasky
f6923226eb Add API function to query port performance counters for infiniband and RoCE
traffic in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:58:49 +00:00
Hans Petter Selasky
f4554f7830 Add API functions to query and modify local loopback of multicast and
unicast traffic in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:56:11 +00:00
Hans Petter Selasky
2fd90b8297 Add API function to query virtual port counters in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:53:53 +00:00
Hans Petter Selasky
0b3ebe412e Add API functions to modify the transport interface send object, TIS,
in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:50:08 +00:00
Hans Petter Selasky
27c29bc44b Add API functions to set and query dropless port mode in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:44:12 +00:00
Hans Petter Selasky
0e4248a114 Prevent mlx5 core from accessing host memory after shutdown by disabling
PCI busmaster.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:40:27 +00:00
Hans Petter Selasky
197563c294 Set ATOMIC endian mode in mlx5 core.
The hardware is capable of 2 requestor endianness modes for standard 8
byte atomics: BE (0x0) and host endianness (0x1). Read the supported
modes from hca atomic capabilities and configure HW to host endianness
mode if supported.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:38:43 +00:00
Hans Petter Selasky
500d0c409e Add const keyword to input-only argument in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:30:14 +00:00
Hans Petter Selasky
a4d6b00747 Make local variable 64-bits to avoid masking away bits in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:28:23 +00:00
Hans Petter Selasky
6c7057f7ba Implement support for decoding general port notification event in
the mlx5 core module.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:25:29 +00:00
Hans Petter Selasky
5a93b4cd52 Refactor the flowsteering APIs used by mlx5en(4). This change is needed by
the coming ibcore and mlx5ib updates in order to support traffic redirection
to so-called raw ethernet QPs.

Remove unused E-switch related routines and files while at it.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 09:49:08 +00:00
Hans Petter Selasky
d05554bb99 The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both
iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used
to indicate iWarp protocol use. Backport the proper IB device
capabilities from Linux upstream to distinguish between iWarp and
RoCE. Only allocate the additional socket required for iWarp for RDMA
IDs when at least one iWarp device present. This resolves
interopability issues between iWarp and RoCE in ibcore

Reviewed by:		np @
Differential Revision:	https://reviews.freebsd.org/D12563
Sponsored by:		Mellanox Technologies
MFC after:		3 days
2017-10-20 08:20:15 +00:00
Hans Petter Selasky
3cd4c11ab2 Use common rdma_ip2gid() function instead of custom mlx5_ip2gid() one.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-10-10 12:24:52 +00:00
Hans Petter Selasky
e5d6b589ce Make sure the doorbell lock is valid for the i386 version
of the mlx5en(4) driver.

Tested by:		gallatin @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-10-02 12:20:55 +00:00
Hans Petter Selasky
8ac4c959ab Compile fixes for LINT on 32-bit platforms.
MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-24 08:09:42 +00:00
Hans Petter Selasky
1251590741 Add new mlx5ib(4) driver to the kernel source tree which supports
Remote DMA over Converged Ethernet, RoCE, for the ConnectX-4 series of
PCI express network cards.

There is currently no user-space support and this driver only supports
kernel side non-routable RoCE V1. The krping kernel module can be used
to test this driver. Full user-space support including RoCE V2 will be
added as part of the ongoing upgrade to ibcore from Linux 4.9. Otherwise
this driver is feature equivalent to mlx4ib(4). The mlx5ib(4) kernel
module will only be built when WITH_OFED=YES is specified.

MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-23 12:09:37 +00:00
Hans Petter Selasky
8508e4d730 Make sure the received IP header gets 32-bit aligned for short packets
in the mlx5en(4) driver.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-08 11:49:36 +00:00
Hans Petter Selasky
869dd4b498 Count drop events due to lack of PCI bandwidth as queue drops and not as
input errors in the mlx5en(4) driver. This improves the sysadmin view of
physical port errors.

Submitted by:		gallatin@
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-08 11:36:57 +00:00
Hans Petter Selasky
713dd5cb9e Resolve locking issue for non-sleepable context in the mlx5core.
Code inspection reveals the busdma unload and free functions
do not write to the belonging dma tag and does not need to be
serialized. This allows mlx5_fwp_free() to be called from
software interrupt context.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:14:43 +00:00
Hans Petter Selasky
1d4905b5b0 Using GFP_ATOMIC with firmware commands is not supported after busdma was
introduced in the mlx5core, because busdma might sleep when loading memory
into DMA.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:11:51 +00:00
Mark Johnston
c73cdca2c4 Update io-mapping.h in the LinuxKPI.
Add io_mapping_init_wc() and add a third (unused) parameter to
io_mapping_map_wc().

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11286
2017-06-21 18:20:17 +00:00
Hans Petter Selasky
8b48354659 Improve sysadmin visibility of physical port error counters in the
mlx5en driver.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-04-28 19:38:57 +00:00
Hans Petter Selasky
eac79e7755 Make "desc" pointer non-constant inside the mlx5_core_diagnostics_entry
structure. This fixes compilation with amd64-xtoolchain-gcc.

PR:			216588
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-30 08:35:15 +00:00
Hans Petter Selasky
1c807f6795 Use the busdma API to allocate all DMA-able memory.
The MLX5 driver has four different types of DMA allocations which are
now allocated using busdma:

1) The 4K firmware DMA-able blocks. One busdma object per 4K allocation.
2) Data for firmware commands use the 4K firmware blocks split into four 1K blocks.
3) The 4K firmware blocks are also used for doorbell pages.
4) The RQ-, SQ- and CQ- DMA rings. One busdma object per allocation.

After this patch the mlx5en driver can be used with DMAR enabled in
the FreeBSD kernel.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:46:55 +00:00
Hans Petter Selasky
30dfc0518a Add support for device surprise removal and other PCI errors.
- When device disappears from PCI indicate error device state and:
  1) Trigger command completion for all pending commands
  2) Prevent new commands from executing and return:
     - success for modify and remove/cleanup commands
     - failure for create/query commands
  3) When reclaiming pages for a device in error state don't ask FW to
     return all given pages, just release the allocated memory

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:29:33 +00:00
Hans Petter Selasky
44a03e91f3 Wait for all VFs pages to be reclaimed before closing EQ pages.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:19:06 +00:00
Hans Petter Selasky
310804912a Rename struct fw_page into struct mlx5_fw_page as a preparation step
for adding busdma support.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:03:58 +00:00
Hans Petter Selasky
f361e561a5 Fix command completion with callback scenario.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:56:03 +00:00
Hans Petter Selasky
8e7e0ce110 Minor code refactor as a preparation step for suprise removal of CX-4
PCI device(s), changes:
- alloc_entry() now clears bit for page slot entry aswell
- update of cmd->ent_arr[] is now under cmd->alloc_lock
- complete command if alloc_entry() fails

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:47:53 +00:00
Hans Petter Selasky
d0ce5a0da7 Use ffs() to scan for first bit instead of using a for() loop.
Minor code refactor while at it.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:36:49 +00:00
Hans Petter Selasky
115bc9b1d3 Make fw_pages statistics counter 64-bit to avoid overflow.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:20:38 +00:00
Hans Petter Selasky
66d53750b9 Add support for reading advanced diagnostic counters.
By default reading the diagnostic counters is disabled. The firmware
decides which counters are supported and only those supported show up
in the dev.mce.X.diagnostics sysctl tree.

To enable reading of diagnostic counters set one or more of the
following sysctls to one:

dev.mce.X.conf.diag_general_enable=1
dev.mce.X.conf.diag_pci_enable=1

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:03:50 +00:00
Hans Petter Selasky
5e6a76be8a Enforce reading the consumer and producer counters once to ensure
consistent return values from the mlx5e_sq_has_room_for()
function. The two counters are incremented by different threads under
different locks.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 08:32:50 +00:00
Hans Petter Selasky
e16c241deb Remove superfluous return statement.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 15:47:29 +00:00
Hans Petter Selasky
b98ba64027 Allow transmit packet bufring in software to be disabled.
- Add new sysctl node to control the transmit packet bufring.

- Add optimised version of the transmit routine which output packets
directly to the DMA ring instead of using bufring in case the transmit
lock is congested. This can reduce the number of taskswitches which in
turn influence the overall system CPU usage, depending on the
workload.

- Add " TX" suffix to debug name for transmit mutexes to silence some
witness warnings about aquiring duplicate locks having same name.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
Suggested by:		gallatin @
2017-01-20 15:45:21 +00:00
Hans Petter Selasky
3dfa7645c5 Make draining a sendqueue more robust.
Add own state variable to track if a sendqueue is stopped or not.
This will prevent traffic from entering the sendqueue while it is
being destroyed.

Update drain function to wait for traffic to be transmitted before
returning when the link state is active.

Add extra checks in transmit path for stopped SQ's.

While at it:
- Use likely() for a mbuf pointer check.
- Remove redundant IFF_DRV_RUNNING check.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 12:02:40 +00:00
Hans Petter Selasky
d2bf00a918 Add runtime support for modifying the SQ and RQ completion event
moderation mode. The presence of this feature is indicated through the
firmware capabilities.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 11:11:49 +00:00
Hans Petter Selasky
0402eb6bcc Update firmware interface structures and definitions adding support
for new features and commands.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 10:47:32 +00:00
Hans Petter Selasky
41aa095b2f Make a read only pointer constant.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-12-22 10:12:19 +00:00
Hans Petter Selasky
436659135a Add more comments regarding collection of statistics counters.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-12-22 10:11:03 +00:00
Hans Petter Selasky
de83258d59 Remove useless NULL checks.
NULL is not returned when allocating memory passing the M_WAITOK flag.

Submitted by:		trasz @
Differential Revision:  https://reviews.freebsd.org/D5772
Sponsored by:           Mellanox Technologies
MFC after:		1 week
2016-12-02 09:41:54 +00:00
Hans Petter Selasky
6f4cab6cc3 Add timer to watch the RQ when we are out of mbufs.
The firmware/hardware does not generate additional completion
events unless we post new buffers. Use a timer to try to post
more buffers in case we are temporarily out of mbufs. Else
the receive schedule completely stops.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:39:45 +00:00
Hans Petter Selasky
cb02244355 Add more firmware related structures and update existing ones in the
MLX5 core module. Update the set and query diagnostics counter API.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:28:50 +00:00
Hans Petter Selasky
627ef61aab Query flow table capabilities according to the correct capability bit
for infiniband.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:26:25 +00:00
Hans Petter Selasky
adea303c2a Correct checksum fields in the "mlx5_mini_cqe8" structure. The fields
in question are currently not used.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:22:50 +00:00
Hans Petter Selasky
91951e3978 Ensure the firmware is notified of any host memory allocation
failures. Else firmware commands may time out waiting for host
memory.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:20:13 +00:00
Hans Petter Selasky
97ac390861 When a firmware command times out do not free the command structure to
avoid use after free.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:15:40 +00:00
Hans Petter Selasky
478c1a9932 Set hardware stats flag to avoid double counting the number of incoming bytes.
Found by:	Ben RUBSON <ben.rubson@gmail.com>
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-29 16:35:52 +00:00
Hans Petter Selasky
a2c320d7c7 mlx5en: Fix duplicate mbuf free-by-code.
When mlx5e_sq_xmit() returns an error code and the mbuf pointer is set,
we should not free the mbuf, because the caller will keep the mbuf in
the drbr. Make sure the mbuf pointer is correctly set upon function
exit.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:57:48 +00:00
Hans Petter Selasky
14997cc16a mlx5en: Remove unused pdev pointer.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:55:38 +00:00
Hans Petter Selasky
f5344e8333 mlx5en: Verify port type is ethernet before creating network device
Else the mlx5en driver might attach to infiniband ports.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:53:53 +00:00
Hans Petter Selasky
431fe47416 mlx5en: Allow setting the software MTU size below 1500 bytes
The hardware MTU size can't be set to a value less than 1500 bytes due
to side-band management support. Allow setting the software MTU size
below 1500 bytes, thus creating a mismatch between hardware and
software MTU sizes.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:51:31 +00:00
Hans Petter Selasky
7b4e6e4ac9 mlx5en: Factor out common sendqueue code for use with rate limiting SQs.
Try to reuse code to setup sendqueues when possible by making some static
functions global. Further split the mlx5e_close_sq_wait() function to
separate out reusable parts.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:47:16 +00:00
Hans Petter Selasky
81b3cdc1bb mlx5en: Properly declare doorbell lock for 32-bit CPUs.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:45:35 +00:00
Hans Petter Selasky
5eadc44ceb mlx5en: Optimise away duplicate UAR pointers.
This change also reduces the size of the mlx5e_sq structure so that the last
queue_state element will fit into the previous cacheline and then the mlx5e_sq
structure becomes one cacheline less for amd64.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:40:45 +00:00
Hans Petter Selasky
28f22ccea3 mlx5en: Make the mlx5e_open_cq() and mlx5e_close_cq() functions global.
Make some functions and structures global to allow for code reuse
when creating rate limiting sendqueues.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:39:15 +00:00
Hans Petter Selasky
941cd5d1a4 mlx5en: Minor completion queue control path code refactor.
Move setting of CQ moderation mode together with the other
CQ moderation parameters. Pass completion event vector as
a separate argument to mlx5e_open_cq(), because its value is
different for each call. Pass mlx5e_priv pointer instead of
mlx5e_channel pointer so that code can be used by rate
limiting sendqueues.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:37:35 +00:00
Hans Petter Selasky
98626886ee mlx5en: Separate the sendqueue from using the mlx5e_channel structure.
This change allows for reusing the transmit path for so called
rate limited senqueues. While at it optimise some pointer lookups
in the fast path.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:35:45 +00:00
Hans Petter Selasky
cb4e4a6ed6 Update the MLX5 core module:
- Add new firmware commands and update existing ones.
- Add more firmware related structures and update existing ones.
- Some minor fixes, like adding missing \n to some prints.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:28:16 +00:00
Hans Petter Selasky
351a9c7c0b Increase the maximum RX/TX queue size. This allows for a RX/TX queue
size of 16384 mbufs. Previously the limit was 8192.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-22 13:43:25 +00:00
Hans Petter Selasky
5f9e5b5e62 Fix for use after free.
Clear the device description to avoid use after free because the
bsddev is not destroyed when the mlx5en module is unloaded. Only when
the parent mlx5 module is unloaded the bsddev is destroyed. This fixes
a panic on listing sysctls which refer strings in the bsddev after the
mlx5en module has been unloaded.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-09 07:43:15 +00:00
Hans Petter Selasky
57d5dd7907 Switch to the new block based LRO input function for the mlx5en
driver. This change significantly increases the overall RX aggregation
ratio for heavily loaded networks handling 10-80 thousand simultaneous
connections.

Remove the turbo LRO code and all references to it which has now been
superceeded by the tcp_lro_queue_mbuf() function.

Tested by:	Netflix
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-08 16:22:16 +00:00
Hans Petter Selasky
5b0c29decf Use correct Q-counter output array.
Sponsored by:	Mellanox Technologies
Approved by:	re (kib)
MFC after:	3 days
2016-06-23 09:23:37 +00:00
Hans Petter Selasky
76a5241f2c Add SR-IOV guest support to the mlx5en driver.
This patch adds the missing pieces needed for device setup using the
mlx5en driver inside a virtual machine which is providing hardware
access through SR-IOV.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-06-07 13:58:52 +00:00
Sepherosa Ziehau
36ad8372d4 net: Use M_HASHTYPE_OPAQUE_HASH if the mbuf flowid has hash properties
Reviewed by:	hps, erj, tuexen
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D6688
2016-06-07 04:51:50 +00:00
Hans Petter Selasky
fa201e28fc Prepare for activation of LinuxKPI module parameters as read-only
tunable SYSCTL's. Linux module parameters are associated with the
module they belong to. FreeBSD does not share this concept of a parent
module. Instead add macros which define the prefix to use for the
module parameters in the LinuxKPI consumers.

While at it convert all "bool" LinuxKPI module parameters to "byte"
type, because we don't have a "bool" type of SYSCTL in FreeBSD.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-05-25 12:03:21 +00:00
Hans Petter Selasky
82d2623e5a Verify one sysctl parameter at a time. When a mlx5en sysctl parameter
is updated only verify the changed one instead of all.

No functional change.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 07:07:27 +00:00
Hans Petter Selasky
af89c4aff6 Optimise use of doorbell and remove redundant NOPs
Store the last doorbell write in the mlx5e_sq structure and write the
doorbell to the hardware when the transmit routine finishes
transmitting all queued mbufs.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 06:59:38 +00:00
Hans Petter Selasky
376bcf6331 Implement TX completion event interleaving.
This patch implements a sysctl which allows setting a factor, N, for
how many work queue elements can be generated before requiring a
completion event. When a completion event happens the code simulates N
completion events instead of only one. When draining a transmit queue,
N-1 NOPs are transmitted at most, to force generation of the final
completion event.  Further a timer is running every HZ ticks to flush
any remaining data off the transmit queue when the tx_completion_fact
> 1.

The goal of this feature is to reduce the PCI bandwidth needed when
transmitting data.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 06:54:58 +00:00
Hans Petter Selasky
83c5d190fe Correct some error codes to native FreeBSD ones.
Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-04-29 11:01:06 +00:00
Hans Petter Selasky
21dd652701 Add function to detect the presence of a port module and use this
function to error out early when no port module is present and doing
eeprom access. This also prevents error codes from filling up in
dmesg.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-04-29 11:00:12 +00:00
Sepherosa Ziehau
6dd38b8716 tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
Hans Petter Selasky
d7633a3070 Fix an issue where the network adapter could be left in down state
after changing the HW LRO sysctl when previously in up state.

Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4941
2016-01-19 10:24:47 +00:00
Hans Petter Selasky
636d1fec4d Add clarifying comment about CQE zipping.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4940
2016-01-19 10:19:33 +00:00
Hans Petter Selasky
1558d49bb1 Declare local variables at top of function.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4939
2016-01-19 10:17:24 +00:00
Hans Petter Selasky
4d3b91a762 Allow RX and TX pause frames to be set through ifconfig.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4817
2016-01-19 10:10:02 +00:00
Hans Petter Selasky
f03f517b5e Add support for modifying coalescing parameters runtime.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-12-30 15:01:47 +00:00
Hans Petter Selasky
4fbd91a5af Allow I2C to read address 0x51 as well as address 0x50.
MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
2015-12-30 14:58:55 +00:00
Hans Petter Selasky
4f18ce8ae0 10G ER/LR should present itself as LR.
MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
2015-12-30 14:54:08 +00:00
Hans Petter Selasky
90cc1c7724 Add support for CQE zipping. CQE zipping reduces PCI overhead by
coalescing and zipping multiple CQEs into a single merged CQE. The
feature is enabled by default and can be disabled by a sysctl.

Implementing this feature mlx5_cqwq_pop() has been separated from
mlx5e_get_cqe().

MFC after:	1 week
Submitted by:	Mark Bloch <markb@mellanox.com>
Differential Revision:	https://reviews.freebsd.org/D4598
Sponsored by:	Mellanox Technologies
2015-12-28 18:50:18 +00:00
Hans Petter Selasky
ec0143b260 Add support for sysctl tunables to 10-stable and older. Pushed through
head first to simplify driver maintenance.

MFC after:	1 week
Submitted by:	Drew Gallatin <gallatin@freebsd.org>
Differential Revision:	https://reviews.freebsd.org/D4552
Sponsored by:	Mellanox Technologies
2015-12-28 18:36:00 +00:00
Hans Petter Selasky
ee41fc8f8c Make the eeprom dump function more readable and rename variables for
better clarity.

MFC after:	1 week
Submitted by:	Daria Genzel <dariaz@mellanox.com>
Differential Revision:	https://reviews.freebsd.org/D4551
Sponsored by:	Mellanox Technologies
2015-12-28 18:28:18 +00:00
Hans Petter Selasky
98a998d5e7 Update the mlx5 shared driver code to the latest version, which
include the following list of changes:

- Added eswitch ACL table management
  Introduce API for managing ACL table.
  This API include the following features:
  1) vlan filter - for VST/VGT+ support.
  2) spoofcheck.
  3) robust functionality to allow/drop general untagged/tagged traffic.
  4) support for both ingress and egress ACL types.

- Added loopback filter to the vacl table.

- Added multicast list set in the vPort context

- Added promiscuous mode set in the vPort context

- Set the vlan list in vPort context
  1) Check caps if VLAN list is not longer than FW supports
  2) Set MODIFY_NIC_VPORT_CONTEXT command

- Changed MLX5_EEPROM_MAX_BYTES from 48 to 32 so that a single EEPROM
  reading cannot cross the 128-byte boundary. Previously reading the
  MCIA register was done in batches of 48 bytes. The third reading
  would then by-pass the 127th byte, which means that part of the low
  page and part of the high page would be read at the same time, which
  created a bug:
    1st: 0-47 bytes
    2nd: 48-95 bytes
    3rd: 96-143 bytes

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4411
2015-12-07 13:16:48 +00:00
Hans Petter Selasky
278ce1c919 Add full support for Receive Side Scaling, RSS, to the mlx5en
driver. This includes binding all interrupt and worker threads
according to the RSS configuration, setting up correct Toeplitz
hashing keys as given by RSS and setting the correct mbuf
hashtype for all received traffic.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4410
2015-12-07 12:38:51 +00:00
Hans Petter Selasky
74540a3183 Add support for setting the TX moderation mode via a sysctl entry. TX
completion events can be moderated in the same way like RX completion
events. Expose this functionality by a sysctl variable.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4409
2015-12-07 11:04:50 +00:00
Hans Petter Selasky
2a5ac376e4 The firmware no longer supports setting a port MTU of zero bytes.
Set the port MTU and then query it and report if any problems instead.

MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4408
2015-12-07 10:57:42 +00:00
Hans Petter Selasky
bb3853c6bd Style changes, mostly automated.
Differential Revision:	https://reviews.freebsd.org/D4179
Submitted by:	Daria Genzel <dariaz@mellanox.com>
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-11-19 10:28:51 +00:00
Hans Petter Selasky
ee09079968 Accumulate out of RX buffers into a 64-bit value and subtract out of
RX buffers from number of received packets.

Differential Revision:	https://reviews.freebsd.org/D4178
Submitted by:	Drew Gallatin <gallatin@freebsd.org>
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-11-19 10:23:10 +00:00
Hans Petter Selasky
36c1007d35 Maintain the "hw_lro" configuration variable correctly.
Setting sysctl dev....conf.hw_lro may fail if the net device lro is
turned off. Due to the nature of our sysctl handler we need to set the
values back to 0 and issue an error.

Differential Revision:	https://reviews.freebsd.org/D4177
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-11-19 10:18:13 +00:00
Hans Petter Selasky
7e1b8bc0c9 Print cable name, if cable type is not recognized.
Differential Revision:	https://reviews.freebsd.org/D4180
Submitted by:	Mark Bloch <markb@mellanox.com>
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-11-19 10:10:52 +00:00
Hans Petter Selasky
03ab395e29 Compile fix for 32-bit platforms:
- The Linux timers data field is "unsigned long".

Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2015-11-12 09:52:37 +00:00
Hans Petter Selasky
dc7e38ac4d Add mlx5 and mlx5en driver(s) for ConnectX-4 and ConnectX-4LX cards
from Mellanox Technologies. The current driver supports ethernet
speeds up to and including 100 GBit/s. Infiniband support will be
done later.

The code added is not compiled by default, which will be done by a
separate commit.

Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2015-11-10 12:20:22 +00:00