Commit Graph

349 Commits

Author SHA1 Message Date
Slava Shwartsman
8718eb63f8 mlx5: Add software tx_jumbo_packets counter
This counter will represent transmitted packets which has more than
1518 octets.
The NIC has multiple hardware counters for counting transmitted
packets larger than 1518 octets. Each counter counts the packets
in specific range.
We accumulate those counters to have a single counter.

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:45:37 +00:00
Slava Shwartsman
feb5f357ea mlx5: Implement support for configuring PCIe packet write ordering via a sysctl.
Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:45:08 +00:00
Slava Shwartsman
70b417cf90 mlx5: Extend vector argument to u64.
Else the MLX5_TRIGGERED_CMD_COMP flag will be masked away.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:44:38 +00:00
Slava Shwartsman
29e544513e mlx5: Add global control to disable firmware reset, for all mlx5 devices.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:44:08 +00:00
Slava Shwartsman
2119f825d1 mlx5: Fix use-after-free in self-healing flow
When the mlx5 health mechanism detects a problem while the driver
is in the middle of init_one or remove_one, the driver needs to prevent
the health mechanism from scheduling future work; if future work
is scheduled, there is a problem with use-after-free: the system WQ
tries to run the work item (which has been freed) at the scheduled
future time.

Prevent this by disabling work item scheduling in the health mechanism
when the driver is in the middle of init_one() or remove_one().

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:43:37 +00:00
Slava Shwartsman
8f7f07368d mlx5: Move hw.mlx5 node definition to mlx5_core.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:43:07 +00:00
Slava Shwartsman
63cc6d1bc2 mlx5: Convert some spaces into tabs and use device_printf() instead of printf().
Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:42:36 +00:00
Slava Shwartsman
abb28d287b mlx5: Add SRQ fixes from Linux
Combine multiple fixes from Linux to SRQ.
Linux commits:
c73b791 IB/mlx5: Assign SRQ type earlier
0fd27a8 IB/mlx5: Fix out-of-bound access
c2b37f7 IB/mlx5: Fix integer overflows in mlx5_ib_create_srq
d63c467 RDMA/mlx5: Fix memory leak in mlx5_ib_create_srq() error path

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:42:06 +00:00
Slava Shwartsman
3b21d18587 mlx5: Fix for potential memory leaks.
Make sure allocated data gets freed in error cases.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:41:37 +00:00
Slava Shwartsman
07b624ed71 mlx5: Discard unused return values.
Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:41:06 +00:00
Slava Shwartsman
843a89d37e mlx5: Raise fatal IB event when sys error occurs
All other mlx5_events report the port number as 1 based, which is how FW
reports it in the port event EQE. Reporting 0 for this event causes
mlx5_ib to not raise a fatal event notification to registered clients
due to a seemingly invalid port.

All switch cases in mlx5_ib_event that go through the port check are
supposed to set the port now, so just do it once at variable
declaration.

Linux commit:
aba462134634b502d720e15b23154f21cfa277e5

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:40:36 +00:00
Slava Shwartsman
2bf40c3608 mlx5: Fix integer overflow while resizing CQ
The user can provide very large cqe_size which will cause to integer
overflow.

Linux commit:
28e9091e3119933c38933cb8fc48d5618eb784c8

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:40:05 +00:00
Slava Shwartsman
0c79f82cf0 mlx5: Notify user that the ConnectX-6 shutdown its port due to power limitation
If power exceed the slot limit, or slot limit is unknown the ConnectX-6
firmware will shutdown its port.
Inform the user via debug message.

MFC after:      3 days
Approved by:    hselasky (mentor), kib (mentor)
Sponsored by:   Mellanox Technologies
2018-10-22 10:38:38 +00:00
Hans Petter Selasky
6ed134c41b Make the MSIX module parameter limit per device, in mlx5en(4).
MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 12:41:09 +00:00
Hans Petter Selasky
16ae32f927 Add support for receive side scaling stride, RSSS, in mlx5en(4).
The receive side scaling stride parameter is a value which define the interval
between active receive side queues. The traffic for the inactive queues is
redirected to the nearest active queue by use of modulus. The default value
of this parameter is one, which means all receive side queues are used.

The point of this feature is to redirect more traffic to fewer receive side
queues in order to take more advantage of sorted large receive offload,
sorted LRO. The sorted LRO works better when more packets are accumulated
per service interval.

MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 12:28:06 +00:00
Hans Petter Selasky
0be5034007 Don't stall transmit queue on drops in mlx5en(4).
When a transmitted packet is dropped don't stall the transmit queue.

MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 12:19:36 +00:00
Hans Petter Selasky
2d32b0a304 Maximum number of mbuf frags is off-by-one for worst case scenario in mlx5en(4).
Inspecting the PRM no more than 0x3F data segments, DS, of size 16 bytes is
allowed.

Worst case scenario summary of DS usage:
Header is fixed:	2 DS
Maximum inlining:	98 => (98 - 2) / 16 = 6 DS
Remainder:		0x3F - 2 - 6 = 55 DS (mbuf frags)

Previously a value of 56 DS was used and this would work in the
normal case because not all inline data area was used up.

MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 11:06:07 +00:00
Hans Petter Selasky
7b9b93a8dd Update version information for the mlx5 and mlx5en(4) modules.
While at it bump some copyright dates.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-18 10:12:53 +00:00
Hans Petter Selasky
0539900214 Do not inline transmit headers and use HW VLAN tagging if supported by mlx5en(4).
Query the minimal inline mode supported by the card.
When creating a send queue, cache the queried mode and optimize the transmit
if no inlining is required.  In this case, we can avoid touching the headers
cache line and avoid dirtying several more lines by copying headers into
the send WQEs.  Also, if no inline headers are used, hardware assists in
the VLAN tag framing.

Submitted by:		kib@, slavash@
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-18 10:03:30 +00:00
Hans Petter Selasky
90c8e44125 Use a mbuf header instead of a mbuf cluster for debugging interrupts in mlx5en(4).
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 11:53:37 +00:00
Hans Petter Selasky
a6b2d28d05 Add module parameter to limit number of MSIX EQ vectors in mlx5en(4).
For setups having a large amount of PCI devices, it makes sense to limit the
number of MSIX vectors per PCI device, in order to avoid running out of IRQ
vectors.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 11:47:56 +00:00
Hans Petter Selasky
aa9f073c9b Add missing newline.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 11:43:43 +00:00
Hans Petter Selasky
2f17f76aa4 Handle jumbo frames without requiring big clusters in mlx5en(4).
The scatter list is formed by the chunks of MCLBYTES each, and larger
than default packets are returned to the stack as the mbuf chain.

Submitted by:		kib@
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 11:42:05 +00:00
Hans Petter Selasky
f8c3349737 Enable both receive and transmit pauseframes by default in mlx5en(4).
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 11:21:02 +00:00
Hans Petter Selasky
f2b4782c81 Add context numbers for HW elements in mlx5en(4).
To access the data, set sysctl dev.mce.N.conf.debug_stats to 1.
This enables the sysctl node dev.mce.N.hw_ctx_debug.  Its content is
the mapping of each channel' number to used receive queue and associated
completion queue, set of the transmit queues numbers and corresponding
completion queues.

Trimmed example output:
channel 30 rq 188 cq 1085
channel 30 tc 0 sq 187 cq 1084
channel 31 rq 191 cq 1087
channel 31 tc 0 sq 190 cq 1086

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 11:18:01 +00:00
Hans Petter Selasky
a880c1ff6a Do not hint about 'trust both' mode when the mlx5en(4) hardware does not support it.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 11:11:30 +00:00
Hans Petter Selasky
f0474ab919 Correctly write atomic variable in mlx5en(4).
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 11:08:40 +00:00
Hans Petter Selasky
52ff436841 Remove redundant call to mlx5_vsc_find_cap() in mlx5core.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 10:27:46 +00:00
Hans Petter Selasky
6d54b22db7 Make sure the state variable is set atomically instead of using a mutex in mlx5core.
Device detach and setting error state may deadlock over the interface mutex
like this:

a) Detach code in mlx5en waits until error state is set while the interface
mutex is locked.
b) The set error handler needs to lock the interface mutex before it can
set the error state.

The solution is to use atomics to set the error state.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 10:20:01 +00:00
Hans Petter Selasky
b575d8c850 Refactor access to CR-space into using VSC APIs in mlx5core.
Remove no longer used files and APIs.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 10:16:32 +00:00
Hans Petter Selasky
9fc929d2e2 Remove redundant newline character in mlx5core.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 10:11:00 +00:00
Hans Petter Selasky
18450a3b10 Update version information for the mlx5ib module.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 10:07:40 +00:00
Hans Petter Selasky
62bfa774ae Don't pass unsupported events to ibcore from mlx5ib.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:59:55 +00:00
Hans Petter Selasky
14a1b9bd3a Use static device naming instead of dynamic one in mlx5ib.
When resetting mlx5core instances it can happen that the order of attach and
detach for mlx5ib instances is changed. Take the unit number for mlx5_%d from
the parent PCI device, similarly to what is done in mlx5en(4), so that there
is a direct relationship between mce<N> and mlx5_<N>.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:58:11 +00:00
Hans Petter Selasky
ed0cee0bf4 Implement support for Differentiated Service Code Point, DSCP, in mlx5en(4).
The DSCP feature is controlled using a set of sysctl(8) fields under
the qos sysctl directory entry for mlx5en(4).

For Routable RoCE QPs, the DSCP should be set in the QP's address path.
The DSCP's value is derived from the traffic class.

Linux commit:
ed88451e1f2d400fd6a743d0a481631cf9f97550

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:56:40 +00:00
Hans Petter Selasky
f4546fa376 Add support for prio-tagged traffic for RDMA in ibcore.
When receiving a PCP change all GID entries are reloaded.
This ensures the relevant GID entries use prio tagging,
by setting VLAN present and VLAN ID to zero.

The priority for prio tagged traffic is set using the regular
rdma_set_service_type() function.

Fake the real network device to have a VLAN ID of zero
when prio tagging is enabled. This is logic is hidden inside
the rdma_vlan_dev_vlan_id() function which must always be used
to retrieve the VLAN ID throughout all of ibcore and the
infiniband network drivers.

The VLAN presence information then propagates through all
of ibcore and so incoming connections will have the VLAN
bit set. The incoming VLAN ID is then checked against the
return value of rdma_vlan_dev_vlan_id().

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-17 09:11:53 +00:00
Hans Petter Selasky
38535d6cab Add support for hardware rate limiting to mlx5en(4).
The hardware rate limiting feature is enabled by the RATELIMIT kernel
option. Please refer to ifconfig(8) and the txrtlmt option and the
SO_MAX_PACING_RATE set socket option for more information. This
feature is compatible with hardware transmit send offload, TSO.

A set of sysctl(8) knobs under dev.mce.<N>.rate_limit are provided to
setup the ratelimit table and also to fine tune various rate limit
related parameters.

Sponsored by:	Mellanox Technologies
2018-05-29 14:04:57 +00:00
Matt Macy
4f6c66cc9c UDP: further performance improvements on tx
Cumulative throughput while running 64
  netperf -H $DUT -t UDP_STREAM -- -m 1
on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps

Single stream throughput increases from 910kpps to 1.18Mpps

Baseline:
https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg

- Protect read access to global ifnet list with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg

- Protect short lived ifaddr references with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg

- Convert if_afdata read lock path to epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg

A fix for the inpcbhash contention is pending sufficient time
on a canary at LLNW.

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15409
2018-05-23 21:02:14 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Konstantin Belousov
952e75c763 mlx5en: Always allow VLAN id 0.
According to the 802.1Q-2014 9.6 VLAN Tag Control Information, VID value 0
means that there is no VLAN tag assigned to the packet, and only PCP and
DEI values from the tag are meaningful.  Current flow table programming
filter out such packets.

When programming VLAN filter for flow table, unconditionally add rule which
accept packets with VLAN id 0.  The packets are already handled correctly
by the network stack.

Reviewed by:	hselasky, slavash
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-05-02 20:22:03 +00:00
Hans Petter Selasky
28cfdee769 Bump driver version number in mlx5en(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-04-04 10:45:06 +00:00
Hans Petter Selasky
d77004ab47 Remove unused structure field in mlx5core.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2018-03-30 19:58:58 +00:00
Hans Petter Selasky
76ee71dcd3 Bump mlx5core driver version.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2018-03-30 19:55:31 +00:00
Hans Petter Selasky
4d5fdbe9b8 Fix for use after free in mlx5core.
Make sure the command completion handler is not called when the device is
in internal error state. This can easily trigger use after free situations.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2018-03-30 19:50:45 +00:00
Hans Petter Selasky
ca2345a05d Make sure Giant is locked when allocating bus resources in mlx5core.
During health care IRQ resources will be reallocated.
Newbus requires that Giant is locked before accessing
these resources.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2018-03-30 19:49:35 +00:00
Hans Petter Selasky
92d23c82cd Collect firmware dump when mlx5core is in device error state.
Firmware dump collecting should be triggered in case firmware syndrome
with request for reset bit is set.

MFC after:	3 days
Submitted by:	slavash@
Sponsored by:	Mellanox Technologies
2018-03-30 19:48:25 +00:00
Hans Petter Selasky
d28b6b55ba Reorganize health recovery in mlx5core.
- Move the semaphore locking and unlocking to the same function.
- Flags are no longer needed if the reset and crdump will be done in the
  same function.

MFC after:	3 days
Submitted by:	slavash@
Sponsored by:	Mellanox Technologies
2018-03-30 19:45:48 +00:00
Hans Petter Selasky
3c1274bd64 Prepare for FW dump in error state in mlx5core.
- Move firmware dump prep and cleanup to init_one() and remove_one() so that
the init and cleanup will happen only upon driver reload.
- Add some prints to indicate firmware dump.

MFC after:	3 days
Submitted by:	slavash@
Sponsored by:	Mellanox Technologies
2018-03-30 19:43:15 +00:00
Hans Petter Selasky
0a752b05a8 Properly check if crspace is supported in mlx5core.
The old code checked for MLX5_CR_SPACE_DOMAIN which is irrelevant here.
However, if dev->vsec_addr would be 0, an access to wrong offset would
happen.

MFC after:	3 days
Submitted by:	slavash@
Sponsored by:	Mellanox Technologies
2018-03-30 19:39:27 +00:00
Hans Petter Selasky
4950c6ec72 Add missing newline character in print in mlx5core.
MFC after:	3 days
Submitted by:	slavash@
Sponsored by:	Mellanox Technologies
2018-03-30 19:35:31 +00:00
Brooks Davis
541d96aaaf Use an accessor function to access ifr_data.
This fixes 32-bit compat (no ioctl command defintions are required
as struct ifreq is the same size).  This is believed to be sufficent to
fully support ifconfig on 32-bit systems.

Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Relnotes:	yes
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14900
2018-03-30 18:50:13 +00:00
Hans Petter Selasky
9a3d0cf097 Remove redundant prototype to fix compilation with GCC.
Reported by:	jeff@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-25 08:55:53 +00:00
Hans Petter Selasky
c0cea51b46 Don't wait for completions when a mlx5en(4) device is in internal
error state.

If the device is in internal error state the hardware will not
generate completions. Just move on to destroy the resources.

Submitted by:	slavash@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 18:38:12 +00:00
Hans Petter Selasky
9cd6fc88be Fix incorrect page count when mlx5core is in internal error.
Change page cleanup flow when in internal error to properly decrement
the page counts when reclaiming pages. That prevents timing out
waiting for extra pages that were actually cleaned up previously.

Submitted by:	slavash@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 18:35:59 +00:00
Hans Petter Selasky
94790180f3 Don't save PCI state when PCI error is detected in mlx5core.
When a PCI error is detected the PCI state could be corrupt, don't
save it in that flow. Save the state after initialization. After
restoring the PCI state during slot reset save it again, restoring
the state destroys the previously saved state info.

Submitted by:	slavash@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 18:34:35 +00:00
Hans Petter Selasky
f20b553d75 Add mutual exclusion mechanism for software reset of firmware in mlx5core.
Since the FW can be shared between PCI functions it is common that
more than one health poll will detected a failure, this can lead to
multiple resets.

The solution is to use a FW locking mechanism using semaphore space to
provide a way to synchronize between functions. The FW semaphore is
acquired via config cycle access. First the VSEC gateway must be
acquired, then the semaphore can be locked by writing a value to it
and confirmed it's locked by reading the same value back. The process
in the same to free the semaphore, except the value written should be
zero.

Submitted by:	slavash@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 18:32:03 +00:00
Hans Petter Selasky
fe242ba7c1 Issue a software reset on firmware assert in mlx5core.
If a FW assert is considered fatal, indicated by a new bit in the
health buffer, reset the FW. After the reset, follow the normal
recovery flow.

Submitted by:	slavash@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 18:24:09 +00:00
Hans Petter Selasky
1900b6f887 Handle software reset of firmware in error flow in mlx5core.
Some mlx5 adapter firmware allows the driver to reset the firmware in
the event of an error. When a software reset is issued on any physical
function all PFs enter reset state. This is a recoverable condition.
The existing recovery flow was designed to allow the recovery of a
VF after a PF driver reload. This patch expands the scope of that
flow to recover PFs or VFs after a SW reset has been issued.
When a software reset is issued the following occurs:

1. The NIC interface mode is set to SW_RESET (7) while the reset is in
   progress.
2. Once the reset completes the NIC interface mode is set to NIC
   disabled (1).

After the reset has been issued (added in a subsequent patch) the
health poll for other functions will detect that the NIC interface
state has been set to disabled. This will cause it to enter the
existing recovery flow.  If the PCI is still working (meaning it
doesn't return 0xff on all reads) it means recovery can proceed
immediately instead of waiting 60 seconds.

The error detetion has also been refactored to avoid incorrect or
misleading log messages.

Submitted by:	slavash@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 18:20:42 +00:00
Hans Petter Selasky
1fb6089c3b Hide verbose proclamation of error when forced in mlx5core.
When mlx5_enter_error_state() operation is forced by shutdown, the
messages surrounding setting the error state are not informational
and confuse users.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 18:11:06 +00:00
Hans Petter Selasky
519774ea5a Cancel delayed recovery work when unloading the mlx5core driver.
linux commit 2a0165a034ac024b60cca49c61e46f4afa2e4d98

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 18:09:09 +00:00
Hans Petter Selasky
c09025693b Add support for fast unload in shutdown flow in mlx5core.
This patch accumulates the following Linux commits:

- 8812c24d28f4972c4f2b9998bf30b1f2a1b62adf
  net/mlx5: Add fast unload support in shutdown flow
- 59211bd3b6329c3e5f4a90ac3d7f87ffa7867073
  net/mlx5: Split the load/unload flow into hardware and software flows
- 4525abeaae54560254a1bb8970b3d4c225d32ef4
  net/mlx5: Expose command polling interface

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 18:02:20 +00:00
Hans Petter Selasky
4bb7662b09 Improve support for health recovery in mlx5core.
This patch accumulates the following Linux commits:

- 04c0c1ab38e95105d950db5b84e727637e149ce7
  net/mlx5: PCI error recovery health care simulation
- 0179720d6be2096b8d0a4d143254ff9e77747daa
  net/mlx5: Introduce trigger_health_work function
- 3fece5d676939f42f434c63dfe1bd42d7d94e6f0
  net/mlx5: Continue health polling until it is explicitly stopped

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 17:33:14 +00:00
Hans Petter Selasky
177781564f Create designated workqueue for each mlx5en(4) device instance.
The mlx5e_destroy_ifp() function may be called from the system workqueue and
in this case trying to flush all works will cause a dead lock.
Instead of using the system workqueue, create a designated workqueue
for each mlx5en(4) device instance.

Submitted by:	slavash@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-23 16:59:51 +00:00
Conrad Meyer
918622dbc0 mlx5(4): Remove redundant declaration of mlx5_enter_error_state
Broken in r330644.

Sponsored by:	Dell EMC Isilon
2018-03-10 00:59:48 +00:00
Konstantin Belousov
2cec152827 Make mlx5 compilable on ILP32 arches.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-03-08 22:03:43 +00:00
Hans Petter Selasky
50e3ba10e8 Set correct SL in completion for RoCE in mlx5ib(4).
There is a difference when parsing a completion entry between Ethernet
and IB ports. When link layer is Ethernet the bits describe the type of
L3 header in the packet. In the case when link layer is Ethernet and VLAN
header is present the value of SL is equal to the 3 UP bits in the VLAN
header. If VLAN header is not present then the SL is undefined and consumer
of the completion should check if IB_WC_WITH_VLAN is set.

While that, this patch also fills the vlan_id field in the completion if
present.

linux commit 12f8fedef2ec94c783f929126b20440a01512c14

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 16:27:31 +00:00
Hans Petter Selasky
0285276c92 Add call to setup firmware data dump structure during device load in
mlx5core.

Do not consider the inability to create a firmware dump fatal, but
inform about the situation and allow the driver to attach. The device
might not implement the needed VSC, or we might not know the layout of
the registers map. In either case, only firmware dump functionality is
limited, the network operations should be fine.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 16:19:01 +00:00
Hans Petter Selasky
9cb83c4689 Avoid more LFENCE/SFENCe on x86 in mlx5en(4),
by using the FreeBSD native fences.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:58:30 +00:00
Hans Petter Selasky
7d69d339ad Fix mlx5en(4) driver to properly call m_defrag().
When the mlx5en(4) driver was converted to using BUSDMA(9) the call to
m_defrag() was moved after the part of the TX routine that strips the
header from the mbuf chain. Before it called m_defrag it first trimmed
off the now-empty mbufs from the start of the chain. This has the side
effect of also removing the head of the chain that has M_PKTHDR set.
m_defrag() will not defrag a chain that does not have M_PKTHDR set,
thus it was effectively never defragging the mbuf chains.

As it turns out, trimming the mbufs in this fashion is unnecessary since
the call to bus_dmamap_load_mbuf_sg doesn't map empty mbufs anyway, so
remove it.

Differential Revision:	https://reviews.freebsd.org/D12050
Submitted by:	mjoras@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:53:04 +00:00
Hans Petter Selasky
1eb09ad434 Use vport rather than physical-port MTU in mlx5en(4).
Set and report vport MTU rather than physical MTU,
The driver will set both vport and physical port mtu
and will rely on the query of vport mtu.

SRIOV VFs have to report their MTU to their vport manager (PF),
and this will allow them to work with any MTU they need
without failing the request.

Also for some cases where the PF is not a port owner, PF can
work with MTU less than the physical port mtu if set physical
port mtu didn't take effect.

Based on Linux upstream commit:
cd255efff9baadd654d6160e52d17ae7c568c9d3

Submitted by:	Meny Yossefi <menyy@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:47:17 +00:00
Hans Petter Selasky
7c22ae80d2 Use the device unit number for naming the ifnet interface in mlx5en(4).
Currently the ifnet interface is named mceX, where X is a monotonically
incremented value. If the device is reset due to a fatal error, then the
interface name will change.  Using the device unit number will keep the
naming consistent across the reset logic.

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:43:41 +00:00
Hans Petter Selasky
1a872967ad Remove duplicate prototypes.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:37:09 +00:00
Hans Petter Selasky
e808190a59 Add kernel and userspace code to dump the firmware state of supported
ConnectX-4/5 devices in mlx5core.

The dump is obtained by reading a predefined register map from the
non-destructive crspace, accessible by the vendor-specific PCIe
capability (VSC). The dump is stored in preallocated kernel memory and
managed by the mlx5tool(8), which communicates with the driver using a
character device node.

The utility allows to store the dump in format
    <address> <value>
into a file, to reset the dump content, and to manually initiate the
dump.

A call to mlx5_fwdump() should be added at the places where a dump
must be fetched automatically. The most likely place is right before a
firmware reset request.

Submitted by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 15:21:56 +00:00
Hans Petter Selasky
4b95c6659a Add vendor specific capability interface support in mlx5core.
Add the ability to access the vendor specific space gateway in order
to support reading and writing data into the different configuration
domains.

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:59:47 +00:00
Hans Petter Selasky
fba294620c Use device_printf() instead of printf() when printing warnings and errors
to dmesg(8) in mlx5core.

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:58:27 +00:00
Hans Petter Selasky
10b0804509 Add support for per priority flow control, PFC, to mlx5en(4).
Add support for PFC and implement reading the per priority statistics
using the sysctl(8) interface. PFC is used together with VLAN priority
and can be enabled and disabled on a per priority basis.

Global pause frames and PFC are incompatible features and surrounding
logic has been added to warn the user about misconfiguration.

Update relevant mlx5core APIs for PFC configuration.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:40:39 +00:00
Hans Petter Selasky
118063fb70 Add support for explicit congestion notification, ECN, to mlx5ib(4).
ECN configuration and statistics is available through a set of sysctl(8)
nodes under sys.class.infiniband.mlx5_X.cong . The ECN configuration
nodes can also be used as loader tunables.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 11:23:14 +00:00
Hans Petter Selasky
788333d9a6 Use the autogenerated interface file for all commands in mlx5core.
This patch accumulates the following Linux commits:
- 90b3e38d048f09b22fb50bcd460cea65fd00b2d7
  mlx5_core: Modify CQ moderation parameters
- 09a7d9eca1a6cf5eb4f9abfdf8914db9dbd96f08
  mlx5_core: QP/XRCD commands via mlx5 ifc
- 1a412fb1caa2c1b77719ccb5ed8b0c3c2bc65da7
  mlx5_core: Modify QP commands via mlx5 ifc
- ec22eb53106be1472ba6573dc900943f52f8fd1e
  mlx5_core: MKey/PSV commands via mlx5 ifc
- 73b626c182dff06867ceba996a819e8372c9b2ce
  mlx5_core: EQ commands via mlx5 ifc
- 20ed51c643b6296789a48adc3bc2cc875a1612cf
  mlx5_core: Access register and MAD IFC commands via mlx5 ifc
- a533ed5e179cd15512d40282617909d3482a771c
  mlx5_core: Pages management commands via mlx5 ifc
- b8a4ddb2e8f44f872fb93bbda2d541b27079fd2b
  mlx5_core: Add MLX5_ARRAY_SET64 to fix BUILD_BUG_ON
- af1ba291c5e498973cc325c501dd8da80b234571
  mlx5_core: Refactor internal SRQ API
- b06e7de8a9d8d1d540ec122bbdf2face2a211634
  mlx5_core: Refactor device capability function
- c4f287c4a6ac489c18afc4acc4353141a8c53070
  mlx5_core: Unify and improve command interface

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 10:43:42 +00:00
Hans Petter Selasky
ca55159467 Fix race between PCI error handlers and health work in mlx5core.
linux commit 05ac2c0b7438ea08c5d54b48797acf9b22cb2f6f

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 09:58:41 +00:00
Hans Petter Selasky
7053deeb3d Avoid calling sleeping function from the health poll thread in mlx5core.
linux commit c1d4d2e92ad670168a17a57dfa182a5a5baa72d4

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 09:51:33 +00:00
Hans Petter Selasky
a2485fe5a6 Updates for PCI and health monitor recovery in mlx5core.
This patch accumulates the following Linux commits:

mlx5_health.c
- 78ccb25861d76a8fc5c678d762180e6918834200
  mlx5_core: Fix wrong name in struct
- 171bb2c560f45c0427ca3776a4c8f4e26e559400
  mlx5_core: Update health syndromes
- 0144a95e2ad53a40c62148f44fb0c1f9d2a0d1e9
  mlx5_core: Use accessor functions to read from device memory
- ac6ea6e81a80172612e0c9ef93720f371b198918
  mlx5_core: Use private health thread for each device
- fd76ee4da55abb21babfc69310d321b9cb9a32e0
  mlx5_core: Fix internal error detection conditions
- 2241007b3d783cbdbaa78c30bdb1994278b6f9b9
  mlx5: Clear health sick bit when starting health poll
- 712bfef60912d91033cb25739f7444d5b8d8c59f
  mlx5: Fix version printout in case of health issue
- 89d44f0a6c732db23b219be708e2fe1e03ee4842
  mlx5_core: Add pci error handlers to mlx5_core driver

mlx5_cmd.c
- be87544de8df2b1eb34bcb5e32691287d96f9ec4
  mlx5_core: Fix async commands return code
- a31208b1e11df334d443ec8cace7636150bb8ce2
  mlx5_core: New init and exit flow for mlx5_core
- 020446e01eebc9dbe7eda038e570ab9c7ab13586
  mlx5_core: Prepare cmd interface to system errors handling
- 89d44f0a6c732db23b219be708e2fe1e03ee4842
  mlx5_core: Add pci error handlers to mlx5_core driver
- 0d834442cc247c7b3f3bd6019512ae03e96dd99a
  mlx5: Fix teardown errors that happen in pci error handler

mlx5_main.c
- 5fc7197d3a256d9c5de3134870304b24892a4908
  mlx5: Add pci shutdown callback

Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-08 09:47:09 +00:00
Hans Petter Selasky
2e9c3a4f99 Implement priority to traffic class mapping in mlx5core.
Add support for mapping priority to traffic class via sysctl

Submitted by:	Slava Shwartsman <slavash@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 15:23:07 +00:00
Hans Petter Selasky
cfc9c386eb Implement rate limit per traffic class in mlx5core.
Add support for rate limiting traffic class via sysctl.

Submitted by:	Slava Shwartsman <slavash@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 15:17:36 +00:00
Hans Petter Selasky
d91421515a Implement missing query for current port rate in mlx5ib(4).
- Factor out port speed definitions into new port.h header file,
  similarly as done in Linux upstream.
- Correct two existing port speed definitions in mlx5en according to
  Linux upstream.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 15:03:11 +00:00
Hans Petter Selasky
ecb4fcc48e Add log message for unsupported QSFPs in mlx5core.
Submitted by:	Matthew Finlay <matt@mellanox.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:51:50 +00:00
Hans Petter Selasky
2289559114 Make sure default VNET is set when adding a new interface in mlx5core.
Adding an interface might be done outside the device_attach() routine
and will then cause a panic, due to the VNET not being set.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:49:27 +00:00
Hans Petter Selasky
11546d068d Add timeout handle to commands with callback in mlx5core.
The current implementation does not handle timeout in case of command
with callback request, and this can lead to deadlock if the command
doesn't get firmware response. Add delayed callback timeout work
before posting the command to firmware. In case of real firmware
command completion we will cancel the delayed work. In case of
firmware command timeout the callback timeout handler will be called
and it will simulate firmware completion with timeout error.

linux commit 65ee67084589c1783a74b4a4a5db38d7264ec8b5

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:41:29 +00:00
Hans Petter Selasky
2327a753b9 Fix potential deadlock in command mode change in mlx5core.
Call command completion handler in case of timeout when working in
interrupts mode. Avoid flushing the commands workqueue after acquiring
the semaphores to prevent a potential deadlock.

linux commit commit 9cba4ebcf374c3772f6eb61f2d065294b2451b49

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:35:28 +00:00
Hans Petter Selasky
ee1b5c5811 Use a macro in mlx5_command_str() instead of copying OP name.
linux commit 42ca502e179d0654ef441333a9d0f35c948734f3

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:29:30 +00:00
Hans Petter Selasky
2f2d3c0cf3 Disable unsupported disassociate ucontext functionality in mlx5ib(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-07 14:03:31 +00:00
Hans Petter Selasky
1456d97c01 Optimize ibcore RoCE address handle creation from user-space.
Creating a UD address handle from user-space or from the kernel-space,
when the link layer is ethernet, requires resolving the remote L3
address into a L2 address. Doing this from the kernel is easy because
the required ARP(IPv4) and ND6(IPv6) address resolving APIs are readily
available. In userspace such an interface does not exist and kernel
help is required.

It should be noted that in an IP-based GID environment, the GID itself
does not contain all the information needed to resolve the destination
IP address. For example information like VLAN ID and SCOPE ID, is not
part of the GID and must be fetched from the GID attributes. Therefore
a source GID should always be referred to as a GID index. Instead of
going through various racy steps to obtain information about the
GID attributes from user-space, this is now all done by the kernel.

This patch optimises the L3 to L2 address resolving using the existing
create address handle uverbs interface, retrieving back the L2 address
as an additional user-space information structure.

This commit combines the following Linux upstream commits:

IB/core: Let create_ah return extended response to user
IB/core: Change ib_resolve_eth_dmac to use it in create AH
IB/mlx5: Make create/destroy_ah available to userspace
IB/mlx5: Use kernel driver to help userspace create ah
IB/mlx5: Report that device has udata response in create_ah

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-05 14:34:52 +00:00
Hans Petter Selasky
1cbc85fd04 Move the mlx5 core device pointer first in the mlx5en priv. This help simplify
checks to recognize own network devices when using mlx5ib. This patch fixes
an issues where mlx5ib fails to recognize mceX network devices for use with
RoCE.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-01-30 12:38:06 +00:00
Eitan Adler
02ca39cff2 sys/dev/mlx[45]: fix uses of 1 << 31
Reviewed by:		kib (D13858)
2018-01-12 06:36:44 +00:00
Konstantin Belousov
e44f4f3547 mlx5en: Avoid SFENCe on x86
The IA32 memory model guarantees that all writes are seen in the program
order.  Also, any access to the uncacheable memory flushes the store
buffers.  As the consequence, SFENCE instruction is (almost) never needed,
in particular, it is not needed to ensure the correct order of updates as
seen by a PCIe device.

Use atomic_thread_fence_rel() instead of wb() to only emit compiler barriers
on x86 there.  Other architectures get the right barrier instruction as
well.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-12-19 14:11:41 +00:00
Konstantin Belousov
ef23f141bc Implement hardware mlx5(4) rx timestamps.
Driver support is only provided for ConnectX4/5.

System-time timestamp is calculated based on the free-running counter
timestamp provided by hardware.  Driver periodically samples the
counter to calibrate it against the system clock and uses linear
interpolation to convert.  Stability of the crystal which drives the
clock is +-50 ppm at the operational temperature, which makes the
algorithm good enough.

The calculation is somewhat delicate because all values are 64bit and
overflow the naive formula for linear interpolation.  The calculation
drops the least significant bits in advance, see the PREC shift in
mlx5_mbuf_tstmp().

Hardware stamps can be turned off by 'ifconfig mceN -hwrxtsmp'.  Buggy
firmware might result in small but visible errors in the reported
timestamps, detectable e.g. by nonsensical (negative) RTT values for
LAN pings.

Reviewed by:	gallatin, hselasky
Sponsored by:	Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D12638
2017-11-29 10:04:11 +00:00
Hans Petter Selasky
c3125bc5bf Compile fixes for 32-bit architectures.
Sponsored by:	Mellanox Technologies
2017-11-24 12:08:50 +00:00
Hans Petter Selasky
937d37fc6c Merge ^/head r325842 through r325998. 2017-11-19 12:36:03 +00:00
Hans Petter Selasky
b108f35740 Remove duplicate static function prototype to fix compilation of
mlx5_fs_tree.c after r325638 when using GCC.

Found by:	kib @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-18 20:32:09 +00:00
Hans Petter Selasky
55b1c6e7e4 Merge ^/head r325663 through r325841. 2017-11-15 11:28:11 +00:00
Hans Petter Selasky
dd00abf2d7 Make sure the ib_wr_opcode enum is signed by adding a negative dummy element.
Different compilers may optimise the enum type in different ways. This ensures
coherency when range checking the value of enums in ibcore.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-14 14:51:37 +00:00
Hans Petter Selasky
059ecd56d0 The new mlx5ib(4) module requires some existing values to be redefined.
Sponsored by:	Mellanox Technologies
2017-11-10 15:28:17 +00:00
Hans Petter Selasky
8e6e287f8d Update mlx5ib(4) to match Linux 4.9 and the new ibcore APIs.
Sponsored by:	Mellanox Technologies
2017-11-10 15:02:17 +00:00
Hans Petter Selasky
4b109912f1 Add more and update existing mlx5 core firmware structure definitions and bits.
This change is part of coming ibcore and mlx5ib updates.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:39:03 +00:00
Hans Petter Selasky
53d7bb46d5 Expose the current hardware MTU in mlx5en(4) as a separate entry
in the sysctl tree.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:19:22 +00:00
Hans Petter Selasky
61fd7ac087 Add support for configuring local multicast and unicast data traffic loopback
in mlx5en(4) driver via the sysctl interface.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:14:54 +00:00
Hans Petter Selasky
bb3616ab20 Add support for disabling and enabling RX and TX DMA rings in mlx5en(4).
This is useful for supporting setups similar to Netmap.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:10:41 +00:00
Hans Petter Selasky
b35a986d25 Make physical address of init segment available in the priv of mlx5 core.
This change is needed by mlx5ib(4).

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 14:02:12 +00:00
Hans Petter Selasky
f6923226eb Add API function to query port performance counters for infiniband and RoCE
traffic in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:58:49 +00:00
Hans Petter Selasky
f4554f7830 Add API functions to query and modify local loopback of multicast and
unicast traffic in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:56:11 +00:00
Hans Petter Selasky
2fd90b8297 Add API function to query virtual port counters in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:53:53 +00:00
Hans Petter Selasky
0b3ebe412e Add API functions to modify the transport interface send object, TIS,
in mlx5 core.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:50:08 +00:00
Hans Petter Selasky
27c29bc44b Add API functions to set and query dropless port mode in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:44:12 +00:00
Hans Petter Selasky
0e4248a114 Prevent mlx5 core from accessing host memory after shutdown by disabling
PCI busmaster.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:40:27 +00:00
Hans Petter Selasky
197563c294 Set ATOMIC endian mode in mlx5 core.
The hardware is capable of 2 requestor endianness modes for standard 8
byte atomics: BE (0x0) and host endianness (0x1). Read the supported
modes from hca atomic capabilities and configure HW to host endianness
mode if supported.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:38:43 +00:00
Hans Petter Selasky
500d0c409e Add const keyword to input-only argument in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:30:14 +00:00
Hans Petter Selasky
a4d6b00747 Make local variable 64-bits to avoid masking away bits in mlx5 core.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:28:23 +00:00
Hans Petter Selasky
6c7057f7ba Implement support for decoding general port notification event in
the mlx5 core module.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 13:25:29 +00:00
Hans Petter Selasky
5a93b4cd52 Refactor the flowsteering APIs used by mlx5en(4). This change is needed by
the coming ibcore and mlx5ib updates in order to support traffic redirection
to so-called raw ethernet QPs.

Remove unused E-switch related routines and files while at it.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-10 09:49:08 +00:00
Hans Petter Selasky
d05554bb99 The remote DMA TCP portspace selector, RDMA_PS_TCP, is used for both
iWarp and RoCE in ibcore. The selection of RDMA_PS_TCP can not be used
to indicate iWarp protocol use. Backport the proper IB device
capabilities from Linux upstream to distinguish between iWarp and
RoCE. Only allocate the additional socket required for iWarp for RDMA
IDs when at least one iWarp device present. This resolves
interopability issues between iWarp and RoCE in ibcore

Reviewed by:		np @
Differential Revision:	https://reviews.freebsd.org/D12563
Sponsored by:		Mellanox Technologies
MFC after:		3 days
2017-10-20 08:20:15 +00:00
Hans Petter Selasky
3cd4c11ab2 Use common rdma_ip2gid() function instead of custom mlx5_ip2gid() one.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-10-10 12:24:52 +00:00
Hans Petter Selasky
e5d6b589ce Make sure the doorbell lock is valid for the i386 version
of the mlx5en(4) driver.

Tested by:		gallatin @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-10-02 12:20:55 +00:00
Hans Petter Selasky
8ac4c959ab Compile fixes for LINT on 32-bit platforms.
MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-24 08:09:42 +00:00
Hans Petter Selasky
1251590741 Add new mlx5ib(4) driver to the kernel source tree which supports
Remote DMA over Converged Ethernet, RoCE, for the ConnectX-4 series of
PCI express network cards.

There is currently no user-space support and this driver only supports
kernel side non-routable RoCE V1. The krping kernel module can be used
to test this driver. Full user-space support including RoCE V2 will be
added as part of the ongoing upgrade to ibcore from Linux 4.9. Otherwise
this driver is feature equivalent to mlx4ib(4). The mlx5ib(4) kernel
module will only be built when WITH_OFED=YES is specified.

MFC after:		2 weeks
Sponsored by:		Mellanox Technologies
2017-08-23 12:09:37 +00:00
Hans Petter Selasky
8508e4d730 Make sure the received IP header gets 32-bit aligned for short packets
in the mlx5en(4) driver.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-08 11:49:36 +00:00
Hans Petter Selasky
869dd4b498 Count drop events due to lack of PCI bandwidth as queue drops and not as
input errors in the mlx5en(4) driver. This improves the sysadmin view of
physical port errors.

Submitted by:		gallatin@
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-08-08 11:36:57 +00:00
Hans Petter Selasky
713dd5cb9e Resolve locking issue for non-sleepable context in the mlx5core.
Code inspection reveals the busdma unload and free functions
do not write to the belonging dma tag and does not need to be
serialized. This allows mlx5_fwp_free() to be called from
software interrupt context.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:14:43 +00:00
Hans Petter Selasky
1d4905b5b0 Using GFP_ATOMIC with firmware commands is not supported after busdma was
introduced in the mlx5core, because busdma might sleep when loading memory
into DMA.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2017-08-03 09:11:51 +00:00
Mark Johnston
c73cdca2c4 Update io-mapping.h in the LinuxKPI.
Add io_mapping_init_wc() and add a third (unused) parameter to
io_mapping_map_wc().

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11286
2017-06-21 18:20:17 +00:00
Hans Petter Selasky
8b48354659 Improve sysadmin visibility of physical port error counters in the
mlx5en driver.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-04-28 19:38:57 +00:00
Hans Petter Selasky
eac79e7755 Make "desc" pointer non-constant inside the mlx5_core_diagnostics_entry
structure. This fixes compilation with amd64-xtoolchain-gcc.

PR:			216588
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-30 08:35:15 +00:00
Hans Petter Selasky
1c807f6795 Use the busdma API to allocate all DMA-able memory.
The MLX5 driver has four different types of DMA allocations which are
now allocated using busdma:

1) The 4K firmware DMA-able blocks. One busdma object per 4K allocation.
2) Data for firmware commands use the 4K firmware blocks split into four 1K blocks.
3) The 4K firmware blocks are also used for doorbell pages.
4) The RQ-, SQ- and CQ- DMA rings. One busdma object per allocation.

After this patch the mlx5en driver can be used with DMAR enabled in
the FreeBSD kernel.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:46:55 +00:00
Hans Petter Selasky
30dfc0518a Add support for device surprise removal and other PCI errors.
- When device disappears from PCI indicate error device state and:
  1) Trigger command completion for all pending commands
  2) Prevent new commands from executing and return:
     - success for modify and remove/cleanup commands
     - failure for create/query commands
  3) When reclaiming pages for a device in error state don't ask FW to
     return all given pages, just release the allocated memory

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:29:33 +00:00
Hans Petter Selasky
44a03e91f3 Wait for all VFs pages to be reclaimed before closing EQ pages.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:19:06 +00:00
Hans Petter Selasky
310804912a Rename struct fw_page into struct mlx5_fw_page as a preparation step
for adding busdma support.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 11:03:58 +00:00
Hans Petter Selasky
f361e561a5 Fix command completion with callback scenario.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:56:03 +00:00
Hans Petter Selasky
8e7e0ce110 Minor code refactor as a preparation step for suprise removal of CX-4
PCI device(s), changes:
- alloc_entry() now clears bit for page slot entry aswell
- update of cmd->ent_arr[] is now under cmd->alloc_lock
- complete command if alloc_entry() fails

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:47:53 +00:00
Hans Petter Selasky
d0ce5a0da7 Use ffs() to scan for first bit instead of using a for() loop.
Minor code refactor while at it.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:36:49 +00:00
Hans Petter Selasky
115bc9b1d3 Make fw_pages statistics counter 64-bit to avoid overflow.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:20:38 +00:00
Hans Petter Selasky
66d53750b9 Add support for reading advanced diagnostic counters.
By default reading the diagnostic counters is disabled. The firmware
decides which counters are supported and only those supported show up
in the dev.mce.X.diagnostics sysctl tree.

To enable reading of diagnostic counters set one or more of the
following sysctls to one:

dev.mce.X.conf.diag_general_enable=1
dev.mce.X.conf.diag_pci_enable=1

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 10:03:50 +00:00
Hans Petter Selasky
5e6a76be8a Enforce reading the consumer and producer counters once to ensure
consistent return values from the mlx5e_sq_has_room_for()
function. The two counters are incremented by different threads under
different locks.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-27 08:32:50 +00:00
Hans Petter Selasky
e16c241deb Remove superfluous return statement.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 15:47:29 +00:00
Hans Petter Selasky
b98ba64027 Allow transmit packet bufring in software to be disabled.
- Add new sysctl node to control the transmit packet bufring.

- Add optimised version of the transmit routine which output packets
directly to the DMA ring instead of using bufring in case the transmit
lock is congested. This can reduce the number of taskswitches which in
turn influence the overall system CPU usage, depending on the
workload.

- Add " TX" suffix to debug name for transmit mutexes to silence some
witness warnings about aquiring duplicate locks having same name.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
Suggested by:		gallatin @
2017-01-20 15:45:21 +00:00
Hans Petter Selasky
3dfa7645c5 Make draining a sendqueue more robust.
Add own state variable to track if a sendqueue is stopped or not.
This will prevent traffic from entering the sendqueue while it is
being destroyed.

Update drain function to wait for traffic to be transmitted before
returning when the link state is active.

Add extra checks in transmit path for stopped SQ's.

While at it:
- Use likely() for a mbuf pointer check.
- Remove redundant IFF_DRV_RUNNING check.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 12:02:40 +00:00
Hans Petter Selasky
d2bf00a918 Add runtime support for modifying the SQ and RQ completion event
moderation mode. The presence of this feature is indicated through the
firmware capabilities.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 11:11:49 +00:00
Hans Petter Selasky
0402eb6bcc Update firmware interface structures and definitions adding support
for new features and commands.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-01-20 10:47:32 +00:00
Hans Petter Selasky
41aa095b2f Make a read only pointer constant.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-12-22 10:12:19 +00:00
Hans Petter Selasky
436659135a Add more comments regarding collection of statistics counters.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-12-22 10:11:03 +00:00
Hans Petter Selasky
de83258d59 Remove useless NULL checks.
NULL is not returned when allocating memory passing the M_WAITOK flag.

Submitted by:		trasz @
Differential Revision:  https://reviews.freebsd.org/D5772
Sponsored by:           Mellanox Technologies
MFC after:		1 week
2016-12-02 09:41:54 +00:00
Hans Petter Selasky
6f4cab6cc3 Add timer to watch the RQ when we are out of mbufs.
The firmware/hardware does not generate additional completion
events unless we post new buffers. Use a timer to try to post
more buffers in case we are temporarily out of mbufs. Else
the receive schedule completely stops.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:39:45 +00:00
Hans Petter Selasky
cb02244355 Add more firmware related structures and update existing ones in the
MLX5 core module. Update the set and query diagnostics counter API.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:28:50 +00:00
Hans Petter Selasky
627ef61aab Query flow table capabilities according to the correct capability bit
for infiniband.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:26:25 +00:00
Hans Petter Selasky
adea303c2a Correct checksum fields in the "mlx5_mini_cqe8" structure. The fields
in question are currently not used.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:22:50 +00:00
Hans Petter Selasky
91951e3978 Ensure the firmware is notified of any host memory allocation
failures. Else firmware commands may time out waiting for host
memory.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:20:13 +00:00
Hans Petter Selasky
97ac390861 When a firmware command times out do not free the command structure to
avoid use after free.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-11-07 11:15:40 +00:00
Hans Petter Selasky
478c1a9932 Set hardware stats flag to avoid double counting the number of incoming bytes.
Found by:	Ben RUBSON <ben.rubson@gmail.com>
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-29 16:35:52 +00:00
Hans Petter Selasky
a2c320d7c7 mlx5en: Fix duplicate mbuf free-by-code.
When mlx5e_sq_xmit() returns an error code and the mbuf pointer is set,
we should not free the mbuf, because the caller will keep the mbuf in
the drbr. Make sure the mbuf pointer is correctly set upon function
exit.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:57:48 +00:00
Hans Petter Selasky
14997cc16a mlx5en: Remove unused pdev pointer.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:55:38 +00:00
Hans Petter Selasky
f5344e8333 mlx5en: Verify port type is ethernet before creating network device
Else the mlx5en driver might attach to infiniband ports.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:53:53 +00:00
Hans Petter Selasky
431fe47416 mlx5en: Allow setting the software MTU size below 1500 bytes
The hardware MTU size can't be set to a value less than 1500 bytes due
to side-band management support. Allow setting the software MTU size
below 1500 bytes, thus creating a mismatch between hardware and
software MTU sizes.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:51:31 +00:00
Hans Petter Selasky
7b4e6e4ac9 mlx5en: Factor out common sendqueue code for use with rate limiting SQs.
Try to reuse code to setup sendqueues when possible by making some static
functions global. Further split the mlx5e_close_sq_wait() function to
separate out reusable parts.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:47:16 +00:00
Hans Petter Selasky
81b3cdc1bb mlx5en: Properly declare doorbell lock for 32-bit CPUs.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:45:35 +00:00
Hans Petter Selasky
5eadc44ceb mlx5en: Optimise away duplicate UAR pointers.
This change also reduces the size of the mlx5e_sq structure so that the last
queue_state element will fit into the previous cacheline and then the mlx5e_sq
structure becomes one cacheline less for amd64.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:40:45 +00:00
Hans Petter Selasky
28f22ccea3 mlx5en: Make the mlx5e_open_cq() and mlx5e_close_cq() functions global.
Make some functions and structures global to allow for code reuse
when creating rate limiting sendqueues.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:39:15 +00:00
Hans Petter Selasky
941cd5d1a4 mlx5en: Minor completion queue control path code refactor.
Move setting of CQ moderation mode together with the other
CQ moderation parameters. Pass completion event vector as
a separate argument to mlx5e_open_cq(), because its value is
different for each call. Pass mlx5e_priv pointer instead of
mlx5e_channel pointer so that code can be used by rate
limiting sendqueues.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:37:35 +00:00
Hans Petter Selasky
98626886ee mlx5en: Separate the sendqueue from using the mlx5e_channel structure.
This change allows for reusing the transmit path for so called
rate limited senqueues. While at it optimise some pointer lookups
in the fast path.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:35:45 +00:00
Hans Petter Selasky
cb4e4a6ed6 Update the MLX5 core module:
- Add new firmware commands and update existing ones.
- Add more firmware related structures and update existing ones.
- Some minor fixes, like adding missing \n to some prints.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-09-16 11:28:16 +00:00
Hans Petter Selasky
351a9c7c0b Increase the maximum RX/TX queue size. This allows for a RX/TX queue
size of 16384 mbufs. Previously the limit was 8192.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-22 13:43:25 +00:00
Hans Petter Selasky
5f9e5b5e62 Fix for use after free.
Clear the device description to avoid use after free because the
bsddev is not destroyed when the mlx5en module is unloaded. Only when
the parent mlx5 module is unloaded the bsddev is destroyed. This fixes
a panic on listing sysctls which refer strings in the bsddev after the
mlx5en module has been unloaded.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-09 07:43:15 +00:00
Hans Petter Selasky
57d5dd7907 Switch to the new block based LRO input function for the mlx5en
driver. This change significantly increases the overall RX aggregation
ratio for heavily loaded networks handling 10-80 thousand simultaneous
connections.

Remove the turbo LRO code and all references to it which has now been
superceeded by the tcp_lro_queue_mbuf() function.

Tested by:	Netflix
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-08-08 16:22:16 +00:00
Hans Petter Selasky
5b0c29decf Use correct Q-counter output array.
Sponsored by:	Mellanox Technologies
Approved by:	re (kib)
MFC after:	3 days
2016-06-23 09:23:37 +00:00
Hans Petter Selasky
76a5241f2c Add SR-IOV guest support to the mlx5en driver.
This patch adds the missing pieces needed for device setup using the
mlx5en driver inside a virtual machine which is providing hardware
access through SR-IOV.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-06-07 13:58:52 +00:00
Sepherosa Ziehau
36ad8372d4 net: Use M_HASHTYPE_OPAQUE_HASH if the mbuf flowid has hash properties
Reviewed by:	hps, erj, tuexen
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D6688
2016-06-07 04:51:50 +00:00
Hans Petter Selasky
fa201e28fc Prepare for activation of LinuxKPI module parameters as read-only
tunable SYSCTL's. Linux module parameters are associated with the
module they belong to. FreeBSD does not share this concept of a parent
module. Instead add macros which define the prefix to use for the
module parameters in the LinuxKPI consumers.

While at it convert all "bool" LinuxKPI module parameters to "byte"
type, because we don't have a "bool" type of SYSCTL in FreeBSD.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-05-25 12:03:21 +00:00
Hans Petter Selasky
82d2623e5a Verify one sysctl parameter at a time. When a mlx5en sysctl parameter
is updated only verify the changed one instead of all.

No functional change.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 07:07:27 +00:00
Hans Petter Selasky
af89c4aff6 Optimise use of doorbell and remove redundant NOPs
Store the last doorbell write in the mlx5e_sq structure and write the
doorbell to the hardware when the transmit routine finishes
transmitting all queued mbufs.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 06:59:38 +00:00
Hans Petter Selasky
376bcf6331 Implement TX completion event interleaving.
This patch implements a sysctl which allows setting a factor, N, for
how many work queue elements can be generated before requiring a
completion event. When a completion event happens the code simulates N
completion events instead of only one. When draining a transmit queue,
N-1 NOPs are transmitted at most, to force generation of the final
completion event.  Further a timer is running every HZ ticks to flush
any remaining data off the transmit queue when the tx_completion_fact
> 1.

The goal of this feature is to reduce the PCI bandwidth needed when
transmitting data.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-05-20 06:54:58 +00:00
Hans Petter Selasky
83c5d190fe Correct some error codes to native FreeBSD ones.
Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-04-29 11:01:06 +00:00
Hans Petter Selasky
21dd652701 Add function to detect the presence of a port module and use this
function to error out early when no port module is present and doing
eeprom access. This also prevents error codes from filling up in
dmesg.

Sponsored by:	Mellanox Technologies
Tested by:	Netflix
MFC after:	1 week
2016-04-29 11:00:12 +00:00
Sepherosa Ziehau
6dd38b8716 tcp/lro: Use tcp_lro_flush_all in device drivers to avoid code duplication
And factor out tcp_lro_rx_done, which deduplicates the same logic with
netinet/tcp_lro.c

Reviewed by:	gallatin (1st version), hps, zbb, np, Dexuan Cui <decui microsoft com>
Sponsored by:	Microsoft OSTC
Differential Revision:	https://reviews.freebsd.org/D5725
2016-04-01 06:28:33 +00:00
Hans Petter Selasky
d7633a3070 Fix an issue where the network adapter could be left in down state
after changing the HW LRO sysctl when previously in up state.

Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4941
2016-01-19 10:24:47 +00:00
Hans Petter Selasky
636d1fec4d Add clarifying comment about CQE zipping.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4940
2016-01-19 10:19:33 +00:00
Hans Petter Selasky
1558d49bb1 Declare local variables at top of function.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4939
2016-01-19 10:17:24 +00:00
Hans Petter Selasky
4d3b91a762 Allow RX and TX pause frames to be set through ifconfig.
Reviewed by:	gnn
Sponsored by:	Mellanox Technologies
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D4817
2016-01-19 10:10:02 +00:00
Hans Petter Selasky
f03f517b5e Add support for modifying coalescing parameters runtime.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2015-12-30 15:01:47 +00:00
Hans Petter Selasky
4fbd91a5af Allow I2C to read address 0x51 as well as address 0x50.
MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
2015-12-30 14:58:55 +00:00
Hans Petter Selasky
4f18ce8ae0 10G ER/LR should present itself as LR.
MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
2015-12-30 14:54:08 +00:00
Hans Petter Selasky
90cc1c7724 Add support for CQE zipping. CQE zipping reduces PCI overhead by
coalescing and zipping multiple CQEs into a single merged CQE. The
feature is enabled by default and can be disabled by a sysctl.

Implementing this feature mlx5_cqwq_pop() has been separated from
mlx5e_get_cqe().

MFC after:	1 week
Submitted by:	Mark Bloch <markb@mellanox.com>
Differential Revision:	https://reviews.freebsd.org/D4598
Sponsored by:	Mellanox Technologies
2015-12-28 18:50:18 +00:00
Hans Petter Selasky
ec0143b260 Add support for sysctl tunables to 10-stable and older. Pushed through
head first to simplify driver maintenance.

MFC after:	1 week
Submitted by:	Drew Gallatin <gallatin@freebsd.org>
Differential Revision:	https://reviews.freebsd.org/D4552
Sponsored by:	Mellanox Technologies
2015-12-28 18:36:00 +00:00
Hans Petter Selasky
ee41fc8f8c Make the eeprom dump function more readable and rename variables for
better clarity.

MFC after:	1 week
Submitted by:	Daria Genzel <dariaz@mellanox.com>
Differential Revision:	https://reviews.freebsd.org/D4551
Sponsored by:	Mellanox Technologies
2015-12-28 18:28:18 +00:00
Hans Petter Selasky
98a998d5e7 Update the mlx5 shared driver code to the latest version, which
include the following list of changes:

- Added eswitch ACL table management
  Introduce API for managing ACL table.
  This API include the following features:
  1) vlan filter - for VST/VGT+ support.
  2) spoofcheck.
  3) robust functionality to allow/drop general untagged/tagged traffic.
  4) support for both ingress and egress ACL types.

- Added loopback filter to the vacl table.

- Added multicast list set in the vPort context

- Added promiscuous mode set in the vPort context

- Set the vlan list in vPort context
  1) Check caps if VLAN list is not longer than FW supports
  2) Set MODIFY_NIC_VPORT_CONTEXT command

- Changed MLX5_EEPROM_MAX_BYTES from 48 to 32 so that a single EEPROM
  reading cannot cross the 128-byte boundary. Previously reading the
  MCIA register was done in batches of 48 bytes. The third reading
  would then by-pass the 127th byte, which means that part of the low
  page and part of the high page would be read at the same time, which
  created a bug:
    1st: 0-47 bytes
    2nd: 48-95 bytes
    3rd: 96-143 bytes

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4411
2015-12-07 13:16:48 +00:00
Hans Petter Selasky
278ce1c919 Add full support for Receive Side Scaling, RSS, to the mlx5en
driver. This includes binding all interrupt and worker threads
according to the RSS configuration, setting up correct Toeplitz
hashing keys as given by RSS and setting the correct mbuf
hashtype for all received traffic.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4410
2015-12-07 12:38:51 +00:00
Hans Petter Selasky
74540a3183 Add support for setting the TX moderation mode via a sysctl entry. TX
completion events can be moderated in the same way like RX completion
events. Expose this functionality by a sysctl variable.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4409
2015-12-07 11:04:50 +00:00
Hans Petter Selasky
2a5ac376e4 The firmware no longer supports setting a port MTU of zero bytes.
Set the port MTU and then query it and report if any problems instead.

MFC after:	1 week
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
Differential Revision:	https://reviews.freebsd.org/D4408
2015-12-07 10:57:42 +00:00
Hans Petter Selasky
bb3853c6bd Style changes, mostly automated.
Differential Revision:	https://reviews.freebsd.org/D4179
Submitted by:	Daria Genzel <dariaz@mellanox.com>
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-11-19 10:28:51 +00:00
Hans Petter Selasky
ee09079968 Accumulate out of RX buffers into a 64-bit value and subtract out of
RX buffers from number of received packets.

Differential Revision:	https://reviews.freebsd.org/D4178
Submitted by:	Drew Gallatin <gallatin@freebsd.org>
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-11-19 10:23:10 +00:00
Hans Petter Selasky
36c1007d35 Maintain the "hw_lro" configuration variable correctly.
Setting sysctl dev....conf.hw_lro may fail if the net device lro is
turned off. Due to the nature of our sysctl handler we need to set the
values back to 0 and issue an error.

Differential Revision:	https://reviews.freebsd.org/D4177
Submitted by:	Shahar Klein <shahark@mellanox.com>
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-11-19 10:18:13 +00:00
Hans Petter Selasky
7e1b8bc0c9 Print cable name, if cable type is not recognized.
Differential Revision:	https://reviews.freebsd.org/D4180
Submitted by:	Mark Bloch <markb@mellanox.com>
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2015-11-19 10:10:52 +00:00
Hans Petter Selasky
03ab395e29 Compile fix for 32-bit platforms:
- The Linux timers data field is "unsigned long".

Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2015-11-12 09:52:37 +00:00
Hans Petter Selasky
dc7e38ac4d Add mlx5 and mlx5en driver(s) for ConnectX-4 and ConnectX-4LX cards
from Mellanox Technologies. The current driver supports ethernet
speeds up to and including 100 GBit/s. Infiniband support will be
done later.

The code added is not compiled by default, which will be done by a
separate commit.

Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2015-11-10 12:20:22 +00:00