243 Commits

Author SHA1 Message Date
Hans Petter Selasky
91f13f8368 Implement fast close of RX channel in mlx5en(4).
Instead of waiting for all jobs to be cancelled, simply close the completion
queue to prevent more completion events and let mlx5e_destroy_rq() cleanup
the remaining mbufs.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:34:42 +00:00
Hans Petter Selasky
243853215d Correct number of elements for priority to traffic class mappings in mlx5en(4).
The number of priorities is always 8, while the number of traffic classes
supported can vary. While at it convert the sysctl node into an array.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:34:14 +00:00
Hans Petter Selasky
ffadb62f20 Remove unused module parameter in mlx5ib.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:33:29 +00:00
Hans Petter Selasky
6428c27faf Make sure to error out when arming the CQ fails in mlx4ib and mlx5ib.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:33:09 +00:00
Hans Petter Selasky
069963d772 Destroy port stats debug context in correct order in mlx5en(4).
Destroy children nodes before parent nodes.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:32:22 +00:00
Hans Petter Selasky
c66537d7b2 Fix tx_jumbo_packets counter in mlx5en(4).
Instead of reading Ethernet RFC 2819 pXtoYoctets counters from
hardware which counts RX octets, count tx_stat_pXtoYoctets from
Ethernet extended counters which counts TX octets.

TX jumbo counters should be accumulated only after the PPCNT
counters were fetched from hardware with their latest value.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:32:03 +00:00
Hans Petter Selasky
bcfad02593 Update Ethernet extended counters in mlx5en(4).
Expose all Ethernet extended counters those counters via debug_stats
sysctl:
dev.mce.X.debug_stats

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:31:32 +00:00
Hans Petter Selasky
5169fb81ca Protect from infinite sw-reset loop in mlx5core.
Avoid an infinite software firmware reset loop that may be caused by a
hardware bug by limiting the maximum number of resets.
The counter between resets is reset by request for reset, and not by a
successful reset.
The interval between two resets can be configured via sysctl:
hw.mlx5.sw_reset_timeout
which is global to all mlx5 devices in the system.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:30:47 +00:00
Hans Petter Selasky
192fc18d49 Disable all MSIX interrupts before shutdown in mlx5.
Make sure the interrupt handlers don't race with the fast unload one
code in the shutdown handler.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:30:18 +00:00
Hans Petter Selasky
a3a31fde6d Import Linux code to implement mlx5_ib_disassociate_ucontext() in mlx5ib.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:29:45 +00:00
Hans Petter Selasky
983026ea83 Add temperature warning event to log in mlx5core.
Temperature warning event is sent by FW to indicate high temperature
as detected by one of the sensors on the board.
Add handling of this event by writing the numbers of the alert sensors
to the kernel log.

Linux commit:
1865ea9adbfaf341c5cd5d8f7d384f19948b2fe9

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:28:18 +00:00
Hans Petter Selasky
7646dc2347 Correctly define the interface state bits in mlx5en(4).
While at it remove unused interface state bits. This also fixes and issue
during shutdown:

There is an issue where the firmware fails during mlx5_load_one,
the health_care timer detects the issue and schedules a health_care call.
Then the mlx5_load_one detects the issue, cleans up and quits. Then
the health_care starts and calls mlx5_unload_one to clean up the resources
that no longer exist and causes kernel panic.

The root cause is that the bit MLX5_INTERFACE_STATE_DOWN is not set
after mlx5_load_one fails. The solution is removing the bit
MLX5_INTERFACE_STATE_DOWN and quit mlx5_unload_one if the
bit MLX5_INTERFACE_STATE_UP is not set. The bit MLX5_INTERFACE_STATE_DOWN
is redundant and we can use MLX5_INTERFACE_STATE_UP instead.

Linux commit:
10a8d00707082955b177164d4b4e758ffcbd4017
b3cb5388499c5e219324bfe7da2e46cbad82bfcf

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:27:29 +00:00
Hans Petter Selasky
e5eae1dc7d Enable FPGA and FPGA QP errors for EQ and call the handler in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:26:33 +00:00
Hans Petter Selasky
c322dbafd5 Add MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:25:14 +00:00
Hans Petter Selasky
423530be04 Add support for Dynamic Interrupt Moderation, DIM, in mlx5en(4).
Add support for DIM based on Linux,
with some minor adaptions specific to FreeBSD.

Linux commit
f97c3dc3c0e8d23a5c4357d182afeef4c67f5c33

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:23:33 +00:00
Andrew Gallatin
50575ce11c Track TCP connection's NUMA domain in the inpcb
Drivers can now pass up numa domain information via the
mbuf numa domain field.  This information is then used
by TCP syncache_socket() to associate that information
with the inpcb. The domain information is then fed back
into transmitted mbufs in ip{6}_output(). This mechanism
is nearly identical to what is done to track RSS hash values
in the inp_flowid.

Follow on changes will use this information for lacp egress
port selection, binding TCP pacers to the appropriate NUMA
domain, etc.

Reviewed by:	markj, kib, slavash, bz, scottl, jtl, tuexen
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20028
2019-04-25 15:37:28 +00:00
Andrew Gallatin
7687707dd4 Track device's NUMA domain in ifnet & alloc ifnet from NUMA local memory
This commit adds new if_alloc_domain() and if_alloc_dev() methods to
allocate ifnets.  When called with a domain on a NUMA machine,
ifalloc_domain() will record the NUMA domain in the ifnet, and it will
allocate the ifnet struct from memory which is local to that NUMA
node.  Similarly, if_alloc_dev() is a wrapper for if_alloc_domain
which uses a driver supplied device_t to call ifalloc_domain() with
the appropriate domain.

Note that the new if_numa_domain field fits in an alignment pad in
struct ifnet, and so does not alter the size of the structure.

Reviewed by:	glebius, kib, markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19930
2019-04-22 19:24:21 +00:00
Andrew Gallatin
538ff57b75 mlx5en: Enable new pfil(9) KPI ethernet filtering hooks
This allows efficient filtering at packet ingress on mlx5en.

Note that the packets are filtered (and potentially dropped) *before*
the driver has committed to (re)allocating an mbuf for the
packet. Dropped packets are treated essentially the same as an
error. Nothing is allocated, and the existing buffer is recycled. This
allows us to drop malicious packets at close to line rate with very
little CPU use.

Reviewed by:	hselasky, slavash, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19063
2019-04-15 17:14:50 +00:00
Konstantin Belousov
a748d99a17 Fix build with option RSS, removing unused variables.
Reported by:	np
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-12-06 21:52:40 +00:00
Konstantin Belousov
e206dc6479 Appease gcc build, remove duplicated declaration.
Reported by:	np
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-12-06 19:20:00 +00:00
Sean Bruno
5fc3b4acab Change u32 to uint32_t to allow the native-xtools target to build
libsysdecode.

Submitted by:	kib
2018-12-06 18:59:33 +00:00
Slava Shwartsman
0f3b263d83 mlx4/mlx5: Updated driver version to 3.5.0
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:25:34 +00:00
Slava Shwartsman
cc971b2261 mlx5en: Implement backpressure indication.
The backpressure indication is implemented using an unlimited rate type of
mbuf send tag. When the upper layers typically the socket layer has obtained such
a tag, it can then query the destination driver queue for the current
amount of space available in the send queue.

A single mbuf send tag may be referenced multiple times and a refcount has been added
to the mlx5e_priv structure to track its usage. Because the send tag resides
in the mlx5e_channel structure, there is no need to wait for refcounts to reach
zero until the mlx4en(4) driver is detached. The channels structure is persistant
during the lifetime of the mlx5en(4) driver it belongs to and can so be accessed
without any need of synchronization.

The mlx5e_snd_tag structure was extended to contain a type field, because there are now
two different tag types which end up in the driver which need to be distinguished.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:25:03 +00:00
Slava Shwartsman
71defeda26 mlx5en: Improve configuration of HW LRO.
In order to enable HW LRO, both the "hw_lro" sysctl in the mlx5en(4) config
space must be set, and the ifconfig(8) LRO capability must be set. Any other
settings will disable HW LRO.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:24:33 +00:00
Slava Shwartsman
01f02abfec mlx5en: Count all transmitted and received bytes.
Add counter for all transmitted and received bytes. Currently only all
transmitted and received packets were counted. Fix description of RX LRO
counters while at it.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:24:02 +00:00
Slava Shwartsman
3230c29d72 mlx5en: Statically allocate and free the channel structure(s).
By allocating the worst case size channel structure array
at attach time we can eliminate various NULL checks in the
fast path. And also reduce the chance for use-after-free
issues in the transmit fast path.

This change is also a requirement for implementing
backpressure support.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:23:31 +00:00
Slava Shwartsman
b3cf149325 mlx5en: Fix race in mlx5e_ethtool_debug_stats().
Writing to the debug stats variable must be locked,
else serialization will be lost which might cause
various kernel panics due to creating and destroying
sysctls out of order.

Make sure the sysctl context is initialized after freeing
the sysctl nodes, else they can be freed twice.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:23:01 +00:00
Slava Shwartsman
42390bb884 mlx5en: Add support for IFM_10G_LR and IFM_40G_ER4 media types.
Inspect the ethernet compliance code to figure out actual cable type by reading
the PDDR module info register.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:22:30 +00:00
Slava Shwartsman
f292a94d6b mlx5en: Don't set rate on SQs when the SQ is already stopped.
This can happen when connections are short lived and leads to
a firmware error printout in dmesg, syndrome 0x51cfb0, because
the SQ is in the wrong state.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:21:59 +00:00
Slava Shwartsman
3e581cabf0 mlx5en: Fix for inlining issues in transmit path
1) Don't exceed the drivers own hardcoded TX inline limit.

The blueflame register size can be much greater than the hardcoded limit
for inlining. Make sure we don't exceed the drivers own limit, because this
also means that the maximum number of TX fragments becomes invalid and
then memory size assumptions in the TX path no longer hold up.

2) Make sure the mlx5_query_min_inline() function returns an error code.

3) Header inlining is required when using TSO.

4) Catch failure to compute inline header size for TSO.

5) Add support for UDP when computing inline header size.

6) Fix for inlining issues with regards to DSCP.

Make sure we inline 4 bytes beyond the ethernet and/or
VLAN header to workaround a hardware bug extracting
the DSCP field from the IPv4/v6 header.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:21:28 +00:00
Slava Shwartsman
d51ced5fae mlx5en: Remove the DRBR and associated logic in the transmit path.
The hardware queues are deep enough currently and using the DRBR and associated
callbacks only leads to more task switching in the TX path. The is also a race
setting the queue_state which can lead to hung TX rings.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:20:57 +00:00
Slava Shwartsman
e870c0ab61 mlx5en: Implement support for bandwidth limiting in by ratio, ETS.
Add support for setting the bandwidth limit as a ratio rather than in bits per
second. The ratio must be an integer number between 1 and 100 inclusivly.

Implement the needed firmware commands and SYSCTLs through mlx5en(4).

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:20:26 +00:00
Slava Shwartsman
d82f1c13ad mlx5fpga: Add set and query connect/disconnect FPGA
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:19:55 +00:00
Slava Shwartsman
085b35bb69 mlx5fpga: IOCTL for FPGA temperature measurement
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:19:23 +00:00
Slava Shwartsman
b5f5751275 mlx5fpga: Support MorseQ board
Added and supported new enum "morseQ = 4" for fpga_id field

Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:18:52 +00:00
Slava Shwartsman
d50c55f17d mlx5fpga_tools initial code import.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:17:22 +00:00
Slava Shwartsman
e9dcd83155 mlx5fpga: Initial code import.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:11:20 +00:00
Slava Shwartsman
67db687393 mlx5ib: Set default active width and speed when querying port.
Make sure the active width and speed is set in case the
translate_eth_proto_oper() function doesn't recognize the
current port operation mask.

Linux commit:
7672ed33c4c15dbe9d56880683baaba4227cf940

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:49:11 +00:00
Slava Shwartsman
d3300d4aed mlx5ib: Make sure the congestion work timer does not escape the drain procedure.
If the mlx5_ib_read_cong_stats() function was running when mlx5ib was unloaded,
because this function unconditionally restarts the timer, the timer can still
be pending after the delayed work has been cancelled. To fix this simply loop
on the delayed work cancel procedure as long as it returns non-zero.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:48:39 +00:00
Slava Shwartsman
ac2fdeb4e7 mlx5ib: Fix null pointer dereference in mlx5_ib_create_srq
Although "create_srq_user" does overwrite "in.pas" on some paths, it
also contains at least one feasible path which does not overwrite it.

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:48:10 +00:00
Slava Shwartsman
c9e9b5c104 mlx5ib: Fix sign extension in mlx5_ib_query_device
"fw_rev_min(dev->mdev)" with type "unsigned short" (16 bits, unsigned) is
promoted in "fw_rev_min(dev->mdev) << 16" to type "int" (32 bits, signed), then
sign-extended to type "unsigned long" (64 bits, unsigned). If
"fw_rev_min(dev->mdev) << 16" is greater than 0x7FFFFFFF, the upper bits of the
result will all be 1.

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:47:41 +00:00
Slava Shwartsman
31c3f64819 mlx5: Fix driver version location
Driver description should be set by core and not by the Ethernet driver.

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:47:10 +00:00
Slava Shwartsman
721a1a6a69 mlx5: Fixes to allow command polling mode to exist alongside event mode.
A command is either polling or event driven and the mode cannot change
during execution of a command. Make sure the event handler only handle
commands which are not polled. This is done by checking the command mode
in the command handler before completing commands.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:46:39 +00:00
Slava Shwartsman
b084af6cdc mlx5: Fix wrong size allocation for QoS ETC TC register
The driver allocates wrong size (due to wrong struct name) when issuing
a query/set request to NIC's register.

Linux commit:
d14fcb8d877caf1b8d6bd65d444bf62b21f2070c

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:46:09 +00:00
Slava Shwartsman
8718eb63f8 mlx5: Add software tx_jumbo_packets counter
This counter will represent transmitted packets which has more than
1518 octets.
The NIC has multiple hardware counters for counting transmitted
packets larger than 1518 octets. Each counter counts the packets
in specific range.
We accumulate those counters to have a single counter.

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:45:37 +00:00
Slava Shwartsman
feb5f357ea mlx5: Implement support for configuring PCIe packet write ordering via a sysctl.
Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:45:08 +00:00
Slava Shwartsman
70b417cf90 mlx5: Extend vector argument to u64.
Else the MLX5_TRIGGERED_CMD_COMP flag will be masked away.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:44:38 +00:00
Slava Shwartsman
29e544513e mlx5: Add global control to disable firmware reset, for all mlx5 devices.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:44:08 +00:00
Slava Shwartsman
2119f825d1 mlx5: Fix use-after-free in self-healing flow
When the mlx5 health mechanism detects a problem while the driver
is in the middle of init_one or remove_one, the driver needs to prevent
the health mechanism from scheduling future work; if future work
is scheduled, there is a problem with use-after-free: the system WQ
tries to run the work item (which has been freed) at the scheduled
future time.

Prevent this by disabling work item scheduling in the health mechanism
when the driver is in the middle of init_one() or remove_one().

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:43:37 +00:00
Slava Shwartsman
8f7f07368d mlx5: Move hw.mlx5 node definition to mlx5_core.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:43:07 +00:00