Commit Graph

231 Commits

Author SHA1 Message Date
Hans Petter Selasky
e5eae1dc7d Enable FPGA and FPGA QP errors for EQ and call the handler in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:26:33 +00:00
Hans Petter Selasky
c322dbafd5 Add MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:25:14 +00:00
Hans Petter Selasky
423530be04 Add support for Dynamic Interrupt Moderation, DIM, in mlx5en(4).
Add support for DIM based on Linux,
with some minor adaptions specific to FreeBSD.

Linux commit
f97c3dc3c0e8d23a5c4357d182afeef4c67f5c33

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:23:33 +00:00
Andrew Gallatin
50575ce11c Track TCP connection's NUMA domain in the inpcb
Drivers can now pass up numa domain information via the
mbuf numa domain field.  This information is then used
by TCP syncache_socket() to associate that information
with the inpcb. The domain information is then fed back
into transmitted mbufs in ip{6}_output(). This mechanism
is nearly identical to what is done to track RSS hash values
in the inp_flowid.

Follow on changes will use this information for lacp egress
port selection, binding TCP pacers to the appropriate NUMA
domain, etc.

Reviewed by:	markj, kib, slavash, bz, scottl, jtl, tuexen
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20028
2019-04-25 15:37:28 +00:00
Andrew Gallatin
7687707dd4 Track device's NUMA domain in ifnet & alloc ifnet from NUMA local memory
This commit adds new if_alloc_domain() and if_alloc_dev() methods to
allocate ifnets.  When called with a domain on a NUMA machine,
ifalloc_domain() will record the NUMA domain in the ifnet, and it will
allocate the ifnet struct from memory which is local to that NUMA
node.  Similarly, if_alloc_dev() is a wrapper for if_alloc_domain
which uses a driver supplied device_t to call ifalloc_domain() with
the appropriate domain.

Note that the new if_numa_domain field fits in an alignment pad in
struct ifnet, and so does not alter the size of the structure.

Reviewed by:	glebius, kib, markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19930
2019-04-22 19:24:21 +00:00
Andrew Gallatin
538ff57b75 mlx5en: Enable new pfil(9) KPI ethernet filtering hooks
This allows efficient filtering at packet ingress on mlx5en.

Note that the packets are filtered (and potentially dropped) *before*
the driver has committed to (re)allocating an mbuf for the
packet. Dropped packets are treated essentially the same as an
error. Nothing is allocated, and the existing buffer is recycled. This
allows us to drop malicious packets at close to line rate with very
little CPU use.

Reviewed by:	hselasky, slavash, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19063
2019-04-15 17:14:50 +00:00
Konstantin Belousov
a748d99a17 Fix build with option RSS, removing unused variables.
Reported by:	np
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-12-06 21:52:40 +00:00
Konstantin Belousov
e206dc6479 Appease gcc build, remove duplicated declaration.
Reported by:	np
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-12-06 19:20:00 +00:00
Sean Bruno
5fc3b4acab Change u32 to uint32_t to allow the native-xtools target to build
libsysdecode.

Submitted by:	kib
2018-12-06 18:59:33 +00:00
Slava Shwartsman
0f3b263d83 mlx4/mlx5: Updated driver version to 3.5.0
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:25:34 +00:00
Slava Shwartsman
cc971b2261 mlx5en: Implement backpressure indication.
The backpressure indication is implemented using an unlimited rate type of
mbuf send tag. When the upper layers typically the socket layer has obtained such
a tag, it can then query the destination driver queue for the current
amount of space available in the send queue.

A single mbuf send tag may be referenced multiple times and a refcount has been added
to the mlx5e_priv structure to track its usage. Because the send tag resides
in the mlx5e_channel structure, there is no need to wait for refcounts to reach
zero until the mlx4en(4) driver is detached. The channels structure is persistant
during the lifetime of the mlx5en(4) driver it belongs to and can so be accessed
without any need of synchronization.

The mlx5e_snd_tag structure was extended to contain a type field, because there are now
two different tag types which end up in the driver which need to be distinguished.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:25:03 +00:00
Slava Shwartsman
71defeda26 mlx5en: Improve configuration of HW LRO.
In order to enable HW LRO, both the "hw_lro" sysctl in the mlx5en(4) config
space must be set, and the ifconfig(8) LRO capability must be set. Any other
settings will disable HW LRO.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:24:33 +00:00
Slava Shwartsman
01f02abfec mlx5en: Count all transmitted and received bytes.
Add counter for all transmitted and received bytes. Currently only all
transmitted and received packets were counted. Fix description of RX LRO
counters while at it.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:24:02 +00:00
Slava Shwartsman
3230c29d72 mlx5en: Statically allocate and free the channel structure(s).
By allocating the worst case size channel structure array
at attach time we can eliminate various NULL checks in the
fast path. And also reduce the chance for use-after-free
issues in the transmit fast path.

This change is also a requirement for implementing
backpressure support.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:23:31 +00:00
Slava Shwartsman
b3cf149325 mlx5en: Fix race in mlx5e_ethtool_debug_stats().
Writing to the debug stats variable must be locked,
else serialization will be lost which might cause
various kernel panics due to creating and destroying
sysctls out of order.

Make sure the sysctl context is initialized after freeing
the sysctl nodes, else they can be freed twice.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:23:01 +00:00
Slava Shwartsman
42390bb884 mlx5en: Add support for IFM_10G_LR and IFM_40G_ER4 media types.
Inspect the ethernet compliance code to figure out actual cable type by reading
the PDDR module info register.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:22:30 +00:00
Slava Shwartsman
f292a94d6b mlx5en: Don't set rate on SQs when the SQ is already stopped.
This can happen when connections are short lived and leads to
a firmware error printout in dmesg, syndrome 0x51cfb0, because
the SQ is in the wrong state.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:21:59 +00:00
Slava Shwartsman
3e581cabf0 mlx5en: Fix for inlining issues in transmit path
1) Don't exceed the drivers own hardcoded TX inline limit.

The blueflame register size can be much greater than the hardcoded limit
for inlining. Make sure we don't exceed the drivers own limit, because this
also means that the maximum number of TX fragments becomes invalid and
then memory size assumptions in the TX path no longer hold up.

2) Make sure the mlx5_query_min_inline() function returns an error code.

3) Header inlining is required when using TSO.

4) Catch failure to compute inline header size for TSO.

5) Add support for UDP when computing inline header size.

6) Fix for inlining issues with regards to DSCP.

Make sure we inline 4 bytes beyond the ethernet and/or
VLAN header to workaround a hardware bug extracting
the DSCP field from the IPv4/v6 header.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:21:28 +00:00
Slava Shwartsman
d51ced5fae mlx5en: Remove the DRBR and associated logic in the transmit path.
The hardware queues are deep enough currently and using the DRBR and associated
callbacks only leads to more task switching in the TX path. The is also a race
setting the queue_state which can lead to hung TX rings.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:20:57 +00:00
Slava Shwartsman
e870c0ab61 mlx5en: Implement support for bandwidth limiting in by ratio, ETS.
Add support for setting the bandwidth limit as a ratio rather than in bits per
second. The ratio must be an integer number between 1 and 100 inclusivly.

Implement the needed firmware commands and SYSCTLs through mlx5en(4).

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:20:26 +00:00
Slava Shwartsman
d82f1c13ad mlx5fpga: Add set and query connect/disconnect FPGA
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:19:55 +00:00
Slava Shwartsman
085b35bb69 mlx5fpga: IOCTL for FPGA temperature measurement
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:19:23 +00:00
Slava Shwartsman
b5f5751275 mlx5fpga: Support MorseQ board
Added and supported new enum "morseQ = 4" for fpga_id field

Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:18:52 +00:00
Slava Shwartsman
d50c55f17d mlx5fpga_tools initial code import.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:17:22 +00:00
Slava Shwartsman
e9dcd83155 mlx5fpga: Initial code import.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:11:20 +00:00
Slava Shwartsman
67db687393 mlx5ib: Set default active width and speed when querying port.
Make sure the active width and speed is set in case the
translate_eth_proto_oper() function doesn't recognize the
current port operation mask.

Linux commit:
7672ed33c4c15dbe9d56880683baaba4227cf940

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:49:11 +00:00
Slava Shwartsman
d3300d4aed mlx5ib: Make sure the congestion work timer does not escape the drain procedure.
If the mlx5_ib_read_cong_stats() function was running when mlx5ib was unloaded,
because this function unconditionally restarts the timer, the timer can still
be pending after the delayed work has been cancelled. To fix this simply loop
on the delayed work cancel procedure as long as it returns non-zero.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:48:39 +00:00
Slava Shwartsman
ac2fdeb4e7 mlx5ib: Fix null pointer dereference in mlx5_ib_create_srq
Although "create_srq_user" does overwrite "in.pas" on some paths, it
also contains at least one feasible path which does not overwrite it.

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:48:10 +00:00
Slava Shwartsman
c9e9b5c104 mlx5ib: Fix sign extension in mlx5_ib_query_device
"fw_rev_min(dev->mdev)" with type "unsigned short" (16 bits, unsigned) is
promoted in "fw_rev_min(dev->mdev) << 16" to type "int" (32 bits, signed), then
sign-extended to type "unsigned long" (64 bits, unsigned). If
"fw_rev_min(dev->mdev) << 16" is greater than 0x7FFFFFFF, the upper bits of the
result will all be 1.

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:47:41 +00:00
Slava Shwartsman
31c3f64819 mlx5: Fix driver version location
Driver description should be set by core and not by the Ethernet driver.

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:47:10 +00:00
Slava Shwartsman
721a1a6a69 mlx5: Fixes to allow command polling mode to exist alongside event mode.
A command is either polling or event driven and the mode cannot change
during execution of a command. Make sure the event handler only handle
commands which are not polled. This is done by checking the command mode
in the command handler before completing commands.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:46:39 +00:00
Slava Shwartsman
b084af6cdc mlx5: Fix wrong size allocation for QoS ETC TC register
The driver allocates wrong size (due to wrong struct name) when issuing
a query/set request to NIC's register.

Linux commit:
d14fcb8d877caf1b8d6bd65d444bf62b21f2070c

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:46:09 +00:00
Slava Shwartsman
8718eb63f8 mlx5: Add software tx_jumbo_packets counter
This counter will represent transmitted packets which has more than
1518 octets.
The NIC has multiple hardware counters for counting transmitted
packets larger than 1518 octets. Each counter counts the packets
in specific range.
We accumulate those counters to have a single counter.

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:45:37 +00:00
Slava Shwartsman
feb5f357ea mlx5: Implement support for configuring PCIe packet write ordering via a sysctl.
Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:45:08 +00:00
Slava Shwartsman
70b417cf90 mlx5: Extend vector argument to u64.
Else the MLX5_TRIGGERED_CMD_COMP flag will be masked away.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:44:38 +00:00
Slava Shwartsman
29e544513e mlx5: Add global control to disable firmware reset, for all mlx5 devices.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:44:08 +00:00
Slava Shwartsman
2119f825d1 mlx5: Fix use-after-free in self-healing flow
When the mlx5 health mechanism detects a problem while the driver
is in the middle of init_one or remove_one, the driver needs to prevent
the health mechanism from scheduling future work; if future work
is scheduled, there is a problem with use-after-free: the system WQ
tries to run the work item (which has been freed) at the scheduled
future time.

Prevent this by disabling work item scheduling in the health mechanism
when the driver is in the middle of init_one() or remove_one().

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:43:37 +00:00
Slava Shwartsman
8f7f07368d mlx5: Move hw.mlx5 node definition to mlx5_core.
Submitted by:   kib@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:43:07 +00:00
Slava Shwartsman
63cc6d1bc2 mlx5: Convert some spaces into tabs and use device_printf() instead of printf().
Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:42:36 +00:00
Slava Shwartsman
abb28d287b mlx5: Add SRQ fixes from Linux
Combine multiple fixes from Linux to SRQ.
Linux commits:
c73b791 IB/mlx5: Assign SRQ type earlier
0fd27a8 IB/mlx5: Fix out-of-bound access
c2b37f7 IB/mlx5: Fix integer overflows in mlx5_ib_create_srq
d63c467 RDMA/mlx5: Fix memory leak in mlx5_ib_create_srq() error path

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:42:06 +00:00
Slava Shwartsman
3b21d18587 mlx5: Fix for potential memory leaks.
Make sure allocated data gets freed in error cases.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:41:37 +00:00
Slava Shwartsman
07b624ed71 mlx5: Discard unused return values.
Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:41:06 +00:00
Slava Shwartsman
843a89d37e mlx5: Raise fatal IB event when sys error occurs
All other mlx5_events report the port number as 1 based, which is how FW
reports it in the port event EQE. Reporting 0 for this event causes
mlx5_ib to not raise a fatal event notification to registered clients
due to a seemingly invalid port.

All switch cases in mlx5_ib_event that go through the port check are
supposed to set the port now, so just do it once at variable
declaration.

Linux commit:
aba462134634b502d720e15b23154f21cfa277e5

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:40:36 +00:00
Slava Shwartsman
2bf40c3608 mlx5: Fix integer overflow while resizing CQ
The user can provide very large cqe_size which will cause to integer
overflow.

Linux commit:
28e9091e3119933c38933cb8fc48d5618eb784c8

Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 13:40:05 +00:00
Slava Shwartsman
0c79f82cf0 mlx5: Notify user that the ConnectX-6 shutdown its port due to power limitation
If power exceed the slot limit, or slot limit is unknown the ConnectX-6
firmware will shutdown its port.
Inform the user via debug message.

MFC after:      3 days
Approved by:    hselasky (mentor), kib (mentor)
Sponsored by:   Mellanox Technologies
2018-10-22 10:38:38 +00:00
Hans Petter Selasky
6ed134c41b Make the MSIX module parameter limit per device, in mlx5en(4).
MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 12:41:09 +00:00
Hans Petter Selasky
16ae32f927 Add support for receive side scaling stride, RSSS, in mlx5en(4).
The receive side scaling stride parameter is a value which define the interval
between active receive side queues. The traffic for the inactive queues is
redirected to the nearest active queue by use of modulus. The default value
of this parameter is one, which means all receive side queues are used.

The point of this feature is to redirect more traffic to fewer receive side
queues in order to take more advantage of sorted large receive offload,
sorted LRO. The sorted LRO works better when more packets are accumulated
per service interval.

MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 12:28:06 +00:00
Hans Petter Selasky
0be5034007 Don't stall transmit queue on drops in mlx5en(4).
When a transmitted packet is dropped don't stall the transmit queue.

MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 12:19:36 +00:00
Hans Petter Selasky
2d32b0a304 Maximum number of mbuf frags is off-by-one for worst case scenario in mlx5en(4).
Inspecting the PRM no more than 0x3F data segments, DS, of size 16 bytes is
allowed.

Worst case scenario summary of DS usage:
Header is fixed:	2 DS
Maximum inlining:	98 => (98 - 2) / 16 = 6 DS
Remainder:		0x3F - 2 - 6 = 55 DS (mbuf frags)

Previously a value of 56 DS was used and this would work in the
normal case because not all inline data area was used up.

MFC after:		3 days
Approved by:		re (marius)
Sponsored by:		Mellanox Technologies
2018-09-06 11:06:07 +00:00
Hans Petter Selasky
7b9b93a8dd Update version information for the mlx5 and mlx5en(4) modules.
While at it bump some copyright dates.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-07-18 10:12:53 +00:00