Commit Graph

562 Commits

Author SHA1 Message Date
Konstantin Belousov
bab0c4b1a0 mlx5en: stop ignoring pauses and flow in the media reqs.
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2020-11-12 02:23:27 +00:00
Konstantin Belousov
559dbeac47 mlx5en: Register all combinations of FDX/RXPAUSE/TXPAUSE as valid media types.
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2020-11-12 02:22:16 +00:00
Konstantin Belousov
4ead80241a mlx5en: Refactor repeated code to register media type to mlx5e_ifm_add().
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2020-11-12 02:21:14 +00:00
John Baldwin
b7d92a6683 Remove IF_SND_TAG_TYPE_TLS_RATE_LIMIT conditionals.
Support for TLS rate limit tags is now in the tree, so this macro is
always defined.

Reviewed by:	hselasky
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D27020
2020-10-30 21:05:50 +00:00
John Baldwin
418b5444f8 Fix a couple of silly bugs in r367149.
- Assign the TLS rate limit value to the correct member of the
  rl_params for the nested rate limit tag.

- Remove a dead condition.

Pointy hat to:	jhb
2020-10-30 00:06:36 +00:00
John Baldwin
36e0a362ac Add m_snd_tag_alloc() as a wrapper around if_snd_tag_alloc().
This gives a more uniform API for send tag life cycle management.

Reviewed by:	gallatin, hselasky
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D27000
2020-10-29 23:28:39 +00:00
John Baldwin
638000c0b6 Use public interfaces to manage the nested rate limit send tag.
Each TLS send tag in mlx5 contains a nested rate limit send tag.
Previously, the driver was calling internal functions to manage the
nested tag.  Calling free methods directly instead of m_snd_tag_rele()
leaked send tag references and references on the ifp.  Changes to use
the ifp methods for the nested tag for other methods are more cosmetic
but do simplify the code.

Reviewed by:	gallatin, hselasky
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D26996
2020-10-29 22:22:27 +00:00
John Baldwin
521eac97f3 Support hardware rate limiting (pacing) with TLS offload.
- Add a new send tag type for a send tag that supports both rate
  limiting (packet pacing) and TLS offload (mostly similar to D22669
  but adds a separate structure when allocating the new tag type).

- When allocating a send tag for TLS offload, check to see if the
  connection already has a pacing rate.  If so, allocate a tag that
  supports both rate limiting and TLS offload rather than a plain TLS
  offload tag.

- When setting an initial rate on an existing ifnet KTLS connection,
  set the rate in the TCP control block inp and then reset the TLS
  send tag (via ktls_output_eagain) to reallocate a TLS + ratelimit
  send tag.  This allocates the TLS send tag asynchronously from a
  task queue, so the TLS rate limit tag alloc is always sleepable.

- When modifying a rate on a connection using KTLS, look for a TLS
  send tag.  If the send tag is only a plain TLS send tag, assume we
  failed to allocate a TLS ratelimit tag (either during the
  TCP_TXTLS_ENABLE socket option, or during the send tag reset
  triggered by ktls_output_eagain) and ignore the new rate.  If the
  send tag is a ratelimit TLS send tag, change the rate on the TLS tag
  and leave the inp tag alone.

- Lock the inp lock when setting sb_tls_info for a socket send buffer
  so that the routines in tcp_ratelimit can safely dereference the
  pointer without needing to grab the socket buffer lock.

- Add an IFCAP_TXTLS_RTLMT capability flag and associated
  administrative controls in ifconfig(8).  TLS rate limit tags are
  only allocated if this capability is enabled.  Note that TLS offload
  (whether unlimited or rate limited) always requires IFCAP_TXTLS[46].

Reviewed by:	gallatin, hselasky
Relnotes:	yes
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D26691
2020-10-29 00:23:16 +00:00
Hans Petter Selasky
194ddc011a Properly cleanup driver during remove_one() in mlx5core.
Cleanup all host resources, SYSCTLs, MSIX vectors and memory used
by the host and only leave the device allocated memory behind, if any,
because it may still be in use, when the PCI remove function is called.
Else future probe calls may fail due to SYSCTLs already existing.

MFC after:		1 week
Sponsored by:		Mellanox Technologies // NVIDIA Networking
2020-10-07 17:46:49 +00:00
John Baldwin
56fb710f1b Store the send tag type in the common send tag header.
Both cxgbe(4) and mlx5(4) wrapped the existing send tag header with
their own identical headers that stored the type that the
type-specific tag structures inherited from, so in practice it seems
drivers need this in the tag anyway.  This permits removing these
extra header indirections (struct cxgbe_snd_tag and struct
mlx5e_snd_tag).

In addition, this permits driver-independent code to query the type of
a tag, e.g. to know what type of tag is being queried via
if_snd_query.

Reviewed by:	gallatin, hselasky, np, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D26689
2020-10-06 17:58:56 +00:00
Hans Petter Selasky
39b0f9c389 Poll statistics more frequently in mlx5en(4).
This makes traffic steering algorithms more accurate.

MFC after:	1 week
Submitted by:	gallatin @
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-09-14 14:24:54 +00:00
Hans Petter Selasky
78ae1e6e15 Make hardware TLS send tag allocation synchronous in mlx5en(4).
Previously the send tag was setup in the background, and all packets for
the given send tag were dropped until ready. Change this to be blocking
behaviour so that once the setsocketopt() for enabling TLS completes,
the socket is ready to send packets. Do this by simply flushing the
work request which does the needed firmware programming during send
tag allocation.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // Nvidia
2020-09-01 12:21:17 +00:00
Konstantin Belousov
596b98ba16 mlx5 sriov: Add controls for VFs to set port/node GUIDs.
Setting GUIDs make RoCE offloads functional on VFs.

Reported and tested by:	chuck
Sponsored by:	Mellanox Technologies - Nvidia
MFC after:	1 week
2020-08-31 16:32:17 +00:00
Konstantin Belousov
cca1f7a12f mlx5 sriov: add error message for failed MAC programming on VF.
Sponsored by:	Mellanox Technologies - Nvidia
MFC after:	1 week
2020-08-31 16:30:52 +00:00
Konstantin Belousov
2ea114b34e mlx5en: Implement SIOCGIFDOWNREASON.
Sponsored by:	Mellanox Technologies - Nvidia
MFC after:	1 week
2020-08-31 16:27:03 +00:00
Konstantin Belousov
62daa4b6e8 mlx5_core: add mlx5_query_pddr().
And use it in mlx5_query_pddr_range_info() instead of direct register
access.

Sponsored by:	Mellanox Technologies - Nvidia
MFC after:	1 week
2020-08-31 16:25:55 +00:00
Konstantin Belousov
e088db5eae mlx5_core: Import PDDR register definitions
PDDR (Port Diagnostics Database Register) is used to read the physical
layer debug database, which contains helpful troubleshooting information
regarding the state of the link.

PDDR register can only be queried when PCAM register reports it as
supported in its register mask. A new helper macro was added to
the MLX5_CAP_* infrastructure in order to access this mask.

Sponsored by:	Mellanox Technologies - Nvidia
MFC after:	1 week
2020-08-31 16:23:51 +00:00
Hans Petter Selasky
1866c98e64 Infiniband clients must be attached and detached in a specific order in ibcore.
Currently the linking order of the infiniband, IB, modules decide in which
order the clients are attached and detached. For example one IB client may
use resources from another IB client. This can lead to a potential deadlock
at shutdown. For example if the ipoib is unregistered after the ib_multicast
client is detached, then if ipoib is using multicast addresses a deadlock may
happen, because ib_multicast will wait for all its resources to be freed before
returning from the remove method.

Fix this by using module_xxx_order() instead of module_xxx().

Differential Revision:	https://reviews.freebsd.org/D23973
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2020-07-06 08:50:11 +00:00
Konstantin Belousov
92d8df2f37 mlx5_core: remove unneccessary LFENCE instruction.
Use fence instead of barrier, which is optimized to take advantage of
the x86 TSO memory model.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2020-07-02 10:44:45 +00:00
Hans Petter Selasky
11304ef50e Fix HW TLS offload regression issue after r359919, in mlx5en(4).
Changes in the mbuf layout regarding HW TLS, resulted in wrong detection
of starting mbuf. Use a boolean variable to handle this and pass m_adj()
the top mbuf, so that the packet header is adjusted correctly.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-17 11:14:54 +00:00
Ryan Moeller
cbb9ccf735 Avoid trying to toggle TSO twice
Remove TSO from the toggle mask when automatically disabled by TXCKSUM* in
various NIC drivers.

Reviewed by:	hselasky, np, gallatin, jpaetzel
Approved by:	mav (mentor)
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25120
2020-06-15 16:35:27 +00:00
Hans Petter Selasky
6fe9e470bb Make sure packets generated by raw IP code is let through by mlx5en(4).
Allow the TCP header to reside in the mbuf following the IP header.
Else such packets will get dropped.

Backtrace:
mlx5e_sq_xmit()
mlx5e_xmit()
ether_output_frame()
ether_output()
ip_output_send()
ip_output()
rip_output()
sosend_generic()
sosend()
kern_sendit()
sendit()
sys_sendto()
amd64_syscall()
fast_syscall_common()

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-11 09:41:54 +00:00
Hans Petter Selasky
b63b61cc75 Extend use of unlikely() in the fast path, in mlx5en(4).
Typically the TCP/IP headers fit within the first mbuf and should not
trigger any of the error cases. Use unlikely() for these cases.

No functional change.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-11 09:38:51 +00:00
Hans Petter Selasky
9eb1e4aa21 Use const keyword when parsing the TCP/IP header in the fast path in mlx5en(4).
When parsing the TCP/IP header in the fast path, make it clear by using
the const keyword, no fields are to be modified inside the transmitted
packet.

No functional change.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-11 09:36:37 +00:00
Hans Petter Selasky
bf43f9812c Sync with Linux packet pacing enhancements in mlx5en(4).
Linux commit:
05d3ac978ed25b753bfe34fe76c50c31ee506a82

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-26 07:41:46 +00:00
Hans Petter Selasky
ce69b84204 Improve set progress parameters, SET PSV for HW TLS in mlx5en(4).
There is no need for a fence and there is no need to provide
the TCP sequence number.

Sponsored by:	Mellanox Technologies
2020-05-25 12:37:45 +00:00
Hans Petter Selasky
233a6665b6 Correctly set the initial vector for TLS v1.3 for mlx5en(4).
For TLS v1.3 the 12 bytes of the initial vector, IV, should just be copied
as-is from the kernel to the gcm_iv field, which hold the first 4 bytes,
and the remaining 8 bytes go to the subsequent implicit_iv field.
There is no need to consider the byte order on the 12 bytes of IV like
initially done.

Sponsored by:	Mellanox Technologies
2020-05-25 12:34:15 +00:00
Hans Petter Selasky
9550e3403e Update the TLS capability bit after recent PRM changes in mlx5en(4).
A CX6-DX firmware version equal to or newer than 12.27.0372 is
now required.

Sponsored by:	Mellanox Technologies
2020-05-25 12:31:48 +00:00
Konstantin Belousov
d0a4068359 mlx5_core: add more port module event types to decode.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2020-05-20 11:20:45 +00:00
Konstantin Belousov
6418350cf4 mlx5_core: add "PMD type not enabled" port module event type.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	3 days
2020-05-20 11:10:10 +00:00
Gleb Smirnoff
6edfd179c8 Step 4.1: mechanically rename M_NOMAP to M_EXTPG
Reviewed by:	gallatin
Differential Revision:	https://reviews.freebsd.org/D24598
2020-05-03 00:21:11 +00:00
Gleb Smirnoff
7b6c99d08d Step 3: anonymize struct mbuf_ext_pgs and move all its fields into mbuf
within m_epg namespace.
All edits except the 'struct mbuf' declaration and mb_dupcl() were done
mechanically with sed:

s/->m_ext_pgs.nrdy/->m_epg_nrdy/g
s/->m_ext_pgs.hdr_len/->m_epg_hdrlen/g
s/->m_ext_pgs.trail_len/->m_epg_trllen/g
s/->m_ext_pgs.first_pg_off/->m_epg_1st_off/g
s/->m_ext_pgs.last_pg_len/->m_epg_last_len/g
s/->m_ext_pgs.flags/->m_epg_flags/g
s/->m_ext_pgs.record_type/->m_epg_record_type/g
s/->m_ext_pgs.enc_cnt/->m_epg_enc_cnt/g
s/->m_ext_pgs.tls/->m_epg_tls/g
s/->m_ext_pgs.so/->m_epg_so/g
s/->m_ext_pgs.seqno/->m_epg_seqno/g
s/->m_ext_pgs.stailq/->m_epg_stailq/g

Reviewed by:	gallatin
Differential Revision:	https://reviews.freebsd.org/D24598
2020-05-03 00:12:56 +00:00
Gleb Smirnoff
6fbcdeb6f1 Step 2.4: Stop using 'struct mbuf_ext_pgs' in drivers.
Reviewed by:	gallatin, hselasky
Differential Revision:	https://reviews.freebsd.org/D24598
2020-05-02 23:58:20 +00:00
Hans Petter Selasky
decb087cc2 Add support for reading temperature in mlx5en(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-27 14:35:39 +00:00
Andrew Gallatin
23feb56348 KTLS: Re-work unmapped mbufs to carry ext_pgs in the mbuf itself.
While the original implementation of unmapped mbufs was a large
step forward in terms of reducing cache misses by enabling mbufs
to carry more than a single page for sendfile, they are rather
cache unfriendly when accessing the ext_pgs metadata and
data. This is because the ext_pgs part of the mbuf is allocated
separately, and almost guaranteed to be cold in cache.

This change takes advantage of the fact that unmapped mbufs
are never used at the same time as pkthdr mbufs. Given this
fact, we can overlap the ext_pgs metadata with the mbuf
pkthdr, and carry the ext_pgs meta directly in the mbuf itself.
Similarly, we can carry the ext_pgs data (TLS hdr/trailer/array
of pages) directly after the existing m_ext.

In order to be able to carry 5 pages (which is the minimum
required for a 16K TLS record which is not perfectly aligned) on
LP64, I've had to steal ext_arg2. The only user of this in the
xmit path is sendfile, and I've adjusted it to use arg1 when
using unmapped mbufs.

This change is almost entirely mechanical, except that we
change mb_alloc_ext_pgs() to no longer allow allocating
pkthdrs, the change to avoid ext_arg2 as mentioned above,
and the removal of the ext_pgs zone,

This change saves roughly 2% "raw" CPU (~59% -> 57%), or over
3% "scaled" CPU on a Netflix 100% software kTLS workload at
90+ Gb/s on Broadwell Xeons.

In a follow-on commit, I plan to remove some hacks to avoid
access ext_pgs fields of mbufs, since they will now be in
cache.

Many thanks to glebius for helping to make this better in
the Netflix tree.

Reviewed by:	hselasky, jhb, rrs, glebius (early version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D24213
2020-04-14 14:46:06 +00:00
Hans Petter Selasky
bd88e5f28f Account out of buffer as dropped packets in mlx5en(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-08 08:56:27 +00:00
Hans Petter Selasky
d182de8661 Remove obsolete bufring stats in mlx5en(4).
Leftover from when DRBR was removed.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-08 08:53:31 +00:00
Hans Petter Selasky
cd1442c0ff Don't drop packets having too many TCP option headers in mlx5en(4).
When using SACK it can happen there are multiple option headers.
Don't drop these packets, but instead limit the amount of inlining
to the maximum supported.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-06 09:50:20 +00:00
Hans Petter Selasky
9c9b73403c Ensure a minimum inline size of 16 bytes in mlx5en(4).
This includes 14 bytes of ethernet header and 2 bytes of VLAN header.

This allows for making assumptions about the inline size limit
in the fast transmit path later on.

Use a signed integer variable to catch underflow.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-06 09:45:49 +00:00
Hans Petter Selasky
f504949065 Count number of times transmit ring is out of buffers in mlx5en(4).
Differential Revision:	https://reviews.freebsd.org/D24273
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-06 09:41:22 +00:00
Konstantin Belousov
4ad58ea85b mlx5_core: lower the severity of message noting that no SR-IOV cap is present.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 22:47:14 +00:00
Konstantin Belousov
bbcb656af2 mlx5: Route NIC_VPORT_CHANGE events to eswitch code.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 22:44:48 +00:00
Konstantin Belousov
90959e7e37 mlx5: Read number of VF ports from the SR-IOV cap.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 22:43:39 +00:00
Konstantin Belousov
18a70fa574 mlx5: Use eswitch interface to configure VFs steering.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 22:40:26 +00:00
Konstantin Belousov
8982c8003b mlx5: Add 'follow' vport state, relevant for VFs.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 22:38:57 +00:00
Konstantin Belousov
f6ca0b216a mlx5: Integrate eswitch and mpfs management code.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 22:33:39 +00:00
Konstantin Belousov
91ad1bd953 mlx5: Restore eswitch management code from attic.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 22:30:56 +00:00
Konstantin Belousov
9dfa078252 mlx5: Basic PCIe side of SR-IOV support.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 22:17:01 +00:00
Konstantin Belousov
e19a968f15 mlx5_core: add sysctls to report device capabilities.
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 21:54:32 +00:00
Konstantin Belousov
96dad2b720 mlx5en: Support 50GBase-KR4 media type in mlx5en driver.
Submitted by:	Adam Peace <adam.e.peace@gmail.com>
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2020-03-04 17:13:35 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Gleb Smirnoff
e87c494015 Although most of the NIC drivers are epoch ready, due to peer pressure
switch over to opt-in instead of opt-out for epoch.

Instead of IFF_NEEDSEPOCH, provide IFF_KNOWSEPOCH. If driver marks
itself with IFF_KNOWSEPOCH, then ether_input() would not enter epoch
when processing its packets.

Now this will create recursive entrance in epoch in >90% network
drivers, but will guarantee safeness of the transition.

Mark several tested drivers as IFF_KNOWSEPOCH.

Reviewed by:		hselasky, jeff, bz, gallatin
Differential Revision:	https://reviews.freebsd.org/D23674
2020-02-24 21:07:30 +00:00
Hans Petter Selasky
9b6e45d459 Fix broken MLX5_IB_INDEX() macro in mlx5ib(4).
The index should be computed as distance from arg[0] and not
the beginning of struct mlx5_ib_congestion .

While at it fix a use of zero length array to avoid depending
on undefined compiler behaviour.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-02-21 10:14:02 +00:00
John Baldwin
ff8c6681c8 Don't check the auth algorithm for GCM.
The upstream OpenSSL changes only set the cipher for GCM since the
authentication is redundant, and changes to OCF will soon remove the
GCM authentication algorithm constants entirely for the same reason.
In addition, ktls_create_session() already validates these fields and
wouldn't pass down an invalid auth_algorithm value to any drivers or
ktls backends.

Reviewed by:	hselasky
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D23671
2020-02-13 23:04:11 +00:00
Hans Petter Selasky
01651e9615 Add support for debugnet in mlx5en(4).
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-02-12 10:03:25 +00:00
Hans Petter Selasky
f14d849862 Add support for disabling and polling MSIX interrupts in mlx5core.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-02-12 09:58:19 +00:00
Hans Petter Selasky
e48813009c Widen EPOCH(9) usage in mlx5en(4).
Make completion event path mostly lockless using EPOCH(9).

Implement a mechanism using EPOCH(9) which allows us to make
the callback path for completion events mostly lockless.

Simplify draining callback events using epoch_wait().

While at it make sure all receive completion callbacks are
covered by the network EPOCH(9), because this is required
when calling if_input() and ether_input() after r357012.

Sponsored by:	Mellanox Technologies
2020-01-30 12:35:13 +00:00
Hans Petter Selasky
0a833dff03 Fix compilation issue with mlx5core and sparc64 (gcc48):
sys/dev/mlx5/mlx5_en/mlx5_en_tx.c:335: error: requested alignment is not a constant

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2019-12-06 16:20:22 +00:00
Hans Petter Selasky
7272f9cd77 Implement hardware TLS via send tags for mlx5en(4), which is supported by
ConnectX-6 DX.

Currently TLS v1.2 and v1.3 with AES 128/256 crypto over TCP/IP (v4
and v6) is supported.

A per PCI device UMA zone is used to manage the memory of the send
tags.  To optimize performance some crypto contexts may be cached by
the UMA zone, until the UMA zone finishes the memory of the given send
tag.

An asynchronous task is used manage setup of the send tags towards the
firmware. Most importantly setting the AES 128/256 bit pre-shared keys
for the crypto context.

Updating the state of the AES crypto engine and encrypting data, is
all done in the fast path. Each send tag tracks the TCP sequence
number in order to detect non-contiguous blocks of data, which may
require a dump of prior unencrypted data, to restore the crypto state
prior to wire transmission.

Statistics counters have been added to count the amount of TLS data
transmitted in total, and the amount of TLS data which has been dumped
prior to transmission. When non-contiguous TCP sequence numbers are
detected, the software needs to dump the beginning of the current TLS
record up until the point of retransmission. All TLS counters utilize
the counter(9) API.

In order to enable hardware TLS offload the following sysctls must be set:
kern.ipc.mb_use_ext_pgs=1
kern.ipc.tls.ifnet.permitted=1
kern.ipc.tls.enable=1

Sponsored by:	Mellanox Technologies
2019-12-06 15:36:32 +00:00
Konstantin Belousov
0cf6ff0a77 mlx5: Do not poke hardware for statistic after teardown is started.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2019-12-05 15:21:13 +00:00
Hans Petter Selasky
04f1690bf0 Add basic support for TCP/IP based hardware TLS offload to mlx5core.
The hardware offload is primarily targeted for TLS v1.2 and v1.3,
using AES 128/256 bit pre-shared keys. This patch adds all the needed
hardware structures, capabilites and firmware commands.

Sponsored by:	Mellanox Technologies
2019-12-05 15:16:19 +00:00
Konstantin Belousov
c47b170b02 mlx5: Do not try to enable fwdumps if scan space did not responded.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2019-12-02 14:22:55 +00:00
Konstantin Belousov
9a4af029b0 mlx5: Downgrade assert about misbehaving hardware to error message.
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2019-12-02 14:21:40 +00:00
Gleb Smirnoff
20b26072b8 Convert to if_foreach_llmaddr() KPI.
Reviewed by:	hselasky
2019-10-14 20:23:16 +00:00
Hans Petter Selasky
08650d170c Fix regression issue after r352989:
As noted by the commit message, callouts are now persistant
and should not be in the auto-zero section of the RQ's and SQ's.
This fixes an assert when using the TX completion event
factor feature with mlx5en(4).

Found by:	gallatin@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-08 19:49:25 +00:00
Hans Petter Selasky
bb2d815739 Fix build failure for gcc after r352983, due to
not using static variable declared by net/sff8472.h .

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 12:45:39 +00:00
Hans Petter Selasky
67a254a074 Fix build failure for 32-bit platforms after r352991, due to
incorrect printf() formatter string.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 12:02:14 +00:00
Hans Petter Selasky
d3d47408fc Bump driver version for mlx5core, mlx5en(4) and mlx5ib(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:15:35 +00:00
Hans Petter Selasky
fedc7bd218 Print numeric error_type and module_status in mlx5core
in case the strings are not available.

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:06:01 +00:00
Hans Petter Selasky
02fd8723c2 Add print to show user a reason for rejecting buffer size change in mlx5en(4).
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:05:05 +00:00
Hans Petter Selasky
e525a7f0eb Only update lossy buffers config when manual PFC configuration was done
in mlx5en(4).

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:02:54 +00:00
Hans Petter Selasky
63dd1be931 Improve mlx5_fwdump_prep logging in mlx5core.
Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 11:01:05 +00:00
Hans Petter Selasky
7bf46a63ce Randomize the delay when waiting for VSC flag in mlx5core.
The PRM suggests random 0 - 10ms to prevent multiple waiters on the same
interval in order to avoid starvation.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:59:44 +00:00
Hans Petter Selasky
59efbf791d Wait for FW readiness before initializing command interface in mlx5core.
Before attempting to initialize the command interface we must wait till
the fw_initializing bit is clear.

If we fail to meet this condition the hardware will drop our
configuration, specifically the descriptors page address.  This scenario
can happen when the firmware is still executing an FLR flow and did not
finish yet so the driver needs to wait for that to finish.

Linux commits:
6c780a0267b8
b8a92577f4be.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:53:28 +00:00
Hans Petter Selasky
c84e0068ce Fix regression issue about bad refcounting of unlimited send tags
in mlx5en(4) after r348254.

The unlimited send tags are shared amount multiple connections and are
not allocated per send tag allocation request. Only increment the refcount.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:46:57 +00:00
Hans Petter Selasky
eeb1ff9800 Seal transmit path with regards to using destroyed mutex in mlx5en(4).
It may happen during link down that the running state may be observed
non-zero in the transmit routine, right before the running state is
cleared. This may end up using a destroyed mutex.

Make all channel mutexes and callouts persistant.

Preserve receive and send queue statistics during link toggle.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:43:49 +00:00
Hans Petter Selasky
b4672400e6 Remove unused cpu field from channel structure in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:26:26 +00:00
Hans Petter Selasky
a2d65bfd8f Remove mkey_be from channel structure in mlx5en(4).
Use value from priv structure instead.
This saves some space in the channel structure.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:25:47 +00:00
Hans Petter Selasky
3e40712eb0 Return an error from ioctl(MLX5_FW_RESET) if reset was rejected in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:24:13 +00:00
Hans Petter Selasky
96425f44c9 Add sysctl(8) to get and set forward error correction, FEC, configuration
in mlx5en(4).

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:22:15 +00:00
Hans Petter Selasky
048ddb58bc Move EEPROM information query from a sysctl in mlx5en(4) to an ioctl
in mlx5core. The EEPROM information is not only a property of the
mlx5en(4) driver.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:14:55 +00:00
Hans Petter Selasky
6deb0b1e94 Add support for buffer parameter manipulations in mlx5en(4).
The following sysctls are added:
dev.mce.N.conf.qos.cable_length
dev.mce.N.conf.qos.buffers_size
dev.mce.N.conf.qos.buffers_prio

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:08:04 +00:00
Hans Petter Selasky
c28ef24918 Import Linux code to query/set buffer state in mlx5en(4).
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 10:05:34 +00:00
Hans Petter Selasky
d585ff62c4 Add mlx5e_dbg() compatibility macro.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:59:42 +00:00
Hans Petter Selasky
006ae571da Update definitons for PPTB and PBMC registers layouts in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:58:00 +00:00
Hans Petter Selasky
207ff00e26 Add definition for the Port Buffer Status Register in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:57:12 +00:00
Hans Petter Selasky
8ae1c36f8b Sort the ports registers definitions numerically in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:56:27 +00:00
Hans Petter Selasky
6b4040d8ff Unify prints in mlx5en(4).
All prints in mlx5en(4) should use on of the macros:
mlx5_en_err/dbg/warn

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:49:44 +00:00
Hans Petter Selasky
a2f4f59ca8 Unify prints in mlx5core.
All prints in mlx5core should use on of the macros:
mlx5_core_err/dbg/warn

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:48:01 +00:00
Hans Petter Selasky
c9bb26aef1 Add proper print in case of 0x0 health syndrome in mlx5core.
In case of health counter fails to increment it indicates a bad device health.
In case when the syndrome indicated by firmware is 0x0, this indicates that
firmware is unable to respond to initialization segment reads.
Add proper print in this case.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:46:14 +00:00
Hans Petter Selasky
95c05e056f Add missing blank line at the end of the print in mlx5core.
Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:45:07 +00:00
Hans Petter Selasky
4bc8507b82 Remove no longer needed fwdump register tables from mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:43:48 +00:00
Hans Petter Selasky
4745819044 Read rege map from crdump scan space in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:40:23 +00:00
Hans Petter Selasky
f29c160ef3 Define MLX5_VSC_DOMAIN_SCAN_CRSPACE.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:34:34 +00:00
Hans Petter Selasky
4d98df72cd Use the MLX5_VSC_DOMAIN_SEMAPHORES constant instead of hand-rolled symbol
in mlx5core.

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:33:38 +00:00
Hans Petter Selasky
04910901bc Move mlx5_ifc_vsc_space_bits and mlx5_ifc_vsc_addr_bits to mlx5_ifc.h.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:32:41 +00:00
Hans Petter Selasky
e456decc55 Make the mlx5_vsc_wait_on_flag(9) function global.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:31:36 +00:00
Hans Petter Selasky
111b57c359 Add port module event software counters in mlx5core.
While at it, fixup PME based on latest PRM defines.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:29:55 +00:00
Hans Petter Selasky
55221653c0 Correct and update some counter names in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:27:56 +00:00
Hans Petter Selasky
2110251a39 Export channel IRQ number as part of the "hw_ctx_debug" sysctl(8) in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:27:08 +00:00
Hans Petter Selasky
6226306b47 Cleanup naming of IRQ vectors in mlx5en.
Remove unused IRQ naming functions and arrays.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:23:33 +00:00
Hans Petter Selasky
66b38bfe3d Add support for Multi-Physical Function Switch, MPFS, in mlx5en.
MPFS is a logical switch in the Mellanox device which forward packets
based on a hardware driven L2 address table, to one or more physical-
or virtual- functions. The physical- or virtual- function is required
to tell the MPFS by using the MPFS firmware commands, which unicast
MAC addresses it is requesting from the physical port's traffic.
Broadcast and multicast traffic however, is copied to all listening
physical- and virtual- functions and does not need a rule in the MPFS
switching table.

Linux commit:	eeb66cdb682678bfd1f02a4547e3649b38ffea7e
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:22:22 +00:00
Hans Petter Selasky
2db3dd5061 Implement macro for asserting priv lock in mlx5en.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:16:17 +00:00
Hans Petter Selasky
2e16440940 Fix for missing cleanup code in error case in mlx5en.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:15:07 +00:00
Hans Petter Selasky
53784e3632 Check return value of mlx5_vector2eqn() function in mlx5en.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:14:01 +00:00
Hans Petter Selasky
8e773e55f4 Make sure the number of IRQ vectors doesn't exceed 256 in mlx5core.
The "intr" field in "struct mlx5_ifc_eqc_bits" is only 8 bits wide.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:12:53 +00:00
Hans Petter Selasky
b632492966 Update warning and error print formats in mlx5ib.
Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:11:01 +00:00
Hans Petter Selasky
c788dcead1 Fix reported max SGE calculation in mlx5ib.
Add the 512 bytes limit of RDMA READ and the size of remote address to the max
SGE calculation.

Submitted by:	slavash@
Linux commit:	288c01b746aa
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-10-02 09:09:28 +00:00
Mark Johnston
ef34be8beb Fix warnings about unused identifiers when compiling without RATELIMIT. 2019-08-02 15:19:11 +00:00
Randall Stewart
20abea6663 This adds the third step in getting BBR into the tree. BBR and
an updated rack depend on having access to the new
ratelimit api in this commit.

Sponsored by:	Netflix Inc.
Differential Revision:	https://reviews.freebsd.org/D20953
2019-08-01 14:17:31 +00:00
John Baldwin
e37240f9f3 Add support for IFCAP_NOMAP to mlx5(4).
Since mlx5 uses bus_dma, this only required adding the capability
flag.

Submitted by:	gallatin
Reviewed by:	gallatin, hselasky, rrs
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20616
2019-06-29 00:53:07 +00:00
Hans Petter Selasky
76a3555808 Make sure the DMA tags get freed in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-06-04 08:06:51 +00:00
John Baldwin
fb3bc59600 Restructure mbuf send tags to provide stronger guarantees.
- Perform ifp mismatch checks (to determine if a send tag is allocated
  for a different ifp than the one the packet is being output on), in
  ip_output() and ip6_output().  This avoids sending packets with send
  tags to ifnet drivers that don't support send tags.

  Since we are now checking for ifp mismatches before invoking
  if_output, we can now try to allocate a new tag before invoking
  if_output sending the original packet on the new tag if allocation
  succeeds.

  To avoid code duplication for the fragment and unfragmented cases,
  add ip_output_send() and ip6_output_send() as wrappers around
  if_output and nd6_output_ifp, respectively.  All of the logic for
  setting send tags and dealing with send tag-related errors is done
  in these wrapper functions.

  For pseudo interfaces that wrap other network interfaces (vlan and
  lagg), wrapper send tags are now allocated so that ip*_output see
  the wrapper ifp as the ifp in the send tag.  The if_transmit
  routines rewrite the send tags after performing an ifp mismatch
  check.  If an ifp mismatch is detected, the transmit routines fail
  with EAGAIN.

- To provide clearer life cycle management of send tags, especially
  in the presence of vlan and lagg wrapper tags, add a reference count
  to send tags managed via m_snd_tag_ref() and m_snd_tag_rele().
  Provide a helper function (m_snd_tag_init()) for use by drivers
  supporting send tags.  m_snd_tag_init() takes care of the if_ref
  on the ifp meaning that code alloating send tags via if_snd_tag_alloc
  no longer has to manage that manually.  Similarly, m_snd_tag_rele
  drops the refcount on the ifp after invoking if_snd_tag_free when
  the last reference to a send tag is dropped.

  This also closes use after free races if there are pending packets in
  driver tx rings after the socket is closed (e.g. from tcpdrop).

  In order for m_free to work reliably, add a new CSUM_SND_TAG flag in
  csum_flags to indicate 'snd_tag' is set (rather than 'rcvif').
  Drivers now also check this flag instead of checking snd_tag against
  NULL.  This avoids false positive matches when a forwarded packet
  has a non-NULL rcvif that was treated as a send tag.

- cxgbe was relying on snd_tag_free being called when the inp was
  detached so that it could kick the firmware to flush any pending
  work on the flow.  This is because the driver doesn't require ACK
  messages from the firmware for every request, but instead does a
  kind of manual interrupt coalescing by only setting a flag to
  request a completion on a subset of requests.  If all of the
  in-flight requests don't have the flag when the tag is detached from
  the inp, the flow might never return the credits.  The current
  snd_tag_free command issues a flush command to force the credits to
  return.  However, the credit return is what also frees the mbufs,
  and since those mbufs now hold references on the tag, this meant
  that snd_tag_free would never be called.

  To fix, explicitly drop the mbuf's reference on the snd tag when the
  mbuf is queued in the firmware work queue.  This means that once the
  inp's reference on the tag goes away and all in-flight mbufs have
  been queued to the firmware, tag's refcount will drop to zero and
  snd_tag_free will kick in and send the flush request.  Note that we
  need to avoid doing this in the middle of ethofld_tx(), so the
  driver grabs a temporary reference on the tag around that loop to
  defer the free to the end of the function in case it sends the last
  mbuf to the queue after the inp has dropped its reference on the
  tag.

- mlx5 preallocates send tags and was using the ifp pointer even when
  the send tag wasn't in use.  Explicitly use the ifp from other data
  structures instead.

- Sprinkle some assertions in various places to assert that received
  packets don't have a send tag, and that other places that overwrite
  rcvif (e.g. 802.11 transmit) don't clobber a send tag pointer.

Reviewed by:	gallatin, hselasky, rgrimes, ae
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20117
2019-05-24 22:30:40 +00:00
Conrad Meyer
e12be3218a Include eventhandler.h in more compilation units
This was enumerated with exhaustive search for sys/eventhandler.h includes,
cross-referenced against EVENTHANDLER_* usage with the comm(1) utility.  Manual
checking was performed to avoid redundant includes in some drivers where a
common os_bsd.h (for example) included sys/eventhandler.h indirectly, but it is
possible some of these are redundant with driver-specific headers in ways I
didn't notice.

(These CUs did not show up as missing eventhandler.h in tinderbox.)

X-MFC-With:	r347984
2019-05-21 01:18:43 +00:00
Hans Petter Selasky
dfea1c3e32 Fix LINT compilation issue.
"mdev" is unused when building LINT targets.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 12:27:16 +00:00
Hans Petter Selasky
cf59f7e108 Bump the Mellanox driver version numbers and the FreeBSD version number.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:15:07 +00:00
Hans Petter Selasky
8d1eeedb5d Make command workqueue persistant in mlx5core.
There is no reason to re-create the command workqueue during healthcare.
This also fixes an issue where a previous work struct may refer to a
destroyed workqueue.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:09:08 +00:00
Hans Petter Selasky
cf551f955d Fix race between driver unload and dumping firmware in mlx5core.
Present code uses lock-less accesses to the dump data to prevent top
level ioctls from blocking bottom-level call to dump.  Unfortunately, this
depends on the type stability of the dump data structure, which makes it
non-functional during driver teardown.

Switch to the mutex locking scheme where top levels use the mutex in the
bound regions, while copyouts and drain for completion utilize condvars.
The mutex lifetime is guaranteed to be strictly larger than the time
interval where driver can initiate dump, and most of the control fields
of the old struct mlx5_dump_data are directly embedded into struct
mlx5_core_dev.

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:08:48 +00:00
Hans Petter Selasky
39c6d43ee5 Ensure the flowtable rules are not freed twice in mlx5en(4).
This can happen when re-loading the driver.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:08:21 +00:00
Hans Petter Selasky
f5233a73d8 Undo previous steps upon returning failure in mlx5en(4).
Else flowtable resources may not be properly freed.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:08:01 +00:00
Hans Petter Selasky
47d93c5c43 Make sure the flow destination structure does not use values off the stack
in mlx5en(4).

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:07:42 +00:00
Hans Petter Selasky
a0a4fd7734 Flush command workqueue when command completion is triggered in mlx5core.
Avoid race for command completion when triggering a command completions event.
Serialize operation by queueing all commands on the same work queue.
This can happen when healthcare triggers.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:07:20 +00:00
Hans Petter Selasky
4f22751048 Make command timeout way shorter in mlx5core.
The command timeout is terribly long, whole two hours. Make it 60s so if
things do go wrong, the user gets feedback in relatively short time, so
they can take corrective actions and/or investigate using tools and such.

Linux commit:
6b6c07bdcdc97ccac2596063bfc32a5faddfe884

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:07:00 +00:00
Hans Petter Selasky
4d0e6d8452 Remove non-functional MLX5E_MAX_RX_SEGS macro in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:06:42 +00:00
Hans Petter Selasky
8b825a1857 Fix for compilation warning in mlx5en(4).
Function 'mlx5e_alloc_rx_wqe' can never be inlined because it uses alloca
(override using the always_inline attribute)

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:06:22 +00:00
Hans Petter Selasky
bd802cea53 Rename functions from mlx5_fwdump to mlx5_ctl in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:05:59 +00:00
Hans Petter Selasky
998c9a2bbc Implement firmware reset from userspace in mlx5tool(8).
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:05:09 +00:00
Hans Petter Selasky
939c79a213 Add Firmware Reset Level, MFRL, register accessors in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:04:40 +00:00
Hans Petter Selasky
c62f4d8de8 Expose per-lane counters before correction mechanism in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:03:29 +00:00
Hans Petter Selasky
5f9484f3f1 Add support for extended PCIe counters in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 11:02:36 +00:00
Hans Petter Selasky
67fd194170 Extend the counters framework in mlx5en(4).
Allow more macro arguments and split the variable type and name into
separate arguments. This allows simple and powerful copy and extraction
of values from IFC based structures into SYSCTLs with the use of a single
macro.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:59:16 +00:00
Hans Petter Selasky
c71a71bafc Update performance counter bits in mlx5core.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:58:41 +00:00
Hans Petter Selasky
adb6fd50c8 Implement reading PCI power status in mlx5core.
Implement a watchdog as part of the healtcare subsystem which
reads the PCI power status during startup and upon the PCI
power status change event and store it into the core device
structure. This value is then exported to user-space via a
read-only SYSCTL. A dmesg print has been added to inform
the admin about the PCI power status.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:58:06 +00:00
Hans Petter Selasky
40218d734d Move workqueue from mlx5en(4) to mlx5core.
This avoids creating more workqueues in mlx5core to do
simple firmware command polling tasks.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:57:37 +00:00
Hans Petter Selasky
ebf7e777cd Always return success for RoCE modify port in mlx5ib.
CM layer calls ib_modify_port() regardless of the link layer.

For the Ethernet ports, qkey violation and Port capabilities
are meaningless. Therefore, always return success for ib_modify_port
calls on the Ethernet ports.

Linux Commit:
ec2558796d25e6024071b6bcb8e11392538d57bf

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:57:16 +00:00
Hans Petter Selasky
86a3977962 Add support for new rates to mlx5ib.
Submitted by:	slavash@
MFC after:      3 days
Sponsored by:   Mellanox Technologies
2019-05-08 10:56:51 +00:00
Hans Petter Selasky
52786315fc Do not add IFM_10G_LR and IFM_40G_ER4 to supported media types by default in
mlx5en(4).

IFM_10G_LR and IFM_40G_ER4 media should be added only if the device
has the needed capability bit set for it.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:55:15 +00:00
Hans Petter Selasky
ac87880ac1 Add support for 200Gb ethernet speeds to mlx5core.
Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:54:54 +00:00
Hans Petter Selasky
6d1dc6524e Remove unused speed enums in mlx5core.
Submitted by:	slavash@
MFC after:      3 days
Sponsored by:   Mellanox Technologies
2019-05-08 10:54:24 +00:00
Hans Petter Selasky
1bda99bff5 Control automatic update of firmware on driver load with a tunable in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:54:05 +00:00
Hans Petter Selasky
945f398487 Correct check for the calibration generation in mlx5en(4).
If generation is cleared due to hardware clock failure, check for it before
the divisor is used.  Actually clear generation when failure occurs.

While there, stop doing the calculations inside the generation loop.  Since
all members of mlx5e_clbr_point are used for calculations, get the
local copy of the structure and use it after generation stabilized.

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:53:47 +00:00
Hans Petter Selasky
752b8aabfa Let rx_out_of_buffer be a 32-bit counter in mlx5en(4).
This fixes counting issues when the firmware resets the counter during
allocation of the counter set where the counter belongs.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:53:25 +00:00
Hans Petter Selasky
c29b90ba68 Add vnic steering drop statistics in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:53:01 +00:00
Hans Petter Selasky
94175bc92d Use software counters for rx_packets and rx_bytes in mlx5en(4).
The physical- and virtual- port counters might not reflect the amount
of data received after address filtering. Use the software counters
instead for rx_packets and rx_bytes to know exactly how much data
was received.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:52:32 +00:00
Hans Petter Selasky
be2b4a690a Add mlx5_firmware_update() in mlx5core.
Add support for upgrading firmware on mlx5 module load.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:52:11 +00:00
Hans Petter Selasky
03ee9ea9bf Fix for double bus master disable in mlx5core.
mlx5_pci_disable_device is calling pci_disable_device which disables
bus master. No need to explicitly call pci_clear_master.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:51:29 +00:00
Hans Petter Selasky
ea78f07b5e Implement userspace firmware update for ConnectX-4/5/6.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:50:35 +00:00
Hans Petter Selasky
b255ca093a Rename mlx5_fwdump_addr to more neutral mlx5_tool_addr in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:50:08 +00:00
Hans Petter Selasky
20a63539d6 Add mlxfw callbacks in mlx5core.
Add mlx5 implementation for the ones defined by the mlxfw
shared module to be used while flashing the device firmware.

The callbacks do their job through the MCQI, MCC and MCDA registers.

Linux commit:
62bd22cf326dc4ac5be673c11cef4602dc1f5e47

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:49:36 +00:00
Hans Petter Selasky
8d593abae4 Convert remaining module parameters into SYSCTLs in mlx5core.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:44:53 +00:00
Hans Petter Selasky
a5754bd222 Remove redundant line of code in mlx5core.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:44:27 +00:00
Hans Petter Selasky
07e68e62a0 Change implicit and probably erronous EPERM to EIO on command status error
in mlx5core.

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:44:02 +00:00
Hans Petter Selasky
0b9cdc8e53 Fix netstat counters mapping in mlx5en(4).
The current mapping of driver counters to netstat counters is wrong.
For example, a single jabber packet, will cause the Ierrs counter to
count three times.

The work for mapping the hardware and software counters to their right
place in netstat counters were already done in Linux, take that as is
to the FreeBSD driver.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:42:33 +00:00
Hans Petter Selasky
8b1b42c150 Avoid leaking send queue mbufs during error recovery in mlx5en(4).
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:41:44 +00:00
Hans Petter Selasky
031b1981d2 Add helper functions to set/query MCC/MCDA/MCQI registers in mlx5core.
To be used by the mlx5 callbacks exposed to the mlxfw module.

Linux commit:
d2ad488b0073bd1a2c3f5d2ea50a7eb632103e5d

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:41:21 +00:00
Hans Petter Selasky
9e3c099977 Enhance MCAM reg to allow query on access reg support in mlx5core.
Enhance MCAM to allow the driver to query which access regs are
supported. For now, expose the regs needed for FW flashing.

Linux commit:
0ab87743cc8c5bcd482daf71961ed5fc45349e01

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:41:00 +00:00
Hans Petter Selasky
d5d52dd748 Add MCC (Management Component Control) register definitions in mlx5core.
MCC (Management Component Control) allows to control a firmware
component update.

MCDA (Management Component Data Access) allows to read and write
a firmware component.

MCQI (Management Component Query Information) allows to query
information about firmware components.

Linux commit:
4717628938423fcba0aa8fa889e9fed4eb6a655f

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:40:41 +00:00
Hans Petter Selasky
58e63779f2 Add reading the mcam_reg in mlx5core.
Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:40:13 +00:00
Hans Petter Selasky
5a8145f6f2 Query and cache PCAM, MCAM registers on initialization in mlx5core.
On load_one, we now cache our capabilities registers internally, similar
to QUERY_HCA_CAP. Capabilities can later be queried using macros
introduced in this patch.

Linux commit:
71862561f3a62015a11de16d1c306481e8415c08

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:39:53 +00:00
Hans Petter Selasky
0358212d37 Implement PCAM, MCAM access register commands in mlx5core.
Introduced registers will expose capabilities of new registers and
features related to port/management.
Driver will query MCAM and PCAM in order to avoid failing on old
firmwares with lack of support.

Linux commit:
c835ad64683bd3e2d1b31ed2cb1ff4366932edb1

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:39:25 +00:00
Hans Petter Selasky
ae73b04113 Expose PCAM, MCAM registers infrastructure in mlx5core.
PCAM: Ports capabilities mask register.
MCAM: Management capabilities mask register.

PCAM and MCAM registers will provide information regarding firmware
support for different features, in order to avoid cases where new driver
combined with old firmware results in syndromes (for ex. PCIe counters
before this patchset).

Linux commit:
cfdcbceaeffc669b70d904d80a2df9c86c232566

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:39:01 +00:00
Hans Petter Selasky
fba52edb89 Add sysctl(8) to control fast unload support in mlx5core.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:38:31 +00:00
Hans Petter Selasky
a4130a344c Add Fast teardown support to mlx5core.
Today mlx5 devices support two teardown modes:
1- Regular teardown
2- Force teardown

This change introduces the enhanced version of the "Force teardown" that
allows SW to perform teardown in a faster way without the need to reclaim
all the pages.

Fast teardown provides the following advantages:
1- Fix a FW race condition that could cause command timeout
2- Avoid moving to polling mode
3- Close the vport to prevent PCI ACK to be sent without been
   scattered to memory

Linux commit:
fcd29ad17c6ff885dfae58f557e9323941e63ba2

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:38:06 +00:00
Hans Petter Selasky
2647c1e4c5 Make sure the running variable is properly set for ratelimited SQs in mlx5en(4).
Else the SQs won't be properly released when closing rate-limited connections
leading to wrong state transitions on the SQ.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:37:31 +00:00
Hans Petter Selasky
ba11bcecd8 Implement get and set nic state as global functions in mlx5core.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:37:03 +00:00
Hans Petter Selasky
c2a1e80706 Ticks are integer type in FreeBSD.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:36:32 +00:00
Hans Petter Selasky
a005c157e4 Configure firmware to use RX hash format in mini CQE in mlx5en(4).
When using CQE zipping, one can choose between RX hash and Checksum.
This will indicate the parameter on which a zipping session should be
stopped.

While porting the Linux code, Checksum was chosen. However, the value
of Checksum is not being used anywhere.
For the FreeBSD driver, we prefer to use the RX hash format which will
guarantee the RX hash value for all the mini CQEs.
While at it, make sure to initialize the Checksum value in the
decompressed CQE.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:35:55 +00:00
Hans Petter Selasky
d52ffcb71c Disable CQE zipping by default in mlx5en(4).
After doing performance measurements, it seems like CQE zipping doesn't
have any significant benefit.
Moreover, we know that this feature is disabled by default on other
operating systems (Linux for example).

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:35:35 +00:00
Hans Petter Selasky
f0dcb8dff5 Split mlx5e_update_stats_work() in mlx5en(4).
Split the function into the mlx5e_update_stats_locked() core and make
mlx5e_update_stats_work() call the _locked helper, similar to many other
places in the kernel. This improves the code structure, making the
locking clean.

Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:35:14 +00:00
Hans Petter Selasky
91f13f8368 Implement fast close of RX channel in mlx5en(4).
Instead of waiting for all jobs to be cancelled, simply close the completion
queue to prevent more completion events and let mlx5e_destroy_rq() cleanup
the remaining mbufs.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:34:42 +00:00
Hans Petter Selasky
243853215d Correct number of elements for priority to traffic class mappings in mlx5en(4).
The number of priorities is always 8, while the number of traffic classes
supported can vary. While at it convert the sysctl node into an array.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:34:14 +00:00
Hans Petter Selasky
ffadb62f20 Remove unused module parameter in mlx5ib.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:33:29 +00:00
Hans Petter Selasky
6428c27faf Make sure to error out when arming the CQ fails in mlx4ib and mlx5ib.
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:33:09 +00:00
Hans Petter Selasky
069963d772 Destroy port stats debug context in correct order in mlx5en(4).
Destroy children nodes before parent nodes.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:32:22 +00:00
Hans Petter Selasky
c66537d7b2 Fix tx_jumbo_packets counter in mlx5en(4).
Instead of reading Ethernet RFC 2819 pXtoYoctets counters from
hardware which counts RX octets, count tx_stat_pXtoYoctets from
Ethernet extended counters which counts TX octets.

TX jumbo counters should be accumulated only after the PPCNT
counters were fetched from hardware with their latest value.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:32:03 +00:00
Hans Petter Selasky
bcfad02593 Update Ethernet extended counters in mlx5en(4).
Expose all Ethernet extended counters those counters via debug_stats
sysctl:
dev.mce.X.debug_stats

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:31:32 +00:00
Hans Petter Selasky
5169fb81ca Protect from infinite sw-reset loop in mlx5core.
Avoid an infinite software firmware reset loop that may be caused by a
hardware bug by limiting the maximum number of resets.
The counter between resets is reset by request for reset, and not by a
successful reset.
The interval between two resets can be configured via sysctl:
hw.mlx5.sw_reset_timeout
which is global to all mlx5 devices in the system.

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:30:47 +00:00
Hans Petter Selasky
192fc18d49 Disable all MSIX interrupts before shutdown in mlx5.
Make sure the interrupt handlers don't race with the fast unload one
code in the shutdown handler.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:30:18 +00:00
Hans Petter Selasky
a3a31fde6d Import Linux code to implement mlx5_ib_disassociate_ucontext() in mlx5ib.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:29:45 +00:00
Hans Petter Selasky
983026ea83 Add temperature warning event to log in mlx5core.
Temperature warning event is sent by FW to indicate high temperature
as detected by one of the sensors on the board.
Add handling of this event by writing the numbers of the alert sensors
to the kernel log.

Linux commit:
1865ea9adbfaf341c5cd5d8f7d384f19948b2fe9

Submitted by:	slavash@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:28:18 +00:00
Hans Petter Selasky
7646dc2347 Correctly define the interface state bits in mlx5en(4).
While at it remove unused interface state bits. This also fixes and issue
during shutdown:

There is an issue where the firmware fails during mlx5_load_one,
the health_care timer detects the issue and schedules a health_care call.
Then the mlx5_load_one detects the issue, cleans up and quits. Then
the health_care starts and calls mlx5_unload_one to clean up the resources
that no longer exist and causes kernel panic.

The root cause is that the bit MLX5_INTERFACE_STATE_DOWN is not set
after mlx5_load_one fails. The solution is removing the bit
MLX5_INTERFACE_STATE_DOWN and quit mlx5_unload_one if the
bit MLX5_INTERFACE_STATE_UP is not set. The bit MLX5_INTERFACE_STATE_DOWN
is redundant and we can use MLX5_INTERFACE_STATE_UP instead.

Linux commit:
10a8d00707082955b177164d4b4e758ffcbd4017
b3cb5388499c5e219324bfe7da2e46cbad82bfcf

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:27:29 +00:00
Hans Petter Selasky
e5eae1dc7d Enable FPGA and FPGA QP errors for EQ and call the handler in mlx5core.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:26:33 +00:00
Hans Petter Selasky
c322dbafd5 Add MLX5_FPGA_RELOAD IOCTL(2) to mlx5fpga.
Submitted by:	kib@
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:25:14 +00:00
Hans Petter Selasky
423530be04 Add support for Dynamic Interrupt Moderation, DIM, in mlx5en(4).
Add support for DIM based on Linux,
with some minor adaptions specific to FreeBSD.

Linux commit
f97c3dc3c0e8d23a5c4357d182afeef4c67f5c33

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2019-05-08 10:23:33 +00:00
Andrew Gallatin
50575ce11c Track TCP connection's NUMA domain in the inpcb
Drivers can now pass up numa domain information via the
mbuf numa domain field.  This information is then used
by TCP syncache_socket() to associate that information
with the inpcb. The domain information is then fed back
into transmitted mbufs in ip{6}_output(). This mechanism
is nearly identical to what is done to track RSS hash values
in the inp_flowid.

Follow on changes will use this information for lacp egress
port selection, binding TCP pacers to the appropriate NUMA
domain, etc.

Reviewed by:	markj, kib, slavash, bz, scottl, jtl, tuexen
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20028
2019-04-25 15:37:28 +00:00
Andrew Gallatin
7687707dd4 Track device's NUMA domain in ifnet & alloc ifnet from NUMA local memory
This commit adds new if_alloc_domain() and if_alloc_dev() methods to
allocate ifnets.  When called with a domain on a NUMA machine,
ifalloc_domain() will record the NUMA domain in the ifnet, and it will
allocate the ifnet struct from memory which is local to that NUMA
node.  Similarly, if_alloc_dev() is a wrapper for if_alloc_domain
which uses a driver supplied device_t to call ifalloc_domain() with
the appropriate domain.

Note that the new if_numa_domain field fits in an alignment pad in
struct ifnet, and so does not alter the size of the structure.

Reviewed by:	glebius, kib, markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19930
2019-04-22 19:24:21 +00:00
Andrew Gallatin
538ff57b75 mlx5en: Enable new pfil(9) KPI ethernet filtering hooks
This allows efficient filtering at packet ingress on mlx5en.

Note that the packets are filtered (and potentially dropped) *before*
the driver has committed to (re)allocating an mbuf for the
packet. Dropped packets are treated essentially the same as an
error. Nothing is allocated, and the existing buffer is recycled. This
allows us to drop malicious packets at close to line rate with very
little CPU use.

Reviewed by:	hselasky, slavash, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19063
2019-04-15 17:14:50 +00:00
Konstantin Belousov
a748d99a17 Fix build with option RSS, removing unused variables.
Reported by:	np
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-12-06 21:52:40 +00:00
Konstantin Belousov
e206dc6479 Appease gcc build, remove duplicated declaration.
Reported by:	np
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-12-06 19:20:00 +00:00
Sean Bruno
5fc3b4acab Change u32 to uint32_t to allow the native-xtools target to build
libsysdecode.

Submitted by:	kib
2018-12-06 18:59:33 +00:00
Slava Shwartsman
0f3b263d83 mlx4/mlx5: Updated driver version to 3.5.0
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:25:34 +00:00
Slava Shwartsman
cc971b2261 mlx5en: Implement backpressure indication.
The backpressure indication is implemented using an unlimited rate type of
mbuf send tag. When the upper layers typically the socket layer has obtained such
a tag, it can then query the destination driver queue for the current
amount of space available in the send queue.

A single mbuf send tag may be referenced multiple times and a refcount has been added
to the mlx5e_priv structure to track its usage. Because the send tag resides
in the mlx5e_channel structure, there is no need to wait for refcounts to reach
zero until the mlx4en(4) driver is detached. The channels structure is persistant
during the lifetime of the mlx5en(4) driver it belongs to and can so be accessed
without any need of synchronization.

The mlx5e_snd_tag structure was extended to contain a type field, because there are now
two different tag types which end up in the driver which need to be distinguished.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:25:03 +00:00
Slava Shwartsman
71defeda26 mlx5en: Improve configuration of HW LRO.
In order to enable HW LRO, both the "hw_lro" sysctl in the mlx5en(4) config
space must be set, and the ifconfig(8) LRO capability must be set. Any other
settings will disable HW LRO.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:24:33 +00:00
Slava Shwartsman
01f02abfec mlx5en: Count all transmitted and received bytes.
Add counter for all transmitted and received bytes. Currently only all
transmitted and received packets were counted. Fix description of RX LRO
counters while at it.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:24:02 +00:00
Slava Shwartsman
3230c29d72 mlx5en: Statically allocate and free the channel structure(s).
By allocating the worst case size channel structure array
at attach time we can eliminate various NULL checks in the
fast path. And also reduce the chance for use-after-free
issues in the transmit fast path.

This change is also a requirement for implementing
backpressure support.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:23:31 +00:00
Slava Shwartsman
b3cf149325 mlx5en: Fix race in mlx5e_ethtool_debug_stats().
Writing to the debug stats variable must be locked,
else serialization will be lost which might cause
various kernel panics due to creating and destroying
sysctls out of order.

Make sure the sysctl context is initialized after freeing
the sysctl nodes, else they can be freed twice.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:23:01 +00:00
Slava Shwartsman
42390bb884 mlx5en: Add support for IFM_10G_LR and IFM_40G_ER4 media types.
Inspect the ethernet compliance code to figure out actual cable type by reading
the PDDR module info register.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:22:30 +00:00
Slava Shwartsman
f292a94d6b mlx5en: Don't set rate on SQs when the SQ is already stopped.
This can happen when connections are short lived and leads to
a firmware error printout in dmesg, syndrome 0x51cfb0, because
the SQ is in the wrong state.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:21:59 +00:00
Slava Shwartsman
3e581cabf0 mlx5en: Fix for inlining issues in transmit path
1) Don't exceed the drivers own hardcoded TX inline limit.

The blueflame register size can be much greater than the hardcoded limit
for inlining. Make sure we don't exceed the drivers own limit, because this
also means that the maximum number of TX fragments becomes invalid and
then memory size assumptions in the TX path no longer hold up.

2) Make sure the mlx5_query_min_inline() function returns an error code.

3) Header inlining is required when using TSO.

4) Catch failure to compute inline header size for TSO.

5) Add support for UDP when computing inline header size.

6) Fix for inlining issues with regards to DSCP.

Make sure we inline 4 bytes beyond the ethernet and/or
VLAN header to workaround a hardware bug extracting
the DSCP field from the IPv4/v6 header.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:21:28 +00:00
Slava Shwartsman
d51ced5fae mlx5en: Remove the DRBR and associated logic in the transmit path.
The hardware queues are deep enough currently and using the DRBR and associated
callbacks only leads to more task switching in the TX path. The is also a race
setting the queue_state which can lead to hung TX rings.

Submitted by:   hselasky@
Approved by:    hselasky (mentor)
MFC after:      1 week
Sponsored by:   Mellanox Technologies
2018-12-05 14:20:57 +00:00