Commit Graph

579 Commits

Author SHA1 Message Date
Hans Petter Selasky
1943c40cd6 mlx5en(4): Don't wait for receive queue to fill up with mbufs during open channels.
Failure to get mbufs may be transient.
Don't permanently fail to open the channels due to lack of mbufs.
This also makes modifying channel parameters faster.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:07 +02:00
Hans Petter Selasky
6bd4bb9bdb mlx5en(4): Explain why CQE zipping is off.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:07 +02:00
Hans Petter Selasky
80b4ef6d10 mlx5: Remove unused debugfs node pointers.
No functional change intended.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:07 +02:00
Hans Petter Selasky
aa7bbdabde mlx5: Implement diagostic counters as sysctl(8) nodes.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:07 +02:00
Hans Petter Selasky
95bf70a4bf mlx5: Don't give zero number of pages to the firmware.
Can happen when using virtual mlx5_core<N> functions, VFs.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Hans Petter Selasky
273bfac08f mlx5: Implement mlx5_core_modify_cq_by_mask().
Implement one CQ modify function supporting all firmware versions,
instead of having more variants of CQ modify.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Hans Petter Selasky
2f7e9a8a21 mlx5: Fix duplicate free of default flow rule in error case.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Hans Petter Selasky
b0b87d9151 mlx5: Make mlx5_del_flow_rule() NULL safe.
This change factors out repeated NULL checks.

No functional change intended.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Hans Petter Selasky
3bb3e4768f mlx5: Make MLX5_COMP_EQ_SIZE tunable.
When using hardware pacing, this value can be increased, because more SQ's
means more EQ events aswell. Make it tunable, hw.mlx5.comp_eq_size .

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Gleb Smirnoff
a6b55ee6be net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH
Expect that drivers call into the network stack with the net epoch
entered. This has already been the fact since early 2020. The net
interrupts, that are marked with INTR_TYPE_NET, were entering epoch
since 511d1afb6b. For the taskqueues there is NET_TASK_INIT() and
all drivers that were known back in 2020 we marked with it in
6c3e93cb5a. However in e87c494015 we took conservative approach
and preferred to opt-in rather than opt-out for the epoch.

This change not only reverts e87c494015 but adds a safety belt to
avoid panicing with INVARIANTS if there is a missed driver. With
INVARIANTS we will run in_epoch() check, print a warning and enter
the net epoch.  A driver that prints can be quickly fixed with the
IFF_NEEDSEPOCH flag, but better be augmented to properly enter the
epoch itself.

Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing
some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the
new IFF_NEEDSEPOCH.  But the tcp_lro_flush_all() asserts the presence
of network epoch.  Indeed, all NIC drivers that support LRO already
provide the epoch, either with help of INTR_TYPE_NET or just running
NET_EPOCH_ENTER() in their code.

Reviewed by:		zlei, gallatin, erj
Differential Revision:	https://reviews.freebsd.org/D39510
2023-04-17 09:08:35 -07:00
Zhenlei Huang
da4068c4e1 mlx5ib(4): Mark driver knows net epoch
This driver has already been EPOCH(9) aware since e48813009c.

Reviewed by:	hselasky
Tested by:	hselasky
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D39406
2023-04-06 00:08:23 +08:00
Justin Hibbits
70510800d1 mlx5: Enter network epoch when using if_foreach()
Summary: IFNET_RLOCK() is not sufficient, the epoch needs entered.

Reviewed by:	hselasky
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38711
2023-03-06 11:04:15 -05:00
Justin Hibbits
5dc00f00b7 Mechanically convert mlx5en(4) to IfAPI
Reviewed by:	zlei
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38595
2023-02-15 09:32:41 -05:00
Gleb Smirnoff
caf32b260a pfil: add pfil_mem_{in,out}() and retire pfil_run_hooks()
The 0b70e3e78b changed the original design of a single entry point
into pfil(9) chains providing separate functions for the filtering
points that always provide mbufs and know the direction of a flow.
The motivation was to reduce branching.  The logical continuation
would be to do the same for the filtering points that always provide
a memory pointer and retire the single entry point.

o Hooks now provide two functions: one for mbufs and optional for
  memory pointers.
o pfil_hook_args() has a new member and pfil_add_hook() has a
  requirement to zero out uninitialized data. Bump PFIL_VERSION.
o As it was before, a hook function for a memory pointer may realloc
  into an mbuf.  Such mbuf would be returned via a pointer that must
  be provided in argument.
o The only hook that supports memory pointers is ipfw:default-link.
  It is rewritten to provide two functions.
o All remaining uses of pfil_run_hooks() are converted to
  pfil_mem_in().
o Transparent union of pfil_packet_t and tricks to fix pointer
  alignment are retired. Internal pfil_realloc() reduces down to
  m_devget() and thus is retired, too.

Reviewed by:		mjg, ocochard
Differential revision:	https://reviews.freebsd.org/D37977
2023-02-14 10:02:49 -08:00
Elliott Mitchell
1c5e687117 mlx5: purge EOL release compatibility
Remove FreeBSD 10 support code

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/603
Differential Revision: https://reviews.freebsd.org/D35560
2023-02-04 09:13:09 -07:00
Konstantin Belousov
01143ba118 ifcapnv: fix IFCAP2 usage
IFCAP2_XXX constants are integers, they do not need shift for the
definition.  But their usage as bitmask for if_capenable2 does require
shift.  Add convenience macro IFCAP2_BIT() for consumers.

Fix the only existing consumer, mlx5(4) RXTLS enable bits.

Reported by:	jhb
Reviewed by:	jhb, jhibbits, hselasky
Coverity CID:	1501659
Sponsored by:	NVIDIA networking
Differential revision:	https://reviews.freebsd.org/D37862
2023-01-03 11:48:16 +02:00
John Baldwin
47cc457bd6 mlx5: Fix mismatch in array bounds.
Reported by:	GCC -Warray-parameter
Reviewed by:	hselasky, imp, emaste (earlier version)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D37549
2022-12-07 12:32:54 -08:00
John Baldwin
f49fd63a6a kmem_malloc/free: Use void * instead of vm_offset_t for kernel pointers.
Reviewed by:	kib, markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D36549
2022-09-22 15:09:19 -07:00
Randall Stewart
7cc3ea9c6f mlx5 M_TSTMP accuracy looses quite a bit of precision so lets fix it.
The way that the clock is synchronized between the system and the current mlx5 for the purposes of the M_TSTMP
being carried we loose a lot of precision. Instead lets change the math that calculates this to separate out
the seconds/nanoseconds and operate on the two values so we don't get overflow instead of just
shifting the value down and loosing precision.

Reviewed by: kib, hselasky
Sponsored by: Netflix Inc
Differential Revision: https://reviews.freebsd.org/D36327
2022-09-20 13:12:16 -04:00
Gordon Bergling
e1a40dd294 mlx5en(4): Correct a typo in a kernel error message
- s/ouput/output

MFC after:	5 days
2022-09-03 19:29:33 +02:00
Gordon Bergling
a4181a3ec3 mlx5en(4): Fix a typo in a source code comment
- s/functino/function/

MFC after:	3 days
2022-07-31 10:28:20 +02:00
Dimitry Andric
3a0b243282 Fix unused variable warning in mlx5_ib_devx.c
With clang 15, the following -Werror warning is produced:

    sys/dev/mlx5/mlx5_ib/mlx5_ib_devx.c:1926:6: error: variable 'num_alloc_xa_entries' set but not used [-Werror,-Wunused-but-set-variable]
            int num_alloc_xa_entries = 0;
                ^

The 'num_alloc_xa_entries' variable appears to have been a debugging aid
that has never been used for anything, so remove it.

MFC after:	3 days
2022-07-25 00:40:13 +02:00
Dimitry Andric
6332ad8673 Fix unused variable warning in mlx5_fs_tree.c
With clang 15, the following -Werror warning is produced:

    sys/dev/mlx5/mlx5_core/mlx5_fs_tree.c:1408:15: error: variable 'candidate_group_num' set but not used [-Werror,-Wunused-but-set-variable]
            unsigned int candidate_group_num = 0;
                         ^

The 'candidate_group_num' variable appears to have been a debugging aid
that has never been used for anything, so remove it.

MFC after:	3 days
2022-07-25 00:40:13 +02:00
Hans Petter Selasky
e4d178d093 mlx5ib: Fix memory leak in clean_mr() error path
In the clean_mr() error path the 'mr' should be freed.

Linux commit:
5942d8ae411775b76e5e1ab0cce57b0666516f2d

PR:		264653
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-06-13 17:00:16 +02:00
Hans Petter Selasky
d5d6949031 mlx5en(4): Allow RX TLS to be enabled and disabled by ifconfig(8).
While at it, fix double initialization of the "drv_ioctl_data_d" structure
and the "mask" variable.

Reviewed by:	kib@
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-06-08 13:53:26 +02:00
Hans Petter Selasky
cb27627968 mlx5en(4): Set the leaf network interface field in the mbuf packet header.
This will be used for TLS RX.

Submitted by:	jhb@
Differential revision:	https://reviews.freebsd.org/D32356
Sponsored by:	NVIDIA Networking
2022-06-07 12:54:42 +02:00
Konstantin Belousov
3a364a6b91 mlx5en: formally declare supoort for RXTLS
Reviewed by:	hselasky
Sponsored by:	NVIDIA Networking
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D32551
2022-05-24 23:59:32 +03:00
Konstantin Belousov
f7ea19958b Convert mlx5_en to SIOCSIFCAPNV
Reviewed by:	hselasky, jhb, kp (previous version)
Sponsored by:	NVIDIA Networking
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D32551
2022-05-24 23:59:32 +03:00
Hans Petter Selasky
d735d604f0 mlx5en(4): Use hard-coded 4K page size for RQ/SQ/CQ.
The page size specified for RQ, SQ and CQ is always in units of 4KBytes.
Make sure we subtract MLX5_ADAPTER_PAGE_SHIFT, 12, instead of PAGE_SHIFT
which may vary. This fixes support for using the mlx5en driver on systems
having non-4K page size.

Linux commit:
68cdf5d6e91068c98d6091b193dc7a5ab7dcf5eb

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-05-03 13:48:43 +02:00
John Baldwin
75095cd38f mlx5: fs_tcp is only used for INET or INET6. 2022-04-13 16:08:21 -07:00
John Baldwin
c3b1cbc9e6 mlx5 RATELIMIT: Remove an unused variable. 2022-04-12 14:59:00 -07:00
John Baldwin
834533b9c5 mlx5: Remove unused variables.
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D34827
2022-04-08 17:25:13 -07:00
John Baldwin
ebb16d5e93 mlx5: Pass the correct data pointer to the add_dst_cb instead of NULL.
Reported by:	-Wunused-but-set-variable
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D34812
2022-04-07 10:46:48 -07:00
John Baldwin
6301649e0e mlx5: Remove write-only variables. 2022-04-06 16:45:28 -07:00
Gordon Bergling
4a87beeccb mlx5en(4): Fix a few typos in source code comments
- s/persistant/persistent/

MFC after:	3 days
2022-03-28 19:36:32 +02:00
Hans Petter Selasky
b18c510844 mlx5/mlx4: Bump driver version to 3.7.1
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-03-08 13:12:03 +01:00
Hans Petter Selasky
eb16e362d6 mlx5core: Add PCI IDs for ConnectX-8.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-21 09:35:19 +01:00
Hans Petter Selasky
91c8ffd7e6 mlx5ib: Add support for NDR link speed.
The IBTA specification has new speed - NDR. That speed supports signaling
rate of 100Gb. mlx5 IB driver translates link modes reported by ConnectX
device to IB speed and width. Added translation of new 100Gb, 200Gb and
400Gb link modes to NDR IB type and width of x1, x2 or x4 respectively.

Linux commits:
f946e45f59ef01ff54ffb3b1eba3a8e7915e7326

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-21 09:35:19 +01:00
Hans Petter Selasky
ea8aacc523 mlx5core: Add PCI IDs for ConnectX-7.
Linux commits:
505a7f5478062c6cd11e22022d9f1bf64cd8eab3
dd8595eabeb486d41ad9994e6cece36e0e25e313

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-21 09:35:19 +01:00
Hans Petter Selasky
bc531a1faa mlx5en: Improve CQE error debugging.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-17 13:13:09 +01:00
Hans Petter Selasky
015f22f5d0 mlx5en: Fix TLS worker thread race.
Create a dedicated free state, in case the taskqueue worker is still pending,
to avoid re-activation of a freed send tag.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-17 13:13:09 +01:00
Hans Petter Selasky
ebdb700649 mlx5en: Improve RX- and TX- TLS refcounting.
Use the send tag refcounting mechanism to refcount the RX- and TX- TLS
send tags. Then it is no longer needed to wait for refcounts to reach
zero when destroying RX- and TX- TLS send tags as a result of pending
data or WQE commands.

This also ensures that when TX-TLS and rate limiting is used at the same
time, the underlying SQ is not prematurely destroyed.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-17 13:13:09 +01:00
Hans Petter Selasky
d2a788a522 mlx5en: Add missing refcount decrement on link-down.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-17 13:13:09 +01:00
Mark Johnston
235ed6a486 mlx5e: Make TLS tag zones unmanaged
These zones are cache zones used to allocate TLS offload contexts from
firmware.  Releasing items from the cache is a sleepable operation due
to the need to await a response from the firmware command freeing the
tag, so items cannot be reclaimed from the zone in non-sleepable
contexts.  Since the cache size is limited by firmware limits, avoid
this by setting UMA_ZONE_UNMANAGED to avoid reclamation by uma_timeout()
and the low memory handler.

Reviewed by:	hselasky, kib
MFC after:	3 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34142
2022-02-15 09:25:34 -05:00
Hans Petter Selasky
a30f71704e mlx5ib: Add support for parsing udata in mlx5_ib_create_flow().
Backport from Linux 5.17 (drivers/infiniband/hw/mlx5/fs.c)

This fixes creating flow rules from user-space after the
kernel space update based on Linux 5.7-rc1 .

Sponsored by:	NVIDIA Networking
2022-02-10 11:17:42 +01:00
Hans Petter Selasky
04f407a3e5 mlx5en: Make sure the NIC IP addresses are written to firmware on link up.
Fixes e059c120b4 .

PR:		261746
MFC after:	1 day
Sponsored by:	NVIDIA Networking
2022-02-10 11:17:42 +01:00
Hans Petter Selasky
84d7b8e75f mlx5en: Implement TLS RX support.
TLS RX support is modeled after TLS TX support. The basic structures and layouts
are almost identical, except that the send tag created filters RX traffic and
not TX traffic.

The TLS RX tag keeps track of past TLS records up to a certain limit,
approximately 1 Gbyte of TCP data. TLS records of same length are joined
into a single database record.

Regularly the HW is queried for TLS RX progress information. The TCP sequence
number gotten from the HW is then matches against the database of TLS TCP
sequence number records and lengths. If a match is found a static params WQE
is queued on the IQ and the hardware should immediately resume decrypting TLS
data until the next non-sequential TCP packet arrives.

Offloading TLS RX data is supported for untagged, prio-tagged, and
regular VLAN traffic.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:17 +01:00
Hans Petter Selasky
e6d7ac1d03 mlx5core: Set driver version into firmware.
If the driver_version capability bit is enabled, send the driver
version to firmware after the init HCA command, for display purposes.

Example of driver version: "FreeBSD,mlx5_core,14.0.0,3.x-xxx"

Linux commits:
012e50e109fd27ff989492ad74c50ca7ab21e6a1

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:17 +01:00
Hans Petter Selasky
8e332232a5 mlx5en: Implement one RQT object per channel.
These objects will eventually be used to switch TLS RX traffic.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:17 +01:00
Hans Petter Selasky
ea00d7e8ca mlx5: Add raw ethernet local loopback support.
Currently, unicast/multicast loopback raw ethernet (non-RDMA) packets
are sent back to the vport.  A unicast loopback packet is the packet
with destination MAC address the same as the source MAC address.  For
multicast, the destination MAC address is in the vport's multicast
filter list.

Moreover, the local loopback is not needed if there is one or none
user space context.

After this patch, the raw ethernet unicast and multicast local
loopback are disabled by default. When there is more than one user
space context, the local loopback is enabled.

Note that when local loopback is disabled, raw ethernet packets are
not looped back to the vport and are forwarded to the next routing
level (eswitch, or multihost switch, or out to the wire depending on
the configuration).

Linux commits:
c85023e153e3824661d07307138fdeff41f6d86a
8978cc921fc7fad3f4d6f91f1da01352aeeeff25

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:16 +01:00