272288 Commits

Author SHA1 Message Date
Hans Petter Selasky
5381f93647 mlx5en: Create TIRs before flowtables.
Because flowtables may redirect traffic to TIRs.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:16 +01:00
Hans Petter Selasky
001106f807 mlx5en: Create flowtables in correct order.
Because it affects how the flow tables may re-direct traffic.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:16 +01:00
Hans Petter Selasky
2c0ade806a mlx5: Implement flow steering helper functions for TCP sockets.
This change adds convenience functions to setup a flow steering rule based on
a TCP socket. The helper function gets all the address information from the
socket and returns a steering rule, to be used with HW TLS RX offload.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:16 +01:00
Hans Petter Selasky
0ee1b09eaa mlx5: Implement offloads flowtable namespace.
This namespace will be used for TCP offloads, like hardware decryption
of TLS TCP data.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:16 +01:00
Hans Petter Selasky
e059c120b4 mlx5en: Create and destroy all flow tables and rules when the network interface attaches and detaches.
Previously flow steering tables and rules were only created and destroyed
at link up and down events, respectivly. Due to new requirements for adding
TLS RX flow tables and rules, the main flow steering table must always be
available as there are permanent redirections from the TLS RX flow table
to the vlan flow table.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:16 +01:00
Hans Petter Selasky
a8e715d21b mlx5en: Add race protection for SQ remap
Add a refcount for posted WQEs to avoid a race between
post WQE and FW command flows.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:16 +01:00
Hans Petter Selasky
aabca1034c mlx5en: Properly account for no-checksum on tunneled packets.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:15 +01:00
Hans Petter Selasky
06c2bd1872 mlx5en: Force all packets through the indirection table.
All packets must go through the indirection table, RQT,
because it is not possible to modify the RQN of the TIR
for direct dispatchment after it is created, typically
when the link goes up and down.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:15 +01:00
Hans Petter Selasky
266c81aae3 mlx5/mlx5en: Add SQ remap support
Add support to map an SQ to a specific schedule queue using a
special WQE as performance enhancement.

SQ remap operation is handled by a privileged internal queue, IQ,
and the mapping is enabled from one rate to another.

The transition from paced to non-paced should however always go
through FW.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:15 +01:00
Hans Petter Selasky
1c407d0494 mlx5: Properly define the reg_umr_sq networking offload capability bit.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:15 +01:00
Hans Petter Selasky
9680b1ba71 mlx5en: Only delete installed VxLAN rules.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:15 +01:00
Hans Petter Selasky
6176a5e338 mlx5en: Fix inverted logical assignment.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:15 +01:00
Hans Petter Selasky
694263572f mlx5en: Implement support for internal queues, IQ.
Internal send queues are regular sendqueues which are reserved for WQE commands
towards the hardware and firmware. These queues typically carry resync
information for ongoing TLS RX connections and when changing schedule queues
for rate limited connections.

The internal queue, IQ, code is more or less a stripped down copy
of the existing SQ managing code with exception of:

1) An optional single segment memory buffer which can be read or
   written as a whole by the hardware, may be provided.
2) An optional completion callback for all transmit operations, may
   be provided.
3) Does not support mbufs.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:15 +01:00
Hans Petter Selasky
21228c67ab mlx5en: Implement helper functions to open and close TLS TIR context.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:15 +01:00
Hans Petter Selasky
75767cb889 mlx5en: Share DEK objects with TLS RX.
The TLS RX support also needs to be able to allocate DEK objects.
Share the available objects 1:1.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:14 +01:00
Hans Petter Selasky
fad4b7d1f2 mlx5en: Add missing TLS structure prototype.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:14 +01:00
Hans Petter Selasky
3a1bf85503 mlx5en: Remove unused hardware TLS field.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:14 +01:00
Hans Petter Selasky
33a6a7a72a mlx5en: Make the receive packet indirection table, RQT, static instead of dynamic.
Allocate the RQT once, pointing all initial entries to the drop RQN.
When opening the channels simplify modify the RQT, directing all traffic
to the new RQNs. Similarly when closing the channels point all RQT entries
back to the so-called drop RQN.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:14 +01:00
Hans Petter Selasky
7800af352a mlx5en: Set CQN in RQ parameters for drop RQ.
Else creating the drop RQ fails.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:14 +01:00
Hans Petter Selasky
03567b0dfa mlx5en: Set channel pointer for drop receive queue.
A valid channel pointer is needed to get the priv pointer during init.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:14 +01:00
Hans Petter Selasky
4e40e984da mlx5en: Print error code when opening drop RQ fails.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:14 +01:00
Hans Petter Selasky
27b778ae55 mlx5en: Implement dummy receive queue, RQ, for dropping packets.
What is a drop RQ and why is it needed?

The RSS indirection table, also called the RQT, selects the
destination RQ based on the receive queue number, RQN. The RQT is
frequently referred to by flow steering rules to distribute traffic
among multiple RQs. The problem is that the RQs cannot be destroyed
before the RQT referring them is destroyed too. Further, TLS RX
rules may still be referring to the RQT even if the link went
down. Because there is no magic RQN for dropping packets, we create
a dummy RQ, also called drop RQ, which sole purpose is to drop all
received packets. When the link goes down this RQN is filled in all
RQT entries, of the main RQT, so the real RQs which are about to be
destroyed can be released and the TLS RX rules can be sustained.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:14 +01:00
Hans Petter Selasky
a60f953424 mlx5en: Make the hw_lro parameter read only tunable.
This prevents the so-called TIR context from changing during runtime.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:14 +01:00
Hans Petter Selasky
788e9e7478 mlx5: Remove support for FreeBSD 10 and older.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:13 +01:00
Hans Petter Selasky
2d5e5a0d75 mlx5en: Patch to inhibit transmit doorbell writes during packet reception.
During packet reception the network stack frequently transmit data in
response to TCP window updates. To reduce the number of transmit doorbells
needed, inhibit all transmit doorbells designated for the same channel until
after the reception of packets for the given channel is completed.

While at it slightly refactor the mlx5e_tx_notify_hw() function:

1) The doorbell information is always stored into sq->doorbell.d64 .
No need to pass a separate pointer to this variable.

2) Move checks for skipping doorbell writes inside this function.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 16:21:13 +01:00
Konstantin Belousov
0f7b6e11c0 mlx5en: Use a UMA cache zone for managing TLS send tags
Instead of allocating directly from a normal zone. This way
import and release are guaranteed to process all allocated and then
deallocated items. Also, the release occurs in a sleepable context when
caller of uma_zfree() or uma_zdestroy() can sleep itself.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 14:45:58 +02:00
Konstantin Belousov
028130b8e4 mlx5ib: idiomatic use of preprocessor, in particular paths
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 14:45:58 +02:00
Konstantin Belousov
7060097908 mlx5ib: normalize use of the opt_*.h files
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 14:45:57 +02:00
Konstantin Belousov
89918a2375 mlx5en: idiomatic use of preprocessor, in particular paths
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 14:45:57 +02:00
Konstantin Belousov
b984b95693 mlx5en: normalize use of the opt_*.h files
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 14:45:57 +02:00
Hans Petter Selasky
12c56d7dc4 mlx5: idiomatic use of preprocessor, in particular paths
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 14:45:57 +02:00
Konstantin Belousov
ee9d634bd3 mlx5: normalize use of the opt_*.h files
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2022-02-01 14:45:57 +02:00
Andrew Turner
c363da4ae8 Add the Arm SPE interrupt to acpidump
To support the Arm Statistical Profiling Extension (SPE) ACPI 6.3 added
a place to hold the SPE interrupt. Add to acpidump to show when printing
the Arm Generic Interrupt data.

Sponsored by:	The FreeBSD Foundation
2022-02-01 11:43:13 +00:00
Konstantin Belousov
303d3ae7e8 ufs, msdosfs: do not record witness order when creating vnode
When allocating new vnode, we need to lock it exclusively before
making it externally visible.  Since other threads cannot observe the
vnode yet, current lock order cannot create LoR conditions.

Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34126
2022-02-01 10:51:55 +02:00
Konstantin Belousov
d51b0786a2 msdosfs_denode.c: some style
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D34126
2022-02-01 10:51:48 +02:00
Konstantin Belousov
99aa3b731c ffs: lock buffers after snaplk with LK_NOWITNESS
Reviewed by:	mckusick
Discussed with:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34073
2022-02-01 06:54:50 +02:00
Konstantin Belousov
c02780b78c Add GB_NOWITNESS flag
It prevents WITNESS from recording the lock order for the buffer lock
acquired by getblkx().

Reviewed by:	mckusick
Discussed with:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34073
2022-02-01 06:54:50 +02:00
Konstantin Belousov
e11b2b69c5 ffs_alloc.c: order includes alphabetically
Reviewed by:	mckusick
Discussed with:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34073
2022-02-01 06:54:50 +02:00
Konstantin Belousov
d950c5898a vm/vm_extern.h, vm/vm_page.h: use sys/kassert.h
instead of fatty sys/systm.h.

Suggested by:	jhb
Reviewed by:	alc, imp, jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:55:35 +02:00
Konstantin Belousov
f4cdb9d7c3 vm/vm_pager.h: use sys/systm.h header
it is needed for __read_mostly attribute definition, which right now
comes from vm/vm_page.h including sys/systm.h

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:55:35 +02:00
Konstantin Belousov
54d34bfbdf Introduce sys/kassert.h
It contains assert-related definitions previously provided by
sys/systm.h.  The new header is leaner than whole systm.h.
Include kassert.h from systm.h for compatibility.

The copyright assignment to Eivind Eklund was suggested by Kirk McKusick
and is based in the commit 5526d2d920eb17b1507499f35b275b486f7fe8d0.

Suggested by:	jhb
Reviewed by:	alc, imp, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:14:14 +02:00
John Baldwin
58862c0bea fstyp: Remove __packed from struct exfat_de_label.
This fixes a -Waddress-of-packed-member warning about a possibly
unaligned pointer from GCC 9 when calling convert_label().

__packed has to be removed from struct exfat_dirent as well to fix an
alignment warning when casting from a struct exfat_dirent pointer to a
struct exfat_de_label pointer.

Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D32144
2022-01-31 17:33:31 -08:00
John Baldwin
6c9ed42828 ggatec: Use ANSI C definition for init_initial_buffer_size.
This fixes -Wstrict-prototypes and -Wold-style-definition warnings
from GCC 9.
2022-01-31 17:12:04 -08:00
John Baldwin
53e938e408 hyperv storvsc: Don't abuse struct sglist to hold virtual addresses.
struct sglist is intended for holding S/G lists of physical address
ranges, not virtual address ranges.  GCC 9.x issues several warnings
due to casts between pointers and integers of different sizes as a
result (vm_paddr_t is 64-bits on i386).  Instead, add a local 'struct
hv_sglist' which uses an array of 'struct iovec' to hold the S/G list
of virtual address ranges.

Differential Revision:	https://reviews.freebsd.org/D31933
2022-01-31 17:11:27 -08:00
John Baldwin
d782385e9b tcp_ratelimit: Handle some edge cases with TLS + RL send tags.
- After a connection has fallen back from NIC TLS to SW TLS, any
  pacing rate changes should modify the inpcb send tag even though
  SB_TLS_IFNET is set.

- If a connection tries to modify the pacing rate before the send
  tag has been converted from plain TLS to TLS + RL, don't fail
  the rate request set but let it fall through to setting the rate
  on the non-TLS inpcb RL tag.

Reviewed by:	gallatin, rrs, hselasky
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D34085
2022-01-31 16:40:04 -08:00
John Baldwin
d958bc7963 ktls: Try to enable TOE TLS after marking existing data not ready.
At the moment this is mostly a no-op but in the future there will be
in-flight encrypted data which requires software decryption.  This
same setup is also needed for NIC TLS RX.

Note that this does break TOE TLS RX for AES-CBC ciphers since there
is no software fallback for AES-CBC receive.  This will be resolved
one way or another before 14.0 is released.

Reviewed by:	hselasky
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D34082
2022-01-31 16:39:21 -08:00
Gordon Tetlow
9ad859dab2 Fix minor grammar nit. 2022-01-31 15:35:23 -08:00
Ed Maste
e38610abca ssh: remove unused header
Fixes:		0746301c4995 ("ssh: pass 0 to procctl(2) to operate...")
Sponsored by:	The FreeBSD Foundation
2022-01-31 17:15:21 -05:00
Mark Johnston
773e3a71b2 pf: Initialize pf_kpool mutexes earlier
There are some error paths in ioctl handlers that will call
pf_krule_free() before the rule's rpool.mtx field is initialized,
causing a panic with INVARIANTS enabled.

Fix the problem by introducing pf_krule_alloc() and initializing the
mutex there.  This does mean that the rule->krule and pool->kpool
conversion functions need to stop zeroing the input structure, but I
don't see a nicer way to handle this except perhaps by guarding the
mtx_destroy() with a mtx_initialized() check.

Constify some related functions while here and add a regression test
based on a syzkaller reproducer.

Reported by:	syzbot+77cd12872691d219c158@syzkaller.appspotmail.com
Reviewed by:	kp
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34115
2022-01-31 16:14:00 -05:00
Robert Wing
b4cc5d63b6 bhyve/virtio: use correct device id for virtio-scsi
Section 4.1.2.1 of the virtio spec states that the transitional PCI
device id for a scsi device is 0x1004.

Fix suggested by reporter.

PR:             259961
Reported by:    me@nanaya.pro
Reviewed by:	imp, jhb
Fixes:  f9c005a17f4e ("Add bhyve virtio-scsi storage backend support.")
Differential Revision:	https://reviews.freebsd.org/D34103
2022-01-31 09:44:47 -09:00