Commit Graph

147104 Commits

Author SHA1 Message Date
Mateusz Guzik
5e954b9216 tmpfs: add missing vop_fplookup ops to tmpfs_fifoop_entries
Reported by:	gbe
PR:	270917
2023-04-18 18:06:30 +00:00
Marius Strobl
8defc88c13 gem(4): Remove onboard-only Sun ERI and remnants of SBus support
These bits are obsolete since 58aa35d429.
This change reverts part of 9ba2b298df as
well as effectively bd3d9826d7, i. e. the
SBus-related modifications. This also gets rid of a nasty hack required
as bus_{read,write}_N(9) doesn't really fit bus_space_subregion(9).
2023-04-18 19:17:24 +02:00
Marius Strobl
bd15d31cef mmc(4): Don't call bridge driver for timings not requiring tuning
The original idea behind calling into the bridge driver was to have the
logic deciding whether tuning is actually required for a particular bus
timing in a given slot as well as doing the sanity checking only on the
controller layer which also generally is better suited for these due to
say SDHCI_SDR50_NEEDS_TUNING. On another thought, not every such driver
should need to check whether tuning is required at all, though, and not
everything is SDHCI in the first place.
Adjust sdhci{,_fsl_fdt}(4) accordingly, but keep sdhci_generic_tune() a
bit cautious still.
2023-04-18 19:17:24 +02:00
Randall Stewart
2ad584c555 tcp: Inconsistent use of hpts_calling flag
Gleb has noticed there were some inconsistency's in the way the inp_hpts_calls flag was being used. One
such inconsistency results in a bug when we can't allocate enough sendmap entries to entertain a call to
rack_output().. basically a timer won't get started like it should. Also in cleaning this up I find that the
"no_output" side of input needs to be adjusted to make sure we don't try to re-pace too quickly outside
the hpts assurance of 250useconds.

Another thing here is we end up with duplicate calls to tcp_output() which we should not. If packets go
from hpts for processing the input side of tcp will call the output side of tcp on the last packet if it is needed.
This means that when that occurs a second call to tcp_output would be made that is not needed and if pacing
is going on may be harmful.

Lets fix all this and explicitly state the contract that hpts is making with transports that care about the
flag.

Reviewed by: tuexen, glebius
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39653
2023-04-17 17:10:26 -04:00
Steve Kiernan
fb5ff7384c arm64: Use FULLKERNEL instead of .ALLSRC in .bin target
Using .ALLSRC may get additional arguments that we may not want
and could cause the objcopy to fail.

Reviewed by:	emaste
Obtained from:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D39639
2023-04-18 11:41:57 -04:00
Kristof Provost
af94d8cc17 pf: fix incorrect lock define
PF_TABLE_STATS_ASSERT() should be checking pf_table_stats_lock not
pf_rules_lock.

Fortunately the define is not yet used anywhere so this was harmless.
Fix it anyway, in case it does get used.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-04-18 15:51:05 +02:00
Hans Petter Selasky
1943c40cd6 mlx5en(4): Don't wait for receive queue to fill up with mbufs during open channels.
Failure to get mbufs may be transient.
Don't permanently fail to open the channels due to lack of mbufs.
This also makes modifying channel parameters faster.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:07 +02:00
Hans Petter Selasky
6bd4bb9bdb mlx5en(4): Explain why CQE zipping is off.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:07 +02:00
Hans Petter Selasky
80b4ef6d10 mlx5: Remove unused debugfs node pointers.
No functional change intended.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:07 +02:00
Hans Petter Selasky
aa7bbdabde mlx5: Implement diagostic counters as sysctl(8) nodes.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:07 +02:00
Hans Petter Selasky
95bf70a4bf mlx5: Don't give zero number of pages to the firmware.
Can happen when using virtual mlx5_core<N> functions, VFs.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Hans Petter Selasky
273bfac08f mlx5: Implement mlx5_core_modify_cq_by_mask().
Implement one CQ modify function supporting all firmware versions,
instead of having more variants of CQ modify.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Hans Petter Selasky
2f7e9a8a21 mlx5: Fix duplicate free of default flow rule in error case.
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Hans Petter Selasky
b0b87d9151 mlx5: Make mlx5_del_flow_rule() NULL safe.
This change factors out repeated NULL checks.

No functional change intended.

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Hans Petter Selasky
3bb3e4768f mlx5: Make MLX5_COMP_EQ_SIZE tunable.
When using hardware pacing, this value can be increased, because more SQ's
means more EQ events aswell. Make it tunable, hw.mlx5.comp_eq_size .

MFC after:	1 week
Sponsored by:	NVIDIA Networking
2023-04-18 15:01:06 +02:00
Randall Stewart
37229fed38 tcp: Blackbox logging and tcp accounting together can cause a crash.
If you currently turn BB logging on and in combination have TCP Accounting on we can get a
crash where we have no NULL check and we run out of memory. Also lets make sure we
don't do a divide by 0 in calculating any BB ratios.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39622
2023-04-17 13:52:00 -04:00
Alexander V. Chernikov
28abf63277 netlink: sync interface IFLA attributes
MFC after:	2 weeks
2023-04-18 12:34:05 +00:00
Gordon Bergling
105e397eb6 kern_sysctl: Remove double words in source code comments
- s/on on/on/

MFC after:	5 days
2023-04-18 07:14:57 +02:00
Gordon Bergling
93e4914816 net80211: Remove double words in source code comments
- s/we we/we/

MFC after:	5 days
2023-04-18 07:14:50 +02:00
Stephen J. Kiernan
76735c7439 flash: Add "n25q64" to mx25l driver
This is for 64Mb Micron N25Q serial NOR flash memory

Obtained from:	Juniper Networks, Inc.
2023-04-18 00:21:17 -04:00
Jason A. Harmening
0c01203e47 vfs_lookup(): re-check v_mountedhere on lock upgrade
The VV_CROSSLOCK handling logic may need to upgrade the covered
vnode lock depending upon the requirements of the filesystem into
which vfs_lookup() is walking.  This may involve transiently
dropping the lock, which can allow the target mount to be unmounted.

Tested by:	pho
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Jason A. Harmening
93fe61afde unionfs_mkdir(): handle dvp reclamation
The underlying VOP_MKDIR() implementation may temporarily drop the
parent directory vnode's lock.  If the vnode is reclaimed during that
window, the unionfs vnode will effectively become unlocked because
the its v_vnlock field will be reset.  To uphold the locking
requirements of VOP_MKDIR() and to avoid triggering various VFS
assertions, explicitly re-lock the unionfs vnode before returning
in this case.

Note that there are almost certainly other cases in which we'll
similarly need to handle vnode relocking by the underlying FS; this
is the only one that's caused problems in stress testing so far.
A more general solution, such as that employed for nullfs in
null_bypass(), will likely need to be implemented.

Tested by:	pho
Reviewed by:	kib, markj
Differential Revision: https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Jason A. Harmening
d711884e60 Remove unionfs_islocked()
The implementation is racy; if the unionfs vnode is not in fact
locked, vnode private data may be concurrently altered or freed.
Instead, simply rely upon the standard implementation to query the
v_vnlock field, which is type-stable and will reflect the correct
lower/upper vnode configuration for the unionfs node.

Tested by:	pho
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Jason A. Harmening
a5d82b55fe Remove an impossible condition from unionfs_lock()
We hold the vnode interlock, so vnode private data cannot suddenly
become NULL.

Tested by:	pho
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Jason A. Harmening
a18c403fbd unionfs: remove LK_UPGRADE if falling back to the standard lock
The LK_UPGRADE operation may have temporarily dropped the upper or
lower vnode's lock.  If the unionfs vnode was reclaimed during that
window, its lock field will be reset to no longer point at the
upper/lower vnode lock, so the lock operation will use the standard
lock stored in v_lock.  Remove LK_UPGRADE from the flags in this case
to avoid a lockmgr assertion, as this lock has not been previously
owned by the calling thread.

Reported by:	pho
Tested by:	pho
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Ed Maste
00172f3416 geom: use bool for one-bit wide bit-field
A one-bit wide bit-field can take only the values 0 and -1.  Clang 16
introduced a warning that "implicit truncation from 'int' to a one-bit
wide bit-field changes value from 1 to -1".  Fix by using c99 bool.

Reported by:	Clang, via dim
Reviewed by:	dim
Sponsored by:	The FreeBSD Foundation
2023-04-17 15:43:00 -04:00
Gleb Smirnoff
3232b1f4a9 tcp: fix build
The recent 25685b7537 came in conflict with a540cdca31.  Remove the
code that cleans up the old style input queue.  Note that two lines
below we assert that the new style input queue is empty.  The TCP
stacks that use the queue are supposed to flush it in their
tfb_tcp_fb_fini method.
2023-04-17 10:24:20 -07:00
Gleb Smirnoff
a6b55ee6be net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH
Expect that drivers call into the network stack with the net epoch
entered. This has already been the fact since early 2020. The net
interrupts, that are marked with INTR_TYPE_NET, were entering epoch
since 511d1afb6b. For the taskqueues there is NET_TASK_INIT() and
all drivers that were known back in 2020 we marked with it in
6c3e93cb5a. However in e87c494015 we took conservative approach
and preferred to opt-in rather than opt-out for the epoch.

This change not only reverts e87c494015 but adds a safety belt to
avoid panicing with INVARIANTS if there is a missed driver. With
INVARIANTS we will run in_epoch() check, print a warning and enter
the net epoch.  A driver that prints can be quickly fixed with the
IFF_NEEDSEPOCH flag, but better be augmented to properly enter the
epoch itself.

Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing
some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the
new IFF_NEEDSEPOCH.  But the tcp_lro_flush_all() asserts the presence
of network epoch.  Indeed, all NIC drivers that support LRO already
provide the epoch, either with help of INTR_TYPE_NET or just running
NET_EPOCH_ENTER() in their code.

Reviewed by:		zlei, gallatin, erj
Differential Revision:	https://reviews.freebsd.org/D39510
2023-04-17 09:08:35 -07:00
Gleb Smirnoff
a540cdca31 tcp_hpts: use queue(9) STAILQ for the input queue
Reviewed by:		rrs
Differential Revision:	https://reviews.freebsd.org/D39574
2023-04-17 09:07:23 -07:00
Steve Kiernan
48ffacbc84 veriexec: Add function to get label associated with a file
Add mac_veriexec_metadata_get_file_label to avoid the need to
expose internals to other MAC modules.

Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:33 -04:00
Steve Kiernan
bd4742c970 veriexec: Rename old VERIEXEC_SIGNED_LOAD as VERIEXEC_SIGNED_LOAD32
We need to handle old ioctl from old binary.

Add some missing ioctls.

Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:32 -04:00
Steve Kiernan
d195f39d1d veriexec: Add option MAC_VERIEXEC_DEBUG
Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:32 -04:00
Simon J. Gerraty
8c3e263dc1 veriexec: mac_veriexec_syscall compat32 support
Some 32bit apps may need to be able to use
MAC_VERIEXEC_GET_PARAMS_PID_SYSCALL
MAC_VERIEXEC_GET_PARAMS_PATH_SYSCALL

Therefore compat32 support is required.

Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:32 -04:00
Steve Kiernan
8512d82ea0 veriexec: Additional functionality for MAC/veriexec
Ensure veriexec opens the file before doing any read operations.

When the MAC_VERIEXEC_CHECK_PATH_SYSCALL syscall is requested, veriexec
needs to open the file before calling mac_veriexec_check_vp. This is to
ensure any set up is done by the file system. Most file systems do not
explicitly need an open, but some (e.g. virtfs) require initialization
of access tokens (file identifiers, etc.) before doing any read or write
operations.

The evaluate_fingerprint() function needs to ensure it has an open file
for reading in order to evaluate the fingerprint. The ideal solution is
to have a hook after the VOP_OPEN call in vn_open. For now, we open the
file for reading, envaluate the fingerprint, and close the file. While
this leaves a potential hole that could possibly be taken advantage of
by a dedicated aversary, this code path is not typically visited often
in our use cases, as we primarily encounter verified mounts and not
individual files. This should be considered a temporary workaround until
discussions about the post-open hook have concluded and the hook becomes
available.

Add MAC_VERIEXEC_GET_PARAMS_PATH_SYSCALL and
MAC_VERIEXEC_GET_PARAMS_PID_SYSCALL to mac_veriexec_syscall so we can
fetch and check label contents in an unconstrained manner.

Add a check for PRIV_VERIEXEC_CONTROL to do ioctl on /dev/veriexec

Make it clear that trusted process cannot be debugged. Attempts to debug
a trusted process already fail, but the failure path is very obscure.
Add an explicit check for VERIEXEC_TRUSTED in
mac_veriexec_proc_check_debug.

We need mac_veriexec_priv_check to not block PRIV_KMEM_WRITE if
mac_priv_gant() says it is ok.

Reviewed by:	sjg
Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:32 -04:00
Mark Johnston
d95fbf4e1a riscv: save the thread pointer in both modes
The contents of frame->tf_tp are uninitialized if accessed by DTrace (in
probe context), resulting in a panic when trying to access the memory
pointed to by tp. This saves the thread pointer to the trap frame when
handling both userland and kernel exceptions.

Reviewed by:	markj, mhorne
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D39582
2023-04-17 09:49:52 -04:00
Alexander V. Chernikov
f656a96020 tests: make ktest build on ppc.
MFC after:	2 weeks
2023-04-17 13:47:07 +00:00
Alexander V. Chernikov
9742519b22 netlink: fix operations with link-local routes/gateways.
MFC after:	3 days
2023-04-17 12:04:43 +00:00
Alexander V. Chernikov
b8da3b62a5 tests: add ktest modules to build
MFC after:	2 weeks
2023-04-17 10:46:05 +00:00
Pawel Jakub Dawidek
068913e4ba zfs: Add vfs.zfs.bclone_enabled sysctl.
Keep block cloning disabled by default for now, but allow to enable and
use it after setting vfs.zfs.bclone_enabled to 1, so people can easily
try it.

Approved by:	oshogbo
Reviewed by:	mm, oshogbo
Differential Revision:	https://reviews.freebsd.org/D39613
2023-04-17 03:38:30 -07:00
Zhenlei Huang
401f03445e lagg(4): Correctly define some sysctl variables
939a050ad9 virtualized lagg(4), but the corresponding sysctl of some
virtualized global variables are not marked with CTLFLAG_VNET. A try to
operate on those variables via sysctl will effectively go to the 'master'
copies and the virtualized ones are not read or set accordingly. As a
side effect, on updating the 'master' copy, the virtualized global
variables of newly created vnets will have correct values.

PR:		270705
Reviewed by:	kp
Fixes:		939a050ad9 Virtualize lagg(4) cloner
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D39467
2023-04-17 18:24:35 +08:00
Zhenlei Huang
a7acce3491 vnet: Fix a typo in a source code comment
- s/form/from/

MFC after:	3 days
2023-04-17 18:24:35 +08:00
Pawel Jakub Dawidek
1959e122d9 zfs: Merge https://github.com/openzfs/zfs/pull/14739
The zfs_log_clone_range() function is never called from the
zfs_clone_range_replay() function, so I assumed it is safe to assert
that zil_replaying() is never TRUE here. It turns out zil_replaying()
also returns TRUE when the sync property is set to disabled.

Fix the problem by just returning if zil_replaying() returns TRUE.

Reported by: Florian Smeets
Signed-off-by: Pawel Jakub Dawidek pawel@dawidek.net

Approved by: oshogbo, mm
2023-04-17 02:22:56 -07:00
Pawel Jakub Dawidek
e0bb199925 zfs: cherry-pick openzfs/zfs@c71fe7164
Fix data corruption when cloning embedded blocks

Don't overwrite blk_phys_birth, as for embedded blocks it is part of
the payload.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Issue #13392
Closes #14739

Approved by: oshogbo, mm
2023-04-17 02:19:49 -07:00
Stephen J. Kiernan
88a3358ea4 veriexec: Add SPDX-License-Identifier 2023-04-16 21:23:00 -04:00
Stephen J. Kiernan
894bcc876d sys/modules/Makefile: conditionally add MAC/veriexec modules
Only build MAC/veriexec modules when MK_VERIEXEC is yes or we
are building all modules.

Add VERIEXEC knob to kernel __DEFAULT_NO_OPTIONS

Reviewed by:	sjg
Obtained from:	Juniper Networks, Inc.
2023-04-16 20:24:54 -04:00
Stephen J. Kiernan
8050e0a429 sys/modules/Makefile: add MAC/veriexec modules into the build
Build the MAC/veriexec module and the SHA2, SHA256, SHA384, and
SHA512 fingerprint modules.

Obtained from:	Juniper Networks, Inc.
2023-04-16 19:18:55 -04:00
Simon J. Gerraty
6ae8d57652 mac_veriexec: add mac_priv_grant check for NODEV
Allow other MAC modules to override some veriexec checks.

We need two new privileges:
PRIV_VERIEXEC_DIRECT	process wants to override 'indirect' flag
			on interpreter
PRIV_VERIEXEC_NOVERIFY	typically associated with PRIV_VERIEXEC_DIRECT
			allow override of O_VERIFY

We also need to check for PRIV_VERIEXEC_NOVERIFY override
for FINGERPRINT_NODEV and FINGERPRINT_NOENTRY.
This will only happen if parent had PRIV_VERIEXEC_DIRECT override.

This allows for MAC modules to selectively allow some applications to
run without verification.

Needless to say, this is extremely dangerous and should only be used
sparingly and carefully.

Obtained from:	Juniper Networks, Inc.

Reviewers: sjg
Subscribers: imp, dab

Differential Revision: https://reviews.freebsd.org/D39537
2023-04-16 19:14:40 -04:00
Stephen J. Kiernan
4819e5aeda Add new privilege PRIV_KDB_SET_BACKEND
Summary:
Check for PRIV_KDB_SET_BACKEND before allowing a thread to change
the KDB backend.

Obtained from:	Juniper Networks, Inc.
Reviewers: sjg, emaste
Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D39538
2023-04-16 14:37:58 -04:00
Val Packett
77f0e198d9 procctl: add state flags to PROC_REAP_GETPIDS reports
For a process supervisor using the reaper API to track process subtrees,
it is very useful to know the state of the processes on the list.

Sponsored by:   https://www.patreon.com/valpackett
Reviewed by:    kib
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D39585
2023-04-16 13:48:20 +03:00
Stephen J. Kiernan
b1a00c2b13 Quiet compiler warnings for fget_noref and fdget_noref
Summary:
Typecasting both parts of the comparison to u_int quiets compiler
warnings about signed/unsigned comparison and takes care of positive
and negative numbers for the file descriptor in a single comparison.

Obtained from:	Juniper Netwowrks, Inc.

Reviewers: mjg

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D39593
2023-04-15 23:50:54 -04:00