Commit Graph

147086 Commits

Author SHA1 Message Date
Gordon Bergling
93e4914816 net80211: Remove double words in source code comments
- s/we we/we/

MFC after:	5 days
2023-04-18 07:14:50 +02:00
Stephen J. Kiernan
76735c7439 flash: Add "n25q64" to mx25l driver
This is for 64Mb Micron N25Q serial NOR flash memory

Obtained from:	Juniper Networks, Inc.
2023-04-18 00:21:17 -04:00
Jason A. Harmening
0c01203e47 vfs_lookup(): re-check v_mountedhere on lock upgrade
The VV_CROSSLOCK handling logic may need to upgrade the covered
vnode lock depending upon the requirements of the filesystem into
which vfs_lookup() is walking.  This may involve transiently
dropping the lock, which can allow the target mount to be unmounted.

Tested by:	pho
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Jason A. Harmening
93fe61afde unionfs_mkdir(): handle dvp reclamation
The underlying VOP_MKDIR() implementation may temporarily drop the
parent directory vnode's lock.  If the vnode is reclaimed during that
window, the unionfs vnode will effectively become unlocked because
the its v_vnlock field will be reset.  To uphold the locking
requirements of VOP_MKDIR() and to avoid triggering various VFS
assertions, explicitly re-lock the unionfs vnode before returning
in this case.

Note that there are almost certainly other cases in which we'll
similarly need to handle vnode relocking by the underlying FS; this
is the only one that's caused problems in stress testing so far.
A more general solution, such as that employed for nullfs in
null_bypass(), will likely need to be implemented.

Tested by:	pho
Reviewed by:	kib, markj
Differential Revision: https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Jason A. Harmening
d711884e60 Remove unionfs_islocked()
The implementation is racy; if the unionfs vnode is not in fact
locked, vnode private data may be concurrently altered or freed.
Instead, simply rely upon the standard implementation to query the
v_vnlock field, which is type-stable and will reflect the correct
lower/upper vnode configuration for the unionfs node.

Tested by:	pho
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Jason A. Harmening
a5d82b55fe Remove an impossible condition from unionfs_lock()
We hold the vnode interlock, so vnode private data cannot suddenly
become NULL.

Tested by:	pho
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Jason A. Harmening
a18c403fbd unionfs: remove LK_UPGRADE if falling back to the standard lock
The LK_UPGRADE operation may have temporarily dropped the upper or
lower vnode's lock.  If the unionfs vnode was reclaimed during that
window, its lock field will be reset to no longer point at the
upper/lower vnode lock, so the lock operation will use the standard
lock stored in v_lock.  Remove LK_UPGRADE from the flags in this case
to avoid a lockmgr assertion, as this lock has not been previously
owned by the calling thread.

Reported by:	pho
Tested by:	pho
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D39272
2023-04-17 20:31:40 -05:00
Ed Maste
00172f3416 geom: use bool for one-bit wide bit-field
A one-bit wide bit-field can take only the values 0 and -1.  Clang 16
introduced a warning that "implicit truncation from 'int' to a one-bit
wide bit-field changes value from 1 to -1".  Fix by using c99 bool.

Reported by:	Clang, via dim
Reviewed by:	dim
Sponsored by:	The FreeBSD Foundation
2023-04-17 15:43:00 -04:00
Gleb Smirnoff
3232b1f4a9 tcp: fix build
The recent 25685b7537 came in conflict with a540cdca31.  Remove the
code that cleans up the old style input queue.  Note that two lines
below we assert that the new style input queue is empty.  The TCP
stacks that use the queue are supposed to flush it in their
tfb_tcp_fb_fini method.
2023-04-17 10:24:20 -07:00
Gleb Smirnoff
a6b55ee6be net: replace IFF_KNOWSEPOCH with IFF_NEEDSEPOCH
Expect that drivers call into the network stack with the net epoch
entered. This has already been the fact since early 2020. The net
interrupts, that are marked with INTR_TYPE_NET, were entering epoch
since 511d1afb6b. For the taskqueues there is NET_TASK_INIT() and
all drivers that were known back in 2020 we marked with it in
6c3e93cb5a. However in e87c494015 we took conservative approach
and preferred to opt-in rather than opt-out for the epoch.

This change not only reverts e87c494015 but adds a safety belt to
avoid panicing with INVARIANTS if there is a missed driver. With
INVARIANTS we will run in_epoch() check, print a warning and enter
the net epoch.  A driver that prints can be quickly fixed with the
IFF_NEEDSEPOCH flag, but better be augmented to properly enter the
epoch itself.

Note on TCP LRO: it is a backdoor to enter the TCP stack bypassing
some layers of net stack, ignoring either old IFF_KNOWSEPOCH or the
new IFF_NEEDSEPOCH.  But the tcp_lro_flush_all() asserts the presence
of network epoch.  Indeed, all NIC drivers that support LRO already
provide the epoch, either with help of INTR_TYPE_NET or just running
NET_EPOCH_ENTER() in their code.

Reviewed by:		zlei, gallatin, erj
Differential Revision:	https://reviews.freebsd.org/D39510
2023-04-17 09:08:35 -07:00
Gleb Smirnoff
a540cdca31 tcp_hpts: use queue(9) STAILQ for the input queue
Reviewed by:		rrs
Differential Revision:	https://reviews.freebsd.org/D39574
2023-04-17 09:07:23 -07:00
Steve Kiernan
48ffacbc84 veriexec: Add function to get label associated with a file
Add mac_veriexec_metadata_get_file_label to avoid the need to
expose internals to other MAC modules.

Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:33 -04:00
Steve Kiernan
bd4742c970 veriexec: Rename old VERIEXEC_SIGNED_LOAD as VERIEXEC_SIGNED_LOAD32
We need to handle old ioctl from old binary.

Add some missing ioctls.

Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:32 -04:00
Steve Kiernan
d195f39d1d veriexec: Add option MAC_VERIEXEC_DEBUG
Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:32 -04:00
Simon J. Gerraty
8c3e263dc1 veriexec: mac_veriexec_syscall compat32 support
Some 32bit apps may need to be able to use
MAC_VERIEXEC_GET_PARAMS_PID_SYSCALL
MAC_VERIEXEC_GET_PARAMS_PATH_SYSCALL

Therefore compat32 support is required.

Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:32 -04:00
Steve Kiernan
8512d82ea0 veriexec: Additional functionality for MAC/veriexec
Ensure veriexec opens the file before doing any read operations.

When the MAC_VERIEXEC_CHECK_PATH_SYSCALL syscall is requested, veriexec
needs to open the file before calling mac_veriexec_check_vp. This is to
ensure any set up is done by the file system. Most file systems do not
explicitly need an open, but some (e.g. virtfs) require initialization
of access tokens (file identifiers, etc.) before doing any read or write
operations.

The evaluate_fingerprint() function needs to ensure it has an open file
for reading in order to evaluate the fingerprint. The ideal solution is
to have a hook after the VOP_OPEN call in vn_open. For now, we open the
file for reading, envaluate the fingerprint, and close the file. While
this leaves a potential hole that could possibly be taken advantage of
by a dedicated aversary, this code path is not typically visited often
in our use cases, as we primarily encounter verified mounts and not
individual files. This should be considered a temporary workaround until
discussions about the post-open hook have concluded and the hook becomes
available.

Add MAC_VERIEXEC_GET_PARAMS_PATH_SYSCALL and
MAC_VERIEXEC_GET_PARAMS_PID_SYSCALL to mac_veriexec_syscall so we can
fetch and check label contents in an unconstrained manner.

Add a check for PRIV_VERIEXEC_CONTROL to do ioctl on /dev/veriexec

Make it clear that trusted process cannot be debugged. Attempts to debug
a trusted process already fail, but the failure path is very obscure.
Add an explicit check for VERIEXEC_TRUSTED in
mac_veriexec_proc_check_debug.

We need mac_veriexec_priv_check to not block PRIV_KMEM_WRITE if
mac_priv_gant() says it is ok.

Reviewed by:	sjg
Obtained from:	Juniper Networks, Inc.
2023-04-17 11:47:32 -04:00
Mark Johnston
d95fbf4e1a riscv: save the thread pointer in both modes
The contents of frame->tf_tp are uninitialized if accessed by DTrace (in
probe context), resulting in a panic when trying to access the memory
pointed to by tp. This saves the thread pointer to the trap frame when
handling both userland and kernel exceptions.

Reviewed by:	markj, mhorne
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D39582
2023-04-17 09:49:52 -04:00
Alexander V. Chernikov
f656a96020 tests: make ktest build on ppc.
MFC after:	2 weeks
2023-04-17 13:47:07 +00:00
Alexander V. Chernikov
9742519b22 netlink: fix operations with link-local routes/gateways.
MFC after:	3 days
2023-04-17 12:04:43 +00:00
Alexander V. Chernikov
b8da3b62a5 tests: add ktest modules to build
MFC after:	2 weeks
2023-04-17 10:46:05 +00:00
Pawel Jakub Dawidek
068913e4ba zfs: Add vfs.zfs.bclone_enabled sysctl.
Keep block cloning disabled by default for now, but allow to enable and
use it after setting vfs.zfs.bclone_enabled to 1, so people can easily
try it.

Approved by:	oshogbo
Reviewed by:	mm, oshogbo
Differential Revision:	https://reviews.freebsd.org/D39613
2023-04-17 03:38:30 -07:00
Zhenlei Huang
401f03445e lagg(4): Correctly define some sysctl variables
939a050ad9 virtualized lagg(4), but the corresponding sysctl of some
virtualized global variables are not marked with CTLFLAG_VNET. A try to
operate on those variables via sysctl will effectively go to the 'master'
copies and the virtualized ones are not read or set accordingly. As a
side effect, on updating the 'master' copy, the virtualized global
variables of newly created vnets will have correct values.

PR:		270705
Reviewed by:	kp
Fixes:		939a050ad9 Virtualize lagg(4) cloner
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D39467
2023-04-17 18:24:35 +08:00
Zhenlei Huang
a7acce3491 vnet: Fix a typo in a source code comment
- s/form/from/

MFC after:	3 days
2023-04-17 18:24:35 +08:00
Pawel Jakub Dawidek
1959e122d9 zfs: Merge https://github.com/openzfs/zfs/pull/14739
The zfs_log_clone_range() function is never called from the
zfs_clone_range_replay() function, so I assumed it is safe to assert
that zil_replaying() is never TRUE here. It turns out zil_replaying()
also returns TRUE when the sync property is set to disabled.

Fix the problem by just returning if zil_replaying() returns TRUE.

Reported by: Florian Smeets
Signed-off-by: Pawel Jakub Dawidek pawel@dawidek.net

Approved by: oshogbo, mm
2023-04-17 02:22:56 -07:00
Pawel Jakub Dawidek
e0bb199925 zfs: cherry-pick openzfs/zfs@c71fe7164
Fix data corruption when cloning embedded blocks

Don't overwrite blk_phys_birth, as for embedded blocks it is part of
the payload.

Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Issue #13392
Closes #14739

Approved by: oshogbo, mm
2023-04-17 02:19:49 -07:00
Stephen J. Kiernan
88a3358ea4 veriexec: Add SPDX-License-Identifier 2023-04-16 21:23:00 -04:00
Stephen J. Kiernan
894bcc876d sys/modules/Makefile: conditionally add MAC/veriexec modules
Only build MAC/veriexec modules when MK_VERIEXEC is yes or we
are building all modules.

Add VERIEXEC knob to kernel __DEFAULT_NO_OPTIONS

Reviewed by:	sjg
Obtained from:	Juniper Networks, Inc.
2023-04-16 20:24:54 -04:00
Stephen J. Kiernan
8050e0a429 sys/modules/Makefile: add MAC/veriexec modules into the build
Build the MAC/veriexec module and the SHA2, SHA256, SHA384, and
SHA512 fingerprint modules.

Obtained from:	Juniper Networks, Inc.
2023-04-16 19:18:55 -04:00
Simon J. Gerraty
6ae8d57652 mac_veriexec: add mac_priv_grant check for NODEV
Allow other MAC modules to override some veriexec checks.

We need two new privileges:
PRIV_VERIEXEC_DIRECT	process wants to override 'indirect' flag
			on interpreter
PRIV_VERIEXEC_NOVERIFY	typically associated with PRIV_VERIEXEC_DIRECT
			allow override of O_VERIFY

We also need to check for PRIV_VERIEXEC_NOVERIFY override
for FINGERPRINT_NODEV and FINGERPRINT_NOENTRY.
This will only happen if parent had PRIV_VERIEXEC_DIRECT override.

This allows for MAC modules to selectively allow some applications to
run without verification.

Needless to say, this is extremely dangerous and should only be used
sparingly and carefully.

Obtained from:	Juniper Networks, Inc.

Reviewers: sjg
Subscribers: imp, dab

Differential Revision: https://reviews.freebsd.org/D39537
2023-04-16 19:14:40 -04:00
Stephen J. Kiernan
4819e5aeda Add new privilege PRIV_KDB_SET_BACKEND
Summary:
Check for PRIV_KDB_SET_BACKEND before allowing a thread to change
the KDB backend.

Obtained from:	Juniper Networks, Inc.
Reviewers: sjg, emaste
Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D39538
2023-04-16 14:37:58 -04:00
Val Packett
77f0e198d9 procctl: add state flags to PROC_REAP_GETPIDS reports
For a process supervisor using the reaper API to track process subtrees,
it is very useful to know the state of the processes on the list.

Sponsored by:   https://www.patreon.com/valpackett
Reviewed by:    kib
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D39585
2023-04-16 13:48:20 +03:00
Stephen J. Kiernan
b1a00c2b13 Quiet compiler warnings for fget_noref and fdget_noref
Summary:
Typecasting both parts of the comparison to u_int quiets compiler
warnings about signed/unsigned comparison and takes care of positive
and negative numbers for the file descriptor in a single comparison.

Obtained from:	Juniper Netwowrks, Inc.

Reviewers: mjg

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D39593
2023-04-15 23:50:54 -04:00
Warner Losh
214909d669 Revert "cam: fix up world compilation after previous"
This reverts commit 1d35493e46. It was the wrong fix. 757fc6666b has
the proper fix to include stdbool for userland.

Sponsored by:		Netflix
2023-04-15 18:25:55 -06:00
Warner Losh
757fc6666b cam: Include stdbool.h for userland
Sponsored by:		Netflix
2023-04-15 18:25:22 -06:00
Mateusz Guzik
1d35493e46 cam: fix up world compilation after previous
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-04-15 23:11:27 +00:00
Warner Losh
fd02926a68 cam: Properly mask out the status bits to get completion code
ccb_h.status has two parts: the actual status and some addition bits to
indicate additional information. It must be masked before comparing
against completion codes. Add new inline function cam_ccb_success to
simplify this to test whether or not the request succeeded. Most of the
code already does this, but a few places don't (the rest likely should
be converted to use cam_ccb_status and/or cam_ccb_success, but that's
for another day). This caused at least one bug in recognizing devices
behind a SATA port multiplexer, though some of these checks were
fine with the special knowledge of the code paths involved.

PR:			270459
Sponsored by:		Netflix
MFC After:		1 week (and maybe a EN requst)
Reviewed by:		ken, mav
Differential Revision:	https://reviews.freebsd.org/D39572
2023-04-15 16:32:41 -06:00
Mateusz Guzik
63ee747feb zfs: Revert "ZFS_IOC_COUNT_FILLED does unnecessary txg_wait_synced()"
This reverts commit 519851122b.

It results in data corruption, see:
https://github.com/openzfs/zfs/issues/14753

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-04-15 21:34:54 +00:00
Mateusz Guzik
46ac8f2e7d zfs: don't use zfs_freebsd_copy_file_range
There is one data corruption problem reported and fixed upstream, not
cherry-picked here yet.

On top of it the following fires under load:
        VERIFY(zil_replaying(zfsvfs->z_log, tx));

The patch which introduced the entire machinery is a revert candidate,
but as the machinery came with a dedicated feature flag, doing so would
render affected pools read-only at best. To be figured out.

As a temporary bandaid at least stop the active usage.
Note this patch does not make the feature disappear from zpool upgrade.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-04-15 21:34:54 +00:00
Bjoern A. Zeeb
42742fe725 KASAN: add bus_space*read*_8 for aarch64
Add the remaining bus_space*read*_8 functions conditionally for
only arm64 in order to not break KASAN builds with new code using
one of them.

Suggested by:	markj
Reviewed by:	markj
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D39581
2023-04-15 16:13:56 +00:00
Eugene Grosbein
5ee1c90e50 tmpfs: unbreak module build outside of kernel build environment
MFC after:	3 days
2023-04-15 11:00:03 +07:00
Konstantin Belousov
1e0e335b0f amd64: fix PKRU and swapout interaction
When vm_map_remove() is called from vm_swapout_map_deactivate_pages()
due to swapout, PKRU attributes for the removed range must be kept
intact.  Provide a variant of pmap_remove(), pmap_map_delete(), to
allow pmap to distinguish between real removes of the UVA mappings
and any other internal removes, e.g. swapout.

For non-amd64, pmap_map_delete() is stubbed by define to pmap_remove().

Reported by:	andrew
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D39556
2023-04-15 02:53:59 +03:00
Randall Stewart
3cc7b66732 tcp: stack unloading crash in rack and bbr
Its possible to induce a crash in either rack or bbr. This would be done
if the rack stack were say the default and bbr was being used by a connection.
If the bbr stack is then unloaded and it was active, we will trigger a MPASS assert
in tcp_hpts since the new stack (default rack) would start a timer, and the old stack
(bbr) would have the inp already in hpts.

Reviewed by: tuexen
Sponsored by: Netflix Inc
Differential Revision:https://reviews.freebsd.org/D39576
2023-04-14 15:42:23 -04:00
Alexander V. Chernikov
9f324d8ac2 netlink: make netlink work correctly on CHERI.
Current Netlink message writer code relies on executing callbacks
 with arbitrary data (pointer or integer) to flush the completed
 messages.
This arbitrary data is stored as a union of { void *, uint64_t }.
At some stage, the message flushing code copied this data, using
 direct uint64_t assignment instead of copying the union. It lead
 to failure on CHERI, as sizeof(pointer) == 16 there.

Fix the code by making union non-anonymous and copying it entirely.

Reviewed by:	br, jhb, jrtc27
Differential Revision: https://reviews.freebsd.org/D39557
MFC after:	2 weeks
2023-04-14 16:33:43 +00:00
Alexander V. Chernikov
3e5d0784b9 Testing: add framework for the kernel unit tests.
This changes intends to reduce the bar to the kernel unit-testing by
 introducing a new kernel-testing framework ("ktest") based on Netlink,
 loadable test modules and python test suite integration.

This framework provides the following features:
* Integration to the FreeBSD test suite
* Automatic test discovery
* Automatic test module loading
* Minimal boiler-plate code in both kernel and userland
* Passing any metadata to the test
* Convenient environment pre-setup using python testing framework
* Streaming messages from the kernel to the userland
* Running tests in the dedicated taskqueues
* Skipping or parametrizing tests

Differential Revision: https://reviews.freebsd.org/D39385
MFC after:	2 weeks
2023-04-14 15:47:55 +00:00
Mikhail Pchelin
2f53b5991c net80211: fix a typo in Rx MCS set for unequal modulation case
RX MCS set defines which MCSs are supported for RX, bits 0-31 are for equal
modulation of the streams, bits 33-76 are for unequal case. Current code checks
txstreams variable instead of rxstreams to set bits from 53 to 76 for 4 spatial
streams case.

The modulations are defined in tables 19-38 and 19-41 of the IEEE Std
802.11-2020.

Spotted by bz in https://reviews.freebsd.org/D39476

Reviewed by:		bz
Approved by:		bz
Sponsored by:		Serenity Cybersecurity, LLC
Differential Revision:	https://reviews.freebsd.org/D39568
2023-04-14 18:20:09 +03:00
Mikhail Pchelin
ea26545cc5 net80211: wrong transmit MCS set in HT cap IE
Current code checks whether or not txstreams are equal to rxstreams and if it
isn't - sets needed bits in "Transmit MCS Set". But if they are equal it sets
whole set to zero, which contradicts the standard, if tx and rx streams are
equal 'Tx MCS Set Defined' (table 9-186, IEEE Std 802.11-2020) must be set to
one.

Reviewed by:		bz
Approved by:		bz
Sponsored by:		Serenity Cybersecurity, LLC
Differential Revision:	https://reviews.freebsd.org/D39476
2023-04-14 18:16:29 +03:00
Kyle Evans
d1b6271118 uart(4): add Sunrise Point UART controllers
Sponsored by:	Zenith Electronics LLC
Sponsored by:	Klara, Inc.
2023-04-14 09:58:00 -05:00
Elliott Mitchell
6d765bff6f xen: move common variables off of sys/x86/xen/hvm.c
The xen_domain_type and HYPERVISOR_shared_info variables are shared by
all Xen architectures, so they should be in common rather than
reimplemented by each architecture.

hvm_start_flags is used by xen_initial_domain() and so needs to be in
common.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D28982
2023-04-14 15:59:11 +02:00
Julien Grall
5e2183dab8 xen/intr: move sys/x86/xen/xen_intr.c to sys/dev/xen/bus/
The event channel source code or equivalent is needed on all
architectures.  Since much of this is viable to share, get this moved out
of x86-land.  Each interrupt interface then needs a distinct back-end
implementation.

Reviewed by: royger
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Original implementation: Julien Grall <julien@xen.org>, 2014-01-13 17:41:04
Differential Revision: https://reviews.freebsd.org/D30236
2023-04-14 15:58:57 +02:00
Elliott Mitchell
6699c22c1c xen/intr: move interrupt allocation/release to architecture
Simply moving the interrupt allocation and release functions into files
which belong to the architecture.  Since x86 interrupt handling is quite
distinct from other architectures, this is a crucial necessary step.

Identifying the border between x86 and architecture-independent is
actually quite tricky.  Similarly, getting the prototypes for the
border right is also quite tricky.

Inspired by the work of Julien Grall <julien@xen.org>,
2015-10-20 09:14:56, but heavily adjusted.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30936
2023-04-14 15:58:56 +02:00