Commit Graph

281645 Commits

Author SHA1 Message Date
Justin Hibbits
68b47dcbe3 Finish mechanical conversion of axgbe(4) to IfAPI.
Sponsored by:	Juniper Networks, Inc.
2023-02-14 10:21:20 -05:00
Justin Hibbits
aac2d19d93 IfAPI: Style cleanup
Summary:
Clean up style issues from IfAPI additions.

Casts to (struct ifnet *) made sense when `if_t` was a `void *`, but
since it's a `struct ifnet *` it no longer makes sense.  Fix whitespace
errors, among others.

Reviewed by:	kib, glebius
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38499
2023-02-14 10:21:20 -05:00
Justin Hibbits
a3a76c3d90 IfAPI: Add capabilities2/capenable2 accessors
Summary:
As a stopgap measure add basic accessors for the if_capabilities2 and
if_capenable2 members to further hide the ifnet details.

Sponsored by:	Juniper Networks, Inc.
Reviewed by:	glebius, kib
Differential Revision: https://reviews.freebsd.org/D38487
2023-02-14 10:21:20 -05:00
Justin Hibbits
189c3729d8 IfAPI: More accessors
Summary:
Add the following accessors needed by infiniband drivers:
* if_getaddrlen()
* if_setbroadcastaddr()
* if_resolvemulti()

With these accessors, and additional changes on the drivers' side, an
amd64 kernel can be compiled with `struct ifnet` completely hidden.

Reviewed by:	melifaro
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38488
2023-02-14 10:21:19 -05:00
Justin Hibbits
8d5feede40 IfAPI: Finish changes of ice(4).
Reviewed by:	erj
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D38423
2023-02-14 10:21:19 -05:00
Justin Hibbits
e330262f34 Mechanically convert netmap(4) to IfAPI
Reviewed by:	vmaffione, zlei
Sponsored by:	Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D37814
2023-02-14 10:21:19 -05:00
Justin Hibbits
4366ea339d mlx4: Finish conversion to IfAPI
Fix a few stragglers found with further IfAPI work.

Sponsored by:	Juniper Networks, Inc.
2023-02-14 10:21:19 -05:00
Justin Hibbits
b29549c7f9 etherswitch: Fix leftovers from IfAPI conversion
Sponsored by:	Juniper Networks, Inc.
2023-02-14 10:21:18 -05:00
Mark Johnston
636b19ead4 tcp: Disallow re-connection of a connected socket
soconnectat() tries to ensure that one cannot connect a connected
socket.  However, the check is racy and does not really prevent two
threads from attempting to connect the same TCP socket.

Modify tcp_connect() and tcp6_connect() to perform the check again, this
time synchronized by the inpcb lock, under which we call
soisconnecting().

Reported by:	syzkaller
Reviewed by:	glebius
MFC after:	2 weeks
Sponsored by:	Klara, Inc.
Sponsored by:	Modirum MDPay
Differential Revision:	https://reviews.freebsd.org/D38507
2023-02-14 10:07:19 -05:00
Dmitry Chagin
825fbd087e linux(4): Trim unused opt_usb.h from modules Makefiles
MFC after:		2 weeks
2023-02-14 17:46:33 +03:00
Dmitry Chagin
c8a79231a5 linux(4): Rename linux_timer.h to linux_time.h
To avoid confusing people, rename linux_timer.h to linux_time.h,
as linux_timer.c is the implementation of timer syscalls only,
while linux_time.c contains implementation of all stuff declared
in linux_time.h.

MFC after:		2 weeks
2023-02-14 17:46:33 +03:00
Dmitry Chagin
55d3e181fc linux(4): Cleanup includes under arm64/linux
Cleanup unneeded includes, sort the rest according to style(9).
No functional changes.

MFC after:		2 weeks
2023-02-14 17:46:33 +03:00
Dmitry Chagin
496a9a4f60 linux(4): Cleanup includes under x86/linux
Cleanup unneeded includes, sort the rest according to style(9).
No functional changes.

MFC after:		2 weeks
2023-02-14 17:46:33 +03:00
Dmitry Chagin
2456a45929 linux(4): Cleanup includes under amd64/linux
Cleanup unneeded includes, sort the rest according to style(9).
No functional changes.

MFC after:		2 weeks
2023-02-14 17:46:32 +03:00
Dmitry Chagin
f4a512a5f1 linux(4): Cleanup includes under i386/linux
Cleanup unneeded includes, sort the rest according to style(9).
No functional changes.

MFC after:		2 weeks
2023-02-14 17:46:32 +03:00
Dmitry Chagin
d8e53d94fa linux(4): Cleanup includes under compat/linux
Cleanup unneeded includes, sort the rest according to style(9).
No functional changes.

MFC after:		2 weeks
2023-02-14 17:46:32 +03:00
Dmitry Chagin
e2028292e5 linux(4): Cleanup sys/sysctl.h from linux_misc.h
Leftover after c5156c77 (r374538).

MFC after:		2 weeks
2023-02-14 17:46:32 +03:00
Dmitry Chagin
77f66834b0 linux(4): Fix brackets of local include opt_inet6
MFC after:		2 weeks
2023-02-14 17:46:32 +03:00
Dmitry Chagin
32fdc75fe7 linux(4): Move use_real_names knob to the linux.c
MI linux.[c|h] are the module independent in terms of the Linux emulation
layer (ie, intended for both ISA - 32 & 64 bit), analogue of MD linux.h.
There must be a code here that cannot be placed into the corresponding by
common sense MI source and header files, i.e., code is machine independent,
but ISA dependent.
For the use_real_names knob, the code must be placed into the
linux_socket.[c|h], however linux_socket is ISA dependent.

MFC after:		2 weeks
2023-02-14 17:46:32 +03:00
Dmitry Chagin
acbbd5c039 linux(4): Cleanup sys/uio.h where linux_uitl.h is included
MFC after:		2 weeks
2023-02-14 17:46:31 +03:00
Dmitry Chagin
50c85a32d9 linux(4): Move uselib() to i386
This obsolete system call is not supported by glibc. In ancient libc
versions (before glibc 2.0), uselib() was used to load the shared
libraries with names found in an array of names in the binary.
On Linux, since 3.15, this system call is available only when
the kernel is configured with the CONFIG_USELIB option.

It doesn't look like anyone needs this syscall for others Linuxulators,
so move it to the corresponding MD Linuxulator.

MFC after:		2 weeks
2023-02-14 17:46:31 +03:00
Dmitry Chagin
86f9efef2c linux(4): Cleanup abi_compat.h include from linux_timer.h
Leftover after timespec copyin/copyout routines was implemented.

MFC after:		2 weeks
2023-02-14 17:46:31 +03:00
Dmitry Chagin
007986714c linux(4): Cleanup sys/queue.h from linux.h
Leftover after converting futexes to the umtx API.

NFC after:		2 weeks
2023-02-14 17:46:31 +03:00
Dmitry Chagin
513eb69edf linux(4): Cleanup sys/sysent.h from linux_util
Include sys/sysent.h directly where it needed. The linux_util.h included
in a most source files of the Linuxulator, avoid collecting a rarely used
includes here.

MFC after:		2 weeks
2023-02-14 17:46:31 +03:00
Dmitry Chagin
31e938c531 linux(4): Cleanup vm includes from linux_util.h
Include vm headers directly where they needed. The linux_util.h included
in a most source files of the Linuxulator, avoid collecting a rarely used
includes here.

MFC after:		2 weeks
2023-02-14 17:46:30 +03:00
Dmitry Chagin
81e7a80055 linux(4): Cleanup unneeded includes from linux_util.h
MFC after:		2 weeks
2023-02-14 17:46:30 +03:00
Stefan Eßer
f20058955c sys/kbio.h: make pre-unicode keymap support optional
FreeBSD-9 had introduced support for the full set of Unicode
characters to the parsing and processing of keymap character tables.

This support has been extended to cover the table for accented
characters that are reached via dead key combinations in FreeBSD-13.2.

New ioctls have been introduced to support both the pre-Unicode and
the Unicode formats and keyboard drivers have been extended to support
those ioctls.

This commit makes the ABI compatibility functions in the kernel
optional and dependent on COMPAT_FREEBSD13 in -CURRENT.

The kbdcontrol command in -CURRENT and 13-STABLE (before 13.2) has
been made ABI compatible with old kernels to allow a new world to be
run on an old kernel (that does not have full Unicode support for
keymaps).

This commit is not to merged back to 12-STABLE or 13-STABLE. It is
part of review D38465, which has been split into 3 separate commits
due to different MFC and life-time requirements of either commit.

Approved by:	imp
Differential Revision:	https://reviews.freebsd.org/D38465
2023-02-14 14:03:28 +01:00
Stefan Eßer
c2bb66023f kbdcontrol: enable pre-Unicode dead key table compatibility
The definition of pre-Unicode keymap ioctls will be made optional and
dependent on COMPAT_FREEBSD13 in a follow-up commit to 14-CURRENT.

While we generally provide ABI compatibility for older binaries on
a new kernel, but not functionally extended userland programs on an
old kernel, it has been specifically requested to preserve ABI
compatibility for the kbdcontrol program for both these cases.

Passing the kernel configuration option COMPAT_FREEBSD13 to the build
of kbdcontrol will make ioctls visible to the build that are normally
hidden, but required to implement compatibility with kernels that only
support 8 bit characters in dead key maps.

This commit is not to be merged to any previous FreeBSD version and
it shall be reverted as soon as this type of ABI compatibility is no
longer deemed necessary (probably before 14-STABLE is branched).

This commit is a part of review D38465 and split off to allow it to be
reverted using the commit ID.
2023-02-14 13:49:06 +01:00
Stefan Eßer
b4eab621f2 kbdcontrol.c: make pre-Unicode compatibility conditional
Support for the full range of Unicode character codes has been added
to the main keymap back in 2009, with compatibility shims added in
2011 (to support an older kbdcontrol command on a new kernel during
an upgrade from FreeBSD-8 to FreeBSD-9).

Unicode support for accented characters that are reached via dead key
combinations has been added just recently, again with compatibility
shims to allow all combinations of old/new kernel and old/new
kbdcontrol command to load and display the keymaps including the dead
key table. (But full Unicode in the dead key table requires both a new
kernel and kbdcontrol command.)

This commit makes the compatibility shims depend on the respective
compatibility ioctls (OGIO_KEYMAP, OPIO_KEYMAP, OGIO_DEADKEYMAP, and
OPIO_DEADKEYMAP) being defined in sys/kbio.h. This is true for all of
them in 13-STABLE, none in 12-STABLE (as of now), and will become
optional due to a follow-up commit to sys/kbio.h in -CURRENT.

This commit is the only part of review D38465 that should be merged
back to 12-STABLE and 13-STABLE.

MFC after:	1 month
2023-02-14 13:27:27 +01:00
Peter Holm
31a2528768 stress2: Add UFS+SU test scenario 2023-02-14 11:48:18 +01:00
Peter Holm
3f1f48f1d9 stress2: Add a regression test for D38549 2023-02-14 09:44:58 +01:00
Corvin Köhne
b11081dca7
bhyve: add emulation for qemu's fwcfg data port
The data port returns the data of the fwcfg item.

Reviewed by:		markj
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D38333
2023-02-14 08:28:49 +01:00
Corvin Köhne
151d8131a8
bhyve: add emulation for the qemu fwcfg selector port
The selector port is used to select the desired fwcfg item.

Reviewed by:		markj
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D38332
2023-02-14 08:28:43 +01:00
Corvin Köhne
9b99de77f1
bhyve: add basic qemu fwcfg implementation
qemu's fwcfg and bhyve's fwctl are both used to configure ovmf. qemu's
fwcfg is much more powerfull than bhyve's fwctl. For that reason, add
support for qemu's fwcfg.

Reviewed by:		markj
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D38331
2023-02-14 08:28:37 +01:00
Corvin Köhne
fbd045021d
bhyve: maintain a list of acpi devices
The list is used to generate the dsdt entry for every acpi device.

Reviewed by:		markj
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D3830
2023-02-14 08:28:31 +01:00
Corvin Köhne
682a522d61
bhyve: add helper func to write a dsdt entry
The guest will check the dsdt to detect acpi devices. Therefore, add a
helper function to create such a dsdt entry for an acpi device.

Reviewed by:		markj
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D38329
2023-02-14 08:28:27 +01:00
Corvin Köhne
13a1df5b85
bhyve: add helper func to add acpi resources
These helper function can be used to assign acpi resources to an
acpi_device.

Reviewed by:		markj
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D38328
2023-02-14 08:28:21 +01:00
Corvin Köhne
1231f047c3
bhyve: add helper struct for acpi device handling
To simplify the handling of different acpi devices like qemu fwcfg or a
tpm, add a helper struct. It will handle the reporting of acpi
resources.

Reviewed by:		markj
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D38327
2023-02-14 08:28:17 +01:00
Piotr Kubaj
8923de5905
ice(4): Update to 1.37.7-k
Notable changes include:

- DSCP QoS Support (leveraging support added in
  rG9c950139051298831ce19d01ea5fb33ec6ea7f89)
- Improved PFC handling and TC queue assignments (now all remaining
  queues are assigned to TC 0 when more than one TC is enabled and the
  number of available queues does not evenly divide between them)
- Support for dumping the internal FW state for additional debugging by
  Intel support
- Support for allowing "No FEC" to be a valid state for the LESM to
  negotiate when using non-standard compliant modules

Also includes various bug fixes and smaller enhancements, too.

Signed-off-by: Eric Joyner <erj@FreeBSD.org>

Reviewed by:	erj@
Tested by:	Jeff Pieper <jeffrey.pieper@intel.com>
MFC after:	3 days
Relnotes:	yes
Sponsored by:	Intel Corporation
Differential Revision:	https://reviews.freebsd.org/D38109
2023-02-13 17:29:44 -08:00
Rich Ercolani
cfd57573ff
quick fix for lingering snapdir unmount problems
Unfortunately, even after e79b6807, I still, much more rarely,
tripped asserts when playing with many ctldir mounts at once.

Since this appears to happen if we dispatched twice too fast, just
ignore it. We don't actually need to do anything if someone already
started doing it for us.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #14462
2023-02-13 16:40:13 -08:00
George Amanakis
34ce4c42ff
Fix a race condition in dsl_dataset_sync() when activating features
The zio returned from arc_write() in dmu_objset_sync() uses
zio_nowait(). However we may reach the end of dsl_dataset_sync()
which checks if we need to activate features in the filesystem
without knowing if that zio has even run through the ZIO pipeline yet.
In that case we will flag features to be activated in
dsl_dataset_block_born() but dsl_dataset_sync() has already
completed its run and those features will not actually be activated.
Mitigate this by moving the feature activation code in
dsl_dataset_sync_done(). Also add new ASSERTs in
dsl_scan_visitbp() checking if a block contradicts any filesystem
flags.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: George Amanakis <gamanakis@gmail.com>
Closes #13816
2023-02-13 16:37:46 -08:00
Prakash Surya
13312e2fa1
Reduce need for contiguous memory for ioctls
We've had cases where we trigger an OOM despite having memory freely
available on the system. For example, here, we had about 21GB free:

    kernel: Node 0 Normal: 2418758*4kB (UME) 1549533*8kB (UE) 0*16kB
    0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =
    22071296kB

The problem being, all the memory is in 4K and 8K contiguous regions,
but the allocation request was for a 16K contiguous region:

    kernel: SafeExecutors-4 invoked oom-killer:
    gfp_mask=0x42dc0(GFP_KERNEL|__GFP_NOWARN|__GFP_COMP|__GFP_ZERO),
    order=2, oom_score_adj=0

The offending allocation came from this call trace:

    kernel: Call Trace:
    kernel:  dump_stack+0x57/0x7a
    kernel:  dump_header+0x4f/0x1e1
    kernel:  oom_kill_process.cold.33+0xb/0x10
    kernel:  out_of_memory+0x1ad/0x490
    kernel:  __alloc_pages_slowpath+0xd55/0xe40
    kernel:  __alloc_pages_nodemask+0x2df/0x330
    kernel:  kmalloc_large_node+0x42/0x90
    kernel:  __kmalloc_node+0x25a/0x320
    kernel:  ? spl_kmem_free_impl+0x21/0x30 [spl]
    kernel:  spl_kmem_alloc_impl+0xa5/0x100 [spl]
    kernel:  spl_kmem_zalloc+0x19/0x20 [spl]
    kernel:  zfsdev_ioctl+0x2b/0xe0 [zfs]
    kernel:  do_vfs_ioctl+0xa9/0x640
    kernel:  ? __audit_syscall_entry+0xdd/0x130
    kernel:  ksys_ioctl+0x67/0x90
    kernel:  __x64_sys_ioctl+0x1a/0x20
    kernel:  do_syscall_64+0x5e/0x200
    kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
    kernel: RIP: 0033:0x7fdca3674317

The problem is, for each ioctl that ZFS makes, it has to allocate a
zfs_cmd_t structure, which is 13744 bytes in size (on my system):

    sdb> sizeof zfs_cmd
    (size_t)13744

This size, coupled with the fact that we currently allocate it with
kmem_zalloc, means we need a 16K contiguous region of memory to satisfy
the request.

The solution taken by this change, is to use "vmem" instead of "kmem" to
do the allocation, such that we don't necessarily need a contiguous 16K
memory region to satisfy the allocation.

Arguably, a better solution would be not to require such a large
allocation to begin with (e.g. reduce the size of the zfs_cmd_t
structure), but that'd be a much larger change than this "one liner".
Thus, I've opted for this approach for now; we can always circle back
and attempt to reduce the size of the structure in the future.

Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
Reviewed-by: Don Brady <don.brady@delphix.com>
Signed-off-by: Prakash Surya <prakash.surya@delphix.com>
Closes #14474
2023-02-13 16:35:59 -08:00
Konstantin Belousov
3a3450eda6 tmpfs_rename(): use tmpfs_access_locked instead of VOP_ACCESS()
Protect the call with the node lock. We cannot lock the fvp vnode
sleepable there, because we already own other participating vnode's
locks. Taking it without sleeping require unwinding the whole locking
state in one more place.

Note that the liveness of the node is guaranteed by the lock on the
parent directory vnode.

Reported and tested by:	pho
Fixes:	cbac1f3464956185cf95955344b6009e2cc3ae40ESC
Reviewed by:	markj, mjg
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38557
2023-02-14 01:16:38 +02:00
Konstantin Belousov
adc3506d56 Extract tmpfs-specific part of tmpfs_access() into a helper
The helper tmpfs_access_locked() requires either the vnode or node
locked for consistency of the access check, unlike the pure vnode op.

Reviewed by:	markj, mjg
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38557
2023-02-14 01:16:38 +02:00
Konstantin Belousov
889f074601 tmpfs_access(): style fixes and remove redundand assertions
Note that MPASS(VOP_ISLOCKED(vp)) is simply broken.

Reviewed by:	markj, mjg
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38557
2023-02-14 01:16:38 +02:00
Rick Macklem
9d329bbc9a nfsd: Continue adding macros so nfsd can run in a vnet prison
Commit 7344856e3a6d added a lot of macros that will front end
vnet macros so that nfsd(8) can run in vnet prison.
This patch adds some more of them and also a lot of uses of
nfsstatsv1_p instead of nfsstatsv1. nfsstatsv1_p points to
nfsstatsv1 for prison0, but will point to a malloc'd structure
for other prisons.

It also puts nfsstatsv1_p in nfscommon.ko instead of nfsd.ko.

MFC after:	3 months
2023-02-13 15:07:17 -08:00
Konstantin Belousov
0152d453a0 msdosfs deextend: validate pages of the partial buffer
Suppose that the cluster size is larger than page size. If the buffer
at the old EOF (before extending) was partial and dirty, it cannot be
automatically neither written out nor validated by the buffer cache,
since extending buffer adds invalid pages at the end.

Correct the buffer state by calling vfs_bio_clrbuf() on it, to mark
newly added and zeroed pages as valid.

Note that UFS is immune to the problem because ffs_truncate() always
allocate the block and buffer for the last byte of the file.

PR:	269341
Reported by:	asomers
In collaboration with:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38549
2023-02-14 00:29:42 +02:00
Konstantin Belousov
67dc1e7b04 msdosfs deextend(): memoize DETOV(dep)
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38549
2023-02-14 00:29:32 +02:00
Konstantin Belousov
e59180ea09 msdosfs: correct handling of vnode pager size on file extension error
If extension fails, vnode pager recorded size might be left increased.
Only update vnode pager when extension is past the point of no rollback.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38549
2023-02-14 00:29:19 +02:00
Konstantin Belousov
020e8a4d06 allocbuf(): convert direct panic() calls to KASSERT()s
Also do minor style adjustments.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38549
2023-02-14 00:28:42 +02:00