Commit Graph

278828 Commits

Author SHA1 Message Date
Takanori Watanabe
7b5d62bb73 ofw: add BUS_GET_DEVICE_PATH interface to openfirm/fdt, somewhat incomplete.
This add BUS_GET_DEVICE_PATH interface,
which shows device tree of openfirm/fdt.

In qemu-system-arm64 with "virt" machine with device-tree firmware,
% devctl getpath OFW cpu0

Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D37031
2022-10-18 16:55:47 +09:00
Colin Percival
469ad86031 amd64: Add FIRECRACKER kernel configuration
This kernel configuration supports the Firecracker VMM environment.

Relnotes:	FreeBSD can now run inside the Firecracker VMM
		via the amd64 FIRECRACKER kernel configuration.
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36672
2022-10-17 23:02:22 -07:00
Colin Percival
13f34e211b PVH: Set bootmethod to PVH
Now that we can PVH boot on a non-Xen hypervisor, we shouldn't set
machdep.bootmethod to "XEN".  Instead, set it to "PVH"; there are
other ways to discern the hypervisor.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36191
2022-10-17 23:02:22 -07:00
Colin Percival
c4a4011c74 PVH: support whitespace cmdline splitting
For historical reasons, Xen kernel command lines have options
separated by commas.  Every other FreeBSD platform uses whitespace;
this is also necessary in PVH in order to support the Firecracker
VMM.  Allow options to be separated by any combination of commas
and whitespace.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36190
2022-10-17 23:02:22 -07:00
Colin Percival
a8ea154064 x86: Distinguish Xen from non-Xen PVH boots
The PVH boot protocol, introduced by Xen, is now used by some non-Xen
platforms (e.g. the Firecracker VM) as well.  In order to accommodate
these, we use CPUID to detect Xen and only perform Xen-specific setup
when running on that platform.

The "isxen" function duplicates some work done by identcpu.c later in
the boot process; but we need it here since this is the very first C
code which runs when PVH booting (even before hammer_time).

In many places the existing code had
        xc_printf(...);
        HYPERVISOR_shutdown(SHUTDOWN_crash);
making use of Xen functionality to print a message and shut down; in
the places where this idiom can be reached in the non-xen case, we
replace it idiom with a CRASH(...) macro which calls those in the Xen
case and halts in the non-Xen case.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35801
2022-10-17 23:02:22 -07:00
Colin Percival
023a025b5c x86: Add support for PVH version 1 memmap
Version 0 of PVH booting uses a Xen hypercall to retrieve the system
memory map; in version 1 the memory map can be provided via the
start_info structure.

Using the memory map from the version 1 start_info structure allows
FreeBSD to use PVH booting on systems other than Xen, e.g. on the
Firecracker VM.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35800
2022-10-17 23:02:22 -07:00
Colin Percival
d1ca8cc638 x86: Add MPTABLE_LINUX_BUG_COMPAT option
Linux has two bugs in its handling of the x86 MP table:
1. It assumes that there is always 640 kB of base memory, and looks for
the MP table in the top kB of this even if the memory map indicates
that memory location does not exist.
2. It ignores that entry_count field and instead iterates through the
MP table by scanning until it runs out of bytes in the table.

The Firecracker VM (and probably other related VMs) relies on both of
these bugs.  With the MPTABLE_LINUX_BUG_COMPAT option, we search for
the MP table at address 639k even if that isn't in the memory map; and
replace a zeroed entry_count with a value computed from scanning the
table until we run out of table bytes.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35799
2022-10-17 23:02:22 -07:00
Colin Percival
2297a1633d Add NO_LEGACY_PCIB kernel option to i386, amd64
On systems without a PCI bus, legacy_pcib_identify by default creates
one anyway:
    legacy_pcib_identify: no bridge found, adding pcib0 anyway

This commit adds a kernel option NO_LEGACY_PCIB which disables this,
allowing systems to be fully PCI-free.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35798
2022-10-17 23:02:22 -07:00
Colin Percival
c4b68e7e53 ns8250: Check if flush via FCR succeeded
The emulated UART in the Firecracker VMM (aka the implementation in the
rust-vmm/vm-superio project) includes FIFOs but does not implement the
FCR register, which is used by ns8250_flush to flush the FIFOs.

Check the LSR to see if there is still data in the FIFOs and call
ns8250_drain if necessary.

Discussed with:	emaste, imp, jrtc27
Sponsored by:	https://patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36979
2022-10-17 23:02:21 -07:00
Colin Percival
782105f7c8 vtblk: Use busdma
We assume that bus addresses from busdma are the same thing as
"physical" addresses in the Virtio specification; this seems to
be true for the platforms on which Virtio is currently supported.

For block devices which are limited to a single data segment per
I/O, we request PAGE_SIZE alignment from busdma; this is necessary
in order to support unaligned I/Os from userland, which may cross
a boundary between two non-physically-contiguous pages.  On devices
which support more data segments per I/O, we retain the existing
behaviour of limiting I/Os to (max data segs - 1) * PAGE_SIZE.

Reviewed by:	bryanv
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36667
2022-10-17 23:02:21 -07:00
Colin Percival
3a8aff9d08 vtblk: Include pointer to softc in request
No functional change intended.

Reviewed by:	bryanv, imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36666
2022-10-17 23:02:21 -07:00
Colin Percival
cc25cfc9cf vtblk: Requeue inside vtblk_request_execute
Most virtio_blk requests are launched from vtblk_startio; prior to this
commit, if vtblk_request_execute failed (e.g. due to a lack of space on
the virtio queue) vtblk_startio would requeue the request to be
reattempted later.

Add a flag "vbr_requeue_on_error" to requests and perform the requeuing
from inside vtblk_request_execute instead.

No functional change intended.

Reviewed by:	bryanv, imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36665
2022-10-17 23:02:21 -07:00
Colin Percival
86f8f5ccb7 vtblk: Make vtblk_request_execute return void.
The error, if any, now gets stashed in the request structure.  (Step 1
of reworking this driver to use busdma.)

No functional change intended.

Reviewed by:	bryanv, imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36664
2022-10-17 23:02:21 -07:00
Colin Percival
0e1f5ab7db virtio_mmio: Support command-line parameters
The Virtio MMIO bus driver was added in 2014 with support for devices
exposed via FDT; in 2018 support was added to discover Virtio MMIO
devices via ACPI tables, as in QEMU.  The Firecracker VMM eschews both
FDT and ACPI, instead presenting device information via kernel command
line arguments of the form virtio_mmio.device=<parameters>.

These command line parameters get converted into kernel environment
variables; this adds support for parsing those variables and attaching
virtio_mmio children to nexus.

There is a case to be made that it would be cleaner to have a new
"cmdlinebus" attached to nexus and virtio_mmio children attached to
that.  A future commit might do that.

Discussed with:	imp, jrtc27
Sponsored by:	https://patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36189
2022-10-17 23:02:21 -07:00
Colin Percival
c32bd97641 kern: Support duplicate variables in early kenv
Some virtual machines pass virtio MMIO device parameters via the kernel
command line as a series of virtio_mmio.device=<parameters> options.
These get translated into FreeBSD kernel environment variables; but
unfortunately they all use the same variable name, which resulted in
all but the first such parameter being ignored when the dynamic kernel
environment is set up from the initial environment buffers.

With this commit, duplicate environment settings will instead be stored
as ${name}_1, ${name}_2... ${name}_9999.  In the unlikely event that
the same variable is set over 10000 times before the dynamic kernel
environment is set up, we panic.

Variable settings after the dynamic environment is initialized continue
to override the previously-set value; the change is limited to the very
early kernel boot (prior to SI_SUB_KMEM + 1) and changes behaviour from
"ignore" to "store with a different name" only.

Reviewed by:	imp
Feedback from:	kevans
Sponsored by:	https://patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36187
2022-10-17 23:02:20 -07:00
Gleb Smirnoff
d6eabdac2e dpaa2: fix build without WITNESS
Using mutex(9) requires including <sys/lock.h> per manual page.  With
WITNESS the header was cryptically included via dpaa_ni.h -> mbuf.h.
2022-10-17 22:38:40 -07:00
Gleb Smirnoff
2782ed8f6c dpaa2: fix standalone module build 2022-10-17 22:38:24 -07:00
Gleb Smirnoff
7fb975c8fb dpaa2: fix build without FDT 2022-10-17 22:38:02 -07:00
Eric Joyner
9c95013905
iflib: Introduce v2 of TX Queue Select Functionality
For v2, iflib will parse packet headers before queueing a packet.

This commit also adds a new field in the structure that holds parsed
header information from packets; it stores the IP ToS/traffic class
field found in the IPv4/IPv6 header.

To help, it will only partially parse header packets before queueing
them by using a new header parsing function that does less than the
current parsing header function; for our purposes we only need up to the
minimal IP header in order to get the IP ToS infromation and don't need
to pull up more data.

For now, v1 and v2 co-exist in this patch; v1 still offers a
less-invasive method where none of the packet is parsed in iflib before
queueing.

This also bumps the sys/param.h version.

Signed-off-by:	Eric Joyner <erj@FreeBSD.org>
Tested by:	IntelNetworking
MFC after:	3 days
Sponsored by:	Intel Corporation
Differential Revision: 	https://reviews.freebsd.org/D34742
2022-10-17 14:59:55 -07:00
Ed Maste
9f6097d6a6 linuxkpi: retire now-unused MIPS support
Reviewed by:	bz, manu
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D37023
2022-10-17 16:31:20 -04:00
Gleb Smirnoff
a3da8329c5 carp: fix regression panic from ccd69bd573
Reported & tested by:	Oleg Ginzburg <olevole olevole.ru>
Fixes:			ccd69bd573
2022-10-17 11:39:40 -07:00
Ed Maste
101ba46bb6 libproc: retire now-unused MIPS support
Discussed with:	imp
2022-10-17 14:17:25 -04:00
Mitchell Horne
4a9b1a1463 getpagesize(3): cross-reference getpagesizes(3)
MFC after:	3 days
2022-10-17 15:16:12 -03:00
Ali Abdallah
ba4782022a ksched: correct return code for invalid priority
By convention, EINVAL is returned when validating arguments, not EPERM.
This matches the documented behaviour of sched_setscheduler(3), and that
of SCHED_OTHER.

PR:		227735
MFC after:	1 week
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D37021
2022-10-17 15:12:13 -03:00
Mitchell Horne
04620006b9 pthread_setschedparam(3): document EPERM return
In kern_sched_setparam(), before setting any parameters, p_cansched() is
called to check that the thread has appropriate privileges.

PR:		175687
Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D37020
2022-10-17 15:12:12 -03:00
Mitchell Horne
1e6577831d config(5): drop mention of mips
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D37019
2022-10-17 15:12:12 -03:00
Ed Maste
9a86a3cd9b truss: remove now-unused special case for MIPS
Reviewed by:	mhorne, imp
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D37022
2022-10-17 14:02:49 -04:00
Kenneth D. Merry
11778fca4a Fix mpr(4) panic during a firmware update.
Issue Description:
The RequestCredits field of IOCFacts got changed between the Phase23
firmware to Phase24 firmware. So as part of firmware update operation,
driver has to free the resources & pools which are created with the Phase23
Firmware's IOCFacts data (i.e. during driver load time) and has to
reallocate the resources and pools using Phase24's IOCFacts data. Here
driver has freed the interrupts but missed to reallocate the interrupts and
hence config page read operation is getting timed out and controller is
going for recursive reinit (controller reset) operations and leading to
kernel panic.

Fix:
Reallocate the interrupts if the interrupts are disabled as part of
firmware update/downgrade operation.

Submitted by:	Sreekanth Ready <sreekanth.reddy@broadcom.com>
Tested by:	ken
MFC after:	3 days
2022-10-17 12:48:34 -04:00
Konstantin Belousov
ca2560bd85 rtld: fix typo in comment
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-10-17 17:10:03 +03:00
Gert Doering
2e797555f7 if_ovpn(4): implement ioctl() to set if_flags
Fully working openvpn(8) --iroute support needs real subnet config
on ovpn(4) interfaces (IFF_BROADCAST), while client-side/p2p
configs need IFF_POINTOPOINT setting.  So make this configurable.

Reviewed by:	kp
2022-10-17 15:33:45 +02:00
Alan Somers
3c3b906b54 fusefs: After successful F_GETLK, l_whence should be SEEK_SET
PR:		266886
Reported by:	John Millikin <jmillikin@gmail.com>
MFC after:	2 weeks
Reviewed by:	emaste
Differential Revision: https://reviews.freebsd.org/D37014
2022-10-17 07:09:50 -06:00
liu-du
0ca740d9a6 xargs: fix exit code when using -P
currently when xargs runs in parallel mode (e.g. -P2), it somtimes
incorrectly returns zero exit code.  this commit fix it and also adds
tests.

Reviewed by:	mjg
PR:	267110
2022-10-17 10:39:04 +00:00
Kristof Provost
b136983a8a if_ovpn: fix use-after-free
ovpn_encrypt_tx_cb() calls ovpn_encap() to transmit a packet, then adds
the length of the packet to the "tunnel_bytes_sent" counter.  However,
after ovpn_encap() returns 0, the mbuf chain may have been freed, so the
load of m->m_pkthdr.len may be a use-after-free.

Reported by:	markj
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-10-17 09:24:41 +02:00
Cy Schubert
865f46b255 unbound: Reapply Vendor import 1.17.0
Reapply 643f9a0581. 64d318ea98 was a
mismerge during fake rebase. Let's reapply it.

Changes include: Added ACL per interface, proxy protocol and bug fixes.

Announcement:   https://nlnetlabs.nl/news/2022/Oct/13/unbound-1.17.0-released/

Merge commit '643f9a0581e8aac7eb790ced1164748939829826' into main
2022-10-16 14:08:33 -07:00
Cy Schubert
8cee2ebac5 Revert "unbound: Vendor import 1.17.0"
This reverts commit 64d318ea98, reversing
changes made to 8063dc0320.

Revert a mismerge which reversed 8063dc0320.
2022-10-16 13:42:15 -07:00
Cy Schubert
64d318ea98 unbound: Vendor import 1.17.0
Added ACL per interface, proxy protocol and bug fixes.

Announcement:   https://nlnetlabs.nl/news/2022/Oct/13/unbound-1.17.0-released/

Merge commit '643f9a0581e8aac7eb790ced1164748939829826' into new_merge
2022-10-16 13:32:55 -07:00
Rick Macklem
8063dc0320 nfsd: Make Setxattr/Removexattr NFSv4.2 ops IO_SYNC
When the NFS server does Setxattr or Removexattr, the
operations must be done IO_SYNC. If a server
crashes/reboots immediately after replying it must
have the extended attribute changes.

Since UFS does extended attributes asynchronously
by default and there is no "ioflag" argument in
the VOP calls, follow the VOP calls with VOP_FSYNC(),
to ensure the operation has been done synchronously.

This was found by inspection while investigating a
bug discovered during a recent IETF NFSv4 testing
event, where the Change attribute wasn't changed
in the operation reply.

This bug will take further work for ZFS and the
pNFS server configuration, but is now fixed for
a non-pNFS UFS exported file system.

MFC after:	1 month
2022-10-16 13:27:32 -07:00
Cy Schubert
643f9a0581 unbound: Vendor import 1.16.3
Added ACL per interface, proxy protocol and bug fixes.

Announcement:	https://nlnetlabs.nl/news/2022/Oct/13/unbound-1.17.0-released/
2022-10-16 12:24:20 -07:00
Ashish SHUKLA
e6901a29bc
kvm_close(3): Check kd->sparse_map != NULL before munmap
PR:		266113
Reviewed by:	markj
2022-10-16 16:49:16 +00:00
Mitchell Horne
204a5f5800 sbuf(9): reference the correct function
This was most likely a copy-paste error.

PR:		262433
MFC after:	3 days
Reported by:	Boris Ivanovsky <bivanovsky@gmail.com>
Sponsored by:	The FreeBSD Foundation
2022-10-15 15:51:44 -03:00
Mitchell Horne
39888ed7a3 kern_intr: Check for NULL event in intr_destroy()
It likely won't happen, but is consistent with the other functions of
this KPI.

Reviewed by:	imp, jhb
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D33479
2022-10-15 15:51:44 -03:00
Mitchell Horne
2af741fc53 intr_event(9): update copyright
To reflect my work on the rewrite, which is in-part sponsored by
the FreeBSD Foundation.

I have also included a copyright entry for trhodes@, who wrote the patch
beginning this rewrite in PR 100803.

Reviewed by:	imp, jhb, emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D36935
2022-10-15 15:51:00 -03:00
Mitchell Horne
cb9425e21c intr_event(9): update existing function descriptions
Document new arguments and behaviours for these functions as compared to
the old ithread_* versions.

Reviewed by:	pauamma
Input from:	jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33478
2022-10-15 15:50:25 -03:00
Mitchell Horne
dfc91493ab intr_event(9): update top-level description
The ithread has been subsumed by the 'interrupt event' object, so
update the description to reflect this by describing an interrupt event
and its contents. We've also moved on from having a single handler
function to the split filter-and-handler model. Explain the purpose and
constraints of these two types of handlers.

Reviewed by:	jhb, pauamma
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33477
2022-10-15 15:50:25 -03:00
Mitchell Horne
0cec1648b4 intr_event(9): update the example of swi_add()
Reviewed by:	jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33476
2022-10-15 15:50:25 -03:00
Mitchell Horne
3cdbaee354 ithread(9): update functions to current day
The public KPI is now intr_event_**,
 - Convert existing documented functions to their equivalents.
 - Fix up the function arguments
 - Fix up the possible error return values for each
 - Remove ithread_schedule() completely
 - Rename man page to intr_event(9)
 - Update cross-references

Future changes will update the descriptive text for these functions.

PR:		100803
Based on work by: trhodes
Reviewed by:	jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33475
2022-10-15 15:49:33 -03:00
Rick Macklem
7d9dc91a99 nfscl: Fix the NFSv4.0 mount so that it does not crash
Commit efe58855f3 modifies IN_LOOPBACK() so that it uses a VNET
variable. Without this patch, nfscl_getmyip() uses IN_LOOPBACK()
when the VNET is not set and crashes the system.
nfscl_getmyip() is only called when a NFSv4.0 (not NFSv4.1/4.2)
mount is done.

This patch re-organizes nfscl_getmyip() so that IN_LOOPBACK()
is before the CURVENT_RESTORE() macro, to avoid the crashes.

Reviewed by:	karels, zlei.huang_gmail.com
Differential Revision:	https://reviews.freebsd.org/D37008
2022-10-15 08:38:07 -07:00
Zhenlei Huang
43f8c763cd if_me: Use dedicated network privilege
Separate if_me privileges from if_gif.

Reviewed by:		kp
Differential Revision:	https://reviews.freebsd.org/D36691
2022-10-15 17:05:36 +02:00
Kristof Provost
b37707bb39 pf: fix LINT-NOINET6 build 2022-10-15 10:02:35 +02:00
Rick Macklem
82512c17ea clnt_vc.c: Replace msleep() with pause() to avoid assert panic
An msleep() in clnt_vc.c used a global "fake_wchan" wchan argument
along with the mutex in a CLIENT structure.  As such, it was
possible to use different mutexes for the same wchan and
cause a panic assert.  Since this is in a rarely executed code
path, the assert panic was only recently observed.

Since "fake_wchan" never gets a wakeup, this msleep() can
be replaced with a pause() to avoid the panic assert,
which is what this patch does.

Reviewed by:	kib, markj
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D36977
2022-10-14 15:46:55 -07:00