Commit Graph

1358 Commits

Author SHA1 Message Date
Elliott Mitchell
09bd542d17 xen/intr: rework xen_intr_alloc_isrc() call structure
The call structure around xen_intr_alloc_isrc() was rather awful.
Notably finding a structure for reuse is part of allocation, but this
was done outside xen_intr_alloc_isrc().  Move this into
xen_intr_alloc_isrc() so the function handles all allocation steps.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30726
2023-04-14 15:58:49 +02:00
Elliott Mitchell
149c581018 xen/intr: adjust xenisrc types, adjust format strings to match
As "CPUs", IRQs (vector) and virtual IRQs are always positive integers,
adjust the Xen code to use unsigned integers.  Several format strings
need adjustment to match.  Additionally single-bit bitfields are
boolean.

No functional change expected.

Reviewed by: royger
2023-04-14 15:58:49 +02:00
Julien Grall
28a78d860e xen: introduce XEN_CPUID_TO_VCPUID()/XEN_VCPUID()
Part of the series for allowing FreeBSD/ARM to run on Xen.  On ARM the
function is a trivial pass-through, other architectures need distinct
implementations.

While implementing XEN_VCPUID() as a call to XEN_CPUID_TO_VCPUID()
works, that involves multiple accesses to the PCPU region.  As such make
this a distinct macro.  Only callers in machine independent code have
been switched.

Add a wrapper for the x86 PIC interface to use matching the old
prototype.

Partially inspired by the work of Julien Grall <julien@xen.org>,
2015-08-01 09:45:06, but XEN_VCPUID() was redone by Elliott Mitchell on
2022-06-13 12:51:57.

Reviewed by: royger
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Original implementation: Julien Grall <julien@xen.org>, 2014-04-19 08:57:40
Original implementation: Julien Grall <julien@xen.org>, 2014-04-19 14:32:01
Differential Revision: https://reviews.freebsd.org/D29404
2023-04-14 15:58:46 +02:00
Elliott Mitchell
054073c283 xen/intr: xen_intr_bind_isrc() always set handle
Previously the upper layer handle was being set before the last
potential error condition.  The reasoning appears to have been it was
assumed invalid in case of an error being returned.  Now ensure it is
invalid until just before a successful return.

Fixes: 76acc41fb7 ("Implement vector callback for PVHVM and unify event channel implementations")
Fixes: 6d54cab1fe ("xen: allow to register event channels without handlers")
Reviewed by: royger
2023-04-14 15:58:45 +02:00
Elliott Mitchell
b94341afcb xen/intr: rework xen_intr_resume() for in-place remapping
The prior implementation of xen_intr_resume() was wiping
xen_intr_port_to_isrc[] and then rebuilding from the x86 interrupt
table.  Rework to instead wipe the channel numbers (->xi_port) and then
scan the table for sources with invalid channels.

This will be slower due to scanning the whole table, but this removes
the dependency on the x86 interrupt code.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30599
[royger]
Split line over 80 characters.
2023-03-29 09:51:45 +02:00
Elliott Mitchell
e45c8ea31c xen/intr: merge parts of resume functionality into new function
The portions of xen_rebind_ipi() and xen_rebind_virq() were already
near-identical.  While xen_rebind_ipi() should panic() on
single-processor, still having the functionality to invoke seems
harmless.

Meanwhile much of the loop from xen_intr_resume() seemed to want to be
closer to this same code.  This pushes related bits closer together.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30598
2023-03-29 09:51:44 +02:00
Julien Grall
910bd069f8 xen/intr: remove x86 APIC headers from xen_intr.c
Remove these no longer needed headers.  Key for making xen_intr.c
machine-independent as they don't exist on other architectures.

Originally this was part of a much larger commit, but was broken off
for submission to the FreeBSD project.

Reviewed by: royger
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Original implementation: Julien Grall <julien@xen.org>, 2015-10-20 09:14:56
MFC after: 1 week
2023-03-29 09:51:43 +02:00
Elliott Mitchell
40ad9aaa88 xen/intr: stop passing shared_info_t to xen_intr_active_ports()
There is only a single global HYPERVISOR_shared_info pointer, so
directly use the global pointer.

Reviewed by: royger
MFC after: 1 week
2023-03-29 09:51:43 +02:00
Elliott Mitchell
9f3be3a6ec xen: switch to using core atomics for synchronization
Now that the atomic macros are always genuinely atomic on x86, they can
be used for synchronization with Xen.  A single core VM isn't too
unusual, but actual single core hardware is uncommon.

Replace an open-coding of evtchn_clear_port() with the inline.

Substantially inspired by work done by Julien Grall <julien@xen.org>,
2014-01-13 17:40:58.

Reviewed by: royger
MFC after: 1 week
2023-03-29 09:51:42 +02:00
Elliott Mitchell
49ca3167b7 xen/intr: add check for intr_register_source() errors
While unusual, intr_register_source() can return failure.  A likely
cause might be another device grabbing from Xen's interrupt range.
This should NOT happen, but could happen due to a bug.  As such check
for this and fail if it occurs.

This theoretical situation also effects xen_intr_find_unused_isrc().
There, .is_pic must be tested to ensure such an intrusion doesn't cause
misbehavior.

Reviewed by: royger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31995
2023-03-29 09:51:41 +02:00
Elliott Mitchell
1797ff9627 xen/intr: cleanup event channel number use
Consistently use ~0 instead of 0 when clearing xenisrc structures.
0 is a valid event channel number, even though it is reserved by Xen.
Whereas ~0 is guaranteed invalid.

Reviewed by: royger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30743
2023-03-29 09:51:41 +02:00
Elliott Mitchell
2b2415bafa xen/intr: fix corruption of event channel table
In xen_intr_release_isrc(), the isrc should only be removed if it is
assigned to a valid port.  This had been mitigated by using 0 for not
having a port, but this is actually corrupting the table.  Fix this bug
as modifying the code would cause this bug to manifest as kernel memory
corruption.  Similar issue for the vCPU bitmap masks.

The KASSERT() doesn't need lock protection.

Reviewed by: royger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30743
2023-03-29 09:51:40 +02:00
Elliott Mitchell
0ebf9bb42d xen/intr: fix overflow of Xen interrupt range
The comparison was wrong.  Hopefully this never occurred in the wild,
but now ensure the error message will occur before damage is caused.
This appears non-exploitable as exploitation would require a guest to
force Domain 0 to allocate all event channels, which a guest shouldn't
be able to do.

Adjust the error message to better describe what has occurred.

Reviewed by: royger
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D30743
2023-03-29 09:51:39 +02:00
Elliott Mitchell
2d5e325303 xen/intr: always set xi_close in xen_intr_bind_isrc()
Appears errors are uncommon since calling xen_intr_release_isrc() on a
xenisrc with xi_close in an undefined state could be bad.  Fix this
problematic lurking nasty.

Reviewed by: royger
MFC after: 1 week
2023-03-29 09:51:39 +02:00
Corvin Köhne
e8988d60d2
pci: expose intel_graphics_stolen as sysctl
The Intel graphics stolen memory is used by the Intel GOP driver on
boot. When using bhyve with GPU passthrough, it's also used by the guest
GOP driver at guest boot. For that reason, bhyve needs to know the
address and size of this region to inform the guest about this region.
Exposing the variables as sysctl allows bhyve to easily read them.
2023-03-27 11:40:49 +02:00
Brooks Davis
eb232cffc9 amd64: reduce header pollution in _stdint.h
In 38d1ac34ff SIGATOMIC_{MIN,MAX} were
defined in terms of LONG_{MIN,MAX}.  Later, they were switched to
__LONG_{MIN,MAX} in 78fe75bc28 where an
include of machine/_limits.h was added.  Switch to using fixed width
INT64_{MIN,MAX} and remove the header pollution.

No functional change.

Reviewed by:	theraven, emaste
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D39196
2023-03-22 16:23:57 +00:00
Jean-Sébastien Pédron
d780c6a6ab
x86/pci_early_quirks: Support Intel 11th+ gen
Newer Intel CPUs/iGPUs use a new method to determine the base address of
the stolen memory. This code was ported from Linux.

Reviewed by:	manu
Approved by:	manu
Differential Revision:	https://reviews.freebsd.org/D39057
2023-03-20 21:47:36 +01:00
Mitchell Horne
99bd5c1fe3 x86: nexus code tidy-up
Make a pass at the various nexus implementations, fixing some very minor
style issues, obsolete comments, etc.

The method declaration section has become unwieldy in many respects.
Attempt to tame it by:
 - Using generated method typedefs
 - Grouping methods roughly by category, and then alphabetically.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D38495
2023-03-20 17:35:47 -03:00
Roger Pau Monné
a78e46a7db xen: take struct size into account for video information
The xenpf_dom0_console_t structure can grow as more data is added, and
hence we need to check that the fields we accesses have been filled by
Xen.  The only extra field FreeBSD currently uses is the top 32 bits
for the frame buffer physical address.

Note that this field is present in all the versions that make the
information available from the platform hypercall interface, so the
check here is mostly cosmetic, and to remember us that newly added
fields require checking the size of the returned data.

Fixes: 6f80738b22 ('xen: fetch dom0 video console information from Xen')
Sponsored by: Citrix Systems R&D
2023-03-14 09:59:08 +01:00
Roger Pau Monné
6f80738b22 xen: fetch dom0 video console information from Xen
It's possible for Xen to switch the video mode set by the boot loader,
so that the information passed in the kernel metadata is no longer
valid.  Fetch the video mode used by Xen using an hypercall and update
the medatada for the kernel to use the correct video mode.

Sponsored by: Citrix Systems R&D
2023-03-09 17:13:17 +01:00
Roger Pau Monné
5489d7e93a xen: bump used interface version
This is required for a further change that will make use of a field
that was added in version 0x00040d00.

No functional change expected.

Sponsored by: Citrix Systems R&D
2023-03-09 17:13:17 +01:00
John-Mark Gurney
2fee875629
abstract out the vm detection via smbios..
This makes the detection of VMs common between platforms that
have SMBios.

Reviewed by:		imp, kib
Differential Revision:	https://reviews.freebsd.org/D38800
2023-03-02 16:54:21 -08:00
Mina Galić
499171a98c apic: prevent divide by zero in CPU frequency init
If a CPU for some reason returns 0 as CPU frequency, we currently panic
on the resulting divide by zero when trying to initialize the CPU(s) via
APIC. When this happens, we'll fallback to measuring the frequency
instead.

PR: 269767
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/664
2023-02-25 09:47:40 -07:00
Mateusz Guzik
f4a9e9fc79 x86: whack kernel gcov vestige
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-02-23 16:42:38 +00:00
Dmitry Chagin
496a9a4f60 linux(4): Cleanup includes under x86/linux
Cleanup unneeded includes, sort the rest according to style(9).
No functional changes.

MFC after:		2 weeks
2023-02-14 17:46:33 +03:00
Dmitry Chagin
10d16789a3 linux(4): Get rid of the opt_compat.h include.
Since e013e369 COMPAT_LINUX, COMPAT_LINUX32 build options are removed,
so include of opt_compat.h is no more needed.

MFC after:		2 weeks
2023-02-12 20:24:32 +03:00
Val Packett
4a1c4de232 Allow sysctl hw.machine/hw.machine_arch in capability mode
There's no harm in reading strings like 'amd64'.

Reviewed by: emaste, manu
Sponsored by: https://www.patreon.com/valpackett
Differential Revision: https://reviews.freebsd.org/D28703
2023-02-06 14:00:52 -05:00
Mark Johnston
2bed14192c pvclock: Export a vDSO page even without rdtscp available
When the cycle counter is "stable", i.e., synchronized across vCPUs by
the hypervisor, userspace can use a serialized rdtsc instead of relying
on rdtscp, just like the kernel timecounter does.  This can be useful
for performance in guests where the hypervisor hides rdtscp for some
reason.

To avoid breaking compatibility with older userspace which expects
rdtscp to be usable when pvclock exports timekeeping info, hide this
feature behind a sysctl.

Reviewed by:	kib
Tested by:	Shrikanth R Kamath <kshrikanth@juniper.net>
MFC after:	2 weeks
Sponsored by:	Klara, Inc.
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D38342
2023-02-03 11:48:25 -05:00
Dmitry Chagin
a95cb95e12 linux(4): Preserve fpu fxsave state across signal delivery on amd64.
PR:			240768
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D38302
MFC after:		1 week
2023-02-02 20:21:37 +03:00
Corvin Köhne
55f1ca209d
atrtc: expose power loss as sysctl
Exposing the a power loss of the rtc as an sysctl makes it easier to
detect an empty cmos battery.

Reviewed by:		manu
MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D38325
2023-02-02 08:25:08 +01:00
Konstantin Belousov
2555f175b3 Move kstack_contains() and GET_STACK_USAGE() to MD machine/stack.h
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38320
2023-02-02 00:59:26 +02:00
Dmitry Chagin
5c32146723 amd64: Eliminate write only cpu_fxsr.
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D38289
MFC after:		1 week
2023-02-01 18:17:06 +03:00
Dmitry Chagin
290afc5d55 mp_x86: Trim trailing whitespaces.
MFC after:		1 week
2023-01-29 16:18:39 +03:00
Dmitry Chagin
6fdf04a2be smp: Drop confusing braces and return statement as panic() is never returns.
Reviewed by:		imp, kib
Differential Revision:	https://reviews.freebsd.org/D38235
MFC after:		1 week
2023-01-29 15:33:16 +03:00
Konstantin Belousov
11989314dc x86: add more definitions for XCR0 bits
This covers all currently defined bits, adding PKRU and TILE.

Reviewed by:	jhb, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38219
2023-01-27 19:44:49 +02:00
Corvin Köhne
122405c903
x86: ignore stepping for APL30 errata
The issue is present in all apollolake cpus and it doesn't look like
there'll be a fix in the future.

See
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/pentium-celeron-n-series-j-series-datasheet-spec-update.pdf

MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D37621
2023-01-12 10:08:17 +01:00
Konstantin Belousov
45ac7755a7 amd64: identify small cores
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37770
2023-01-01 00:09:45 +02:00
Li-Wen Hsu
611c6b233d
Complete retire cp(4)
And fix the LINT build.

Sponsored by:	The FreeBSD Foundation

Fixes:	895992bb66
2022-12-14 11:38:55 +08:00
Mateusz Guzik
c3f1a13902 Retire broken GPROF support from the kernel
The option is not even recognized and with that patched it does not
compile. Even if it did work, it would be prohibitively expensive to
use.

Interested parties can use pmcstat or dtrace instead.
2022-11-15 14:17:10 +00:00
Warner Losh
1d21f64149 bhyve: Implement MSR_MISC_FEATURES_ENABLES
Linux reads MISC_FEATURES_ENABLES to manage the CPUID faulting feature
(undocumented in the Intel SDM, but documented in 323850-004 (Intel
Virtualization Technology FlexMigration Application Note). Since bhyve
doesn't emulate this feature, we always return 0. Neither does bhyve
support the MONITOR/MWAIT fault bit also in this MSR (which is
documented in the sdm), so always return 0.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D36602
2022-10-27 11:34:41 -06:00
Ed Maste
c8113dad7e Increase MAX_APIC_ID safeguard to 0x800
MAX_APIC_ID must be at least twice MAXCPU.  Increase it to 0x800 so that
it is possible to set MAXCPU to 512 or 1024 in a custom kernel config
file.

Note that increasing this limit does not itself cause any allocations
to be larger; it just allows madt_parse_cpu() to process higher APIC
IDs.

APIC IDs may be sparse and so we can waste memory.  This is independent
of this change, but becomes more of an issue as the maximum APIC ID
grows.  This should be addressed with future work.

Reviewed by:	royger
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D37067
2022-10-27 12:33:34 -04:00
Konstantin Belousov
829145388b x86/include/elf.h: make inclusion blocks for elf32.h and elf64.h similar
They were copy-pasted when x86/include/elf.h file was merged from its
i386 and amd64 counterparts.  Having the text around inclusions
significantly different is somewhat confusing.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37085
2022-10-25 19:00:44 +03:00
Konstantin Belousov
5f00525dfc i386: move hard-coded load address for PIE below default linker base
both for i386 native and compat32 amd64.  We know the ld-elf.so.1 size
in advance, it fits there.  Trying to push it up after the end of a
binary cannot work reliably and eventually fail for large binaries.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37085
2022-10-25 19:00:44 +03:00
Colin Percival
b7761f1f08 x86/busdma: Limit reserved pages if low nsegs
When bus_dmamap_create is called, if bouncing might be required we
reserve enough pages for a maximum-length request, subject to the
MAX_BPAGES constraint (32 MB on amd64; 32 MB or 2 MB on i386
depending on the amount of RAM).

Since pages used for bouncing are typically non-consecutive, each
bounced page will typically constitute a busdma segment; as such, we
are unlikely to ever successfully use more pages than the nsegments
limit.  Limit the number of pages reserved to nsegments.

On FreeBSD/Firecracker, this reduces bounce page memory consumption
from 32 MB to 512 kB, making VMs with 128 MB of RAM usable.

Reviewed by:	imp, mav
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D37082
2022-10-21 22:47:33 -07:00
Colin Percival
13f34e211b PVH: Set bootmethod to PVH
Now that we can PVH boot on a non-Xen hypervisor, we shouldn't set
machdep.bootmethod to "XEN".  Instead, set it to "PVH"; there are
other ways to discern the hypervisor.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36191
2022-10-17 23:02:22 -07:00
Colin Percival
c4a4011c74 PVH: support whitespace cmdline splitting
For historical reasons, Xen kernel command lines have options
separated by commas.  Every other FreeBSD platform uses whitespace;
this is also necessary in PVH in order to support the Firecracker
VMM.  Allow options to be separated by any combination of commas
and whitespace.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36190
2022-10-17 23:02:22 -07:00
Colin Percival
a8ea154064 x86: Distinguish Xen from non-Xen PVH boots
The PVH boot protocol, introduced by Xen, is now used by some non-Xen
platforms (e.g. the Firecracker VM) as well.  In order to accommodate
these, we use CPUID to detect Xen and only perform Xen-specific setup
when running on that platform.

The "isxen" function duplicates some work done by identcpu.c later in
the boot process; but we need it here since this is the very first C
code which runs when PVH booting (even before hammer_time).

In many places the existing code had
        xc_printf(...);
        HYPERVISOR_shutdown(SHUTDOWN_crash);
making use of Xen functionality to print a message and shut down; in
the places where this idiom can be reached in the non-xen case, we
replace it idiom with a CRASH(...) macro which calls those in the Xen
case and halts in the non-Xen case.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35801
2022-10-17 23:02:22 -07:00
Colin Percival
023a025b5c x86: Add support for PVH version 1 memmap
Version 0 of PVH booting uses a Xen hypercall to retrieve the system
memory map; in version 1 the memory map can be provided via the
start_info structure.

Using the memory map from the version 1 start_info structure allows
FreeBSD to use PVH booting on systems other than Xen, e.g. on the
Firecracker VM.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35800
2022-10-17 23:02:22 -07:00
Colin Percival
d1ca8cc638 x86: Add MPTABLE_LINUX_BUG_COMPAT option
Linux has two bugs in its handling of the x86 MP table:
1. It assumes that there is always 640 kB of base memory, and looks for
the MP table in the top kB of this even if the memory map indicates
that memory location does not exist.
2. It ignores that entry_count field and instead iterates through the
MP table by scanning until it runs out of bytes in the table.

The Firecracker VM (and probably other related VMs) relies on both of
these bugs.  With the MPTABLE_LINUX_BUG_COMPAT option, we search for
the MP table at address 639k even if that isn't in the memory map; and
replace a zeroed entry_count with a value computed from scanning the
table until we run out of table bytes.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35799
2022-10-17 23:02:22 -07:00
Colin Percival
2297a1633d Add NO_LEGACY_PCIB kernel option to i386, amd64
On systems without a PCI bus, legacy_pcib_identify by default creates
one anyway:
    legacy_pcib_identify: no bridge found, adding pcib0 anyway

This commit adds a kernel option NO_LEGACY_PCIB which disables this,
allowing systems to be fully PCI-free.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35798
2022-10-17 23:02:22 -07:00