Commit Graph

1328 Commits

Author SHA1 Message Date
Konstantin Belousov
2555f175b3 Move kstack_contains() and GET_STACK_USAGE() to MD machine/stack.h
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38320
2023-02-02 00:59:26 +02:00
Dmitry Chagin
5c32146723 amd64: Eliminate write only cpu_fxsr.
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D38289
MFC after:		1 week
2023-02-01 18:17:06 +03:00
Dmitry Chagin
290afc5d55 mp_x86: Trim trailing whitespaces.
MFC after:		1 week
2023-01-29 16:18:39 +03:00
Dmitry Chagin
6fdf04a2be smp: Drop confusing braces and return statement as panic() is never returns.
Reviewed by:		imp, kib
Differential Revision:	https://reviews.freebsd.org/D38235
MFC after:		1 week
2023-01-29 15:33:16 +03:00
Konstantin Belousov
11989314dc x86: add more definitions for XCR0 bits
This covers all currently defined bits, adding PKRU and TILE.

Reviewed by:	jhb, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38219
2023-01-27 19:44:49 +02:00
Corvin Köhne
122405c903
x86: ignore stepping for APL30 errata
The issue is present in all apollolake cpus and it doesn't look like
there'll be a fix in the future.

See
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/pentium-celeron-n-series-j-series-datasheet-spec-update.pdf

MFC after:		1 week
Sponsored by:		Beckhoff Automation GmbH & Co. KG
Differential Revision: https://reviews.freebsd.org/D37621
2023-01-12 10:08:17 +01:00
Konstantin Belousov
45ac7755a7 amd64: identify small cores
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37770
2023-01-01 00:09:45 +02:00
Li-Wen Hsu
611c6b233d
Complete retire cp(4)
And fix the LINT build.

Sponsored by:	The FreeBSD Foundation

Fixes:	895992bb66
2022-12-14 11:38:55 +08:00
Mateusz Guzik
c3f1a13902 Retire broken GPROF support from the kernel
The option is not even recognized and with that patched it does not
compile. Even if it did work, it would be prohibitively expensive to
use.

Interested parties can use pmcstat or dtrace instead.
2022-11-15 14:17:10 +00:00
Warner Losh
1d21f64149 bhyve: Implement MSR_MISC_FEATURES_ENABLES
Linux reads MISC_FEATURES_ENABLES to manage the CPUID faulting feature
(undocumented in the Intel SDM, but documented in 323850-004 (Intel
Virtualization Technology FlexMigration Application Note). Since bhyve
doesn't emulate this feature, we always return 0. Neither does bhyve
support the MONITOR/MWAIT fault bit also in this MSR (which is
documented in the sdm), so always return 0.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D36602
2022-10-27 11:34:41 -06:00
Ed Maste
c8113dad7e Increase MAX_APIC_ID safeguard to 0x800
MAX_APIC_ID must be at least twice MAXCPU.  Increase it to 0x800 so that
it is possible to set MAXCPU to 512 or 1024 in a custom kernel config
file.

Note that increasing this limit does not itself cause any allocations
to be larger; it just allows madt_parse_cpu() to process higher APIC
IDs.

APIC IDs may be sparse and so we can waste memory.  This is independent
of this change, but becomes more of an issue as the maximum APIC ID
grows.  This should be addressed with future work.

Reviewed by:	royger
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D37067
2022-10-27 12:33:34 -04:00
Konstantin Belousov
829145388b x86/include/elf.h: make inclusion blocks for elf32.h and elf64.h similar
They were copy-pasted when x86/include/elf.h file was merged from its
i386 and amd64 counterparts.  Having the text around inclusions
significantly different is somewhat confusing.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37085
2022-10-25 19:00:44 +03:00
Konstantin Belousov
5f00525dfc i386: move hard-coded load address for PIE below default linker base
both for i386 native and compat32 amd64.  We know the ld-elf.so.1 size
in advance, it fits there.  Trying to push it up after the end of a
binary cannot work reliably and eventually fail for large binaries.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37085
2022-10-25 19:00:44 +03:00
Colin Percival
b7761f1f08 x86/busdma: Limit reserved pages if low nsegs
When bus_dmamap_create is called, if bouncing might be required we
reserve enough pages for a maximum-length request, subject to the
MAX_BPAGES constraint (32 MB on amd64; 32 MB or 2 MB on i386
depending on the amount of RAM).

Since pages used for bouncing are typically non-consecutive, each
bounced page will typically constitute a busdma segment; as such, we
are unlikely to ever successfully use more pages than the nsegments
limit.  Limit the number of pages reserved to nsegments.

On FreeBSD/Firecracker, this reduces bounce page memory consumption
from 32 MB to 512 kB, making VMs with 128 MB of RAM usable.

Reviewed by:	imp, mav
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D37082
2022-10-21 22:47:33 -07:00
Colin Percival
13f34e211b PVH: Set bootmethod to PVH
Now that we can PVH boot on a non-Xen hypervisor, we shouldn't set
machdep.bootmethod to "XEN".  Instead, set it to "PVH"; there are
other ways to discern the hypervisor.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36191
2022-10-17 23:02:22 -07:00
Colin Percival
c4a4011c74 PVH: support whitespace cmdline splitting
For historical reasons, Xen kernel command lines have options
separated by commas.  Every other FreeBSD platform uses whitespace;
this is also necessary in PVH in order to support the Firecracker
VMM.  Allow options to be separated by any combination of commas
and whitespace.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36190
2022-10-17 23:02:22 -07:00
Colin Percival
a8ea154064 x86: Distinguish Xen from non-Xen PVH boots
The PVH boot protocol, introduced by Xen, is now used by some non-Xen
platforms (e.g. the Firecracker VM) as well.  In order to accommodate
these, we use CPUID to detect Xen and only perform Xen-specific setup
when running on that platform.

The "isxen" function duplicates some work done by identcpu.c later in
the boot process; but we need it here since this is the very first C
code which runs when PVH booting (even before hammer_time).

In many places the existing code had
        xc_printf(...);
        HYPERVISOR_shutdown(SHUTDOWN_crash);
making use of Xen functionality to print a message and shut down; in
the places where this idiom can be reached in the non-xen case, we
replace it idiom with a CRASH(...) macro which calls those in the Xen
case and halts in the non-Xen case.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35801
2022-10-17 23:02:22 -07:00
Colin Percival
023a025b5c x86: Add support for PVH version 1 memmap
Version 0 of PVH booting uses a Xen hypercall to retrieve the system
memory map; in version 1 the memory map can be provided via the
start_info structure.

Using the memory map from the version 1 start_info structure allows
FreeBSD to use PVH booting on systems other than Xen, e.g. on the
Firecracker VM.

Reviewed by:	royger
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35800
2022-10-17 23:02:22 -07:00
Colin Percival
d1ca8cc638 x86: Add MPTABLE_LINUX_BUG_COMPAT option
Linux has two bugs in its handling of the x86 MP table:
1. It assumes that there is always 640 kB of base memory, and looks for
the MP table in the top kB of this even if the memory map indicates
that memory location does not exist.
2. It ignores that entry_count field and instead iterates through the
MP table by scanning until it runs out of bytes in the table.

The Firecracker VM (and probably other related VMs) relies on both of
these bugs.  With the MPTABLE_LINUX_BUG_COMPAT option, we search for
the MP table at address 639k even if that isn't in the memory map; and
replace a zeroed entry_count with a value computed from scanning the
table until we run out of table bytes.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35799
2022-10-17 23:02:22 -07:00
Colin Percival
2297a1633d Add NO_LEGACY_PCIB kernel option to i386, amd64
On systems without a PCI bus, legacy_pcib_identify by default creates
one anyway:
    legacy_pcib_identify: no bridge found, adding pcib0 anyway

This commit adds a kernel option NO_LEGACY_PCIB which disables this,
allowing systems to be fully PCI-free.

Reviewed by:	imp
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35798
2022-10-17 23:02:22 -07:00
John Baldwin
a9fca3b987 Fix various places which cast a pointer to a vm_paddr_t or vice versa.
GCC warns about the mismatched sizes on i386 where vm_paddr_t is 64
bits.

Reviewed by:	imp, markj
Differential Revision:	https://reviews.freebsd.org/D36750
2022-10-03 16:10:41 -07:00
John Baldwin
f49fd63a6a kmem_malloc/free: Use void * instead of vm_offset_t for kernel pointers.
Reviewed by:	kib, markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D36549
2022-09-22 15:09:19 -07:00
John Baldwin
7ae99f80b6 pmap_unmapdev/bios: Accept a pointer instead of a vm_offset_t.
This matches the return type of pmap_mapdev/bios.

Reviewed by:	kib, markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D36548
2022-09-22 15:08:52 -07:00
Konstantin Belousov
fd25c62278 i386: check that trap() and syscall() run on the thread kstack
and not on the trampoline stack.  This is a useful way to ensure that
we did not enabled interrupts while on user %cr3 or trampoline stack.

Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2022-09-14 18:46:32 +03:00
Gordon Bergling
9755e244c9 x86: Correct a typo in source code comment
- s/occured/occurred/

MFC after:	3 days
2022-09-04 13:36:53 +02:00
Colin Percival
02ab915ae0 lapic_init: Reduce LOOPS
While I'm here, instrument lapic_init with TSLOG so it shows up (or
typically not, after this change) on flamecharts.

Reviewed by:	kib
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D36186
2022-08-13 15:28:09 -07:00
Mateusz Guzik
648edd6378 x86: remove MP_WATCHDOG
It does not work with ULE, which is the default scheduler for over a
decade.

Reviewed by:	emaste, kib
Differential Revision:	https://reviews.freebsd.org/D36094
2022-08-11 21:35:32 +00:00
Emmanuel Vadot
821b850a3b x86: Remove redundant parentheses
Reported by:	avg
Sponsored by:	Beckhoff Automation GmbH & Co. KG
MFC after:	1 week
MFC-With:	b223c1f1a0 ("x86: Add another cpuid for Apollo Lake errata APL30")
2022-08-09 09:46:50 +02:00
Corvin Köhne
b223c1f1a0 x86: Add another cpuid for Apollo Lake errata APL30
Sponsored by:	Beckhoff Automation GmbH & Co. KG
MFC after:	1 week
2022-08-09 09:07:59 +02:00
Alan Cox
7f46deccbe x86/iommu: Reduce the number of queued invalidation interrupts
Restructure dmar_qi_task() so as to reduce the number of invalidation
completion interrupts.  Specifically, because processing completed
invalidations in dmar_qi_task() can take quite some time, don't reenable
completion interrupts until processing has completed a first time. Then,
check a second time after reenabling completion interrupts, so that
any invalidations that complete just before interrupts are reenabled
do not linger until a future invalidation might raise an interrupt.
(Recent changes have made checking for completed invalidations cheap; no
locking is required.)

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D36054
2022-08-06 13:05:58 -05:00
Alexander Motin
ac64943ca8 mca: Add sysctl to mute corrected errors.
Setting hw.mca.log_corrected to 0 will mute corrected errors logging
except ones marked as reaching Yellow threshold by hardware.

MFC after:	1 week
2022-08-05 13:48:05 -04:00
Alan Cox
4670f90846 iommu_gas: Eliminate redundant parameters and push down lock acquisition
Since IOMMU map entries store a reference to the domain in which they
reside, there is no need to pass the domain to iommu_gas_free_entry(),
iommu_gas_free_space(), and iommu_gas_free_region().

Push down the acquisition and release of the IOMMU domain lock into
iommu_gas_free_space() and iommu_gas_free_region().

Both of these changes allow for simplifications in the callers of the
functions without really complicating the functions themselves.
Moreover, the latter change eliminates the direct use of the IOMMU
domain lock from the x86-specific DMAR code.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D35995
2022-07-30 14:28:48 -05:00
Alan Cox
42736dc44d x86/iommu: Reduce DMAR lock contention
Replace the DMAR unit's tlb_flush TAILQ by a custom list implementation
that enables dmar_qi_task() to dequeue entries without holding the DMAR
lock.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D35951
2022-07-29 00:11:33 -05:00
Alan Cox
c251563470 x86/iommu: Correct a recent change to iommu_domain_unload_entry()
Correct 8bc3673847.  When iommu_domain_unload_entry() performs a
synchronous IOTLB invalidation, it must call dmar_domain_free_entry()
to remove the entry from the domain's RB_TREE.

Push down the acquisition and release of the DMAR lock into the
recently introduced function dmar_qi_invalidate_sync_locked() and
remove the _locked suffix.

MFC with:	8bc3673847
2022-07-26 01:07:21 -05:00
Alan Cox
8bc3673847 iommu_gas: Eliminate a possible case of use-after-free
Eliminate a possible case of use-after-free in an error handling path
after a mapping failure.  Specifically, eliminate IOMMU_MAP_ENTRY_QI_NF
and instead perform the IOTLB invalidation synchronously.  Otherwise,
when iommu_domain_unload_entry() is called and told not to free the
IOMMU map entry, the caller could free the entry before dmar_qi_task()
is finished with it.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D35878
2022-07-25 11:14:58 -05:00
Dimitry Andric
eadef926b0 Adjust linux_vdso_{cpu,tsc}_selector_idx() definitions to avoid clang 15 warnings
With clang 15, the following -Werror warnings are produced:

    sys/x86/linux/linux_vdso_selector_x86.c:44:28: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
    linux_vdso_tsc_selector_idx()
                               ^
                                void
    sys/x86/linux/linux_vdso_selector_x86.c:62:28: error: a function declaration without a prototype is deprecated in all versions of C [-Werror,-Wstrict-prototypes]
    linux_vdso_cpu_selector_idx()
                               ^
                                void

This is because linux_vdso_tsc_selector_idx() and
linux_vdso_cpu_selector_idx are declared with (void) argument lists, but
defined with empty argument lists. Make the definitions match the
declarations.

MFC after:	3 days
2022-07-25 00:40:13 +02:00
Alan Cox
4eaaacc755 x86/iommu: Shrink the critical section in dmar_qi_task()
It is safe to test and clear the Invalidation Wait Descriptor
Complete flag before acquiring the DMAR lock in dmar_qi_task(),
rather than waiting until the lock is held.

Reviewed by:	kib
MFC after:	2 weeks
2022-07-18 22:23:13 -05:00
Colin Percival
05350f0936 x86: Remove 1 second DELAY from cpu_reset
On SMP systems, cpu_reset broadcasts a message telling the APs to stop
themselves, and then the BSP waits 1 second before actually resetting
itself; this behaviour dates back to 1998-05-17.

I assume that this delay was added in order to allow the APs to stop
themselves before the BSP resets; but we wait until the APs have all
acknowledged entering the "stopped" state, so it no longer seems to
serve any purpose.

Reviewed by:	jhb, kib
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D35797
2022-07-18 17:23:25 -07:00
Mitchell Horne
c84c5e00ac ddb: annotate some commands with DB_CMD_MEMSAFE
This is not completely exhaustive, but covers a large majority of
commands in the tree.

Reviewed by:	markj
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D35583
2022-07-18 22:06:09 +00:00
Alan Cox
da55f86c61 x86/iommu: Eliminate redundant wrappers
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35832
2022-07-16 18:05:37 -05:00
Alan Cox
db0110a536 iommu: Shrink the iommu map entry structure
Eliminate the unroll_entry field from struct iommu_map_entry, shrinking
the struct by 16 bytes on 64-bit architectures.

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35769
2022-07-15 22:24:52 -05:00
Mark Johnston
03f868b163 x86: Add a required store-load barrier in cpu_idle()
ULE's tdq_notify() tries to avoid delivering IPIs to the idle thread.
In particular, it tries to detect whether the idle thread is running.
There are two mechanisms for this:
- tdq_cpu_idle, an MI flag which is set prior to calling cpu_idle().  If
  tdq_cpu_idle == 0, then no IPI is needed;
- idle_state, an x86-specific state flag which is updated after
  cpu_idleclock() is called.

The implementation of the second mechanism is racy; the race can cause a
CPU to go to sleep with pending work.  Specifically, cpu_idle_*() set
idle_state = STATE_SLEEPING, then check for pending work by loading the
tdq_load field of the CPU's runqueue.  These operations can be reordered
so that the idle thread observes tdq_load == 0, and tdq_notify()
observes idle_state == STATE_RUNNING.

Some counters indicate that the idle_state check in tdq_notify()
frequently elides an IPI.  So, fix the problem by inserting a fence
after the store to idle_state, immediately before idling the CPU.

PR:		264867
Reviewed by:	mav, kib, jhb
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35777
2022-07-14 10:28:01 -04:00
Mark Johnston
ece453d5fa eventtimer: Simplify KTR traces
Stop including the current CPU in all event messages, since it's already
saved in KTR log entries and thus is redundant.  All eventtimer traces
occur in a context where CPU migration is not possible.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-07-11 15:58:43 -04:00
Mitchell Horne
258958b3c7 ddb: use _FLAGS command macros where appropriate
Some command definitions were forced to use DB_FUNC in order to specify
their required flags, CS_OWN or CS_MORE. Use the new macros to simplify
these.

Reviewed by:	markj, jhb
MFC after:	3 days
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D35582
2022-07-05 11:56:55 -03:00
Dmitry Chagin
03473e8ec8 linux(4): Use saved cpu feature bits
MFC after:		3 days
2022-07-04 23:42:07 +03:00
Warner Losh
26031009cf amd64/efi: Stop falling back to hints for RSDP
All boot loaders for the last 6 years set acpi.rsdp in addition to the
hints. This was planned for removal ~5 years ago. Belatedly remove it
from here.

Sponsored by:		Netflix
Reviewed by:		jhb
Differential Revision:	https://reviews.freebsd.org/D35633
2022-07-02 08:02:12 -06:00
Roger Pau Monné
77cb05db0c x86/xen: stop assuming kernel memory loading order in PVH
Do not assume that start_info will always be loaded at the highest
memory address, and instead check the position of all the loaded
elements in order to find the last loaded one, and thus a likely safe
place to use as early boot allocation memory space.

Reported by: markj, cperciva
Sponsored by: Citrix Systems R&D
Reviewed by: markj
Differential revision: https://reviews.freebsd.org/D35628
2022-06-30 08:53:16 +02:00
Dmitry Chagin
050f5a8405 amd64: Reload CPU ext features after resume or cr4 changes
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D35555
MFC after:		2 weeks
2022-06-29 10:34:43 +03:00
Roger Pau Monné
091febc04a xen/blkback: do not use x86 CPUID in generic code
Move checker for whether Xen creates IOMMU mappings for foreign pages
into a helper that's defined in arch-specific code.

Reported by: Elliott Mitchell <ehem+freebsd@m5p.com>
Fixes: 1d528f95e8 ('xen/blkback: remove bounce buffering mode')
Sponsored by: Citrix Systems R&D
2022-06-28 09:51:57 +02:00
John Baldwin
15a6642da6 x86 mptable: Include <x86/legacvar.h> for legacy_get_pcibus().
Fixes:		b076d8d54c mptable_hostb: Use legacy_get_pcibus() to fetch PCI bus number.
MFC after:	1 week
2022-06-23 15:00:12 -07:00