Commit Graph

8667 Commits

Author SHA1 Message Date
Mark Johnston
a422084abb Add the KMSAN runtime
KMSAN enables the use of LLVM's MemorySanitizer in the kernel.  This
enables precise detection of uses of uninitialized memory.  As with
KASAN, this feature has substantial runtime overhead and is intended to
be used as part of some automated testing regime.

The runtime maintains a pair of shadow maps.  One is used to track the
state of memory in the kernel map at bit-granularity: a bit in the
kernel map is initialized when the corresponding shadow bit is clear,
and is uninitialized otherwise.  The second shadow map stores
information about the origin of uninitialized regions of the kernel map,
simplifying debugging.

KMSAN relies on being able to intercept certain functions which cannot
be instrumented by the compiler.  KMSAN thus implements interceptors
which manually update shadow state and in some cases explicitly check
for uninitialized bytes.  For instance, all calls to copyout() are
subject to such checks.

The runtime exports several functions which can be used to verify the
shadow map for a given buffer.  Helpers provide the same functionality
for a few structures commonly used for I/O, such as CAM CCBs, BIOs and
mbufs.  These are handy when debugging a KMSAN report whose
proximate and root causes are far away from each other.

Obtained from:	NetBSD
Sponsored by:	The FreeBSD Foundation
2021-08-10 21:27:53 -04:00
Mark Johnston
f95f780ea4 amd64: Define KVA regions for KMSAN shadow maps
KMSAN requires two shadow maps, each one-to-one with the kernel map.
Allocate regions of the kernels PML4 page for them.  Add functions to
create mappings in the shadow map regions, these will be used by the
KMSAN runtime.

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31295
2021-08-10 21:27:52 -04:00
Mark Johnston
4fd450a87d amd64 pmap: Pre-set PG_M on 2MB KASAN shadow map entries
Also remove a redundant assertion in pmap_kasan_enter().

Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31295
2021-08-10 21:22:07 -04:00
Mark Johnston
41335c6b7f vmm: Make iommu ops tables const
While here, use designated initializers and rename some AMD iommu method
implementations to match the corresponding op names.  No functional
change intended.

Reviewed by:	grehan
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31462
2021-08-09 13:28:27 -04:00
Mark Johnston
e54ae8258d amd64: Fix output operand specs for the stmxcsr and vmread intrinsics
This does not appear to affect code generation, at least with the
default toolchain.

Noticed because incorrect output specifications lead to false positives
from KMSAN, as the instrumentation uses them to update shadow state for
output operands.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31466
2021-08-09 13:28:08 -04:00
Ed Maste
9feff969a0 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation
2021-08-08 10:42:24 -04:00
Ka Ho Ng
df95cc76af vmm: Bump vmname buffer in struct vm to VM_MAX_NAMELEN + 1
In hw.vmm.create sysctl handler the maximum length of vm name is
VM_MAX_NAMELEN. However in vm_create() the maximum length allowed is
only VM_MAX_NAMELEN - 1 chars. Bump the length of the internal buffer to
allow the length of VM_MAX_NAMELEN for vm name.

MFC after:	3 days
Reviewed by:	grehan
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31372
2021-08-02 17:55:08 +08:00
Roger Pau Monné
82bf6a2566 xen/timer: fix amd64 LINT kernel build
On amd64 XENHVM depends on the xentimer device for PVH early startup,
so both should be added or removed together (like the current
dependency with xenpci). Fix this by adding xentimer to NOTES and
updating the comments on the config files. Note that on i386 there's
no such dependency between xentimer and XENHVM, since there's no PVH
support.

While there also fix the MINIMAL i386 build to include the xentimer,
so it keeps the same functionality as before xentimer was split from
XENHVM.

Reported by: lwhsu
PR: 257549
Fixes: ae59812748 ('xen/timer: make xen timer optional')
2021-08-02 10:33:35 +02:00
Konstantin Belousov
665895db26 amd64 pmap_vm_page_alloc_check(): loose the assert
Current expression checks that vm_page_alloc(9) never returns a page
belonging to the preload area.  This is not true if something was freed
from there, for instance a preloaded module was unloaded, or ucode update
freed.

Only check that we never allow to allocate a page belonging to the kernel
proper, check against _end.

Reported and tested by:	dhw
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-08-02 03:28:33 +03:00
Konstantin Belousov
1a55a3a729 amd64 pmap_vm_page_alloc_check(): print more data for failed assert
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-08-01 16:42:02 +03:00
Konstantin Belousov
041b7317f7 Add pmap_vm_page_alloc_check()
which is the place to put MD asserts about allocated pages.

On amd64, verify that allocated page does not belong to the kernel
(text, data) or early allocated pages.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-31 16:53:42 +03:00
Konstantin Belousov
e18380e341 amd64: do not assume that kernel is loaded at 2M physical
Allow any 2M aligned contiguous location below 4G for the staging
area location.  It should still be mapped by loader at KERNBASE.

The assumption kernel makes about loader->kernel handoff with regard to
the MMU programming are explicitly listed at the beginning of hammer_time(),
where kernphys is calculated.  Now kernphys is the variable instead of
symbol designating the physical address.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-31 16:53:42 +03:00
Mark Johnston
a90d053b84 Simplify kernel sanitizer interceptors
KASAN and KCSAN implement interceptors for various primitive operations
that are not instrumented by the compiler.  KMSAN requires them as well.
Rather than adding new cases for each sanitizer which requires
interceptors, implement the following protocol:
- When interceptor definitions are required, define
  SAN_NEEDS_INTERCEPTORS and SANITIZER_INTERCEPTOR_PREFIX.
- In headers that declare functions which need to be intercepted by a
  sanitizer runtime, use SANITIZER_INTERCEPTOR_PREFIX to provide
  declarations.
- When SAN_RUNTIME is defined, do not redefine the names of intercepted
  functions.  This is typically the case in files which implement
  sanitizer runtimes but is also needed in, for example, files which
  define ifunc selectors for intercepted operations.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-07-29 21:13:32 -04:00
Konstantin Belousov
2572376f7f amd64: do not touch low memory in AP startup unless we used legacy boot
This fixes several ommisions in 48216088b1

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D31343
2021-07-30 01:20:45 +03:00
Konstantin Belousov
b27fe1c3ba amd64: stop doing special allocation for the AP startup trampoline
There is no reason now why do we need to allocate trampoline page very
early in the boot process.  The only requirement for the page is that
it is below 1M to be usable by the real mode during init.  This can be
handled by vm_alloc_contig() when we do the startup.

Also assert that startup trampoline fits into single page.  In principle
we can do multi-page allocation if needed, but it is not.

Move the alloc_ap_trampoline() function and the boot_address variable to
i386/mp_machdep.c.  Keep existing mechanism of early alloc on i386.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D31343
2021-07-30 01:20:45 +03:00
Mark Johnston
4b136ef259 amd64: Set GS.base before calling init_secondary() on APs
KMSAN instrumentation requires thread-local storage to track
initialization state for function parameters and return values.  This
buffer is accessed as part of each function prologue.  It is provided by
the KMSAN runtime, which looks up a pointer in the current thread's
structure.

When KMSAN is configured, init_secondary() is instrumented, but this
means that GS.base must be initialized first, otherwise the runtime
cannot safely access curthread.  Work around this by loading GS.base
before calling init_secondary(), so that the runtime can at least check
curthread == NULL and return a pointer to some dummy storage.  Note that
init_secondary() still must reload GS.base after calling lgdt(), which
loads a selector into %gs, which in turn clears the base register.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31336
2021-07-29 10:22:37 -04:00
Mark Johnston
e153745083 amd64: Set MSR_KGSBASE to 0 during AP startup
There is no reason to initialize it to anything else, and this matches
initialization of the BSP.  No functional change intended.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31336
2021-07-29 10:14:05 -04:00
Dmitry Chagin
f337940144 linux(4): Fix gcc buld.
gcc failed as it didn't inlined the builtins and generates calls to
the libgcc, ld can't find libgcc as cross-toolchain libgcc is not installed.
To avoid this add internal vDSO ffs functions without optimized builtins.

Reported by:		jhb
MFC after:		2 weeks
2021-07-29 09:52:33 +03:00
Julien Grall
ae59812748 xen/timer: make xen timer optional
The timer is not used on ARM.

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29041
2021-07-28 17:27:03 +02:00
Konstantin Belousov
d6717f8778 amd64: rework AP startup
Stop using temporal page table with 1:1 mapping of low 1G populated over
the whole VA.  Use 1:1 mapping of low 4G temporarily installed in the
normal kernel page table.

The features are:
- now there is one less step for startup asm to perform
- the startup code still needs to be at lower 1G because CPU starts in
  real mode. But everything else can be located anywhere in low 4G
  because it is accessed by non-paged 32bit protected mode.  Note that
  kernel page table root page is at low 4G, as well as the kernel itself.
- the page table pages can be allocated by normal allocator, there is
  no need to carve them from the phys_avail segments at very early time.
  The allocation of the page for startup code still requires some magic.
  Pages are freed after APs are ignited.
- la57 startup for APs is less tricky, we directly load the final page
  table and do not need to tweak the paging mode.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-27 20:11:15 +03:00
Edward Tomasz Napierala
30c6d98219 linux: implement sigaltstack(2) on arm64
... by making it machine-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31286
2021-07-27 13:34:49 +00:00
Edward Tomasz Napierala
b54838003c linux: fix sigaltstack on amd64
To determine whether to use alternate signal stack or not,
we need to use the native signal number, not the one translated
with bsd_to_linux_signal().

In practical terms, this fixes golang.

Reviewed By:	dchagin
Fixes:		135dd0cab5
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31298
2021-07-26 11:57:52 +01:00
Alan Cox
3687797618 amd64: Don't repeat unnecessary tests when cmpset fails
When a cmpset for removing the PG_RW bit in pmap_promote_pde() fails,
there is no need to repeat the alignment, PG_A, and PG_V tests just to
reload the PTE's value.  The only bit that we need be concerned with at
this point is PG_M.  Use fcmpset instead.

MFC after:	1 week
2021-07-24 13:06:47 -05:00
Konstantin Belousov
48216088b1 amd64: do not touch low memory in AP startup unless we used legacy boot
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-24 18:52:45 +03:00
Konstantin Belousov
6a3821369f amd64: make efi_boot global
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-24 18:52:44 +03:00
Konstantin Belousov
c8bae074d9 amd64: add pmap_alloc_page_below_4g()
Suggested and reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-24 18:52:44 +03:00
Konstantin Belousov
34516d4ad1 amd64 pti init: fix calculation of the kernel text start
Old expression happens to provide the correct answer, but assumes that
kernel is loaded at physical address zero, with 2M gap.  Do not use
kernphys to calculate KVA of kernel text start, just explicitly write
out KERNBASE and the hole size.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-24 18:52:44 +03:00
Edward Tomasz Napierala
72f7ddb587 linux: implement rt_sigsuspend(2) on arm64
... by making it architecture-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31259
2021-07-23 20:13:00 +00:00
Alan Cox
b7de535288 amd64: Eliminate a redundant test from pmap_enter_object()
The call to pmap_allow_2m_x_page() in pmap_enter_object() is redundant.
Specifically, even without the call to pmap_allow_2m_x_page() in
pmap_enter_object(), pmap_allow_2m_x_page() is eventually called by
pmap_enter_pde(), so the outcome will be the same.  Essentially,
calling pmap_allow_2m_x_page() in pmap_enter_object() amounts to
"optimizing" for the unexpected case.

Reviewed by:	kib
MFC after:	1 week
2021-07-23 23:15:42 -05:00
Mark Johnston
f95e683fa2 Annotate amd64 stack unwinders with __nomemorysanitize
Sponsored by:	The FreeBSD Foundation
2021-07-23 10:47:13 -04:00
Dmitry Chagin
cf8d74e3fe linux(4): Allow musl brand to use FUTEX_REQUEUE op.
Initial patch from submitter was adapted by me to prevent unconditional
FUTEX_REQUEUE use.

PR:			255947
Submitted by:		Philippe Michaud-Boudreault
Differential Revision:	https://reviews.freebsd.org/D30332
2021-07-20 14:39:20 +03:00
Dmitry Chagin
1ca6b15bbd Drop "All rights reserved" from my copyright statements.
Add email and fixup years while here.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30912
MFC after:		2 weeks
2021-07-20 10:05:50 +03:00
Dmitry Chagin
ae8330b448 linux(4): Add arch name to the some printfs.
Reviewed by:		emaste
Differential revision:	https://reviews.freebsd.org/D30904
MFC after:		2 weeks
2021-07-20 10:05:08 +03:00
Dmitry Chagin
09cffde975 linux(4): Fixup the vDSO initialization order.
The vDSO initialisation order should be as follows:
- native abi init via exec_sysvec_init();
- vDSO symbols queued to the linux_vdso_syms list;
- linux_vdso_install();
- linux_exec_sysvec_init();

As the exec_sysvec_init() called with SI_ORDER_ANY (last) at SI_SUB_EXEC
order, move linux_vdso_install() and linux_exec_sysvec_init() to the
SI_SUB_EXEC+1 order.

Reviewed by:		trasz
Differential Revision:	https://reviews.freebsd.org/D30902
MFC after		2 weeks
2021-07-20 10:02:34 +03:00
Dmitry Chagin
a543556c81 linux(4): Constify vdso install/deinstall.
In order to reduce diff between arches constify vdso install/deinstall
functions like arm64.

Reviewed by:		emaste
Differential revision:	https://reviews.freebsd.org/D30901
MFC after:		2 weeks
2021-07-20 10:01:47 +03:00
Dmitry Chagin
9931033bbf linux(4); Almost complete the vDSO.
The vDSO (virtual dynamic shared object) is a small shared library that the
kernel maps R/O into the address space of all Linux processes on image
activation. The vDSO is a fully formed ELF image, shared by all processes
with the same ABI, has no process private data.

The primary purpose of the vDSO:
- non-executable stack, signal trampolines not copied to the stack;
- signal trampolines unwind, mandatory for the NPTL;
- to avoid contex-switch overhead frequently used system calls can be
  implemented in the vDSO: for now gettimeofday, clock_gettime.

The first two have been implemented, so add the implementation of system
calls.

System calls implemenation based on a native timekeeping code with some
limitations:
- ifunc can't be used, as vDSO r/o mapped to the process VA and rtld
  can't relocate symbols;
- reading HPET memory is not implemented for now (TODO).

In case on any error vDSO system calls fallback to the kernel system
calls. For unimplemented vDSO system calls added prototypes which call
corresponding kernel system call.

Tested by:		trasz (arm64)
Differential revision:  https://reviews.freebsd.org/D30900
MFC after:              2 weeks
2021-07-20 10:01:18 +03:00
Dmitry Chagin
5fd9cd53d2 linux(4): Modify sv_onexec hook to return an error.
Temporary add stubs to the Linux emulation layer which calls the existing hook.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D30911
MFC after:		2 weeks
2021-07-20 09:56:25 +03:00
Dmitry Chagin
815165be20 linux(4): Remove function prototypes from the vDSO.
In preparation for vDSO code revision get rid of incomplete vDSO methods
from locore, but leave .note.Linux section commented out.
.note.Linux section is used by glibc rtld to get the kernel version, that
saves one system call call. I'll try to implement it later, if figure out
how to use it with jails.

MFC after:	2 weeks
2021-07-20 09:52:08 +03:00
Jessica Clarke
439097486b acpi: Fix a repeated comment typo 2021-07-19 17:19:23 +01:00
Jessica Clarke
52fba9a943 acpi: Fix a repeated vm_offset_t that should be a vm_size_t
The underlying types for both are the same so arguably this doesn't
really matter, but using the wrong type is still confusing and
technically incorrect.
2021-07-19 17:19:23 +01:00
David Chisnall
cf98bc28d3 Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-16 18:06:44 +01:00
Alan Cox
325ff93274 Clear the accessed bit when copying a managed superpage mapping
pmap_copy() is used to speculatively create mappings, so those mappings
should not have their access bit preset.

Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31162
2021-07-14 13:06:10 -05:00
Warner Losh
c0c703342d pccard: remove pccard device from all kernels
All the PC Card drivers have been removed from the tree. Remove the
pccard drivers from all the kernels.

Sponsored by:		Netflix
2021-07-13 20:39:31 -06:00
Alan Cox
d411b285bc pmap: Micro-optimize pmap_remove_pages() on amd64 and arm64
Reduce the live ranges for three variables so that they do not span the
call to PHYS_TO_VM_PAGE().  This enables the compiler to generate
slightly smaller machine code.

Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31161
2021-07-13 17:33:23 -05:00
Ka Ho Ng
b5c74dfd64 vmm: Fix AMD-vi using wrong rid range
The ACPI parsing code around rid range was wrong on assuming there is
only one pair of start/end device id range. Besides, ivhd_dev_parse()
never work as supposed. The start/end rid info was always zero.

Restructure the code to build dynamic-sized tables for each IOMMU softc
holding device entries. The device entries are enumerated to find a
suitable IOMMU unit. Operations on devices not governed (e.g. the IOMMU
unit itself) are no-op from now on. There are also a minor fix on wrong
%b formatting string usage.

Tested on my EPYC 7282.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	grehan
Differential Revision:	https://reviews.freebsd.org/D30827
2021-07-14 01:53:10 +08:00
Peter Grehan
517904de5c igc(4): Introduce new driver for the Intel I225 Ethernet controller.
This controller supports 2.5G/1G/100MB/10MB speeds, and allows
tx/rx checksum offload, TSO, LRO, and multi-queue operation.

The driver was derived from code contributed by Intel, and modified
by Netgate to fit into the iflib framework.

Thanks to Mike Karels for testing and feedback on the driver.

Reviewed by:	bcr (manpages), kbowling, scottl, erj
MFC after:	1 month
Relnotes:	yes
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30668
2021-07-12 14:57:18 +10:00
Helge Oldach
b21f19c9e0 MINIMAL: remove debugging and some loadable network modules
Remove deugging stuff, since it's arguably not needed in a minimal
setup. Also vlan, tuntap and gif since they can be loaded.

imp didn't include the part of the patch that removed xen guest support.
Xen guest is relatively small and has no way of being loaded.

Reviewed by:	imp
PR:		229564
MFC After:	3 days
2021-07-11 10:35:42 -06:00
David Chisnall
d2b558281a Revert "Pass the syscall number to capsicum permission-denied signals"
This broke the i386 build.

This reverts commit 3a522ba1bc.
2021-07-10 20:26:01 +01:00
David Chisnall
3a522ba1bc Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-10 17:19:52 +01:00
Konstantin Belousov
fdc71fa112 amd64 pmap: unexpand the NBPDR macro definition
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-07-10 14:46:54 +03:00
Konstantin Belousov
71463a34ab amd64 mpboot.S: fix typo in comment
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-07-10 14:46:54 +03:00
Konstantin Belousov
63664df720 amd64 locore.S: add FF copyright for LA57 work
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-07-10 14:46:53 +03:00
Konstantin Belousov
9dc715230c amd64 locore.S: trim .globl list from symbols gone for long time
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-07-10 14:46:53 +03:00
Konstantin Belousov
55e63ed307 x86: use ANSI C definition style for trap_fatal
PR:	257062
Submitted by:	Vijay Sharma <vijaysh312@gmail.com>
MFC after:	1 week
2021-07-10 14:46:53 +03:00
Mark Johnston
f08f0ae524 amd64: Mark the trapframe as initialized in trap()
Otherwise KASAN may generate false positives if the trapframe was
written into a poisoned region of the stack.

Reported by:	pho
Sponsored by:	The FreeBSD Foundation
2021-07-09 20:38:50 -04:00
Konstantin Belousov
28a66fc3da Do not call FreeBSD-ABI specific code for all ABIs
Use sysentvec hooks to only call umtx_thread_exit/umtx_exec, which handle
robust mutexes, for native FreeBSD ABI.  Similarly, there is no sense
in calling sigfastblock_clear() for non-native ABIs.

Requested by:	dchagin
Reviewed by:	dchagin, markj (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30987
2021-07-07 14:12:07 +03:00
Alan Cox
e41fde3ed7 On a failed fcmpset don't pointlessly repeat tests
In a few places, on a failed compare-and-set, both the amd64 pmap and
the arm64 pmap repeat tests on bits that won't change state while the
pmap is locked.  Eliminate some of these unnecessary tests.

Reviewed by:	andrew, kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31014
2021-07-05 21:07:40 -05:00
Edward Tomasz Napierala
447636e43c linux(4): implement coredump support
Implement dumping core for Linux binaries on amd64, for both
32- and 64-bit executables.  Some bits are still missing.

This is based on a prototype by chuck@.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30019
2021-06-30 22:45:06 +01:00
Alan Cox
1a8bcf30f9 amd64: a simplication to pmap_remove_{all,write}
Eliminate some unnecessary unlocking and relocking when we have to retry
the operation to avoid deadlock.  (All of the other pmap functions that
iterate over a PV list already implemented retries without these same
unlocking and relocking operations.)

Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D30951
2021-06-30 13:12:25 -05:00
Edward Tomasz Napierala
435754a59e Add infrastructure required for Linux coredump support
This adds `sv_elf_core_osabi`, `sv_elf_core_abi_vendor`,
and `sv_elf_core_prepare_notes` fields to `struct sysentvec`,
and modifies imgact_elf.c to make use of them instead
of hardcoding FreeBSD-specific values.  It also updates all
of the ABI definitions to preserve current behaviour.

This makes it possible to implement non-native ELF coredump
support without unnecessary code duplication.  It will be used
for Linux coredumps.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30921
2021-06-29 08:49:12 +01:00
Mateusz Guzik
9a8e4527f0 amd64: typo fix: memcmpy -> memcmp in a comment
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-06-26 16:24:46 +00:00
Dmitry Chagin
4aae133469 linux(4): Make vDSO defines private.
Hide the vDSO defines to the linux32_sysvec as they are not intended to
be used outside of it. Fix LINUX32_PS_STRINGS, use the size of
struct linux32_ps_strings instead of a numeric constant.

MFC after:	2 weeks
2021-06-25 18:41:04 +03:00
Konstantin Belousov
33e1287b6a amd64: do not touch BIOS reset flag halfword, unless we boot through BIOS
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30872
2021-06-24 00:38:00 +03:00
Dmitry Chagin
c1da89fec2 linux(4): Retire linux_kplatform.
Assuming we can't run on i486, i586 class cpu, retire linux_kplatform var
and use hardcoded 'machine' value in linux_newuname().

I have added linux_kplatform for consistency with linux_platform which is
placed in to vdso to avoid excess copyout it on stack for AT_PLATFORM at
exec time.

This is the first stage of Linuxulator's vdso revision.

Reviewed by:		trasz, imp
Differential Revision:	https://reviews.freebsd.org/D30774
MFC after:		2 weeks
2021-06-22 08:36:21 +03:00
Dmitry Chagin
e013e36939 linux(4): Get rid of Linuxulator kernel build options.
Stop confusing people, retire COMPAT_LINUX and COMPAT_LINUX32 kernel
build options. Since we have 32 and 64 bit Linux emulators, we can't build both
emulators together into the kernel. I don't think it matters, Linux emulation
depends on loadable modules (via rc).

Cut LINPROCFS and LINSYSFS for consistency.

PR:			215061
Reviewed by:		bcr (manpages), trasz
Differential Revision:	https://reviews.freebsd.org/D30751
MFC after:		2 weeks
2021-06-22 08:32:39 +03:00
Edward Tomasz Napierala
135dd0cab5 linux: reduce differences between rt_sendsig() and sendsig()
This makes it easier to compare the two.  This involves moving
the mutex slightly lower down, but there should be no functional
changes.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30541
2021-06-21 17:51:56 +01:00
Dmitry Chagin
8fe8bb7cb5 linux(4): Regen for linux_poll system call.
MFC after:	2 weeks
2021-06-22 08:09:55 +03:00
Dmitry Chagin
2eff670fde linux(4): Implement poll system call via linux_common_ppol()
for the sake of converting events to/from native.

MFC after:	2 weeks
2021-06-22 08:07:46 +03:00
Dmitry Chagin
26795a0378 linux(4): Rework Linux ppoll system call.
For now the Linux emulation layer uses in kernel ppoll(2) without
conversion of user supplied fd 'events', and does not convert the
kernel supplied fd 'revents'.

At least POLLRDHUP is handled by FreeBSD differently than by
Linux. Seems that Linux silencly ignores POLLRDHUP on non socket fd's
unlike FreeBSD, which does more strictly check and fails.

Rework the Linux ppoll, using kern_poll and converting 'events'
and 'revents' values.
While here, move poll events defines to the MI part of code as they
mostly identical on all arches except arm.

Differential Revision:	https://reviews.freebsd.org/D30716
MFC after:		2 weeks
2021-06-22 08:06:05 +03:00
Ka Ho Ng
210e6aec4f vmm: Fix ivrs_drv device_printf usage
The original %b description string is wrong.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	imp, jhb
Differential Revision:	https://reviews.freebsd.org/D30805
2021-06-19 14:07:26 +08:00
Konstantin Belousov
0247c33e89 amd64 efirt: initialize vm_pages backing EFI runtime memory
so that PHYS_TO_VM_PAGE() and consequently physcopyin() work for them

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30785
2021-06-17 16:58:51 +03:00
Konstantin Belousov
870e197d52 Add quirks for Linux ABI signals handling
Require queueing of the signals with default action, and disable
dequeueing SIGCHLD on wait for live process.

Reported and tested by:	dchagin
Reviewed by:	dchagin, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30675
2021-06-16 02:01:35 +03:00
Mark Johnston
70dd5eebc0 amd64: Fix propagation of LDT updates
When a process has used sysarch(2) to specify descriptors for its
private LDT, upon rfork(RFMEM) descriptors are copied into the new child
process.  Any updates to the descriptors are thus reflected to all other
processes sharing the vmspace.  However, this is incorrect in the rather
obscure case where the child process was created before the LDT was
modified.  Fix this by only modifying other processes which already
share the LDT.

Reported by:	syzkaller
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-06-14 17:32:18 -04:00
Olivier Houchard
30b915d7b2 an: Remove driver
Now that an(4) is gone, remove it from GENERIC kernel config files.

Reported by:	flo
2021-06-12 01:08:54 +02:00
Dmitry Chagin
89f15b79b1 linux(4): Regen for ppoll_time64 system call.
MFC after:	2 weeks
2021-06-10 15:19:12 +03:00
Dmitry Chagin
ed61e0ce1d linux(4): Implement ppoll_time64 system call.
MFC after:	2 weeks
2021-06-10 15:18:46 +03:00
Dmitry Chagin
981a60f112 linux(4): Regen for pselect6_time64 system call.
MFC after:	2 weeks
2021-06-10 15:04:37 +03:00
Dmitry Chagin
f6d075ecd7 linux(4): Implement pselect6_time64 system call.
MFC after:	2 weeks
2021-06-10 15:03:30 +03:00
Dmitry Chagin
c002529000 linux(4): Regen for rt_sigtimedwait_time64 system call.
MFC after:	2 weeks
2021-06-10 14:52:43 +03:00
Dmitry Chagin
db4a1f331b linux(4): Implement rt_sigtimedwait_time64 system call.
It still does not work as intended, awaits D30675.

MFC after:	2 weeks
2021-06-10 14:51:30 +03:00
Dmitry Chagin
985978806e linux(4): Regen for futex_time64 system call.
MFC after:	2 weeks
2021-06-10 14:28:25 +03:00
Dmitry Chagin
2e46d0c3d9 linux(4): Implement futex_time64 system call.
MFC after:	2 weeks
2021-06-10 14:27:06 +03:00
Dmitry Chagin
ee64d98204 linux(4): Regen for futex system call.
MFC after:	2 weeks
2021-06-10 14:16:40 +03:00
Dmitry Chagin
3c1de151e3 linux(4): Change Linux futex syscall definition to match Linux actual one.
MFC after:	2 weeks
2021-06-10 14:00:00 +03:00
Edward Tomasz Napierala
f102b61d0e linux: make sure to zero the l_siginfo structure for ptrace(2)
Reported By:	dchagin
Sponsored By:	EPSRC
2021-06-08 10:18:29 +01:00
Konstantin Belousov
598f6fb49c linuxolator: Add compat.linux.setid_allowed knob
PR:	21463
Reported by:	kris
Reviewed by:	dchagin
Tested by:	trasz
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D28154
2021-06-06 21:43:00 +03:00
Dmitry Chagin
f4e801085b linux(4): optimize ksiginfo to siginfo conversion.
Retire ksiginfo_to_lsiginfo function, use siginfo_to_lsiginfo instead.
Convert rt_sigtimedwait siginfo variables to well known names.

MFC after:	2 weeks
2021-06-07 06:06:17 +03:00
Dmitry Chagin
e29ea22f70 Regen for ('0f8dab45404f347752470579feccc6d2739b9570') Linux
rt_sigtimedwait system call.

MFC after:	2 weeks
2021-06-07 05:39:29 +03:00
Dmitry Chagin
0f8dab4540 linux(4): Fix timeout parameter of rt_sigtimedwait syscall, which is
timespec not a timeval.

MFC after:	2 weeks
2021-06-07 05:35:35 +03:00
Dmitry Chagin
56b187005c Regen for ('6501370a7dfb358daf07555136742bc064e68cb7') Linux
clock_nanosleep_time64 system call.

MFC after:	2 weeks
2021-06-07 05:29:27 +03:00
Dmitry Chagin
6501370a7d linux(4): Implement clock_nanosleep_time64 system call.
MFC after:	2 weeks
2021-06-07 05:26:48 +03:00
Dmitry Chagin
773d9153c3 Regen for ('187715a420237e1ed94dd5aef158eada7dcdc559') Linux
clock_getres_time64 system call.

MFC after:	2 weeks
2021-06-07 05:21:48 +03:00
Dmitry Chagin
187715a420 linux(4): Implement clock_getres_time64 system call.
MFC after:	2 weeks
2021-06-07 05:21:32 +03:00
Dmitry Chagin
82e3848654 Regen for ('19f9a0e4df54f8d1e99234146024422bdcfa09ce') Linux
clock_settime64 system call.

MFC after:	2 weeks
2021-06-07 05:14:04 +03:00
Dmitry Chagin
19f9a0e4df linux(4): Implement clock_settime64 system call.
MFC after:	2 weeks
2021-06-07 05:11:25 +03:00
Dmitry Chagin
9e07ae7a09 Regen for ('99b6f430698fa00a33184dd61591d8b6518ed9d3') Linux
clock_gettime64 system call.

MFC after:	2 weeks
2021-06-07 05:08:11 +03:00
Dmitry Chagin
99b6f43069 linux(4): Implement clock_gettime64 system call.
MFC after:	2 weeks
2021-06-07 05:04:42 +03:00
Dmitry Chagin
ea7fa5583c Regen for ('e4bffb80bbc6a2e4b3be89aefcbd5bb2c2fc0ba0') Linux
utimensat_time64 syscall.

MFC after:	2 weeks
2021-06-07 04:56:58 +03:00
Dmitry Chagin
e4bffb80bb linux(4): Implement utimensat_time64 system call.
MFC after:	2 weeks
2021-06-07 04:54:30 +03:00
Dmitry Chagin
bfcce1a9f6 linux(4): add struct timespec64 definition and conversion routine for
future use.

MFC after:		2 weeks
2021-06-07 04:47:12 +03:00
Mark Johnston
8cd05b8833 amd64: Clear the local TSS when creating a new thread
Otherwise it is copied from the creating thread.  Then, if either thread
exits, the other is left with a dangling pointer, typically resulting in
a page fault upon the next context switch.

Reported by:	syzkaller
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30607
2021-06-01 19:38:22 -04:00
Mark Johnston
6cda627556 amd64: Relax the assertion added in commit 4a59cbc12
We only need to ensure that interrupts are disabled when handling a
fault from iret.  Otherwise it's possible to trigger the assertion
legitimately, e.g., by copying in from an invalid address.

Fixes:		4a59cbc12
Reported by:	pho
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30594
2021-06-01 19:38:09 -04:00
Mark Johnston
4a59cbc125 amd64: Avoid enabling interrupts when handling kernel mode prot faults
When PTI is enabled, we may have been on the trampoline stack when iret
faults.  So, we have to switch back to the regular stack before
re-entering trap().

trap() has the somewhat strange behaviour of re-enabling interrupts when
handling certain kernel-mode execeptions.  In particular, it was doing
this for exceptions raised during execution of iret.  When switching
away from the trampoline stack, however, the thread must not be migrated
to a different CPU.  Fix the problem by simply leaving interrupts
disabled during the window.

Reported by:	syzbot+6cfa544fd86ad4647ffc@syzkaller.appspotmail.com
Reported by:	syzbot+cfdfc9e5a8f28f11a7f5@syzkaller.appspotmail.com
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30578
2021-05-31 18:49:33 -04:00
Dmitry Chagin
19593f775c linux(4); Retire unnecessary __packed attribute from some struct's
definition.

Differential Revision:	https://reviews.freebsd.org/D30482
MFC after:		2 weeks
2021-05-31 21:56:34 +03:00
Konstantin Belousov
c56de177d2 x86: initialize initial FPU state earlier
Make it under SI_SUB_CPU sysinit, instead of much later SI_SUB_DRIVERS.
The SI_SUB_DRIVERS survived from times when FPU used real ISA attachment,
now it is only pnp stub claiming id.

PR:	255997
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30512
2021-05-28 21:38:32 +03:00
Edward Tomasz Napierala
c0f171736a Regen after 6d926e850d.
Sponsored By:	EPSRC
2021-05-28 09:04:17 +01:00
Edward Tomasz Napierala
6d926e850d linux: add new syscall numbers
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30193
2021-05-28 09:02:16 +01:00
Mark Johnston
4c599db71a vmm: Let guests enable SMEP/SMAP if the host supports it
Reviewed by:	kib, grehan, jhb
Tested by:	grehan (AMD)
MFC after:	3 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30462
2021-05-26 09:34:52 -04:00
Konstantin Belousov
a59f028537 amd64/linux*: add required header to get the constant value
Otherwise asm silently interpret it as the external global symbol.

Reported by:	bz
Sponsored by:	The FreeBSD Foundation
Fixes:	91aae953cb
2021-05-26 01:24:09 +03:00
Konstantin Belousov
91aae953cb amd64: clear PSL.AC in the right frame
If copyin family of routines fault, kernel does clear PSL.AC on the
fault entry, but the AC flag of the faulted frame is kept intact.  Since
onfault handler is effectively jump, AC survives until syscall exit.

Reported by:	m00nbsd, via Sony
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
admbugs:	975
2021-05-25 18:20:46 +03:00
Edward Tomasz Napierala
95c19e1d65 linux: refactor bsd_to_linux_regset() out of linux_ptrace.c
This will be used for Linux coredump support.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30365
2021-05-21 07:26:07 +01:00
Ceri Davies
c1a148873d sys/*/conf/*, docs: fix links to handbook
While here, fix all links to older en_US.ISO8859-1 documentation
in the src/ tree.

PR:             255026
Reported by:    Michael Büker <freebsd@michael-bueker.de>
Reviewed by:    dbaio
Approved by:    blackend (mentor), re (gjb)
MFC after:      10 days
Differential Revision: https://reviews.freebsd.org/D30265
2021-05-20 09:27:10 +01:00
Roger Pau Monné
9e14ac116e x86/xen: further PVHv1 removal cleanup
The AP startup extern variable declarations are not longer needed,
since PVHv2 uses the native AP startup path using the lapic. Remove
the declaration and make the variables static to mp_machdep.c

Sponsored by: Citrix Systems R&D
2021-05-18 10:43:31 +02:00
Roger Pau Monné
ac3ede5371 x86/xen: remove PVHv1 code
PVHv1 was officially removed from Xen in 4.9, so just axe the related
code from FreeBSD.

Note FreeBSD supports PVHv2, which is the replacement for PVHv1.

Sponsored by: Citrix Systems R&D
Reviewed by: kib, Elliott Mitchell
Differential Revision: https://reviews.freebsd.org/D30228
2021-05-17 11:41:21 +02:00
Mitchell Horne
c93e6ea344 xen: remove support for PVHv1 bootpath
PVHv1 is a legacy interface supported only by Xen versions 4.4 through
4.9.

Reviewed by: royger
2021-05-17 10:56:52 +02:00
Mark Johnston
60cb98a1bd linux: Fix a mistake in commit fb58045145
The change to futex_andl_smap() should have ordered stac before the
load from a user address, otherwise it does not fix anything.

Fixes:	fb58045145 ("linux: Fix SMAP-enabled futex routines")
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-05-16 22:23:14 -04:00
Mark Johnston
fb58045145 linux: Fix SMAP-enabled futex routines
Some of them were dereferencing the user pointer before disabling SMAP.

PR:		255591
Reviewed by:	kib
Tested by:	pitwuu@gmail.com
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D30276
2021-05-16 13:42:08 -04:00
Ed Maste
2c9764f36b regen syscall files after d51198d63b63 2021-05-13 14:09:58 -04:00
Edward Tomasz Napierala
916f3dba45 linux(4): make arch_prctl(2) support GET_CET_STATUS, report unknown codes
This is largely a no-op, to make future debugging slightly easier.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30035
2021-05-06 09:33:42 +01:00
Edward Tomasz Napierala
023bff7990 linux(4): fix ptrace(2) to properly handle orig_rax
This fixes strace(1) erroneously reporting return values
as "Function not implemented", combined with reporting the binary
ABI as X32.

Very similar code in linux_ptrace_getregs() is left as it is - it's
probably wrong too, but I don't have a way to test it.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29927
2021-05-04 15:21:06 +01:00
Andrew Turner
c78ad207ba Switch the EFI virtual address to a uint64_t
It is defined as a uint64_t in the UEFI spec. As it's not used as a
pointer by the kernel follow this and define it as the same in the
kernel.

Reviewed by:	kib, manu, imp
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D29759
2021-05-01 06:01:20 +00:00
Konstantin Belousov
72a42ec63b amd64: disable LA57 by default for now
A testing on the real hardware uncovered an issue, and since I do not have
access to the machine, disable until the bug can be fixed.

Reported by:	"Pieper, Jeffrey E" <jeffrey.e.pieper@intel.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-04-30 17:43:45 +03:00
Konstantin Belousov
21fc6a2a10 amd64: invalidate TLB between page table update and access
When setting up trampoline mapping for LA57 switcher, it is possible
that TLB still has some random mapping at that address.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-04-30 17:43:45 +03:00
Mark Johnston
20e3b9d8bd kasan: Use vm_offset_t for the first parameter to kasan_shadow_map()
No functional change intended.

Sponsored by:	The FreeBSD Foundation
2021-04-29 11:39:02 -04:00
Alexander V. Chernikov
6993187a8c Add FIB_ALGO to GENERIC on amd64/arm64.
Option `FIB_ALGO` gates new modular fib lookup functionality,
 enabling more performant routing table lookups and improving
 control plane convergence under the load.

Detailed feature description is available in D27401.

Reviewed By: olivier, gnn
Differential Revision: https://reviews.freebsd.org/D28434
2021-04-24 23:22:58 +00:00
Edward Tomasz Napierala
77651151f3 linux: make ptrace(2) return EIO when trying to peek invalid address
Previously we've returned the error from native ptrace(2), ENOMEM.
This confused Linux strace(2).

Reviewed By:	emaste
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29925
2021-04-24 11:37:50 +01:00
Ka Ho Ng
6fe60f1d5c AMD-vi: Fortify IVHD device_identify process
- Use malloc(9) to allocate ivhd_hdrs list. The previous assumption
  that there are at most 10 IVHDs in a system is not true. A counter
  example would be a system with 4 IOMMUs, and each IOMMU is related
  to IVHDs type 10h, 11h and 40h in the ACPI IVRS table.
- Always scan through the whole ivhd_hdrs list to find IVHDs that has
  the same DeviceId but less prioritized IVHD type.

Sponsored by:	The FreeBSD Foundation
MFC with:	74ada297e8
Reviewed by:	grehan
Approved by:	lwhsu (mentor)
Differential Revision:	https://reviews.freebsd.org/D29525
2021-04-19 16:08:13 +08:00
Mark Johnston
f115c06121 amd64: Add MD bits for KASAN
- Initialize KASAN before executing SYSINITs.
- Add a GENERIC-KASAN kernel config, akin to GENERIC-KCSAN.
- Increase the kernel stack size if KASAN is enabled.  Some of the
  ASAN instrumentation increases stack usage and it's enough to
  trigger stack overflows in ZFS.
- Mark the trapframe as valid in interrupt handlers if it is
  assigned to td_intr_frame.  Otherwise, an interrupt in a function
  which creates a poisoned alloca region can trigger false positives.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29455
2021-04-13 17:42:20 -04:00
Mark Johnston
6faf45b34b amd64: Implement a KASAN shadow map
The idea behind KASAN is to use a region of memory to track the validity
of buffers in the kernel map.  This region is the shadow map.  The
compiler inserts calls to the KASAN runtime for every emitted load
and store, and the runtime uses the shadow map to decide whether the
access is valid.  Various kernel allocators call kasan_mark() to update
the shadow map.

Since the shadow map tracks only accesses to the kernel map, accesses to
other kernel maps are not validated by KASAN.  UMA_MD_SMALL_ALLOC is
disabled when KASAN is configured to reduce usage of the direct map.
Currently we have no mechanism to completely eliminate uses of the
direct map, so KASAN's coverage is not comprehensive.

The shadow map uses one byte per eight bytes in the kernel map.  In
pmap_bootstrap() we create an initial set of page tables for the kernel
and preloaded data.

When pmap_growkernel() is called, we call kasan_shadow_map() to extend
the shadow map.  kasan_shadow_map() uses pmap_kasan_enter() to allocate
memory for the shadow region and map it.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29417
2021-04-13 17:42:20 -04:00
Edward Tomasz Napierala
ca6e1fa3ce linux: adjust ordering of Linux auxv and add dummy AT_HWCAP2
This should be a no-op; the purpose of this is to reduce
a spurious difference between Linuxulator and Linux, to make
debugging core dumps slightly easier.

Note that AT_HWCAP2 we pass to Linux binaries is always 0,
instead of being equal to 'cpu_feature2'.  This matches what
I've observed under Ubuntu Focal VM.

Reviewed By:	chuck, dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29609
2021-04-13 13:14:30 +01:00
Andrew Turner
5d2d599d3f Create VM_MEMATTR_DEVICE on all architectures
This is intended to be used with memory mapped IO, e.g. from
bus_space_map with no flags, or pmap_mapdev.

Use this new memory type in the map request configured by
resource_init_map_request, and in pciconf.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29692
2021-04-12 06:15:31 +00:00
Piotr Pawel Stefaniak
a212f56d10 Balance parentheses in sysctl descriptions 2021-04-11 10:30:55 +02:00
Konstantin Belousov
94172affa4 amd64: clear debug registers on execing 32bit Linux binary
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29687
2021-04-10 04:25:02 +03:00
Konstantin Belousov
d50adfec9e amd64: clear debug registers on execing 32bit native binary
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29687
2021-04-10 04:25:02 +03:00
Konstantin Belousov
2f15884747 amd64 linux64: use x86_clear_dbregs()
instead of manually inlining it

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29687
2021-04-10 04:25:02 +03:00
Konstantin Belousov
290b0d123a x86: use x86_clear_dbregs() on fork
instead of manual zeroing of the debug registers file in pcb.
This centralizes the cleaning code, but the practical difference is
that PCB_DBREGS flag is cleared, saving some operations on context
switching.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29687
2021-04-10 04:25:02 +03:00
Konstantin Belousov
a8b75a57c9 x86: add x86_clear_dbregs() helper
Move the code from exec_setregs() to reset debug registers state on exec,
to the x86_clear_dbregs() helper

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29687
2021-04-10 04:25:01 +03:00
Greg V
a29bff7a52 smbios: support getting address from EFI
On some systems (e.g. Lenovo ThinkPad X240, Apple MacBookPro12,1)
the SMBIOS entry point is not found in the <0xFFFFF space.

Follow the SMBIOS spec and use the EFI Configuration Table for
locating the entry point on EFI systems.

Reviewed by:	rpokala, dab
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D29276
2021-04-07 14:46:29 -05:00
Mark Johnston
0f07c234ca Remove more remnants of sio(4)
Reviewed by:	imp
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29626
2021-04-07 14:33:02 -04:00
Konstantin Belousov
aa3ea612be x86: remove gcov kernel support
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D29529
2021-04-02 15:41:51 +03:00
Ka Ho Ng
03efa462b2 AMD-vi: Mixed format IVHD block should replace fixed format IVHD block
This fixes double IVHD_SETUP_INTR calls on the same IOMMU device.

Sponsored by:	The FreeBSD Foundation
MFC with:	74ada297e8
Reported by:	Oleg Ginzburg <olevole@olevole.ru>
Reviewed by:	grehan
Approved by:	philip (mentor)
Differential Revision:	https://reviews.freebsd.org/D29521
2021-04-01 15:31:24 +08:00
Ka Ho Ng
cf76495e0a AMD-vi: Fix mismatched NULL checking in amdiommu teardown path
Sponsored by:	The FreeBSD Foundation
Approved by:	lwhsu (mentor)
MFC with:	74ada297e8
2021-04-01 03:34:34 +08:00
Jason A. Harmening
8dc8feb53d Clean up a couple of MD warts in vm_fault_populate():
--Eliminate a big ifdef that encompassed all currently-supported
architectures except mips and powerpc32.  This applied to the case
in which we've allocated a superpage but the pager-populated range
is insufficient for a superpage mapping.  For platforms that don't
support superpages the check should be inexpensive as we shouldn't
get a superpage in the first place.  Make the normal-page fallback
logic identical for all platforms and provide a simple implementation
of pmap_ps_enabled() for MIPS and Book-E/AIM32 powerpc.

--Apply the logic for handling pmap_enter() failure if a superpage
mapping can't be supported due to additional protection policy.
Use KERN_PROTECTION_FAILURE instead of KERN_FAILURE for this case,
and note Intel PKU on amd64 as the first example of such protection
policy.

Reviewed by:	kib, markj, bdragon
Differential Revision:	https://reviews.freebsd.org/D29439
2021-03-30 18:15:55 -07:00
Konstantin Belousov
8223717ce6 x86: clear %db registers in new process
Reported by:	 Michał Górny <mgorny@gentoo.org>
PR:	254661
Reviewed by:	emaste, jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D29496
2021-03-31 02:07:35 +03:00
Mitchell Horne
7446b0888d gdb: report specific stop reason for watchpoints
The remote protocol allows for implementations to report more specific
reasons for the break in execution back to the client [1]. This is
entirely optional, so it is only implemented for amd64, arm64, and i386
at the moment.

[1] https://sourceware.org/gdb/current/onlinedocs/gdb/Stop-Reply-Packets.html

Reviewed by:	jhb
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
NetApp PR:	51
Differential Revision:	https://reviews.freebsd.org/D29174
2021-03-30 11:36:41 -03:00
Mitchell Horne
9d81dd5404 ddb: replace watchpoint set/clear functions
Use the new kdb variants. Print more specific error messages.

Reviewed by:	jhb, markj
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D29156
2021-03-29 12:05:44 -03:00
Mitchell Horne
15dc1d4452 x86: implement kdb watchpoint functions
Add wrappers around the dbreg interface that can be consumed by MI
kernel debugger code. The dbreg functions themselves are updated to
return error codes, not just -1. dbreg_set_watchpoint() is extended to
accept access bits as an argument.

Reviewed by:	jhb, kib, markj
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D29155
2021-03-29 12:05:43 -03:00
Mark Johnston
7ae2e70336 amd64: Make KPDPphys local to pmap.c
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-03-24 09:57:31 -04:00
Ka Ho Ng
be97fc8dce bhyve amd: Small cleanups in amdvi_dump_cmds
Bump offset with MOD_INC instead in amdvi_dump_cmds.

Reviewed by:	jhb
Approved by:	philip (mentor)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D28862
2021-03-23 16:12:41 +08:00
Mark Johnston
3ead60236f Generalize bus_space(9) and atomic(9) sanitizer interceptors
Make it easy to define interceptors for new sanitizer runtimes, rather
than assuming KCSAN.  Lay a bit of groundwork for KASAN and KMSAN.

When a sanitizer is compiled in, atomic(9) and bus_space(9) definitions
in atomic_san.h are used by default instead of the inline
implementations in the platform's atomic.h.  These definitions are
implemented in the sanitizer runtime, which includes
machine/{atomic,bus}.h with SAN_RUNTIME defined to pull in the actual
implementations.

No functional change intended.

MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2021-03-22 22:21:53 -04:00