Commit Graph

1197 Commits

Author SHA1 Message Date
Kornel Duleba
06e6ca6dd3 dmar: Disable protected memory regions after initialization
Some BIOSes protect memory region they reside in by using DMAR to
prevent devices from doing any DMA transactions to that part of RAM.
AMI refers to this as "DMA Control Guarantee".
Disable the protection when address translation is enabled.
I stumbled upon this while investigation a failing coredump on a device
which has this feature enabled.

Sponsored by:		Stormshield
Obtained from:		Semihalf
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D32591
2021-10-29 10:08:25 +02:00
Kornel Duleba
3c02da8096 dmar: Don't try to reserve PCI regions for non-existing devices
In some cases we might have to create DMAR context before the
corresponding device has been enumerated by the PCI bus.
In that case we get called with NULL dev, because of that trying
to reserve PCI regions causes a NULL pointer dereference in
pci_find_pcie_root_port.

Sponsored by:		Stormshield
Obtained from:		Semihalf
MFC after:		2 weeks
Reviewed by:		kib, rlibby
Differential revision:	https://reviews.freebsd.org/D32589
2021-10-29 10:08:25 +02:00
Gleb Smirnoff
6aae3517ed Retire synchronous PPP kernel driver sppp(4).
The last two drivers that required sppp are cp(4) and ce(4).

These devices are still produced and can be purchased
at Cronyx <http://cronyx.ru/hardware/wan.html>.

Since Roman Kurakin <rik@FreeBSD.org> has quit them, they no
longer support FreeBSD officially.  Later they have dropped
support for Linux drivers to.  As of mid-2020 they don't even
have a developer to maintain their Windows driver.  However,
their support verbally told me that they could provide aid to
a FreeBSD developer with documentaion in case if there appears
a new customer for their devices.

These drivers have a feature to not use sppp(4) and create an
interface, but instead expose the device as netgraph(4) node.
Then, you can attach ng_ppp(4) with help of ports/net/mpd5 on
top of the node and get your synchronous PPP.  Alternatively
you can attach ng_frame_relay(4) or ng_cisco(4) for HDLC.
Actually, last time I used cp(4) back in 2004, using netgraph(4)
instead of sppp(4) was already the right way to do.

Thus, remove the sppp(4) related part of the drivers and enable
by default the negraph(4) part.  Further maintenance of these
drivers in the tree shouldn't be a big deal.

While doing that, remove some cruft and enable cp(4) compilation
on amd64.  The ce(4) for some unknown reason marks its internal
DDK functions with __attribute__ fastcall, which most likely is
safe to remove, but without hardware I'm not going to do that, so
ce(4) remains i386-only.

Reviewed by:		emaste, imp, donner
Differential Revision:	https://reviews.freebsd.org/D32590
See also:		https://reviews.freebsd.org/D23928
2021-10-22 11:41:36 -07:00
Konstantin Belousov
661bd70bd7 DMAR: clean up warnings about write-only variables
For some of them, used only when KTR or KMSAN are configured, apply
__unused attribute directly.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-10-21 21:40:46 +03:00
Mark Johnston
06ebadc5f5 x86: Remove some leftover APM support
This is obsolete since commit 8c576a279e ("Remove APM BIOS support").

Reviewed by:	imp, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32510
2021-10-18 09:56:59 -04:00
Mark Johnston
de8554295b cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it
This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well.  No functional change
intended.

This is a re-application of commit
9068f6ea69.

Reviewed by:	cem, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32029
2021-10-18 09:56:58 -04:00
Konstantin Belousov
f8d3368b43 apic: initialize lapic_paddr statically
The default value for LAPIC registers page physical address
is usually right. Having this value available early makes
pmap_force_invalidate_cache_range(), used on non-self-snoop machines,
avoid flushing LAPIC range for early calls.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32318
2021-10-06 05:52:56 +03:00
Mitchell Horne
ab4ed843a3 minidump: De-duplicate the progress bar
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31885
2021-09-29 16:42:21 -03:00
Konstantin Belousov
24a3897c2c x86 bounce_bus_dmamem_alloc(): use malloc_aligned() only when possible
malloc_domainset_aligned() requires that alignment is less than
page size. Fall back to other allocation methods, most likely
kmem_alloc_contig(), when malloc_aligned() cannot fullfill the driver
request.

Reported by:	Loic F <loic.f@hardenedbsd.org>
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32127
2021-09-25 15:58:12 +03:00
Alexander Motin
d3a8f98acb Make CPU children explicitly share parent unit numbers.
Before this device unit number match was coincidental and broke if I
disabled some CPU device(s).  Aside of cosmetics, for some drivers
(may be considered broken) it caused talking to wrong CPUs.
2021-09-24 23:31:51 -04:00
Alexander Motin
ef50d5fbc3 x86: Add NUMA nodes into CPU topology.
Depending on hardware, NUMA nodes may match last level caches, or
they may be above them (AMD Zen 2/3) or below (Intel Xeon w/ SNC).
This information is provided by ACPI instead of CPUID, and it is
provided for each CPU individually instead of mask widths, but
this code should be able to properly handle all the above cases.

This change should immediately allow idle stealing in sched_ule(4)
to prefer load from NUMA-local CPUs to remote ones when the node
does not match LLC.  Later we may think of how to better handle it
on sched_pickcpu() side.

MFC after:	1 month
2021-09-23 14:31:38 -04:00
Konstantin Belousov
e36d0e86e3 Revert "linux32: add a hack to avoid redefining the type of the savefpu tag"
This reverts commit 0f6829488e.
Also it changes the type of md_usr_fpu_save struct mdthread member
to void *, which is what uncovered this trouble.  Now the save area
is untyped, but since it is hidden behind accessors, it is not too
significant.  Since apparently there are consumers affected outside
the tree, this hack is better than one from the reverted revision.

PR:	258678
Reported by:	cy
Reviewed by:	cy, kevans, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32060
2021-09-22 23:17:47 +03:00
Mark Johnston
bcdc599dc2 Revert "cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it"
This reverts commit 9068f6ea69.

The underlying macro needs to be reworked to avoid problems with control
flow statements.

Reported by:	rlibby
2021-09-21 13:51:42 -04:00
Konstantin Belousov
0f6829488e linux32: add a hack to avoid redefining the type of the savefpu tag
when compiling in amd64 kernel environment with -m32.  This is a temporal
workaround for some future proper (but unclear) fix.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Mark Johnston
9068f6ea69 cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it
This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well.  No functional change
intended.

Reviewed by:	cem, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32029
2021-09-21 12:07:47 -04:00
Konstantin Belousov
2b6eec531a x86: duplicate acpi_wakeup.c per i386 and amd64
The file as is is the maze of #ifdef passages, all slightly different.
Divorcing i386 and amd64 version actually makes changing the code
easier, also no changes for i386 are planned.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31931
2021-09-14 00:23:14 +03:00
Konstantin Belousov
db2ba218d9 amd64 acpi_wakeup: map 1:1 whole low 4G for the trampoline page table
This is required since kernel text might be physically located
anywhere below 4G.

PR:	258432
Reported by:	Taku YAMAMOTO <taku@tackymt.homeip.net>
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31916
2021-09-13 19:52:13 +03:00
Konstantin Belousov
ceca8ac1ce x86 acpi_install_wakeup_handler(): style
Do not use tab between type and variable name in local declarations.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31916
2021-09-13 19:52:06 +03:00
Konstantin Belousov
e99255c8a6 amd64: do not touch low memory in acpi_wakeup_ap() if booted by UEFI
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31916
2021-09-13 19:51:52 +03:00
Colin Percival
cd165c8bf0 x86/tsc.c: Add TSLOG to test_tsc
On my benchmark system this takes ~ 14 ms; enough to be worth
recording in the boot time profile.
2021-09-09 17:02:15 -07:00
Andrew Turner
b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Dimitry Andric
8e3c56d6b6 xen: Fix warning by adding KERNBASE to modlist_paddr before casting
Clang 13 produces the following warning for hammer_time_xen():

sys/x86/xen/pv.c:183:19: error: the pointer incremented by -2147483648 refers past the last possible element for an array in 64-bit address space containing 256-bit (32-byte) elements (max possible 576460752303423488 elements) [-Werror,-Warray-bounds]
                    (vm_paddr_t)start_info->modlist_paddr + KERNBASE;
                                ^                               ~~~~~~~~
sys/xen/interface/arch-x86/hvm/start_info.h:131:5: note: array 'modlist_paddr' declared here
    uint64_t modlist_paddr;         /* Physical address of an array of           */
    ^

This is because the expression first casts start_info->modlist_paddr to
struct hvm_modlist_entry * (via vmpaddr_t), and *then* adds KERNBASE,
which is then interpreted as KERNBASE * sizeof(struct
hvm_modlist_entry).

Instead, parenthesize the addition to get the intended result, and cast
it to struct hvm_modlist_entry * afterwards. Also remove the cast to
vmpaddr_t since it is not necessary.

Reviewed by:	royger
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D31711
2021-08-29 19:43:00 +02:00
Adam Fenn
d4b2d3035a pvclock: Add vDSO support
Add vDSO support for timekeeping devices that support the KVM/XEN
paravirtual clock API.

Also, expose, in the userspace-accessible '<machine/pvclock.h>',
definitions that will be needed by 'libc' to support
'VDSO_TH_ALGO_X86_PVCLK'.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31418
2021-08-14 15:57:54 +03:00
Adam Fenn
6c69c6bb4c kvm_clock: KVM paravirtual clock support
Add support for the KVM paravirtual clock device.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29733
2021-08-14 15:57:54 +03:00
Adam Fenn
0b3382b863 pvclock: Add 'struct pvclock' API
Consolidate more hypervisor-agnostic functionality behind a new 'struct
pvclock' API.

This should also make it easier to subsequently add hypervisor-agnostic
vDSO timekeeping support.

Also, perform some clean-up:
    - Remove 'pvclock_get_last_cycles()'; do not allow external access
      to 'pvclock_last_systime' since this is not necessary.
    - Consolidate/simplify wall and system time reading codepaths.
    - Ensure correct ordering within wall and system time reading
      codepaths via 'atomic(9)' and 'rdtsc_ordered()' rather than via
      'rmb()'.
    - Remove some extra newlines.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31418
2021-08-14 15:57:54 +03:00
Adam Fenn
652ae7b114 x86: cpufunc: Add rdtsc_ordered()
Add a variant of 'rdtsc()' that performs the ordered version of 'rdtsc'
appropriate for the invoking x86 variant.

Also, expose the 'lfence'-ed and 'mfence'-ed 'rdtsc()' variants needed
by 'rdtsc_ordered()' for general use.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31416
2021-08-14 15:57:53 +03:00
NagaChaitanya Vellanki
2a9b4076dc
Merge common parts of i386 and amd64's ieeefp.h into x86/x86_ieeefp.h
MFC after:	1 week
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D26292
2021-08-12 18:45:22 +08:00
Dmitry Chagin
de8374df28 fork: Allow ABI to specify fork return values for child.
At least Linux x86 ABI's does not use carry bit and expects that the dx register
is preserved. For this add a new sv_set_fork_retval hook and call it from cpu_fork().

Add a short comment about touching dx in x86_set_fork_retval(), for more details
see phab comments from kib@ and imp@.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D31472
MFC after:		2 weeks
2021-08-12 11:45:25 +03:00
Mark Johnston
693c9516fa busdma: Add KMSAN integration
Sanitizer instrumentation of course cannot automatically update shadow
state when devices write to host memory.  KMSAN thus hooks into busdma,
both to update shadow state after a device write, and to verify that the
kernel does not publish uninitalized bytes to devices.

To implement this, when KMSAN is configured, each dmamap embeds a memory
descriptor describing the region currently loaded into the map.
bus_dmamap_sync() uses the operation flags to determine whether to
validate the loaded region or to mark it as initialized in the shadow
map.

Note that in cases where the amount of data written is less than the
buffer size, the entire buffer is marked initialized even when it is
not.  For example, if a NIC writes a 128B packet into a 2KB buffer, the
entire buffer will be marked initialized, but subsequent accesses past
the first 128 bytes are likely caused by bugs.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31338
2021-08-10 21:27:54 -04:00
Mark Johnston
3a1802fef4 busdma: Add an internal BUS_DMA_FORCE_MAP flag to x86 bounce_busdma
Use this flag to indicate that busdma should allocate a map structure
even no bouncing is required to satisfy the tag's constraints.  This
will be used for KMSAN.

Also fix a memory leak that can occur if the kernel fails to allocate
bounce pages in bounce_bus_dmamap_create().

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31338
2021-08-10 21:27:54 -04:00
Mark Johnston
b0f71f1bc5 amd64: Add MD bits for KMSAN
Interrupt and exception handlers must call kmsan_intr_enter() prior to
calling any C code.  This is because the KMSAN runtime maintains some
TLS in order to track initialization state of function parameters and
return values across function calls.  Then, to ensure that this state is
kept consistent in the face of asynchronous kernel-mode excpeptions, the
runtime uses a stack of TLS blocks, and kmsan_intr_enter() and
kmsan_intr_leave() push and pop that stack, respectively.

Use these functions in amd64 interrupt and exception handlers.  Note
that handlers for user->kernel transitions need not be annotated.

Also ensure that trap frames pushed by the CPU and by handlers are
marked as initialized before they are used.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31467
2021-08-10 21:27:53 -04:00
Ed Maste
9feff969a0 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation
2021-08-08 10:42:24 -04:00
Konstantin Belousov
d0bc4b4666 x86_msr_op: extend the KPI to allow MSR read and single-CPU operations
Reivewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31386
2021-08-05 18:46:37 +03:00
Mark Johnston
b2ed7e988a bus: Convert to the new interceptor scheme
This was missed in commit a90d053b84.

Fixes:		a90d053b84
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-07-30 15:15:27 -04:00
Mark Johnston
a90d053b84 Simplify kernel sanitizer interceptors
KASAN and KCSAN implement interceptors for various primitive operations
that are not instrumented by the compiler.  KMSAN requires them as well.
Rather than adding new cases for each sanitizer which requires
interceptors, implement the following protocol:
- When interceptor definitions are required, define
  SAN_NEEDS_INTERCEPTORS and SANITIZER_INTERCEPTOR_PREFIX.
- In headers that declare functions which need to be intercepted by a
  sanitizer runtime, use SANITIZER_INTERCEPTOR_PREFIX to provide
  declarations.
- When SAN_RUNTIME is defined, do not redefine the names of intercepted
  functions.  This is typically the case in files which implement
  sanitizer runtimes but is also needed in, for example, files which
  define ifunc selectors for intercepted operations.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2021-07-29 21:13:32 -04:00
Konstantin Belousov
b27fe1c3ba amd64: stop doing special allocation for the AP startup trampoline
There is no reason now why do we need to allocate trampoline page very
early in the boot process.  The only requirement for the page is that
it is below 1M to be usable by the real mode during init.  This can be
handled by vm_alloc_contig() when we do the startup.

Also assert that startup trampoline fits into single page.  In principle
we can do multi-page allocation if needed, but it is not.

Move the alloc_ap_trampoline() function and the boot_address variable to
i386/mp_machdep.c.  Keep existing mechanism of early alloc on i386.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D31343
2021-07-30 01:20:45 +03:00
Alexander Motin
5a49f19141 Do not expose to scheduler caches of single CPU.
Before this change my dual-Xeon(R) Gold 6242R always reported 3 levels
or topology (root, package/L3 and core/L2).  But with SMT disabled
core/L2 matches thread, so additional topology level only causes more
traversal work.  With this change SMT case is reported same as before,
while non-SMT is reported with only 2 much more simple levels.

MFC after:	2 weeks
2021-07-28 16:38:01 -04:00
Julien Grall
ac959cf544 xen: introduce xen_has_percpu_evtchn()
xen_vector_callback_enabled is x86 specific and availability of
per-cpu event channel delivery differs on other architectures.

Introduce a new helper to check if there's support for per-cpu event
channel injection.

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29402
2021-07-28 17:27:05 +02:00
Julien Grall
0b4f30c236 xen/control: introduce xen_pv_shutdown_handler()
While x86 only register PV shutdown handler for PV guests. ARM guests
are always using HVM and requires the PV shutdown handler.

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29406
2021-07-28 17:27:04 +02:00
Julien Grall
69c6eee756 xen: introduce xen_pv_disks_disabled()
ARM guest is considered as HVM in Freebsd but they only support PV disk
(no emulation available).

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29403
2021-07-28 17:27:04 +02:00
Julien Grall
5f70008327 xen/netfront: introduce xen_pv_nics_disabled()
ARM guest is considered as HVM but it only supports PV nics (no
emulation available).

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29405
2021-07-28 17:27:04 +02:00
Elliott Mitchell
c89f1f12b0 xen/xen-os: move inclusion of machine/xen-os.h later
Several of x86 enable/disable functions depend upon the xen*domain()
functions.  As such the xen*domain() functions need to be declared
before machine/xen-os.h.

Officially declare direct inclusion of machine/xen/xen-os.h verboten as
such will break these functions/macros.  Remove one such soon to be
broken inclusion.

Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29811
2021-07-28 17:27:04 +02:00
Elliott Mitchell
9976c5a540 xen/intr: use __func__ instead of function names
Functions tend to get renamed and unless the developer is careful
often debugging messages are missed. As such using func is far
superior.  Replace several instances of hard-coded function names.

Reviewed by: royger
Differential revision: https://reviews.freebsd.org/D29499
2021-07-28 17:27:03 +02:00
Elliott Mitchell
5ca00e0c98 xen/intr: use struct xenisrc * as xen_intr_handle_t
Since xen_intr_handle_t is meant to be an opaque handle and the only
use is retrieving the associated struct xenisrc *, directly use it as
the opaque handler.

Also add a wrapper function for converting the other direction.  If some
other value becomes appropriate in the future, these two functions will
be the only spots needing modification.

Reviewed by: mhorne, royger
Differential Revision: https://reviews.freebsd.org/D29500
2021-07-28 17:27:03 +02:00
Elliott Mitchell
b6ff9345a4 xen: create VM_MEMATTR_XEN for Xen memory mappings
The requirements for pages shared with Xen/other VMs may vary from
architecture to architecture.  As such create a macro which various
architectures can use.

Remove a use of PAT_WRITE_BACK in xenstore.c.  This is a x86-ism which
shouldn't have been present in a common area.

Original idea: Julien Grall <julien@xen.org>, 2014-01-14 06:44:08
Approach suggested by: royger
Reviewed by: royger, mhorne
Differential Revision: https://reviews.freebsd.org/D29351
2021-07-28 17:27:02 +02:00
Julien Grall
a48f7ba444 xen: move x86/xen/xenpv.c to dev/xen/bus/xenpv.c
Minor changes are necessary to make this processor-independent, but
moving the file out of x86 and into common is the first step (so
others don't add /more/ x86-isms).

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29042
2021-07-28 17:27:02 +02:00
Konstantin Belousov
d6717f8778 amd64: rework AP startup
Stop using temporal page table with 1:1 mapping of low 1G populated over
the whole VA.  Use 1:1 mapping of low 4G temporarily installed in the
normal kernel page table.

The features are:
- now there is one less step for startup asm to perform
- the startup code still needs to be at lower 1G because CPU starts in
  real mode. But everything else can be located anywhere in low 4G
  because it is accessed by non-paged 32bit protected mode.  Note that
  kernel page table root page is at low 4G, as well as the kernel itself.
- the page table pages can be allocated by normal allocator, there is
  no need to carve them from the phys_avail segments at very early time.
  The allocation of the page for startup code still requires some magic.
  Pages are freed after APs are ignited.
- la57 startup for APs is less tricky, we directly load the final page
  table and do not need to tweak the paging mode.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-27 20:11:15 +03:00
Mark Johnston
f95e683fa2 Annotate amd64 stack unwinders with __nomemorysanitize
Sponsored by:	The FreeBSD Foundation
2021-07-23 10:47:13 -04:00
Dmitry Chagin
1ca6b15bbd Drop "All rights reserved" from my copyright statements.
Add email and fixup years while here.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30912
MFC after:		2 weeks
2021-07-20 10:05:50 +03:00
Dmitry Chagin
9931033bbf linux(4); Almost complete the vDSO.
The vDSO (virtual dynamic shared object) is a small shared library that the
kernel maps R/O into the address space of all Linux processes on image
activation. The vDSO is a fully formed ELF image, shared by all processes
with the same ABI, has no process private data.

The primary purpose of the vDSO:
- non-executable stack, signal trampolines not copied to the stack;
- signal trampolines unwind, mandatory for the NPTL;
- to avoid contex-switch overhead frequently used system calls can be
  implemented in the vDSO: for now gettimeofday, clock_gettime.

The first two have been implemented, so add the implementation of system
calls.

System calls implemenation based on a native timekeeping code with some
limitations:
- ifunc can't be used, as vDSO r/o mapped to the process VA and rtld
  can't relocate symbols;
- reading HPET memory is not implemented for now (TODO).

In case on any error vDSO system calls fallback to the kernel system
calls. For unimplemented vDSO system calls added prototypes which call
corresponding kernel system call.

Tested by:		trasz (arm64)
Differential revision:  https://reviews.freebsd.org/D30900
MFC after:              2 weeks
2021-07-20 10:01:18 +03:00