This existing helper function is preferable to the hand-rolled
calculation of the kstack bounds.
Make some small style improvements while here. Notably, rename every
instance of "r", the return address, to "ra". Tidy the includes in the
affected files.
Reviewed by: jkoshy
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39909
When vm_map_remove() is called from vm_swapout_map_deactivate_pages()
due to swapout, PKRU attributes for the removed range must be kept
intact. Provide a variant of pmap_remove(), pmap_map_delete(), to
allow pmap to distinguish between real removes of the UVA mappings
and any other internal removes, e.g. swapout.
For non-amd64, pmap_map_delete() is stubbed by define to pmap_remove().
Reported by: andrew
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D39556
sig_atomic_t is defined as a long and thus is 64-bit on arm64. For some
reason its limit was incorrectly specified as a 32-bit number. This had
the unfortunate side effect of causing gnulib to override most of the
definitions in stdint.h. On CheriBSD this breaks all software that uses
gnulib in annoying and hard to debug ways.
Technically updating the limits might be an ABI change, but these
defines are largely unused (the only use in tree is in the libc++ test
suite where it's use an assertion that will fail due to this bug).
Further, since the underlying type remains the same, we're just
increasing the range of values a paranoid program might use.
Reviewed by: emaste
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D39194
The unwind logic was copied from AArch64 which follows the peculiar
AACPS (where, unlike typical RISC architectures, its frame pointer
follows an x86/stack machine-like convention where the frame pointer
points at the bottom of the frame record, not the top). Delete the
pointless riscv_frame struct and fix this.
Reviewed by: mhorne
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D28054
This code was originally written under the assumption that the ISA
string would only contain single-letter extensions. The RISC-V
specification has extended its description of the format quite a bit,
allowing for much longer ISA strings containing multi-letter extension
names.
Newer versions of QEMU (7.1.0) will append to the riscv,isa property
indicating the presence of multi-letter standard extensions such as
Zfencei. This triggers a KASSERT about the expected length of the
string, preventing boot.
Increase the size of the isa array significantly, and teach the code
to parse (skip over) multi-letter extensions, and optional extension
version numbers. We currently ignore them completely, but this will
change in the future as we start supporting supervisor-level extensions.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D36601
Add a <sys/_pv_entry.h> intended for use in <machine/pmap.h> to
define struct pv_entry, pv_chunk, and related macros and inline
functions.
Note that powerpc does not yet use this as while the mmu_radix pmap
in powerpc uses the new scheme (albeit with fewer PV entries in a
chunk than normal due to an used pv_pmap field in struct pv_entry),
the Book-E pmaps for powerpc use the older style PV entries without
chunks (and thus require the pv_pmap field).
Suggested by: kib
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36685
This matches the return type of pmap_mapdev/bios.
Reviewed by: kib, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36548
This applies one of the changes from
5567d6b441 to other architectures
besides arm64.
Reviewed by: kib
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D36263
The devmap variants used vm_offset_t for some reason, and a few places
explicitly cast bus addresses to vm_offset_t. (Probably those casts
along with similar casts for vm_size_t should just be removed and
instead permit the compiler to DTRT.)
Reviewed by: markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D35961
After the addition of SV48 support, VIRT_IS_VALID() did not exclude
addresses that are in the SV39 address space hole but not in the SV48
address space hole. This can result in mishandling of accesses to that
range when in SV39 mode.
Fix the problem by modifying VIRT_IS_VALID() to use the runtime address
space bounds. Then, if the address is invalid, and pcb_onfault is set,
give vm_fault_trap() a chance to veto the access instead of panicking.
PR: 265439
Reviewed by: jhb
Reported and tested by: Robert Morris <rtm@lcs.mit.edu>
Fixes: 31218f3209 ("riscv: Add support for enabling SV48 mode")
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35952
These files no longer depend on the macros required when these checks
were added.
PR: 263102 (exp-run)
Reviewed by: brooks, imp, emaste
Differential Revision: https://reviews.freebsd.org/D34804
Since physical memory management is now handled by subr_physmem.c, the
need to keep this global array has diminished. It is not referenced
outside of early boot-time, and is populated by physmem_avail() in
pmap_bootstrap(). Just allocate the array on the stack for the duration
of its lifetime.
The check against physmap[0] in initriscv() can be dropped altogether,
as there is no consequence for excluding a memory range twice.
Reviewed by: markj
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34778
This increases the size of the user map from 256GB to 128TB. The kernel
map is left unchanged for now.
For now SV48 mode is left disabled by default, but can be enabled with a
tunable. Note that extant hardware does not implement SV48, but QEMU
does.
- In pmap_bootstrap(), allocate a L0 page and attempt to enable SV48
mode. If the write to SATP doesn't take, the kernel continues to run
in SV39 mode.
- Define VM_MAX_USER_ADDRESS to refer to the SV48 limit. In SV39 mode,
the region [VM_MAX_USER_ADDRESS_SV39, VM_MAX_USER_ADDRESS_SV48] is not
mappable.
Reviewed by: jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34280
Instead of having the one-off load_satp(), just use csr_write(). No
functional change intended.
Reviewed by: alc, jhb
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34271
In SV48 mode, the top-level page will be an L0 page rather than an L1
page. Rename the field accordingly. No functional change intended.
Reviewed by: alc, jhb
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34270
- Move busdma_lock_mutex to subr_bus_dma.c.
- Move _busdma_lock_dflt to subr_bus_dma.c. This function was named a
couple of different things previously. It is not a public API but
an internal helper used in place of a NULL pointer. The prototype
is in <sys/bus_dma.h> as not all backends include
<sys/bus_dma_internal.h>.
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33694
When a DMA request using bounce pages completes, a swi is triggered to
schedule pending DMA requests using the just-freed bounce pages. For
a long time this bus_dma swi has been tied to a "virtual memory" swi
(swi_vm). However, all of the swi_vm implementations are the same and
consist of checking a flag (busdma_swi_pending) which is always true
and if set calling busdma_swi. I suspect this dates back to the
pre-SMPng days and that the intention was for swi_vm to serve as a
mux. However, in the current scheme there's no need for the mux.
Instead, remove swi_vm and vm_ih. Each bus_dma implementation that
uses bounce pages is responsible for creating its own swi (busdma_ih)
which it now schedules directly. This swi invokes busdma_swi directly
removing the need for busdma_swi_pending.
One consequence is that the swi now works on RISC-V which had previously
failed to invoke busdma_swi from swi_vm.
Reviewed by: imp, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D33447
The header exports the following:
- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)
Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).
Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr. libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).
A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.
Reviewed by: kib (older version of x86/tls.h), jrtc27
Sponsored by: The University of Cambridge, Google Inc.
Differential Revision: https://reviews.freebsd.org/D33351
After a round of cleanups in late 2020, all definitions are
functionally identical.
This removes a rotted __aligned(8) on arm. It was added in
b7112ead32 and was intended to align the
args member so that 64-bit types (off_t, etc) could be safely read on
armeb compiled with clang. With the removal of armev, this is no
longer needed (armv7 requires that 32-bit aligned reads of 64-bit
values be supported and we enable such support on armv6). As further
evidence this is unnecessary, cleanups to struct syscall_args have
resulted in args being 32-bit aligned on 32-bit systems. The sole
effect is to bloat the struct by 4 bytes.
Reviewed by: kib, jhb, imp
Differential Revision: https://reviews.freebsd.org/D33308
This definition enables callers to estimate remaining space on the
kstack, and take action on it. Notably, it enables optimizations in the
GEOM and netgraph subsystems to directly dispatch work items when there
is sufficient stack space, rather than queuing them for a worker thread.
Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will
not go unimplemented elsewhere.
PR: 259157
Reviewed by: mav, kib, markj (previous version)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D32580
The minidump code is written assuming that certain global state will not
change, and rightly so, since it executes from a kernel debugger
context. In order to support taking minidumps of a live system, we
should allow copies of relevant global state that is likely to change to
be passed as parameters to the minidumpsys() function.
This patch does the work of parameterizing this function, by adding a
struct minidumpstate argument. For now, this struct allows for copies of
the kernel message buffer, and the bitset that tracks which pages should
be dumped (vm_page_dump). Follow-up changes will actually make use of
these arguments.
Notably, dump_avail[] does not need a snapshot, since it is not expected
to change after system initialization.
The existing minidumpsys() definitions are renamed, and a thin MI
wrapper is added to kern_dump.c, which handles the construction of
the state struct. Thus, calling minidumpsys() remains as simple as
before.
Reviewed by: kib, markj, jhb
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D31989
This is needed for LinuxKPI's _ioremap_attr. This reuses the generic
implementation introduced for aarch64, and itself requires implementing
pmap_kenter, which is trivial to do given riscv currently treats all
mapping attributes the same due to the Svpbmt extension not yet being
ratified and in hardware.
Reviewed by: markj, mhorne
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D32445
When handling a kernel page fault, check explicitly that stval resides
in either the user or kernel address spaces, and make the page fault
fatal if not. Otherwise, a properly crafted address may appear to
pmap_fault() as a valid and present page in the kernel map, causing the
page fault to be retried continuously. This is mainly due to the fact
that the upper bits of virtual addresses are not validated by most of
the pmap code.
Faults of this nature should only occur due to some kind of bug in the
kernel, but it is best to handle them gracefully when they do.
Handle user page faults in the same way, sending a SIGSEGV immediately
when a malformed address is encountered.
Add an assertion to pmap_l1(), which should help catch other bugs of
this kind that make it this far.
Reviewed by: jrtc27, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31208
pmap_change_attr is required by drm-kmod so we need the function to
exist. Since the Svpbmt extension is on the horizon we will likely end
up with a real implementation of it, so this stub implementation does
all the necessary page table walking to validate the input, ensuring
that no new errors are returned once it's implemented fully (other than
due to out of memory conditions when demoting L2 entries) and providing
a skeleton for that future implementation.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D31996
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.
Reviewed by: markj
MFC after: 2 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D31885
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).
Reviewed by: imp, markj
Sponsored by: DARPA, AFRL (original work)
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D19830
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).
Sponsored by: The FreeBSD Foundation
which is the place to put MD asserts about allocated pages.
On amd64, verify that allocated page does not belong to the kernel
(text, data) or early allocated pages.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D31121
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned. This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.
This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.
Approved by: markj (mentor)
Reviewed by: kib, bcr (manpages)
Differential Revision: https://reviews.freebsd.org/D29185
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned. This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.
Approved by: markj (mentor)
Reviewed by: kib, bcr (manpages)
Differential Revision: https://reviews.freebsd.org/D29185
Many of these typedefs are the same across all architectures or can
be set based on an architecture-independent compiler-provided macro
(e.g. __SIZEOF_SIZE_T__). These macros have been available since GCC 4.6
and Clang sometime before 3.0 (godbolt.org does not have any older clang
versions installed).
I originally considered using the compiler-provided `__FOO_TYPE__` directly.
However, in order to do so we have to check that those match the previous
typedef exactly (not just that they have the same size) since any change
would be an ABI break. For example, changing `long` to `long long` results
in different C++ name mangling. Additionally, Clang and GCC disagree on
the underlying type for some of (u)int*_fast_t types, so this change
only moves the definitions that are identical across all architectures
and does not touch those types.
This de-deduplication will allow us to have a smaller diff downstream in
CheriBSD: we only have to only change the (u)intptr_t definition in
sys/_types.h in CheriBSD instead of having to change machine/_types.h for
all CHERI-enabled architectures (currently RISC-V, AArch64 and MIPS).
Reviewed By: imp, kib
Differential Revision: https://reviews.freebsd.org/D29895
This is consistent with other platforms, specifically arm and arm64. No
functional change intended.
Reviewed by: jrtc27
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D30645
This basically mirrors what already exists in ddb, but provides a
slightly improved interface. It allows the caller to specify the
watchpoint access type, and returns more specific error codes to
differentiate failure cases.
This will be used to support hardware watchpoints in gdb(4).
Stubs are provided for architectures lacking hardware watchpoint logic
(mips, powerpc, riscv), while other architectures are added individually
in follow-up commits.
Reviewed by: jhb, kib, markj
MFC after: 3 weeks
Sponsored by: NetApp, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D29155
This change serves two purposes.
First, we take advantage of the compiler provided endian definitions to
eliminate some long-standing duplication between the different versions
of this header. __BYTE_ORDER__ has been defined since GCC 4.6, so there
is no need to rely on platform defaults or e.g. __MIPSEB__ to determine
endianness. A new common sub-header is added, but there should be no
changes to the visibility of these definitions.
Second, this eliminates the hand-rolled __bswapNN() routines, again in
favor of the compiler builtins. This was done already for x86 in
e6ff6154d2. The benefit here is that we no longer have to maintain our
own implementations on each arch, and can instead rely on the compiler
to emit appropriate instructions or libcalls, as available. This should
result in equivalent or better code generation. Notably 32-bit arm will
start using the `rev` instruction for these routines, which is available
on armv6+.
PR: 236920
Reviewed by: arichardson, imp
Tested by: bdragon (BE powerpc)
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D29012
e4b8deb222 removed the last in-tree uses of PCPU_INC(). Its
potential benefit is also practically nonexistent. Non-x86
platforms already implement it as PCPU_ADD(..., 1), and according
to [0] there are no recent x86 processors for which the 'inc'
instruction provides a performance benefit over the equivalent
memory-operand form of the 'add' instruction. The only remaining
benefit of 'inc' is smaller instruction size, which in this case
is inconsequential given the limited number of per-CPU data consumers.
[0]: https://www.agner.org/optimize/instruction_tables.pdf
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D29308
This appears to be a copy-and-paste error that has simply been
overlooked. The tree contains only two calls to any of the affected
variants, but recent additions to the test suite started exercising the
call to atomic_clear_rel_int() in ng_leave_write(), reliably causing
panics.
Apparently, the issue was inherited from the arm64 atomic header. That
instance was addressed in c90baf6817, but the fix did not make its way
to RISC-V.
Note that the particular test case ng_macfilter_test:main still appears
to fail on this platform, but this change reduces the panic to a
timeout.
PR: 253237
Reported by: Jenkins, arichardson
Reviewed by: kp, arichardson
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D29064
The System Reset extension provides functions to shutdown or reboot the
system via SBI firmware. This newly defined extension supersedes the
functionality of the legacy shutdown extension.
Update the SBI code to use the new System Reset extension when
available, and fall back to the legacy one.
Reviewed By: kp, jhb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D28226