Access faults in user mode are treated like TLB misses, which leads to an
endless loop of faults. It's less serious than the same fault in kernel mode,
because we can just terminate the process, but that's not ideal.
Treat user mode access faults as a bus error.
Suggested by: jrtc27
Reviewed by: br, jhb
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D26621
Load/store/fetch access exceptions always indicate a violation of a PMP
rule. We can't treat those as page faults, because updating the page
table and trying again will only result in exactly the same access
exception recurring. This leaves us in an endless exception loop.
We cannot recover from these exceptions, so panic instead.
Reviewed by: jhb
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D26544
Every other architecture defines this and this is required for
interrupts to work when using QEMU's PCI VirtIO devices (which all
report an interrupt line of 0) for two reasons.
Firstly, interrupt line 0 is wrong; they use one of 0x20-0x23 with the
lines being cycled across devices like normal. Moreover, RISC-V uses
INTRNG, whose IRQs are virtual as indices into its irq_map, so even if
we have the right interrupt line we still need to try and route the
interrupt in order to ultimately call into intr_map_irq and get back a
unique index into the map for the given line, otherwise we will use
whatever happens to be in irq_map[line] (which for QEMU where the line
is initialised to 0 results in using the first allocated interrupt,
namely the RTC on IRQ 11 at time of commit).
Note that pci_assign_interrupt will still do the wrong thing for INTRNG
when using a tunable, as it will bypass INTRNG entirely and use the
tunable's value as the index into irq_map, when it should instead
(indirectly) call intr_map_irq to allocate a new entry for the given
IRQ and treat the tunable as stating the physical line in use, which is
what one would expect. This, however, is a problem shared by all INTRNG
architectures, and not exclusive to RISC-V.
Reviewed by: kib
Approved by: kib
Differential Revision: https://reviews.freebsd.org/D26564
In the spirit of the GENERIC config, we should include the drivers required to
run on most supported platforms.
Reviewed by: kp
Differential Revision: https://reviews.freebsd.org/D26501
On Ampere Altra systems, the sparse population of RAM within the
physical address space causes the vm_page_dump bitmap to be much
larger than necessary, increasing the size from ~8 Mib to > 2 Gib
(and overflowing `int` for the size).
Changing the page dump bitmap also changes the minidump file
format, so changes are also necessary in libkvm.
Reviewed by: jhb
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26131
These definitions were repeated by all architectures, with small
variations. Consolidate the common definitons in machine
independent code and use bitset(9) macros for manipulation. Many
opportunities for deduplication remain in the machine dependent
minidump logic. The only intended functional change is increasing
the bit index type to vm_pindex_t, allowing the indexing of pages
with address of 8 TiB and greater.
Reviewed by: kib, markj
Approved by: scottl (implicit)
MFC after: 1 week
Sponsored by: Ampere Computing, Inc.
Differential Revision: https://reviews.freebsd.org/D26129
One problem with the bus_space_read_N() and bus_space_write_N() family of
functions is that they provide no protection against exceptions which can
occur when no physical hardware or device responds to the read or write
cycles. In such a situation, the system typically would panic due to a
kernel-mode bus error. The bus_space_peek_N() and bus_space_poke_N() family
of functions provide a mechanism to handle these exceptions gracefully
without the risk of crashing the system.
Typical example is access to PCI(e) configuration space in bus enumeration
function on badly implemented PCI(e) root complexes (RK3399 or Neoverse
N1 N1SDP and/or access to PCI(e) register when device is in deep sleep state.
This commit adds a real implementation for arm64 only. The remaining
architectures have bus_space_peek()/bus_space_poke() emulated by using
bus_space_read()/bus_space_write() (without exception handling).
MFC after: 1 month
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D25371
that can be extended, but also ensure compile-time type checking. Refactor
common code out of arch-specific implementations. Move the mpr and mps
drivers to this new API. The template type remains visible to the consumer
so that it can be allocated on the stack, but should be considered opaque.
RISC-V is currently built with -Wno-format, which is how these went
undetected. Address them now before re-enabling those warnings.
Differential Revision: https://reviews.freebsd.org/D26319
The kernel adjusts the stack by TF_SIZE and the RISC-V ABI requires
that it remain 16-byte aligned.
Reported by: CHERI, jrtc27
Reviewed by: mhorne
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D26328
Currently we use a single bit to indicate whether the virtual page is
part of a superpage. To support a forthcoming implementation of
non-transparent 1GB superpages, it is useful to provide more detailed
information about large page sizes.
The change converts MINCORE_SUPER into a mask for MINCORE_PSIND(psind)
values, indicating a mapping of size psind, where psind is an index into
the pagesizes array returned by getpagesizes(3), which in turn comes
from the hw.pagesizes sysctl. MINCORE_PSIND(1) is equal to the old
value of MINCORE_SUPER.
For now, two bits are used to record the page size, permitting values
of MAXPAGESIZES up to 4.
Reviewed by: alc, kib
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D26238
This allows privileged userspace processes to find information about the
physical page backing a given mapping. It is useful in applications
such as DPDK which perform some of their own memory management.
Reviewed by: kib, jhb (previous version)
MFC after: 2 weeks
Sponsored by: Juniper Networks, Inc.
Sponsored by: Klara Inc.
Differential Revision: https://reviews.freebsd.org/D26237
sbi_init() sets mimpid, we can use that value.
Reviewed by: philip (mentor), kp (mentor)
Approved by: philip (mentor), kp (mentor)
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D26092
I observed hangs post-r362977 in QEMU with -smp 2, in which one thread
would acquire write access to an rm_lock (sysctllock) and get stuck
waiting in smp_rendezvous_cpus while the other CPU was servicing a trap.
The other thread was waiting for read access to the same lock, thus
causing deadlock.
It's clear that this is just one symptom of a larger problem. The
general expectation of MI kernel code is that interrupts are enabled.
Violating this assumption will at best create some additional latency,
but otherwise might cause locking or other unforeseen issues. All other
architectures do so for some subset of trap values, but this somehow got
missed in the RISC-V port. Enable interrupts now during kernel page
faults and for all user trap types.
The code in exception.S already knows to disable interrupts while
handling the return from exception, so there are no changes required
there.
Reviewed by: jhb, markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D26017
- Properly set up the frame pointer
- Hang if we return from mi_startup
- Whitespace
Clearing the frame pointer marks the end of the backtrace. This fixes
"bt 0" in ddb, which previously would unwind one frame too far.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D26016
This same check is used on other architectures. Previously this would
permit a stack frame to unwind into any arbitrary kernel address
(including unmapped addresses).
Reviewed by: mhorne
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25996
There was an additional 7 bytes of compiler-inserted padding at the
end of the structure visible via 'ptype /o' in gdb.
Reviewed by: mhorne
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25867
RISC-V doesn't support floating-point exceptions.
RISC-V Instruction Set Manual: Volume I: User-Level ISA, 11.2 Floating-Point
Control and Status Register: "As allowed by the standard, we do not support
traps on floating-point exceptions in the base ISA, but instead require
explicit checks of the flags in software. We considered adding branches
controlled directly by the contents of the floating-point accrued exception
flags, but ultimately chose to omit these instructions to keep the ISA simple."
We still need these functions, because some applications (notably Perl) call
them, but we cannot provide a meaningful implementation.
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D25740
QEMU's RISC-V virt machine provides syscon-power and syscon-reset
devices as the means by which to shutdown and reboot. We also need to
ensure that we have attached the syscon_generic device before attaching
any syscon_power devices, and so we introduce a new riscv_syscon device
akin to aw_syscon added in r327936. Currently the SiFive test finisher
is used as the specific implementation of such a syscon device.
Reviewed by: br, brooks (mentor), jhb (mentor)
Approved by: br, brooks (mentor), jhb (mentor)
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D25725
This device was originally used as part of the goldfish virtual hardware
platform used for emulating Android on QEMU, but is now also used as the
RTC for the RISC-V virt machine in QEMU. It provides a simple 64-bit
nanosecond timer exposed via a pair of memory-mapped 32-bit registers,
although only with 1s granularity.
Reviewed by: brooks (mentor), jhb (mentor), kp
Approved by: brooks (mentor), jhb (mentor), kp
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D25717
Being able to use tmpfs without kernel modules is very useful when building
small MFS_ROOT kernels without a real file system.
Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs.
Compiling TMPFS only adds 4 .c files so this should not make much of a
difference to NO_MODULES build times (as we do for our minimal RISC-V
images).
Reviewed By: br (earlier version for riscv), brooks, emaste
Differential Revision: https://reviews.freebsd.org/D25317
The size of the containing structure was passed instead of the size of
the array. This happened to be harmless as the extra word copied is
one we copy in the next line anyway.
Reported by: CHERI (bounds check violation)
Reviewed by: brooks, imp
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25791
During device attachment, all interrupt sources will bind to the BSP,
as it is the only processor online. This means interrupts must be
redistributed ("shuffled") later, during SI_SUB_SMP.
For the EARLY_AP_STARTUP case, this is no longer true. SI_SUB_SMP will
execute much earlier, meaning APs will be online and available before
devices begin attachment, and there will therefore be nothing to
shuffle.
All PIC-conforming interrupt controllers will handle this early
distribution properly, except for RISC-V's PLIC. Make the necessary
tweak to the PLIC driver.
While here, convert irq_assign_cpu from a boolean_t to a bool.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D25693
The FDT may contain a short /chosen/bootargs string which we should pass
to boot_parse_cmdline. Notably, this allows the use of qemu's -append
option to pass things like -s to boot to single user mode.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: mhorne
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25544
This removes SCTP from in-tree kernel configuration files. Now, SCTP
can be enabled by simply loading the module, as discussed on
freebsd-net@.
Reviewed by: tuexen
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25611
We cannot complete the interrupt (i.e. write to the claims/complete register
until the interrupt handler has actually run. We don't run the interrupt
handler immediately from intr_isrc_dispatch(), we only schedule it for later
execution.
If we immediately complete it (i.e. before the interrupt handler proper has
run) the interrupt may be triggered again if the interrupt source remains set.
From RISC-V Instruction Set Manual: Volume II: Priviliged Architecture, 7.4
Interrupt Gateways:
"If a level-sensitive interrupt source deasserts the interrupt after the PLIC
core accepts the request and before the interrupt is serviced, the interrupt
request remains present in the IP bit of the PLIC core and will be serviced by
a handler, which will then have to determine that the interrupt device no
longer requires service."
In other words, we may receive interrupts twice.
Avoid that by postponing the completion until after the interrupt handler has
run.
If the interrupt is handled by a filter rather than by scheduling an interrupt
thread we must also complete the interrupt, so set up a post_filter handler
(which is the same as the post_ithread handler).
Reviewed by: mhorne
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D25531
This implementation doesn't have any major deviations from the other EFI
ports. I've copied the boilerplate from arm and arm64.
I've tested this with the following boot flows:
OpenSBI (M-mode) -> u-boot (S-mode) -> loader.efi -> FreeBSD
OpenSBI (M-mode) -> u-boot (S-mode) -> boot1.efi -> loader.efi -> FreeBSD
Due to the way that u-boot handles secondary CPUs, OpenSBI >= v0.7 is required,
as the HSM extension is needed to bring them up explicitly. Because of this,
using BBL as the SBI implementation will not be possible. Additionally, there
are a few recent u-boot changes that are required as well, all of which will be
present in the upcoming v2020.07 release.
Looks good: emaste
Differential Revision: https://reviews.freebsd.org/D25135
The top 10 bits of a pte are reserved by specification[1] and are not part of
the PPN.
[1] 'Volume II: RISC-V Privileged Architectures V20190608-Priv-MSU-Ratified',
'4.4.1 Addressing and Memory Protection', page 72: "The PTE format for Sv39 is
shown in Figure 4.18. ... Bits 63–54 are reserved for future use and must be
zeroed by software for forward compatibility."
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: kp, mhorne
Differential Revision: https://reviews.freebsd.org/D25523
A very minor micro-optimization; t0 is not clobbered between the loop top and
bottom and there appear to be no other branches to this label.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: mhorne
Differential Revision: https://reviews.freebsd.org/D25524
If we panic we dump the registers for debugging. This is very useful, but it
missed several registers (ra, sp, gp and tp).
Log these as well. Especially the return address value is extremely useful.
Sponsored by: Axiado
This temporary mapping will become optional. Booting via loader(8)
means that the DTB will have already been copied into the kernel's
staging area, and is therefore covered by the early KVA mappings.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D24911
In locore, we must detect and handle different arguments passed by
loader(8) compared to what we recieve when booting directly via SBI
firmware. Currently we receive the hart ID in a0 and a pointer to the
device tree blob in a1. loader(8) provides only a pointer to its
metadata in a0.
The solution to this is to add an additional entry point, _alt_start.
This will be placed first in the .text section, so SBI firmware will
enter here, and jump to the common pagetable setup shortly after. Since
loader(8) understands our ELF kernel, it will enter at the ELF's entry
address, which points to _start. This approach leads to very little
guesswork as to which way we booted.
Fix-up initriscv() to parse the loader's metadata, continuing to use
fake_preload_metadata() in the SBI direct boot case.
Reviewed by: markj, jrtc27 (asm portion)
Differential Revision: https://reviews.freebsd.org/D24912
Currently we only call sbi_shutdown in cpu_reset, which means we reach
"Please press any key to reboot." even when RB_POWEROFF is set, and only
once the user presses a key do we then shutdown. Instead, register a
shutdown_final event handler and make an SBI shutdown call if
RB_POWEROFF is set.
Reviewed by: br, jhb (mentor), kp
Approved by: br, jhb (mentor), kp
Differential Revision: https://reviews.freebsd.org/D25183