sbi_init() sets mimpid, we can use that value.
Reviewed by: philip (mentor), kp (mentor)
Approved by: philip (mentor), kp (mentor)
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D26092
I observed hangs post-r362977 in QEMU with -smp 2, in which one thread
would acquire write access to an rm_lock (sysctllock) and get stuck
waiting in smp_rendezvous_cpus while the other CPU was servicing a trap.
The other thread was waiting for read access to the same lock, thus
causing deadlock.
It's clear that this is just one symptom of a larger problem. The
general expectation of MI kernel code is that interrupts are enabled.
Violating this assumption will at best create some additional latency,
but otherwise might cause locking or other unforeseen issues. All other
architectures do so for some subset of trap values, but this somehow got
missed in the RISC-V port. Enable interrupts now during kernel page
faults and for all user trap types.
The code in exception.S already knows to disable interrupts while
handling the return from exception, so there are no changes required
there.
Reviewed by: jhb, markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D26017
- Properly set up the frame pointer
- Hang if we return from mi_startup
- Whitespace
Clearing the frame pointer marks the end of the backtrace. This fixes
"bt 0" in ddb, which previously would unwind one frame too far.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D26016
This same check is used on other architectures. Previously this would
permit a stack frame to unwind into any arbitrary kernel address
(including unmapped addresses).
Reviewed by: mhorne
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25996
There was an additional 7 bytes of compiler-inserted padding at the
end of the structure visible via 'ptype /o' in gdb.
Reviewed by: mhorne
Obtained from: CheriBSD
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25867
RISC-V doesn't support floating-point exceptions.
RISC-V Instruction Set Manual: Volume I: User-Level ISA, 11.2 Floating-Point
Control and Status Register: "As allowed by the standard, we do not support
traps on floating-point exceptions in the base ISA, but instead require
explicit checks of the flags in software. We considered adding branches
controlled directly by the contents of the floating-point accrued exception
flags, but ultimately chose to omit these instructions to keep the ISA simple."
We still need these functions, because some applications (notably Perl) call
them, but we cannot provide a meaningful implementation.
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D25740
QEMU's RISC-V virt machine provides syscon-power and syscon-reset
devices as the means by which to shutdown and reboot. We also need to
ensure that we have attached the syscon_generic device before attaching
any syscon_power devices, and so we introduce a new riscv_syscon device
akin to aw_syscon added in r327936. Currently the SiFive test finisher
is used as the specific implementation of such a syscon device.
Reviewed by: br, brooks (mentor), jhb (mentor)
Approved by: br, brooks (mentor), jhb (mentor)
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D25725
This device was originally used as part of the goldfish virtual hardware
platform used for emulating Android on QEMU, but is now also used as the
RTC for the RISC-V virt machine in QEMU. It provides a simple 64-bit
nanosecond timer exposed via a pair of memory-mapped 32-bit registers,
although only with 1s granularity.
Reviewed by: brooks (mentor), jhb (mentor), kp
Approved by: brooks (mentor), jhb (mentor), kp
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D25717
Being able to use tmpfs without kernel modules is very useful when building
small MFS_ROOT kernels without a real file system.
Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs.
Compiling TMPFS only adds 4 .c files so this should not make much of a
difference to NO_MODULES build times (as we do for our minimal RISC-V
images).
Reviewed By: br (earlier version for riscv), brooks, emaste
Differential Revision: https://reviews.freebsd.org/D25317
The size of the containing structure was passed instead of the size of
the array. This happened to be harmless as the extra word copied is
one we copy in the next line anyway.
Reported by: CHERI (bounds check violation)
Reviewed by: brooks, imp
Obtained from: CheriBSD
MFC after: 1 week
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D25791
During device attachment, all interrupt sources will bind to the BSP,
as it is the only processor online. This means interrupts must be
redistributed ("shuffled") later, during SI_SUB_SMP.
For the EARLY_AP_STARTUP case, this is no longer true. SI_SUB_SMP will
execute much earlier, meaning APs will be online and available before
devices begin attachment, and there will therefore be nothing to
shuffle.
All PIC-conforming interrupt controllers will handle this early
distribution properly, except for RISC-V's PLIC. Make the necessary
tweak to the PLIC driver.
While here, convert irq_assign_cpu from a boolean_t to a bool.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D25693
The FDT may contain a short /chosen/bootargs string which we should pass
to boot_parse_cmdline. Notably, this allows the use of qemu's -append
option to pass things like -s to boot to single user mode.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: mhorne
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D25544
This removes SCTP from in-tree kernel configuration files. Now, SCTP
can be enabled by simply loading the module, as discussed on
freebsd-net@.
Reviewed by: tuexen
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D25611
We cannot complete the interrupt (i.e. write to the claims/complete register
until the interrupt handler has actually run. We don't run the interrupt
handler immediately from intr_isrc_dispatch(), we only schedule it for later
execution.
If we immediately complete it (i.e. before the interrupt handler proper has
run) the interrupt may be triggered again if the interrupt source remains set.
From RISC-V Instruction Set Manual: Volume II: Priviliged Architecture, 7.4
Interrupt Gateways:
"If a level-sensitive interrupt source deasserts the interrupt after the PLIC
core accepts the request and before the interrupt is serviced, the interrupt
request remains present in the IP bit of the PLIC core and will be serviced by
a handler, which will then have to determine that the interrupt device no
longer requires service."
In other words, we may receive interrupts twice.
Avoid that by postponing the completion until after the interrupt handler has
run.
If the interrupt is handled by a filter rather than by scheduling an interrupt
thread we must also complete the interrupt, so set up a post_filter handler
(which is the same as the post_ithread handler).
Reviewed by: mhorne
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D25531
This implementation doesn't have any major deviations from the other EFI
ports. I've copied the boilerplate from arm and arm64.
I've tested this with the following boot flows:
OpenSBI (M-mode) -> u-boot (S-mode) -> loader.efi -> FreeBSD
OpenSBI (M-mode) -> u-boot (S-mode) -> boot1.efi -> loader.efi -> FreeBSD
Due to the way that u-boot handles secondary CPUs, OpenSBI >= v0.7 is required,
as the HSM extension is needed to bring them up explicitly. Because of this,
using BBL as the SBI implementation will not be possible. Additionally, there
are a few recent u-boot changes that are required as well, all of which will be
present in the upcoming v2020.07 release.
Looks good: emaste
Differential Revision: https://reviews.freebsd.org/D25135
The top 10 bits of a pte are reserved by specification[1] and are not part of
the PPN.
[1] 'Volume II: RISC-V Privileged Architectures V20190608-Priv-MSU-Ratified',
'4.4.1 Addressing and Memory Protection', page 72: "The PTE format for Sv39 is
shown in Figure 4.18. ... Bits 63–54 are reserved for future use and must be
zeroed by software for forward compatibility."
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: kp, mhorne
Differential Revision: https://reviews.freebsd.org/D25523
A very minor micro-optimization; t0 is not clobbered between the loop top and
bottom and there appear to be no other branches to this label.
Submitted by: Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by: mhorne
Differential Revision: https://reviews.freebsd.org/D25524
If we panic we dump the registers for debugging. This is very useful, but it
missed several registers (ra, sp, gp and tp).
Log these as well. Especially the return address value is extremely useful.
Sponsored by: Axiado
This temporary mapping will become optional. Booting via loader(8)
means that the DTB will have already been copied into the kernel's
staging area, and is therefore covered by the early KVA mappings.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D24911
In locore, we must detect and handle different arguments passed by
loader(8) compared to what we recieve when booting directly via SBI
firmware. Currently we receive the hart ID in a0 and a pointer to the
device tree blob in a1. loader(8) provides only a pointer to its
metadata in a0.
The solution to this is to add an additional entry point, _alt_start.
This will be placed first in the .text section, so SBI firmware will
enter here, and jump to the common pagetable setup shortly after. Since
loader(8) understands our ELF kernel, it will enter at the ELF's entry
address, which points to _start. This approach leads to very little
guesswork as to which way we booted.
Fix-up initriscv() to parse the loader's metadata, continuing to use
fake_preload_metadata() in the SBI direct boot case.
Reviewed by: markj, jrtc27 (asm portion)
Differential Revision: https://reviews.freebsd.org/D24912
Currently we only call sbi_shutdown in cpu_reset, which means we reach
"Please press any key to reboot." even when RB_POWEROFF is set, and only
once the user presses a key do we then shutdown. Instead, register a
shutdown_final event handler and make an SBI shutdown call if
RB_POWEROFF is set.
Reviewed by: br, jhb (mentor), kp
Approved by: br, jhb (mentor), kp
Differential Revision: https://reviews.freebsd.org/D25183
This can happen with very large kernels (e.g. ones embedding a root
filesystem). The DTB written by OpenSBI/BBL is quite small so this is
unlikely to hit important data, but if it does this can result in very
confusing and hard-to-debug crashes. Add a KASSERT() and a verbose print
to catch this problem with debug kernels.
While this will not print any output by default if it fails (that would
depend on EARLY_PRINTF), at least the kernel now halts reliably instead
of randomly crashing.
Reviewed By: mhorne
Differential Revision: https://reviews.freebsd.org/D25153
By default OpenSBI and BBL will pass the DTB at a 2MB-aligned address.
However, by default there are no 2MB aligned regions between the SBI and
the kernel, so we have to choose a 2MB aligned region after the kernel.
OpenSBI defaults to placing the DTB 32MB after the start of the kernel but
this is not sufficient for a kernel with a large MFS embedded.
We could increase this offset to a larger number (e.g. 64/128/256) but that
imposes restrictions on the minimum RAM size.
Another solution would be to place the DTB between OpenSBI and the kernel
at 1MB alignment, but current locore.S code assumes 2MB alignment.
With this change I can now boot on QEMU with an OpenSBI configured to
store the DTB at an offset of 1MB.
See also https://github.com/riscv/opensbi/issues/169
Reviewed By: mhorne
Differential Revision: https://reviews.freebsd.org/D25151
The trampoline code used for loading gzipped a.out kernels on arm was
removed in r350436. A portion of this code allowed for DDB to find the
symbol tables when booting without loader(8), and some of this was
untouched in the removal. Remove it now.
Differential Revision: https://reviews.freebsd.org/D24950
This is in preparation for booting via loader(8). Lift these macros from arm64
so we don't need to worry about the size when inserting new elements. This
could have been done in r359673, but I didn't think I would be returning to
this function so soon.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D24910
This reapplies logical r360944 and r360946 (reverting r360955), with fixed
copystr() stand-in replacement macro. Eventually the goal is to convert
consumers and kill the macro, but for a first step it helps if the macro is
correct.
Prior commit message:
Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults. It's just
an older incarnation of the now-more-common strlcpy().
Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.
Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy (with correction from brooks@ -- thanks).
Remove N redundant MI implementations of copystr. For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.
Reviewed by: jhb (earlier version)
Discussed with: brooks (thanks!)
Differential Revision: https://reviews.freebsd.org/D24672
When protecting a superpage, we would previously fall through to the
non-superpage case and read the contents of the superpage as PTEs,
potentially modifying them and trying to look up underlying VM pages that
don't exist if they happen to look like PTEs we would care about. This led
to nginx causing an unexpected page fault in pmap_protect that panic'ed the
kernel. Instead, if we see a superpage, we are done for this range and
should continue to the next.
Reviewed by: markj, jhb (mentor)
Approved by: markj, jhb (mentor)
Differential Revision: https://reviews.freebsd.org/D24827
Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults. It's just
an older incarnation of the now-more-common strlcpy().
Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.
Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy.
Remove N redundant MI implementations of copystr. For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D24672
The addition of the HSM SBI extension to OpenSBI introduces a new
breaking change: secondary harts will remain parked in the firmware,
until they are brought up explicitly via sbi_hsm_hart_start(). Add
the call to do this, sending the secondary harts to mpentry.
If the HSM extension is not present, secondary harts are assumed to be
released by the firmware, as is the case for OpenSBI =< v0.6 and BBL.
In the case that the HSM call fails we exclude the CPU, notify the
user, and allow the system to proceed with booting.
Reviewed by: markj (older version)
Differential Revision: https://reviews.freebsd.org/D24497
APs enter the kernel at the same point as the BSP, the _start routine.
They then jump to mpentry, but not before storing the kernel's physical
load address in the s9 register. Extract this calculation into its own
routine, so that APs can be instructed to enter directly from mpentry.
Differential Revision: https://reviews.freebsd.org/D24495
Now that hw.machine_arch handles soft-float vs hard-float there is no
longer a reason for this config.
Submitted by: mhorne (kern.mk hunk)
Reviewed by: imp (earlier version), kp
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24544
For userland, MACHINE_ARCH reflects the current ABI via preprocessor
directives. For the kernel, the hw.machine_arch sysctl uses the ELF
header flags of the current process to select the correct MACHINE_ARCH
value.
Reviewed by: imp, kp
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D24543
pmap_bootstrap() expects the kernel's physical load address, but we have
been providing the start of physical memory. This had the nice effect of
protecting the memory used by the SBI runtime firmware, but now that we
have alternate means of achieving that, we should provide the correct
value. This will free up any memory between the SBI firmware and the
kernel for allocation.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D24156
The device tree may contain a "reserved-memory" node, whose purpose is
to communicate sections of physical memory that should not be used for
general allocations. Add the logic to parse and exclude these regions.
The particular motivation for this is protection of the SBI runtime
firmware. Currently, there is no mechanism through which the SBI
can communicate the details of its reserved memory region(s) to
a supervisor payload. There has been some discussion recently on how
this can be achieved [1], and it seems that the path going forward
will be to add an entry to the reserved-memory node.
This hasn't caused any issues for us yet, since we exclude all physical
memory below the kernel's load address from being allocated, and on all
currently supported platforms this covers the SBI firmware region. This
will change in another commit, so as a safety measure, ensure that the
lowest 2MB of memory is excluded if this region has not been reported.
[1] https://github.com/riscv/riscv-sbi-doc/pull/37
Reviewed by: markj, nick (older version)
Differential Revision: https://reviews.freebsd.org/D24155
Replace our hand-rolled functions with the generic ones provided by
kern/subr_physmem.c. This greatly simplifies the initialization of
physical memory regions and kernel globals.
Tested by: nick
Differential Revision: https://reviews.freebsd.org/D24154
The location of the device-tree blob is passed to the kernel by the
previous booting stage (i.e. BBL or OpenSBI). Currently, we leave it
untouched and mark the 1MB of memory holding it as unavailable.
Instead, do what is done by other fake_preload_metadata() routines and
copy to the DTB to KVA space. This is more in line with what loader(8)
will provide us in the future, and it allows us to reclaim the hole in
physical memory.
Reviewed by: markj, kp (earlier version)
Differential Revision: https://reviews.freebsd.org/D24152
The only way to flush the local hart's icache is with a FENCE.I (or an
equivalent SBI call); a normal FENCE is insufficient and, for the
single-hart case, unnecessary.
Reviewed by: jhb (mentor), markj
Approved by: jhb (mentor), markj
Differential Revision: https://reviews.freebsd.org/D24317