This is a follow up to r356481. In locore.S, before virtual memory is
set up, we should avoid using indirect address lookups through the GOT.
Therefore we need to convert uses of the la instruction to lla, which
always generates an auipc/addi pair of instructions. This conversion was
done for the BSP case, but not the AP case, resulting in a fault
somewhere before mpva and a failure to bring APs online.
Reported by: lwhsu
Reviewed by: lwhsu, jrtc27 (accepted in a comment)
Differential Revision: https://reviews.freebsd.org/D23138
lld on RISC-V is not yet able to handle undefined weak symbols for
non-PIC code in the code model (medany/medium) used by the RISC-V
kernel.
Both GCC and clang emit an auipc / addi pair of instructions to
generate an address relative to the current PC with a 31-bit offset.
Undefined weak symbols need to have an address of 0, but the kernel
runs with PC values much greater than 2^31, so there is no way to
construct a NULL pointer as a PC-relative value. The bfd linker
rewrites the instruction pair to use lui / addi with values of 0 to
force a NULL pointer address. (There are similar cases for 'ld'
becoming auipc / ld that bfd rewrites to lui / ld with an address of
0.)
To work around this, compile the kernel with -fPIE when using lld.
This does not make the kernel position-independent, but it does
force the compiler to indirect address lookups through GOT entries
(so auipc / ld against a GOT entry to fetch the address). This
adds extra memory indirections for global symbols, so should be
disabled once lld is finally fixed.
A few 'la' instructions in locore that depend on PC-relative
addressing to load physical addresses before paging is enabled have to
use auipc / addi and not indirect via GOT entries, so change those to
use 'lla' which always uses auipc / addi for both PIC and non-PIC.
Submitted by: jrtc27
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D23064
Implement support for the UART as found on the SiFive FU540. It should
also work on, but has not been tested with, the FU310.
Reviewed by: philip
Sponsored by: Axiado
There's no point in checking for absent CPUs if we're not going to do anything
about either the present or absent case. This loop can just be removed.
Reviewed by: philip
Sponsored by: Axiado
With WITNESS enabled we see the following warning:
lock order reversal: (sleepable after non-sleepable)
1st 0xffffffd0847c7210 fu540spi0 (fu540spi0) @
/usr/home/kp/axiado/hornet-freebsd/src/sys/riscv/sifive/fu540_spi.c:297
2nd 0xffffffc00372bb30 Clock topology lock (Clock topology lock) @
/usr/home/kp/axiado/hornet-freebsd/src/sys/dev/extres/clk/clk.c:1137
stack backtrace:
#0 0xffffffc0002a579e at witness_checkorder+0xb72
#1 0xffffffc0002a5556 at witness_checkorder+0x92a
#2 0xffffffc000254c7a at _sx_slock_int+0x66
#3 0xffffffc00025537a at _sx_slock+0x8
#4 0xffffffc000123022 at clk_get_freq+0x38
#5 0xffffffc0005463e4 at __clzdi2+0x2bb8
#6 0xffffffc00014af58 at randomdev_getkey+0x76e
#7 0xffffffc0001278b0 at simplebus_add_device+0x7ee
#8 0xffffffc00027c9a8 at device_attach+0x2e6
#9 0xffffffc00027c634 at device_probe_and_attach+0x7a
#10 0xffffffc00027d76a at bus_generic_attach+0x10
#11 0xffffffc00014aab0 at randomdev_getkey+0x2c6
#12 0xffffffc00027c9a8 at device_attach+0x2e6
#13 0xffffffc00027c634 at device_probe_and_attach+0x7a
#14 0xffffffc00027d76a at bus_generic_attach+0x10
#15 0xffffffc000278bd2 at config_intrhook_oneshot+0x52
#16 0xffffffc000278b3e at config_intrhook_establish+0x146
#17 0xffffffc000278cf2 at config_intrhook_disestablish+0xfe
The clock topology lock can sleep, which means we cannot attempt to
acquire it while holding the non-sleepable mutex.
Fix that by retrieving the clock speed once, during attach and not every
time during SPI transaction setup.
Submitted by: kp
Sponsored by: Axiado
Due to clang and LLD's tendency to use a PLT for builtins, and as they
don't have full support for EABI, we sometimes have to deal with a PLT in
.ko files in a clang-built kernel.
As such, augment the in-kernel linker to support jump table processing.
As there is no particular reason to support lazy binding in kernel modules,
only implement Secure-PLT immediate binding.
As part of these changes, add elf_cpu_parse_dynamic() to the MD API of the
in-kernel linker (except on platforms that use raw object files.)
The new function will allow MD code to act on MD tags in _DYNAMIC.
Use this new function in the PowerPC MD code to ensure BSS-PLT modules using
PLT will be rejected during insertion, and to poison the runtime resolver to
ensure we get a clear panic reason if a call is made to the resolver.
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D22608
off the stack, initialized to default values, and then filled in with
driver-specific values, all without having to worry about the numerous
other fields in the tag. The resulting template is then passed into
busdma and the normal opaque tag object created. See the man page for
details on how to initialize a template.
Templates do not support tag filters. Filters have been broken for many
years, and only existed for an ancient make/model of hardware that had a
quirky DMA engine. Instead of breaking the ABI/API and changing the
arugment signature of bus_dma_tag_create() to remove the filter arguments,
templates allow us to ignore them, and also significantly reduce the
complexity of creating and managing tags.
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D22906
The SiFive FU540 Power Reset Clocking Interrupt block contains a PLL
that turns the input crystal (33.3MHz) into a 1-1.5GHz clock.
This clock in turn is divided by two to produce the tlclk, which is fed
into devices such as the SPI and I2C controllers.
Register a new clock device for the PRCI so that those devices can
read the correct clock through the clk framework.
Submitted by: kp
Sponsored by: Axiado
fix an assert violation introduced in r355784. Without this spinlock_exit()
may see owepreempt and switch before reducing the spinlock count. amd64
had been optimized to do a single critical enter/exit regardless of the
number of spinlocks which avoided the problem and this optimization had
not been applied elsewhere.
Reported by: emaste
Suggested by: rlibby
Discussed with: jhb, rlibby
Tested by: manu (arm64)
This is a 32-bit structure embedded in each vm_page, consisting mostly
of page queue state. The use of a structure makes it easy to store a
snapshot of a page's queue state in a stack variable and use cmpset
loops to update that state without requiring the page lock.
This change merely adds the structure and updates references to atomic
state fields. No functional change intended.
Reviewed by: alc, jeff, kib
Sponsored by: Netflix, Intel
Differential Revision: https://reviews.freebsd.org/D22650
o Remove All Rights Reserved from my notices
o imp@FreeBSD.org everywhere
o regularize punctiation, eliminate date ranges
o Make sure that it's clear that I don't claim All Rights reserved by listing
All Rights Reserved on same line as other copyright holders (but not
me). Other such holders are also listed last where it's clear.
- Use ustringp for the location of the argv and environment strings
and allow destp to travel further down the stack for the stackgap
and auxv regions.
- Update the Linux copyout_strings variants to move destp down the
stack as was done for the native ABIs in r263349.
- Stop allocating a space for a stack gap in the Linux ABIs. This
used to hold translated system call arguments, but hasn't been used
since r159992.
Reviewed by: kib
Tested on: md64 (amd64, i386, linux64), i386 (i386, linux)
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22501
RISC-V inherited this code from arm64, so implement the fix from r354712.
See the revision for the full description.
Submitted by: kevans (arm64 version)
Change the FreeBSD ELF ABIs to use this new hook to copyout ELF auxv
instead of doing it in the sv_fixup hook. In particular, this new
hook allows the stack space to be allocated at the same time the auxv
values are copied out to userland. This allows us to avoid wasting
space for unused auxv entries as well as not having to recalculate
where the auxv vector is by walking back up over the argv and
environment vectors.
Reviewed by: brooks, emaste
Tested on: amd64 (amd64 and i386 binaries), i386, mips, mips64
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22355
SBI version 0.2 introduces functions for obtaining the details of the
SBI implementation, such as version and implemntation ID. Print this
info at startup when it is available.
Reviewed by: jhb, kp
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22327
The Supervisor Binary Interface (SBI) specification v0.2 is a backwards
incompatible update to the SBI call interface for kernels running in
supervisor mode. The goal of this update was to make it easier for new
and optional functionality to be added to the SBI.
SBI functions are now called by passing an "extension ID" and a
"function ID" which are passed in a7 and a6 respectively. SBI calls
will also return an error and value in the following struct:
struct sbi_ret {
long error;
long value;
}
This version introduces several new functions under the "base"
extension. It is expected that all SBI implementations >= 0.2 will
support this base set of functions, as they implement some essential
services such as obtaining the SBI version, CPU implementation info, and
extension probing.
Existing SBI functions have been designated as "legacy". For the time
being they will remain implemented, but it is expected that in the
future their functionality will be duplicated or replaced by new SBI
extensions. Each legacy function has been assigned its own extension ID,
and for now we simply probe and assert for their existence.
Compatibility with legacy SBI implementations (such as BBL) is
maintained by checking the output of sbi_get_spec_version(). This
function is guaranteed to succeed by the new spec, but will return an
error in legacy implementations. We use this as an indicator of whether
or not we can rely on the new SBI base extensions.
For further info on the Supervisor Binary Interface, see:
https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.adoc
Reviewed by: kp, jhb
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22326
Allow for an additional argument to sbi_call which will be passed in a6.
This is required for SBI spec 0.2 support, as a6 will indicate the SBI
function ID.
While here, introduce some macros to clean up the calls.
Reviewed by: kp, jhb
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D22325
Our PLIC implementation only enables interrupts on the boot cpu.
Implement plic_bind_intr() so that they can be redistributed near the
end of boot during intr_irq_shuffle().
This also slightly modifies how enable bits are handled in an attempt to
better fit the PIC interface. plic_enable_intr()/plic_disable_intr() are
converted to manage an interrupt source's threshold value, since this
value can be used as to globally enable/disable an irq. All handing of the
per-context enable bits is moved to the new methods plic_setup_intr()
and plic_bind_intr().
Reviewed by: br
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D21928
The RISC-V PLIC (platform level interrupt controller) registers are divided up
by "context", which is purposefully left ambiguous in the PLIC spec. Currently
we assume each CPU number corresponds 1-to-1 with a context number, but that is
not correct. Most existing PLIC implementations (such as SiFive's) have
multiple contexts per-cpu. For example, a single CPU might have a context for
machine mode interrupts and a context for supervisor mode interrupts. To
complicate things further, FreeBSD renumbers the CPUs during boot, but the PLIC
driver still assumes that CPU ID equals the RISC-V hart number, meaning
interrupt enables/claims might be performed for the wrong context registers.
To fix this, we must calculate each CPU's context number during
attachment. This is done by reading the interrupt properties from the
device tree, from which a mapping from context to RISC-V hart to CPU
number can be created.
Reviewed by: br
MFC after: 3 weeks
Differential Revision: https://reviews.freebsd.org/D21927
This option is causing boot to fail for the Hifive Unleashed and older
versions of QEMU (3.1.1). Remove it from the GENERIC config for now.
Reported by: br
MFC after: 1 week
The fill_elf_hwcap() function expects to find only cpu nodes under the
/cpus entry of the device tree. Newer versions of QEMU insert a cpu-map
node which describes the CPU topology, breaking this function. To fix
this, simply skip any non-cpu entries.
Reviewed by: markj, kp, jhb
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D22151
The lr.w instruction used to read the value from memory sign-extends
the value read from memory. GCC sign-extends the 32-bit comparison
value passed in whereas clang currently does not. As a result, if the
value being compared has the MSB set, the comparison fails for
matching 32-bit values when compiled with clang.
Use a cast to explicitly sign-extend the unsigned comparison value.
This works with both GCC and clang.
There is commentary in the RISC-V spec that suggests that GCC's
approach is more correct, but it is not clear if the commentary in the
RISC-V spec is binding.
Reviewed by: mhorne
Obtained from: Axiado
MFC after: 2 weeks
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D22084
- td_kstack_pages was not being initialized.
- td_kstack is supposed to be the base address of the stack region,
not the top.
The arm ports seem to have similar problems and will be fixed next.
Reported by: Jenkins via lwhsu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
After r352110 the page lock no longer protects a page's identity, so
there is no purpose in locking the page in pmap_mincore(). Instead,
if vm.mincore_mapped is set to the non-default value of 0, re-lookup
the page after acquiring its object lock, which holds the page's
identity stable.
The change removes the last callers of vm_page_pa_tryrelock(), so
remove it.
Reviewed by: kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21823
callers hold it.
This simplifies pmap code and removes a dependency on the object lock.
Reviewed by: kib, markj
Tested by: pho
Sponsored by: Netflix, Intel
Differential Revision: https://reviews.freebsd.org/D21596
busy acquires while held.
This allows code that would need to acquire and release a very large number
of page busy locks to use the old mechanism where busy is only checked and
not held. This comes at the cost of false positives but never false
negatives which the single consumer, vm_fault_soft_fast(), handles.
Reviewed by: kib
Tested by: pho
Sponsored by: Netflix, Intel
Differential Revision: https://reviews.freebsd.org/D21592
be called on bootverbose. Do so on RISV-V too.
Submitted by: Nicholas O'Brien <nickisobrien_gmail.com>
Reviewed by: imp, kp
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D21998
the static device mappings.
While RISC-V support was added to subr_devmap.c in r298631, it was never
actually initialised in the machine dependent code.
Submitted by: Nicholas O'Brien <nickisobrien_gmail.com>
Reviewed by: br, kp
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D21975
During a CoW fault, we must check for both 4KB and 2MB mappings before
clearing PGA_WRITEABLE on the old mapping's page. Previously we were
only checking for 4KB mappings. This was missed in r344106.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
pmap_remove_l3() may remove the last mapping of a page, in which case
it must clear PGA_WRITEABLE.
Reported by: Jenkins, via lwhsu
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
We must also check for large mappings. pmap_page_is_mapped() is
mostly used in assertions, so the problem was not very noticeable.
Reviewed by: alc
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21824
Centralize calculation of signal and ucode delivered on unhandled page
fault in new function vm_fault_trap(). MD trap_pfault() now almost
always uses the signal numbers and error codes calculated in
consistent MI way.
This introduces the protection fault compatibility sysctls to all
non-x86 architectures which did not have that bug, but apparently they
were already much more wrong in selecting delivered signals on
protection violations.
Change the delivered signal for accesses to mapped area after the
backing object was truncated. According to POSIX description for
mmap(2):
The system shall always zero-fill any partial page at the end of an
object. Further, the system shall never write out any modified
portions of the last page of an object which are beyond its
end. References within the address range starting at pa and
continuing for len bytes to whole pages following the end of an
object shall result in delivery of a SIGBUS signal.
An implementation may generate SIGBUS signals when a reference
would cause an error in the mapped object, such as out-of-space
condition.
Adjust according to the description, keeping the existing
compatibility code for SIGSEGV/SIGBUS on protection failures.
For situations where kernel cannot handle page fault due to resource
limit enforcement, SIGBUS with a new error code BUS_OBJERR is
delivered. Also, provide a new error code SEGV_PKUERR for SIGSEGV on
amd64 due to protection key access violation.
vm_fault_hold() is renamed to vm_fault(). Fixed some nits in
trap_pfault()s like mis-interpreting Mach errors as errnos. Removed
unneeded truncations of the fault addresses reported by hardware.
PR: 211924
Reviewed by: alc
Discussed with: jilles, markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D21566
In a few cases, the symbol lookup is missing before attempting to
perform the relocation. While the relocation types affected are
currently unused, this results in an uninitialized variable warning,
that is escalated to an error when building with clang.
Reviewed by: markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21773
Fix some style(9) violations.
This also changes the name of the machine-dependent sysctl kern.debug_kld to
debug.kld_reloc, and changes its type from int to bool. This is acceptable
since we are not currently concerned with preserving the RISC-V ABI.
Reviewed by: markj, kp
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D21772
Convert all remaining references to that field to "ref_count" and update
comments accordingly. No functional change intended.
Reviewed by: alc, kib
Sponsored by: Intel, Netflix
Differential Revision: https://reviews.freebsd.org/D21768
The EARLY_AP_STARTUP option initializes non-boot processors
much sooner during startup. This adds support for this option
on RISC-V and enables it by default for GENERIC.
Reviewed by: jhb, markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D21661
- Remove a dead variable from the amd64 pmap_extract_and_hold().
- Fix grammar in the vm_page_wire man page.
Reported by: alc
Reviewed by: alc, kib
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D21639