Commit Graph

174 Commits

Author SHA1 Message Date
Eric van Gyzen
984969cd96 Fix reporting of SS_ONSTACK
Fix reporting of SS_ONSTACK in nested signal delivery when sigaltstack()
is used on some architectures.

Add a unit test for this.  I tested the test by introducing the bug
on amd64.  I did not test it on other architectures.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D18347
2018-11-30 22:44:33 +00:00
Eric van Gyzen
4d5a108409 Prevent kernel stack disclosure in signal delivery
On arm64 and riscv platforms, sendsig() failed to zero the signal
frame before copying it out to userspace.  Zero it.

On arm, I believe all the contents of the frame were initialized,
so there was no disclosure.  However, explicitly zero the whole frame
because that fact could inadvertently change in the future,
it's more clear to the reader, and I could be wrong in the first place.

MFC after:	2 days
Security:	similar to FreeBSD-EN-18:12.mem and CVE-2018-17155
Sponsored by:	Dell EMC Isilon
2018-11-26 20:52:53 +00:00
Mark Johnston
6f8ba91638 RISC-V: Implement get_cyclecount(9).
Add the missing implementation for get_cyclecount(9) on RISC-V by
reading the cycle CSR.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Reviewed by:	jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D17953
2018-11-13 18:20:27 +00:00
Mark Johnston
1e2ceeb16a RISC-V: Add macros for reading performance counter CSRs.
The RISC-V spec defines several performance counter CSRs such as: cycle,
time, instret, hpmcounter(3...31).  They are defined to be 64-bits wide
on all RISC-V architectures.  On RV64 and RV128 they can be read from a
single CSR.  On RV32, additional CSRs (given the suffix "h") are present
which contain the upper 32 bits of these counters, and must be read as
well.  (See section 2.8 in the User ISA Spec for full details.)

This change adds macros for reading these values safely on any RISC-V
ISA length.  Obviously we aren't supporting anything other than RV64
at the moment, but this ensures we won't need to change how we read
these values if we ever do.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Reviewed by:	jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D17952
2018-11-13 18:12:06 +00:00
John Baldwin
c5e797a836 Drop the legacy ELF brandinfo for the old rtld from arm64 and riscv.
These architectures never shipped binaries with an rtld path of
/usr/libexec/ld-elf.so.1.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17876
2018-11-07 18:28:55 +00:00
John Baldwin
274c0a806a Enable use of a global shared page for RISC-V.
machine/vmparam.h already defines the SHAREDPAGE constant.  This
change just enables it for ELF executables.  The only use of the
shared page currently is to hold the signal trampoline.

Reviewed by:	markj, kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17875
2018-11-07 18:27:43 +00:00
John Baldwin
4cbbb74888 Add a KPI for the delay while spinning on a spin lock.
Replace a call to DELAY(1) with a new cpu_lock_delay() KPI.  Currently
cpu_lock_delay() is defined to DELAY(1) on all platforms.  However,
platforms with a DELAY() implementation that uses spin locks should
implement a custom cpu_lock_delay() doesn't use locks.

Reviewed by:	kib
MFC after:	3 days
2018-11-05 21:34:17 +00:00
John Baldwin
ff9738d954 Rework setting PTE_D for kernel mappings.
Rather than unconditionally setting PTE_D for all writeable kernel
mappings, set PTE_D for writable mappings of unmanaged pages (whether
user or kernel).  This matches what amd64 does and also matches what
the RISC-V spec suggests (preset the A and D bits on mappings where
the OS doesn't care about the state).

Suggested by:	alc
Reviewed by:	alc, markj
Sponsored by:	DARPA
2018-11-05 20:00:36 +00:00
John Baldwin
d198cb6d83 Restrict setting PTE execute permissions on RISC-V.
Previously, RISC-V was enabling execute permissions in PTEs for any
readable page.  Now, execute permissions are only enabled if they were
explicitly specified (e.g. via PROT_EXEC to mmap).  The one exception
is that the initial kernel mapping in locore still maps all of the
kernel RWX.

While here, change the fault type passed to vm_fault and
pmap_fault_fixup to only include a single VM_PROT_* value representing
the faulting access to match other architectures rather than passing a
bitmask.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17783
2018-11-01 22:23:15 +00:00
John Baldwin
6f888020df Set PTE_A and PTE_D for user mappings in pmap_enter().
This assumes that an access according to the prot in 'flags' triggered
a fault and is going to be retried after the fault returns, so the two
flags are set preemptively to avoid refaulting on the retry.

While here, only bother setting PTE_D for kernel mappings in pmap_enter
for writable mappings.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17782
2018-11-01 22:17:51 +00:00
John Baldwin
a751b25546 SBI calls expect a pointer to a u_long rather than a pointer.
This is just cosmetic.

A weirder issue is that the SBI doc claims the hart mask pointer should
be a physical address, not a virtual address.  However, the implementation
in bbl seems to just dereference the address directly.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17781
2018-11-01 22:15:25 +00:00
John Baldwin
344adeab18 Don't allow debuggers to modify SSTATUS, only to read it.
Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17771
2018-11-01 22:13:22 +00:00
John Baldwin
ada1ceef0b Implement ptrace_set_pc() and fail PT_*STEP requests explicitly.
Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17769
2018-11-01 22:11:26 +00:00
Kyle Evans
be352d20d5 Compile in VERBOSE_SYSINIT support by default, remain silent by default
The loader tunable 'debug.verbose_sysinit' may be used to toggle verbosity.
This is added to the debugging section of these kernconfs to be turned off
in stable branches for clarity of intent.

MFC after:	never
2018-10-31 22:38:19 +00:00
Ruslan Bukin
b7b391934d o Add pmap lock around pmap_fault_fixup() to ensure other thread will not
modify l3 pte after we loaded old value and before we stored new value.
o Preset A(accessed), D(dirty) bits for kernel mappings.

Reported by:	kib
Reviewed by:	markj
Discussed with:	jhb
Sponsored by:	DARPA, AFRL
2018-10-26 12:27:07 +00:00
Brooks Davis
c3adaa3305 Consolidate identical ELF auxargs type defintions.
All platforms except powerpc use the same values and powerpc shares a
majority of them.

Go ahead and declare AT_NOTELF, AT_UID, and AT_EUID in favor of the
unused AT_DCACHEBSIZE, AT_ICACHEBSIZE, and AT_UCACHEBSIZE for powerpc.

Reviewed by:	jhb, imp
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17397
2018-10-22 22:24:32 +00:00
Ruslan Bukin
b977d81946 Support RISC-V implementations that do not manage the A and D bits
(e.g. RocketChip, lowRISC and derivatives).

RISC-V page table entries support A (accessed) and D (dirty) bits. The
spec makes hardware support for these bits optional. Implementations that
do not manage these bits in hardware raise page faults for accesses to a
valid page without A set and writes to a writable page without D set.
Check for these types of faults when handling a page fault and fixup the
PTE without calling vm_fault if they occur.

Reviewed by:	jhb, markj
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17424
2018-10-18 15:25:07 +00:00
Ruslan Bukin
3c8efd61f5 Revert r339421 due to unintended files included to commit.
Reported by:	ian
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-10-18 15:17:58 +00:00
Ruslan Bukin
53c6ad1d62 Support RISC-V implementations that do not manage the A and D bits
(e.g. RocketChip, lowRISC and derivatives).

RISC-V page table entries support A (accessed) and D (dirty) bits. The
spec makes hardware support for these bits optional. Implementations that
do not manage these bits in hardware raise page faults for accesses to a
valid page without A set and writes to a writable page without D set.
Check for these types of faults when handling a page fault and fixup the
PTE without calling vm_fault if they occur.

Reviewed by:	jhb, markj
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17424
2018-10-18 15:08:14 +00:00
Ruslan Bukin
94036a2587 Invalidate TLB on a local hart.
This was missed in r339367 ("Various fixes for TLB management on RISC-V.").

This fixes operation on lowRISC.

Reviewed by:	jhb
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17583
2018-10-16 16:03:17 +00:00
John Baldwin
73efa2fbd1 Various fixes for TLB management on RISC-V.
- Remove the arm64-specific cpu_*cache* and cpu_tlb_flush* functions.
  Instead, add RISC-V specific inline functions in cpufunc.h for the
  fence.i and sfence.vma instructions.
- Catch up to changes in the arm64 pmap and remove all the cpu_dcache_*
  calls, pmap_is_current, pmap_l3_valid_cacheable, and PTE_NEXT bits from
  pmap.
- Remove references to the unimplemented riscv_setttb().
- Remove unused cpu_nullop.
- Add a link to the SBI doc to sbi.h.
- Add support for a 4th argument in SBI calls.  It's not documented but
  it seems implied for the asid argument to SBI_REMOVE_SFENCE_VMA_ASID.
- Pass the arguments from sbi_remote_sfence*() to the SEE.  BBL ignores
  them so this is just cosmetic.
- Flush icaches on other CPUs when they resume from kdb in case the
  debugger wrote any breakpoints while the CPUs were paused in the IPI_STOP
  handler.
- Add SMP vs UP versions of pmap_invalidate_* similar to amd64.  The
  UP versions just use simple fences.  The SMP versions use the
  sbi_remove_sfence*() functions to perform TLB shootdowns.  Since we
  don't have a valid pm_active field in the riscv pmap, just IPI all
  CPUs for all invalidations for now.
- Remove an extraneous TLB flush from the end of pmap_bootstrap().
- Don't do a TLB flush when writing new mappings in pmap_enter(), only if
  modifying an existing mapping.  Note that for COW faults a TLB flush is
  only performed after explicitly clearing the old mapping as is done in
  other pmaps.
- Sync the i-cache on all harts before updating the PTE for executable
  mappings in pmap_enter and pmap_enter_quick.  Previously the i-cache was
  only sync'd after updating the PTE in pmap_enter.
- Use sbi_remote_fence() instead of smp_rendezvous in pmap_sync_icache().

Reviewed by:	markj
Approved by:	re (gjb, kib)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17414
2018-10-15 18:56:54 +00:00
Ruslan Bukin
05efeb8430 Initialize interrupt priority to 0 on all sources.
Without this hardware raises an interrupt regardless of any
pending bits set.

This fixes operation on RocketChip and derivatives (e.g. lowRISC).

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-10-12 15:51:41 +00:00
Ruslan Bukin
053ec0508e Add support for the UART device found in lowRISC system-on-a-chip.
The only source of documentation for this device is verilog,
so driver is minimalistic.

Reviewed by:	Dr Jonathan Kimmitt <jrrk2@cam.ac.uk>
Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-10-12 15:19:41 +00:00
John Baldwin
7a102e0463 Implement pmap_sync_icache().
This invokes "fence" on the hart performing the write followed by an IPI
to execute "fence.i" on all harts.

This is required to support userland debuggers setting breakpoints in
user processes.

Reviewed by:	br (earlier version), markj
Approved by:	re (gjb)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17139
2018-09-24 17:41:29 +00:00
John Baldwin
232d0b87e0 Various fixes for floating point on RISC-V.
- Explicitly load an empty initial state into FP registers when taking
  the fault on the first FP instruction in a thread.  Setting
  SSTATE.FS to INITIAL is just a marker to let context switch restore
  code know that it can load FP registers with zeroes instead of
  memory loads.  It does not imply that the hardware will reset all
  registers to zero on first access.  In addition, set the state to
  CLEAN instead of INITIAL after the first FP instruction.
  cpu_switch() doesn't do anything for INITIAL and only restores from
  the pcb if the state is CLEAN.  We could perhaps change cpu_switch
  to call fpe_state_clear if the state was INITIAL and leave SSTATE.FS
  set to INITIAL instead of CLEAN after the first FP instruction.
  However, adding this complexity to cpu_switch() doesn't seem worth
  the supposed gain.
- Only save the current FPU registers in fill_fpregs() if the request
  is made to save the current thread's registers.  Previously if a
  debugger requested FP registers via ptrace() it was getting a copy
  of the debugger's FP registers rather than the debugee's.
- Zero the entire FP register set structure returned for ptrace() if a
  thread hasn't used FP registers rather than leaking garbage in the
  fp_fcsr field.
- If a debugger writes FP registers via ptrace(), always mark the pcb
  as having valid FP registers and set SSTATUS.FS_MASK to CLEAN so
  that the registers will be restored when the debugged thread
  resumes.
- Be more explicit about clearing the SSTATUS.FS field before setting
  it to CLEAN on the first FP instruction trap.

Submitted by:	br, markj
Approved by:	re (rgrimes)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17141
2018-09-19 23:45:18 +00:00
Ruslan Bukin
bd528a398e Enable VIMAGE support for RISC-V.
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-09-12 08:13:54 +00:00
Ruslan Bukin
752a8ea48e Use elf_relocaddr() to find the address for R_RISCV_RELATIVE
relocation.

elf_relocaddr() has a hook to handle VIMAGE data addresses.

This fixes VIMAGE support for RISC-V when built as a module.

Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-09-12 08:12:34 +00:00
Ruslan Bukin
157654d0c4 Permit supervisor to access user VA space for certain functions only.
This is done by setting SUM (permit Supervisor User Memory access)
bit in sstatus register.

The functions we allow access for are routines in assembly that
explicitly handle crossing the user kernel boundary.

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-05 11:34:58 +00:00
Ruslan Bukin
93952a8b1b Fix bug: compare uaddr to VM_MAXUSER_ADDRESS, not to a tmp value
left by SET_FAULT_HANDLER().

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-05 09:53:55 +00:00
Ruslan Bukin
378a495661 Add support for 'C'-compressed ISA extension to DTrace FBT provider.
Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-03 14:34:09 +00:00
Ruslan Bukin
0f669630ac Fix an integer overflow while setting the kernel address (MODINFO_ADDR).
This eliminates build warning and makes kldstat happy.

Approved by:	re (marius)
2018-08-31 16:15:46 +00:00
Konstantin Belousov
f0165b1ca6 Remove {max/min}_offset() macros, use vm_map_{max/min}() inlines.
Exposing max_offset and min_offset defines in public headers is
causing clashes with variable names, for example when building QEMU.

Based on the submission by:	royger
Reviewed by:	alc, markj (previous version)
Sponsored by:	The FreeBSD Foundation (kib)
MFC after:	1 week
Approved by:	re (marius)
Differential revision:	https://reviews.freebsd.org/D16881
2018-08-29 12:24:19 +00:00
Mark Johnston
36716fe2e6 Prepare the kernel linker to handle PC-relative ifunc relocations.
The boot-time ifunc resolver assumes that it only needs to apply
IRELATIVE relocations to PLT entries.  With an upcoming optimization,
this assumption no longer holds, so add the support required to handle
PC-relative relocations targeting GNU_IFUNC symbols.
- Provide a custom symbol lookup routine that can be used in early boot.
  The default lookup routine uses kobj, which is not functional at that
  point.
- Apply all existing relocations during boot rather than filtering
  IRELATIVE relocations.
- Ensure that we continue to apply ifunc relocations in a second pass
  when loading a kernel module.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16749
2018-08-22 20:44:30 +00:00
Alan Cox
83a90bffd8 Eliminate kmem_malloc()'s unused arena parameter. (The arena parameter
became unused in FreeBSD 12.x as a side-effect of the NUMA-related
changes.)

Reviewed by:	kib, markj
Discussed with:	jeff, re@
Differential Revision:	https://reviews.freebsd.org/D16825
2018-08-21 16:43:46 +00:00
John Baldwin
8cd385fda0 Make 'device crypto' lines more consistent.
- In configurations with a pseudo devices section, move 'device crypto'
  into that section.
- Use a consistent comment.  Note that other things common in kernel
  configs such as GELI also require 'device crypto', not just IPSEC.

Reviewed by:	rgrimes, cem, imp
Differential Revision:	https://reviews.freebsd.org/D16775
2018-08-18 20:32:08 +00:00
Conrad Meyer
08d77c0178 Riscv: Include crypto for IPSec
Similar to r337944.  I think this is the last configuration that includes IPsec
but not crypto.
2018-08-17 01:08:22 +00:00
Ruslan Bukin
9aa2d5e4fa Remove unused code.
Sponsored by:	DARPA, AFRL
2018-08-14 16:22:14 +00:00
Ruslan Bukin
2cfd37def0 Rewrite RISC-V disassembler:
- Use macroses from encoding.h generated by riscv-opcodes.
- Add support for C-compressed ISA extension.

Sponsored by:	DARPA, AFRL
2018-08-14 16:03:03 +00:00
Ruslan Bukin
c1d0e057d8 Add RISC-V instructions encoding.
This is the output of
$ cat opcodes opcodes-rvc-pseudo opcodes-rvc opcodes-custom |
    ./parse-opcodes -c

It is confirmed by author that the output of parse-opcodes is
in the public domain.

This will be required for DDB disassembler.

Discussed with: Andrew Waterman <waterman@eecs.berkeley.edu>
Obtained from:	https://github.com/riscv/riscv-opcodes
Sponsored by:	DARPA, AFRL
2018-08-13 16:07:18 +00:00
Ruslan Bukin
6371d0bd64 Implement uma_small_alloc(), uma_small_free().
Reviewed by:	markj
Obtained from:	arm64
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16628
2018-08-08 16:08:38 +00:00
Marius Strobl
13a10f3414 Implement atomic_swap_{int,long,ptr}(9). 2018-08-07 18:56:51 +00:00
Ruslan Bukin
c50c8f642c Return ENAMETOOLONG if the latest copied character
is not null terminator.

Sponsored by:	DARPA, AFRL
2018-08-03 16:44:56 +00:00
Ruslan Bukin
385a185b43 Don't overwrite tp in set_mcontext().
This makes libthr/swapcontext_test:swapcontext1 happy.

Sponsored by:	DARPA, AFRL
2018-08-02 12:13:52 +00:00
Ruslan Bukin
84154f4b9e o Don't overwrite tp in fork_trampoline().
o Save and restore tp in cpu_switch().
o Restore tp in cpu_throw().
o Save tp in savectx().

This makes libthr tests happy. In particular fork_test:fork.

Sponsored by:	DARPA, AFRL
2018-08-02 12:12:13 +00:00
Ruslan Bukin
7bb4a84ad3 o Correctly set user tls base: consider TP_OFFSET.
o Ensure tp (thread pointer) saved before copying the pcb.

Sponsored by:	DARPA, AFRL
2018-08-02 12:08:52 +00:00
Konstantin Belousov
e45b89d23d Add pmap_is_valid_memattr(9).
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15583
2018-08-01 18:45:51 +00:00
Ruslan Bukin
a304bc9729 Disable VIMAGE on RISC-V.
Similar to r326179 ("Temporarily disable VIMAGE on arm64") creation of
if_lagg or epair on RISC-V results a kernel panic.

Sponsored by:	DARPA, AFRL
2018-07-30 12:22:49 +00:00
Ruslan Bukin
b51092c7ec Use SPP (Supervisor Previous Privilege) bit in the sstatus
register to determine if trap is from userspace.

Otherwise if we jump to kernel address from userspace, then
TRAPF_USERMODE failed to detect usermode and then do_ast
triggers a panic "ast in kernel mode".

Reviewed by:	markj@
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16469
2018-07-27 16:13:06 +00:00
Mark Johnston
a3055a5e47 Implement pmap_mincore() for riscv.
Reviewed by:	alc, br
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16444
2018-07-26 16:08:26 +00:00
Ruslan Bukin
87f9acc9e1 Remove unused string.
Reported by:	markj@
Sponsored by:	DARPA, AFRL
2018-07-25 15:44:49 +00:00