Commit Graph

467 Commits

Author SHA1 Message Date
Kyle Evans
29a5f63951 riscv: use the common sub-word {,f}cmpset implementation
Reviewed by:	mhorne
Differential Revision:	https://reviews.freebsd.org/D21888
2019-10-06 01:35:31 +00:00
Mark Johnston
d4586dd328 Implement pmap_page_is_mapped() correctly on arm64 and riscv.
We must also check for large mappings.  pmap_page_is_mapped() is
mostly used in assertions, so the problem was not very noticeable.

Reviewed by:	alc
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21824
2019-09-27 23:37:01 +00:00
Konstantin Belousov
df08823d07 Improve MD page fault handlers.
Centralize calculation of signal and ucode delivered on unhandled page
fault in new function vm_fault_trap().  MD trap_pfault() now almost
always uses the signal numbers and error codes calculated in
consistent MI way.

This introduces the protection fault compatibility sysctls to all
non-x86 architectures which did not have that bug, but apparently they
were already much more wrong in selecting delivered signals on
protection violations.

Change the delivered signal for accesses to mapped area after the
backing object was truncated.  According to POSIX description for
mmap(2):
   The system shall always zero-fill any partial page at the end of an
   object. Further, the system shall never write out any modified
   portions of the last page of an object which are beyond its
   end. References within the address range starting at pa and
   continuing for len bytes to whole pages following the end of an
   object shall result in delivery of a SIGBUS signal.

   An implementation may generate SIGBUS signals when a reference
   would cause an error in the mapped object, such as out-of-space
   condition.
Adjust according to the description, keeping the existing
compatibility code for SIGSEGV/SIGBUS on protection failures.

For situations where kernel cannot handle page fault due to resource
limit enforcement, SIGBUS with a new error code BUS_OBJERR is
delivered.  Also, provide a new error code SEGV_PKUERR for SIGSEGV on
amd64 due to protection key access violation.

vm_fault_hold() is renamed to vm_fault().  Fixed some nits in
trap_pfault()s like mis-interpreting Mach errors as errnos.  Removed
unneeded truncations of the fault addresses reported by hardware.

PR:	211924
Reviewed by:	alc
Discussed with:	jilles, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21566
2019-09-27 18:43:36 +00:00
Mitchell Horne
c81e8f9891 Fix some broken relocation handling
In a few cases, the symbol lookup is missing before attempting to
perform the relocation. While the relocation types affected are
currently unused, this results in an uninitialized variable warning,
that is escalated to an error when building with clang.

Reviewed by:	markj
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D21773
2019-09-26 00:58:47 +00:00
Mitchell Horne
8b86850749 Cleanup of elf_machdep.c
Fix some style(9) violations.

This also changes the name of the machine-dependent sysctl kern.debug_kld to
debug.kld_reloc, and changes its type from int to bool. This is acceptable
since we are not currently concerned with preserving the RISC-V ABI.

Reviewed by:	markj, kp
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D21772
2019-09-26 00:54:07 +00:00
Mark Johnston
b119329d81 Complete the removal of the "wire_count" field from struct vm_page.
Convert all remaining references to that field to "ref_count" and update
comments accordingly.  No functional change intended.

Reviewed by:	alc, kib
Sponsored by:	Intel, Netflix
Differential Revision:	https://reviews.freebsd.org/D21768
2019-09-25 16:11:35 +00:00
Mitchell Horne
b698d9178d RISC-V: Support EARLY_AP_STARTUP
The EARLY_AP_STARTUP option initializes non-boot processors
much sooner during startup. This adds support for this option
on RISC-V and enables it by default for GENERIC.

Reviewed by:	jhb, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21661
2019-09-16 22:17:16 +00:00
Mark Johnston
e8bcf6966b Revert r352406, which contained changes I didn't intend to commit. 2019-09-16 15:04:45 +00:00
Mark Johnston
41fd4b9422 Fix a couple of nits in r352110.
- Remove a dead variable from the amd64 pmap_extract_and_hold().
- Fix grammar in the vm_page_wire man page.

Reported by:	alc
Reviewed by:	alc, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21639
2019-09-16 15:03:12 +00:00
Konstantin Belousov
6e7abad2fa riscv trap_pfault: remove unneeded hold of the process around vm_fault() call.
This is re-appearance of the nop code removed from other arches in r287625.

Reviewed by:	alc (as part of the larger patch), markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
DIfferential revision:	https://reviews.freebsd.org/D21645
2019-09-13 20:17:14 +00:00
Kristof Provost
2085ab1d5f riscv: Add missing header
r352218 missing an include statement, causing the build to fail.

Submitted by:	Nicholas O'Brien (nickisobrien_gmail.com)
Sponsored by:	Axiado
2019-09-11 18:07:15 +00:00
Kristof Provost
36fd137698 riscv: Small fix to CPU compatibility identification
fdt_is_compatible_strict() inspects the first compatible property.
We need to inspect the following properties for 'riscv'.
ofw_bus_node_is_compatible() does a recursive search.

This patch fixes "Can't find CPU" error message when bootverbose = true.

Submitted by:	Nicholas O'Brien (nickisobrien_gmail.com)
Reviewed by:	philip, kp
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D21576
2019-09-11 16:16:53 +00:00
Mark Johnston
fee2a2fa39 Change synchonization rules for vm_page reference counting.
There are several mechanisms by which a vm_page reference is held,
preventing the page from being freed back to the page allocator.  In
particular, holding the page's object lock is sufficient to prevent the
page from being freed; holding the busy lock or a wiring is sufficent as
well.  These references are protected by the page lock, which must
therefore be acquired for many per-page operations.  This results in
false sharing since the page locks are external to the vm_page
structures themselves and each lock protects multiple structures.

Transition to using an atomically updated per-page reference counter.
The object's reference is counted using a flag bit in the counter.  A
second flag bit is used to atomically block new references via
pmap_extract_and_hold() while removing managed mappings of a page.
Thus, the reference count of a page is guaranteed not to increase if the
page is unbusied, unmapped, and the object's write lock is held.  As
a consequence of this, the page lock no longer protects a page's
identity; operations which move pages between objects are now
synchronized solely by the objects' locks.

The vm_page_wire() and vm_page_unwire() KPIs are changed.  The former
requires that either the object lock or the busy lock is held.  The
latter no longer has a return value and may free the page if it releases
the last reference to that page.  vm_page_unwire_noq() behaves the same
as before; the caller is responsible for checking its return value and
freeing or enqueuing the page as appropriate.  vm_page_wire_mapped() is
introduced for use in pmap_extract_and_hold().  It fails if the page is
concurrently being unmapped, typically triggering a fallback to the
fault handler.  vm_page_wire() no longer requires the page lock and
vm_page_unwire() now internally acquires the page lock when releasing
the last wiring of a page (since the page lock still protects a page's
queue state).  In particular, synchronization details are no longer
leaked into the caller.

The change excises the page lock from several frequently executed code
paths.  In particular, vm_object_terminate() no longer bounces between
page locks as it releases an object's pages, and direct I/O and
sendfile(SF_NOCACHE) completions no longer require the page lock.  In
these latter cases we now get linear scalability in the common scenario
where different threads are operating on different files.

__FreeBSD_version is bumped.  The DRM ports have been updated to
accomodate the KPI changes.

Reviewed by:	jeff (earlier version)
Tested by:	gallatin (earlier version), pho
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20486
2019-09-09 21:32:42 +00:00
Mitchell Horne
c7ac71f2ca Fix compilation of locore.S with clang
The branch from _start to mpentry has to cross a large section of data;
an offset larger than can be specified with a 12-bit branch immediate.
Fix this by converting the branch to an unconditional jump. The gcc
assembler does this conversion silently but it is not done automatically
by clang.

Reported by:	Jeremy Bennett <jeremy.bennett@embecosm.com>
Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D21437
2019-09-08 19:53:11 +00:00
Mitchell Horne
5b4f0f860e Remove a duplicate KTR entry
Reviewed by:	markj
Approved by:	markj (mentor)
Differential Revision:	https://reviews.freebsd.org/D21438
2019-09-08 19:46:34 +00:00
Philip Paeps
bdc786cc7c riscv: restore default HZ=1000, keep QEMU at HZ=100
This reverts r351918 and r351919.

Discussed with:	br, ian, imp
2019-09-07 05:13:31 +00:00
Philip Paeps
7f0b970948 QEMU: use default HZ
HZ=100 by default on riscv since r351918.
2019-09-06 01:22:16 +00:00
Konstantin Belousov
a2a0f90654 Centralize __pcpu definitions.
Many extern struct pcpu <something>__pcpu declarations were
copied/pasted in sources.  The issue is that the definition is MD, but
it cannot be provided by machine/pcpu.h due to actual struct pcpu
defined in sys/pcpu.h later than the inclusion of machine/pcpu.h.
This forced the copying when other code needed direct access to
__pcpu.  There is no way around it, due to machine/pcpu.h supplying
part of struct pcpu fields.

To work around the problem, add a new machine/pcpu_aux.h header, which
should fill any needed MD definitions after struct pcpu definition is
completed. This allows to remove copies of __pcpu spread around the
source.  Also on x86 it makes it possible to remove work arounds like
OFFSETOF_CURTHREAD or clang specific warnings supressions.

Reported and tested by:	lwhsu, bcran
Reviewed by:	imp, markj (previous version)
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D21418
2019-08-29 07:25:27 +00:00
Jeff Roberson
2194393787 Move phys_avail definition into MI code. It is consumed in the MI layer and
doing so adds more flexibility with less redundant code.

Reviewed by:	jhb, markj, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21250
2019-08-16 00:45:14 +00:00
Kristof Provost
52bb6100e9 riscv: Fix copyin/copyout
r343275 introduced a performance optimisation to the copyin/copyout
routines by attempting to copy word-per-word rather than byte-per-byte
where possible.

This optimisation failed to account for cases where the buffer is longer
than XLEN_BYTES, but due to misalignment does not not allow for any
word-sized copies. E.g. a 9 byte buffer (with XLEN_BYTES == 8) which is
misaligned by 2 bytes. The code nevertheless did a single full-word
copy, which meant we copied too much data. This potentially clobbered
other data.

This is most easily demonstrated by a simple `sysctl -a`.

Fix it by not assuming that we'll always have at least one full-word
copy to do, but instead checking the remaining length first.

Reviewed by:	markj@, mhorne@, br@ (previous version)
MFC after:	1 week
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D21100
2019-07-29 14:59:14 +00:00
Alan Cox
5d18382b72 Simplify the handling of superpages in pmap_clear_modify(). Specifically,
if a demotion succeeds, then all of the 4KB page mappings within the
superpage-sized region must be valid, so there is no point in testing the
validity of the 4KB page mapping that is going to be write protected.

Deindent the nearby code.

Reviewed by:	kib, markj
Tested by:	pho (amd64, i386)
X-MFC after:	r350004 (this change depends on arm64 dirty bit emulation)
Differential Revision:	https://reviews.freebsd.org/D21027
2019-07-25 22:02:55 +00:00
Kristof Provost
cd7795a5a4 riscv: Return vm_paddr_t in pmap_early_vtophys()
We can't use a u_int to compute the physical address in
pmap_early_vtophys(). Our int is 32-bit, but the physical address is
64-bit. This works fine if everything lives in below 0x100000000, but as
soon as it doesn't this breaks.

MFC after:	1 week
Sponsored by:	Axiado
2019-07-17 21:25:26 +00:00
John Baldwin
c18ca74916 Don't pass error from syscallenter() to syscallret().
syscallret() doesn't use error anymore.  Fix a few other places to permit
removing the return value from syscallenter() entirely.
- Remove a duplicated assertion from arm's syscall().
- Use td_errno for amd64_syscall_ret_flush_l1d.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D2090
2019-07-15 21:25:16 +00:00
Mark Johnston
1fd21cb005 pmap_clear_modify() needs to clear PTE_W.
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2019-07-15 15:45:33 +00:00
Mark Johnston
24074d28ec Fix reference counting in pmap_ts_referenced() on RISC-V.
pmap_ts_referenced() does not necessarily clear the access bit from
all accessed mappings of a given page.  Thus, if a scan of the mappings
needs to be restarted, we should be careful to avoid double-counting
accessed mappings whose access bits were not cleared in a previous
attempt.

Reported by:	alc
Reviewed by:	alc
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20926
2019-07-15 15:43:15 +00:00
Konstantin Belousov
30b3018d48 Provide protection against starvation of the ll/sc loops when accessing userpace.
Casueword(9) on ll/sc architectures must be prepared for userspace
constantly modifying the same cache line as containing the CAS word,
and not loop infinitely.  Otherwise, rogue userspace livelocks the
kernel.

To fix the issue, change casueword(9) interface to return new value 1
indicating that either comparision or store failed, instead of relying
on the oldval == *oldvalp comparison.  The primitive no longer retries
the operation if it failed spuriously.  Modify callers of
casueword(9), all in kern_umtx.c, to handle retries, and react to
stops and requests to terminate between retries.

On x86, despite cmpxchg should not return spurious failures, we can
take advantage of the new interface and just return PSL.ZF.

Reviewed by:	andrew (arm64, previous version), markj
Tested by:	pho
Reported by:	https://xenbits.xen.org/xsa/advisory-295.txt
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D20772
2019-07-12 18:43:24 +00:00
Mark Johnston
eeacb3b02f Merge the vm_page hold and wire mechanisms.
The hold_count and wire_count fields of struct vm_page are separate
reference counters with similar semantics.  The remaining essential
differences are that holds are not counted as a reference with respect
to LRU, and holds have an implicit free-on-last unhold semantic whereas
vm_page_unwire() callers must explicitly determine whether to free the
page once the last reference to the page is released.

This change removes the KPIs which directly manipulate hold_count.
Functions such as vm_fault_quick_hold_pages() now return wired pages
instead.  Since r328977 the overhead of maintaining LRU for wired pages
is lower, and in many cases vm_fault_quick_hold_pages() callers would
swap holds for wirings on the returned pages anyway, so with this change
we remove a number of page lock acquisitions.

No functional change is intended.  __FreeBSD_version is bumped.

Reviewed by:	alc, kib
Discussed with:	jeff
Discussed with:	jhb, np (cxgbe)
Tested by:	pho (previous version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D19247
2019-07-08 19:46:20 +00:00
Alan Cox
7fe5c13c05 Merge r349526 from amd64. When we protect an L3 entry, we only call
vm_page_dirty() when, in fact, we are write protecting the page and the L3
entry has PTE_D set.  However, pmap_protect() was always calling
vm_page_dirty() when an L2 entry has PTE_D set.  Handle L2 entries the
same as L3 entries so that we won't perform unnecessary calls to
vm_page_dirty().

Simplify the loop calling vm_page_dirty() on L2 entries.
2019-07-05 05:23:23 +00:00
Navdeep Parhar
57f317e60a Display the approximate space needed when a minidump fails due to lack
of space.

Reviewed by:	kib@
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D20801
2019-06-30 03:14:04 +00:00
Conrad Meyer
c363b16c63 sys: Remove DEV_RANDOM device option
Remove 'device random' from kernel configurations that reference it (most).
Replace perhaps mistaken 'nodevice random' in two MIPS configs with 'options
RANDOM_LOADABLE' instead.  Document removal in UPDATING; update NOTES and
random.4.

Reviewed by:	delphij, markm (previous version)
Approved by:	secteam(delphij)
Differential Revision:	https://reviews.freebsd.org/D19918
2019-06-21 00:16:30 +00:00
Mitchell Horne
ffedb98b3e RISC-V: expose extension bits in AT_HWCAP
AT_HWCAP is a field in the elf auxiliary vector meant to describe
cpu-specific hardware features. For RISC-V we want to use this to
indicate the presence of any standard extensions supported by the CPU.
This allows userland applications to query the system for supported
extensions using elf_aux_info(3).

Support for an extension is indicated by the presence of its
corresponding bit in AT_HWCAP -- e.g. systems supporting the 'c'
extension (compressed instructions) will have the second bit set.

Extensions advertised through AT_HWCAP are only those that are supported
by all harts in the system.

Reviewed by:	jhb, markj
Approved by:	markj (mentor)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20493
2019-06-11 00:55:54 +00:00
Mitchell Horne
ff0e7938f2 Remove unused mcall_trap() function
The mcall_trap() dummy function is unused, and should be removed as we
are unlikely to support M-mode traps any time soon.

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D20494
2019-06-09 15:52:26 +00:00
Mitchell Horne
c04c594daa RISC-V: Clean up some GENERIC options
Some of the config options that are disabled by default seem to be only
for historical reasons. Enable those that appear to no longer be
problematic. This includes WITH_CTF, STACK, GEOM_RAID, and re-enabling
blacklisted kernel modules.

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D20495
2019-06-09 15:50:35 +00:00
Mitchell Horne
bffa317ff2 RISC-V: Announce real and available memory at boot
Most architectures print their total (real) and available memory during
boot. Properly initialize the realmem global and print these messages.
Also print the physical memory chunks (behind a bootverbose flag).

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D20496
2019-06-09 15:48:36 +00:00
Mitchell Horne
93ca8057c5 Add TSLOG events to initriscv()
Add the enter and exit events, similar to what's found in
hammer_time() on amd64. We must use TSRAW as the pcpu isn't yet
initialized.

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D20497
2019-06-09 15:45:48 +00:00
Mitchell Horne
6ae48dd870 Fix global pointer relaxations in the RISC-V kernel
The gp register is intended to used by the linker as another means of
performing relaxations, and should point to the small data section (.sdata).

Currently gp is being used as the pcpu pointer within the kernel, but the more
appropriate choice for this is the tp register, which is unused.

Swap existing usage of gp with tp within the kernel, and set up gp properly
at boot with the value of __global_pointer$ for all harts.

Additionally, remove some cases of accessing tp from the PCB, as it is not
part of the per-thread state. The user's tp and gp should be tracked only
through the trapframe.

Reviewed by:	markj, jhb
Approved by:	markj (mentor)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D19893
2019-06-09 15:43:38 +00:00
Mitchell Horne
fc261c16bd Remove block of dead code
Approved by:	markj (mentor)
2019-06-09 15:36:51 +00:00
Alan Cox
1fe65d054b Correct a new KASSERT() in r348828.
X-MFC with:	r348828
2019-06-09 05:55:58 +00:00
Alan Cox
fd2dae0a30 Implement an alternative solution to the amd64 and i386 pmap problem that we
previously addressed in r348246.

This pmap problem also exists on arm64 and riscv.  However, the original
solution developed for amd64 and i386 cannot be used on arm64 and riscv.  In
particular, arm64 and riscv do not define a PG_PROMOTED flag in their level
2 PTEs.  (A PG_PROMOTED flag makes no sense on arm64, where unlike x86 or
riscv we are required to break the old 4KB mappings before making the 2MB
mapping; and on riscv there are no unused bits in the PTE to define a
PG_PROMOTED flag.)

This commit implements an alternative solution that can be used on all four
architectures.  Moreover, this solution has two other advantages.  First, on
older AMD processors that required the Erratum 383 workaround, it is less
costly.  Specifically, it avoids unnecessary calls to pmap_fill_ptp() on a
superpage demotion.  Second, it enables the elimination of some calls to
pagezero() in pmap_kernel_remove_{l2,pde}().

In addition, remove a related stale comment from pmap_enter_{l2,pde}().

Reviewed by:	kib, markj (an earlier version)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D20538
2019-06-09 03:36:10 +00:00
Mark Johnston
88ea538a98 Replace uses of vm_page_unwire(m, PQ_NONE) with vm_page_unwire_noq(m).
These calls are not the same in general: the former will dequeue the
page if it is enqueued, while the latter will just leave it alone.  But,
all existing uses of the former apply to unmanaged pages, which are
never enqueued in the first place.  No functional change intended.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20470
2019-06-07 18:23:29 +00:00
Conrad Meyer
daec92844e Include ktr.h in more compilation units
Similar to r348026, exhaustive search for uses of CTRn() and cross reference
ktr.h includes.  Where it was obvious that an OS compat header of some kind
included ktr.h indirectly, .c files were left alone.  Some of these files
clearly got ktr.h via header pollution in some scenarios, or tinderbox would
not be passing prior to this revision, but go ahead and explicitly include it
in files using it anyway.

Like r348026, these CUs did not show up in tinderbox as missing the include.

Reported by:	peterj (arm64/mp_machdep.c)
X-MFC-With:	r347984
Sponsored by:	Dell EMC Isilon
2019-05-21 20:38:48 +00:00
Conrad Meyer
e2e050c8ef Extract eventfilter declarations to sys/_eventfilter.h
This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
pollution substantially.

EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).

As a side effect of reduced header pollution, many .c files and headers no
longer contain needed definitions.  The remainder of the patch addresses
adding appropriate includes to fix those files.

LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
sys/mutex.h since r326106 (but silently protected by header pollution prior
to this change).

No functional change (intended).  Of course, any out of tree modules that
relied on header pollution for sys/eventhandler.h, sys/lock.h, or
sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
2019-05-20 00:38:23 +00:00
Ruslan Bukin
b803d0b790 Add support for HiFive Unleashed -- the board with a multi-core RISC-V SoC
from SiFive, Inc.

The first core on this SoC (hart 0) is a 64-bit microcontroller.

o Pick a hart to run boot process using hart lottery.
  This allows to exclude hart 0 from running the boot process.
  (BBL releases hart 0 after the main harts, so it never wins the lottery).
o Renumber CPUs early on boot.
  Exclude non-MMU cores. Store the original hart ID in struct pcpu. This
  allows to find out the correct destination for IPIs and remote sfence
  calls.

Thanks to SiFive, Inc for the board provided.

Reviewed by:	markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D20225
2019-05-12 16:17:05 +00:00
Ruslan Bukin
ef68f03ec2 RISC-V ISA does not specify how to manage physical memory attributes (PMA).
So do nothing in pmap_page_set_memattr() and don't panic.

Reviewed by:	markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D20209
2019-05-10 11:21:57 +00:00
Andrew Gallatin
542970fa2d Remove IPSEC from GENERIC due to performance issues
Having IPSEC compiled into the kernel imposes a non-trivial
performance penalty on multi-threaded workloads due to IPSEC
refcounting. In my benchmarks of multi-threaded UDP
transmit (connected sockets), I've seen a roughly 20% performance
penalty when the IPSEC option is included in the kernel (16.8Mpps
vs 13.8Mpps with 32 senders on a 14 core / 28 HTT Xeon
2697v3)). This is largely due to key_addref() incrementing and
decrementing an atomic reference count on the default
policy. This cause all CPUs to stall on the same cacheline, as it
bounces between different CPUs.

Given that relatively few users use ipsec, and that it can be
loaded as a module, it seems reasonable to ask those users to
load the ipsec module so as to avoid imposing this penalty on the
GENERIC kernel. Its my hope that this will make FreeBSD look
better in "out of the box" benchmark comparisons with other
operating systems.

Many thanks to ae for fixing auto-loading of ipsec.ko when
ifconfig tries to configure ipsec, and to cy for volunteering
to ensure the the racoon ports will load the ipsec.ko module

Reviewed by:	cem, cy, delphij, gnn, jhb, jpaetzel
Differential Revision:	https://reviews.freebsd.org/D20163
2019-05-09 22:38:15 +00:00
Ruslan Bukin
fcc3a0f630 Connect Xilinx AXI drivers and Cadence Ethernet MAC to the RISC-V build.
Sponsored by:	DARPA, AFRL
2019-05-08 16:06:54 +00:00
Kyle Evans
251a32b5b2 tun/tap: merge and rename to tuntap
tun(4) and tap(4) share the same general management interface and have a lot
in common. Bugs exist in tap(4) that have been fixed in tun(4), and
vice-versa. Let's reduce the maintenance requirements by merging them
together and using flags to differentiate between the three interface types
(tun, tap, vmnet).

This fixes a couple of tap(4)/vmnet(4) issues right out of the gate:
- tap devices may no longer be destroyed while they're open [0]
- VIMAGE issues already addressed in tun by kp

[0] emaste had removed an easy-panic-button in r240938 due to devdrn
blocking. A naive glance over this leads me to believe that this isn't quite
complete -- destroy_devl will only block while executing d_* functions, but
doesn't block the device from being destroyed while a process has it open.
The latter is the intent of the condvar in tun, so this is "fixed" (for
certain definitions of the word -- it wasn't really broken in tap, it just
wasn't quite ideal).

ifconfig(8) also grew the ability to map an interface name to a kld, so
that `ifconfig {tun,tap}0` can continue to autoload the correct module, and
`ifconfig vmnet0 create` will now autoload the correct module. This is a
low overhead addition.

(MFC commentary)

This may get MFC'd if many bugs in tun(4)/tap(4) are discovered after this,
and how critical they are. Changes after this are likely easily MFC'd
without taking this merge, but the merge will be easier.

I have no plans to do this MFC as of now.

Reviewed by:	bcr (manpages), tuexen (testing, syzkaller/packetdrill)
Input also from:	melifaro
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D20044
2019-05-08 02:32:11 +00:00
Ruslan Bukin
bf03b1f1f9 Disable interrupts first and then set spinlock_count to 1.
Otherwise interrupt can be generated just after setting spinlock_count
and before disabling interrupts.

Sponsored by:	DARPA, AFRL
2019-05-07 14:32:17 +00:00
Ruslan Bukin
75cf8837a9 Provide a template for busdma code for RISC-V.
RISC-V ISA specifies no cache management instructions so leave cache
operations in cpufunc.h as no-op for now.

Note some new hardware comes with their own memory-mapped cache
management controller.

Tested on HiFive Unleashed board with cgem(4).

Reviewed by:	markj
Obtained from:	arm64
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D20126
2019-05-07 13:41:43 +00:00
Ruslan Bukin
adf208e786 Deactivate IRQ resource by calling to intr_deactivate_irq().
This is the part of INTRNG support that was missed.

Sponsored by:	DARPA, AFRL
2019-05-01 15:03:12 +00:00
Ruslan Bukin
7bad03a8b5 Implement pic_pre_ithread(), pic_post_ithread().
Reviewed by:	markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19819
2019-04-24 13:41:46 +00:00
Mitchell Horne
e196d237be RISC-V: initialize pcpu slightly earlier
In certain scenarios, it is possible for PCPU data to be
accessed before it has been initialized (e.g. during printf
if the kernel was built with the TSLOG option).

Initialize the PCPU pointer for hart 0 at the beginning of
initriscv() rather than near the end.

Reviewed by:		markj
Approved by:		markj (mentor)
Differential Revision:	https://reviews.freebsd.org/D19726
2019-04-07 20:12:24 +00:00
Ruslan Bukin
a8cb655d2e o Grab the number of devices supported by PLIC from FDT.
o Fix bug in PLIC_ENABLE macro when irq >= 32.

Tested on the real hardware, which is HiFive Unleashed board.

Thanks to SiFive, Inc. for the board provided.

Reviewed by:	markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19775
2019-04-02 12:02:35 +00:00
Ruslan Bukin
61fef9e860 Grab timer frequency from FDT.
RISC-V timer has no dedicated DTS node and we have to get timer
frequency from cpus node.

Tested on Government Furnished Equipment (GFE) cores synthesized
on Xilinx VCU118.

Reviewed by:	markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19727
2019-03-27 16:26:03 +00:00
Konstantin Belousov
fd8d844f76 amd64 KPTI: add control from procctl(2).
Add the infrastructure to allow MD procctl(2) commands, and use it to
introduce amd64 PTI control and reporting.  PTI mode cannot be
modified for existing pmap, the knob controls PTI of the new vmspace
created on exec.

Requested by:	jhb
Reviewed by:	jhb, markj (previous version)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:44:33 +00:00
Konstantin Belousov
6f1fe3305a amd64: Add md process flags and first P_MD_PTI flag.
PTI mode for the process pmap on exec is activated iff P_MD_PTI is set.

On exec, the existing vmspace can be reused only if pti mode of the
pmap matches the P_MD_PTI flag of the process.  Add MD
cpu_exec_vmspace_reuse() callback for exec_new_vmspace() which can
vetoed reuse of the existing vmspace.

MFC note: md_flags change struct proc KBI.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D19514
2019-03-16 11:31:01 +00:00
Mark Johnston
f3af92bd36 Reorder copyright lines to preserve the source of "All rights reserved."
Reported by:	rgrimes
MFC with:	r344829, r344830
2019-03-06 16:50:14 +00:00
Mark Johnston
3b5b20292b Implement minidump support for RISC-V.
Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D18320
2019-03-06 00:01:06 +00:00
Mark Johnston
3a3dfb2815 Initialize dump_avail[] on riscv.
Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D19170
2019-03-05 23:58:16 +00:00
Mark Johnston
91c3fda00b Add pmap_get_tables() for riscv.
This mirrors the arm64 implementation and is for use in the minidump
code.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D18321
2019-03-05 23:56:40 +00:00
Edward Tomasz Napierala
1699546def Remove sv_pagesize, originally introduced with r100384.
In all of the architectures we have today, we always use PAGE_SIZE.
While in theory one could define different things, none of the
current architectures do, even the ones that have transitioned from
32-bit to 64-bit like i386 and arm. Some ancient mips binaries on
other systems used 8k instead of 4k, but we don't support running
those and likely never will due to their age and obscurity.

Reviewed by:	imp (who also contributed the commit message)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D19280
2019-03-01 16:16:38 +00:00
Konstantin Belousov
e7a9df16e6 Add kernel support for Intel userspace protection keys feature on
Skylake Xeons.

See SDM rev. 68 Vol 3 4.6.2 Protection Keys and the description of the
RDPKRU and WRPKRU instructions.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D18893
2019-02-20 09:51:13 +00:00
Konstantin Belousov
72091bb393 Enable enabling ASLR on non-x86 architectures.
Discussed with:	emaste
Sponsored by:	The FreeBSD Foundation
2019-02-14 14:44:53 +00:00
Mark Johnston
35c91b0c27 Implement per-CPU pmap activation tracking for RISC-V.
This reduces the overhead of TLB invalidations by ensuring that we
only interrupt CPUs which are using the given pmap.  Tracking is
performed in pmap_activate(), which gets called during context switches:
from cpu_throw(), if a thread is exiting or an AP is starting, or
cpu_switch() for a regular context switch.

For now, pmap_sync_icache() still must interrupt all CPUs.

Reviewed by:	kib (earlier version), jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18874
2019-02-13 17:50:01 +00:00
Mark Johnston
91c85dd88b Implement pmap_clear_modify() for RISC-V.
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18875
2019-02-13 17:38:47 +00:00
Mark Johnston
f6893f09d5 Implement transparent 2MB superpage promotion for RISC-V.
This includes support for pmap_enter(..., psind=1) as described in the
commit log message for r321378.

The changes are largely modelled after amd64.  arm64 has more stringent
requirements around superpage creation to avoid the possibility of TLB
conflict aborts, and these requirements do not apply to RISC-V, which
like amd64 permits simultaneous caching of 4KB and 2MB translations for
a given page.  RISC-V's PTE format includes only two software bits, and
as these are already consumed we do not have an analogue for amd64's
PG_PROMOTED.  Instead, pmap_remove_l2() always invalidates the entire
2MB address range.

pmap_ts_referenced() is modified to clear PTE_A, now that we support
both hardware- and software-managed reference and dirty bits.  Also
fix pmap_fault_fixup() so that it does not set PTE_A or PTE_D on kernel
mappings.

Reviewed by:	kib (earlier version)
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18863
Differential Revision:	https://reviews.freebsd.org/D18864
Differential Revision:	https://reviews.freebsd.org/D18865
Differential Revision:	https://reviews.freebsd.org/D18866
Differential Revision:	https://reviews.freebsd.org/D18867
Differential Revision:	https://reviews.freebsd.org/D18868
2019-02-13 17:19:37 +00:00
Ed Maste
ac979af451 riscv: default to non-executable stack
There's no need to worry about potential backwards compatibility issues
in a brand-new architecture, so avoid stack PROT_EXEC as with arm64.

Discussed with:	br
2019-02-06 19:22:15 +00:00
David E. O'Brien
09efc56d66 Follow arm[32] and sparc64 KAPI and provide the FreeBSD standard spelling
across all architectures for this header.

Reviewed by:	stevek
Obtained from:	Juniper Networks
2019-01-29 20:10:27 +00:00
Mark Johnston
8fc2164b47 Remove a redundant test.
The existence of a PV entry for a mapping guarantees that the mapping
exists, so we should not need to test for that.

Reviewed by:	kib
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18866
2019-01-28 16:23:56 +00:00
Mark Johnston
80fe23594c Optimize RISC-V copyin(9)/copyout(9) routines.
The existing copyin(9) and copyout(9) routines on RISC-V perform only a
simple byte-by-byte copy.  Improve their performance by performing
word-sized copies where possible.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18851
2019-01-21 19:38:53 +00:00
Mark Johnston
45272d0568 Deduplicate common code in copyin()/copyout() with a macro.
No functional change intended.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18850
2019-01-21 19:37:12 +00:00
Mark Johnston
23732c0fe3 Don't enable interrupts in init_secondary().
The MI kernel assumes that interrupts will not be enabled on APs until
after the first context switch.  In particular, the problem was causing
occasional deadlocks during boot.

Remove an unneeded intr_disable() added in r335005.

Reviewed by:	jhb (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18738
2019-01-04 17:14:50 +00:00
Mark Johnston
c1959ba49b Fix dirty bit handling in pmap_remove_write().
Reviewed by:	jhb, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18732
2019-01-04 17:10:16 +00:00
Mark Johnston
b679dc7fee Clear PGA_WRITEABLE in pmap_remove_pages().
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18731
2019-01-04 17:08:45 +00:00
Mark Johnston
7c59ec14e6 Fix a use-after-free in the riscv pmap_release() implementation.
Don't bother zeroing the top-level page before freeing it.  Previously,
the page was freed before being zeroed.

Reviewed by:	jhb, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18720
2019-01-03 16:26:52 +00:00
Mark Johnston
bad66a29d4 Synchronize access to the allpmaps list.
The list will be removed with some future work.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18721
2019-01-03 16:24:03 +00:00
Mark Johnston
60af34002e Fix some issues with the riscv pmap_protect() implementation.
- Handle VM_PROT_EXECUTE.
- Clear PTE_D and mark the page dirty when removing write access
  from a mapping.
- Atomically clear PTE_W to avoid clobbering a hardware PTE update.

Reviewed by:	jhb, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18719
2019-01-03 16:21:44 +00:00
Mark Johnston
8ccaccd522 Set PTE_U on PTEs created by pmap_enter_quick().
Otherwise prefaulted entries are not accessible from user mode and
end up triggering a fault upon access, so prefaulting has no effect.

Reviewed by:	jhb, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18718
2019-01-03 16:19:32 +00:00
Mark Johnston
619999ff9f Use regular stores to update PTEs in the riscv pmap layer.
There's no need to use atomics when the previous value isn't needed.
No functional change intended.

Reviewed by:	kib
Discussed with:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18717
2019-01-03 16:15:28 +00:00
Mark Johnston
7b1e32a5be Configure hz=100 in the QEMU target.
We currently don't have a good way to dynamically detect whether the
kernel is running as a guest.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18715
2019-01-03 16:11:21 +00:00
Mateusz Guzik
628888f0e0 Remove iBCS2, part2: general kernel
Reviewed by:	kib (previous version)
Sponsored by:	The FreeBSD Foundation
2018-12-19 21:57:58 +00:00
Mark Johnston
53941c0a73 Replace uses of sbadaddr with stval.
The sbadaddr register was renamed in version 1.10 of the privileged
architecture specification.  No functional change intended.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18594
2018-12-19 17:52:09 +00:00
Mark Johnston
5268e09865 Implement cpu_halt() for RISC-V.
Submitted by:	Mitchell Horne <mhorne063@gmail.com>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D18595
2018-12-19 17:45:16 +00:00
Mark Johnston
3ec68206f5 Add some more checking to the RISC-V page fault handler.
- Panic immediately if witness says we're holding non-sleepable locks.
  This helps ensure that we don't recurse on the pmap lock in
  pmap_fault_fixup().
- Panic if the kernel faults on a user address without setting an
  onfault handler.
- Panic if the fault occurred in a critical section or interrupt
  handler, like we do on other platforms.
- Fix some style issues in trap_pfault().

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18561
2018-12-14 21:07:12 +00:00
Mark Johnston
105c317166 Avoid needless TLB invalidations in pmap_remove_pages().
pmap_remove_pages() is called during process termination, when it is
guaranteed that no other CPU may access the mappings being torn down.
In particular, it unnecessary to invalidate each mapping individually
since we do a pmap_invalidate_all() at the end of the function.

Also don't call pmap_invalidate_all() while holding a PV list lock, the
global pvh lock is sufficient.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18562
2018-12-14 21:04:30 +00:00
Mark Johnston
4f86ff4e47 Assume that pmap_l1() will return a PTE.
pmaps on RISC-V always have an L1 page table page, so we don't need to
check for this when performing lookups.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18563
2018-12-14 21:03:01 +00:00
Mark Johnston
01cd6fba6c Add a QEMU config for RISC-V.
Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18560
2018-12-14 21:00:41 +00:00
Mark Johnston
fb50c41448 Enable witness(4) in the RISC-V GENERIC config.
Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18559
2018-12-14 20:57:57 +00:00
Mark Johnston
4a02086817 Clean up the riscv pmap_bootstrap() implementation.
- Build up phys_avail[] in a single loop, excluding memory used by
  the loaded kernel.
- Fix an array indexing bug in the aforementioned phys_avail[]
  initialization.[1]
- Remove some unneeded code copied from the arm64 implementation.

PR:		231515 [1]
Reviewed by:	jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18464
2018-12-14 18:50:32 +00:00
Mark Johnston
a64886cef3 Remove an unused malloc(9) type.
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-12-11 02:16:27 +00:00
Mark Johnston
e7d46a1d71 Use inline tests for individual PTE bits in the RISC-V pmap.
Inline tests for PTE_* bits are easy to read and don't really require a
predicate function, and predicates which operate on a pt_entry_t are
inconvenient when working with L1 and L2 page table entries.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18461
2018-12-11 02:15:56 +00:00
Mark Johnston
1a153f42fa Update the description of the address space layout on RISC-V.
This adds more detail and fixes some inaccuracies.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18463
2018-12-07 15:56:40 +00:00
Mark Johnston
1f5e341b46 Rename sptbr to satp per v1.10 of the privileged architecture spec.
Add a subroutine for updating satp, for use when updating the
active pmap.  No functional change intended.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D18462
2018-12-07 15:55:23 +00:00
Eric van Gyzen
984969cd96 Fix reporting of SS_ONSTACK
Fix reporting of SS_ONSTACK in nested signal delivery when sigaltstack()
is used on some architectures.

Add a unit test for this.  I tested the test by introducing the bug
on amd64.  I did not test it on other architectures.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D18347
2018-11-30 22:44:33 +00:00
Eric van Gyzen
4d5a108409 Prevent kernel stack disclosure in signal delivery
On arm64 and riscv platforms, sendsig() failed to zero the signal
frame before copying it out to userspace.  Zero it.

On arm, I believe all the contents of the frame were initialized,
so there was no disclosure.  However, explicitly zero the whole frame
because that fact could inadvertently change in the future,
it's more clear to the reader, and I could be wrong in the first place.

MFC after:	2 days
Security:	similar to FreeBSD-EN-18:12.mem and CVE-2018-17155
Sponsored by:	Dell EMC Isilon
2018-11-26 20:52:53 +00:00
Mark Johnston
6f8ba91638 RISC-V: Implement get_cyclecount(9).
Add the missing implementation for get_cyclecount(9) on RISC-V by
reading the cycle CSR.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Reviewed by:	jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D17953
2018-11-13 18:20:27 +00:00
Mark Johnston
1e2ceeb16a RISC-V: Add macros for reading performance counter CSRs.
The RISC-V spec defines several performance counter CSRs such as: cycle,
time, instret, hpmcounter(3...31).  They are defined to be 64-bits wide
on all RISC-V architectures.  On RV64 and RV128 they can be read from a
single CSR.  On RV32, additional CSRs (given the suffix "h") are present
which contain the upper 32 bits of these counters, and must be read as
well.  (See section 2.8 in the User ISA Spec for full details.)

This change adds macros for reading these values safely on any RISC-V
ISA length.  Obviously we aren't supporting anything other than RV64
at the moment, but this ensures we won't need to change how we read
these values if we ever do.

Submitted by:	Mitchell Horne <mhorne063@gmail.com>
Reviewed by:	jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D17952
2018-11-13 18:12:06 +00:00
John Baldwin
c5e797a836 Drop the legacy ELF brandinfo for the old rtld from arm64 and riscv.
These architectures never shipped binaries with an rtld path of
/usr/libexec/ld-elf.so.1.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17876
2018-11-07 18:28:55 +00:00
John Baldwin
274c0a806a Enable use of a global shared page for RISC-V.
machine/vmparam.h already defines the SHAREDPAGE constant.  This
change just enables it for ELF executables.  The only use of the
shared page currently is to hold the signal trampoline.

Reviewed by:	markj, kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17875
2018-11-07 18:27:43 +00:00
John Baldwin
4cbbb74888 Add a KPI for the delay while spinning on a spin lock.
Replace a call to DELAY(1) with a new cpu_lock_delay() KPI.  Currently
cpu_lock_delay() is defined to DELAY(1) on all platforms.  However,
platforms with a DELAY() implementation that uses spin locks should
implement a custom cpu_lock_delay() doesn't use locks.

Reviewed by:	kib
MFC after:	3 days
2018-11-05 21:34:17 +00:00
John Baldwin
ff9738d954 Rework setting PTE_D for kernel mappings.
Rather than unconditionally setting PTE_D for all writeable kernel
mappings, set PTE_D for writable mappings of unmanaged pages (whether
user or kernel).  This matches what amd64 does and also matches what
the RISC-V spec suggests (preset the A and D bits on mappings where
the OS doesn't care about the state).

Suggested by:	alc
Reviewed by:	alc, markj
Sponsored by:	DARPA
2018-11-05 20:00:36 +00:00
John Baldwin
d198cb6d83 Restrict setting PTE execute permissions on RISC-V.
Previously, RISC-V was enabling execute permissions in PTEs for any
readable page.  Now, execute permissions are only enabled if they were
explicitly specified (e.g. via PROT_EXEC to mmap).  The one exception
is that the initial kernel mapping in locore still maps all of the
kernel RWX.

While here, change the fault type passed to vm_fault and
pmap_fault_fixup to only include a single VM_PROT_* value representing
the faulting access to match other architectures rather than passing a
bitmask.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17783
2018-11-01 22:23:15 +00:00
John Baldwin
6f888020df Set PTE_A and PTE_D for user mappings in pmap_enter().
This assumes that an access according to the prot in 'flags' triggered
a fault and is going to be retried after the fault returns, so the two
flags are set preemptively to avoid refaulting on the retry.

While here, only bother setting PTE_D for kernel mappings in pmap_enter
for writable mappings.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17782
2018-11-01 22:17:51 +00:00
John Baldwin
a751b25546 SBI calls expect a pointer to a u_long rather than a pointer.
This is just cosmetic.

A weirder issue is that the SBI doc claims the hart mask pointer should
be a physical address, not a virtual address.  However, the implementation
in bbl seems to just dereference the address directly.

Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17781
2018-11-01 22:15:25 +00:00
John Baldwin
344adeab18 Don't allow debuggers to modify SSTATUS, only to read it.
Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17771
2018-11-01 22:13:22 +00:00
John Baldwin
ada1ceef0b Implement ptrace_set_pc() and fail PT_*STEP requests explicitly.
Reviewed by:	markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17769
2018-11-01 22:11:26 +00:00
Kyle Evans
be352d20d5 Compile in VERBOSE_SYSINIT support by default, remain silent by default
The loader tunable 'debug.verbose_sysinit' may be used to toggle verbosity.
This is added to the debugging section of these kernconfs to be turned off
in stable branches for clarity of intent.

MFC after:	never
2018-10-31 22:38:19 +00:00
Ruslan Bukin
b7b391934d o Add pmap lock around pmap_fault_fixup() to ensure other thread will not
modify l3 pte after we loaded old value and before we stored new value.
o Preset A(accessed), D(dirty) bits for kernel mappings.

Reported by:	kib
Reviewed by:	markj
Discussed with:	jhb
Sponsored by:	DARPA, AFRL
2018-10-26 12:27:07 +00:00
Brooks Davis
c3adaa3305 Consolidate identical ELF auxargs type defintions.
All platforms except powerpc use the same values and powerpc shares a
majority of them.

Go ahead and declare AT_NOTELF, AT_UID, and AT_EUID in favor of the
unused AT_DCACHEBSIZE, AT_ICACHEBSIZE, and AT_UCACHEBSIZE for powerpc.

Reviewed by:	jhb, imp
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17397
2018-10-22 22:24:32 +00:00
Ruslan Bukin
b977d81946 Support RISC-V implementations that do not manage the A and D bits
(e.g. RocketChip, lowRISC and derivatives).

RISC-V page table entries support A (accessed) and D (dirty) bits. The
spec makes hardware support for these bits optional. Implementations that
do not manage these bits in hardware raise page faults for accesses to a
valid page without A set and writes to a writable page without D set.
Check for these types of faults when handling a page fault and fixup the
PTE without calling vm_fault if they occur.

Reviewed by:	jhb, markj
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17424
2018-10-18 15:25:07 +00:00
Ruslan Bukin
3c8efd61f5 Revert r339421 due to unintended files included to commit.
Reported by:	ian
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-10-18 15:17:58 +00:00
Ruslan Bukin
53c6ad1d62 Support RISC-V implementations that do not manage the A and D bits
(e.g. RocketChip, lowRISC and derivatives).

RISC-V page table entries support A (accessed) and D (dirty) bits. The
spec makes hardware support for these bits optional. Implementations that
do not manage these bits in hardware raise page faults for accesses to a
valid page without A set and writes to a writable page without D set.
Check for these types of faults when handling a page fault and fixup the
PTE without calling vm_fault if they occur.

Reviewed by:	jhb, markj
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17424
2018-10-18 15:08:14 +00:00
Ruslan Bukin
94036a2587 Invalidate TLB on a local hart.
This was missed in r339367 ("Various fixes for TLB management on RISC-V.").

This fixes operation on lowRISC.

Reviewed by:	jhb
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D17583
2018-10-16 16:03:17 +00:00
John Baldwin
73efa2fbd1 Various fixes for TLB management on RISC-V.
- Remove the arm64-specific cpu_*cache* and cpu_tlb_flush* functions.
  Instead, add RISC-V specific inline functions in cpufunc.h for the
  fence.i and sfence.vma instructions.
- Catch up to changes in the arm64 pmap and remove all the cpu_dcache_*
  calls, pmap_is_current, pmap_l3_valid_cacheable, and PTE_NEXT bits from
  pmap.
- Remove references to the unimplemented riscv_setttb().
- Remove unused cpu_nullop.
- Add a link to the SBI doc to sbi.h.
- Add support for a 4th argument in SBI calls.  It's not documented but
  it seems implied for the asid argument to SBI_REMOVE_SFENCE_VMA_ASID.
- Pass the arguments from sbi_remote_sfence*() to the SEE.  BBL ignores
  them so this is just cosmetic.
- Flush icaches on other CPUs when they resume from kdb in case the
  debugger wrote any breakpoints while the CPUs were paused in the IPI_STOP
  handler.
- Add SMP vs UP versions of pmap_invalidate_* similar to amd64.  The
  UP versions just use simple fences.  The SMP versions use the
  sbi_remove_sfence*() functions to perform TLB shootdowns.  Since we
  don't have a valid pm_active field in the riscv pmap, just IPI all
  CPUs for all invalidations for now.
- Remove an extraneous TLB flush from the end of pmap_bootstrap().
- Don't do a TLB flush when writing new mappings in pmap_enter(), only if
  modifying an existing mapping.  Note that for COW faults a TLB flush is
  only performed after explicitly clearing the old mapping as is done in
  other pmaps.
- Sync the i-cache on all harts before updating the PTE for executable
  mappings in pmap_enter and pmap_enter_quick.  Previously the i-cache was
  only sync'd after updating the PTE in pmap_enter.
- Use sbi_remote_fence() instead of smp_rendezvous in pmap_sync_icache().

Reviewed by:	markj
Approved by:	re (gjb, kib)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17414
2018-10-15 18:56:54 +00:00
Ruslan Bukin
05efeb8430 Initialize interrupt priority to 0 on all sources.
Without this hardware raises an interrupt regardless of any
pending bits set.

This fixes operation on RocketChip and derivatives (e.g. lowRISC).

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-10-12 15:51:41 +00:00
Ruslan Bukin
053ec0508e Add support for the UART device found in lowRISC system-on-a-chip.
The only source of documentation for this device is verilog,
so driver is minimalistic.

Reviewed by:	Dr Jonathan Kimmitt <jrrk2@cam.ac.uk>
Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-10-12 15:19:41 +00:00
John Baldwin
7a102e0463 Implement pmap_sync_icache().
This invokes "fence" on the hart performing the write followed by an IPI
to execute "fence.i" on all harts.

This is required to support userland debuggers setting breakpoints in
user processes.

Reviewed by:	br (earlier version), markj
Approved by:	re (gjb)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17139
2018-09-24 17:41:29 +00:00
John Baldwin
232d0b87e0 Various fixes for floating point on RISC-V.
- Explicitly load an empty initial state into FP registers when taking
  the fault on the first FP instruction in a thread.  Setting
  SSTATE.FS to INITIAL is just a marker to let context switch restore
  code know that it can load FP registers with zeroes instead of
  memory loads.  It does not imply that the hardware will reset all
  registers to zero on first access.  In addition, set the state to
  CLEAN instead of INITIAL after the first FP instruction.
  cpu_switch() doesn't do anything for INITIAL and only restores from
  the pcb if the state is CLEAN.  We could perhaps change cpu_switch
  to call fpe_state_clear if the state was INITIAL and leave SSTATE.FS
  set to INITIAL instead of CLEAN after the first FP instruction.
  However, adding this complexity to cpu_switch() doesn't seem worth
  the supposed gain.
- Only save the current FPU registers in fill_fpregs() if the request
  is made to save the current thread's registers.  Previously if a
  debugger requested FP registers via ptrace() it was getting a copy
  of the debugger's FP registers rather than the debugee's.
- Zero the entire FP register set structure returned for ptrace() if a
  thread hasn't used FP registers rather than leaking garbage in the
  fp_fcsr field.
- If a debugger writes FP registers via ptrace(), always mark the pcb
  as having valid FP registers and set SSTATUS.FS_MASK to CLEAN so
  that the registers will be restored when the debugged thread
  resumes.
- Be more explicit about clearing the SSTATUS.FS field before setting
  it to CLEAN on the first FP instruction trap.

Submitted by:	br, markj
Approved by:	re (rgrimes)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17141
2018-09-19 23:45:18 +00:00
Ruslan Bukin
bd528a398e Enable VIMAGE support for RISC-V.
Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-09-12 08:13:54 +00:00
Ruslan Bukin
752a8ea48e Use elf_relocaddr() to find the address for R_RISCV_RELATIVE
relocation.

elf_relocaddr() has a hook to handle VIMAGE data addresses.

This fixes VIMAGE support for RISC-V when built as a module.

Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
2018-09-12 08:12:34 +00:00
Ruslan Bukin
157654d0c4 Permit supervisor to access user VA space for certain functions only.
This is done by setting SUM (permit Supervisor User Memory access)
bit in sstatus register.

The functions we allow access for are routines in assembly that
explicitly handle crossing the user kernel boundary.

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-05 11:34:58 +00:00
Ruslan Bukin
93952a8b1b Fix bug: compare uaddr to VM_MAXUSER_ADDRESS, not to a tmp value
left by SET_FAULT_HANDLER().

Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-05 09:53:55 +00:00
Ruslan Bukin
378a495661 Add support for 'C'-compressed ISA extension to DTrace FBT provider.
Approved by:	re (kib)
Sponsored by:	DARPA, AFRL
2018-09-03 14:34:09 +00:00
Ruslan Bukin
0f669630ac Fix an integer overflow while setting the kernel address (MODINFO_ADDR).
This eliminates build warning and makes kldstat happy.

Approved by:	re (marius)
2018-08-31 16:15:46 +00:00
Konstantin Belousov
f0165b1ca6 Remove {max/min}_offset() macros, use vm_map_{max/min}() inlines.
Exposing max_offset and min_offset defines in public headers is
causing clashes with variable names, for example when building QEMU.

Based on the submission by:	royger
Reviewed by:	alc, markj (previous version)
Sponsored by:	The FreeBSD Foundation (kib)
MFC after:	1 week
Approved by:	re (marius)
Differential revision:	https://reviews.freebsd.org/D16881
2018-08-29 12:24:19 +00:00
Mark Johnston
36716fe2e6 Prepare the kernel linker to handle PC-relative ifunc relocations.
The boot-time ifunc resolver assumes that it only needs to apply
IRELATIVE relocations to PLT entries.  With an upcoming optimization,
this assumption no longer holds, so add the support required to handle
PC-relative relocations targeting GNU_IFUNC symbols.
- Provide a custom symbol lookup routine that can be used in early boot.
  The default lookup routine uses kobj, which is not functional at that
  point.
- Apply all existing relocations during boot rather than filtering
  IRELATIVE relocations.
- Ensure that we continue to apply ifunc relocations in a second pass
  when loading a kernel module.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16749
2018-08-22 20:44:30 +00:00
Alan Cox
83a90bffd8 Eliminate kmem_malloc()'s unused arena parameter. (The arena parameter
became unused in FreeBSD 12.x as a side-effect of the NUMA-related
changes.)

Reviewed by:	kib, markj
Discussed with:	jeff, re@
Differential Revision:	https://reviews.freebsd.org/D16825
2018-08-21 16:43:46 +00:00
John Baldwin
8cd385fda0 Make 'device crypto' lines more consistent.
- In configurations with a pseudo devices section, move 'device crypto'
  into that section.
- Use a consistent comment.  Note that other things common in kernel
  configs such as GELI also require 'device crypto', not just IPSEC.

Reviewed by:	rgrimes, cem, imp
Differential Revision:	https://reviews.freebsd.org/D16775
2018-08-18 20:32:08 +00:00
Conrad Meyer
08d77c0178 Riscv: Include crypto for IPSec
Similar to r337944.  I think this is the last configuration that includes IPsec
but not crypto.
2018-08-17 01:08:22 +00:00
Ruslan Bukin
9aa2d5e4fa Remove unused code.
Sponsored by:	DARPA, AFRL
2018-08-14 16:22:14 +00:00
Ruslan Bukin
2cfd37def0 Rewrite RISC-V disassembler:
- Use macroses from encoding.h generated by riscv-opcodes.
- Add support for C-compressed ISA extension.

Sponsored by:	DARPA, AFRL
2018-08-14 16:03:03 +00:00
Ruslan Bukin
c1d0e057d8 Add RISC-V instructions encoding.
This is the output of
$ cat opcodes opcodes-rvc-pseudo opcodes-rvc opcodes-custom |
    ./parse-opcodes -c

It is confirmed by author that the output of parse-opcodes is
in the public domain.

This will be required for DDB disassembler.

Discussed with: Andrew Waterman <waterman@eecs.berkeley.edu>
Obtained from:	https://github.com/riscv/riscv-opcodes
Sponsored by:	DARPA, AFRL
2018-08-13 16:07:18 +00:00
Ruslan Bukin
6371d0bd64 Implement uma_small_alloc(), uma_small_free().
Reviewed by:	markj
Obtained from:	arm64
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16628
2018-08-08 16:08:38 +00:00
Marius Strobl
13a10f3414 Implement atomic_swap_{int,long,ptr}(9). 2018-08-07 18:56:51 +00:00
Ruslan Bukin
c50c8f642c Return ENAMETOOLONG if the latest copied character
is not null terminator.

Sponsored by:	DARPA, AFRL
2018-08-03 16:44:56 +00:00
Ruslan Bukin
385a185b43 Don't overwrite tp in set_mcontext().
This makes libthr/swapcontext_test:swapcontext1 happy.

Sponsored by:	DARPA, AFRL
2018-08-02 12:13:52 +00:00
Ruslan Bukin
84154f4b9e o Don't overwrite tp in fork_trampoline().
o Save and restore tp in cpu_switch().
o Restore tp in cpu_throw().
o Save tp in savectx().

This makes libthr tests happy. In particular fork_test:fork.

Sponsored by:	DARPA, AFRL
2018-08-02 12:12:13 +00:00
Ruslan Bukin
7bb4a84ad3 o Correctly set user tls base: consider TP_OFFSET.
o Ensure tp (thread pointer) saved before copying the pcb.

Sponsored by:	DARPA, AFRL
2018-08-02 12:08:52 +00:00
Konstantin Belousov
e45b89d23d Add pmap_is_valid_memattr(9).
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15583
2018-08-01 18:45:51 +00:00
Ruslan Bukin
a304bc9729 Disable VIMAGE on RISC-V.
Similar to r326179 ("Temporarily disable VIMAGE on arm64") creation of
if_lagg or epair on RISC-V results a kernel panic.

Sponsored by:	DARPA, AFRL
2018-07-30 12:22:49 +00:00
Ruslan Bukin
b51092c7ec Use SPP (Supervisor Previous Privilege) bit in the sstatus
register to determine if trap is from userspace.

Otherwise if we jump to kernel address from userspace, then
TRAPF_USERMODE failed to detect usermode and then do_ast
triggers a panic "ast in kernel mode".

Reviewed by:	markj@
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16469
2018-07-27 16:13:06 +00:00
Mark Johnston
a3055a5e47 Implement pmap_mincore() for riscv.
Reviewed by:	alc, br
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16444
2018-07-26 16:08:26 +00:00
Ruslan Bukin
87f9acc9e1 Remove unused string.
Reported by:	markj@
Sponsored by:	DARPA, AFRL
2018-07-25 15:44:49 +00:00
Mark Johnston
c03590f5a2 Embed a simplebus_softc in struct soc_softc.
This is required by the definition of the soc driver.

Reviewed by:	br
Sponsored by:	The FreeBSD Foundation
2018-07-24 21:02:11 +00:00
Ruslan Bukin
8eca6e4855 Fix setjmp for RISC-V:
o The correct value for _JB_SIGMASK is 27.
o The storage size for double-precision floating
  point register is 8 bytes.

Submitted by:	"James Clarke" <jrtc4@cam.ac.uk>
Reviewed by:	markj@
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16344
2018-07-23 09:54:28 +00:00
Warner Losh
9ecd7fdebe Remove VM_FREELIST_ISADMA. It's not needed on these architectures.
Differential Review: https://reviews.freebsd.org/D16290
2018-07-17 21:07:53 +00:00
Alan Cox
afeed44dc5 Invalidate the mapping before updating its physical address.
Doing so ensures that all threads sharing the pmap have a consistent
view of the mapping.  This fixes the problem described in the commit
log message for r329254 without the overhead of an extra page fault
in the common case.  (Now that all pmap_enter() implementations are
similarly modified, the workaround added in r329254 can be removed,
reducing the overhead of COW faults.)

With this change we can reuse the PV entry from the old mapping,
potentially avoiding a call to reclaim_pv_chunk().  Otherwise, there is
nothing preventing the old PV entry from being reclaimed.  In rare
cases this could result in the PTE's page table page being freed,
leading to a use-after-free of the page when the updated PTE is written
following the allocation of the PV entry for the new mapping.

Reviewed by:	br, markj
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D16261
2018-07-14 20:14:00 +00:00
Matt Macy
ab3059a8e7 Back pcpu zone with domain correct pages
- Change pcpu zone consumers to use a stride size of PAGE_SIZE.
  (defined as UMA_PCPU_ALLOC_SIZE to make future identification easier)

- Allocate page from the correct domain for a given cpu.

- Don't initialize pc_domain to non-zero value if NUMA is not defined
  There are some misconceptions surrounding this field. It is the
  _VM_ NUMA domain and should only ever correspond to valid domain
  values as understood by the VM.

The former slab size of sizeof(struct pcpu) was somewhat arbitrary.
The new value is PAGE_SIZE because that's the smallest granularity
which the VM can allocate a slab for a given domain. If you have
fewer than PAGE_SIZE/8 counters on your system there will be some
memory wasted, but this is obviously something where you want the
cache line to be coming from the correct domain.

Reviewed by: jeff
Sponsored by: Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D15933
2018-07-06 02:06:03 +00:00
Sean Bruno
096da4de18 riscv: Remove unused variable "code"
gcc found that the variabl "code", while being assigned a value, isn't
be used for anything.

Reviewed by:	br
Differential Revision:	https://reviews.freebsd.org/D16114
2018-07-05 17:26:44 +00:00
Sean Bruno
96744f0225 Make ZSTD a real option via ZSTDIO.
It looks like the intent was to allow ZSTD support to be
compiled into the kernel with options ZSTDIO. But it doesn't look
like that was ever implemented or I'm missing how to do it.

I did a cursory audit of kernel config files and made a decision to
enable ZSTDIO in riscv GENERIC and mips MALTA configurations.  All other
kernel configurations already had this option in their kernel configs
but they didn't do anything useful as the feature was declared as
"standard" prior to this.

Reviewed by:	cem allanjude
Differential Revision:	https://reviews.freebsd.org/D16007
2018-07-05 17:07:23 +00:00
Ruslan Bukin
a08232301d Include UART driver since it is now provided in QEMU.
Sponsored by:	DARPA, AFRL
2018-06-29 10:55:42 +00:00
Ruslan Bukin
c43e3c8659 PLIC driver was sponsored by ECATS contract, not CTSRD one. 2018-06-21 11:52:09 +00:00
Ruslan Bukin
b626c976dc Don't jump to VA space until kernel is ready.
This fixes the race when first core sets up the pagetables, while
secondary cores do translating the address of __riscv_boot_ap.

This now allows us to smpboot in QEMU with 8 cores just fine.

Sponsored by:	DARPA, AFRL
2018-06-13 10:32:21 +00:00
Ruslan Bukin
6fdc57357e Include VirtIO devices to the GENERIC configuration file.
These are now available in QEMU/RISC-V.

Sponsored by:	DARPA, AFRL
2018-06-12 17:55:40 +00:00
Ruslan Bukin
2d53a67c2c o Add driver for PLIC (Platform-Level Interrupt Controller) device.
o Convert interrupt machdep support to use INTRNG code.

Sponsored by:	DARPA, AFRL
2018-06-12 17:45:15 +00:00
Ruslan Bukin
ebdf0baf3a Add simplebus-like RISC-V SoC bus.
This is required in order to probe and attach devices described under
"riscv-virtio-soc" node of DTS.

Sponsored by:	DARPA, AFRL
2018-06-12 17:07:30 +00:00
Ruslan Bukin
f2e299880a Release secondary cores from WFI (wait for interrupt) by sending them
an IPI.

This does not work however yet in QEMU. As a temporary workaround set
software interrupt pending bit manually on a local core to ensure WFI
doesn't halt the hart.

This is required to smpboot in QEMU.

Sponsored by:	DARPA, AFRL
2018-06-12 16:47:33 +00:00
Ruslan Bukin
a9063ba1d7 Align virtual addressing entries.
This is required due to C-compressed ISA extension option being turned on.

This fixes SMP operation in QEMU.

Sponsored by:	DARPA, AFRL
2018-06-12 16:19:27 +00:00
John Baldwin
ca75fa17ee Export a breakpoint() function to userland for riscv.
As a result, enable tests using breakpoint() on riscv.

Reviewed by:	br
Differential Revision:	https://reviews.freebsd.org/D15191
2018-05-16 16:56:35 +00:00
Warner Losh
794af7cfdc Remove extra copy of bcopy.c now that we're using the libkern version
of this file.
2018-05-12 01:43:32 +00:00
Brooks Davis
9c11d8d483 Remove the unused fuwintr() and suiwintr() functions.
Half of implementations always failed (returned (-1)) and they were
previously used in only one place.

Reviewed by:	kib, andrew
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15102
2018-04-17 18:04:28 +00:00
Ed Maste
fc2a8776a2 Rename assym.s to assym.inc
assym is only to be included by other .s files, and should never
actually be assembled by itself.

Reviewed by:	imp, bdrewery (earlier)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D14180
2018-03-20 17:58:51 +00:00
Konstantin Belousov
8c8ee2ee1c Unify bulk free operations in several pmaps.
Submitted by:	Yoshihiro Ota
Reviewed by:	markj
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D13485
2018-03-04 20:53:20 +00:00
Warner Losh
ef1fcaf0f5 Do not include float interfaces when using libsa.
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
2018-02-23 04:04:25 +00:00
Konstantin Belousov
2c0f13aa59 vm_wait() rework.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass.  If there is no object
associated with the wait, use curthread' policy domainset.  The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.

Eliminate pagedaemon_wait().  vm_domain_clear() handles the same
operations.

Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.

Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.

Scetched and reviewed by:	jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D14384
2018-02-20 10:13:13 +00:00
Jeff Roberson
e958ad4cf3 Make v_wire_count a per-cpu counter(9) counter. This eliminates a
significant source of cache line contention from vm_page_alloc().  Use
accessors and vm_page_unwire_noq() so that the mechanism can be easily
changed in the future.

Reviewed by:	markj
Discussed with:	kib, glebius
Tested by:	pho (earlier version)
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14273
2018-02-12 22:53:00 +00:00
Warner Losh
62bca77843 Move __va_list and related defines to sys/sys/_types.h
__va_list and related defines are identical in all the
ARCH/include/_types.h files. Move them to sys/sys/_types.h

Sponsored by: Netflix
2018-02-12 14:48:20 +00:00
Warner Losh
33e959abab Use standard pattern for stdargs.h
We don't support older compilers. Most of the code in these files is
for pre-3.0 gcc, which is at least 15 years obsolete. Move to using
phk's sys/_stdargs.h for all these platforms.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323
2018-02-12 14:48:05 +00:00
Mark Johnston
ab7c09f121 Use vm_page_unwire_noq() instead of directly modifying page wire counts.
No functional change intended.

Reviewed by:	alc, kib (previous revision)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14266
2018-02-08 19:28:51 +00:00
Nathan Whitehorn
9a8196ce19 Remove SFBUF_OPTIONAL_DIRECT_MAP and such hacks, replacing them across the
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.

As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.

Reviewed by:		kib
Suggestions from:	marius, alc, kib
Runtime tested on:	amd64, powerpc64, powerpc, mips64
2018-01-19 17:46:31 +00:00
Jeff Roberson
ab3185d15e Implement NUMA support in uma(9) and malloc(9). Allocations from specific
domains can be done by the _domain() API variants.  UMA also supports a
first-touch policy via the NUMA zone flag.

The slab layer is now segregated by VM domains and is precise.  It handles
iteration for round-robin directly.  The per-cpu cache layer remains
a mix of domains according to where memory is allocated and freed.  Well
behaved clients can achieve perfect locality with no performance penalty.

The direct domain allocation functions have to visit the slab layer and
so require per-zone locks which come at some expense.

Reviewed by:	Attilio (a slightly older version)
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
2018-01-12 23:25:05 +00:00
Colin Percival
d5d7606c0c Use the TSLOG framework to record entry/exit timestamps for DELAY and
_vprintf; these functions are called in many places and can contribute
meaningfully to the total time spent booting.
2017-12-31 09:24:41 +00:00
Konstantin Belousov
30d4f9e888 Add atomic_load(9) and atomic_store(9) operations.
They provide relaxed-ordered atomic access semantic.  Due to the
FreeBSD memory model, the operations are syntaxical wrappers around
the volatile accesses.  The volatile qualifier is used to ensure that
the access not optimized out and in turn depends on the volatile
semantic as implemented by supported compilers.

The motivation for adding the operation is to help people coming from
other systems or knowing the C11/C++ standards where atomics have
special type and require use of the special access operations.  It is
still the case that FreeBSD requires plain load and stores of aligned
integer types to be atomic.

Suggested by:	jhb
Reviewed by:	alc, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13534
2017-12-19 09:59:20 +00:00
Bruce Evans
fb3cc1c37d Move instantiation of msgbufp from 9 MD files to subr_prf.c.
This variable should be pure MI except possibly for reading it in MD
dump routines.  Its initialization was pure MD in 4.4BSD, but FreeBSD
changed this in r36441 in 1998.  There were many imperfections in
r36441.  This commit fixes only a small one, to simplify fixing the
others 1 arch at a time.  (r47678 added support for
special/early/multiple message buffer initialization which I want in
a more general form, but this was too fragile to use because hacking
on the msgbufp global corrupted it, and was only used for 5 hours in
-current...)
2017-12-07 07:55:38 +00:00
Pedro F. Giffuni
796df753f4 SPDX: Consider code from Carnegie-Mellon University.
Interesting cases, most likely from CMU Mach sources.
2017-11-30 15:48:35 +00:00
Ed Schouten
814629dd64 Don't let cpu_set_syscall_retval() clobber exec_setregs().
Upon successful completion, the execve() system call invokes
exec_setregs() to initialize the registers of the initial thread of the
newly executed process. What is weird is that when execve() returns, it
still goes through the normal system call return path, clobbering the
registers with the system call's return value (td->td_retval).

Though this doesn't seem to be problematic for x86 most of the times (as
the value of eax/rax doesn't matter upon startup), this can be pretty
frustrating for architectures where function argument and return
registers overlap (e.g., ARM). On these systems, exec_setregs() also
needs to initialize td_retval.

Even worse are architectures where cpu_set_syscall_retval() sets
registers to values not derived from td_retval. On these architectures,
there is no way cpu_set_syscall_retval() can set registers to the way it
wants them to be upon the start of execution.

To get rid of this madness, let sys_execve() return EJUSTRETURN. This
will cause cpu_set_syscall_retval() to leave registers intact. This
makes process execution easier to understand. It also eliminates the
difference between execution of the initial process and successive ones.
The initial call to sys_execve() is not performed through a system call
context.

Reviewed by:	kib, jhibbits
Differential Revision:	https://reviews.freebsd.org/D13180
2017-11-24 07:35:08 +00:00
Ruslan Bukin
0d4435dfab o Invalidate the correct page in pmap_protect().
With this bug fix we don't need to invalidate all the entries.
o Remove a call to pmap_invalidate_all(). This was never called
  as the anyvalid variable is never set.

Obtained from:	arm64/pmap.c (r322797, r322800)
Sponsored by:	DARPA, AFRL
2017-11-22 14:10:58 +00:00
Pedro F. Giffuni
df57947f08 spdx: initial adoption of licensing ID tags.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13133
2017-11-18 14:26:50 +00:00
Eitan Adler
a2aef24aa3 Update several more URLs
- Primarily http -> https
- Primarily FreeBSD project URLs
2017-10-29 08:17:03 +00:00
Michal Meloun
904d8c492f Add AT_HWCAP2 ELF auxiliary vector.
- allocate value for new AT_HWCAP2 auxiliary vector on all platforms.
 - expand 'struct sysentvec' by new 'u_long *sv_hwcap2', in exactly
   same way as for AT_HWCAP.

MFC after:	1 month
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D12699
2017-10-21 12:05:01 +00:00
Bjoern A. Zeeb
8e94025b41 With r181803 on 2008-08-17 23:27:27Z the first VIMAGE commit went into
HEAD.  Enable VIMAGE in GENERIC kernels and some others (where GENERIC does
not exist) on HEAD.

Disable building LINT-VIMAGE with VIMAGE being default.

This should give it a lot more exposure in the run-up to 12 to help
us evaluate whether to keep it on by default or not.
We are also hoping to get better performance testing.
The feature can be disabled using nooptions.

Requested by:		many
Reviewed by:		kristof, emaste, hiren
X-MFC after:		never
Relnotes:		yes
Differential Revision:	https://reviews.freebsd.org/D12639
2017-10-20 21:40:59 +00:00
Alan Cox
2582d7a969 Sync with amd64/arm/arm64/i386/mips pmap change r288256:
Exploit r288122 to address a cosmetic issue.  Since PV chunk pages don't
belong to a vm object, they can't be paged out.  Since they can't be paged
out, they are never enqueued in a paging queue.  Nonetheless, passing
PQ_INACTIVE to vm_page_unwire() creates the appearance that these pages
are being enqueued in the inactive queue.  As of r288122, we can avoid
this false impression by passing PQ_NONE.

MFC after:	1 week
2017-09-20 04:19:49 +00:00
Josh Paetzel
c77037f16f Fix indentation for r323068
PR:	220170
Reported by:	lidl
MFC after:	3 days
Pointyhat to:	jpaetzel
2017-09-19 20:40:05 +00:00
John Baldwin
c2f37b9245 Add AT_HWCAP and AT_EHDRFLAGS on all platforms.
A new 'u_long *sv_hwcap' field is added to 'struct sysentvec'.  A
process ABI can set this field to point to a value holding a mask of
architecture-specific CPU feature flags.  If an ABI does not wish to
supply AT_HWCAP to processes the field can be left as NULL.

The support code for AT_EHDRFLAGS was already present on all systems,
just the #define was not present.  This is a step towards unifying the
AT_* constants across platforms.

Reviewed by:	kib
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D12290
2017-09-14 14:26:55 +00:00
Mateusz Guzik
4dabeda46c Fix riscv and powerpc compilation after r323329.
On these archs bzero is a C function, which triggers a compilation error
as the compiler tries to expand the macro.
2017-09-09 05:56:04 +00:00
Josh Paetzel
9d0ec2a920 Revert r323087
This needs more thinking out and consensus, and the commit message
was wrong AND there was a typo in the commit.

pointyhat:	jpaetzel
2017-09-01 17:03:48 +00:00
Josh Paetzel
0be04b100c Take options IPSEC out of GENERIC
PR:	220170
Submitted by:	delphij
Reviewed by:	ae, glebius
MFC after:	2 weeks
Differential Revision:	D11806
2017-09-01 15:54:53 +00:00
Josh Paetzel
3b65550eec Allow kldload tcpmd5
PR:	220170
MFC after:	2 weeks
2017-08-31 20:16:28 +00:00
Ruslan Bukin
af19cc59ca Support for v1.10 (latest) of RISC-V privilege specification.
New version is not compatible on supervisor mode with v1.9.1
(previous version).

Highlights:
    o BBL (Berkeley Boot Loader) provides no initial page tables
      anymore allowing us to choose VM, to build page tables manually
      and enable MMU in S-mode.
    o SBI interface changed.
    o GENERIC kernel.
      FDT is now chosen standard for RISC-V hardware description.
      DTB is now provided by Spike (golden model simulator). This
      allows us to introduce GENERIC kernel. However, description
      for console and timer devices is not provided in DTB, so move
      these devices temporary to nexus bus.
    o Supervisor can't access userspace by default. Solution is to
      set SUM (permit Supervisor User Memory access) bit in sstatus
      register.
    o Compressed extension is now turned on by default.
    o External GCC 7.1 compiler used.
    o _gp renamed to __global_pointer$
    o Compiler -march= string is now in use allowing us to choose
      required extensions (compressed, FPU, atomic, etc).

Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11800
2017-08-10 14:18:09 +00:00
Jason A. Harmening
eb36b1d0bc Clean up MD pollution of bus_dma.h:
--Remove special-case handling of sparc64 bus_dmamap* functions.
  Replace with a more generic mechanism that allows MD busdma
  implementations to generate inline mapping functions by
  defining WANT_INLINE_DMAMAP in <machine/bus_dma.h>.  This
  is currently useful for sparc64, x86, and arm64, which all
  implement non-load dmamap operations as simple wrappers
  around map objects which may be bus- or device-specific.

--Remove NULL-checked bus_dmamap macros.  Implement the
  equivalent NULL checks in the inlined x86 implementation.
  For non-x86 platforms, these checks are a minor pessimization
  as those platforms do not currently allow NULL maps.  NULL
  maps were originally allowed on arm64, which appears to have
  been the motivation behind adding arm[64]-specific barriers
  to bus_dma.h, but that support was removed in r299463.

--Simplify the internal interface used by the bus_dmamap_load*
  variants and move it to bus_dma_internal.h

--Fix some drivers that directly include sys/bus_dma.h
  despite the recommendations of bus_dma(9)

Reviewed by:	kib (previous revision), marius
Differential Revision:	https://reviews.freebsd.org/D10729
2017-07-01 05:35:29 +00:00
Ruslan Bukin
5c118142b4 Undefine temporary macro.
This fixes world build.

Sponsored by:	DARPA, AFRL
2017-06-17 07:36:46 +00:00
Konstantin Belousov
2d88da2f06 Move struct syscall_args syscall arguments parameters container into
struct thread.

For all architectures, the syscall trap handlers have to allocate the
structure on the stack.  The structure takes 88 bytes on 64bit arches
which is not negligible.  Also, it cannot be easily found by other
code, which e.g. caused duplication of some members of the structure
to struct thread already.  The change removes td_dbg_sc_code and
td_dbg_sc_nargs which were directly copied from syscall_args.

The structure is put into the copied on fork part of the struct thread
to make the syscall arguments information correct in the child after
fork.

This move will also allow several more uses shortly.

Reviewed by:	jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
X-Differential revision:	https://reviews.freebsd.org/D11080
2017-06-12 21:03:23 +00:00
Konstantin Belousov
43f41dd393 Make struct syscall_args visible to userspace compilation environment
from machine/proc.h, consistently on all architectures.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
X-Differential revision:	https://reviews.freebsd.org/D11080
2017-06-12 20:53:44 +00:00
Ruslan Bukin
4d3b6bd5df Follow r317061 "Remove struct vmmeter from struct pcpu"
with MD changes for RISC-V.

This unbreaks RISC-V build.

Sponsored by:	DARPA, AFRL
2017-04-19 17:06:32 +00:00
Ruslan Bukin
370bf3020d Provide a NULL pointer to device tree blob so GENERIC kernel
can be compiled.
We will need to get pointer to DTB from hardware, so mark as TODO.

Sponsored by:	DARPA, AFRL
2017-04-12 10:34:50 +00:00
Patrick Kelsey
67d955aab4 Corrected misspelled versions of rendezvous.
The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().

Reviewed by:	gnn, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D10313
2017-04-09 02:00:03 +00:00
Bruce Evans
f434f3515b Fix printing of negative offsets (typically from frame pointers) again.
I fixed this in 1997, but the fix was over-engineered and fragile and
was broken in 2003 if not before.  i386 parameters were copied to 8
other arches verbatim, mostly after they stopped working on i386, and
mostly without the large comment saying how the values were chosen on
i386.  powerpc has a non-verbatim copy which just changes the uncritical
parameter and seems to add a sign extension bug to it.

Just treat negative offsets as offsets if they are no more negative than
-db_offset_max (default -64K), and remove all the broken parameters.

-64K is not very negative, but it is enough for frame and stack pointer
offsets since kernel stacks are small.

The over-engineering was mainly to go more negative than -64K for the
negative offset format, without affecting printing for more than a
single address.

Addresses in the top 64K of a (full 32-bit or 64-bit) address space
are now printed less well, but there aren't many interesting ones.
For arches that have many interesting ones very near the top (e.g.,
68k has interrupt vectors there), there would be no good limit for
the negative offset format and -64K is a good as anything.
2017-03-26 18:46:35 +00:00
Ruslan Bukin
43b595f6a5 Implement atomic_fcmpset_*() for RISC-V.
Requested by: mjg
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9447
2017-02-05 00:32:12 +00:00
Konstantin Belousov
9fb10d635e Define the vm_ooffset_t and vm_pindex_t types as machine-independend.
The types are for the byte offset and page index in vm object.  They
are similar to off_t, which is defined as 64bit MI integer.  Using MI
definitions will allow to provide consistent MD values of vm
object-related maximum sizes.

Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-02-04 12:26:38 +00:00
Li-Wen Hsu
8ff21c23a5 Add RISC-V support for truss(1)
While here, extract NARGREG as a definition.

Reviewed by:	br
Differential Revision:	https://reviews.freebsd.org/D9249
2017-01-24 09:41:44 +00:00
Ruslan Bukin
250b1bf3c5 Disable superpages reservations as we don't have implemented them yet.
Requested by:	Alan Cox <alc@rice.edu>
Sponsored by:	DARPA, AFRL
2016-11-21 12:00:31 +00:00
Ruslan Bukin
7804dd5212 Add full softfloat and hardfloat support for RISC-V.
Hardfloat is now default (use riscv64sf as TARGET_ARCH
for softfloat).

Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D8529
2016-11-16 15:21:32 +00:00
Ruslan Bukin
71e35b7da8 Check if L2 entry exists for the given VA before loading L3 entry.
This is a fix for a panic that was easy to reproduce executing
"(/bin/ls &)" in the shell.

Sponsored by:	DARPA, AFRL
2016-11-14 18:30:03 +00:00
Ruslan Bukin
426b08bcd3 System Binary Interface (SBI) page was moved in latest version of
Berkeley Boot Loader (BBL) due to code size increase.

We will need to dehardcode this somehow.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-11-04 13:07:54 +00:00
Ruslan Bukin
2ad1d09f16 o Add support for long double.
o Add support for latest RISC-V GNU toolchain.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-11-03 13:06:17 +00:00
Andriy Voskoboinyk
7453645f2a rtwn(4), urtwn(4): merge common code, add support for 11ac devices.
All devices:
- add support for rate adaptation via ieee80211_amrr(9);
- use short preamble for transmitted frames when needed;
- multi-bss support:
 * for RTL8821AU: 2 VAPs at the same time;
 * other: 1 any VAP + 1 sta VAP.
RTL8188CE:
- fix IQ calibration bug (reason of significant speed degradation);
- add h/w crypto acceleration support.
USB:
- A-MPDU Tx support;
- short GI support;
Other:
- add support for RTL8812AU / RTL8821AU chipsets
(a/b/g/n only; no ac yet);
- split merged code into subparts:
 * bus glue (usb/*, pci/*, rtl*/usb/*, rtl*/pci/*)
 * common (if_rtwn*)
 * chip-specific (rtl*/*)
- various other bugfixes.

Due to code reorganization, module names / requirements were changed too:
urtwn urtwnfw -> rtwn rtwn_usb rtwnfw
rtwn  rtwnfw  -> rtwn rtwn_pci rtwnfw

Tested with RTL8188CE, RTL8188CUS, RTL8188EU and RTL8821AU.

Tested by:	kevlo, garga,
		Peter Garshtja <peter.garshtja@ambient-md.com>,
		Kevin McAleavey <kevin.mcaleavey@knosproject.com>,
		Ilias-Dimitrios Vrachnis <id@vrachnis.com>,
		<otacilio.neto@bsd.com.br>
Relnotes:	yes
2016-10-17 20:38:24 +00:00
Warner Losh
b2a7ac4802 Fix building on i386 and arm. But 'public domain' headers on the files
with no creative content. Include "lost" changes from git:
o Use /dev/efi instead of /dev/efidev
o Remove redundant NULL checks.

Submitted by: kib@, dim@, zbb@, emaste@
2016-10-13 06:56:23 +00:00
Jonathan T. Looney
bd79708dbf In the TCP stack, the hhook(9) framework provides hooks for kernel modules
to add actions that run when a TCP frame is sent or received on a TCP
session in the ESTABLISHED state. In the base tree, this functionality is
only used for the h_ertt module, which is used by the cc_cdg, cc_chd, cc_hd,
and cc_vegas congestion control modules.

Presently, we incur overhead to check for hooks each time a TCP frame is
sent or received on an ESTABLISHED TCP session.

This change adds a new compile-time option (TCP_HHOOK) to determine whether
to include the hhook(9) framework for TCP. To retain backwards
compatibility, I added the TCP_HHOOK option to every configuration file that
already defined "options INET". (Therefore, this patch introduces no
functional change. In order to see a functional difference, you need to
compile a custom kernel without the TCP_HHOOK option.) This change will
allow users to easily exclude this functionality from their kernel, should
they wish to do so.

Note that any users who use a custom kernel configuration and use one of the
congestion control modules listed above will need to add the TCP_HHOOK
option to their kernel configuration.

Reviewed by:	rrs, lstewart, hiren (previous version), sjg (makefiles only)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D8185
2016-10-12 02:16:42 +00:00
Warner Losh
943ac2b07e Include stubs even on the platforms we don't support so libsysdecode
continues to build.
2016-10-11 22:54:29 +00:00
Alan Cox
8cb0c1029d Various changes to pmap_ts_referenced()
Move PMAP_TS_REFERENCED_MAX out of the various pmap implementations and
into vm/pmap.h, and describe what its purpose is.  Eliminate the archaic
"XXX" comment about its value.  I don't believe that its exact value, e.g.,
5 versus 6, matters.

Update the arm64 and riscv pmap implementations of pmap_ts_referenced()
to opportunistically update the page's dirty field.

On amd64, use the PDE value already cached in a local variable rather than
dereferencing a pointer again and again.

Reviewed by:	kib, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D7836
2016-09-10 16:49:25 +00:00
Mark Johnston
dbbaf04f1e Remove support for idle page zeroing.
Idle page zeroing has been disabled by default on all architectures since
r170816 and has some bugs that make it seemingly unusable. Specifically,
the idle-priority pagezero thread exacerbates contention for the free page
lock, and yields the CPU without releasing it in non-preemptive kernels. The
pagezero thread also does not behave correctly when superpage reservations
are enabled: its target is a function of v_free_count, which includes
reserved-but-free pages, but it is only able to zero pages belonging to the
physical memory allocator.

Reviewed by:	alc, imp, kib
Differential Revision:	https://reviews.freebsd.org/D7714
2016-09-03 20:38:13 +00:00
Ruslan Bukin
9862cef040 o Separate rtc and timecmp registers: they are different across
RISC-V cpu implementations.
o Update RocketChip device tree source (DTS).

We now support latest verison of RocketChip synthesized on
Xilinx FPGA (Zedboard).

RocketChip is an implementation of RISC-V processor written on
Chisel hardware construction language.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-09-01 14:58:11 +00:00
Ruslan Bukin
5f8228b2f3 o Remove operation in machine mode.
Machine privilege level was specially designed to use in vendor's
  firmware or bootloader. We have implemented operation in machine
  mode in FreeBSD as part of understanding RISC-V ISA, but it is time
  to remove it.
  We now use BBL (Berkeley Boot Loader) -- standard RISC-V firmware,
  which provides operation in machine mode for us.
  We now use standard SBI calls to machine mode, instead of handmade
  'syscalls'.
o Remove HTIF bus.
  HTIF bus is now legacy and no longer exists in RISC-V specification.
  HTIF code still exists in Spike simulator, but BBL do not provide
  raw interface to it.
  Memory disk is only choice for now to have multiuser booted in Spike,
  until Spike has implemented more devices (e.g. Virtio, etc).

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-08-10 12:41:36 +00:00
Ruslan Bukin
98f50c44e3 Update RISC-V port to Privileged Architecture Version 1.9.
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-08-02 14:50:14 +00:00
Konstantin Belousov
5c2cf81845 Update comments for the MD functions managing contexts for new
threads, to make it less confusing and using modern kernel terms.

Rename the functions to reflect current use of the functions, instead
of the historic KSE conventions:
  cpu_set_fork_handler -> cpu_fork_kthread_handler (for kthreads)
  cpu_set_upcall -> cpu_copy_thread (for forks)
  cpu_set_upcall_kse -> cpu_set_upcall (for new threads creation)

Reviewed by:	jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (hrs)
Differential revision:	https://reviews.freebsd.org/D6731
2016-06-16 12:05:44 +00:00
Ruslan Bukin
e933649fb9 Remove duplicate define. 2016-06-08 13:57:18 +00:00
Ruslan Bukin
03e4a374c4 Fix typos. 2016-06-02 15:14:40 +00:00
Ruslan Bukin
99ea4d4733 Add support for loadable kernel modules.
Submitted by:	Yukishige Shibata <y-shibat@mtd.biglobe.ne.jp>
2016-06-01 14:12:31 +00:00
Ruslan Bukin
22d5f3540a * Enable KDTRACE options as we support DTrace now.
* Add bpf device to kernel config.
2016-06-01 12:19:00 +00:00
Ruslan Bukin
6f397ac71b Increase the size and alignment of the setjmp buffer.
This is required for future CPU extentions.

Reviewed by:	brooks
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-05-26 10:03:30 +00:00
Ruslan Bukin
fed1ca4b71 Add initial DTrace support for RISC-V.
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-05-24 16:41:37 +00:00
Ruslan Bukin
892933d079 Store the original value of stack pointer to the exception frame
(the value we had before supervisor exception occurred).
This helps consumers (e.g. DTrace) to not proceed additional calculations.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-05-24 13:59:13 +00:00
Alan Cox
d3ffaee8e6 Eliminate an unused #include. For a brief period of time, _unrhdr.h was
used to implement PCID support on amd64.

Reviewed by:	kib
2016-05-13 20:14:41 +00:00
Ruslan Bukin
9af9422682 Rework the list of all pmaps: embed the list link into pmap. 2016-04-26 14:38:18 +00:00
Ruslan Bukin
3f8f5599a3 o Add device tree files and kernel configuration files
for RISC-V cpus synthesized on FPGA hardware.
o Include new files to the build.
2016-04-26 13:22:08 +00:00
Ruslan Bukin
00106e52c2 Add the non-standard "IO interrupt" vector used by lowRISC.
For now they provide UART irq only.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-04-26 12:56:44 +00:00
Ruslan Bukin
6c0d33bcb3 Add the implementation of basic bus_space_read/write functions.
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-04-26 12:45:01 +00:00
Ruslan Bukin
63270b0b65 Add the implementation of OF_decode_addr(). 2016-04-26 12:33:25 +00:00
Ruslan Bukin
30b72b6871 Move arm's devmap to some generic place, so it can be used
by other architectures.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D6091
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-04-26 11:53:37 +00:00
Ruslan Bukin
02a371289a o Implement shared pagetables and switch from 4 to 3 levels page
memory system.

RISC-V ISA has only single page table base register for both kernel
and user addresses translation. Before this commit we were using an
extra (4th) level of pagetables for switching between kernel and user
pagetables, but then realized FPGA hardware has 3-level page system
hardcoded. It is also become clear that the bitfile synthesized for
4-level system is untested/broken, so we can't use extra level for
switching.

We are now share level 1 of pagetables between kernel and user VA.
This requires to keep track of all the user pmaps created and once we
adding L1 page to kernel pmap we have to add it to all the user pmaps.

o Change the VM layout as we must have topmost bit to be 1 in the
  selected page system for kernel addresses and 0 for user addresses.
o Implement pmap_kenter_device().
o Create the l3 tables for the early devmap.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-04-25 14:47:51 +00:00
Ruslan Bukin
21e2c118b8 Do not setup machine exception vector.
Sounds strange, but both RocketCore and lowRISC do not operate
if we set it.

All the known implementations (Spike, QEMU, RocketCore, lowRISC) uses
default machine trap vector address and operates fine with this.

Original Berkeley Boot Loader (bbl) does not set this as well.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-04-25 13:30:37 +00:00
Ruslan Bukin
f8f69c9385 Revert r298477 ("Clear the DDR memory").
There is no need to clear all the DDR memory (we only need to clear
BSS section).
I was playing with non-default version of hardware (the bitfile
synthesized for 4-level page memory system) and clearing was helpful,
but then realized support for 4-level page system is untested/broken
in both RocketCore and lowRISC.
2016-04-25 13:20:57 +00:00
Ruslan Bukin
ce2b4fcfb9 Clear the DDR memory. This should be done by bootloaders,
but they have no such feature yet.

This fixes operation on Rocket Core and lowRISC.
2016-04-22 16:15:58 +00:00
Ruslan Bukin
fd3dc9f439 Add memory barriers (fence instructions) so the data wrotten by hardware
to physical address now can be read by VA.

This fixes operation on Rocket Core (FPGA).

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-04-22 15:12:05 +00:00
Ruslan Bukin
6c1838e37e Correct the event queue initialization.
This fixes operation on Rocket Core.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-04-22 15:04:46 +00:00
Pedro F. Giffuni
b93c97c14a risc-v: for pointers replace 0 with NULL.
These are mostly cosmetical, no functional change.

Found with devel/coccinelle.
2016-04-14 17:25:50 +00:00
Ruslan Bukin
d52d6d7ca7 Add support for ddb(4).
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-03-10 15:51:43 +00:00
Andrew Turner
a43760692b Make the fdt_get_mem_regions memsize argument optional. It's only used in
by a few callers.

Sponsored by:	ABT Systems Ltd
2016-03-01 09:45:27 +00:00
Justin Hibbits
e665eafb25 Correct the memory rman ranges to be to BUS_SPACE_MAXADDR
Summary:
As part of the migration of rman_res_t to be typed to uintmax_t, memory ranges
must be clamped appropriately for the bus, to prevent completely bogus addresses
from being used.

This is extracted from D4544.

Reviewed By: cem
Sponsored by:	Alex Perez/Inertial Computing
Differential Revision: https://reviews.freebsd.org/D5134
2016-03-01 02:59:06 +00:00
Wojciech Macek
e571e15cb0 Fix fdt_get_mem_regions() to work with 64-bit addresses
Use u_long instead of uint32_t variables to avoid overflow
    in case of PA space bigger than 32-bit.

Obtained from:         Semihalf
Submitted by:          Michal Stanek <mst@semihalf.com>
Sponsored by:          Annapurna Labs
Approved by:           cognet (mentor)
Reviewed by:           andrew, br, wma
Differential revision: https://reviews.freebsd.org/D5393
2016-02-29 09:22:39 +00:00
Ruslan Bukin
14232d424d o Use uint64_t for page number as it doesn't fit uint32_t.
o Implement growkernel bits for L1 level of pagetables.

This allows us to boot with 128GB of physical memory.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-02-26 14:04:00 +00:00
Ruslan Bukin
17696c12f5 Add support for symmetric multiprocessing (SMP).
Tested on Spike simulator with 2 and 16 cores (tlb enabled),
so set MAXCPU to 16 at this time.

This uses FDT data to get information about CPUs
(code based on arm64 mp_machdep).

Invalidate entire TLB cache as it is the only way yet.

Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
2016-02-24 16:50:34 +00:00
Ruslan Bukin
cea32c0742 o Grab physical memory regions information from the device tree.
o Increase memory size.
2016-02-23 14:21:46 +00:00
Ruslan Bukin
e1b6d4a9f4 Add basic trap handlers for illegal instruction and breakpoint
exceptions.
2016-02-22 14:54:50 +00:00
Ruslan Bukin
041c26e7ff Fix comment. 2016-02-22 14:19:45 +00:00
Ruslan Bukin
c7977d4cee Remove duplicates. 2016-02-22 14:13:05 +00:00
Ruslan Bukin
0b7bfc0beb Provide stack(9) MD stubs for RISC-V so ktr(9) can be compiled in. 2016-02-22 14:01:46 +00:00
Ruslan Bukin
dfb5ca561e Fix ktrace call. 2016-02-22 13:52:08 +00:00
Svatopluk Kraus
35a0bc1260 As <machine/vmparam.h> is included from <vm/vm_param.h>, there is no
need to include it explicitly when <vm/vm_param.h> is already included.

Suggested by:	alc
Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D5379
2016-02-22 09:08:04 +00:00
Justin Hibbits
7915adb560 Introduce a RMAN_IS_DEFAULT_RANGE() macro, and use it.
This simplifies checking for default resource range for bus_alloc_resource(),
and improves readability.

This is part of, and related to, the migration of rman_res_t from u_long to
uintmax_t.

Discussed with:	jhb
Suggested by:	marcel
2016-02-20 01:32:58 +00:00