Commit Graph

135 Commits

Author SHA1 Message Date
Ruslan Bukin
6371d0bd64 Implement uma_small_alloc(), uma_small_free().
Reviewed by:	markj
Obtained from:	arm64
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16628
2018-08-08 16:08:38 +00:00
Marius Strobl
13a10f3414 Implement atomic_swap_{int,long,ptr}(9). 2018-08-07 18:56:51 +00:00
Ruslan Bukin
c50c8f642c Return ENAMETOOLONG if the latest copied character
is not null terminator.

Sponsored by:	DARPA, AFRL
2018-08-03 16:44:56 +00:00
Ruslan Bukin
385a185b43 Don't overwrite tp in set_mcontext().
This makes libthr/swapcontext_test:swapcontext1 happy.

Sponsored by:	DARPA, AFRL
2018-08-02 12:13:52 +00:00
Ruslan Bukin
84154f4b9e o Don't overwrite tp in fork_trampoline().
o Save and restore tp in cpu_switch().
o Restore tp in cpu_throw().
o Save tp in savectx().

This makes libthr tests happy. In particular fork_test:fork.

Sponsored by:	DARPA, AFRL
2018-08-02 12:12:13 +00:00
Ruslan Bukin
7bb4a84ad3 o Correctly set user tls base: consider TP_OFFSET.
o Ensure tp (thread pointer) saved before copying the pcb.

Sponsored by:	DARPA, AFRL
2018-08-02 12:08:52 +00:00
Konstantin Belousov
e45b89d23d Add pmap_is_valid_memattr(9).
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15583
2018-08-01 18:45:51 +00:00
Ruslan Bukin
a304bc9729 Disable VIMAGE on RISC-V.
Similar to r326179 ("Temporarily disable VIMAGE on arm64") creation of
if_lagg or epair on RISC-V results a kernel panic.

Sponsored by:	DARPA, AFRL
2018-07-30 12:22:49 +00:00
Ruslan Bukin
b51092c7ec Use SPP (Supervisor Previous Privilege) bit in the sstatus
register to determine if trap is from userspace.

Otherwise if we jump to kernel address from userspace, then
TRAPF_USERMODE failed to detect usermode and then do_ast
triggers a panic "ast in kernel mode".

Reviewed by:	markj@
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16469
2018-07-27 16:13:06 +00:00
Mark Johnston
a3055a5e47 Implement pmap_mincore() for riscv.
Reviewed by:	alc, br
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16444
2018-07-26 16:08:26 +00:00
Ruslan Bukin
87f9acc9e1 Remove unused string.
Reported by:	markj@
Sponsored by:	DARPA, AFRL
2018-07-25 15:44:49 +00:00
Mark Johnston
c03590f5a2 Embed a simplebus_softc in struct soc_softc.
This is required by the definition of the soc driver.

Reviewed by:	br
Sponsored by:	The FreeBSD Foundation
2018-07-24 21:02:11 +00:00
Ruslan Bukin
8eca6e4855 Fix setjmp for RISC-V:
o The correct value for _JB_SIGMASK is 27.
o The storage size for double-precision floating
  point register is 8 bytes.

Submitted by:	"James Clarke" <jrtc4@cam.ac.uk>
Reviewed by:	markj@
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16344
2018-07-23 09:54:28 +00:00
Warner Losh
9ecd7fdebe Remove VM_FREELIST_ISADMA. It's not needed on these architectures.
Differential Review: https://reviews.freebsd.org/D16290
2018-07-17 21:07:53 +00:00
Alan Cox
afeed44dc5 Invalidate the mapping before updating its physical address.
Doing so ensures that all threads sharing the pmap have a consistent
view of the mapping.  This fixes the problem described in the commit
log message for r329254 without the overhead of an extra page fault
in the common case.  (Now that all pmap_enter() implementations are
similarly modified, the workaround added in r329254 can be removed,
reducing the overhead of COW faults.)

With this change we can reuse the PV entry from the old mapping,
potentially avoiding a call to reclaim_pv_chunk().  Otherwise, there is
nothing preventing the old PV entry from being reclaimed.  In rare
cases this could result in the PTE's page table page being freed,
leading to a use-after-free of the page when the updated PTE is written
following the allocation of the PV entry for the new mapping.

Reviewed by:	br, markj
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D16261
2018-07-14 20:14:00 +00:00
Matt Macy
ab3059a8e7 Back pcpu zone with domain correct pages
- Change pcpu zone consumers to use a stride size of PAGE_SIZE.
  (defined as UMA_PCPU_ALLOC_SIZE to make future identification easier)

- Allocate page from the correct domain for a given cpu.

- Don't initialize pc_domain to non-zero value if NUMA is not defined
  There are some misconceptions surrounding this field. It is the
  _VM_ NUMA domain and should only ever correspond to valid domain
  values as understood by the VM.

The former slab size of sizeof(struct pcpu) was somewhat arbitrary.
The new value is PAGE_SIZE because that's the smallest granularity
which the VM can allocate a slab for a given domain. If you have
fewer than PAGE_SIZE/8 counters on your system there will be some
memory wasted, but this is obviously something where you want the
cache line to be coming from the correct domain.

Reviewed by: jeff
Sponsored by: Limelight Networks
Differential Revision:  https://reviews.freebsd.org/D15933
2018-07-06 02:06:03 +00:00
Sean Bruno
096da4de18 riscv: Remove unused variable "code"
gcc found that the variabl "code", while being assigned a value, isn't
be used for anything.

Reviewed by:	br
Differential Revision:	https://reviews.freebsd.org/D16114
2018-07-05 17:26:44 +00:00
Sean Bruno
96744f0225 Make ZSTD a real option via ZSTDIO.
It looks like the intent was to allow ZSTD support to be
compiled into the kernel with options ZSTDIO. But it doesn't look
like that was ever implemented or I'm missing how to do it.

I did a cursory audit of kernel config files and made a decision to
enable ZSTDIO in riscv GENERIC and mips MALTA configurations.  All other
kernel configurations already had this option in their kernel configs
but they didn't do anything useful as the feature was declared as
"standard" prior to this.

Reviewed by:	cem allanjude
Differential Revision:	https://reviews.freebsd.org/D16007
2018-07-05 17:07:23 +00:00
Ruslan Bukin
a08232301d Include UART driver since it is now provided in QEMU.
Sponsored by:	DARPA, AFRL
2018-06-29 10:55:42 +00:00
Ruslan Bukin
c43e3c8659 PLIC driver was sponsored by ECATS contract, not CTSRD one. 2018-06-21 11:52:09 +00:00
Ruslan Bukin
b626c976dc Don't jump to VA space until kernel is ready.
This fixes the race when first core sets up the pagetables, while
secondary cores do translating the address of __riscv_boot_ap.

This now allows us to smpboot in QEMU with 8 cores just fine.

Sponsored by:	DARPA, AFRL
2018-06-13 10:32:21 +00:00
Ruslan Bukin
6fdc57357e Include VirtIO devices to the GENERIC configuration file.
These are now available in QEMU/RISC-V.

Sponsored by:	DARPA, AFRL
2018-06-12 17:55:40 +00:00
Ruslan Bukin
2d53a67c2c o Add driver for PLIC (Platform-Level Interrupt Controller) device.
o Convert interrupt machdep support to use INTRNG code.

Sponsored by:	DARPA, AFRL
2018-06-12 17:45:15 +00:00
Ruslan Bukin
ebdf0baf3a Add simplebus-like RISC-V SoC bus.
This is required in order to probe and attach devices described under
"riscv-virtio-soc" node of DTS.

Sponsored by:	DARPA, AFRL
2018-06-12 17:07:30 +00:00
Ruslan Bukin
f2e299880a Release secondary cores from WFI (wait for interrupt) by sending them
an IPI.

This does not work however yet in QEMU. As a temporary workaround set
software interrupt pending bit manually on a local core to ensure WFI
doesn't halt the hart.

This is required to smpboot in QEMU.

Sponsored by:	DARPA, AFRL
2018-06-12 16:47:33 +00:00
Ruslan Bukin
a9063ba1d7 Align virtual addressing entries.
This is required due to C-compressed ISA extension option being turned on.

This fixes SMP operation in QEMU.

Sponsored by:	DARPA, AFRL
2018-06-12 16:19:27 +00:00
John Baldwin
ca75fa17ee Export a breakpoint() function to userland for riscv.
As a result, enable tests using breakpoint() on riscv.

Reviewed by:	br
Differential Revision:	https://reviews.freebsd.org/D15191
2018-05-16 16:56:35 +00:00
Warner Losh
794af7cfdc Remove extra copy of bcopy.c now that we're using the libkern version
of this file.
2018-05-12 01:43:32 +00:00
Brooks Davis
9c11d8d483 Remove the unused fuwintr() and suiwintr() functions.
Half of implementations always failed (returned (-1)) and they were
previously used in only one place.

Reviewed by:	kib, andrew
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15102
2018-04-17 18:04:28 +00:00
Ed Maste
fc2a8776a2 Rename assym.s to assym.inc
assym is only to be included by other .s files, and should never
actually be assembled by itself.

Reviewed by:	imp, bdrewery (earlier)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D14180
2018-03-20 17:58:51 +00:00
Konstantin Belousov
8c8ee2ee1c Unify bulk free operations in several pmaps.
Submitted by:	Yoshihiro Ota
Reviewed by:	markj
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D13485
2018-03-04 20:53:20 +00:00
Warner Losh
ef1fcaf0f5 Do not include float interfaces when using libsa.
We don't support float in the boot loaders, so don't include
interfaces for float or double in systems headers. In addition, take
the unusual step of spiking double and float to prevent any more
accidental seepage.
2018-02-23 04:04:25 +00:00
Konstantin Belousov
2c0f13aa59 vm_wait() rework.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass.  If there is no object
associated with the wait, use curthread' policy domainset.  The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.

Eliminate pagedaemon_wait().  vm_domain_clear() handles the same
operations.

Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.

Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.

Scetched and reviewed by:	jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D14384
2018-02-20 10:13:13 +00:00
Jeff Roberson
e958ad4cf3 Make v_wire_count a per-cpu counter(9) counter. This eliminates a
significant source of cache line contention from vm_page_alloc().  Use
accessors and vm_page_unwire_noq() so that the mechanism can be easily
changed in the future.

Reviewed by:	markj
Discussed with:	kib, glebius
Tested by:	pho (earlier version)
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14273
2018-02-12 22:53:00 +00:00
Warner Losh
62bca77843 Move __va_list and related defines to sys/sys/_types.h
__va_list and related defines are identical in all the
ARCH/include/_types.h files. Move them to sys/sys/_types.h

Sponsored by: Netflix
2018-02-12 14:48:20 +00:00
Warner Losh
33e959abab Use standard pattern for stdargs.h
We don't support older compilers. Most of the code in these files is
for pre-3.0 gcc, which is at least 15 years obsolete. Move to using
phk's sys/_stdargs.h for all these platforms.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D14323
2018-02-12 14:48:05 +00:00
Mark Johnston
ab7c09f121 Use vm_page_unwire_noq() instead of directly modifying page wire counts.
No functional change intended.

Reviewed by:	alc, kib (previous revision)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D14266
2018-02-08 19:28:51 +00:00
Nathan Whitehorn
9a8196ce19 Remove SFBUF_OPTIONAL_DIRECT_MAP and such hacks, replacing them across the
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.

As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.

Reviewed by:		kib
Suggestions from:	marius, alc, kib
Runtime tested on:	amd64, powerpc64, powerpc, mips64
2018-01-19 17:46:31 +00:00
Jeff Roberson
ab3185d15e Implement NUMA support in uma(9) and malloc(9). Allocations from specific
domains can be done by the _domain() API variants.  UMA also supports a
first-touch policy via the NUMA zone flag.

The slab layer is now segregated by VM domains and is precise.  It handles
iteration for round-robin directly.  The per-cpu cache layer remains
a mix of domains according to where memory is allocated and freed.  Well
behaved clients can achieve perfect locality with no performance penalty.

The direct domain allocation functions have to visit the slab layer and
so require per-zone locks which come at some expense.

Reviewed by:	Attilio (a slightly older version)
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
2018-01-12 23:25:05 +00:00
Colin Percival
d5d7606c0c Use the TSLOG framework to record entry/exit timestamps for DELAY and
_vprintf; these functions are called in many places and can contribute
meaningfully to the total time spent booting.
2017-12-31 09:24:41 +00:00
Konstantin Belousov
30d4f9e888 Add atomic_load(9) and atomic_store(9) operations.
They provide relaxed-ordered atomic access semantic.  Due to the
FreeBSD memory model, the operations are syntaxical wrappers around
the volatile accesses.  The volatile qualifier is used to ensure that
the access not optimized out and in turn depends on the volatile
semantic as implemented by supported compilers.

The motivation for adding the operation is to help people coming from
other systems or knowing the C11/C++ standards where atomics have
special type and require use of the special access operations.  It is
still the case that FreeBSD requires plain load and stores of aligned
integer types to be atomic.

Suggested by:	jhb
Reviewed by:	alc, jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13534
2017-12-19 09:59:20 +00:00
Bruce Evans
fb3cc1c37d Move instantiation of msgbufp from 9 MD files to subr_prf.c.
This variable should be pure MI except possibly for reading it in MD
dump routines.  Its initialization was pure MD in 4.4BSD, but FreeBSD
changed this in r36441 in 1998.  There were many imperfections in
r36441.  This commit fixes only a small one, to simplify fixing the
others 1 arch at a time.  (r47678 added support for
special/early/multiple message buffer initialization which I want in
a more general form, but this was too fragile to use because hacking
on the msgbufp global corrupted it, and was only used for 5 hours in
-current...)
2017-12-07 07:55:38 +00:00
Pedro F. Giffuni
796df753f4 SPDX: Consider code from Carnegie-Mellon University.
Interesting cases, most likely from CMU Mach sources.
2017-11-30 15:48:35 +00:00
Ed Schouten
814629dd64 Don't let cpu_set_syscall_retval() clobber exec_setregs().
Upon successful completion, the execve() system call invokes
exec_setregs() to initialize the registers of the initial thread of the
newly executed process. What is weird is that when execve() returns, it
still goes through the normal system call return path, clobbering the
registers with the system call's return value (td->td_retval).

Though this doesn't seem to be problematic for x86 most of the times (as
the value of eax/rax doesn't matter upon startup), this can be pretty
frustrating for architectures where function argument and return
registers overlap (e.g., ARM). On these systems, exec_setregs() also
needs to initialize td_retval.

Even worse are architectures where cpu_set_syscall_retval() sets
registers to values not derived from td_retval. On these architectures,
there is no way cpu_set_syscall_retval() can set registers to the way it
wants them to be upon the start of execution.

To get rid of this madness, let sys_execve() return EJUSTRETURN. This
will cause cpu_set_syscall_retval() to leave registers intact. This
makes process execution easier to understand. It also eliminates the
difference between execution of the initial process and successive ones.
The initial call to sys_execve() is not performed through a system call
context.

Reviewed by:	kib, jhibbits
Differential Revision:	https://reviews.freebsd.org/D13180
2017-11-24 07:35:08 +00:00
Ruslan Bukin
0d4435dfab o Invalidate the correct page in pmap_protect().
With this bug fix we don't need to invalidate all the entries.
o Remove a call to pmap_invalidate_all(). This was never called
  as the anyvalid variable is never set.

Obtained from:	arm64/pmap.c (r322797, r322800)
Sponsored by:	DARPA, AFRL
2017-11-22 14:10:58 +00:00
Pedro F. Giffuni
df57947f08 spdx: initial adoption of licensing ID tags.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.

Initially, only tag files that use BSD 4-Clause "Original" license.

RelNotes:	yes
Differential Revision:	https://reviews.freebsd.org/D13133
2017-11-18 14:26:50 +00:00
Eitan Adler
a2aef24aa3 Update several more URLs
- Primarily http -> https
- Primarily FreeBSD project URLs
2017-10-29 08:17:03 +00:00
Michal Meloun
904d8c492f Add AT_HWCAP2 ELF auxiliary vector.
- allocate value for new AT_HWCAP2 auxiliary vector on all platforms.
 - expand 'struct sysentvec' by new 'u_long *sv_hwcap2', in exactly
   same way as for AT_HWCAP.

MFC after:	1 month
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D12699
2017-10-21 12:05:01 +00:00
Bjoern A. Zeeb
8e94025b41 With r181803 on 2008-08-17 23:27:27Z the first VIMAGE commit went into
HEAD.  Enable VIMAGE in GENERIC kernels and some others (where GENERIC does
not exist) on HEAD.

Disable building LINT-VIMAGE with VIMAGE being default.

This should give it a lot more exposure in the run-up to 12 to help
us evaluate whether to keep it on by default or not.
We are also hoping to get better performance testing.
The feature can be disabled using nooptions.

Requested by:		many
Reviewed by:		kristof, emaste, hiren
X-MFC after:		never
Relnotes:		yes
Differential Revision:	https://reviews.freebsd.org/D12639
2017-10-20 21:40:59 +00:00
Alan Cox
2582d7a969 Sync with amd64/arm/arm64/i386/mips pmap change r288256:
Exploit r288122 to address a cosmetic issue.  Since PV chunk pages don't
belong to a vm object, they can't be paged out.  Since they can't be paged
out, they are never enqueued in a paging queue.  Nonetheless, passing
PQ_INACTIVE to vm_page_unwire() creates the appearance that these pages
are being enqueued in the inactive queue.  As of r288122, we can avoid
this false impression by passing PQ_NONE.

MFC after:	1 week
2017-09-20 04:19:49 +00:00