Commit Graph

527 Commits

Author SHA1 Message Date
Mitchell Horne
0d2224733e Implement GET_STACK_USAGE on remaining archs
This definition enables callers to estimate remaining space on the
kstack, and take action on it. Notably, it enables optimizations in the
GEOM and netgraph subsystems to directly dispatch work items when there
is sufficient stack space, rather than queuing them for a worker thread.

Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will
not go unimplemented elsewhere.

PR:		259157
Reviewed by:	mav, kib, markj (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32580
2021-11-30 11:15:56 -04:00
Mark Johnston
ecbbe83144 netinet: Deduplicate most in_cksum() implementations
in_cksum() and related routines are implemented separately for each
platform, but only i386 and arm have optimized versions.  Other
platforms' copies of in_cksum.c are identical except for style
differences and support for big-endian CPUs.

Deduplicate the implementations for the rest of the platforms.  This
will make it easier to implement in_cksum() for unmapped mbufs.  On arm
and i386, define HAVE_MD_IN_CKSUM to mean that the MI implementation is
not to be compiled.

No functional change intended.

Reviewed by:	kp, glebius
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33095
2021-11-24 13:31:16 -05:00
Warner Losh
d2bf8c544a riscv: Make machine/regs.h self-contained
Make sys/reg.h self-contained by making riscv's machine/reg.h
self-contained.

Sponsored by:		Netflix
2021-11-23 21:21:17 -07:00
Mitchell Horne
10fe6f80a6 minidump: Use the provided dump bitset
When constructing the set of dumpable pages, use the bitset provided by
the state argument, rather than assuming vm_page_dump invariably. For
normal kernel minidumps this will be a pointer to vm_page_dump, but when
dumping the live system it will not.

To do this, the functions in vm_dumpset.h are extended to accept the
desired bitset as an argument. Note that this provided bitset is assumed
to be derived from vm_page_dump, and therefore has the same size.

Reviewed by:	kib, markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31992
2021-11-19 15:05:52 -04:00
Mitchell Horne
1d2d1418b4 minidump: Use provided msgbuf pointer
Don't assume we are dumping the global message buffer, but use the one
provided by the state argument. While here, drop superfluous
cast to char *.

Reviewed by:	markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31991
2021-11-19 15:05:52 -04:00
Mitchell Horne
681bd71047 minidump: reduce the amount direct accesses to page tables
During a live dump, we may race with updates to the kernel page tables.
This is generally okay; we accept that the state of the system while
dumping may be somewhat inconsistent with its state when the dump was
invoked. However, when walking the kernel page tables, it is important
that we load each PDE/PTE only once while operating on it. Otherwise, it
is possible to have the relevant PTE change underneath us. For example,
after checking the valid bit, but before reading the physical address.

Convert the loads to atomics, and add some validation around the
physical addresses, to ensure that we do not try to dump a non-existent
or non-canonical physical address.

Similarly, don't read kernel_vm_end more than once, on the off chance
that pmap_growkernel() is called between the two page table walks.

Reviewed by:	kib, markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31990
2021-11-19 15:05:52 -04:00
Mitchell Horne
1adebe3cd6 minidump: Parameterize minidumpsys()
The minidump code is written assuming that certain global state will not
change, and rightly so, since it executes from a kernel debugger
context. In order to support taking minidumps of a live system, we
should allow copies of relevant global state that is likely to change to
be passed as parameters to the minidumpsys() function.

This patch does the work of parameterizing this function, by adding a
struct minidumpstate argument. For now, this struct allows for copies of
the kernel message buffer, and the bitset that tracks which pages should
be dumped (vm_page_dump). Follow-up changes will actually make use of
these arguments.

Notably, dump_avail[] does not need a snapshot, since it is not expected
to change after system initialization.

The existing minidumpsys() definitions are renamed, and a thin MI
wrapper is added to kern_dump.c, which handles the construction of
the state struct. Thus, calling minidumpsys() remains as simple as
before.

Reviewed by:	kib, markj, jhb
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31989
2021-11-19 15:05:52 -04:00
Kristof Provost
4e85b64890 Add a COMPAT_FREEBSD13 kernel option
Use it wherever COMPAT_FREEBSD11 is currently specified.

Reviewed by:	jhb (previous version)
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33005
2021-11-17 03:08:40 +01:00
Kristof Provost
23e1961e78 riscv: add COMPAT_FREEBSD12 option
Turn on compat option for older FreeBSD versions (i.e. 12). We do not
enable the compat options for 11 or older because riscv was never
supported in those versions.

Reviewed by:	jrtc27 (previous version)
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33015
2021-11-17 03:08:14 +01:00
Warner Losh
7e3c9ec906 tcp: better congestion control defaults
Define CC_NEWRENO in all the appropriate DEFAULTS and std.* config
files. It's the default congestion control algorithm.  Add code to cc.c
so that CC_DEFAULT is "newreno" if it's not overriden in the config
file.

Sponsored by: Netflix
Fixes: b8d60729de ("tcp: Congestion control cleanup.")
Revired by: manu, hselasky, jhb, glebius, tuexen
Differential Revision:	https://reviews.freebsd.org/D32964
2021-11-12 12:16:11 -07:00
Randall Stewart
b8d60729de tcp: Congestion control cleanup.
NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!

This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.

This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.

Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693
2021-11-11 06:28:18 -05:00
Kyle Evans
6a8ea6d174 sched: split sched_ap_entry() out of sched_throw()
sched_throw() can no longer take a NULL thread, APs enter through
sched_ap_entry() instead.  This completely removes branching in the
common case and cleans up both paths.  No functional change intended.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D32829
2021-11-05 15:45:51 -05:00
Kyle Evans
589aed00e3 sched: separate out schedinit_ap()
schedinit_ap() sets up an AP for a later call to sched_throw(NULL).

Currently, ULE sets up some pcpu bits and fixes the idlethread lock with
a call to sched_throw(NULL); this results in a window where curthread is
setup in platforms' init_secondary(), but it has the wrong td_lock.
Typical platform AP startup procedure looks something like:

- Setup curthread
- ... other stuff, including cpu_initclocks_ap()
- Signal smp_started
- sched_throw(NULL) to enter the scheduler

cpu_initclocks_ap() may have callouts to process (e.g., nvme) and
attempt to sched_add() for this AP, but this attempt fails because
of the noted violated assumption leading to locking heartburn in
sched_setpreempt().

Interrupts are still disabled until cpu_throw() so we're not really at
risk of being preempted -- just let the scheduler in on it a little
earlier as part of setting up curthread.

Reviewed by:	alfredo, kib, markj
Triage help from:	andrew, markj
Smoke-tested by:	alfredo (ppc), kevans (arm64, x86), mhorne (arm)
Differential Revision:	https://reviews.freebsd.org/D32797
2021-11-03 15:54:59 -05:00
Philip Paeps
91feb4f420 riscv: add iicbus and iicoc to GENERIC
The iicoc driver supports the OpenCores I2C IP.  This is included in at
least the SiFive "Unleashed" and "Unmatched" cores and probably others.

Suggested by:	jrtc27
2021-11-01 13:19:55 +08:00
Mark Johnston
84c3922243 Convert consumers to vm_page_alloc_noobj_contig()
Remove now-unneeded page zeroing.  No functional change intended.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32006
2021-10-19 21:22:56 -04:00
Mark Johnston
a4667e09e6 Convert vm_page_alloc() callers to use vm_page_alloc_noobj().
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31986
2021-10-19 21:22:56 -04:00
Jessica Clarke
682c00a6ce riscv: Implement pmap_mapdev_attr
This is needed for LinuxKPI's _ioremap_attr. This reuses the generic
implementation introduced for aarch64, and itself requires implementing
pmap_kenter, which is trivial to do given riscv currently treats all
mapping attributes the same due to the Svpbmt extension not yet being
ratified and in hardware.

Reviewed by:	markj, mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D32445
2021-10-17 15:31:35 +01:00
Konstantin Belousov
4cc167a352 Restore PPS_SYNC in NOTES
This partially reverts e81e77c5a0, leaving the option both in
GENERICs on amd64/arm64/arm, and in global NOTES file.  Apparently
this better matches existing practice, where we do not try to hard
to make LINT and GENERIC complimentary.

Requested and reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-10-12 23:10:35 +03:00
Konstantin Belousov
e81e77c5a0 Enable PPS_SYNC on amd64, arm64 and armv7
Remove the option from NOTES/LINT, and add to NOTES for powerpc and
riscv.

PR:	259036
Requested by:	John Hay <john@sanren.ac.za>
Discussed with:	ian, imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-10-10 22:34:40 +03:00
Mitchell Horne
17f790f49f arm, arm64, riscv: adjust top-level nexus comment
These platforms don't manage resources for DMA request lines or I/O
ports, this is specific to x86. Remove the references from the comments.

Reviewed by:	imp, jhb
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32358
2021-10-08 14:16:32 -03:00
Konstantin Belousov
aba66031f2 riscv: move signal delivery code to exec_machdep.c
Reviewed by:	emaste, imp
Discussed with:	jrtc27
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32310
2021-10-08 03:20:42 +03:00
Mitchell Horne
8babb5582e riscv: fix VM_MAXUSER_ADDRESS checks in asm routines
There are two issues with the checks against VM_MAXUSER_ADDRESS. First,
the comparison should consider the values as unsigned, otherwise
addresses with the high bit set will fail to branch. Second, the value
of VM_MAXUSER_ADDRESS is, by convention, one larger than the maximum
mappable user address and invalid itself. Thus, use the bgeu instruction
for these comparisons.

Add a regression test case for copyin(9).

PR:		257193
Reported by:	Robert Morris <rtm@lcs.mit.edu>
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D31209
2021-10-07 18:12:30 -03:00
Mitchell Horne
4a9f2f8b07 riscv: handle page faults in the unmappable region
When handling a kernel page fault, check explicitly that stval resides
in either the user or kernel address spaces, and make the page fault
fatal if not. Otherwise, a properly crafted address may appear to
pmap_fault() as a valid and present page in the kernel map, causing the
page fault to be retried continuously. This is mainly due to the fact
that the upper bits of virtual addresses are not validated by most of
the pmap code.

Faults of this nature should only occur due to some kind of bug in the
kernel, but it is best to handle them gracefully when they do.

Handle user page faults in the same way, sending a SIGSEGV immediately
when a malformed address is encountered.

Add an assertion to pmap_l1(), which should help catch other bugs of
this kind that make it this far.

Reviewed by:	jrtc27, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31208
2021-10-07 18:12:17 -03:00
Jessica Clarke
2404f03fca riscv: Add vt and kbdmux to GENERIC for video console support
No in-tree drivers are supported for RISC-V (given it supports UEFI we
could enable the EFI framebuffer, but U-Boot has very limited hardware
support and EDK2 remains a work in progress), but drm-kmod exists with
drivers for video cards that can be used with the HiFive Unmatched.

Reviewed by:	imp, jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D32001
2021-10-03 19:34:53 +01:00
Jessica Clarke
1be2e16df1 riscv: Add a stub pmap_change_attr implementation
pmap_change_attr is required by drm-kmod so we need the function to
exist. Since the Svpbmt extension is on the horizon we will likely end
up with a real implementation of it, so this stub implementation does
all the necessary page table walking to validate the input, ensuring
that no new errors are returned once it's implemented fully (other than
due to out of memory conditions when demoting L2 entries) and providing
a skeleton for that future implementation.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31996
2021-10-03 19:33:47 +01:00
John Baldwin
0177102173 arm64, riscv: Fix TRAF_PC() to return the PC, not the return address.
Reviewed by:	mhorne
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D31969
2021-10-01 11:53:12 -07:00
Mitchell Horne
ab4ed843a3 minidump: De-duplicate the progress bar
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31885
2021-09-29 16:42:21 -03:00
Mitchell Horne
31991a5a45 minidump: De-duplicate is_dumpable()
The function is identical in each minidump implementation, so move it to
vm_phys.c. The only slight exception is powerpc where the function was
public, for use in moea64_scan_pmap().

Reviewed by:	kib, markj, imp (earlier version)
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31884
2021-09-29 16:41:52 -03:00
Thomas Skibo
f5d78bea1f sifive_spi: Add missing case for SPIBUS_MODE_NONE
Otherwise sckmode is left uninitialised, not zero. This mode is used for
the on-board flash on the HiFive Unmatched board. Whilst here, catch
unknown modes and return an error rather than silently continuing.

Reviewed by:	#riscv, jrtc27
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31562
2021-08-30 23:38:02 +01:00
Andrew Turner
b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Mateusz Guzik
c69cc8d101 riscv: retire bzero
Unused since ba96f37758 ("Use __builtin for various mem* and b* (e.g. bzero)
routines.")

Reviewed by:	mhorne
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-08-23 18:38:05 +00:00
Jessica Clarke
98138bbde0 riscv: Fix pmap_alloc_l2 when it should allocate a new L1 entry
The current code checks the RWX bits are 0 but does not check the V bit
is non-zero, meaning not-yet-allocated L1 entries that are still zero
are regarded as being allocated. This is likely due to copying the arm64
code that checks ATTR_DESC_MASK is L1_TABLE, which emcompasses both the
type and the validity in a single field, and erroneously translating
that to a check of just PTE_RWX being 0 to indicate non-leaf, forgetting
about the V bit. This then results in the following panic:

    panic: Fatal page fault at 0xffffffc0005cf292: 0x00000000000050
    cpuid = 1
    time = 1628379581
    KDB: stack backtrace:
    db_trace_self() at db_trace_self
    db_trace_self_wrapper() at db_trace_self_wrapper+0x38
    kdb_backtrace() at kdb_backtrace+0x2c
    vpanic() at vpanic+0x148
    panic() at panic+0x2a
    page_fault_handler() at page_fault_handler+0x1ba
    do_trap_supervisor() at do_trap_supervisor+0x7a
    cpu_exception_handler_supervisor() at
    cpu_exception_handler_supervisor+0x70
    --- exception 13, tval = 0x50
    pmap_enter_l2() at pmap_enter_l2+0xb2
    pmap_enter_object() at pmap_enter_object+0x15e
    vm_map_pmap_enter() at vm_map_pmap_enter+0x228
    vm_map_insert() at vm_map_insert+0x4ec
    vm_map_find() at vm_map_find+0x474
    vm_map_find_min() at vm_map_find_min+0x52
    vm_mmap_object() at vm_mmap_object+0x1ba
    vn_mmap() at vn_mmap+0xf8
    kern_mmap() at kern_mmap+0x4c4
    sys_mmap() at sys_mmap+0x38
    do_trap_user() at do_trap_user+0x208
    cpu_exception_handler_user() at cpu_exception_handler_user+0x72
    --- exception 8, tval = 0x1dd

Instead, we should just check the V bit, as on amd64, and assert that
any valid L1 entries are not leaves, since an L1 leaf would render the
entire range allocated and thus we should not have attempted to map that
VA in the first place.

Reported by:	David Gilbert <dgilbert@daveg.ca>
MFC after:	1 week
Reviewed by:	markj, mhorne
Differential Revision:	https://reviews.freebsd.org/D31460
2021-08-09 20:28:37 +01:00
Ed Maste
9feff969a0 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation
2021-08-08 10:42:24 -04:00
Jessica Clarke
c5e5202a3d riscv: Sync NOTES with GENERIC changes
USB is already in sys/conf/NOTES, but NVMe is not, nor of course are the
new SiFive device drivers.

MFC after:	1 week
2021-08-07 23:20:38 +01:00
Jessica Clarke
0a4cb54506 riscv: Add hwreset to NOTES to fix LINT build
Fixes:		8e7e0690ec ("sifive_prci: Add reset support for the FU540 and FU740")
MFC after:	1 week
2021-08-07 23:15:20 +01:00
Jessica Clarke
6e162bd2f2 riscv: Add NVMe, USB and HID support to GENERIC
The SiFive FU740 has both NVMe and USB so we need both to ensure we can
mount root, and HID is a dependency of USB.

Reviewed by:	kp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31036
2021-08-07 19:27:33 +01:00
Jessica Clarke
896e217a0e fu740_pci_dw: Add SiFive FU740 PCIe controller driver
Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31033
2021-08-07 19:27:31 +01:00
Jessica Clarke
b47e5c5dbe sifive_gpio: Add SiFive GPIO controller driver
This is present on both the FU540 and FU740, but only needed for the
FU740 in order to assert reset and power enable signals for its PCIe
controller.

Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31031
2021-08-07 19:27:31 +01:00
Jessica Clarke
90a089cf2a fu540_spi: Rename to sifive_spi
The FU740 also uses the same SPI controller.

Reviewed by:	kp, philip
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31026
2021-08-07 19:27:30 +01:00
Jessica Clarke
8e7e0690ec sifive_prci: Add reset support for the FU540 and FU740
This is needed for FU740 PCIe support. Whilst we don't need the FU540's
resets they are also defined for completeness.

Reviewed by:	manu
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31024
2021-08-07 19:27:29 +01:00
Jessica Clarke
dcbea9a2f4 sifive_prci: Delay attachment until after clk_fixed
This avoids noisy output from early attempts to attach before clk_fixed
has attached to the parent clocks.

Reviewed by:	kp, mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31023
2021-08-07 19:27:29 +01:00
Jessica Clarke
589d8a78a5 sifive_prci: Add support for the FU740 PRCI
Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31022
2021-08-07 19:27:28 +01:00
Jessica Clarke
12b115ec57 fu540_prci: Rename to sifive_prci and use ocd_data for FU540 specificity
The FU740 has a very similar controller and will reuse most of the
driver. This also drops the dependency on the device-tree include for
the binding indices; the header doesn't namespace its contents (and nor
does the FU740 one) so using both would require seperate translation
units which would be unnecessarily complicated just to avoid defining
local copies of the small number of constants.

Whilst here, add the missing l to gemgxlclk's name and drop the prci_
prefix from tlclk's name as we don't prefix any of the others and it's
entirely unnecessary.

Reviewed by:	kp, mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31021
2021-08-07 19:27:27 +01:00
Konstantin Belousov
041b7317f7 Add pmap_vm_page_alloc_check()
which is the place to put MD asserts about allocated pages.

On amd64, verify that allocated page does not belong to the kernel
(text, data) or early allocated pages.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-31 16:53:42 +03:00
Jessica Clarke
4a23504908 riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion
This repeats amd64's cfcbf8c6fd (r180498) and i386's cf3508519c
(r202894) but for riscv; pmap_kextract must be lock-free and so it can
race with superpage promotion and demotion, thus the L2 entry must only
be loaded once to avoid using inconsistent state.

PR:	250866
Reviewed by:	markj, mhorne
Tested by:	David Gilbert <dgilbert@daveg.ca>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31253
2021-07-22 20:02:14 +01:00
Jessica Clarke
8c439847f0 riscv: Include spibus and spigen in GENERIC
We already attempt to enable the SiFive SPI controller, but since spibus
isn't enabled it isn't actually built.

Reviewed by:	kp, philip
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31027
2021-07-21 06:46:09 +01:00
Jessica Clarke
ade2ea3c45 riscv: Fix pindex level confusion
The pindex values are assigned from the L3 leaves upwards, meaning there
are NUL2E L3 tables and then NUL1E L2 tables (with a futher NUL0E L1
tables in future when we implement Sv48 support). Therefore anything
below NUL2E is an L3 table's page and anything above or equal to NUL2E
is an L2 table's page (with the threshold of NUL2E + NUL1E marking the
start of the L1 tables' pages in Sv48). Thus all the comparisons and
arithmetic operations must use NUL2E to handle the L3/L2 allocation (and
thus L2/L1 entry) transition point, not NUL1E as all but pmap_alloc_l2
were doing.

To make matters confusing, the NUL1E and NUL2E definitions in the RISC-V
pmap are based on a 4-level page hierarchy but we currently use the
3-level Sv39 format (as that's the only required one, and hardware
support for the 4-level Sv48 is not widespread). This means that, in
effect, the above bug cancels out with the bloated NULxE definitions
such that things "work" (but are still technically wrong, and thus would
break when adding Sv48 support), with one exception. pmap_enter_l2 is
currently the only function to use the correct constant, but since
_pmap_alloc_l3 uses the incorrect constant, it will do complete nonsense
when it needs to allocate a new L2 table (which is rather rare). In this
instance, _pmap_alloc_l3, whilst it would correctly determine the pindex
was for an L2 table, would only subtract NUL1E when computing l1index
and thus go way out of bounds (by 511*512*512 bytes, or 127.75 GiB) of
its own L1 table and, thanks to pmap_distribute_l1, of every other
pmap's L1 table in the whole system. This has likely never been hit as
it would presumably instantly fault and panic.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31087
2021-07-21 02:51:26 +01:00
Jessica Clarke
a1f9cdb1ab sifive_uart: Fix input character dropping in ddb and at a mountroot prompt
These use the raw console interface and poll. Unfortunately, the SiFive
UART puts the FIFO empty bit inside the FIFO data register, which means
that the act of checking whether a character is available also dequeues
any character from the FIFO, requiring the user to press each key twice.
However, since we configure the watermark to be 0 and, when the UART has
been grabbed for the console, we have interrupts off, we can abuse the
interrupt pending register to act as a substitute for the FIFO empty
bit.

This perhaps suggests that the console interface should move from having
rxready and getc to having getc_nonblock and getc (or make getc take a
bool), as all the places that call rxready do so to avoid blocking on
getc when there is no character available.

Reviewed by:	kp, philip
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31025
2021-07-21 02:51:25 +01:00
Jessica Clarke
d9e85f2c6f riscv: Implement missing nexus methods
This is required for the SiFive FU740's PCIe controller. Copied from
arm64 with the only difference being changing pmap_mapdev_attr to
pmap_mapdev as riscv only has the latter.

Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31032
2021-07-21 02:51:25 +01:00
David Chisnall
cf98bc28d3 Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-16 18:06:44 +01:00