Commit Graph

13935 Commits

Author SHA1 Message Date
Mark Johnston
f06f1d1fdb x86: Deduplicate clock.h
The headers were mostly identical on amd64 and i386.

No functional change intended.

Reviewed by:	cperciva, mav, imp, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33205
2021-12-06 10:39:08 -05:00
Mitchell Horne
03b3d7bbec x86: remove unused T_USER flag
It stopped being used in 3c256f5395, when trap() was reorganized to
have separate switch statements for user and kernel traps. Remove the
two leftover references and the flag itself.

Reviewed by:	kib
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D33253
2021-12-05 11:12:40 -04:00
Mitchell Horne
8bc792b384 i386: take pcb and fpu area into account in GET_STACK_USAGE
On this platform, the pcb and FPU save area are allocated from the top
of each kernel stack, so they should be excluded from the calculation of
the total and used stack sizes.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32581
2021-11-30 11:03:46 -04:00
Ed Maste
28dcccc129 x86 GENERIC/MINIMAL: group sc(4) devices together
The vga and splash devices are part of the sc(4) system console. vt(4)
uses the vt_vga driver instead, and has some limited splash support
directly in vt_core.c.  Leave the sc(4) options in GENERIC/MINIMAL (for
now) but group them together under an sc(4) comment.

Sponsored by:	The FreeBSD Foundation
2021-11-28 14:38:41 -05:00
Ed Maste
777526ed83 Remove options VESA from x86 MINIMAL
Followup to b8cf1c5c30, remove from MINIMAL in addition to GENERIC.

options VESA / vesa.ko provides VESA Bios Extensions (VBE) support for
the legacy sc(4) console.  It is not used by the default console, vt(4).

PR:		253733
Fixes:		b8cf1c5c30 ("Remove options VESA from x86 GENERIC")
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
2021-11-28 14:37:46 -05:00
Ed Maste
b8cf1c5c30 Remove options VESA from x86 GENERIC
options VESA / vesa.ko provides VESA Bios Extensions (VBE) support for
the legacy sc(4) console.  It is not used by the default console, vt(4).

There is a report[1] of an incompatibility between VESA and the Nvidia
driver breaking suspend/resume.  Since VESA is not used by the default
configuration anyway, just remove options VESA from GENERIC.  The kernel
module is still available and may be loaded by sc(4) users who want to
select a VBE mode.

(Note that vt(4) does not support selecting a VBE mode.  The loader can
set a VBE mode and vt(4) will use it via the vt_vbefb driver.)

[1] https://lists.freebsd.org/archives/freebsd-hackers/2021-November/000469.html

PR:		253733
Reported by:	Stefan Blachmann [1]
Reviewed by:	imp, manu, tsoome
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33141
2021-11-28 11:29:17 -05:00
Ed Maste
228e020a3b Correct syscons description in i386 and amd64 configs
Commit 2d6f6d6373 switched to vt(4) as the default console.

Sponsored by:	The FreeBSD Foundation
2021-11-27 16:22:42 -05:00
Mateusz Guzik
af4051d250 linux: remove the always curthread argument from lconvpath 2021-11-25 22:50:42 +00:00
Warner Losh
8722e05ae1 twa: Remove
Belatedly remove twa(4). It was supposed to go before 13.0, but was
overlooked.

Sponsored by:		Netflix
Relnotes:		yes
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33114
2021-11-25 00:45:13 -07:00
Warner Losh
0d5935af8f esp: Remove
Belatedly remove esp(4). It was tagged as gone in 13, but was overlooked
until now.

Sponsored by:		Netflix
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33115
2021-11-25 00:45:12 -07:00
Warner Losh
60de2867c9 amr: remove
Belatedly remove amr(4). It was slated to depart before 13.0 but was
overlooked until now.

Sponsored by:		Netflix
Relnotes:		yes
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33113
2021-11-25 00:45:12 -07:00
Warner Losh
399188a2c6 iir: Remove
Belatedly remove iir(4). It was slated to go before 13, but was
overlooked.

Sponsored by:		Netflix
Relnotes:		yes
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33112
2021-11-25 00:45:12 -07:00
Warner Losh
a9620045a5 mly: Remove.
We'd said this was going away in 13, but was overlooked. Belatedly
remove.

Sponsored by:		Netflix
Relnotes:		yes
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33111
2021-11-25 00:45:12 -07:00
Mark Johnston
ecbbe83144 netinet: Deduplicate most in_cksum() implementations
in_cksum() and related routines are implemented separately for each
platform, but only i386 and arm have optimized versions.  Other
platforms' copies of in_cksum.c are identical except for style
differences and support for big-endian CPUs.

Deduplicate the implementations for the rest of the platforms.  This
will make it easier to implement in_cksum() for unmapped mbufs.  On arm
and i386, define HAVE_MD_IN_CKSUM to mean that the MI implementation is
not to be compiled.

No functional change intended.

Reviewed by:	kp, glebius
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33095
2021-11-24 13:31:16 -05:00
Mark Johnston
09100f936b netinet: Remove in_cksum_update()
It was never implemented on powerpc or riscv and appears to have been
unused since it was added in 1998.  No functional change intended.

Reviewed by:	kp, glebius, cy
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33093
2021-11-24 13:31:15 -05:00
Brooks Davis
6b7c23a026 syscalls: regen 2021-11-22 22:36:57 +00:00
Brooks Davis
f0cfbffc36 syscalls: regen 2021-11-22 22:36:56 +00:00
Brooks Davis
717e7fb27a syscalls: struct ucontext4 -> struct freebsd4_ucontext
This aligns with struct freebsd4_ucontext32 in freebsd32.

Reviewed by:	kib
2021-11-22 22:36:54 +00:00
Mitchell Horne
10fe6f80a6 minidump: Use the provided dump bitset
When constructing the set of dumpable pages, use the bitset provided by
the state argument, rather than assuming vm_page_dump invariably. For
normal kernel minidumps this will be a pointer to vm_page_dump, but when
dumping the live system it will not.

To do this, the functions in vm_dumpset.h are extended to accept the
desired bitset as an argument. Note that this provided bitset is assumed
to be derived from vm_page_dump, and therefore has the same size.

Reviewed by:	kib, markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31992
2021-11-19 15:05:52 -04:00
Mitchell Horne
1d2d1418b4 minidump: Use provided msgbuf pointer
Don't assume we are dumping the global message buffer, but use the one
provided by the state argument. While here, drop superfluous
cast to char *.

Reviewed by:	markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31991
2021-11-19 15:05:52 -04:00
Mitchell Horne
681bd71047 minidump: reduce the amount direct accesses to page tables
During a live dump, we may race with updates to the kernel page tables.
This is generally okay; we accept that the state of the system while
dumping may be somewhat inconsistent with its state when the dump was
invoked. However, when walking the kernel page tables, it is important
that we load each PDE/PTE only once while operating on it. Otherwise, it
is possible to have the relevant PTE change underneath us. For example,
after checking the valid bit, but before reading the physical address.

Convert the loads to atomics, and add some validation around the
physical addresses, to ensure that we do not try to dump a non-existent
or non-canonical physical address.

Similarly, don't read kernel_vm_end more than once, on the off chance
that pmap_growkernel() is called between the two page table walks.

Reviewed by:	kib, markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31990
2021-11-19 15:05:52 -04:00
Mitchell Horne
1adebe3cd6 minidump: Parameterize minidumpsys()
The minidump code is written assuming that certain global state will not
change, and rightly so, since it executes from a kernel debugger
context. In order to support taking minidumps of a live system, we
should allow copies of relevant global state that is likely to change to
be passed as parameters to the minidumpsys() function.

This patch does the work of parameterizing this function, by adding a
struct minidumpstate argument. For now, this struct allows for copies of
the kernel message buffer, and the bitset that tracks which pages should
be dumped (vm_page_dump). Follow-up changes will actually make use of
these arguments.

Notably, dump_avail[] does not need a snapshot, since it is not expected
to change after system initialization.

The existing minidumpsys() definitions are renamed, and a thin MI
wrapper is added to kern_dump.c, which handles the construction of
the state struct. Thus, calling minidumpsys() remains as simple as
before.

Reviewed by:	kib, markj, jhb
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31989
2021-11-19 15:05:52 -04:00
Kristof Provost
4e85b64890 Add a COMPAT_FREEBSD13 kernel option
Use it wherever COMPAT_FREEBSD11 is currently specified.

Reviewed by:	jhb (previous version)
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33005
2021-11-17 03:08:40 +01:00
Warner Losh
7e3c9ec906 tcp: better congestion control defaults
Define CC_NEWRENO in all the appropriate DEFAULTS and std.* config
files. It's the default congestion control algorithm.  Add code to cc.c
so that CC_DEFAULT is "newreno" if it's not overriden in the config
file.

Sponsored by: Netflix
Fixes: b8d60729de ("tcp: Congestion control cleanup.")
Revired by: manu, hselasky, jhb, glebius, tuexen
Differential Revision:	https://reviews.freebsd.org/D32964
2021-11-12 12:16:11 -07:00
Warner Losh
eb6b6622fe Update MINIMAL to have CC options
The MINIMAL configs were overlooked. They are compiled as part of
universe, so this broke universe builds. Add the same defafults as for
GENERIC.

Sponsored by:		Netflix
2021-11-12 09:33:18 -07:00
Randall Stewart
b8d60729de tcp: Congestion control cleanup.
NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!

This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.

This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.

Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693
2021-11-11 06:28:18 -05:00
Warner Losh
072d5b98c4 sysbeep: Adjust interface to take a duration as a sbt
Change the 'period' argument to 'duration' and change its type to
sbintime_t so we can more easily express different durations.

Reviewed by:	tsoome, glebius
Differential Revision:	https://reviews.freebsd.org/D32619
2021-11-03 16:03:51 -06:00
Edward Tomasz Napierala
4dfd612286 linux: mv sys/i386/linux/linux_ptrace{,_machdep}.c
In preparation for machine-independent sys/compat/linux/linux_ptrace.c,
rename the i386-specific Linux ptrace(2) implementation.  No functional
changes.

Sponsored By:	EPSRC
Differential Revision: https://reviews.freebsd.org/D32757
2021-11-03 08:50:17 +00:00
Gleb Smirnoff
6aae3517ed Retire synchronous PPP kernel driver sppp(4).
The last two drivers that required sppp are cp(4) and ce(4).

These devices are still produced and can be purchased
at Cronyx <http://cronyx.ru/hardware/wan.html>.

Since Roman Kurakin <rik@FreeBSD.org> has quit them, they no
longer support FreeBSD officially.  Later they have dropped
support for Linux drivers to.  As of mid-2020 they don't even
have a developer to maintain their Windows driver.  However,
their support verbally told me that they could provide aid to
a FreeBSD developer with documentaion in case if there appears
a new customer for their devices.

These drivers have a feature to not use sppp(4) and create an
interface, but instead expose the device as netgraph(4) node.
Then, you can attach ng_ppp(4) with help of ports/net/mpd5 on
top of the node and get your synchronous PPP.  Alternatively
you can attach ng_frame_relay(4) or ng_cisco(4) for HDLC.
Actually, last time I used cp(4) back in 2004, using netgraph(4)
instead of sppp(4) was already the right way to do.

Thus, remove the sppp(4) related part of the drivers and enable
by default the negraph(4) part.  Further maintenance of these
drivers in the tree shouldn't be a big deal.

While doing that, remove some cruft and enable cp(4) compilation
on amd64.  The ce(4) for some unknown reason marks its internal
DDK functions with __attribute__ fastcall, which most likely is
safe to remove, but without hardware I'm not going to do that, so
ce(4) remains i386-only.

Reviewed by:		emaste, imp, donner
Differential Revision:	https://reviews.freebsd.org/D32590
See also:		https://reviews.freebsd.org/D23928
2021-10-22 11:41:36 -07:00
Mark Johnston
ff93447d8e Use the vm_radix_init() helper when initializing pmaps
No functional change intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32527
2021-10-19 21:22:56 -04:00
Mark Johnston
a4667e09e6 Convert vm_page_alloc() callers to use vm_page_alloc_noobj().
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31986
2021-10-19 21:22:56 -04:00
Mark Johnston
de8554295b cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it
This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well.  No functional change
intended.

This is a re-application of commit
9068f6ea69.

Reviewed by:	cem, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32029
2021-10-18 09:56:58 -04:00
Konstantin Belousov
4c5bf59152 i386: move signal delivery code to exec_machdep.c
also move ptrace-related helpers to ptrace_machdep.c
Apply some style. Use ANSI C function definitions.
Remove MPSAFE annotations.

Reviewed by:	emaste, imp
Discussed with:	jrtc27
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32310
2021-10-08 03:20:42 +03:00
Mitchell Horne
ab4ed843a3 minidump: De-duplicate the progress bar
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31885
2021-09-29 16:42:21 -03:00
Mitchell Horne
31991a5a45 minidump: De-duplicate is_dumpable()
The function is identical in each minidump implementation, so move it to
vm_phys.c. The only slight exception is powerpc where the function was
public, for use in moea64_scan_pmap().

Reviewed by:	kib, markj, imp (earlier version)
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31884
2021-09-29 16:41:52 -03:00
Warner Losh
0cfc8b10ed f00f: We don't need giant to create IDT for workaround.
We don't need to assert we have Giant here. All machines that require
the F00F workaround are UP and interrupts are disabled. Since we are
single threaded, it's safe to allocate the IDT area with pmap_trm_alloc,
interact with the current idt table and replace the IDT table without
any Giant locking.

Sponsored by:		Netflix
Reviewed by:		kib, markj
Differential Revision:	https://reviews.freebsd.org/D31839
2021-09-29 10:35:21 -06:00
Konstantin Belousov
cf0ee8738e Drop cloudabi
According to https://github.com/NuxiNL/cloudlibc:
CloudABI is no longer being maintained. It was an awesome experiment,
but it never got enough traction to be sustainable.

There is no reason to keep it in FreeBSD.

Approved by:	ed (private mail)
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D31923
2021-09-22 00:18:44 +03:00
Mark Johnston
bcdc599dc2 Revert "cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it"
This reverts commit 9068f6ea69.

The underlying macro needs to be reworked to avoid problems with control
flow statements.

Reported by:	rlibby
2021-09-21 13:51:42 -04:00
Mark Johnston
9068f6ea69 cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it
This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well.  No functional change
intended.

Reviewed by:	cem, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32029
2021-09-21 12:07:47 -04:00
Konstantin Belousov
2b6eec531a x86: duplicate acpi_wakeup.c per i386 and amd64
The file as is is the maze of #ifdef passages, all slightly different.
Divorcing i386 and amd64 version actually makes changing the code
easier, also no changes for i386 are planned.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31931
2021-09-14 00:23:14 +03:00
Alexander Motin
7af4475a6e vmd(4): Major driver refactoring
- Re-implement pcib interface to use standard pci bus driver on top of
vmd(4) instead of custom one.
 - Re-implement memory/bus resource allocation to properly handle even
complicated configurations.
 - Re-implement interrupt handling to evenly distribute children's MSI/
MSI-X interrupts between available vmd(4) MSI-X vectors and setup them
to be handled by standard OS mechanisms with minimal overhead, except
sharing when unavoidable.

Successfully tested on Dell XPS 13 laptop with Core i7-1185G7 CPU (VMD
device ID 0x9a0b) and single NVMe SSD, dual-booting with Windows 10.

Successfully tested on Supermicro X11DPI-NT motherboard with Xeon(R)
Gold 6242R CPUs (VMD device ID 0x201d), simultaneously handling NVMe
SSD on one PCIe port and PLX bridge with 3 NVMe and 1 AHCI SSDs on
another.  Handles SSD hot-plug (except Optane 905p for some reason,
which are not detected until manual bus rescan) and enabled IOMMU
(directly connected SSDs work, but ones connected to the PLX fail
without errors from IOMMU).

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential revision:	https://reviews.freebsd.org/D31762
2021-09-02 20:58:02 -04:00
Andrew Turner
b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Konstantin Belousov
31607861e2 Style
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-08-28 23:39:16 +03:00
Gordon Bergling
5bdf58e196 Fix some common typos in source code comments
- s/priviledged/privileged/
- s/funtion/function/
- s/doens't/doesn't/
- s/sychronization/synchronization/

MFC after:	3 days
2021-08-28 18:57:23 +02:00
Mateusz Guzik
38941f5993 i386: retire bcopy
Unused since ba96f37758 ("Use __builtin for various mem* and b* (e.g. bzero)
routines.")

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-08-24 11:24:07 +00:00
Mateusz Guzik
e0545190ef i386: retire bzero
Unused since ba96f37758 ("Use __builtin for various mem* and b* (e.g. bzero)
routines.")

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-08-23 18:38:05 +00:00
Mateusz Guzik
8e4f67f17f i386: retire bcmp
Unused since ba96f37758 ("Use __builtin for various mem* and b* (e.g. bzero)
routines.")

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-08-23 18:38:04 +00:00
Adam Fenn
6c69c6bb4c kvm_clock: KVM paravirtual clock support
Add support for the KVM paravirtual clock device.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29733
2021-08-14 15:57:54 +03:00
Adam Fenn
652ae7b114 x86: cpufunc: Add rdtsc_ordered()
Add a variant of 'rdtsc()' that performs the ordered version of 'rdtsc'
appropriate for the invoking x86 variant.

Also, expose the 'lfence'-ed and 'mfence'-ed 'rdtsc()' variants needed
by 'rdtsc_ordered()' for general use.

Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D31416
2021-08-14 15:57:53 +03:00
Adam Fenn
908e277230 x86: cpufunc: Add rdtscp_aux()
Add a variant of 'rdtscp()' that retains and returns the 'IA32_TSC_AUX'
value read by 'rdtscp'.

Sponsored By:	Juniper Networks, Inc.
Sponsored By:	Klara, Inc.
Reviewed by:	markj, kib
Differential Revision:	https://reviews.freebsd.org/D31415
2021-08-14 15:57:53 +03:00
Gordon Bergling
a1581cd735 Fix a common typo in source code comments
- s/aligment/alignment/

MFC after:	5 days
2021-08-14 14:17:48 +02:00
NagaChaitanya Vellanki
2a9b4076dc
Merge common parts of i386 and amd64's ieeefp.h into x86/x86_ieeefp.h
MFC after:	1 week
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D26292
2021-08-12 18:45:22 +08:00
Dmitry Chagin
b356030e67 linux(4): Regen for clone3 system call.
MFC after:		2 weeks
2021-08-12 11:50:22 +03:00
Dmitry Chagin
17913b0b6b linux(4): Implement clone3 system call.
clone3 system call is used by glibc-2.34.

Differential revision:	https://reviews.freebsd.org/D31475
MFC after:		2 weeks
2021-08-12 11:49:36 +03:00
Dmitry Chagin
0a4b664ae8 linux(4): Add struct clone_args for future clone3 system call.
In preparation for clone3 system call add struct clone_args and use it in
clone implementation.
Move all of clone related bits to the newly created linux_fork.h header.

Differential revision:	https://reviews.freebsd.org/D31474
MFC after:		2 weeks
2021-08-12 11:49:01 +03:00
Dmitry Chagin
0c08f34f4d linux(4): Regen for clone syscall.
MFC after:		2 weeks
2021-08-12 11:47:31 +03:00
Dmitry Chagin
f1c450492f linux(4): Change clone syscall definition to match Linux actual one.
Differential revision:	https://reviews.freebsd.org/D31473
MFC after:		2 weeks
2021-08-12 11:46:36 +03:00
Dmitry Chagin
de8374df28 fork: Allow ABI to specify fork return values for child.
At least Linux x86 ABI's does not use carry bit and expects that the dx register
is preserved. For this add a new sv_set_fork_retval hook and call it from cpu_fork().

Add a short comment about touching dx in x86_set_fork_retval(), for more details
see phab comments from kib@ and imp@.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D31472
MFC after:		2 weeks
2021-08-12 11:45:25 +03:00
Dmitry Chagin
bee191e46f linux(4): Regen for faccessat2 system call.
MFC after:		2 weeks
2021-08-12 11:41:35 +03:00
Dmitry Chagin
13d79be995 linux(4): Implement faccessat2 system call.
It's used by bash on arm64 with glibc-2.32.

Reviewed by:		trasz
Differential Revision:	https://reviews.freebsd.org/D31345
MFC after:		2 weeks
2021-08-12 11:40:42 +03:00
Ed Maste
9feff969a0 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation
2021-08-08 10:42:24 -04:00
Roger Pau Monné
82bf6a2566 xen/timer: fix amd64 LINT kernel build
On amd64 XENHVM depends on the xentimer device for PVH early startup,
so both should be added or removed together (like the current
dependency with xenpci). Fix this by adding xentimer to NOTES and
updating the comments on the config files. Note that on i386 there's
no such dependency between xentimer and XENHVM, since there's no PVH
support.

While there also fix the MINIMAL i386 build to include the xentimer,
so it keeps the same functionality as before xentimer was split from
XENHVM.

Reported by: lwhsu
PR: 257549
Fixes: ae59812748 ('xen/timer: make xen timer optional')
2021-08-02 10:33:35 +02:00
Konstantin Belousov
041b7317f7 Add pmap_vm_page_alloc_check()
which is the place to put MD asserts about allocated pages.

On amd64, verify that allocated page does not belong to the kernel
(text, data) or early allocated pages.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-31 16:53:42 +03:00
Konstantin Belousov
b27fe1c3ba amd64: stop doing special allocation for the AP startup trampoline
There is no reason now why do we need to allocate trampoline page very
early in the boot process.  The only requirement for the page is that
it is below 1M to be usable by the real mode during init.  This can be
handled by vm_alloc_contig() when we do the startup.

Also assert that startup trampoline fits into single page.  In principle
we can do multi-page allocation if needed, but it is not.

Move the alloc_ap_trampoline() function and the boot_address variable to
i386/mp_machdep.c.  Keep existing mechanism of early alloc on i386.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D31343
2021-07-30 01:20:45 +03:00
Dmitry Chagin
741f80df53 linux(4): Eliminating an accidental comment.
MFC after:		2 weeks
2021-07-29 12:51:56 +03:00
Dmitry Chagin
0dc38e3303 linux(4): Reimplement futexes using umtx.
Differential Revision:	https://reviews.freebsd.org/D31236
MFC after:		2 weeks
2021-07-29 12:43:48 +03:00
Dmitry Chagin
f337940144 linux(4): Fix gcc buld.
gcc failed as it didn't inlined the builtins and generates calls to
the libgcc, ld can't find libgcc as cross-toolchain libgcc is not installed.
To avoid this add internal vDSO ffs functions without optimized builtins.

Reported by:		jhb
MFC after:		2 weeks
2021-07-29 09:52:33 +03:00
Julien Grall
ae59812748 xen/timer: make xen timer optional
The timer is not used on ARM.

Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D29041
2021-07-28 17:27:03 +02:00
Edward Tomasz Napierala
30c6d98219 linux: implement sigaltstack(2) on arm64
... by making it machine-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31286
2021-07-27 13:34:49 +00:00
Edward Tomasz Napierala
72f7ddb587 linux: implement rt_sigsuspend(2) on arm64
... by making it architecture-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31259
2021-07-23 20:13:00 +00:00
Dmitry Chagin
cf8d74e3fe linux(4): Allow musl brand to use FUTEX_REQUEUE op.
Initial patch from submitter was adapted by me to prevent unconditional
FUTEX_REQUEUE use.

PR:			255947
Submitted by:		Philippe Michaud-Boudreault
Differential Revision:	https://reviews.freebsd.org/D30332
2021-07-20 14:39:20 +03:00
Dmitry Chagin
09cffde975 linux(4): Fixup the vDSO initialization order.
The vDSO initialisation order should be as follows:
- native abi init via exec_sysvec_init();
- vDSO symbols queued to the linux_vdso_syms list;
- linux_vdso_install();
- linux_exec_sysvec_init();

As the exec_sysvec_init() called with SI_ORDER_ANY (last) at SI_SUB_EXEC
order, move linux_vdso_install() and linux_exec_sysvec_init() to the
SI_SUB_EXEC+1 order.

Reviewed by:		trasz
Differential Revision:	https://reviews.freebsd.org/D30902
MFC after		2 weeks
2021-07-20 10:02:34 +03:00
Dmitry Chagin
a543556c81 linux(4): Constify vdso install/deinstall.
In order to reduce diff between arches constify vdso install/deinstall
functions like arm64.

Reviewed by:		emaste
Differential revision:	https://reviews.freebsd.org/D30901
MFC after:		2 weeks
2021-07-20 10:01:47 +03:00
Dmitry Chagin
9931033bbf linux(4); Almost complete the vDSO.
The vDSO (virtual dynamic shared object) is a small shared library that the
kernel maps R/O into the address space of all Linux processes on image
activation. The vDSO is a fully formed ELF image, shared by all processes
with the same ABI, has no process private data.

The primary purpose of the vDSO:
- non-executable stack, signal trampolines not copied to the stack;
- signal trampolines unwind, mandatory for the NPTL;
- to avoid contex-switch overhead frequently used system calls can be
  implemented in the vDSO: for now gettimeofday, clock_gettime.

The first two have been implemented, so add the implementation of system
calls.

System calls implemenation based on a native timekeeping code with some
limitations:
- ifunc can't be used, as vDSO r/o mapped to the process VA and rtld
  can't relocate symbols;
- reading HPET memory is not implemented for now (TODO).

In case on any error vDSO system calls fallback to the kernel system
calls. For unimplemented vDSO system calls added prototypes which call
corresponding kernel system call.

Tested by:		trasz (arm64)
Differential revision:  https://reviews.freebsd.org/D30900
MFC after:              2 weeks
2021-07-20 10:01:18 +03:00
Dmitry Chagin
5fd9cd53d2 linux(4): Modify sv_onexec hook to return an error.
Temporary add stubs to the Linux emulation layer which calls the existing hook.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D30911
MFC after:		2 weeks
2021-07-20 09:56:25 +03:00
Jessica Clarke
439097486b acpi: Fix a repeated comment typo 2021-07-19 17:19:23 +01:00
Jessica Clarke
52fba9a943 acpi: Fix a repeated vm_offset_t that should be a vm_size_t
The underlying types for both are the same so arguably this doesn't
really matter, but using the wrong type is still confusing and
technically incorrect.
2021-07-19 17:19:23 +01:00
David Chisnall
cf98bc28d3 Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-16 18:06:44 +01:00
Mark Johnston
b092c58c00 Assert that valid PTEs are not overwritten when installing a new PTP
amd64 and 32-bit ARM already had assertions to this effect.  Add them to
other pmaps.

Reviewed by:	alc, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31171
2021-07-15 12:17:33 -04:00
Warner Losh
c0c703342d pccard: remove pccard device from all kernels
All the PC Card drivers have been removed from the tree. Remove the
pccard drivers from all the kernels.

Sponsored by:		Netflix
2021-07-13 20:39:31 -06:00
Peter Grehan
517904de5c igc(4): Introduce new driver for the Intel I225 Ethernet controller.
This controller supports 2.5G/1G/100MB/10MB speeds, and allows
tx/rx checksum offload, TSO, LRO, and multi-queue operation.

The driver was derived from code contributed by Intel, and modified
by Netgate to fit into the iflib framework.

Thanks to Mike Karels for testing and feedback on the driver.

Reviewed by:	bcr (manpages), kbowling, scottl, erj
MFC after:	1 month
Relnotes:	yes
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D30668
2021-07-12 14:57:18 +10:00
David Chisnall
d2b558281a Revert "Pass the syscall number to capsicum permission-denied signals"
This broke the i386 build.

This reverts commit 3a522ba1bc.
2021-07-10 20:26:01 +01:00
David Chisnall
3a522ba1bc Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-10 17:19:52 +01:00
Konstantin Belousov
55e63ed307 x86: use ANSI C definition style for trap_fatal
PR:	257062
Submitted by:	Vijay Sharma <vijaysh312@gmail.com>
MFC after:	1 week
2021-07-10 14:46:53 +03:00
Konstantin Belousov
28a66fc3da Do not call FreeBSD-ABI specific code for all ABIs
Use sysentvec hooks to only call umtx_thread_exit/umtx_exec, which handle
robust mutexes, for native FreeBSD ABI.  Similarly, there is no sense
in calling sigfastblock_clear() for non-native ABIs.

Requested by:	dchagin
Reviewed by:	dchagin, markj (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30987
2021-07-07 14:12:07 +03:00
Edward Tomasz Napierala
435754a59e Add infrastructure required for Linux coredump support
This adds `sv_elf_core_osabi`, `sv_elf_core_abi_vendor`,
and `sv_elf_core_prepare_notes` fields to `struct sysentvec`,
and modifies imgact_elf.c to make use of them instead
of hardcoding FreeBSD-specific values.  It also updates all
of the ABI definitions to preserve current behaviour.

This makes it possible to implement non-native ELF coredump
support without unnecessary code duplication.  It will be used
for Linux coredumps.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30921
2021-06-29 08:49:12 +01:00
Dmitry Chagin
79617645c6 linux(4): Retire unused declaration.
MFC after:	2 weeks
2021-06-22 08:41:33 +03:00
Dmitry Chagin
4efdf5820e linux(4): Retire a now unused include.
MFC after:	2 weeks
2021-06-22 08:39:47 +03:00
Dmitry Chagin
bfe2903798 linux(4): Do not specify shared page for aout binaries.
In Linux vDSO is a small shared ELF library, so it is not intended
for aout binaries. This was added on 64-bit Linuxulator import by mistake.

MFC after:	2 weeks
2021-06-22 08:38:45 +03:00
Dmitry Chagin
c1da89fec2 linux(4): Retire linux_kplatform.
Assuming we can't run on i486, i586 class cpu, retire linux_kplatform var
and use hardcoded 'machine' value in linux_newuname().

I have added linux_kplatform for consistency with linux_platform which is
placed in to vdso to avoid excess copyout it on stack for AT_PLATFORM at
exec time.

This is the first stage of Linuxulator's vdso revision.

Reviewed by:		trasz, imp
Differential Revision:	https://reviews.freebsd.org/D30774
MFC after:		2 weeks
2021-06-22 08:36:21 +03:00
Dmitry Chagin
e013e36939 linux(4): Get rid of Linuxulator kernel build options.
Stop confusing people, retire COMPAT_LINUX and COMPAT_LINUX32 kernel
build options. Since we have 32 and 64 bit Linux emulators, we can't build both
emulators together into the kernel. I don't think it matters, Linux emulation
depends on loadable modules (via rc).

Cut LINPROCFS and LINSYSFS for consistency.

PR:			215061
Reviewed by:		bcr (manpages), trasz
Differential Revision:	https://reviews.freebsd.org/D30751
MFC after:		2 weeks
2021-06-22 08:32:39 +03:00
Dmitry Chagin
8fe8bb7cb5 linux(4): Regen for linux_poll system call.
MFC after:	2 weeks
2021-06-22 08:09:55 +03:00
Dmitry Chagin
2eff670fde linux(4): Implement poll system call via linux_common_ppol()
for the sake of converting events to/from native.

MFC after:	2 weeks
2021-06-22 08:07:46 +03:00
Dmitry Chagin
26795a0378 linux(4): Rework Linux ppoll system call.
For now the Linux emulation layer uses in kernel ppoll(2) without
conversion of user supplied fd 'events', and does not convert the
kernel supplied fd 'revents'.

At least POLLRDHUP is handled by FreeBSD differently than by
Linux. Seems that Linux silencly ignores POLLRDHUP on non socket fd's
unlike FreeBSD, which does more strictly check and fails.

Rework the Linux ppoll, using kern_poll and converting 'events'
and 'revents' values.
While here, move poll events defines to the MI part of code as they
mostly identical on all arches except arm.

Differential Revision:	https://reviews.freebsd.org/D30716
MFC after:		2 weeks
2021-06-22 08:06:05 +03:00
Konstantin Belousov
870e197d52 Add quirks for Linux ABI signals handling
Require queueing of the signals with default action, and disable
dequeueing SIGCHLD on wait for live process.

Reported and tested by:	dchagin
Reviewed by:	dchagin, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30675
2021-06-16 02:01:35 +03:00
Olivier Houchard
30b915d7b2 an: Remove driver
Now that an(4) is gone, remove it from GENERIC kernel config files.

Reported by:	flo
2021-06-12 01:08:54 +02:00
Dmitry Chagin
89f15b79b1 linux(4): Regen for ppoll_time64 system call.
MFC after:	2 weeks
2021-06-10 15:19:12 +03:00
Dmitry Chagin
ed61e0ce1d linux(4): Implement ppoll_time64 system call.
MFC after:	2 weeks
2021-06-10 15:18:46 +03:00
Dmitry Chagin
981a60f112 linux(4): Regen for pselect6_time64 system call.
MFC after:	2 weeks
2021-06-10 15:04:37 +03:00
Dmitry Chagin
f6d075ecd7 linux(4): Implement pselect6_time64 system call.
MFC after:	2 weeks
2021-06-10 15:03:30 +03:00
Dmitry Chagin
c002529000 linux(4): Regen for rt_sigtimedwait_time64 system call.
MFC after:	2 weeks
2021-06-10 14:52:43 +03:00
Dmitry Chagin
db4a1f331b linux(4): Implement rt_sigtimedwait_time64 system call.
It still does not work as intended, awaits D30675.

MFC after:	2 weeks
2021-06-10 14:51:30 +03:00
Dmitry Chagin
985978806e linux(4): Regen for futex_time64 system call.
MFC after:	2 weeks
2021-06-10 14:28:25 +03:00
Dmitry Chagin
2e46d0c3d9 linux(4): Implement futex_time64 system call.
MFC after:	2 weeks
2021-06-10 14:27:06 +03:00
Dmitry Chagin
ee64d98204 linux(4): Regen for futex system call.
MFC after:	2 weeks
2021-06-10 14:16:40 +03:00
Dmitry Chagin
3c1de151e3 linux(4): Change Linux futex syscall definition to match Linux actual one.
MFC after:	2 weeks
2021-06-10 14:00:00 +03:00
Mark Johnston
334335cb14 i386: Add "options HYPERV" to NOTES
This unbreaks the LINT build.

Fixes:		97993d1ebf
Reported by:	mjg
MFC after:	13 days
2021-06-09 09:01:22 -04:00
Mark Johnston
97993d1ebf hyperv: Fix vmbus after the i386 4/4 split
The vmbus ISR needs to live in a trampoline.  Dynamically allocating a
trampoline at driver initialization time poses some difficulties due to
the fact that the KENTER macro assumes that the offset relative to
tramp_idleptd is fixed at static link time.  Another problem is that
native_lapic_ipi_alloc() uses setidt(), which assumes a fixed trampoline
offset.

Rather than fight this, move the Hyper-V ISR to i386/exception.s.  Add a
new HYPERV kernel option to make this optional, and configure it by
default on i386.  This is sufficient to make use of vmbus(4) after the
4/4 split.  Note that vmbus cannot be loaded dynamically and both the
HYPERV option and device must be configured together.  I think this is
not too onerous a requirement, since vmbus(4) was previously
non-functional.

Reported by:	Harry Schmalzbauer <freebsd@omnilan.de>
Tested by:	Harry Schmalzbauer <freebsd@omnilan.de>
Reviewed by:	whu, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30577
2021-06-08 09:40:30 -04:00
Konstantin Belousov
598f6fb49c linuxolator: Add compat.linux.setid_allowed knob
PR:	21463
Reported by:	kris
Reviewed by:	dchagin
Tested by:	trasz
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D28154
2021-06-06 21:43:00 +03:00
Dmitry Chagin
f4e801085b linux(4): optimize ksiginfo to siginfo conversion.
Retire ksiginfo_to_lsiginfo function, use siginfo_to_lsiginfo instead.
Convert rt_sigtimedwait siginfo variables to well known names.

MFC after:	2 weeks
2021-06-07 06:06:17 +03:00
Dmitry Chagin
e29ea22f70 Regen for ('0f8dab45404f347752470579feccc6d2739b9570') Linux
rt_sigtimedwait system call.

MFC after:	2 weeks
2021-06-07 05:39:29 +03:00
Dmitry Chagin
0f8dab4540 linux(4): Fix timeout parameter of rt_sigtimedwait syscall, which is
timespec not a timeval.

MFC after:	2 weeks
2021-06-07 05:35:35 +03:00
Dmitry Chagin
56b187005c Regen for ('6501370a7dfb358daf07555136742bc064e68cb7') Linux
clock_nanosleep_time64 system call.

MFC after:	2 weeks
2021-06-07 05:29:27 +03:00
Dmitry Chagin
6501370a7d linux(4): Implement clock_nanosleep_time64 system call.
MFC after:	2 weeks
2021-06-07 05:26:48 +03:00
Dmitry Chagin
773d9153c3 Regen for ('187715a420237e1ed94dd5aef158eada7dcdc559') Linux
clock_getres_time64 system call.

MFC after:	2 weeks
2021-06-07 05:21:48 +03:00
Dmitry Chagin
187715a420 linux(4): Implement clock_getres_time64 system call.
MFC after:	2 weeks
2021-06-07 05:21:32 +03:00
Dmitry Chagin
82e3848654 Regen for ('19f9a0e4df54f8d1e99234146024422bdcfa09ce') Linux
clock_settime64 system call.

MFC after:	2 weeks
2021-06-07 05:14:04 +03:00
Dmitry Chagin
19f9a0e4df linux(4): Implement clock_settime64 system call.
MFC after:	2 weeks
2021-06-07 05:11:25 +03:00
Dmitry Chagin
9e07ae7a09 Regen for ('99b6f430698fa00a33184dd61591d8b6518ed9d3') Linux
clock_gettime64 system call.

MFC after:	2 weeks
2021-06-07 05:08:11 +03:00
Dmitry Chagin
99b6f43069 linux(4): Implement clock_gettime64 system call.
MFC after:	2 weeks
2021-06-07 05:04:42 +03:00
Dmitry Chagin
ea7fa5583c Regen for ('e4bffb80bbc6a2e4b3be89aefcbd5bb2c2fc0ba0') Linux
utimensat_time64 syscall.

MFC after:	2 weeks
2021-06-07 04:56:58 +03:00
Dmitry Chagin
e4bffb80bb linux(4): Implement utimensat_time64 system call.
MFC after:	2 weeks
2021-06-07 04:54:30 +03:00
Dmitry Chagin
bfcce1a9f6 linux(4): add struct timespec64 definition and conversion routine for
future use.

MFC after:		2 weeks
2021-06-07 04:47:12 +03:00
Mark Johnston
cbe59a6475 i386: Make setidt_disp a size_t instead of uintptr_t
setidt_disp is the offset of the ISR trampoline relative to the address
of the routines in exception.s, so uintptr_t is not quite right.

Also remove a bogus declaration I added in commit 18f55c67f7, it is not
required after all.

Reported by:	jrtc27
Reviewed by:	jrtc27, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30590
2021-06-01 19:37:50 -04:00
Dmitry Chagin
19593f775c linux(4); Retire unnecessary __packed attribute from some struct's
definition.

Differential Revision:	https://reviews.freebsd.org/D30482
MFC after:		2 weeks
2021-05-31 21:56:34 +03:00
Konstantin Belousov
c56de177d2 x86: initialize initial FPU state earlier
Make it under SI_SUB_CPU sysinit, instead of much later SI_SUB_DRIVERS.
The SI_SUB_DRIVERS survived from times when FPU used real ISA attachment,
now it is only pnp stub claiming id.

PR:	255997
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30512
2021-05-28 21:38:32 +03:00
Edward Tomasz Napierala
c0f171736a Regen after 6d926e850d.
Sponsored By:	EPSRC
2021-05-28 09:04:17 +01:00
Edward Tomasz Napierala
6d926e850d linux: add new syscall numbers
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30193
2021-05-28 09:02:16 +01:00
Ceri Davies
c1a148873d sys/*/conf/*, docs: fix links to handbook
While here, fix all links to older en_US.ISO8859-1 documentation
in the src/ tree.

PR:             255026
Reported by:    Michael Büker <freebsd@michael-bueker.de>
Reviewed by:    dbaio
Approved by:    blackend (mentor), re (gjb)
MFC after:      10 days
Differential Revision: https://reviews.freebsd.org/D30265
2021-05-20 09:27:10 +01:00
Roger Pau Monné
ac3ede5371 x86/xen: remove PVHv1 code
PVHv1 was officially removed from Xen in 4.9, so just axe the related
code from FreeBSD.

Note FreeBSD supports PVHv2, which is the replacement for PVHv1.

Sponsored by: Citrix Systems R&D
Reviewed by: kib, Elliott Mitchell
Differential Revision: https://reviews.freebsd.org/D30228
2021-05-17 11:41:21 +02:00
Ed Maste
2c9764f36b regen syscall files after d51198d63b63 2021-05-13 14:09:58 -04:00
Andrew Turner
5d2d599d3f Create VM_MEMATTR_DEVICE on all architectures
This is intended to be used with memory mapped IO, e.g. from
bus_space_map with no flags, or pmap_mapdev.

Use this new memory type in the map request configured by
resource_init_map_request, and in pciconf.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29692
2021-04-12 06:15:31 +00:00
Konstantin Belousov
290b0d123a x86: use x86_clear_dbregs() on fork
instead of manual zeroing of the debug registers file in pcb.
This centralizes the cleaning code, but the practical difference is
that PCB_DBREGS flag is cleared, saving some operations on context
switching.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29687
2021-04-10 04:25:02 +03:00
Konstantin Belousov
a8b75a57c9 x86: add x86_clear_dbregs() helper
Move the code from exec_setregs() to reset debug registers state on exec,
to the x86_clear_dbregs() helper

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29687
2021-04-10 04:25:01 +03:00
Konstantin Belousov
aa3ea612be x86: remove gcov kernel support
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D29529
2021-04-02 15:41:51 +03:00
Konstantin Belousov
8223717ce6 x86: clear %db registers in new process
Reported by:	 Michał Górny <mgorny@gentoo.org>
PR:	254661
Reviewed by:	emaste, jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D29496
2021-03-31 02:07:35 +03:00
Mitchell Horne
7446b0888d gdb: report specific stop reason for watchpoints
The remote protocol allows for implementations to report more specific
reasons for the break in execution back to the client [1]. This is
entirely optional, so it is only implemented for amd64, arm64, and i386
at the moment.

[1] https://sourceware.org/gdb/current/onlinedocs/gdb/Stop-Reply-Packets.html

Reviewed by:	jhb
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
NetApp PR:	51
Differential Revision:	https://reviews.freebsd.org/D29174
2021-03-30 11:36:41 -03:00
Mitchell Horne
9d81dd5404 ddb: replace watchpoint set/clear functions
Use the new kdb variants. Print more specific error messages.

Reviewed by:	jhb, markj
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D29156
2021-03-29 12:05:44 -03:00
Mitchell Horne
15dc1d4452 x86: implement kdb watchpoint functions
Add wrappers around the dbreg interface that can be consumed by MI
kernel debugger code. The dbreg functions themselves are updated to
return error codes, not just -1. dbreg_set_watchpoint() is extended to
accept access bits as an argument.

Reviewed by:	jhb, kib, markj
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D29155
2021-03-29 12:05:43 -03:00
Jason A. Harmening
d22883d715 Remove PCPU_INC
e4b8deb222 removed the last in-tree uses of PCPU_INC().  Its
potential benefit is also practically nonexistent.  Non-x86
platforms already implement it as PCPU_ADD(..., 1), and according
to [0] there are no recent x86 processors for which the 'inc'
instruction provides a performance benefit over the equivalent
memory-operand form of the 'add' instruction.  The only remaining
benefit of 'inc' is smaller instruction size, which in this case
is inconsequential given the limited number of per-CPU data consumers.

[0]: https://www.agner.org/optimize/instruction_tables.pdf

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29308
2021-03-20 19:23:59 -07:00
Mitchell Horne
c02c04f113 x86: consolidate hw watchpoint logic into new file
This is a prerequisite to using these functions outside of ddb, but also
provides some cleanup and minor refactoring. This code is almost
entirely duplicated between the two implementations, the only
significant difference being the lack of dbreg synchronization on i386.

Cleanups are:
 - demote some internal functions to static
 - use the constant NDBREGS instead of a '4' literal
 - remove K&R definitions
 - some added comments

Reviewed by:	kib, jhb
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D29153
2021-03-19 16:51:52 -03:00
John Baldwin
3b57ddb029 Rename linux_set_upcall_kse() to linux_set_upcall().
This matches the rename of cpu_set_upcall_kse() in
5c2cf81845.

Reviewed by:	kib, emaste
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29295
2021-03-18 12:14:34 -07:00
John Baldwin
a7883464fc x86: Reduce code duplication in cpu_fork() and cpu_copy_thread().
Add copy_thread() to hold shared code.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D29228
2021-03-18 12:13:17 -07:00
Gordon Bergling
564a3ac63a i386: Fix a few typos
- wheter -> whether
- while here, fix some whitespace issues

MFC after:	1 week
2021-03-13 16:10:01 +01:00
John Baldwin
40d593d17e x86: Update some stale comments in cpu_fork() and cpu_copy_thread().
Neither of these routines allocate stacks.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D29227
2021-03-12 09:48:49 -08:00
John Baldwin
c7b0213523 x86: Always use clean FPU and segment base state for new kthreads.
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D29208
2021-03-12 09:48:36 -08:00
John Baldwin
755efb8d8f x86: Copy the FPU/XSAVE state from the creating thread to new threads.
POSIX states that new threads created via pthread_create() should
inherit the "floating point environment" from the creating thread.

Discussed with:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D29204
2021-03-12 09:47:41 -08:00
Mark Johnston
732b69c9f9 acpi: Make nexus_acpi quiet on amd64 and i386
Otherwise during attach newbus prints "nexus0", which is not very
useful.

The generic nexus device is already quiet, as is nexus_acpi on arm64.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-03-05 12:54:00 -05:00
Allan Jude
d0673fe160 smbios: Move smbios driver out from x86 machdep code
Add it to the x86 GENERIC and MINIMAL kernels

Sponsored by:	Ampere Computing LLC
Submitted by:	Klara Inc.
Reviewed by:	rpokala
Differential Revision:	https://reviews.freebsd.org/D28738
2021-02-23 21:17:09 +00:00
John Baldwin
67932460c7 Add a VA_IS_CLEANMAP() macro.
This macro returns true if a provided virtual address is contained
in the kernel's clean submap.

In CHERI kernels, the buffer cache and transient I/O map are allocated
as separate regions.  Abstracting this check reduces the diff relative
to FreeBSD.  It is perhaps slightly more readable as well.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D28710
2021-02-17 16:32:11 -08:00
Mark Johnston
aa5fef60bf linux: Update the i386/linux vdso deinitialization routine
This was missed in commit 0fc8a79672 ("linux: Unmap the VDSO page when
unloading").

Reported by:	Mark Millard
MFC with:	0fc8a79672
2021-02-16 17:07:56 -05:00
Roger Pau Monné
a2495c3667 xen/boot: allow specifying boot method when booted from Xen
Allow setting the bootmethod variable from the Xen PVH entry point, in
order to be able to correctly set the underlying firmware mode when
booted as a dom0.

Move the bootmethod variable to be defined in x86/cpu_machdep.c
instead so it can be shared by both i386 and amd64.

Sponsored by:		Citrix Systems R&D
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D28619
2021-02-16 15:26:11 +01:00
Cy Schubert
0e01ea872e Fix a typo.
MFC after:	3 days
2021-01-27 21:52:41 -08:00
Mateusz Guzik
b42a2ea558 Remove ndis(4) remnants from kernel configs
Unbreaks LINT kernels.
2021-01-26 00:04:13 +00:00
Jessica Clarke
5faeda9037 Rename i386's Linux ELF to Linux ELF32
This is what amd64 calls the i386 Linux ABI in order to distinguish it
from the amd64 Linux ABI, and matches the nomenclature used for the
FreeBSD ABIs where they always have the size suffix in the name.

Reviewed by:	trasz
Differential Revision:	https://reviews.freebsd.org/D27647
2021-01-21 01:54:12 +00:00
Marius Strobl
944041f936 wl(4): remove obsolete header
It's unused since 09b9789b28 and r304506
respectively and should have gone along with these.
2021-01-17 00:03:17 +01:00
Vladimir Kondratyev
b62f6dfaed hid: Replace USBHID_ENABLED kernel config option with loader tunable
usbhid(4) is disabled by default to avoid conflicts with existing USB HID
drivers. To enable it place following lines to /boot/loader.conf:

hw.usb.usbhid.enable=1
usbhid_load="YES"

Suggested by:	jhb
Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D28124
2021-01-14 23:04:47 +03:00
Andrew Turner
6eebda3bba Split out the NODEBUG options to a common file
This is the superset of the nooptions found in the -DEBUG kernels.

Reviewed by:	emaste, manu
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D28152
2021-01-14 16:57:53 +00:00
John Baldwin
074a91f746 Enable accelerated AES-XTS software crypto in GENERIC.
In particular, using GELI on a root filesystem will only use
accelerated software crypto drivers if they are available before the
root filesystem is mounted.  While these modules can be loaded from
the loader, including them in GENERIC provides a better out-of-the-box
experience for users.

Both aesni(4) and armv8crypto(4) provide accelerated implementations
of the default cipher used by GELI (AES-XTS) in addition to other
ciphers.

Reviewed by:	mhorne, allanjude, markj
Differential Revision:	https://reviews.freebsd.org/D28100
2021-01-13 13:13:01 -08:00
Konstantin Belousov
0659df6fad vm_map_protect: allow to set prot and max_prot in one go.
This prevents a situation where other thread modifies map entries
permissions between setting max_prot, then relocking, then setting prot,
confusing the operation outcome.  E.g. you can get an error that is not
possible if operation is performed atomic.

Also enable setting rwx for max_prot even if map does not allow to set
effective rwx protection.

Reviewed by:	brooks, markj (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28117
2021-01-13 01:35:22 +02:00
Roger Pau Monne
ed78016d00 xen/privcmd: implement the dm op ioctl
Use an interface compatible with the Linux one so that the user-space
libraries already using the Linux interface can be used without much
modifications.

This allows user-space to make use of the dm_op family of hypercalls,
which are used by device models.

Sponsored by:	Citrix Systems R&D
2021-01-11 16:33:27 +01:00
Vladimir Kondratyev
0f0379fa55 hid: Add recently imported drivers to NOTES
Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D28060
2021-01-10 22:17:20 +03:00
Konstantin Belousov
45974de8fb x86: Add rdtscp32() into cpufunc.h.
Suggested by:	markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27986
2021-01-10 04:42:34 +02:00
Toomas Soome
7c6a71d16c i386 kernel is not built with vt_vbefb
Add vt_vbefb to GENERIC and NOTES
2021-01-08 20:58:23 +02:00
Warner Losh
a21def4d56 pccard: Remove wi(4) driver
Remove wi(4). pccard is going away, and wi only supports PC Card
devices, though it has a minor amount of glue to also support
PCI cards. However, removing the one without removing the other
is hard, so the whole driver is being removed.

Relnotes: Yes
2021-01-07 20:41:06 -07:00
Vladimir Kondratyev
01f2e864f7 hid: Import usbhid - USB transport backend for HID subsystem.
This change implements hid_if.m methods for HID-over-USB protocol [1].

Also, this change adds USBHID_ENABLED kernel option which changes
device_probe() priority and adds/removes PnP records to prefer usbhid
over ums, ukbd, wmt and other USB HID device drivers and vice-versa.

The module is based on uhid(4) driver.  It is disabled by default for
now due to conflicts with existing USB HID drivers.

[1] https://www.usb.org/sites/default/files/hid1_11.pdf

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D27893
2021-01-08 02:18:43 +03:00
Vladimir Kondratyev
b1f1b07f6d hid: Import iichid - I2C transport backend for HID subsystem
This implements hid_if.m methods for HID-over-I2C protocol [1].

Following kernel options are added:

IICHID_SAMPLING - Enable support for a sampling mode as interrupt
                  resource acquisition is not always possible in a case
                  of GPIO interrupts.
IICHID_DEBUG    - Enable debug output.

The module is based on prior Marc Priggemeyer work (D16698).

[1] http://download.microsoft.com/download/7/d/d/7dd44bb7-2a7a-4505-ac1c-7227d3d96d5b/hid-over-i2c-protocol-spec-v1-0.docx

Differential revision:	https://reviews.freebsd.org/D27892
2021-01-08 02:18:43 +03:00
Vladimir Kondratyev
1975878673 hid: Import functions and constants required by new subsystem
This does an import of quirk stubs, debugging macros from USB code and
numerous usage constants used by dependent drivers.

Besides, this change renames some functions to get a better matching
with userland library and NetBSD/OpenBSD HID code. Namely:

- Old hid_report_size() renamed to hid_report_size_max()
- New hid_report_size() calculates size of given report rather than
  maximum size of all reports.
- hid_get_data_unsigned() renamed to hid_get_udata()
- hid_put_data_unsigned() renamed to hid_put_udata()

Compat shim functions are provided in usbhid.h to make possible compile
of legacy code unmodified after this change.

Reviewed by:	manu, hselasky
Differential revision:	https://reviews.freebsd.org/D27887
2021-01-08 02:18:42 +03:00
Vladimir Kondratyev
67de2db262 Factor-out hardware-independent part of USB HID support to new module
It will be used by the upcoming HID-over-i2C implementation.  Should be
no-op, except hid.ko module dependency is to be added to affected drivers.

Reviewed by:	hselasky, manu
Differential revision:	https://reviews.freebsd.org/D27867
2021-01-08 02:18:42 +03:00
Alan Cox
7beeacb27b Honor the vm page's PG_NODUMP flag on arm and i386.
Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27949
2021-01-04 16:15:42 -06:00
Mitchell Horne
962c06c5a3 gdb(4) fix x86 signal reporting
The existing values correspond to x86 exception vector numbers, but the
trap numbers used in the kernel do not match these 1-to-1. Prefer the
definitions from x86/trap.h, as they are what actually get passed to
kdb_trap(). This is of little consequence, as gdb_cpu_signal() only
reports the trap reason (signal number) to the gdb client.

This is limited to the subset of trap values for which kdb_trap() is
reachable.

Reviewed by:	kib
Discussed with:	jhb
MFC after:	1 week
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D27645
2020-12-23 15:40:14 -04:00
John Baldwin
1dce7d9e7e Skip the vm.pmap.kernel_maps sysctl by default.
This sysctl node can generate very verbose output, so don't trigger it
for sysctl -a or sysctl vm.pmap.

Reviewed by:	markj, kib
Differential Revision:	https://reviews.freebsd.org/D27504
2020-12-18 20:41:23 +00:00
Alexander V. Chernikov
d5fe384b4d Enable ROUTE_MPATH support in GENERIC kernels.
Ability to load-balance traffic over multiple path is a must-have thing for routers.
It may be used by the servers to balance outgoing traffic over multiple default gateways.

The previous implementation, RADIX_MPATH stayed in the shadow for too long.
It was not well maintained, which lead us to a vicious circle - people were using
 non-contiguous mask or firewalls to achieve similar goals. As a result, some routing
 daemons implementation still don't have multipath support enabled for FreeBSD.

Turning on ROUTE_MPATH by default would fix it. It will allow to reduce networking
 feature gap to other operating systems. Linux and OpenBSD enabled similar support
 at least 5 years ago.

ROUTE_MPATH does not consume memory unless actually used. It enables around ~1k LOC.

It does not bring any behaviour changes for userland.
Additionally, feature is (temporarily) turned off by the net.route.multipath sysctl
 defaulting to 0.

Differential Revision:	https://reviews.freebsd.org/D27428
2020-12-14 22:23:08 +00:00
Brooks Davis
9ee99cec1f hme(4): Remove as previous announced
The hme (Happy Meal Ethernet) driver was the onboard NIC in most
supported sparc64 platforms. A few PCI NICs do exist, but we have seen
no evidence of use on non-sparc systems.

Reviewed by:	imp, emaste, bcr
Sponsored by:	DARPA
2020-12-11 21:40:38 +00:00
Conrad Meyer
78599c32ef Add CFI start/end proc directives to arm64, i386, and ppc
Follow-up to r353959 and r368070: do the same for other architectures.

arm32 already seems to use its own .fnstart/.fnend directives, which
appear to be ARM-specific variants of the same thing.  Likewise, MIPS
uses .frame directives.

Reviewed by:	arichardson
Differential Revision:	https://reviews.freebsd.org/D27387
2020-12-05 00:33:28 +00:00
Toomas Soome
a4a10b37d4 Add VT driver for VBE framebuffer device
Implement vt_vbefb to support Vesa Bios Extensions (VBE) framebuffer with VT.
vt_vbefb is built based on vt_efifb and is assuming similar data for
initialization, use MODINFOMD_VBE_FB to identify the structure vbe_fb
in kernel metadata.

struct vbe_fb, is populated by boot loader, and is passed to kernel via
metadata payload.

Differential Revision:	https://reviews.freebsd.org/D27373
2020-11-30 08:22:40 +00:00
Ruslan Bukin
f120ad6ba0 o Move options IOMMU from Debugging section back to the Bus section
where it originally was. The bug introduced in r366267.
o Remove options IOMMU from i386/MINIMAL as we don't have it in
  i386/GENERIC.

Reported by:	Harry Schmalzbauer <freebsd@omnilan.de>
Reviewed by:	kib
Sponsored by:	Innovate DSbD
Differential Revision:	https://reviews.freebsd.org/D27399
2020-11-27 21:37:48 +00:00
Jung-uk Kim
926ce35a7e Port rtsx(4) driver for Realtek SD card reader from OpenBSD.
This driver provides support for Realtek PCI SD card readers.  It attaches
mmc(4) bus on card insertion and detaches it on card removal.  It has been
tested with RTS5209, RTS5227, RTS5229, RTS522A, RTS525A and RTL8411B.  It
should also work with RTS5249, RTL8402 and RTL8411.

PR:			204521
Submitted by:		Henri Hennebert (hlh at restart dot be)
Reviewed by:		imp, jkim
Differential Revision:	https://reviews.freebsd.org/D26435
2020-11-24 21:28:44 +00:00
Konstantin Belousov
4815f175d0 Linuxolator: Replace use of eventhandlers by sysent hooks.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27309
2020-11-23 18:18:16 +00:00
Conrad Meyer
77eb984147 'make sysent' for r367773
X-MFC-With:	r367773
2020-11-17 19:53:59 +00:00
Conrad Meyer
de774e422e linux(4): Implement name_to_handle_at(), open_by_handle_at()
They are similar to our getfhat(2) and fhopen(2) syscalls.

Differential Revision:	https://reviews.freebsd.org/D27111
2020-11-17 19:51:47 +00:00
Conrad Meyer
e9b13c6612 linux(4): Deduplicate unimpl/dummy syscall handlers
No functional change.

Reviewed by:	emaste, trasz
Differential Revision:	https://reviews.freebsd.org/D27099
2020-11-05 19:30:31 +00:00
Mateusz Guzik
82c174a3b4 malloc: delegate M_EXEC handling to dedicacted routines
It is almost never needed and adds an avoidable branch.

While here do minior clean ups in preparation for larger changes.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D27019
2020-10-30 20:02:32 +00:00
Edward Tomasz Napierala
866b1f5147 Fix misnomer - linux_to_bsd_errno() does the exact opposite.
Reported by:	arichardson
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26965
2020-10-27 12:49:40 +00:00
Kyle Evans
d42a83b1a9 audit: also correctly audit linux_execve()
Linux execve() gets audited as AUE_EXECVE as well, we should also interpret
the return from this correctly for the same reasoning as in r367002.

MFC with:	r367002
2020-10-26 17:30:17 +00:00
Warner Losh
0b0447734a Remove support for intel compiler from i386 in_cksum
We no longer support building the kernel with the old intel
compiler. Remove support for it from in_cksum. Should there be
interest in reviving it, this is as likely to get in the way as to
help anyway.
2020-10-24 23:21:27 +00:00
Konstantin Belousov
c0b5fcf692 Improve FPU Tag Word reconstruction on i386 to indicate register states.
Improve the code reconstructing en_tw in struct fpreg32 from FXSAVE
results so that all register states are indicated correctly.  The
previous code unconditionally mapped non-empty register state to
'normalized value' constant.  The new code explicitly distinguishes
the 'zero value' and 'special value' constants as well.  This improves
consistency between real FSAVE and translation from FXSAVE, and
ensures that tests using PT_GETFPREGS can rely on a single correct
value independently of the underlying implementation.

PR:	250454
Sponsored by:	The FreeBSD Foundation
Obtained from:	Moritz Systems
Submitted by:	Michał Górny <mgorny@moritz.systems>
Discussed with:	emaste
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26856
2020-10-21 00:15:12 +00:00
John Baldwin
ba610be90a Add a kernel crypto driver using assembly routines from OpenSSL.
Currently, this supports SHA1 and SHA2-{224,256,384,512} both as plain
hashes and in HMAC mode on both amd64 and i386.  It uses the SHA
intrinsics when present similar to aesni(4), but uses SSE/AVX
instructions when they are not.

Note that some files from OpenSSL that normally wrap the assembly
routines have been adapted to export methods usable by 'struct
auth_xform' as is used by existing software crypto routines.

Reviewed by:	gallatin, jkim, delphij, gnn
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D26821
2020-10-20 17:50:18 +00:00
John Baldwin
eeb4c816d6 Properly clear PCB_KERNNPX in fpu_kern_leave().
PR:		250423
Reported by:	CI
Tested by:	lwhsu
2020-10-19 17:35:45 +00:00
Konstantin Belousov
e406235000 Fix for mis-interpretation of PCB_KERNFPU.
RIght now PCB_KERNFPU is used both as indication that kernel prepared
hardware FPU context to use and that the thread is fpu-kern
thread.  This also breaks fpu_kern_enter(FPU_KERN_NOCTX), since
fpu_kern_leave() then clears PCB_KERNFPU.

Introduce new flag PCB_KERNFPU_THR which indicates that the thread is
fpu-kern.  Do not clear PCB_KERNFPU if fpu-kern thread leaves noctx
fpu region.

Reported and tested by:	jhb (amd64)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25511
2020-10-14 23:01:41 +00:00
Konstantin Belousov
d3ba71b2b1 Limit workaround for errata E400 to appropriate AMD cpus.
From Linux sources and several datasheets I looked at, it seems that
the workaround is only needed on families 0xf and 0x10.  For instance,
Ryzens do not implement the accessed MSR at all, it is documented as
reserved.  Also, hypervisors should not allow guest to put CPU into
idle state, so activate workaround only when on bare hardware.

While there, style the code:
    move MSR defines to specialreg.h
    move identification to initcpu.c

Reported by:	whu
Reviewed by:	avg
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26470
2020-10-14 22:57:50 +00:00
Konstantin Belousov
6f3b523c9a Avoid dump_avail[] redefinition.
Move dump_avail[] extern declaration and inlines into a new header
vm/vm_dumpset.h.  This fixes default gcc build for mips.

Reviewed by:	alc, scottph
Tested by:	kevans (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26741
2020-10-14 22:51:40 +00:00
Ruslan Bukin
6e9127d838 Add a per-each macro IOMMU_DOMAIN_UNLOAD_SLEEP which allows to sleep
during iommu guest address space entries unload.

Suggested by:	kib
Sponsored by:	Innovate DSbD
Differential Revision:	https://reviews.freebsd.org/D26722
2020-10-14 14:51:11 +00:00
John Baldwin
1775215f88 Add support for FPU_KERN_NOCTX.
This mirrors the implementation on amd64.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D26754
2020-10-13 17:27:37 +00:00
John Baldwin
4ef6ea38fc Add a <machine/fpu.h> for i386 that includes <machine/npx.h>.
arm64 has a similar wrapper.  This permits defining <machine/fpu.h> as
the standard header for fpu_kern_*.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D26753
2020-10-13 17:26:12 +00:00
Conrad Meyer
f8e8a06d23 random(4) FenestrasX: Push root seed version to arc4random(3)
Push the root seed version to userspace through the VDSO page, if
the RANDOM_FENESTRASX algorithm is enabled.  Otherwise, there is no
functional change.  The mechanism can be disabled with
debug.fxrng_vdso_enable=0.

arc4random(3) obtains a pointer to the root seed version published by
the kernel in the shared page at allocation time.  Like arc4random(9),
it maintains its own per-process copy of the seed version corresponding
to the root seed version at the time it last rekeyed.  On read requests,
the process seed version is compared with the version published in the
shared page; if they do not match, arc4random(3) reseeds from the
kernel before providing generated output.

This change does not implement the FenestrasX concept of PCPU userspace
generators seeded from a per-process base generator.  That change is
left for future discussion/work.

Reviewed by:	kib (previous version)
Approved by:	csprng (me -- only touching FXRNG here)
Differential Revision:	https://reviews.freebsd.org/D22839
2020-10-10 21:52:00 +00:00
Warner Losh
7e46dafa58 Create in-tree LINT files
Now that config(8) has supported include for 19 years, transition to
including the NOTES files. include support didn't exist at the time,
nor did the envvar stuff recently added. Now that it does, eliminate
the building of LINT files by just including everything you need.

Note: This may cause conflicts with updating in some cases.
	find sys -name LINT\* -rm
is suggested across this commit to remove the generated LINT
files.

Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D26540
2020-10-09 01:48:14 +00:00
Warner Losh
8e82f10172 timer_restore is now unused, remove it
apm was the only consumer of timer_restore. Now that it's gone, this
can be removed.
2020-10-08 20:56:11 +00:00
Warner Losh
8c576a279e Remove APM BIOS support
APM BIOS was relevant only to early laptops (approximately P166 or
P200 and slower). These have not been relevant for a long time, and
this code has been untested for a long time (as far as I can
tell). The APM compat code in ACPI and the apm(8) command is not being
retired. Both of these items are still in use (apm(8) is more
scriptable than the replacement acpiconf, for the most part). This has
been commented out of i386 GENERIC since 2002. This code is not
relevant to any other port.

Discussed on: arch@
2020-10-08 20:56:06 +00:00
Warner Losh
28942db891 Remove apm screen saver.
APM BIOS support is about to be removed. Remove the apm screen saver
and its module. They are about to be irrelevant.
2020-10-08 20:56:00 +00:00
Mitchell Horne
8481aab1ac Print symbol index for unsupported relocation types
It is unlikely, but possible, that an unrecognized or unsupported
relocation type is encountered while trying to load a kernel module. If
this occurs we should offer the symbol index as a hint to the user.

While here, fix some small style issues.

Reviewed by:	markj, kib (amd64 part, in D26701)
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
2020-10-07 18:48:10 +00:00
Emmanuel Vadot
90b8c0ea10 Fix LINT: Add backlight to NOTES 2020-10-02 20:52:09 +00:00
Ruslan Bukin
6186bfbd18 Rename kernel option ACPI_DMAR to IOMMU.
This is mostly needed for a common arm64/amd64 iommu code.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D26587
2020-09-29 20:29:07 +00:00
Edward Tomasz Napierala
1e2521ffae Get rid of sa->narg. It serves no purpose; use sa->callp->sy_narg instead.
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26458
2020-09-27 18:47:06 +00:00
Edward Tomasz Napierala
0c5bd5f993 Regen after r366145.
Sponsored by:	DARPA
2020-09-25 10:05:38 +00:00
Mark Johnston
78257765f2 Add a vmparam.h constant indicating pmap support for large pages.
Enable SHM_LARGEPAGE support on arm64.

Reviewed by:	alc, kib
Sponsored by:	Juniper Networks, Inc., Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26467
2020-09-23 19:34:21 +00:00
Warner Losh
f9ba2bbe3a Use envvar rather than nonstandard hint. lines
The NOTES files have a bunch of hint lines that are removed when
generating LINT. However, we can achieve the same effect by prepending
each of the lines with 'envvar' so the NOTES files become standard
config(8) files. No functional changes as the sed script to generate
the LINT files filters these either way.

Suggested by: kevans
2020-09-23 19:18:53 +00:00
D Scott Phillips
00e6614750 Sparsify the vm_page_dump bitmap
On Ampere Altra systems, the sparse population of RAM within the
physical address space causes the vm_page_dump bitmap to be much
larger than necessary, increasing the size from ~8 Mib to > 2 Gib
(and overflowing `int` for the size).

Changing the page dump bitmap also changes the minidump file
format, so changes are also necessary in libkvm.

Reviewed by:	jhb
Approved by:	scottl (implicit)
MFC after:	1 week
Sponsored by:	Ampere Computing, Inc.
Differential Revision:	https://reviews.freebsd.org/D26131
2020-09-21 22:21:59 +00:00
D Scott Phillips
ab041f713a Move vm_page_dump bitset array definition to MI code
These definitions were repeated by all architectures, with small
variations. Consolidate the common definitons in machine
independent code and use bitset(9) macros for manipulation. Many
opportunities for deduplication remain in the machine dependent
minidump logic. The only intended functional change is increasing
the bit index type to vm_pindex_t, allowing the indexing of pages
with address of 8 TiB and greater.

Reviewed by:	kib, markj
Approved by:	scottl (implicit)
MFC after:	1 week
Sponsored by:	Ampere Computing, Inc.
Differential Revision:	https://reviews.freebsd.org/D26129
2020-09-21 22:20:37 +00:00
Edward Tomasz Napierala
70890254b3 Get rid of sv_errtbl and SV_ABI_ERRNO().
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26388
2020-09-17 11:39:33 +00:00
Edward Tomasz Napierala
c26391f4dd Move SV_ABI_ERRNO translation into linux-specific code, to simplify
the syscall path and declutter it a bit.  No functional changes intended.

Reviewed by:	kib (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26378
2020-09-15 16:41:21 +00:00
Mark Johnston
847ab36bf2 Include the psind in data returned by mincore(2).
Currently we use a single bit to indicate whether the virtual page is
part of a superpage.  To support a forthcoming implementation of
non-transparent 1GB superpages, it is useful to provide more detailed
information about large page sizes.

The change converts MINCORE_SUPER into a mask for MINCORE_PSIND(psind)
values, indicating a mapping of size psind, where psind is an index into
the pagesizes array returned by getpagesizes(3), which in turn comes
from the hw.pagesizes sysctl.  MINCORE_PSIND(1) is equal to the old
value of MINCORE_SUPER.

For now, two bits are used to record the page size, permitting values
of MAXPAGESIZES up to 4.

Reviewed by:	alc, kib
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26238
2020-09-02 18:16:43 +00:00
Mark Johnston
2d838cd867 Add the MEM_EXTRACT_PADDR ioctl to /dev/mem.
This allows privileged userspace processes to find information about the
physical page backing a given mapping.  It is useful in applications
such as DPDK which perform some of their own memory management.

Reviewed by:	kib, jhb (previous version)
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D26237
2020-09-02 18:12:47 +00:00
Mateusz Guzik
ed83a56181 i386: clean up empty lines in .c and .h files 2020-09-01 21:19:39 +00:00
Warner Losh
8df7e154a2 Add deprecation notice for apm BIOS
Add deprecation notice for apm bios, aka the apm(4) device. The apm(8)
command will remain, at least for a while, since ACPI emulates the apm
ioctl interface.

Discussed on: arch@
Relnotes: yes
MFC After: 3 days
2020-08-31 21:04:00 +00:00
Peter Grehan
a333a508a2 cpu_auxmsr: assert caller is preventing CPU migration.
Submitted by:	Adam Fenn (adam at fenn dot io)
Requested by:	kib
Reviewed by:	kib, grehan
Approved by:	kib
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D26166
2020-08-24 11:49:49 +00:00
Mark Johnston
72c7f24c8d Use pmap_mapbios() to map ACPI tables on amd64 and i386.
The ACPI table-mapping code used pmap_kenter_temporary() to create
mappings, which in turn uses the fixed-size crashdump map.  Moreover,
the code was not verifying that the table fits in this map, so when
mapping large tables we could clobber adjacent mappings.  This use of
pmap_kenter_temporary() appears to predate support in pmap_mapbios() for
creating early mappings, but that restriction no longer applies.

PR:		248746
Reviewed by:	kib, mav
Tested by:	gallatin, Curtis Villamizar <curtis@ipv6.occnc.com>
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26125
2020-08-20 00:52:53 +00:00
Alexander Motin
8f6355b51d Remove some noisy ACPI tables messages from verbose dmesg.
Those messages were printed hundreds of times during boot, often multiple
times for each table.  We already print information about the tables in
more organized form once to not duplicate it when random ACPI drivers are
attaching.

MFC after:	1 week
2020-08-19 16:09:36 +00:00
Mateusz Guzik
a125ed50a6 linux: add sysctl compat.linux.use_emul_path
This is a step towards facilitating jails with only Linux binaries.
Supporting emul_path adds path lookups which are completely spurious
if the binary at hand runs in a Linux-based root directory.

It defaults to on (== current behavior).

make -C /root/linux-5.3-rc8 -s -j 1 bzImage:

use_emul_path=1: 101.65s user 68.68s system 100% cpu 2:49.62 total
use_emul_path=0: 101.41s user 64.32s system 100% cpu 2:45.02 total
2020-08-18 22:04:22 +00:00
Mateusz Guzik
d5e3895ea4 linux: consistently use LFREEPATH instead of open-coding it 2020-08-18 22:03:55 +00:00
Peter Grehan
3a3f1e9dfa Export a routine to provide the TSC_AUX MSR value and use this in vmm.
Also, drop an unnecessary set of braces.

Requested by:	kib
Reviewed by:	kib
MFC after:	3 weeks
2020-08-18 11:36:38 +00:00
Ruslan Bukin
c4cd699010 o Add machine/iommu.h and include MD iommu headers from it,
so we don't ifdef for every arch in busdma_iommu.c;
o No need to include specialreg.h for x86, remove it.

Requested by:	andrew
Reviewed by:	kib
Sponsored by:	DARPA/AFRL
Differential Revision:	https://reviews.freebsd.org/D25957
2020-08-05 19:11:31 +00:00
Alexander Motin
aba10e131f Allow swi_sched() to be called from NMI context.
For purposes of handling hardware error reported via NMIs I need a way to
escape NMI context, being too restrictive to do something significant.

To do it this change introduces new swi_sched() flag SWI_FROMNMI, making
it careful about used KPIs.  On platforms allowing IPI sending from NMI
context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI,
otherwise it works just like SWI_DELAY.  To handle the delayed SWIs this
patch calls clk_intr_event on every hardclock() tick.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25754
2020-07-25 15:19:38 +00:00
Alex Richardson
b798ef6490 Include TMPFS in all the GENERIC kernel configs
Being able to use tmpfs without kernel modules is very useful when building
small MFS_ROOT kernels without a real file system.
Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs.

Compiling TMPFS only adds 4 .c files so this should not make much of a
difference to NO_MODULES build times (as we do for our minimal RISC-V
images).

Reviewed By: br (earlier version for riscv), brooks, emaste
Differential Revision: https://reviews.freebsd.org/D25317
2020-07-24 08:40:04 +00:00
Alexander Motin
ce53f590ca Untie nmi_handle_intr() from DEV_ISA.
The only part of nmi_handle_intr() depending on ISA is isa_nmi(), which is
already wrapped.  Entering debugger on NMI does not really depend on ISA.

MFC after:	2 weeks
2020-07-22 20:15:21 +00:00
Alexander Motin
f2b6da18de Avoid code duplicaiton by using ipi_selected().
MFC after:	2 weeks
2020-07-21 17:18:38 +00:00
Edward Tomasz Napierala
3e9a214260 Regen after r363304.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-07-18 11:31:31 +00:00
Edward Tomasz Napierala
8d1d017175 Add a trivial linux(4) splice(2) implementation, which simply
returns EINVAL.  Fixes grep (grep-3.1-2build1).

PR:		kern/218699
Reported by:	avos
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25636
2020-07-18 11:28:40 +00:00
Conrad Meyer
4ae224c663 Revert r240317 to prevent leaking pmap entries
Subsequent to r240317, kmem_free() was replaced with kva_free() (r254025).
kva_free() releases the KVA allocation for the mapped region, but no longer
clears the pmap (pagetable) entries.

An affected pmap_unmapdev operation would leave the still-pmap'd VA space
free for allocation by other KVA consumers.  However, this bug easily
avoided notice for ~7 years because most devices (1) never call
pmap_unmapdev and (2) on amd64, mostly fit within the DMAP and do not need
KVA allocations.  Other affected arch are less popular: i386, MIPS, and
PowerPC.  Arm64, arm32, and riscv are not affected.

Reported by:	Don Morris <dgmorris AT earthlink.net>
Submitted by:	Don Morris (amd64 part)
Reviewed by:	kib, markj, Don (!amd64 parts)
MFC after:	I don't intend to, but you might want to
Sponsored by:	Dell Isilon
Differential Revision:	https://reviews.freebsd.org/D25689
2020-07-16 23:29:26 +00:00
Mark Johnston
e64080e79c Switch from SCTP to SCTP_SUPPORT in GENERIC configs.
This removes SCTP from in-tree kernel configuration files.  Now, SCTP
can be enabled by simply loading the module, as discussed on
freebsd-net@.

Reviewed by:	tuexen
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25611
2020-07-16 15:09:04 +00:00
Konstantin Belousov
c123ab20ec Improve description of the vector argument for i386 smp_targeted_tlb_shootdown().
Submitted by:	alc
MFC after:	20 days
2020-07-15 16:12:00 +00:00
Konstantin Belousov
dc43978aa5 amd64: allow parallel shootdown IPIs
Stop using smp_ipi_mtx to protect global shootdown state, and
move/multiply the global state into pcpu.  Now each CPU can initiate
shootdown IPI independently from other CPUs.  Initiator enters
critical section, then fills its local PCPU shootdown info
(pc_smp_tlb_XXX), then clears scoreboard generation at location (cpu,
my_cpuid) for each target cpu.  After that IPI is sent to all targets
which scan for zeroed scoreboard generation words.  Upon finding such
word the shootdown data is read from corresponding cpu' pcpu, and
generation is set.  Meantime initiator loops waiting for all zeroed
generations in scoreboard to update.

Initiator does not disable interrupts, which should allow
non-invalidation IPIs from deadlocking, it only needs to disable
preemption to pin itself to the instance of the pcpu smp_tlb data.

The generation is set before the actual invalidation is performed in
handler. It is safe because target CPU cannot return to userspace
before handler finishes. In principle only NMI can preempt the
handler, but NMI would see the kernel handler frame and not touch
not-invalidated user page table.

Handlers loop until they do not see zeroed scoreboard generations.
This, together with hardware keeping one pending IPI in LAPIC IRR
should prevent lost shootdowns.

Notes.
1. The code does protect writes to LAPIC ICR with exclusion. I believe
   this is fine because we in fact do not send IPIs from interrupt
   handlers. More for !x2APIC mode where ICR access for write requires
   two registers write, we disable interrupts around it. If considered
   incorrect, I can add per-cpu spinlock around ipi_send().
2. Scoreboard lines owned by given target CPU can be padded to the
   cache line, to reduce ping-pong.

Reviewed by:	markj (previous version)
Discussed with:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D25510
2020-07-14 20:37:50 +00:00
Conrad Meyer
64612d4e44 geom(4): Kill GEOM_PART_EBR_COMPAT option
Take advantage of Warner's nice new real GEOM aliasing system and use it for
aliased partition names that actually work.

Our canonical EBR partition name is the weird, not-default-on-x86-prior-to-
this-revision "da1p4+00001234."  However, if compatibility mode (tunable
kern.geom.part.ebr.compat_aliases) is enabled (1, default), we continue to
provide the alias names like "da1p5" in addition to the weird canonical
names.

Naming partition providers was just one aspect of the COMPAT knob; in
addition it limited mutability, in part because it did not preserve existing
EBR header content aside from that of LBA 0.  This change saves the EBR
header for LBA 0, as well as for every EBR partition encountered.  That way,
when we write out the EBR partition table on modification, we can restore
any bootloader or other metadata in both LBA0 (the first data-containing EBR
may start after 0) as well as every logical EBR we read from the disk, and
only update the geometry metadata and linked list pointers that describe the
actual partitioning.

(This change does not add support for the 'bootcode' verb to EBR.)

PR:		232463
Reported by:	Manish Jain <bourne.identity AT hotmail.com>
Discussed with:	ae (no objection)
Relnotes:	maybe
Differential Revision:	https://reviews.freebsd.org/D24939
2020-07-01 02:16:36 +00:00
Kyle Evans
5403f186a7 linuxolator: implement memfd_create syscall
This effectively mirrors our libc implementation, but with minor fudging --
name needs to be copied in from userspace, so we just copy it straight into
stack-allocated memfd_name into the correct position rather than allocating
memory that needs to be cleaned up.

The sealing-related fcntl(2) commands, F_GET_SEALS and F_ADD_SEALS, have
also been implemented now that we support them.

Note that this implementation is still not quite at feature parity w.r.t.
the actual Linux version; some caveats, from my foggy memory:

- Need to implement SHM_GROW_ON_WRITE, default for memfd (in progress)
- LTP wants the memfd name exposed to fdescfs
- Linux allows open() of an fdescfs fd with O_TRUNC to truncate after dup.
  (?)

Interested parties can install and run LTP from ports (devel/linux-ltp) to
confirm any fixes.

PR:		240874
Reviewed by:	kib, trasz
Differential Revision:	https://reviews.freebsd.org/D21845
2020-06-29 03:09:14 +00:00
Edward Tomasz Napierala
a39cdcd7e7 Regen.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-27 14:43:29 +00:00
Edward Tomasz Napierala
308e194cbf Add proper types for linux message queue syscalls; mostly taken
from 32-bit Linuxulator.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25386
2020-06-27 14:42:08 +00:00
Edward Tomasz Napierala
36507f85dc Add syscall definitions for linux xattr syscalls.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25387
2020-06-27 14:39:44 +00:00
Edward Tomasz Napierala
5ac2674278 Adapt linuxulator syscalls.master files to the new layout.
No functional changes.

Reviewed by:	brooks
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25381
2020-06-21 10:09:34 +00:00
Brandon Bergren
40b664f64b [PowerPC] More relocation fixes
It turns out relocating the symbol table itself can cause issues, like fbt
crashing because it applies the offsets to the kernel twice.

This had been previously brought up in rS333447 when the stoffs hack was
added, but I had been unaware of this and reimplemented symtab relocation.

Instead of relocating the symbol table, keep track of the relocation base
in ddb, so the ddb symbols behave like the kernel linker-provided symbols.

This is intended to be NFC on platforms other than PowerPC, which do not
use fully relocatable kernels. (The relbase will always be 0)

 * Remove the rest of the stoffs hack.
 * Remove my half-baked displace_symbol_table() function.
 * Extend ddb initialization to cope with having a relocation offset on the
   kernel symbol table.
 * Fix my kernel-as-initrd hack to work with booke64 by using a temporary
   mapping to access the data.
 * Fix another instance of __powerpc__ that is actually RELOCATABLE_KERNEL.
 * Change the behavior or X_db_symbol_values to apply the relocation base
   when updating valp, to match link_elf_symbol_values() behavior.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25223
2020-06-21 03:39:26 +00:00
Edward Tomasz Napierala
bafd96b8dd Regen after r362440.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-20 18:31:02 +00:00
Edward Tomasz Napierala
52c81be11a Add linux_madvise(2) instead of having Linux apps call the native
FreeBSD madvise(2) directly.  While some of the flag values match,
most don't.

PR:		kern/230160
Reported by:	markj
Reviewed by:	markj
Discussed with:	brooks, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25272
2020-06-20 18:29:22 +00:00
Eric van Gyzen
6fba90f201 FPU init: allocate initial state from UMA to ensure alignment
The Intel Instruction Set Reference says this about the XSAVE instruction:

    Use of a destination operand not aligned to 64-byte boundary
    (in either 64-bit or 32-bit modes) results in a general-protection
    (#GP) exception.

This alignment happens naturally when all malloc buckets are powers
of two.  However, this change is necessary on some systems when
certain non-power-of-two (and non-multiple of 64) malloc buckets
are defined.

Reviewed by:	cem; kib; earlier version by jhb
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:17:56 +00:00
Eric van Gyzen
701acc2fd8 FPU: make xsave_area_desc static
...because it can be.

Reviewed by:	cem kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:12:26 +00:00
Eric van Gyzen
674cbe7908 FPU init: Do potentially blocking operations before disabling interrupts
In particular, uma_zcreate creates sysctl oids, which locks an sx lock,
which uses IPIs under contention.  IPIs tend not to work very well
when interrupts are disabled.  Who knew, right?

Reviewed by:	cem kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:10:45 +00:00
Konstantin Belousov
3b23ffe271 amd64 pmap: reorder IPI send and local TLB flush in TLB invalidations.
Right now code first flushes all local TLB entries that needs to be
flushed, then signals IPI to remote cores, and then waits for
acknowledgements while spinning idle.  In the VMWare article 'Don’t
shoot down TLB shootdowns!' it was noted that the time spent spinning
is lost, and can be more usefully used doing local TLB invalidation.

We could use the same invalidation handler for local TLB as for
remote, but typically for pmap == curpmap we can use INVLPG for locals
instead of INVPCID on remotes, since we cannot control context
switches on them.  Due to that, keep the local code and provide the
callbacks to be called from smp_targeted_tlb_shootdown() after IPIs
are fired but before spin wait starts.

Reviewed by:	alc, cem, markj, Anton Rang <rang at acm.org>
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D25188
2020-06-10 22:07:57 +00:00
Mark Johnston
81302f1d77 Fix boot on systems where NUMA domain 0 is unpopulated.
- Add vm_phys_early_add_seg(), complementing vm_phys_early_alloc(), to
  ensure that segments registered during hammer_time() are placed in the
  right domain.  Otherwise, since the SRAT is not parsed at that point,
  we just add them to domain 0, which may be incorrect and results in a
  domain with only several MB worth of memory.
- Fix uma_startup1() to try allocating memory for zones from any domain.
  If domain 0 is unpopulated, the allocation will simply fail, resulting
  in a page fault slightly later during boot.
- Change _vm_phys_domain() to return -1 for addresses not covered by the
  affinity table, and change vm_phys_early_alloc() to handle wildcard
  domains.  This is necessary on amd64, where the page array is dense
  and pmap_page_array_startup() may allocate page table pages for
  non-existent page frames.

Reported and tested by:	Rafael Kitover <rkitover@gmail.com>
Reviewed by:	cem (earlier version), kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25001
2020-05-28 19:41:00 +00:00
Conrad Meyer
852c303b61 copystr(9): Move to deprecate (attempt #2)
This reapplies logical r360944 and r360946 (reverting r360955), with fixed
copystr() stand-in replacement macro.  Eventually the goal is to convert
consumers and kill the macro, but for a first step it helps if the macro is
correct.

Prior commit message:

Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults.  It's just
an older incarnation of the now-more-common strlcpy().

Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.

Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy (with correction from brooks@ -- thanks).

Remove N redundant MI implementations of copystr.  For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.

Reviewed by:		jhb (earlier version)
Discussed with:		brooks (thanks!)
Differential Revision:	https://reviews.freebsd.org/D24672
2020-05-25 16:40:48 +00:00
Mark Johnston
e115748932 Fix the build after r361033 when ACPI is disabled.
Reported by:	Herbert J. Skuhra <herbert@gojira.at>
2020-05-22 01:18:55 +00:00
Konstantin Belousov
ea6020830c amd64: Add a knob to flush RSB on context switches if machine has SMEP.
The flush is needed to prevent cross-process ret2spec, which is not handled
on kernel entry if IBPB is enabled but SMEP is present.
While there, add i386 RSB flush.

Reported by:	Anthony Steinhauser <asteinhauser@google.com>
Reviewed by:	markj, Anthony Steinhauser
Discussed with:	philip
admbugs:	961
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-05-20 22:00:31 +00:00
Mark Johnston
b19149bc56 Fix the i386 build after r361033.
Reported by:	Jenkins
2020-05-14 17:56:44 +00:00