Commit Graph

13702 Commits

Author SHA1 Message Date
Konstantin Belousov
a8b75a57c9 x86: add x86_clear_dbregs() helper
Move the code from exec_setregs() to reset debug registers state on exec,
to the x86_clear_dbregs() helper

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29687
2021-04-10 04:25:01 +03:00
Konstantin Belousov
aa3ea612be x86: remove gcov kernel support
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D29529
2021-04-02 15:41:51 +03:00
Konstantin Belousov
8223717ce6 x86: clear %db registers in new process
Reported by:	 Michał Górny <mgorny@gentoo.org>
PR:	254661
Reviewed by:	emaste, jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D29496
2021-03-31 02:07:35 +03:00
Mitchell Horne
7446b0888d gdb: report specific stop reason for watchpoints
The remote protocol allows for implementations to report more specific
reasons for the break in execution back to the client [1]. This is
entirely optional, so it is only implemented for amd64, arm64, and i386
at the moment.

[1] https://sourceware.org/gdb/current/onlinedocs/gdb/Stop-Reply-Packets.html

Reviewed by:	jhb
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
NetApp PR:	51
Differential Revision:	https://reviews.freebsd.org/D29174
2021-03-30 11:36:41 -03:00
Mitchell Horne
9d81dd5404 ddb: replace watchpoint set/clear functions
Use the new kdb variants. Print more specific error messages.

Reviewed by:	jhb, markj
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D29156
2021-03-29 12:05:44 -03:00
Mitchell Horne
15dc1d4452 x86: implement kdb watchpoint functions
Add wrappers around the dbreg interface that can be consumed by MI
kernel debugger code. The dbreg functions themselves are updated to
return error codes, not just -1. dbreg_set_watchpoint() is extended to
accept access bits as an argument.

Reviewed by:	jhb, kib, markj
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D29155
2021-03-29 12:05:43 -03:00
Jason A. Harmening
d22883d715 Remove PCPU_INC
e4b8deb222 removed the last in-tree uses of PCPU_INC().  Its
potential benefit is also practically nonexistent.  Non-x86
platforms already implement it as PCPU_ADD(..., 1), and according
to [0] there are no recent x86 processors for which the 'inc'
instruction provides a performance benefit over the equivalent
memory-operand form of the 'add' instruction.  The only remaining
benefit of 'inc' is smaller instruction size, which in this case
is inconsequential given the limited number of per-CPU data consumers.

[0]: https://www.agner.org/optimize/instruction_tables.pdf

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29308
2021-03-20 19:23:59 -07:00
Mitchell Horne
c02c04f113 x86: consolidate hw watchpoint logic into new file
This is a prerequisite to using these functions outside of ddb, but also
provides some cleanup and minor refactoring. This code is almost
entirely duplicated between the two implementations, the only
significant difference being the lack of dbreg synchronization on i386.

Cleanups are:
 - demote some internal functions to static
 - use the constant NDBREGS instead of a '4' literal
 - remove K&R definitions
 - some added comments

Reviewed by:	kib, jhb
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D29153
2021-03-19 16:51:52 -03:00
John Baldwin
3b57ddb029 Rename linux_set_upcall_kse() to linux_set_upcall().
This matches the rename of cpu_set_upcall_kse() in
5c2cf81845.

Reviewed by:	kib, emaste
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29295
2021-03-18 12:14:34 -07:00
John Baldwin
a7883464fc x86: Reduce code duplication in cpu_fork() and cpu_copy_thread().
Add copy_thread() to hold shared code.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D29228
2021-03-18 12:13:17 -07:00
Gordon Bergling
564a3ac63a i386: Fix a few typos
- wheter -> whether
- while here, fix some whitespace issues

MFC after:	1 week
2021-03-13 16:10:01 +01:00
John Baldwin
40d593d17e x86: Update some stale comments in cpu_fork() and cpu_copy_thread().
Neither of these routines allocate stacks.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D29227
2021-03-12 09:48:49 -08:00
John Baldwin
c7b0213523 x86: Always use clean FPU and segment base state for new kthreads.
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D29208
2021-03-12 09:48:36 -08:00
John Baldwin
755efb8d8f x86: Copy the FPU/XSAVE state from the creating thread to new threads.
POSIX states that new threads created via pthread_create() should
inherit the "floating point environment" from the creating thread.

Discussed with:	kib
MFC after:	1 week
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D29204
2021-03-12 09:47:41 -08:00
Mark Johnston
732b69c9f9 acpi: Make nexus_acpi quiet on amd64 and i386
Otherwise during attach newbus prints "nexus0", which is not very
useful.

The generic nexus device is already quiet, as is nexus_acpi on arm64.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-03-05 12:54:00 -05:00
Allan Jude
d0673fe160 smbios: Move smbios driver out from x86 machdep code
Add it to the x86 GENERIC and MINIMAL kernels

Sponsored by:	Ampere Computing LLC
Submitted by:	Klara Inc.
Reviewed by:	rpokala
Differential Revision:	https://reviews.freebsd.org/D28738
2021-02-23 21:17:09 +00:00
John Baldwin
67932460c7 Add a VA_IS_CLEANMAP() macro.
This macro returns true if a provided virtual address is contained
in the kernel's clean submap.

In CHERI kernels, the buffer cache and transient I/O map are allocated
as separate regions.  Abstracting this check reduces the diff relative
to FreeBSD.  It is perhaps slightly more readable as well.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D28710
2021-02-17 16:32:11 -08:00
Mark Johnston
aa5fef60bf linux: Update the i386/linux vdso deinitialization routine
This was missed in commit 0fc8a79672 ("linux: Unmap the VDSO page when
unloading").

Reported by:	Mark Millard
MFC with:	0fc8a79672
2021-02-16 17:07:56 -05:00
Roger Pau Monné
a2495c3667 xen/boot: allow specifying boot method when booted from Xen
Allow setting the bootmethod variable from the Xen PVH entry point, in
order to be able to correctly set the underlying firmware mode when
booted as a dom0.

Move the bootmethod variable to be defined in x86/cpu_machdep.c
instead so it can be shared by both i386 and amd64.

Sponsored by:		Citrix Systems R&D
Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D28619
2021-02-16 15:26:11 +01:00
Cy Schubert
0e01ea872e Fix a typo.
MFC after:	3 days
2021-01-27 21:52:41 -08:00
Mateusz Guzik
b42a2ea558 Remove ndis(4) remnants from kernel configs
Unbreaks LINT kernels.
2021-01-26 00:04:13 +00:00
Jessica Clarke
5faeda9037 Rename i386's Linux ELF to Linux ELF32
This is what amd64 calls the i386 Linux ABI in order to distinguish it
from the amd64 Linux ABI, and matches the nomenclature used for the
FreeBSD ABIs where they always have the size suffix in the name.

Reviewed by:	trasz
Differential Revision:	https://reviews.freebsd.org/D27647
2021-01-21 01:54:12 +00:00
Marius Strobl
944041f936 wl(4): remove obsolete header
It's unused since 09b9789b28 and r304506
respectively and should have gone along with these.
2021-01-17 00:03:17 +01:00
Vladimir Kondratyev
b62f6dfaed hid: Replace USBHID_ENABLED kernel config option with loader tunable
usbhid(4) is disabled by default to avoid conflicts with existing USB HID
drivers. To enable it place following lines to /boot/loader.conf:

hw.usb.usbhid.enable=1
usbhid_load="YES"

Suggested by:	jhb
Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D28124
2021-01-14 23:04:47 +03:00
Andrew Turner
6eebda3bba Split out the NODEBUG options to a common file
This is the superset of the nooptions found in the -DEBUG kernels.

Reviewed by:	emaste, manu
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D28152
2021-01-14 16:57:53 +00:00
John Baldwin
074a91f746 Enable accelerated AES-XTS software crypto in GENERIC.
In particular, using GELI on a root filesystem will only use
accelerated software crypto drivers if they are available before the
root filesystem is mounted.  While these modules can be loaded from
the loader, including them in GENERIC provides a better out-of-the-box
experience for users.

Both aesni(4) and armv8crypto(4) provide accelerated implementations
of the default cipher used by GELI (AES-XTS) in addition to other
ciphers.

Reviewed by:	mhorne, allanjude, markj
Differential Revision:	https://reviews.freebsd.org/D28100
2021-01-13 13:13:01 -08:00
Konstantin Belousov
0659df6fad vm_map_protect: allow to set prot and max_prot in one go.
This prevents a situation where other thread modifies map entries
permissions between setting max_prot, then relocking, then setting prot,
confusing the operation outcome.  E.g. you can get an error that is not
possible if operation is performed atomic.

Also enable setting rwx for max_prot even if map does not allow to set
effective rwx protection.

Reviewed by:	brooks, markj (previous version)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28117
2021-01-13 01:35:22 +02:00
Roger Pau Monne
ed78016d00 xen/privcmd: implement the dm op ioctl
Use an interface compatible with the Linux one so that the user-space
libraries already using the Linux interface can be used without much
modifications.

This allows user-space to make use of the dm_op family of hypercalls,
which are used by device models.

Sponsored by:	Citrix Systems R&D
2021-01-11 16:33:27 +01:00
Vladimir Kondratyev
0f0379fa55 hid: Add recently imported drivers to NOTES
Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D28060
2021-01-10 22:17:20 +03:00
Konstantin Belousov
45974de8fb x86: Add rdtscp32() into cpufunc.h.
Suggested by:	markj
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27986
2021-01-10 04:42:34 +02:00
Toomas Soome
7c6a71d16c i386 kernel is not built with vt_vbefb
Add vt_vbefb to GENERIC and NOTES
2021-01-08 20:58:23 +02:00
Warner Losh
a21def4d56 pccard: Remove wi(4) driver
Remove wi(4). pccard is going away, and wi only supports PC Card
devices, though it has a minor amount of glue to also support
PCI cards. However, removing the one without removing the other
is hard, so the whole driver is being removed.

Relnotes: Yes
2021-01-07 20:41:06 -07:00
Vladimir Kondratyev
01f2e864f7 hid: Import usbhid - USB transport backend for HID subsystem.
This change implements hid_if.m methods for HID-over-USB protocol [1].

Also, this change adds USBHID_ENABLED kernel option which changes
device_probe() priority and adds/removes PnP records to prefer usbhid
over ums, ukbd, wmt and other USB HID device drivers and vice-versa.

The module is based on uhid(4) driver.  It is disabled by default for
now due to conflicts with existing USB HID drivers.

[1] https://www.usb.org/sites/default/files/hid1_11.pdf

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D27893
2021-01-08 02:18:43 +03:00
Vladimir Kondratyev
b1f1b07f6d hid: Import iichid - I2C transport backend for HID subsystem
This implements hid_if.m methods for HID-over-I2C protocol [1].

Following kernel options are added:

IICHID_SAMPLING - Enable support for a sampling mode as interrupt
                  resource acquisition is not always possible in a case
                  of GPIO interrupts.
IICHID_DEBUG    - Enable debug output.

The module is based on prior Marc Priggemeyer work (D16698).

[1] http://download.microsoft.com/download/7/d/d/7dd44bb7-2a7a-4505-ac1c-7227d3d96d5b/hid-over-i2c-protocol-spec-v1-0.docx

Differential revision:	https://reviews.freebsd.org/D27892
2021-01-08 02:18:43 +03:00
Vladimir Kondratyev
1975878673 hid: Import functions and constants required by new subsystem
This does an import of quirk stubs, debugging macros from USB code and
numerous usage constants used by dependent drivers.

Besides, this change renames some functions to get a better matching
with userland library and NetBSD/OpenBSD HID code. Namely:

- Old hid_report_size() renamed to hid_report_size_max()
- New hid_report_size() calculates size of given report rather than
  maximum size of all reports.
- hid_get_data_unsigned() renamed to hid_get_udata()
- hid_put_data_unsigned() renamed to hid_put_udata()

Compat shim functions are provided in usbhid.h to make possible compile
of legacy code unmodified after this change.

Reviewed by:	manu, hselasky
Differential revision:	https://reviews.freebsd.org/D27887
2021-01-08 02:18:42 +03:00
Vladimir Kondratyev
67de2db262 Factor-out hardware-independent part of USB HID support to new module
It will be used by the upcoming HID-over-i2C implementation.  Should be
no-op, except hid.ko module dependency is to be added to affected drivers.

Reviewed by:	hselasky, manu
Differential revision:	https://reviews.freebsd.org/D27867
2021-01-08 02:18:42 +03:00
Alan Cox
7beeacb27b Honor the vm page's PG_NODUMP flag on arm and i386.
Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27949
2021-01-04 16:15:42 -06:00
Mitchell Horne
962c06c5a3 gdb(4) fix x86 signal reporting
The existing values correspond to x86 exception vector numbers, but the
trap numbers used in the kernel do not match these 1-to-1. Prefer the
definitions from x86/trap.h, as they are what actually get passed to
kdb_trap(). This is of little consequence, as gdb_cpu_signal() only
reports the trap reason (signal number) to the gdb client.

This is limited to the subset of trap values for which kdb_trap() is
reachable.

Reviewed by:	kib
Discussed with:	jhb
MFC after:	1 week
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D27645
2020-12-23 15:40:14 -04:00
John Baldwin
1dce7d9e7e Skip the vm.pmap.kernel_maps sysctl by default.
This sysctl node can generate very verbose output, so don't trigger it
for sysctl -a or sysctl vm.pmap.

Reviewed by:	markj, kib
Differential Revision:	https://reviews.freebsd.org/D27504
2020-12-18 20:41:23 +00:00
Alexander V. Chernikov
d5fe384b4d Enable ROUTE_MPATH support in GENERIC kernels.
Ability to load-balance traffic over multiple path is a must-have thing for routers.
It may be used by the servers to balance outgoing traffic over multiple default gateways.

The previous implementation, RADIX_MPATH stayed in the shadow for too long.
It was not well maintained, which lead us to a vicious circle - people were using
 non-contiguous mask or firewalls to achieve similar goals. As a result, some routing
 daemons implementation still don't have multipath support enabled for FreeBSD.

Turning on ROUTE_MPATH by default would fix it. It will allow to reduce networking
 feature gap to other operating systems. Linux and OpenBSD enabled similar support
 at least 5 years ago.

ROUTE_MPATH does not consume memory unless actually used. It enables around ~1k LOC.

It does not bring any behaviour changes for userland.
Additionally, feature is (temporarily) turned off by the net.route.multipath sysctl
 defaulting to 0.

Differential Revision:	https://reviews.freebsd.org/D27428
2020-12-14 22:23:08 +00:00
Brooks Davis
9ee99cec1f hme(4): Remove as previous announced
The hme (Happy Meal Ethernet) driver was the onboard NIC in most
supported sparc64 platforms. A few PCI NICs do exist, but we have seen
no evidence of use on non-sparc systems.

Reviewed by:	imp, emaste, bcr
Sponsored by:	DARPA
2020-12-11 21:40:38 +00:00
Conrad Meyer
78599c32ef Add CFI start/end proc directives to arm64, i386, and ppc
Follow-up to r353959 and r368070: do the same for other architectures.

arm32 already seems to use its own .fnstart/.fnend directives, which
appear to be ARM-specific variants of the same thing.  Likewise, MIPS
uses .frame directives.

Reviewed by:	arichardson
Differential Revision:	https://reviews.freebsd.org/D27387
2020-12-05 00:33:28 +00:00
Toomas Soome
a4a10b37d4 Add VT driver for VBE framebuffer device
Implement vt_vbefb to support Vesa Bios Extensions (VBE) framebuffer with VT.
vt_vbefb is built based on vt_efifb and is assuming similar data for
initialization, use MODINFOMD_VBE_FB to identify the structure vbe_fb
in kernel metadata.

struct vbe_fb, is populated by boot loader, and is passed to kernel via
metadata payload.

Differential Revision:	https://reviews.freebsd.org/D27373
2020-11-30 08:22:40 +00:00
Ruslan Bukin
f120ad6ba0 o Move options IOMMU from Debugging section back to the Bus section
where it originally was. The bug introduced in r366267.
o Remove options IOMMU from i386/MINIMAL as we don't have it in
  i386/GENERIC.

Reported by:	Harry Schmalzbauer <freebsd@omnilan.de>
Reviewed by:	kib
Sponsored by:	Innovate DSbD
Differential Revision:	https://reviews.freebsd.org/D27399
2020-11-27 21:37:48 +00:00
Jung-uk Kim
926ce35a7e Port rtsx(4) driver for Realtek SD card reader from OpenBSD.
This driver provides support for Realtek PCI SD card readers.  It attaches
mmc(4) bus on card insertion and detaches it on card removal.  It has been
tested with RTS5209, RTS5227, RTS5229, RTS522A, RTS525A and RTL8411B.  It
should also work with RTS5249, RTL8402 and RTL8411.

PR:			204521
Submitted by:		Henri Hennebert (hlh at restart dot be)
Reviewed by:		imp, jkim
Differential Revision:	https://reviews.freebsd.org/D26435
2020-11-24 21:28:44 +00:00
Konstantin Belousov
4815f175d0 Linuxolator: Replace use of eventhandlers by sysent hooks.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27309
2020-11-23 18:18:16 +00:00
Conrad Meyer
77eb984147 'make sysent' for r367773
X-MFC-With:	r367773
2020-11-17 19:53:59 +00:00
Conrad Meyer
de774e422e linux(4): Implement name_to_handle_at(), open_by_handle_at()
They are similar to our getfhat(2) and fhopen(2) syscalls.

Differential Revision:	https://reviews.freebsd.org/D27111
2020-11-17 19:51:47 +00:00
Conrad Meyer
e9b13c6612 linux(4): Deduplicate unimpl/dummy syscall handlers
No functional change.

Reviewed by:	emaste, trasz
Differential Revision:	https://reviews.freebsd.org/D27099
2020-11-05 19:30:31 +00:00
Mateusz Guzik
82c174a3b4 malloc: delegate M_EXEC handling to dedicacted routines
It is almost never needed and adds an avoidable branch.

While here do minior clean ups in preparation for larger changes.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D27019
2020-10-30 20:02:32 +00:00
Edward Tomasz Napierala
866b1f5147 Fix misnomer - linux_to_bsd_errno() does the exact opposite.
Reported by:	arichardson
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26965
2020-10-27 12:49:40 +00:00
Kyle Evans
d42a83b1a9 audit: also correctly audit linux_execve()
Linux execve() gets audited as AUE_EXECVE as well, we should also interpret
the return from this correctly for the same reasoning as in r367002.

MFC with:	r367002
2020-10-26 17:30:17 +00:00
Warner Losh
0b0447734a Remove support for intel compiler from i386 in_cksum
We no longer support building the kernel with the old intel
compiler. Remove support for it from in_cksum. Should there be
interest in reviving it, this is as likely to get in the way as to
help anyway.
2020-10-24 23:21:27 +00:00
Konstantin Belousov
c0b5fcf692 Improve FPU Tag Word reconstruction on i386 to indicate register states.
Improve the code reconstructing en_tw in struct fpreg32 from FXSAVE
results so that all register states are indicated correctly.  The
previous code unconditionally mapped non-empty register state to
'normalized value' constant.  The new code explicitly distinguishes
the 'zero value' and 'special value' constants as well.  This improves
consistency between real FSAVE and translation from FXSAVE, and
ensures that tests using PT_GETFPREGS can rely on a single correct
value independently of the underlying implementation.

PR:	250454
Sponsored by:	The FreeBSD Foundation
Obtained from:	Moritz Systems
Submitted by:	Michał Górny <mgorny@moritz.systems>
Discussed with:	emaste
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26856
2020-10-21 00:15:12 +00:00
John Baldwin
ba610be90a Add a kernel crypto driver using assembly routines from OpenSSL.
Currently, this supports SHA1 and SHA2-{224,256,384,512} both as plain
hashes and in HMAC mode on both amd64 and i386.  It uses the SHA
intrinsics when present similar to aesni(4), but uses SSE/AVX
instructions when they are not.

Note that some files from OpenSSL that normally wrap the assembly
routines have been adapted to export methods usable by 'struct
auth_xform' as is used by existing software crypto routines.

Reviewed by:	gallatin, jkim, delphij, gnn
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D26821
2020-10-20 17:50:18 +00:00
John Baldwin
eeb4c816d6 Properly clear PCB_KERNNPX in fpu_kern_leave().
PR:		250423
Reported by:	CI
Tested by:	lwhsu
2020-10-19 17:35:45 +00:00
Konstantin Belousov
e406235000 Fix for mis-interpretation of PCB_KERNFPU.
RIght now PCB_KERNFPU is used both as indication that kernel prepared
hardware FPU context to use and that the thread is fpu-kern
thread.  This also breaks fpu_kern_enter(FPU_KERN_NOCTX), since
fpu_kern_leave() then clears PCB_KERNFPU.

Introduce new flag PCB_KERNFPU_THR which indicates that the thread is
fpu-kern.  Do not clear PCB_KERNFPU if fpu-kern thread leaves noctx
fpu region.

Reported and tested by:	jhb (amd64)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25511
2020-10-14 23:01:41 +00:00
Konstantin Belousov
d3ba71b2b1 Limit workaround for errata E400 to appropriate AMD cpus.
From Linux sources and several datasheets I looked at, it seems that
the workaround is only needed on families 0xf and 0x10.  For instance,
Ryzens do not implement the accessed MSR at all, it is documented as
reserved.  Also, hypervisors should not allow guest to put CPU into
idle state, so activate workaround only when on bare hardware.

While there, style the code:
    move MSR defines to specialreg.h
    move identification to initcpu.c

Reported by:	whu
Reviewed by:	avg
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D26470
2020-10-14 22:57:50 +00:00
Konstantin Belousov
6f3b523c9a Avoid dump_avail[] redefinition.
Move dump_avail[] extern declaration and inlines into a new header
vm/vm_dumpset.h.  This fixes default gcc build for mips.

Reviewed by:	alc, scottph
Tested by:	kevans (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26741
2020-10-14 22:51:40 +00:00
Ruslan Bukin
6e9127d838 Add a per-each macro IOMMU_DOMAIN_UNLOAD_SLEEP which allows to sleep
during iommu guest address space entries unload.

Suggested by:	kib
Sponsored by:	Innovate DSbD
Differential Revision:	https://reviews.freebsd.org/D26722
2020-10-14 14:51:11 +00:00
John Baldwin
1775215f88 Add support for FPU_KERN_NOCTX.
This mirrors the implementation on amd64.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D26754
2020-10-13 17:27:37 +00:00
John Baldwin
4ef6ea38fc Add a <machine/fpu.h> for i386 that includes <machine/npx.h>.
arm64 has a similar wrapper.  This permits defining <machine/fpu.h> as
the standard header for fpu_kern_*.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D26753
2020-10-13 17:26:12 +00:00
Conrad Meyer
f8e8a06d23 random(4) FenestrasX: Push root seed version to arc4random(3)
Push the root seed version to userspace through the VDSO page, if
the RANDOM_FENESTRASX algorithm is enabled.  Otherwise, there is no
functional change.  The mechanism can be disabled with
debug.fxrng_vdso_enable=0.

arc4random(3) obtains a pointer to the root seed version published by
the kernel in the shared page at allocation time.  Like arc4random(9),
it maintains its own per-process copy of the seed version corresponding
to the root seed version at the time it last rekeyed.  On read requests,
the process seed version is compared with the version published in the
shared page; if they do not match, arc4random(3) reseeds from the
kernel before providing generated output.

This change does not implement the FenestrasX concept of PCPU userspace
generators seeded from a per-process base generator.  That change is
left for future discussion/work.

Reviewed by:	kib (previous version)
Approved by:	csprng (me -- only touching FXRNG here)
Differential Revision:	https://reviews.freebsd.org/D22839
2020-10-10 21:52:00 +00:00
Warner Losh
7e46dafa58 Create in-tree LINT files
Now that config(8) has supported include for 19 years, transition to
including the NOTES files. include support didn't exist at the time,
nor did the envvar stuff recently added. Now that it does, eliminate
the building of LINT files by just including everything you need.

Note: This may cause conflicts with updating in some cases.
	find sys -name LINT\* -rm
is suggested across this commit to remove the generated LINT
files.

Reviewed by: kevans
Differential Revision: https://reviews.freebsd.org/D26540
2020-10-09 01:48:14 +00:00
Warner Losh
8e82f10172 timer_restore is now unused, remove it
apm was the only consumer of timer_restore. Now that it's gone, this
can be removed.
2020-10-08 20:56:11 +00:00
Warner Losh
8c576a279e Remove APM BIOS support
APM BIOS was relevant only to early laptops (approximately P166 or
P200 and slower). These have not been relevant for a long time, and
this code has been untested for a long time (as far as I can
tell). The APM compat code in ACPI and the apm(8) command is not being
retired. Both of these items are still in use (apm(8) is more
scriptable than the replacement acpiconf, for the most part). This has
been commented out of i386 GENERIC since 2002. This code is not
relevant to any other port.

Discussed on: arch@
2020-10-08 20:56:06 +00:00
Warner Losh
28942db891 Remove apm screen saver.
APM BIOS support is about to be removed. Remove the apm screen saver
and its module. They are about to be irrelevant.
2020-10-08 20:56:00 +00:00
Mitchell Horne
8481aab1ac Print symbol index for unsupported relocation types
It is unlikely, but possible, that an unrecognized or unsupported
relocation type is encountered while trying to load a kernel module. If
this occurs we should offer the symbol index as a hint to the user.

While here, fix some small style issues.

Reviewed by:	markj, kib (amd64 part, in D26701)
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
2020-10-07 18:48:10 +00:00
Emmanuel Vadot
90b8c0ea10 Fix LINT: Add backlight to NOTES 2020-10-02 20:52:09 +00:00
Ruslan Bukin
6186bfbd18 Rename kernel option ACPI_DMAR to IOMMU.
This is mostly needed for a common arm64/amd64 iommu code.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D26587
2020-09-29 20:29:07 +00:00
Edward Tomasz Napierala
1e2521ffae Get rid of sa->narg. It serves no purpose; use sa->callp->sy_narg instead.
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26458
2020-09-27 18:47:06 +00:00
Edward Tomasz Napierala
0c5bd5f993 Regen after r366145.
Sponsored by:	DARPA
2020-09-25 10:05:38 +00:00
Mark Johnston
78257765f2 Add a vmparam.h constant indicating pmap support for large pages.
Enable SHM_LARGEPAGE support on arm64.

Reviewed by:	alc, kib
Sponsored by:	Juniper Networks, Inc., Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26467
2020-09-23 19:34:21 +00:00
Warner Losh
f9ba2bbe3a Use envvar rather than nonstandard hint. lines
The NOTES files have a bunch of hint lines that are removed when
generating LINT. However, we can achieve the same effect by prepending
each of the lines with 'envvar' so the NOTES files become standard
config(8) files. No functional changes as the sed script to generate
the LINT files filters these either way.

Suggested by: kevans
2020-09-23 19:18:53 +00:00
D Scott Phillips
00e6614750 Sparsify the vm_page_dump bitmap
On Ampere Altra systems, the sparse population of RAM within the
physical address space causes the vm_page_dump bitmap to be much
larger than necessary, increasing the size from ~8 Mib to > 2 Gib
(and overflowing `int` for the size).

Changing the page dump bitmap also changes the minidump file
format, so changes are also necessary in libkvm.

Reviewed by:	jhb
Approved by:	scottl (implicit)
MFC after:	1 week
Sponsored by:	Ampere Computing, Inc.
Differential Revision:	https://reviews.freebsd.org/D26131
2020-09-21 22:21:59 +00:00
D Scott Phillips
ab041f713a Move vm_page_dump bitset array definition to MI code
These definitions were repeated by all architectures, with small
variations. Consolidate the common definitons in machine
independent code and use bitset(9) macros for manipulation. Many
opportunities for deduplication remain in the machine dependent
minidump logic. The only intended functional change is increasing
the bit index type to vm_pindex_t, allowing the indexing of pages
with address of 8 TiB and greater.

Reviewed by:	kib, markj
Approved by:	scottl (implicit)
MFC after:	1 week
Sponsored by:	Ampere Computing, Inc.
Differential Revision:	https://reviews.freebsd.org/D26129
2020-09-21 22:20:37 +00:00
Edward Tomasz Napierala
70890254b3 Get rid of sv_errtbl and SV_ABI_ERRNO().
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26388
2020-09-17 11:39:33 +00:00
Edward Tomasz Napierala
c26391f4dd Move SV_ABI_ERRNO translation into linux-specific code, to simplify
the syscall path and declutter it a bit.  No functional changes intended.

Reviewed by:	kib (earlier version)
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26378
2020-09-15 16:41:21 +00:00
Mark Johnston
847ab36bf2 Include the psind in data returned by mincore(2).
Currently we use a single bit to indicate whether the virtual page is
part of a superpage.  To support a forthcoming implementation of
non-transparent 1GB superpages, it is useful to provide more detailed
information about large page sizes.

The change converts MINCORE_SUPER into a mask for MINCORE_PSIND(psind)
values, indicating a mapping of size psind, where psind is an index into
the pagesizes array returned by getpagesizes(3), which in turn comes
from the hw.pagesizes sysctl.  MINCORE_PSIND(1) is equal to the old
value of MINCORE_SUPER.

For now, two bits are used to record the page size, permitting values
of MAXPAGESIZES up to 4.

Reviewed by:	alc, kib
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26238
2020-09-02 18:16:43 +00:00
Mark Johnston
2d838cd867 Add the MEM_EXTRACT_PADDR ioctl to /dev/mem.
This allows privileged userspace processes to find information about the
physical page backing a given mapping.  It is useful in applications
such as DPDK which perform some of their own memory management.

Reviewed by:	kib, jhb (previous version)
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D26237
2020-09-02 18:12:47 +00:00
Mateusz Guzik
ed83a56181 i386: clean up empty lines in .c and .h files 2020-09-01 21:19:39 +00:00
Warner Losh
8df7e154a2 Add deprecation notice for apm BIOS
Add deprecation notice for apm bios, aka the apm(4) device. The apm(8)
command will remain, at least for a while, since ACPI emulates the apm
ioctl interface.

Discussed on: arch@
Relnotes: yes
MFC After: 3 days
2020-08-31 21:04:00 +00:00
Peter Grehan
a333a508a2 cpu_auxmsr: assert caller is preventing CPU migration.
Submitted by:	Adam Fenn (adam at fenn dot io)
Requested by:	kib
Reviewed by:	kib, grehan
Approved by:	kib
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D26166
2020-08-24 11:49:49 +00:00
Mark Johnston
72c7f24c8d Use pmap_mapbios() to map ACPI tables on amd64 and i386.
The ACPI table-mapping code used pmap_kenter_temporary() to create
mappings, which in turn uses the fixed-size crashdump map.  Moreover,
the code was not verifying that the table fits in this map, so when
mapping large tables we could clobber adjacent mappings.  This use of
pmap_kenter_temporary() appears to predate support in pmap_mapbios() for
creating early mappings, but that restriction no longer applies.

PR:		248746
Reviewed by:	kib, mav
Tested by:	gallatin, Curtis Villamizar <curtis@ipv6.occnc.com>
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26125
2020-08-20 00:52:53 +00:00
Alexander Motin
8f6355b51d Remove some noisy ACPI tables messages from verbose dmesg.
Those messages were printed hundreds of times during boot, often multiple
times for each table.  We already print information about the tables in
more organized form once to not duplicate it when random ACPI drivers are
attaching.

MFC after:	1 week
2020-08-19 16:09:36 +00:00
Mateusz Guzik
a125ed50a6 linux: add sysctl compat.linux.use_emul_path
This is a step towards facilitating jails with only Linux binaries.
Supporting emul_path adds path lookups which are completely spurious
if the binary at hand runs in a Linux-based root directory.

It defaults to on (== current behavior).

make -C /root/linux-5.3-rc8 -s -j 1 bzImage:

use_emul_path=1: 101.65s user 68.68s system 100% cpu 2:49.62 total
use_emul_path=0: 101.41s user 64.32s system 100% cpu 2:45.02 total
2020-08-18 22:04:22 +00:00
Mateusz Guzik
d5e3895ea4 linux: consistently use LFREEPATH instead of open-coding it 2020-08-18 22:03:55 +00:00
Peter Grehan
3a3f1e9dfa Export a routine to provide the TSC_AUX MSR value and use this in vmm.
Also, drop an unnecessary set of braces.

Requested by:	kib
Reviewed by:	kib
MFC after:	3 weeks
2020-08-18 11:36:38 +00:00
Ruslan Bukin
c4cd699010 o Add machine/iommu.h and include MD iommu headers from it,
so we don't ifdef for every arch in busdma_iommu.c;
o No need to include specialreg.h for x86, remove it.

Requested by:	andrew
Reviewed by:	kib
Sponsored by:	DARPA/AFRL
Differential Revision:	https://reviews.freebsd.org/D25957
2020-08-05 19:11:31 +00:00
Alexander Motin
aba10e131f Allow swi_sched() to be called from NMI context.
For purposes of handling hardware error reported via NMIs I need a way to
escape NMI context, being too restrictive to do something significant.

To do it this change introduces new swi_sched() flag SWI_FROMNMI, making
it careful about used KPIs.  On platforms allowing IPI sending from NMI
context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI,
otherwise it works just like SWI_DELAY.  To handle the delayed SWIs this
patch calls clk_intr_event on every hardclock() tick.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D25754
2020-07-25 15:19:38 +00:00
Alex Richardson
b798ef6490 Include TMPFS in all the GENERIC kernel configs
Being able to use tmpfs without kernel modules is very useful when building
small MFS_ROOT kernels without a real file system.
Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs.

Compiling TMPFS only adds 4 .c files so this should not make much of a
difference to NO_MODULES build times (as we do for our minimal RISC-V
images).

Reviewed By: br (earlier version for riscv), brooks, emaste
Differential Revision: https://reviews.freebsd.org/D25317
2020-07-24 08:40:04 +00:00
Alexander Motin
ce53f590ca Untie nmi_handle_intr() from DEV_ISA.
The only part of nmi_handle_intr() depending on ISA is isa_nmi(), which is
already wrapped.  Entering debugger on NMI does not really depend on ISA.

MFC after:	2 weeks
2020-07-22 20:15:21 +00:00
Alexander Motin
f2b6da18de Avoid code duplicaiton by using ipi_selected().
MFC after:	2 weeks
2020-07-21 17:18:38 +00:00
Edward Tomasz Napierala
3e9a214260 Regen after r363304.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-07-18 11:31:31 +00:00
Edward Tomasz Napierala
8d1d017175 Add a trivial linux(4) splice(2) implementation, which simply
returns EINVAL.  Fixes grep (grep-3.1-2build1).

PR:		kern/218699
Reported by:	avos
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25636
2020-07-18 11:28:40 +00:00
Conrad Meyer
4ae224c663 Revert r240317 to prevent leaking pmap entries
Subsequent to r240317, kmem_free() was replaced with kva_free() (r254025).
kva_free() releases the KVA allocation for the mapped region, but no longer
clears the pmap (pagetable) entries.

An affected pmap_unmapdev operation would leave the still-pmap'd VA space
free for allocation by other KVA consumers.  However, this bug easily
avoided notice for ~7 years because most devices (1) never call
pmap_unmapdev and (2) on amd64, mostly fit within the DMAP and do not need
KVA allocations.  Other affected arch are less popular: i386, MIPS, and
PowerPC.  Arm64, arm32, and riscv are not affected.

Reported by:	Don Morris <dgmorris AT earthlink.net>
Submitted by:	Don Morris (amd64 part)
Reviewed by:	kib, markj, Don (!amd64 parts)
MFC after:	I don't intend to, but you might want to
Sponsored by:	Dell Isilon
Differential Revision:	https://reviews.freebsd.org/D25689
2020-07-16 23:29:26 +00:00
Mark Johnston
e64080e79c Switch from SCTP to SCTP_SUPPORT in GENERIC configs.
This removes SCTP from in-tree kernel configuration files.  Now, SCTP
can be enabled by simply loading the module, as discussed on
freebsd-net@.

Reviewed by:	tuexen
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25611
2020-07-16 15:09:04 +00:00
Konstantin Belousov
c123ab20ec Improve description of the vector argument for i386 smp_targeted_tlb_shootdown().
Submitted by:	alc
MFC after:	20 days
2020-07-15 16:12:00 +00:00
Konstantin Belousov
dc43978aa5 amd64: allow parallel shootdown IPIs
Stop using smp_ipi_mtx to protect global shootdown state, and
move/multiply the global state into pcpu.  Now each CPU can initiate
shootdown IPI independently from other CPUs.  Initiator enters
critical section, then fills its local PCPU shootdown info
(pc_smp_tlb_XXX), then clears scoreboard generation at location (cpu,
my_cpuid) for each target cpu.  After that IPI is sent to all targets
which scan for zeroed scoreboard generation words.  Upon finding such
word the shootdown data is read from corresponding cpu' pcpu, and
generation is set.  Meantime initiator loops waiting for all zeroed
generations in scoreboard to update.

Initiator does not disable interrupts, which should allow
non-invalidation IPIs from deadlocking, it only needs to disable
preemption to pin itself to the instance of the pcpu smp_tlb data.

The generation is set before the actual invalidation is performed in
handler. It is safe because target CPU cannot return to userspace
before handler finishes. In principle only NMI can preempt the
handler, but NMI would see the kernel handler frame and not touch
not-invalidated user page table.

Handlers loop until they do not see zeroed scoreboard generations.
This, together with hardware keeping one pending IPI in LAPIC IRR
should prevent lost shootdowns.

Notes.
1. The code does protect writes to LAPIC ICR with exclusion. I believe
   this is fine because we in fact do not send IPIs from interrupt
   handlers. More for !x2APIC mode where ICR access for write requires
   two registers write, we disable interrupts around it. If considered
   incorrect, I can add per-cpu spinlock around ipi_send().
2. Scoreboard lines owned by given target CPU can be padded to the
   cache line, to reduce ping-pong.

Reviewed by:	markj (previous version)
Discussed with:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D25510
2020-07-14 20:37:50 +00:00
Conrad Meyer
64612d4e44 geom(4): Kill GEOM_PART_EBR_COMPAT option
Take advantage of Warner's nice new real GEOM aliasing system and use it for
aliased partition names that actually work.

Our canonical EBR partition name is the weird, not-default-on-x86-prior-to-
this-revision "da1p4+00001234."  However, if compatibility mode (tunable
kern.geom.part.ebr.compat_aliases) is enabled (1, default), we continue to
provide the alias names like "da1p5" in addition to the weird canonical
names.

Naming partition providers was just one aspect of the COMPAT knob; in
addition it limited mutability, in part because it did not preserve existing
EBR header content aside from that of LBA 0.  This change saves the EBR
header for LBA 0, as well as for every EBR partition encountered.  That way,
when we write out the EBR partition table on modification, we can restore
any bootloader or other metadata in both LBA0 (the first data-containing EBR
may start after 0) as well as every logical EBR we read from the disk, and
only update the geometry metadata and linked list pointers that describe the
actual partitioning.

(This change does not add support for the 'bootcode' verb to EBR.)

PR:		232463
Reported by:	Manish Jain <bourne.identity AT hotmail.com>
Discussed with:	ae (no objection)
Relnotes:	maybe
Differential Revision:	https://reviews.freebsd.org/D24939
2020-07-01 02:16:36 +00:00
Kyle Evans
5403f186a7 linuxolator: implement memfd_create syscall
This effectively mirrors our libc implementation, but with minor fudging --
name needs to be copied in from userspace, so we just copy it straight into
stack-allocated memfd_name into the correct position rather than allocating
memory that needs to be cleaned up.

The sealing-related fcntl(2) commands, F_GET_SEALS and F_ADD_SEALS, have
also been implemented now that we support them.

Note that this implementation is still not quite at feature parity w.r.t.
the actual Linux version; some caveats, from my foggy memory:

- Need to implement SHM_GROW_ON_WRITE, default for memfd (in progress)
- LTP wants the memfd name exposed to fdescfs
- Linux allows open() of an fdescfs fd with O_TRUNC to truncate after dup.
  (?)

Interested parties can install and run LTP from ports (devel/linux-ltp) to
confirm any fixes.

PR:		240874
Reviewed by:	kib, trasz
Differential Revision:	https://reviews.freebsd.org/D21845
2020-06-29 03:09:14 +00:00
Edward Tomasz Napierala
a39cdcd7e7 Regen.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-27 14:43:29 +00:00
Edward Tomasz Napierala
308e194cbf Add proper types for linux message queue syscalls; mostly taken
from 32-bit Linuxulator.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25386
2020-06-27 14:42:08 +00:00
Edward Tomasz Napierala
36507f85dc Add syscall definitions for linux xattr syscalls.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25387
2020-06-27 14:39:44 +00:00
Edward Tomasz Napierala
5ac2674278 Adapt linuxulator syscalls.master files to the new layout.
No functional changes.

Reviewed by:	brooks
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25381
2020-06-21 10:09:34 +00:00
Brandon Bergren
40b664f64b [PowerPC] More relocation fixes
It turns out relocating the symbol table itself can cause issues, like fbt
crashing because it applies the offsets to the kernel twice.

This had been previously brought up in rS333447 when the stoffs hack was
added, but I had been unaware of this and reimplemented symtab relocation.

Instead of relocating the symbol table, keep track of the relocation base
in ddb, so the ddb symbols behave like the kernel linker-provided symbols.

This is intended to be NFC on platforms other than PowerPC, which do not
use fully relocatable kernels. (The relbase will always be 0)

 * Remove the rest of the stoffs hack.
 * Remove my half-baked displace_symbol_table() function.
 * Extend ddb initialization to cope with having a relocation offset on the
   kernel symbol table.
 * Fix my kernel-as-initrd hack to work with booke64 by using a temporary
   mapping to access the data.
 * Fix another instance of __powerpc__ that is actually RELOCATABLE_KERNEL.
 * Change the behavior or X_db_symbol_values to apply the relocation base
   when updating valp, to match link_elf_symbol_values() behavior.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D25223
2020-06-21 03:39:26 +00:00
Edward Tomasz Napierala
bafd96b8dd Regen after r362440.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-20 18:31:02 +00:00
Edward Tomasz Napierala
52c81be11a Add linux_madvise(2) instead of having Linux apps call the native
FreeBSD madvise(2) directly.  While some of the flag values match,
most don't.

PR:		kern/230160
Reported by:	markj
Reviewed by:	markj
Discussed with:	brooks, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25272
2020-06-20 18:29:22 +00:00
Eric van Gyzen
6fba90f201 FPU init: allocate initial state from UMA to ensure alignment
The Intel Instruction Set Reference says this about the XSAVE instruction:

    Use of a destination operand not aligned to 64-byte boundary
    (in either 64-bit or 32-bit modes) results in a general-protection
    (#GP) exception.

This alignment happens naturally when all malloc buckets are powers
of two.  However, this change is necessary on some systems when
certain non-power-of-two (and non-multiple of 64) malloc buckets
are defined.

Reviewed by:	cem; kib; earlier version by jhb
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:17:56 +00:00
Eric van Gyzen
701acc2fd8 FPU: make xsave_area_desc static
...because it can be.

Reviewed by:	cem kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:12:26 +00:00
Eric van Gyzen
674cbe7908 FPU init: Do potentially blocking operations before disabling interrupts
In particular, uma_zcreate creates sysctl oids, which locks an sx lock,
which uses IPIs under contention.  IPIs tend not to work very well
when interrupts are disabled.  Who knew, right?

Reviewed by:	cem kib
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D25098
2020-06-12 21:10:45 +00:00
Konstantin Belousov
3b23ffe271 amd64 pmap: reorder IPI send and local TLB flush in TLB invalidations.
Right now code first flushes all local TLB entries that needs to be
flushed, then signals IPI to remote cores, and then waits for
acknowledgements while spinning idle.  In the VMWare article 'Don’t
shoot down TLB shootdowns!' it was noted that the time spent spinning
is lost, and can be more usefully used doing local TLB invalidation.

We could use the same invalidation handler for local TLB as for
remote, but typically for pmap == curpmap we can use INVLPG for locals
instead of INVPCID on remotes, since we cannot control context
switches on them.  Due to that, keep the local code and provide the
callbacks to be called from smp_targeted_tlb_shootdown() after IPIs
are fired but before spin wait starts.

Reviewed by:	alc, cem, markj, Anton Rang <rang at acm.org>
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D25188
2020-06-10 22:07:57 +00:00
Mark Johnston
81302f1d77 Fix boot on systems where NUMA domain 0 is unpopulated.
- Add vm_phys_early_add_seg(), complementing vm_phys_early_alloc(), to
  ensure that segments registered during hammer_time() are placed in the
  right domain.  Otherwise, since the SRAT is not parsed at that point,
  we just add them to domain 0, which may be incorrect and results in a
  domain with only several MB worth of memory.
- Fix uma_startup1() to try allocating memory for zones from any domain.
  If domain 0 is unpopulated, the allocation will simply fail, resulting
  in a page fault slightly later during boot.
- Change _vm_phys_domain() to return -1 for addresses not covered by the
  affinity table, and change vm_phys_early_alloc() to handle wildcard
  domains.  This is necessary on amd64, where the page array is dense
  and pmap_page_array_startup() may allocate page table pages for
  non-existent page frames.

Reported and tested by:	Rafael Kitover <rkitover@gmail.com>
Reviewed by:	cem (earlier version), kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25001
2020-05-28 19:41:00 +00:00
Conrad Meyer
852c303b61 copystr(9): Move to deprecate (attempt #2)
This reapplies logical r360944 and r360946 (reverting r360955), with fixed
copystr() stand-in replacement macro.  Eventually the goal is to convert
consumers and kill the macro, but for a first step it helps if the macro is
correct.

Prior commit message:

Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults.  It's just
an older incarnation of the now-more-common strlcpy().

Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.

Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy (with correction from brooks@ -- thanks).

Remove N redundant MI implementations of copystr.  For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.

Reviewed by:		jhb (earlier version)
Discussed with:		brooks (thanks!)
Differential Revision:	https://reviews.freebsd.org/D24672
2020-05-25 16:40:48 +00:00
Mark Johnston
e115748932 Fix the build after r361033 when ACPI is disabled.
Reported by:	Herbert J. Skuhra <herbert@gojira.at>
2020-05-22 01:18:55 +00:00
Konstantin Belousov
ea6020830c amd64: Add a knob to flush RSB on context switches if machine has SMEP.
The flush is needed to prevent cross-process ret2spec, which is not handled
on kernel entry if IBPB is enabled but SMEP is present.
While there, add i386 RSB flush.

Reported by:	Anthony Steinhauser <asteinhauser@google.com>
Reviewed by:	markj, Anthony Steinhauser
Discussed with:	philip
admbugs:	961
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-05-20 22:00:31 +00:00
Mark Johnston
b19149bc56 Fix the i386 build after r361033.
Reported by:	Jenkins
2020-05-14 17:56:44 +00:00
Mark Johnston
e76aab6ae2 Call acpi_pxm_set_proximity_info() slightly earlier on x86.
This function is responsible for setting pc_domain in each pcpu
structure.  Call it from the main function that starts APs, rather than
a separate SYSINIT.  This makes it easier to close the window where
UMA's per-CPU slab allocator may be called while pc_domain is
uninitialized.  In particular, the allocator uses pc_domain to allocate
domain-local pages, so allocations before this point end up using domain
0 for everything.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24757
2020-05-14 16:07:27 +00:00
Conrad Meyer
051fc58cb3 Revert r360944 and r360946 until reported issues can be resolved
Reported by:	cy
2020-05-12 04:34:26 +00:00
Conrad Meyer
580744621f copystr(9): Move to deprecate [2/2]
Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults.  It's just
an older incarnation of the now-more-common strlcpy().

Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.

Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy.

Remove N redundant MI implementations of copystr.  For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D24672
2020-05-11 22:57:21 +00:00
Mark Johnston
1d6638472b Remove an obsolete TODO comment from several minidump implementations.
The comment referenced a non-existent function, and these minidump
implementations already buffer discontiguous physical data pages by
mapping them into a single VA range that gets passed to the dump device,
so there is no real advantage in batching calls to blk_write().

The RISC-V and MIPS minidump implementations still write a page at a
time and so would benefit from some form of batching.

MFC after:	2 weeks
Sponsored by:	Juniper Networks, Klara Inc.
2020-04-24 18:47:42 +00:00
Brooks Davis
b24e6ac8b7 Convert canary, execpathp, and pagesizes to pointers.
Use AUXARGS_ENTRY_PTR to export these pointers.  This is a followup to
r359987 and r359988.

Reviewed by:	jhb
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24446
2020-04-16 21:53:17 +00:00
Scott Long
0d9abbcf51 Fix ps_strings type change for i386 2020-04-16 05:27:13 +00:00
Brooks Davis
562894f0dc Centralize compatability translation macros.
Copy the CP, PTRIN, etc macros from freebsd32.h into a sys/abi_compat.h
and replace existing definitation with includes where required. This
eliminates duplicate code and allows Linux and FreeBSD compatability
headers to be included in the same files.

Input from:	cem, jhb
Obtained from:	CheriBSD
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24275
2020-04-14 20:30:48 +00:00
John Baldwin
59838c1a19 Retire procfs-based process debugging.
Modern debuggers and process tracers use ptrace() rather than procfs
for debugging.  ptrace() has a supserset of functionality available
via procfs and new debugging features are only added to ptrace().
While the two debugging services share some fields in struct proc,
they each use dedicated fields and separate code.  This results in
extra complexity to support a feature that hasn't been enabled in the
default install for several years.

PR:		244939 (exp-run)
Reviewed by:	kib, mjg (earlier version)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D23837
2020-04-01 19:22:09 +00:00
Conrad Meyer
ca0ec73c11 Expand generic subword atomic primitives
The goal of this change is to make the atomic_load_acq_{8,16},
atomic_testandset{,_acq}_long, and atomic_testandclear_long primitives
available in MI-namespace.

The second goal is to get this draft out of my local tree, as anything that
requires a full tinderbox is a big burden out of tree.  MD specifics can be
refined individually afterwards.

The generic implementations may not be ideal for your architecture; feel
free to implement better versions.  If no subword_atomic definitions are
needed, the include can be removed from your arch's machine/atomic.h.
Generic definitions are guarded by defined macros of the same name.  To
avoid picking up conflicting generic definitions, some macro defines are
added to various MD machine/atomic.h to register an existing implementation.

Include _atomic_subword.h in arm and arm64 machine/atomic.h.

For some odd reason, KCSAN only generates some versions of primitives.
Generate the _acq variants of atomic_load.*_8, atomic_load.*_16, and
atomic_testandset.*_long.  There are other questionably disabled primitives,
but I didn't run into them, so I left them alone.  KCSAN is only built for
amd64 in tinderbox for now.

Add atomic_subword implementations of atomic_load_acq_{8,16} implemented
using masking and atomic_load_acq_32.

Add generic atomic_subword implementations of atomic_testandset_long(),
atomic_testandclear_long(), and atomic_testandset_acq_long(), using
atomic_fcmpset_long() and atomic_fcmpset_acq_long().

On x86, add atomic_testandset_acq_long as an alias for
atomic_testandset_long.

Reviewed by:	kevans, rlibby (previous versions both)
Differential Revision:	https://reviews.freebsd.org/D22963
2020-03-25 23:12:43 +00:00
Ed Maste
2733d8c96c retire cx,ctau drivers
The devices supported by these drivers are obsolete ISA cards, and the
sync serial protocols they supported are essentially obsolete too.

Sponsored by:	The FreeBSD Foundation
2020-03-20 16:50:19 +00:00
Brandon Bergren
3069380898 [PowerPC][Book-E] Fix missing load base in elf_cpu_parse_dynamic().
When I implemented MD DYNAMIC parsing, I was originally passing a
linker_file_t so that the MD code could relocate pointers.

However, it turns out this isn't even filled in until later, so it was
always 0.

Just pass the load base (ef->address) directly, as that's really the only
thing we were interested in in the first place.

This fixes a crash on RB800 where it was trying to write to an unmapped
address when updating the GOT.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D24105
2020-03-18 02:58:18 +00:00
Warner Losh
daba5ace03 Finish removal of bktr
Remove the old ioctl .h files
Remove copying/linking ioctl .h files in instasllworld
Remove bktr from lint
Add now-removed files with ObsoleteFiles
2020-03-01 20:37:42 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Konstantin Belousov
a324b7f71d Fix IBRS for machines with IBRS_ALL capability.
When turning IBRS mitigation using sysctl, as opposed to loader tunable,
send IPI to tweak MSR on all cores.  Right now code only performed MSR write
onr the CPU where sysctl was run.

Properly report hw.ibrs_active for IBRS_ALL.  Split hw_ibrs_ibpb_active out
from ibrs_active, to keep the current semantic of guiding kernel entry and
exit handlers.

Reported and tested by:	mav
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-02-25 17:26:10 +00:00
Mateusz Guzik
4ef55e371a i386: remove no longer needed atomic_load_ptr casts 2020-02-14 23:17:37 +00:00
Mark Johnston
c3d326fd44 Define MAXCPU consistently between the kernel and KLDs.
This reverts r177661.  The change is no longer very useful since
out-of-tree KLDs will be built to target SMP kernels anyway.  Moveover
it breaks the KBI in !SMP builds since cpuset_t's layout depends on the
value of MAXCPU, and several kernel interfaces, notably
smp_rendezvous_cpus(), take a cpuset_t as a parameter.

PR:		243711
Reviewed by:	jhb, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23512
2020-02-05 19:08:21 +00:00
Ed Maste
690a8a6acd regen linuxulator sysent after r357577 2020-02-05 16:54:16 +00:00
Ed Maste
fc7510aef7 linuxulator: implement sendfile
Submitted by:	Bora Özarslan <borako.ozarslan@gmail.com>
Submitted by:	Yang Wang <2333@outlook.jp>
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19917
2020-02-05 16:53:02 +00:00
Ryan Libby
10c8fb47d9 uma: convert mbuf_jumbo_alloc to UMA_ZONE_CONTIG & tag others
Remove mbuf_jumbo_alloc and let large mbuf zones use the new uma default
contig allocator (a copy of mbuf_jumbo_alloc).  Tag other zones which
require contiguous objects, even if they don't use the new default
contig allocator, so that uma knows about their constraints.

Reviewed by:	jeff, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D23238
2020-02-04 22:40:23 +00:00
Mark Johnston
1c29da0279 Reimplement stack capture of running threads on i386 and amd64.
After r355784 the td_oncpu field is no longer synchronized by the thread
lock, so the stack capture interrupt cannot be delievered precisely.
Fix this using a loop which drops the thread lock and restarts if the
wrong thread was sampled from the stack capture interrupt handler.

Change the implementation to use a regular interrupt instead of an NMI.
Now that we drop the thread lock, there is no advantage to the latter.

Simplify the KPIs.  Remove stack_save_td_running() and add a return
value to stack_save_td().  On platforms that do not support stack
capture of running threads, stack_save_td() returns EOPNOTSUPP.  If the
target thread is running in user mode, stack_save_td() returns EBUSY.

Reviewed by:	kib
Reported by:	mjg, pho
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23355
2020-01-31 15:43:33 +00:00
Mark Johnston
149afbf3ba Fix 64-bit syscall argument fetching in 32-bit Linux syscall handlers.
The Linux32 system call argument fetcher places each argument (passed in
registers in the Linux x86 system call convention) into an entry in the
generic system call args array.  Each member of this array is 8 bytes
wide, so this approach is broken for system calls that take off_t
arguments.

Fix the problem by splitting l_loff_t arguments in the 32-bit system
call descriptions, the same as we do for FreeBSD32.  Change entry points
to handle this using the PAIR32TO64 macro.

Move linux_ftruncate64() into compat/linux.

PR:		243155
Reported by:	Alex S <iwtcex@gmail.com>
Reviewed by:	kib (previous version)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23210
2020-01-21 17:28:22 +00:00
Konstantin Belousov
2ee49fac82 Add support for Hygon Dhyana Family 18h processor.
As a new x86 CPU vendor, Chengdu Haiguang IC Design Co., Ltd (Hygon)
is a joint venture between AMD and Haiguang Information Technology Co.,
Ltd., aims at providing x86 processors for China server market.

The first generation Hygon processor(Dhyana) shares most architecture
with AMD's family 17h, but with different CPU vendor ID("HygonGenuine")
and PCI vendor ID(0x1d94) and family series number 18h(Hygon negotiated
with AMD to confirm that only Hygon use family 18h).

To enable Hygon Dhyana support in FreeBSD, add new definitions
HYGON_VENDOR_ID("HygonGenuine") and X86_VENDOR_HYGON(0x1d94) to identify
Hygon Dhyana CPU.

Initialize the CPU features(topology, local APIC ext, MSI, TSC, hwpstate,
MCA, DEBUG_CTL, etc) for amd64 and i386 mode by sharing the code path of
AMD family 17h.

The changes have been applied on FreeBSD 13.0-CURRENT and tested
successfully on Hygon Dhyana processor.

References:
[1] Linux kernel patches for Hygon Dhyana, merged in 4.20:

https://git.kernel.org/tip/c9661c1e80b609cd038db7c908e061f0535804ef

[2] MSR and CPUID definition:

https://www.amd.com/system/files/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf

Submitted by:	Pu Wen <puwen@hygon.cn>
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D23163
2020-01-21 13:22:35 +00:00
Kyle Evans
05d7dd739c sysent targets: further cleanup and deduplication
r355473 vastly improved the readability and cleanliness of these Makefiles.
Every single one of them follows the same pattern and duplicates the exact
same logic.

Now that we have GENERATED/SRCS, split SRCS up into the two parameters we'll
use for ${MAKESYSCALLS} rather than assuming a specific ordering of SRCS and
include a common sysent.mk to handle the rest. This makes it less tedious to
make sweeping changes.

Some default values are provided for GENERATED/SYSENT_*; almost all of these
just use a 'syscalls.master' and 'syscalls.conf' in cwd, and they all use
effectively the same filenames with an arbitrary prefix. Most ABIs will be
able to get away with just setting GENERATED_PREFIX and including
^/sys/conf/sysent.mk, while others only need light additions. kern/Makefile
is the notable exception, as it doesn't take a SYSENT_CONF and the generated
files are spread out between ^/sys/kern and ^/sys/sys, but it otherwise fits
the pattern enough to use the common version.

Reviewed by:	brooks, imp
Nice!:		emaste
Differential Revision:	https://reviews.freebsd.org/D23197
2020-01-18 20:37:45 +00:00
Kyle Evans
1171c633fb Set .ORDER for makesyscalls generated files
When either makesyscalls.lua or syscalls.master changes, all of the
${GENERATED} targets are now out-of-date. With make jobs > 1, this means we
will run the makesyscalls script in parallel for the same ABI, generating
the same set of output files.

Prior to r356603 , there is a large window for interlacing output for some
of the generated files that we were generating in-place rather than staging
in a temp dir. After that, we still should't need to run the script more
than once per-ABI as the first invocation should update all of them. Add
.ORDER to do so cleanly.

Reviewed by:	brooks
Discussed with:	sjg
Differential Revision:	https://reviews.freebsd.org/D23099
2020-01-10 18:24:17 +00:00
Mark Johnston
f3e982e764 Define a unified pmap structure for i386.
The overloading of struct pmap for PAE and non-PAE pmaps results in
three distinct layouts for the structure, which is embedded in
struct vmspace.  This causes a large number of duplicate structure
definitions in the i386 kernel's CTF type graph.

Since most pmap fields are the same in the two pmaps, simply provide
side-by-side variants of the fields that are distinct, using fixed-size
types.

PR:		242689
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22896
2020-01-07 15:59:31 +00:00
Mark Johnston
e8bbca1b44 Consistently use pmap_t instead of struct pmap *.
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2020-01-07 15:59:02 +00:00
Alan Cox
1c3a241032 When a copy-on-write fault occurs, pmap_enter() is called on to replace the
mapping to the old read-only page with a mapping to the new read-write page.
To destroy the old mapping, pmap_enter() must destroy its page table and PV
entries and invalidate its TLB entry.  This change simply invalidates that
TLB entry a little earlier, specifically, on amd64 and arm64, before the PV
list lock is held.

Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23027
2020-01-04 19:50:25 +00:00
Mateusz Guzik
b249ce48ea vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21427
2020-01-03 22:29:58 +00:00
Edward Tomasz Napierala
cc50333011 Add basic getcpu(2) support to linuxulator. The purpose of this
syscall is to query the CPU number and the NUMA domain the calling
thread is currently running on.  The third argument is ignored.
It doesn't do anything regarding scheduling - it's literally
just a way to query the current state, without any guarantees
you won't get rescheduled an opcode later.

This unbreaks Java from CentOS 8
(java-11-openjdk-11.0.5.10-0.el8_0.x86_64).

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22972
2019-12-31 22:01:08 +00:00
Edward Tomasz Napierala
da7627d797 Regen after r356229.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-12-31 16:01:37 +00:00
Edward Tomasz Napierala
a8bfc7a85c Fix definitions for Linux getcpu(2).
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-12-31 15:57:29 +00:00
Pawel Biernacki
54666dffa8 linux(4): implement copy_file_range(2)
copy_file_range(2) is implemented natively since r350315, make it available
for Linux binaries too.

Reviewed by:	kib (mentor), trasz (previous version)
Approved by:	kib (mentor)
Differential Revision:	https://reviews.freebsd.org/D22959
2019-12-30 18:11:06 +00:00
Edward Tomasz Napierala
ee0fe82ee2 Implement Linux syslog(2) syscall; just enough to make Linux dmesg(8)
utility work.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22465
2019-12-29 15:53:55 +00:00