Commit Graph

8879 Commits

Author SHA1 Message Date
Dmitry Chagin
f19c4e2341 linux(4): Change semtimedop syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-06 20:01:43 +03:00
Dmitry Chagin
f48a68874b linux(4): Retire linux_semop implementation.
In i386 Linux semop called via ipc() multiplexor, so use kern_semop
directly from multiplexor.

MFC after:		2 weeks
2022-05-06 20:00:13 +03:00
Dmitry Chagin
cd875998dc linux(4): Regen for semop syscall.
MFC after:		2 weeks
2022-05-06 19:59:33 +03:00
Dmitry Chagin
f686092664 linux(4): Call semop directly.
As the Linux semop syscall is not defined in i386, and as it is equal
to the native semop syscall, call it directly.
Fix semop definition to match Linux actual one - nsops is size_t type.

MFC after:		2 weeks
2022-05-06 19:58:53 +03:00
Dan Carpenter
e99c0c8b79 xen: Prevent buffer overflow in privcmd ioctl
The "call" variable comes from the user in privcmd_ioctl_hypercall().
It's an offset into the hypercall_page[] which has (PAGE_SIZE / 32)
elements.  We need to put an upper bound on it to prevent an out of
bounds access.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>

Obtained from: Linux
Linux commit: 42d8644bd77dd2d747e004e367cb0c895a606f39
Fixes: bf7313e3b7 ("xen: implement the privcmd user-space device")
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by: royger
2022-05-06 09:31:32 +02:00
Dmitry Chagin
1744f14e26 linux(4): Implement recvmmsg_time64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:53 +03:00
Dmitry Chagin
79695e9585 linux(4): Regen for recvmmsg_time64 syscall.
MFC after:	2 weeks
2022-05-04 13:06:52 +03:00
Dmitry Chagin
17ccda0039 linux(4): Change recvmmsg_time64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:52 +03:00
Dmitry Chagin
ce9f8d6ab0 linux(4): Implement timerfd_gettime64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:52 +03:00
Dmitry Chagin
ac80ae9313 linux(4): Regen for timerfd_gettime64 syscall.
MFC after:	2 weeks
2022-05-04 13:06:51 +03:00
Dmitry Chagin
16aefe5ba3 linux(4): Change timerfd_gettime64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:51 +03:00
Dmitry Chagin
b1f0b08d93 linux(4): Implement timerfd_settime64 syscall.
MFC after:		2weeks
2022-05-04 13:06:50 +03:00
Dmitry Chagin
f4228fbb4e linux(4): Regen for timerfd_settime64 syscall.
MFC after:	2 weeks
2022-05-04 13:06:50 +03:00
Dmitry Chagin
8545bcff31 linux(4): Change timerfd_settime64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:50 +03:00
Dmitry Chagin
a1fd2911dd linux(4): Implement timer_settime64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:49 +03:00
Dmitry Chagin
9038a0b74c linux(4): Regen for timer_settime64 syscall.
MFC after:	2 weeks
2022-05-04 13:06:49 +03:00
Dmitry Chagin
1508b1b6a0 linux(4): Change timer_settime64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:48 +03:00
Dmitry Chagin
783c1bd8cb linux(4): Implement timer_gettime64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:48 +03:00
Dmitry Chagin
1cccef6dff linux(4): Regen for timer_gettime64 syscall.
MFC after:	2 weeks
2022-05-04 13:06:48 +03:00
Dmitry Chagin
ccec96033c linux(4): Change timer_gettime64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:47 +03:00
Dmitry Chagin
8c84ca657b linux(4): Implement sched_rr_get_interval_time64 syscall.
MFC after:		2 weeks
2022-05-04 13:06:47 +03:00
Dmitry Chagin
cdddbb77c3 linux(4): Regen for sched_rr_get_interval_time64 syscall.
MFC after:	2 weeks
2022-05-04 13:06:46 +03:00
Dmitry Chagin
7b520c0b3c linux(4): Change sched_rr_get_interval_time64 syscall definition to match Linux actual one.
MFC after:		2 weeks
2022-05-04 13:06:45 +03:00
Dmitry Chagin
fe2c9f83a6 Remove dead code.
is_physical_memory() dead since 235a54de.

Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D35056
MFC after:		2 weeks
2022-04-26 19:40:59 +03:00
Dmitry Chagin
df377f1fb8 linux(4): Regen for epoll_pwait2 syscall. 2022-04-26 19:35:58 +03:00
Dmitry Chagin
81b0b7dc0c linux(4): Change epoll_pwait2 syscall definition to match Linux actual one.
MFC after:	2 weeks
2022-04-26 19:35:57 +03:00
Dmitry Chagin
75e409495f linux(4): Regen for rseq syscall. 2022-04-26 19:35:55 +03:00
Dmitry Chagin
f202f35db0 linux(4): Change rseq syscall definition to match Linux actual one.
MFC after:	2 weeks
2022-04-26 19:35:54 +03:00
John Baldwin
f2d166d532 amd64 NOTES: Add entries for qlxgb, glxgbe, and glxge. 2022-04-22 15:18:06 -07:00
John Baldwin
5bf623bbcd amd64 NOTES: Sort the axp entry. 2022-04-22 15:18:06 -07:00
John Baldwin
0b377a49fa FB_INSTALL_CDEV: Remove this option and related code.
This option was never enabled in GENERIC and does not appear to work
(the cdevsw is stored in a global array but never passed to make_dev
to be associated with a character device).

Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D35008
2022-04-21 10:29:14 -07:00
Brooks Davis
c2f6aae007 machine/in_cksum.h: don't include sys/cdefs.h
All consumers already do it and it was required on amd64 and i386
until recently (1c1bf5bd7c).

Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D34932
2022-04-18 21:02:19 +01:00
John Baldwin
3d6f4411e4 Remove checks for <sys/cdefs.h> being included.
These files no longer depend on the macros required when these checks
were added.

PR:		263102 (exp-run)
Reviewed by:	brooks, imp, emaste
Differential Revision:	https://reviews.freebsd.org/D34804
2022-04-12 10:06:18 -07:00
John Baldwin
5f9c9ae2f2 Remove checks for __CC_SUPPORTS_WARNING assuming it is always true.
All supported compilers (modern versions of GCC and clang) support
this.

PR:		263102 (exp-run)
Reviewed by:	brooks, imp
Differential Revision:	https://reviews.freebsd.org/D34803
2022-04-12 10:06:13 -07:00
John Baldwin
f08613087a Remove checks for __CC_SUPPORTS__INLINE assuming it is always true.
All supported compilers (modern versions of GCC and clang) support
this.

PR:		263102 (exp-run)
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D34802
2022-04-12 10:06:09 -07:00
John Baldwin
5ab33279ad Remove checks for __GNUCLIKE___TYPEOF assuming it is always true.
All supported compilers (modern versions of GCC and clang) support
this.

PR:		263102 (exp-run)
Reviewed by:	brooks, imp, emaste
Differential Revision:	https://reviews.freebsd.org/D34798
2022-04-12 10:05:50 -07:00
John Baldwin
56f5947a71 Remove checks for __GNUCLIKE_ASM assuming it is always true.
All supported compilers (modern versions of GCC and clang) support
this.

Many places didn't have an #else so would just silently do the wrong
thing.  Ancient versions of icc (the original motivation for this) are
no longer a compiler FreeBSD supports.

PR:		263102 (exp-run)
Reviewed by:	brooks, imp
Differential Revision:	https://reviews.freebsd.org/D34797
2022-04-12 10:05:45 -07:00
John Baldwin
1c1bf5bd7c x86: Remove silly checks for <sys/cdefs.h>.
These headers #include <sys/cdefs.h> right after checking if it has
already been #included.  The nested #include already existed when the
check for _SYS_CDEFS_H_ was added, so the check shouldn't have been
added in the first place.

PR:		263102 (exp-run)
Reported by:	brooks
Reviewed by:	brooks, imp, emaste
Differential Revision:	https://reviews.freebsd.org/D34796
2022-04-12 10:05:39 -07:00
Robert Wing
d4e8207317 vmm_instruction_emul.c: fix bhyve build
The __diagused macro was used to cure a "set but not used" warning. This
broke the build for bhyve since __diagused is only defined in the
kernel. Define __diagused when not building the kernel.

Fixes:          5241577a22 ("vmm: fix set but not used warning")
Reported by:    Jenkins
2022-04-10 13:37:24 -08:00
Robert Wing
5a17f489d5 vmm: fix set but not used warning 2022-04-10 10:30:19 -08:00
Robert Wing
5241577a22 vmm: fix set but not used warning 2022-04-10 10:30:16 -08:00
Robert Wing
3587bfa797 vmm: fix set but not used warning 2022-04-10 10:30:14 -08:00
Robert Wing
5c272efaba vmm: fix set but not used warnings 2022-04-10 10:30:11 -08:00
Robert Wing
f877977a03 vmm: fix set but not used warnings 2022-04-10 10:30:08 -08:00
Robert Wing
893a3dd697 vmm: fix set but not used warning 2022-04-10 10:30:05 -08:00
Gordon Bergling
ffedb32dd3 pax(1): Remove a few double words in source code comments
- s/an an/an/

MFC after:	3 days
2022-04-09 14:27:39 +02:00
Gordon Bergling
efd0fdfe28 NOTES: Remove a double word in comments
- s/for for/for/

MFC after:	3 days
2022-04-09 10:31:49 +02:00
John Baldwin
6a33ecdc2f Fix a typo in previous commit.
Reported by:	npn
Pointy hat to:	jhb
Fixes:		572edd3dae vmm: Re-quiet set but unused warnings.
2022-04-08 12:01:33 -07:00
John Baldwin
a7d876f701 vmm amdvi: Move ctrl under #ifdef AMDVI_DEBUG_CMD. 2022-04-07 17:01:29 -07:00
John Baldwin
572edd3dae vmm: Re-quiet set but unused warnings.
__diagused is no longer used for KTR, so instead use #ifdef KTR or
expand KTR-only variables into their values in traces.
2022-04-07 17:01:29 -07:00
John Baldwin
3797f43e91 sgx: Remove unused variable. 2022-04-07 17:01:28 -07:00
Dmitry Chagin
d5dc757e84 linux(4): Add compat.linux32.emulate_i386 knob.
Historically 32-bit Linuxulator under amd64 emulated the real i386
behavior. Since 3d8dd983 the old i386 Linux world can't be used under
amd64 Linuxulator as it don't know anything about amd64 machine (which
is returned now by newuname() syscall). So, add a knob to allow to swith
the behavior and use i386 Linux binaries on amd64.
Set knob to the new behavior as I think this is common to the modern
Linux distros.

Reviewed by:		Pau Amma (doc), emaste
Differential revision:	https://reviews.freebsd.org/D34708
MFC after:		2 weeks
2022-03-31 21:01:09 +03:00
Brooks Davis
8601fca789 sysent: regen for syscallarg_t 2022-03-28 19:43:03 +01:00
Brooks Davis
b1ad6a9000 syscallarg_t: Add a type for system call arguments
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).

Obtained from:	CheriBSD

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D33780
2022-03-28 19:43:03 +01:00
John Baldwin
931983ee08 x86: Add a NT_X86_SEGBASES register set.
This register set contains the values of the fsbase and gsbase
registers.  Note that these registers can already be controlled
individually via ptrace(2) via MD operations, so the main reason for
adding this is to include these register values in core dumps.  In
particular, this will enable looking up the value of TLS variables
from core dumps in gdb.

The value of NT_X86_SEGBASES was chosen to match the value of
NT_386_TLS on Linux.  The notes serve similar purposes, but FreeBSD
will never dump a note equivalent to NT_386_TLS (which dumps a single
segment descriptor rather than a pair of addresses) and picking a
currently-unused value in the NT_X86_* range could result in a future
conflict.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D34650
2022-03-24 11:36:19 -07:00
Bjoern A. Zeeb
246c398145 bhyve: Do not remove guest physical addresses from IOMMU host domain
This permits I/O devices on the host to directly access wired memory
dedicated to guests using passthru devices.  Note that wired memory
belonging to guests that do not use passthru devices has always been
accessible by I/O devices on the host.

bhyve maps guest physical addresses into the user address space of
the bhyve process by mmap'ing /dev/vmm/<vmname>.  Device models pass
pointers derived from this mapping directly to system calls such as
preadv() to minimize copies when emulating DMA.  If the backing store
for a device model is a raw host device (e.g. when exporting a raw disk
device such as /dev/ada<n> as a drive in the guest), the host device
driver (e.g. ahci for /dev/ada<n>) can itself use DMA on the host
directly to the guest's memory.  However, if the guest's memory is
not present in the host IOMMU domain, these DMA requests by the host
device will fail without raising an error visible to the host device
driver or to the guest resulting in non-working I/O in the guest.

It is unclear why guest addresses were removed from the IOMMU host domain
initially, especially only for VM's with a passthru device as the
host IOMMU domain does not affect the permissions of passthru devices,
only devices on the host.

A considered alternative was using bounce buffers instead (D34535
is a proof of concept), but that adds additional overhead for unclear
benefit.

This solves a long-standing problem when using passthru devices and
physical disks in the same VM.

Thanks to:	grehan (patience and help)
Thanks to:	jhb (for improving the commit message)
PR:		260178
Reviewed by:	grehan, jhb
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D34607
2022-03-24 15:21:24 +00:00
Roger Pau Monné
1ca34862dc x86/tsc: fetch frequency from CPUID when running on Xen
Introduce a helper to fetch the TSC frequency from CPUID when running
under Xen.

Since the TSC can also be initialized early when running as a Xen
guest pull out the call to tsc_init() from the
early_clock_source_init() handlers and place it in clock_init(), as
otherwise all handlers would call tsc_init() anyway.

Reviewed by: markj
Sponsored by: Citrix Systems R&D
Differential revision: https://reviews.freebsd.org/D34581
2022-03-18 10:21:04 +01:00
Corvin Köhne
e47fe3183e bhyve: add ROM emulation
Some PCI devices especially GPUs require a ROM to work properly.
The ROM is executed by boot firmware to initialize the device.
To add a ROM to a device use the new ROM option for passthru device
(e.g. -s passthru,0/2/0,rom=<path>/<to>/<rom>).

It's necessary that the ROM is executed by the boot firmware.
It won't be executed by any OS.
Additionally, the boot firmware should be configured to execute the
ROM file.
For that reason, it's only possible to use a ROM when using
OVMF with enabled bus enumeration.

Differential Revision:	https://reviews.freebsd.org/D33129
Sponsored by:   Beckhoff Automation GmbH & Co. KG
MFC after:      1 month
2022-03-10 12:30:37 +01:00
Andrew Turner
9cf15aefb9 Fix the spelling of EFI_PAGE_SIZE
We assume EFI_PAGE_SIZE is the same as PAGE_SIZE, however this may not
be the case. Use the former when working with a list of pages from the
UEFI firmware so the correct size is used.

This will be needed on arm64 where PAGE_SIZE could be 16k or 64k in the
future. The other architectures have been updated to be consistent.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34510
2022-03-10 10:43:54 +00:00
John Baldwin
f1d450ddee bhyve: Remove VM_MAXCPU from the userspace API/ABI.
Reviewed by:	grehan
Differential Revision:	https://reviews.freebsd.org/D34494
2022-03-09 15:39:28 -08:00
Mark Johnston
f7a6dccf42 amd64: Call clock_init() after finishidentcpu()
As in commit c3d830cf7c, we should finalize CPU identification before
probing the TSC frequency.

Fixes:		84369dd523 ("x86: Probe the TSC frequency earlier")
Reported by:	khng
2022-03-04 19:32:39 -05:00
Mark Johnston
84369dd523 x86: Probe the TSC frequency earlier
This lets us use the TSC to implement early DELAY, limiting the use of
the sometimes-unreliable 8254 PIT.

PR:		262155
Reviewed by:	emaste
Tested by:	emaste, mike tancsa <mike@sentex.net>, Stefan Hegnauer <stefan.hegnauer@gmx.ch>
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34367
2022-03-01 09:39:35 -05:00
Robert Wing
2062ce996d vmm: fix "set but not used" warnings 2022-02-28 15:09:32 -09:00
Robert Wing
39d87a0235 vmm: fix "set but not used" warnings 2022-02-28 14:55:37 -09:00
Robert Wing
73505a1076 vmm: fix "set but not used" warnings 2022-02-28 14:46:08 -09:00
Mateusz Guzik
b53133a778 proc: load/store p_cowgen using atomic primitives 2022-02-13 13:07:08 +00:00
Warner Losh
690601f0b4 amd64: Add static asssert for context size
Add a static assert for the siginfo_t, mcontext_t and ucontext_t
sizes. These are de-facto ABI options and cannot change size ever.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D34212
2022-02-10 14:32:20 -07:00
John Baldwin
6426978617 Extend the VMM stats interface to support a dynamic count of statistics.
- Add a starting index to 'struct vmstats' and change the
  VM_STATS ioctl to fetch the 64 stats starting at that index.
  A compat shim for <= 13 continues to fetch only the first 64
  stats.

- Extend vm_get_stats() in libvmmapi to use a loop and a static
  thread local buffer which grows to hold the stats needed.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D27463
2022-02-07 14:11:10 -08:00
Elliott Mitchell
ad7dd51499 xen: switch to use headers in contrib
These headers originate with the Xen project and shouldn't be mixed with
the main portion of the FreeBSD kernel. Notably they shouldn't be the
target of clean-up commits.

Switch to use the headers in sys/contrib/xen.

Reviewed by: royger
2022-02-07 10:11:56 +01:00
John Baldwin
6bea696af2 linux_copyout_strings: Use PROC_PS_STRINGS().
Reviewed by:	markj
Obtained from:	CheriBSD
Differential Revision:	https://reviews.freebsd.org/D34173
2022-02-04 15:57:57 -08:00
Konstantin Belousov
9596b349bb x86 atomic.h: remove obsoleted comment
Modules no longer call kernel functions for atomic ops, and since the
previous commit, we always use lock prefix.

Submitted by:	Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by:	jhb, markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34153
2022-02-04 14:01:39 +02:00
Konstantin Belousov
9c0b759bf9 x86 atomics: use lock prefix unconditionally
Atomics have significant other use besides providing in-system
primitives for safe memory updates.  They are used for implementing
communication with out of system software or hardware following some
protocols.

For instance, even UP kernel might require a protocol using atomics to
communicate with the software-emulated device on SMP hypervisor.  Or
real hardware might need atomic accesses as part of the proper
management protocol.

Another point is that UP configurations on x86 are extinct, so slight
performance hit by unconditionally use proper atomics is not important.
It is compensated by less code clutter, which in fact improves the
UP/i386 lifetime expectations.

Requested by:	Elliott Mitchell <ehem+freebsd@m5p.com>
Reviewed by:	Elliott Mitchell, imp, jhb, markj, royger
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34153
2022-02-04 14:01:39 +02:00
Konstantin Belousov
cbf999e75d x86 atomic.h: cleanup comments for preprocessor directives
Reviewed by:	Elliott Mitchell, imp, jhb, markj, royger
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34153
2022-02-04 14:01:39 +02:00
Konstantin Belousov
e7c5442162 amd64: micro-optimize vptopte()/vtopde() further
Eliminate shlq $3,address shift after masking of the va is done, which
is needed to convert pt_entry_t[] array index into byte offset.
Do it by preshifting the mask, and compensating the right shift of va.

Suggested by:	alc
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33786
2022-02-02 11:40:04 +02:00
Andrew Turner
548a2ec49b Add PT_GETREGSET
This adds the PT_GETREGSET and PT_SETREGSET ptrace types. These can be
used to access all the registers from a specified core dump note type.
The NT_PRSTATUS and NT_FPREGSET notes are initially supported. Other
machine-dependant types are expected to be added in the future.

The ptrace addr points to a struct iovec pointing at memory to hold the
registers along with its length. On success the length in the iovec is
updated to tell userspace the actual length the kernel wrote or, if the
base address is NULL, the length the kernel would have written.

Because the data field is an int the arguments are backwards when
compared to the Linux PTRACE_GETREGSET call.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19831
2022-01-27 11:40:34 +00:00
Mark Johnston
758d98debe exec: Remove the stack gap implementation
ASLR stack randomization will reappear in a forthcoming commit.  Rather
than inserting a random gap into the stack mapping, the entire stack
mapping itself will be randomized in the same way that other mappings
are when ASLR is enabled.

No functional change intended, as the stack gap implementation is
currently disabled by default.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 16:11:54 -05:00
Mark Johnston
706f4a81a8 exec: Introduce the PROC_PS_STRINGS() macro
Rather than fetching the ps_strings address directly from a process'
sysentvec, use this macro.  With stack address randomization the
ps_strings address is no longer fixed.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 16:11:54 -05:00
Mark Johnston
3fc21fdd5f sysent: Add a sv_psstringssz field to struct sysentvec
The size of the ps_strings structure varies between ABIs, so this is
useful for computing the address of the ps_strings structure relative to
the top of the stack when stack address randomization is enabled.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 11:42:07 -05:00
Corvin Köhne
6171e026be bhyve: add support for MTRR
Some guests or driver might depend on MTRR to work properly. E.g. the
nvidia gpu driver won't work without MTRR.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Beckhoff Automation GmbH & Co. KG
Differential Revision:	https://reviews.freebsd.org/D33333
2022-01-14 12:41:44 +01:00
John Baldwin
bd7630ef61 ia32: Sync signal context type names with i386.
- Use ia32_freebsd4_* instead of ia32_*4.
- Use ia32_o* instead of ia32_*3.

Reviewed by:	brooks, imp, kib
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D33882
2022-01-13 17:17:21 -08:00
Brooks Davis
0910a41ef3 Revert "syscallarg_t: Add a type for system call arguments"
Missed issues in truss on at least armv7 and powerpcspe need to be
resolved before recommit.

This reverts commit 3889fb8af0.
This reverts commit 1544e0f5d1.
2022-01-12 23:29:20 +00:00
Brooks Davis
3889fb8af0 sysent: regen for syscallarg_t 2022-01-12 22:51:25 +00:00
Brooks Davis
1544e0f5d1 syscallarg_t: Add a type for system call arguments
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).

Obtained from:	CheriBSD

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D33780
2022-01-12 22:51:25 +00:00
Vitaliy Gusev
c72e914cf1 vmm: vlapic resume can eat 100% CPU by vlapic_callout_handler
Suspend/Resume of Win10 leads that CPU0 is busy on handling interrupts.

Win10 does not use LAPIC timer to often and in most cases, and I see it
is disabled by writing 0 to Initial Count Register (for Timer).

During resume, restart timer only for enabled LAPIC and enabled timer
for that LAPIC.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33448
2022-01-11 09:27:45 -05:00
Konstantin Belousov
15964f1cb3 amd64 pmap: preset A and M bits for pmap_qenter() and pmap_kenter() mappings
This removes one or two atomic update of the pte on access. Also it
makes the pte constant during its lifecycle.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33778
2022-01-08 06:34:18 +02:00
Konstantin Belousov
720a892ac6 amd64 pmap: simplify vtopte() and vtopde()
Pre-calculate masks and page table/page directory bases, for LA48 and
LA57 cases, trading a conditional for the memory accesses.

This shaves 672 bytes of .text, by the cost of 32 .data bytes.  Note
that eliminated code contained one conditional, and several costly
movabs instructions, which are traded by two memory fetches per
vtop{t,d}e() inlines.

Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33776
2022-01-08 06:34:18 +02:00
Warner Losh
9a498f9a14 lio: remove from NOIP
It doesn't build here. This was masked earlier by other build breakage.

Sponsored by:		Netflix
2022-01-05 14:19:34 -07:00
Edward Tomasz Napierala
f7b04c53de linux(4): Reduce diffs between linux_rt_sendsig() and sendsig()
No functional changes (except for the uprintf).

Discussed With:	kib
Sponsored By:	EPSRC
2022-01-04 14:39:19 +00:00
Edward Tomasz Napierala
562bc0a943 Unstaticize {get,set}_fpcontext() on amd64
This will be used to fix Linux signal delivery.

Discussed With:	kib
Sponsored By:	EPSRC
2022-01-04 13:25:12 +00:00
Konstantin Belousov
642f77be1d amd64 sigtramp: comment-out annotations for registers with DWARF number >= 32
Sponsored by:	The FreeBSD Foundation
2022-01-02 21:00:52 +02:00
Konstantin Belousov
76ef4f6348 amd64 get_mcontext(): third argument to get_fpcontext() is a pointer
Use NULL instead of raw 0

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2022-01-01 05:58:21 +02:00
Mark Johnston
f04a096049 exec: Simplify sv_copyout_strings implementations a bit
Simplify control flow around handling of the execpath length and signal
trampoline.  Cache the sysentvec pointer in a local variable.

No functional change intended.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33703
2021-12-31 12:50:15 -05:00
Stefan Eßer
e2650af157 Make CPU_SET macros compliant with other implementations
The introduction of <sched.h> improved compatibility with some 3rd
party software, but caused the configure scripts of some ports to
assume that they were run in a GLIBC compatible environment.

Parts of sched.h were made conditional on -D_WITH_CPU_SET_T being
added to ports, but there still were compatibility issues due to
invalid assumptions made in autoconfigure scripts.

The differences between the FreeBSD version of macros like CPU_AND,
CPU_OR, etc. and the GLIBC versions was in the number of arguments:
FreeBSD used a 2-address scheme (one source argument is also used as
the destination of the operation), while GLIBC uses a 3-adderess
scheme (2 source operands and a separately passed destination).

The GLIBC scheme provides a super-set of the functionality of the
FreeBSD macros, since it does not prevent passing the same variable
as source and destination arguments. In code that wanted to preserve
both source arguments, the FreeBSD macros required a temporary copy of
one of the source arguments.

This patch set allows to unconditionally provide functions and macros
expected by 3rd party software written for GLIBC based systems, but
breaks builds of externally maintained sources that use any of the
following macros: CPU_AND, CPU_ANDNOT, CPU_OR, CPU_XOR.

One contributed driver (contrib/ofed/libmlx5) has been patched to
support both the old and the new CPU_OR signatures. If this commit
is merged to -STABLE, the version test will have to be extended to
cover more ranges.

Ports that have added -D_WITH_CPU_SET_T to build on -CURRENT do
no longer require that option.

The FreeBSD version has been bumped to 1400046 to reflect this
incompatible change.

Reviewed by:	kib
MFC after:	2 weeks
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D33451
2021-12-30 12:20:32 +01:00
John Baldwin
254e4e5b77 Simplify swi for bus_dma.
When a DMA request using bounce pages completes, a swi is triggered to
schedule pending DMA requests using the just-freed bounce pages.  For
a long time this bus_dma swi has been tied to a "virtual memory" swi
(swi_vm).  However, all of the swi_vm implementations are the same and
consist of checking a flag (busdma_swi_pending) which is always true
and if set calling busdma_swi.  I suspect this dates back to the
pre-SMPng days and that the intention was for swi_vm to serve as a
mux.  However, in the current scheme there's no need for the mux.

Instead, remove swi_vm and vm_ih.  Each bus_dma implementation that
uses bounce pages is responsible for creating its own swi (busdma_ih)
which it now schedules directly.  This swi invokes busdma_swi directly
removing the need for busdma_swi_pending.

One consequence is that the swi now works on RISC-V which had previously
failed to invoke busdma_swi from swi_vm.

Reviewed by:	imp, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D33447
2021-12-28 13:51:25 -08:00
Alan Cox
e161dfa918 Fix pmap_is_prefaultable() on arm64 and riscv
The current implementations never correctly return TRUE. In all cases,
when they currently return TRUE, they should have returned FALSE.  And,
in some cases, when they currently return FALSE, they should have
returned TRUE.  Except for its effects on performance, specifically,
additional page faults and pointless calls to pmap_enter_quick() that
abort, this error is harmless.  That is why it has gone unnoticed.

Add a comment to the amd64, arm64, and riscv implementations
describing how their return values are computed.

Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33659
2021-12-27 19:17:14 -06:00
Kyle Evans
e6f760f0e8 sysent: regenerate 2021-12-16 20:56:28 -06:00
Kyle Evans
8494666658 sysent: move away from allowing all compat options for other ABIs
Notably, the current compat_options only makes sense for native and
freebsd32 ABIs.  For the others, it just adds cruft. Switch to having
sets of compat options, and default to the native set.  Setup the other
ABIs where it doesn't make sense to opt-out of the native set.

This removes some redundant COMPAT_FREEBSD* stuff from Linuxolator bits.

line_expr in makesyscalls.lua is fixed to allow empty strings to be
specified, since they're harmless.

Reviewed by:	brooks, kib (both earlier version)
Differential Revision:	https://reviews.freebsd.org/D33356
2021-12-16 20:56:28 -06:00
Warner Losh
b4fba31b63 Remove references to PCMCIA
Remove more references to PCMCIA in kernel config files. We no longer
support PC Card devices.

Sponsored by:		Netflix
2021-12-14 15:27:47 -07:00
Konstantin Belousov
0e6b06d5c8 x86: add a comment providing source for numbers in legacy XSAVE area layout
Suggested by:	Michael Pratt <mpratt@google.com>
Reviewed by:	Michael Pratt <mpratt@google.com>, emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D33423
2021-12-14 19:14:40 +02:00
Konstantin Belousov
73b357be92 amd64: correct size of the SSE area in the xsave layout
PR:	https://github.com/golang/go/issues/46272
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-12-12 20:20:31 +02:00
Warner Losh
21e22be91a ed: Remove options
ed(4) was removed some time ago, but these options relevant to only it
weren't GC'd at the time. Remove them.

Sponsored by:		Netflix
2021-12-09 17:41:39 -07:00
John Baldwin
1a62e9bc00 Add <machine/tls.h> header to hold MD constants and helpers for TLS.
The header exports the following:

- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)

Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).

Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr.  libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).

A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.

Reviewed by:	kib (older version of x86/tls.h), jrtc27
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D33351
2021-12-09 13:17:13 -08:00
Mateusz Guzik
8c3149be54 amd64: plug set-but-not-used vars in pmap
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-12-09 16:37:26 +00:00
Konstantin Belousov
b7c55487ff Regen 2021-12-09 02:49:10 +02:00
Brooks Davis
547566526f Make struct syscall_args machine independent
After a round of cleanups in late 2020, all definitions are
functionally identical.

This removes a rotted __aligned(8) on arm. It was added in
b7112ead32 and was intended to align the
args member so that 64-bit types (off_t, etc) could be safely read on
armeb compiled with clang. With the removal of armev, this is no
longer needed (armv7 requires that 32-bit aligned reads of 64-bit
values be supported and we enable such support on armv6).  As further
evidence this is unnecessary, cleanups to struct syscall_args have
resulted in args being 32-bit aligned on 32-bit systems.  The sole
effect is to bloat the struct by 4 bytes.

Reviewed by:	kib, jhb, imp
Differential Revision:	https://reviews.freebsd.org/D33308
2021-12-08 18:45:33 +00:00
Konstantin Belousov
8a4bd7f818 amd64 ia32 vdso: add unwind annotations to the signal trampoline
Reviewed by:	emaste
Discussed with:	jhb, jrtc27
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D32960
2021-12-06 20:47:24 +02:00
Konstantin Belousov
5b8918fac6 amd64 native vdso: add unwind annotations to the signal trampoline
Reviewed by:	emaste
Discussed with:	jhb, jrtc27
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D32960
2021-12-06 20:47:17 +02:00
Konstantin Belousov
98c8b62524 vdso for ia32 on amd64
Reviewed by:	emaste
Discussed with:	jrtc27
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D32960
2021-12-06 20:46:49 +02:00
Konstantin Belousov
ab4524b3d7 amd64: wrap 64bit sigtramp into vdso
Reviewed by:	emaste
Discussed with:	jrtc27
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D32960
2021-12-06 20:46:49 +02:00
Mark Johnston
f06f1d1fdb x86: Deduplicate clock.h
The headers were mostly identical on amd64 and i386.

No functional change intended.

Reviewed by:	cperciva, mav, imp, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33205
2021-12-06 10:39:08 -05:00
Alan Cox
4c57d6d555 amd64/pmap: fix user page table page accounting
When a superpage mapping is destroyed and the original page table page
containing 4KB mappings that was being held in reserve is deallocated,
the recently introduced user page table page count was not being
decremented.  Consequentially, the count was wrong and would grow over
time.  For example, after multiple iterations of "buildworld", I was
seeing implausible counts, like the following:

vm.pmap.kernel_pt_page_count: 2184
vm.pmap.user_pt_page_count: 2280849
vm.pmap.pv_page_count: 106

With this change, I now see:

vm.pmap.kernel_pt_page_count: 2183
vm.pmap.user_pt_page_count: 344
vm.pmap.pv_page_count: 105

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D33276
2021-12-05 19:13:43 -06:00
Mitchell Horne
03b3d7bbec x86: remove unused T_USER flag
It stopped being used in 3c256f5395, when trap() was reorganized to
have separate switch statements for user and kernel traps. Remove the
two leftover references and the flag itself.

Reviewed by:	kib
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D33253
2021-12-05 11:12:40 -04:00
Ed Maste
28dcccc129 x86 GENERIC/MINIMAL: group sc(4) devices together
The vga and splash devices are part of the sc(4) system console. vt(4)
uses the vt_vga driver instead, and has some limited splash support
directly in vt_core.c.  Leave the sc(4) options in GENERIC/MINIMAL (for
now) but group them together under an sc(4) comment.

Sponsored by:	The FreeBSD Foundation
2021-11-28 14:38:41 -05:00
Ed Maste
777526ed83 Remove options VESA from x86 MINIMAL
Followup to b8cf1c5c30, remove from MINIMAL in addition to GENERIC.

options VESA / vesa.ko provides VESA Bios Extensions (VBE) support for
the legacy sc(4) console.  It is not used by the default console, vt(4).

PR:		253733
Fixes:		b8cf1c5c30 ("Remove options VESA from x86 GENERIC")
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
2021-11-28 14:37:46 -05:00
Ed Maste
b8cf1c5c30 Remove options VESA from x86 GENERIC
options VESA / vesa.ko provides VESA Bios Extensions (VBE) support for
the legacy sc(4) console.  It is not used by the default console, vt(4).

There is a report[1] of an incompatibility between VESA and the Nvidia
driver breaking suspend/resume.  Since VESA is not used by the default
configuration anyway, just remove options VESA from GENERIC.  The kernel
module is still available and may be loaded by sc(4) users who want to
select a VBE mode.

(Note that vt(4) does not support selecting a VBE mode.  The loader can
set a VBE mode and vt(4) will use it via the vt_vbefb driver.)

[1] https://lists.freebsd.org/archives/freebsd-hackers/2021-November/000469.html

PR:		253733
Reported by:	Stefan Blachmann [1]
Reviewed by:	imp, manu, tsoome
Relnotes:	Yes
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33141
2021-11-28 11:29:17 -05:00
Ed Maste
228e020a3b Correct syscons description in i386 and amd64 configs
Commit 2d6f6d6373 switched to vt(4) as the default console.

Sponsored by:	The FreeBSD Foundation
2021-11-27 16:22:42 -05:00
Mateusz Guzik
af4051d250 linux: remove the always curthread argument from lconvpath 2021-11-25 22:50:42 +00:00
Warner Losh
8722e05ae1 twa: Remove
Belatedly remove twa(4). It was supposed to go before 13.0, but was
overlooked.

Sponsored by:		Netflix
Relnotes:		yes
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33114
2021-11-25 00:45:13 -07:00
Warner Losh
0d5935af8f esp: Remove
Belatedly remove esp(4). It was tagged as gone in 13, but was overlooked
until now.

Sponsored by:		Netflix
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33115
2021-11-25 00:45:12 -07:00
Warner Losh
60de2867c9 amr: remove
Belatedly remove amr(4). It was slated to depart before 13.0 but was
overlooked until now.

Sponsored by:		Netflix
Relnotes:		yes
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33113
2021-11-25 00:45:12 -07:00
Warner Losh
399188a2c6 iir: Remove
Belatedly remove iir(4). It was slated to go before 13, but was
overlooked.

Sponsored by:		Netflix
Relnotes:		yes
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33112
2021-11-25 00:45:12 -07:00
Warner Losh
a9620045a5 mly: Remove.
We'd said this was going away in 13, but was overlooked. Belatedly
remove.

Sponsored by:		Netflix
Relnotes:		yes
Reviewed by:		scottl
Differential Revision:	https://reviews.freebsd.org/D33111
2021-11-25 00:45:12 -07:00
Mark Johnston
ecbbe83144 netinet: Deduplicate most in_cksum() implementations
in_cksum() and related routines are implemented separately for each
platform, but only i386 and arm have optimized versions.  Other
platforms' copies of in_cksum.c are identical except for style
differences and support for big-endian CPUs.

Deduplicate the implementations for the rest of the platforms.  This
will make it easier to implement in_cksum() for unmapped mbufs.  On arm
and i386, define HAVE_MD_IN_CKSUM to mean that the MI implementation is
not to be compiled.

No functional change intended.

Reviewed by:	kp, glebius
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33095
2021-11-24 13:31:16 -05:00
Mark Johnston
09100f936b netinet: Remove in_cksum_update()
It was never implemented on powerpc or riscv and appears to have been
unused since it was added in 1998.  No functional change intended.

Reviewed by:	kp, glebius, cy
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33093
2021-11-24 13:31:15 -05:00
Brooks Davis
6b7c23a026 syscalls: regen 2021-11-22 22:36:57 +00:00
Brooks Davis
f0cfbffc36 syscalls: regen 2021-11-22 22:36:56 +00:00
Mitchell Horne
10fe6f80a6 minidump: Use the provided dump bitset
When constructing the set of dumpable pages, use the bitset provided by
the state argument, rather than assuming vm_page_dump invariably. For
normal kernel minidumps this will be a pointer to vm_page_dump, but when
dumping the live system it will not.

To do this, the functions in vm_dumpset.h are extended to accept the
desired bitset as an argument. Note that this provided bitset is assumed
to be derived from vm_page_dump, and therefore has the same size.

Reviewed by:	kib, markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31992
2021-11-19 15:05:52 -04:00
Mitchell Horne
1d2d1418b4 minidump: Use provided msgbuf pointer
Don't assume we are dumping the global message buffer, but use the one
provided by the state argument. While here, drop superfluous
cast to char *.

Reviewed by:	markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31991
2021-11-19 15:05:52 -04:00
Mitchell Horne
681bd71047 minidump: reduce the amount direct accesses to page tables
During a live dump, we may race with updates to the kernel page tables.
This is generally okay; we accept that the state of the system while
dumping may be somewhat inconsistent with its state when the dump was
invoked. However, when walking the kernel page tables, it is important
that we load each PDE/PTE only once while operating on it. Otherwise, it
is possible to have the relevant PTE change underneath us. For example,
after checking the valid bit, but before reading the physical address.

Convert the loads to atomics, and add some validation around the
physical addresses, to ensure that we do not try to dump a non-existent
or non-canonical physical address.

Similarly, don't read kernel_vm_end more than once, on the off chance
that pmap_growkernel() is called between the two page table walks.

Reviewed by:	kib, markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31990
2021-11-19 15:05:52 -04:00
Mitchell Horne
90d4da6225 amd64: provide PHYS_IN_DMAP() and VIRT_IN_DMAP()
It is useful for quickly checking an address against the DMAP region.
These definitions exist already on arm64 and riscv.

Reviewed by:	kib, markj
MFC after:	3 days
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D32962
2021-11-19 15:05:52 -04:00
Mitchell Horne
1adebe3cd6 minidump: Parameterize minidumpsys()
The minidump code is written assuming that certain global state will not
change, and rightly so, since it executes from a kernel debugger
context. In order to support taking minidumps of a live system, we
should allow copies of relevant global state that is likely to change to
be passed as parameters to the minidumpsys() function.

This patch does the work of parameterizing this function, by adding a
struct minidumpstate argument. For now, this struct allows for copies of
the kernel message buffer, and the bitset that tracks which pages should
be dumped (vm_page_dump). Follow-up changes will actually make use of
these arguments.

Notably, dump_avail[] does not need a snapshot, since it is not expected
to change after system initialization.

The existing minidumpsys() definitions are renamed, and a thin MI
wrapper is added to kern_dump.c, which handles the construction of
the state struct. Thus, calling minidumpsys() remains as simple as
before.

Reviewed by:	kib, markj, jhb
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31989
2021-11-19 15:05:52 -04:00
Brooks Davis
ad58266704 freebsd32: remove redundant no-arg syscalls
pipe requires no special handling.

ofreebsd32_sigpending did differ from osigpending in that it acted
on the siglist rather than the sigqueue, but this appears to be an
oversight in 3fbdb3c215.

ogetpagesize could theoretically have ABI-dependent results, but in
practice does not. If it does it would be easy handle in the central
implementation and be the least of the problems in changing the value of
PAGE_SIZE.

Reviewed by:	kevans
2021-11-17 20:12:24 +00:00
Kristof Provost
4e85b64890 Add a COMPAT_FREEBSD13 kernel option
Use it wherever COMPAT_FREEBSD11 is currently specified.

Reviewed by:	jhb (previous version)
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33005
2021-11-17 03:08:40 +01:00
Mark Johnston
ab12e8db29 amd64: Reduce the amount of cpuset copying done for TLB shootdowns
We use pmap_invalidate_cpu_mask() to get the set of active CPUs.  This
(32-byte) set is copied by value through multiple frames until we get to
smp_targeted_tlb_shootdown(), where it is copied yet again.

Avoid this copying by having smp_targeted_tlb_shootdown() make a local
copy of the active CPUs for the pmap, and drop the cpuset parameter,
simplifying callers.  Also leverage the use of the non-destructive
CPU_FOREACH_ISSET to avoid unneeded copying within
smp_targeted_tlb_shootdown().

Reviewed by:	alc, kib
Tested by:	pho
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32792
2021-11-15 13:01:31 -05:00
Mark Johnston
71e6e9da22 amd64: Initialize kernel_pmap's active CPU set to all_cpus
This is in preference to simply filling the cpuset, and allows the
conditional in pmap_invalidate_cpu_mask() to be elided.

Also export pmap_invalidate_cpu_mask() outside of pmap.c for use in a
subsequent commit.

Suggested by:	kib
Reviewed by:	alc, kib
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32792
2021-11-15 13:01:30 -05:00
Mark Johnston
42c2cd1ffb amd64: Annotate an unlikely condition in smp_targeted_tlb_shootdown()
Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-11-15 13:01:30 -05:00
Konstantin Belousov
c5658876b4 amd64/ia32/ia32_signal.c: Use ANSI C functions definitions
Remove MPSAFE annotations.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-11-13 19:31:53 +02:00
Warner Losh
7e3c9ec906 tcp: better congestion control defaults
Define CC_NEWRENO in all the appropriate DEFAULTS and std.* config
files. It's the default congestion control algorithm.  Add code to cc.c
so that CC_DEFAULT is "newreno" if it's not overriden in the config
file.

Sponsored by: Netflix
Fixes: b8d60729de ("tcp: Congestion control cleanup.")
Revired by: manu, hselasky, jhb, glebius, tuexen
Differential Revision:	https://reviews.freebsd.org/D32964
2021-11-12 12:16:11 -07:00
Warner Losh
eb6b6622fe Update MINIMAL to have CC options
The MINIMAL configs were overlooked. They are compiled as part of
universe, so this broke universe builds. Add the same defafults as for
GENERIC.

Sponsored by:		Netflix
2021-11-12 09:33:18 -07:00
Randall Stewart
b8d60729de tcp: Congestion control cleanup.
NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!

This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.

This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.

Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693
2021-11-11 06:28:18 -05:00
Edward Tomasz Napierala
0bf8d5d5f4 linux: Replace ifdefs in ptrace with per-architecture callbacks
It's a cleanup; no (intended) functional changes.

Sponsored By:	EPSRC
Reviewed By:	kib
Differential Revision:	https://reviews.freebsd.org/D32888
2021-11-09 11:59:17 +00:00
Edward Tomasz Napierala
a90ff3c4bc linux: Add ptrace(2) support on arm64
This moves linux_ptrace.c from sys/amd64/linux/ to sys/compat/linux/,
making it possible to use it on architectures other than amd64.
It also enables Linux ptrace(2) on arm64.

Relnotes:	yes
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32868
2021-11-07 08:39:24 +00:00
Edward Tomasz Napierala
3be6e606d7 linux: Fix another amd64-specific piece of linux_ptrace.c
This was missed in c91d0e59be.  No functional changes.

Sponsored By:	EPSRC
2021-11-06 08:28:11 +00:00
Mark Johnston
175d3380a3 amd64: Deduplicate routines for expanding KASAN/KMSAN shadow maps
When working on the ports these functions were slightly different, but
now there's no reason for them to be separate.

No functional change intended.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-11-03 12:36:02 -04:00
Edward Tomasz Napierala
c91d0e59be linux: Make linux_ptrace.c portable
Make sys/amd64/linux/linux_ptrace.c machine-independent,
in preparation for moving it into sys/compat/linux/.
No functional changes.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32756
2021-11-03 08:54:35 +00:00
Edward Tomasz Napierala
f0d9a6a781 linux: make PTRACE_SETREGS use a correct struct
Note that this is largely untested at this point, as was
the previous version; I'm committing this mostly to get
rid of `struct linux_pt_reg`.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32735
2021-10-30 10:13:37 +01:00
Edward Tomasz Napierala
ad0379660d linux: make PTRACE_GETREGS return correct struct
Previously it returned a shorter struct.  I can't find any
modern software that uses it, but tests/ptrace from strace(1)
repo complained.

Differential Revision: https://reviews.freebsd.org/D32601
2021-10-29 16:18:28 +01:00
Edward Tomasz Napierala
f939dccfd7 linux: Make PTRACE_GETREGSET return proper buffer size
This fixes Chrome warning:

[1022/152319.328632:ERROR:ptracer.cc(476)] Unexpected registers size 0 != 216, 68

Reviewed By:	emaste
Sponsored By:	EPSRC
Differential Revision: https://reviews.freebsd.org/D32616
2021-10-29 15:31:33 +01:00
Edward Tomasz Napierala
6547153e46 linux: Fix ptrace panic with ERESTART
Translate ERESTART into Linux "internal" errno ERESTARTSYS.
This fixes the erestartsys.gen.test from strace(1).

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32623
2021-10-29 14:55:59 +01:00
Konstantin Belousov
0b3bc72889 amd64 pmap: adjust the empty pmap optimization in pmap_remove()
to match the added accounting of the top-level page table pages.

Reviewed by:	markj
Tested by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32569
2021-10-28 22:01:58 +03:00