Commit Graph

8646 Commits

Author SHA1 Message Date
Mark Johnston
ab12e8db29 amd64: Reduce the amount of cpuset copying done for TLB shootdowns
We use pmap_invalidate_cpu_mask() to get the set of active CPUs.  This
(32-byte) set is copied by value through multiple frames until we get to
smp_targeted_tlb_shootdown(), where it is copied yet again.

Avoid this copying by having smp_targeted_tlb_shootdown() make a local
copy of the active CPUs for the pmap, and drop the cpuset parameter,
simplifying callers.  Also leverage the use of the non-destructive
CPU_FOREACH_ISSET to avoid unneeded copying within
smp_targeted_tlb_shootdown().

Reviewed by:	alc, kib
Tested by:	pho
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32792
2021-11-15 13:01:31 -05:00
Mark Johnston
71e6e9da22 amd64: Initialize kernel_pmap's active CPU set to all_cpus
This is in preference to simply filling the cpuset, and allows the
conditional in pmap_invalidate_cpu_mask() to be elided.

Also export pmap_invalidate_cpu_mask() outside of pmap.c for use in a
subsequent commit.

Suggested by:	kib
Reviewed by:	alc, kib
Tested by:	pho
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32792
2021-11-15 13:01:30 -05:00
Mark Johnston
42c2cd1ffb amd64: Annotate an unlikely condition in smp_targeted_tlb_shootdown()
Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-11-15 13:01:30 -05:00
Konstantin Belousov
c5658876b4 amd64/ia32/ia32_signal.c: Use ANSI C functions definitions
Remove MPSAFE annotations.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-11-13 19:31:53 +02:00
Warner Losh
7e3c9ec906 tcp: better congestion control defaults
Define CC_NEWRENO in all the appropriate DEFAULTS and std.* config
files. It's the default congestion control algorithm.  Add code to cc.c
so that CC_DEFAULT is "newreno" if it's not overriden in the config
file.

Sponsored by: Netflix
Fixes: b8d60729de ("tcp: Congestion control cleanup.")
Revired by: manu, hselasky, jhb, glebius, tuexen
Differential Revision:	https://reviews.freebsd.org/D32964
2021-11-12 12:16:11 -07:00
Warner Losh
eb6b6622fe Update MINIMAL to have CC options
The MINIMAL configs were overlooked. They are compiled as part of
universe, so this broke universe builds. Add the same defafults as for
GENERIC.

Sponsored by:		Netflix
2021-11-12 09:33:18 -07:00
Randall Stewart
b8d60729de tcp: Congestion control cleanup.
NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!

This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.

This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.

Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693
2021-11-11 06:28:18 -05:00
Edward Tomasz Napierala
0bf8d5d5f4 linux: Replace ifdefs in ptrace with per-architecture callbacks
It's a cleanup; no (intended) functional changes.

Sponsored By:	EPSRC
Reviewed By:	kib
Differential Revision:	https://reviews.freebsd.org/D32888
2021-11-09 11:59:17 +00:00
Edward Tomasz Napierala
a90ff3c4bc linux: Add ptrace(2) support on arm64
This moves linux_ptrace.c from sys/amd64/linux/ to sys/compat/linux/,
making it possible to use it on architectures other than amd64.
It also enables Linux ptrace(2) on arm64.

Relnotes:	yes
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32868
2021-11-07 08:39:24 +00:00
Edward Tomasz Napierala
3be6e606d7 linux: Fix another amd64-specific piece of linux_ptrace.c
This was missed in c91d0e59be.  No functional changes.

Sponsored By:	EPSRC
2021-11-06 08:28:11 +00:00
Mark Johnston
175d3380a3 amd64: Deduplicate routines for expanding KASAN/KMSAN shadow maps
When working on the ports these functions were slightly different, but
now there's no reason for them to be separate.

No functional change intended.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-11-03 12:36:02 -04:00
Edward Tomasz Napierala
c91d0e59be linux: Make linux_ptrace.c portable
Make sys/amd64/linux/linux_ptrace.c machine-independent,
in preparation for moving it into sys/compat/linux/.
No functional changes.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32756
2021-11-03 08:54:35 +00:00
Edward Tomasz Napierala
f0d9a6a781 linux: make PTRACE_SETREGS use a correct struct
Note that this is largely untested at this point, as was
the previous version; I'm committing this mostly to get
rid of `struct linux_pt_reg`.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32735
2021-10-30 10:13:37 +01:00
Edward Tomasz Napierala
ad0379660d linux: make PTRACE_GETREGS return correct struct
Previously it returned a shorter struct.  I can't find any
modern software that uses it, but tests/ptrace from strace(1)
repo complained.

Differential Revision: https://reviews.freebsd.org/D32601
2021-10-29 16:18:28 +01:00
Edward Tomasz Napierala
f939dccfd7 linux: Make PTRACE_GETREGSET return proper buffer size
This fixes Chrome warning:

[1022/152319.328632:ERROR:ptracer.cc(476)] Unexpected registers size 0 != 216, 68

Reviewed By:	emaste
Sponsored By:	EPSRC
Differential Revision: https://reviews.freebsd.org/D32616
2021-10-29 15:31:33 +01:00
Edward Tomasz Napierala
6547153e46 linux: Fix ptrace panic with ERESTART
Translate ERESTART into Linux "internal" errno ERESTARTSYS.
This fixes the erestartsys.gen.test from strace(1).

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32623
2021-10-29 14:55:59 +01:00
Konstantin Belousov
0b3bc72889 amd64 pmap: adjust the empty pmap optimization in pmap_remove()
to match the added accounting of the top-level page table pages.

Reviewed by:	markj
Tested by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32569
2021-10-28 22:01:58 +03:00
Konstantin Belousov
e93b5adb6b amd64 pmap: account for the top-level pages
both for kernel and user page tables, the later exist in the PTI case.

Reviewed by:	markj
Tested by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32569
2021-10-28 22:01:58 +03:00
Edward Tomasz Napierala
2ec26ae402 linux: Improve debug for PTRACE_GETEVENTMSG
No functional changes.

Sponsored By:	EPSRC
2021-10-23 19:53:12 +01:00
Edward Tomasz Napierala
6e66030c4c linux: implement PTRACE_EVENT_EXEC
This fixes strace(1) from Ubuntu Focal.

Reviewed By:	jhb
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32367
2021-10-23 19:46:26 +01:00
Edward Tomasz Napierala
2558bb8e91 linux: Make PTRACE_GET_SYSCALL_INFO handle EJUSTRETURN
This fixes panic when trying to run strace(8) from Focal.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32355
2021-10-23 18:56:39 +01:00
Edward Tomasz Napierala
e3a83df119 linux: Improve debug for PTRACE_GETREGSET
No functional changes.

Sponsored By:	EPSRC
2021-10-23 09:30:06 +01:00
Edward Tomasz Napierala
3417c29851 linux: Constify bsd_to_linux_regset()
No functional changes.

Reviewed By:	emaste
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32599
2021-10-23 08:33:58 +01:00
Mark Johnston
4c812fe61b vlapic: Schedule callouts on the local CPU
The virtual LAPIC driver uses callouts to implement the LAPIC timer.
Callouts are armed using callout_reset_sbt(), which currently puts
everything on CPU 0.  On systems running many bhyve VMs this results in
a large amount of contention for CPU 0's callout lock.

Modify vlapic to schedule callouts on the local CPU instead.  This
allows timer interrupts to be scheduled more evenly among CPUs where
bhyve is running.

Reviewed by:	grehan, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32559
2021-10-19 21:22:57 -04:00
Mark Johnston
34fac29e98 amd64: Add comments to pmap_pinit_type()
... explaining why we don't pass the pmap pointer to
pmap_alloc_pt_page().

Reported by:	alc
Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32528
2021-10-19 21:22:57 -04:00
Mark Johnston
ff93447d8e Use the vm_radix_init() helper when initializing pmaps
No functional change intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32527
2021-10-19 21:22:56 -04:00
Mark Johnston
84c3922243 Convert consumers to vm_page_alloc_noobj_contig()
Remove now-unneeded page zeroing.  No functional change intended.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32006
2021-10-19 21:22:56 -04:00
Mark Johnston
a4667e09e6 Convert vm_page_alloc() callers to use vm_page_alloc_noobj().
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31986
2021-10-19 21:22:56 -04:00
Mark Johnston
de8554295b cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it
This implementation is faster and doesn't modify the cpuset, so it lets
us avoid some unnecessary copying as well.  No functional change
intended.

This is a re-application of commit
9068f6ea69.

Reviewed by:	cem, kib, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32029
2021-10-18 09:56:58 -04:00
Mark Johnston
b0423d0f5e amd64: Zero the PML5 PTI page when initializing a pmap
The root page is not zeroed at allocation time since with 4-level tables
each entry is copied from a template.  However, with 5-level tables only
a single entry is filled, so the rest need to be cleared.

Reported by:	alc
Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32525
2021-10-18 09:50:42 -04:00
Edward Tomasz Napierala
a03d4d73e4 linux: Improve debugging for PTRACE_GETREGSET
It's triggered by gdb(1).

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32456
2021-10-17 12:53:16 +01:00
Edward Tomasz Napierala
f9246e1484 linux: Implement some bits of PTRACE_PEEKUSER
This makes Linux gdb from Bionic a little less broken.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32455
2021-10-17 12:20:21 +01:00
Edward Tomasz Napierala
75a9d95b4d linux: Adjust PTRACE_GET_SYSCALL_INFO buffer size semantics
The tests/ptrace_syscall_info test from strace(1) complained
about this.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32368
2021-10-17 11:49:46 +01:00
Konstantin Belousov
e81e77c5a0 Enable PPS_SYNC on amd64, arm64 and armv7
Remove the option from NOTES/LINT, and add to NOTES for powerpc and
riscv.

PR:	259036
Requested by:	John Hay <john@sanren.ac.za>
Discussed with:	ian, imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-10-10 22:34:40 +03:00
Konstantin Belousov
ce21d4bff1 amd64 efirt: do not flush cache for runtime pages
We actually do not know is it safe or not to flush cache for random
BAR/register page existing in the system.  It is well-known that for
instance LAPICs cannot tolerate cache flush.  As report indicates,
there are more such devices.

This issue typically affects AMD machines which do not report self-snoop,
causing real CLFLUSH invocation on the mapped pages.  Intels do self-snoop,
so this change should be nop for them, and unsafe devices, if any, are
already ignored.

Reported and tested by:	manu
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32318
2021-10-06 05:53:20 +03:00
Konstantin Belousov
33c17670af amd64: add pmap_page_set_memattr_noflush()
Similar to pmap_page_set_memattr() by setting MD page cache attribute
to the argument.  Unlike pmap_page_set_memattr(), does not flush cache
for the direct mapping of the page.

Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32318
2021-10-06 05:53:12 +03:00
Mitchell Horne
ab4ed843a3 minidump: De-duplicate the progress bar
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31885
2021-09-29 16:42:21 -03:00
Mitchell Horne
31991a5a45 minidump: De-duplicate is_dumpable()
The function is identical in each minidump implementation, so move it to
vm_phys.c. The only slight exception is powerpc where the function was
public, for use in moea64_scan_pmap().

Reviewed by:	kib, markj, imp (earlier version)
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31884
2021-09-29 16:41:52 -03:00
Kirk McKusick
1c8d670cb6 Bring the tags and links entries for amd64 up to date.
MFC after:    1 week
Sponsored by: Netflix
2021-09-27 20:04:51 -07:00
Konstantin Belousov
b1e2f063ae amd64 sendsig: fix context corruption
Drop fpstate only after copying out xfpustate from the thread usermode
save area. Otherwise a context switch between get_fpcontext(), which now
returns the pointer directly into user save area, and copyout, would
cause reinit of the save area, loosing user registers.

Reported, reviewed, and tested by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential revision:	https://reviews.freebsd.org/D32159
2021-09-27 20:12:46 +03:00
Mark Johnston
f766826fe3 amd64: Remove proc0_tf, the bootstrap trapframe
It no longer serves any purpose as thread0's td_frame field is now
initialized during fpuinitstate().  No functional change intended.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32057
2021-09-25 10:18:52 -04:00
Mark Johnston
ca1e447b10 amd64: Avoid copying td_frame from kernel procs
When creating a new thread, we unconditionally copy td_frame from the
creating thread.  For threads which never return to user mode, this is
unnecessary since td_frame just points to the base of the stack or a
random interrupt frame.

If KASAN is configured this copying may also trigger false positives
since the td_frame region may contain poisoned stack regions.  It was
not noticed before since thread0 used a dummy proc0_tf trapframe, and
kernel procs are generally created by thread0.  Since commit
df8dd6025a, though, we call
cpu_thread_alloc(&thread0) when initializing FPU state, which
reinitializes thread0.td_frame.

Work around the problem by not copying the frame unless the copying
thread came from user mode.  While here, de-duplicate the copying and
remove redundant re(initialization) of td_frame.

Reported by:	syzbot+2ec89312bffbf38d9aec@syzkaller.appspotmail.com
Reviewed by:	kib
Fixes:		df8dd6025a
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32057
2021-09-25 10:18:30 -04:00
Konstantin Belousov
e36d0e86e3 Revert "linux32: add a hack to avoid redefining the type of the savefpu tag"
This reverts commit 0f6829488e.
Also it changes the type of md_usr_fpu_save struct mdthread member
to void *, which is what uncovered this trouble.  Now the save area
is untyped, but since it is hidden behind accessors, it is not too
significant.  Since apparently there are consumers affected outside
the tree, this hack is better than one from the reverted revision.

PR:	258678
Reported by:	cy
Reviewed by:	cy, kevans, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32060
2021-09-22 23:17:47 +03:00
Konstantin Belousov
cf0ee8738e Drop cloudabi
According to https://github.com/NuxiNL/cloudlibc:
CloudABI is no longer being maintained. It was an awesome experiment,
but it never got enough traction to be sustainable.

There is no reason to keep it in FreeBSD.

Approved by:	ed (private mail)
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D31923
2021-09-22 00:18:44 +03:00
Konstantin Belousov
c2ee4dfd04 ia32_get_fpcontext(): xfpusave can be legitimately NULL
Reported by:	cy
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Fixes:	bd9e0f5df6
2021-09-22 00:17:06 +03:00
Mark Johnston
bcdc599dc2 Revert "cpuset(9): Add CPU_FOREACH_IS(SET|CLR) and modify consumers to use it"
This reverts commit 9068f6ea69.

The underlying macro needs to be reworked to avoid problems with control
flow statements.

Reported by:	rlibby
2021-09-21 13:51:42 -04:00
Konstantin Belousov
2e79a21632 amd64: consistently use uprintf() to report weird situations in sigreturn
Reviewed by:	jhb
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
bd9e0f5df6 amd64: eliminate td_md.md_fpu_scratch
For signal send, copyout from the user FPU save area directly.

For sigreturn, we are in sleepable context and can do temporal
allocation of the transient save area.  We cannot copying from userspace
directly to user save area because XSAVE state needs to be validated,
also partial copyins can corrupt it.

Requested by:	jhb
Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
df8dd6025a amd64: stop using top of the thread' kernel stack for FPU user save area
Instead do one more allocation at the thread creation time.  This frees
a lot of space on the stack.

Also do not use alloca() for temporal storage in signal delivery sendsig()
function and signal return syscall sys_sigreturn().  This saves equal
amount of space, again by the cost of one more allocation at the thread
creation time.

A useful experiment now would be to reduce KSTACK_PAGES.

Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00
Konstantin Belousov
9151abe323 exec_machdep.c: some style, use ANSI C definition for sys_sigreturn()
Reviewed by:	jhb, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31954
2021-09-21 20:20:15 +03:00