Commit Graph

1015 Commits

Author SHA1 Message Date
Mark Johnston
1811c1e957 exec: Reimplement stack address randomization
The approach taken by the stack gap implementation was to insert a
random gap between the top of the fixed stack mapping and the true top
of the main process stack.  This approach was chosen so as to avoid
randomizing the previously fixed address of certain process metadata
stored at the top of the stack, but had some shortcomings.  In
particular, mlockall(2) calls would wire the gap, bloating the process'
memory usage, and RLIMIT_STACK included the size of the gap so small
(< several MB) limits could not be used.

There is little value in storing each process' ps_strings at a fixed
location, as only very old programs hard-code this address; consumers
were converted decades ago to use a sysctl-based interface for this
purpose.  Thus, this change re-implements stack address randomization by
simply breaking the convention of storing ps_strings at a fixed
location, and randomizing the location of the entire stack mapping.
This implementation is simpler and avoids the problems mentioned above,
while being unlikely to break compatibility anywhere the default ASLR
settings are used.

The kern.elfN.aslr.stack_gap sysctl is renamed to kern.elfN.aslr.stack,
and is re-enabled by default.

PR:		260303
Reviewed by:	kib
Discussed with:	emaste, mw
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 16:12:36 -05:00
Mark Johnston
706f4a81a8 exec: Introduce the PROC_PS_STRINGS() macro
Rather than fetching the ps_strings address directly from a process'
sysentvec, use this macro.  With stack address randomization the
ps_strings address is no longer fixed.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 16:11:54 -05:00
Mark Johnston
3fc21fdd5f sysent: Add a sv_psstringssz field to struct sysentvec
The size of the ps_strings structure varies between ABIs, so this is
useful for computing the address of the ps_strings structure relative to
the top of the stack when stack address randomization is enabled.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 11:42:07 -05:00
Brooks Davis
0910a41ef3 Revert "syscallarg_t: Add a type for system call arguments"
Missed issues in truss on at least armv7 and powerpcspe need to be
resolved before recommit.

This reverts commit 3889fb8af0.
This reverts commit 1544e0f5d1.
2022-01-12 23:29:20 +00:00
Brooks Davis
3889fb8af0 sysent: regen for syscallarg_t 2022-01-12 22:51:25 +00:00
Mark Johnston
f04a096049 exec: Simplify sv_copyout_strings implementations a bit
Simplify control flow around handling of the execpath length and signal
trampoline.  Cache the sysentvec pointer in a local variable.

No functional change intended.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33703
2021-12-31 12:50:15 -05:00
Kyle Evans
e6f760f0e8 sysent: regenerate 2021-12-16 20:56:28 -06:00
Kyle Evans
8494666658 sysent: move away from allowing all compat options for other ABIs
Notably, the current compat_options only makes sense for native and
freebsd32 ABIs.  For the others, it just adds cruft. Switch to having
sets of compat options, and default to the native set.  Setup the other
ABIs where it doesn't make sense to opt-out of the native set.

This removes some redundant COMPAT_FREEBSD* stuff from Linuxolator bits.

line_expr in makesyscalls.lua is fixed to allow empty strings to be
specified, since they're harmless.

Reviewed by:	brooks, kib (both earlier version)
Differential Revision:	https://reviews.freebsd.org/D33356
2021-12-16 20:56:28 -06:00
Konstantin Belousov
b7c55487ff Regen 2021-12-09 02:49:10 +02:00
Mateusz Guzik
af4051d250 linux: remove the always curthread argument from lconvpath 2021-11-25 22:50:42 +00:00
Brooks Davis
6b7c23a026 syscalls: regen 2021-11-22 22:36:57 +00:00
Brooks Davis
f0cfbffc36 syscalls: regen 2021-11-22 22:36:56 +00:00
Edward Tomasz Napierala
4dfd612286 linux: mv sys/i386/linux/linux_ptrace{,_machdep}.c
In preparation for machine-independent sys/compat/linux/linux_ptrace.c,
rename the i386-specific Linux ptrace(2) implementation.  No functional
changes.

Sponsored By:	EPSRC
Differential Revision: https://reviews.freebsd.org/D32757
2021-11-03 08:50:17 +00:00
Andrew Turner
b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Dmitry Chagin
b356030e67 linux(4): Regen for clone3 system call.
MFC after:		2 weeks
2021-08-12 11:50:22 +03:00
Dmitry Chagin
17913b0b6b linux(4): Implement clone3 system call.
clone3 system call is used by glibc-2.34.

Differential revision:	https://reviews.freebsd.org/D31475
MFC after:		2 weeks
2021-08-12 11:49:36 +03:00
Dmitry Chagin
0a4b664ae8 linux(4): Add struct clone_args for future clone3 system call.
In preparation for clone3 system call add struct clone_args and use it in
clone implementation.
Move all of clone related bits to the newly created linux_fork.h header.

Differential revision:	https://reviews.freebsd.org/D31474
MFC after:		2 weeks
2021-08-12 11:49:01 +03:00
Dmitry Chagin
0c08f34f4d linux(4): Regen for clone syscall.
MFC after:		2 weeks
2021-08-12 11:47:31 +03:00
Dmitry Chagin
f1c450492f linux(4): Change clone syscall definition to match Linux actual one.
Differential revision:	https://reviews.freebsd.org/D31473
MFC after:		2 weeks
2021-08-12 11:46:36 +03:00
Dmitry Chagin
de8374df28 fork: Allow ABI to specify fork return values for child.
At least Linux x86 ABI's does not use carry bit and expects that the dx register
is preserved. For this add a new sv_set_fork_retval hook and call it from cpu_fork().

Add a short comment about touching dx in x86_set_fork_retval(), for more details
see phab comments from kib@ and imp@.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D31472
MFC after:		2 weeks
2021-08-12 11:45:25 +03:00
Dmitry Chagin
bee191e46f linux(4): Regen for faccessat2 system call.
MFC after:		2 weeks
2021-08-12 11:41:35 +03:00
Dmitry Chagin
13d79be995 linux(4): Implement faccessat2 system call.
It's used by bash on arm64 with glibc-2.32.

Reviewed by:		trasz
Differential Revision:	https://reviews.freebsd.org/D31345
MFC after:		2 weeks
2021-08-12 11:40:42 +03:00
Ed Maste
9feff969a0 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation
2021-08-08 10:42:24 -04:00
Dmitry Chagin
741f80df53 linux(4): Eliminating an accidental comment.
MFC after:		2 weeks
2021-07-29 12:51:56 +03:00
Dmitry Chagin
0dc38e3303 linux(4): Reimplement futexes using umtx.
Differential Revision:	https://reviews.freebsd.org/D31236
MFC after:		2 weeks
2021-07-29 12:43:48 +03:00
Dmitry Chagin
f337940144 linux(4): Fix gcc buld.
gcc failed as it didn't inlined the builtins and generates calls to
the libgcc, ld can't find libgcc as cross-toolchain libgcc is not installed.
To avoid this add internal vDSO ffs functions without optimized builtins.

Reported by:		jhb
MFC after:		2 weeks
2021-07-29 09:52:33 +03:00
Edward Tomasz Napierala
30c6d98219 linux: implement sigaltstack(2) on arm64
... by making it machine-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31286
2021-07-27 13:34:49 +00:00
Edward Tomasz Napierala
72f7ddb587 linux: implement rt_sigsuspend(2) on arm64
... by making it architecture-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31259
2021-07-23 20:13:00 +00:00
Dmitry Chagin
cf8d74e3fe linux(4): Allow musl brand to use FUTEX_REQUEUE op.
Initial patch from submitter was adapted by me to prevent unconditional
FUTEX_REQUEUE use.

PR:			255947
Submitted by:		Philippe Michaud-Boudreault
Differential Revision:	https://reviews.freebsd.org/D30332
2021-07-20 14:39:20 +03:00
Dmitry Chagin
09cffde975 linux(4): Fixup the vDSO initialization order.
The vDSO initialisation order should be as follows:
- native abi init via exec_sysvec_init();
- vDSO symbols queued to the linux_vdso_syms list;
- linux_vdso_install();
- linux_exec_sysvec_init();

As the exec_sysvec_init() called with SI_ORDER_ANY (last) at SI_SUB_EXEC
order, move linux_vdso_install() and linux_exec_sysvec_init() to the
SI_SUB_EXEC+1 order.

Reviewed by:		trasz
Differential Revision:	https://reviews.freebsd.org/D30902
MFC after		2 weeks
2021-07-20 10:02:34 +03:00
Dmitry Chagin
a543556c81 linux(4): Constify vdso install/deinstall.
In order to reduce diff between arches constify vdso install/deinstall
functions like arm64.

Reviewed by:		emaste
Differential revision:	https://reviews.freebsd.org/D30901
MFC after:		2 weeks
2021-07-20 10:01:47 +03:00
Dmitry Chagin
9931033bbf linux(4); Almost complete the vDSO.
The vDSO (virtual dynamic shared object) is a small shared library that the
kernel maps R/O into the address space of all Linux processes on image
activation. The vDSO is a fully formed ELF image, shared by all processes
with the same ABI, has no process private data.

The primary purpose of the vDSO:
- non-executable stack, signal trampolines not copied to the stack;
- signal trampolines unwind, mandatory for the NPTL;
- to avoid contex-switch overhead frequently used system calls can be
  implemented in the vDSO: for now gettimeofday, clock_gettime.

The first two have been implemented, so add the implementation of system
calls.

System calls implemenation based on a native timekeeping code with some
limitations:
- ifunc can't be used, as vDSO r/o mapped to the process VA and rtld
  can't relocate symbols;
- reading HPET memory is not implemented for now (TODO).

In case on any error vDSO system calls fallback to the kernel system
calls. For unimplemented vDSO system calls added prototypes which call
corresponding kernel system call.

Tested by:		trasz (arm64)
Differential revision:  https://reviews.freebsd.org/D30900
MFC after:              2 weeks
2021-07-20 10:01:18 +03:00
Dmitry Chagin
5fd9cd53d2 linux(4): Modify sv_onexec hook to return an error.
Temporary add stubs to the Linux emulation layer which calls the existing hook.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D30911
MFC after:		2 weeks
2021-07-20 09:56:25 +03:00
David Chisnall
cf98bc28d3 Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-16 18:06:44 +01:00
David Chisnall
d2b558281a Revert "Pass the syscall number to capsicum permission-denied signals"
This broke the i386 build.

This reverts commit 3a522ba1bc.
2021-07-10 20:26:01 +01:00
David Chisnall
3a522ba1bc Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-10 17:19:52 +01:00
Edward Tomasz Napierala
435754a59e Add infrastructure required for Linux coredump support
This adds `sv_elf_core_osabi`, `sv_elf_core_abi_vendor`,
and `sv_elf_core_prepare_notes` fields to `struct sysentvec`,
and modifies imgact_elf.c to make use of them instead
of hardcoding FreeBSD-specific values.  It also updates all
of the ABI definitions to preserve current behaviour.

This makes it possible to implement non-native ELF coredump
support without unnecessary code duplication.  It will be used
for Linux coredumps.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30921
2021-06-29 08:49:12 +01:00
Dmitry Chagin
79617645c6 linux(4): Retire unused declaration.
MFC after:	2 weeks
2021-06-22 08:41:33 +03:00
Dmitry Chagin
4efdf5820e linux(4): Retire a now unused include.
MFC after:	2 weeks
2021-06-22 08:39:47 +03:00
Dmitry Chagin
bfe2903798 linux(4): Do not specify shared page for aout binaries.
In Linux vDSO is a small shared ELF library, so it is not intended
for aout binaries. This was added on 64-bit Linuxulator import by mistake.

MFC after:	2 weeks
2021-06-22 08:38:45 +03:00
Dmitry Chagin
c1da89fec2 linux(4): Retire linux_kplatform.
Assuming we can't run on i486, i586 class cpu, retire linux_kplatform var
and use hardcoded 'machine' value in linux_newuname().

I have added linux_kplatform for consistency with linux_platform which is
placed in to vdso to avoid excess copyout it on stack for AT_PLATFORM at
exec time.

This is the first stage of Linuxulator's vdso revision.

Reviewed by:		trasz, imp
Differential Revision:	https://reviews.freebsd.org/D30774
MFC after:		2 weeks
2021-06-22 08:36:21 +03:00
Dmitry Chagin
8fe8bb7cb5 linux(4): Regen for linux_poll system call.
MFC after:	2 weeks
2021-06-22 08:09:55 +03:00
Dmitry Chagin
2eff670fde linux(4): Implement poll system call via linux_common_ppol()
for the sake of converting events to/from native.

MFC after:	2 weeks
2021-06-22 08:07:46 +03:00
Dmitry Chagin
26795a0378 linux(4): Rework Linux ppoll system call.
For now the Linux emulation layer uses in kernel ppoll(2) without
conversion of user supplied fd 'events', and does not convert the
kernel supplied fd 'revents'.

At least POLLRDHUP is handled by FreeBSD differently than by
Linux. Seems that Linux silencly ignores POLLRDHUP on non socket fd's
unlike FreeBSD, which does more strictly check and fails.

Rework the Linux ppoll, using kern_poll and converting 'events'
and 'revents' values.
While here, move poll events defines to the MI part of code as they
mostly identical on all arches except arm.

Differential Revision:	https://reviews.freebsd.org/D30716
MFC after:		2 weeks
2021-06-22 08:06:05 +03:00
Konstantin Belousov
870e197d52 Add quirks for Linux ABI signals handling
Require queueing of the signals with default action, and disable
dequeueing SIGCHLD on wait for live process.

Reported and tested by:	dchagin
Reviewed by:	dchagin, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30675
2021-06-16 02:01:35 +03:00
Dmitry Chagin
89f15b79b1 linux(4): Regen for ppoll_time64 system call.
MFC after:	2 weeks
2021-06-10 15:19:12 +03:00
Dmitry Chagin
ed61e0ce1d linux(4): Implement ppoll_time64 system call.
MFC after:	2 weeks
2021-06-10 15:18:46 +03:00
Dmitry Chagin
981a60f112 linux(4): Regen for pselect6_time64 system call.
MFC after:	2 weeks
2021-06-10 15:04:37 +03:00
Dmitry Chagin
f6d075ecd7 linux(4): Implement pselect6_time64 system call.
MFC after:	2 weeks
2021-06-10 15:03:30 +03:00
Dmitry Chagin
c002529000 linux(4): Regen for rt_sigtimedwait_time64 system call.
MFC after:	2 weeks
2021-06-10 14:52:43 +03:00