Commit Graph

3957 Commits

Author SHA1 Message Date
Bjoern A. Zeeb
c39eefe715 LinuxKPI: implement dma_set_coherent_mask()
Coherent is lower 32bit only by default in Linux and our only default
dma mask is 64bit currently which violates expectations unless
dma_set_coherent_mask() was called explicitly with a different mask.

Implement coherent by creating a second tag, and storing the tags in the
objects and use the tag from the object wherever possible.
This currently does not update the scatterlist or pool (both could be
converted but S/G cannot be MFCed as easily).

There is a 2nd change embedded in the updated logic of
linux_dma_alloc_coherent() to always zero the allocation as
otherwise some drivers get cranky on uninialised garbage.

Sponsored by:	The FreeBSD Foundation
MFC after:	7 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D32164
2021-09-29 12:41:28 +00:00
Bjoern A. Zeeb
72c89ce97b LinuxKPI: dma-mapping.h unify "mask" and "dma_mask"
In some places we are using "mask" and others "dma_mask" for the
same thing.  Harmonize the various places to "dma_mask" as used in
linux_pci.c.  For the declaration remove the argument names to
avoid the entire problem.

This is in preparation for an upcoming change.
No functional changes intended.

Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
2021-09-27 20:53:06 +00:00
Bjoern A. Zeeb
93b14194ac LinuxKPI: disable device_release_driver()
As reported by multiple people testing iwlwifi, device_release_driver()
can lead to a panic on secondary errors (usually during attach).
Disable device_release_driver() for the short-term to prevent the panic
but leave it in place so it can be re-worked and fixed properly for
the long-term more easily.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-09-27 17:45:06 +00:00
Konstantin Belousov
cf0ee8738e Drop cloudabi
According to https://github.com/NuxiNL/cloudlibc:
CloudABI is no longer being maintained. It was an awesome experiment,
but it never got enough traction to be sustainable.

There is no reason to keep it in FreeBSD.

Approved by:	ed (private mail)
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D31923
2021-09-22 00:18:44 +03:00
Mark Johnston
fea1a98ead freebsd32: Fix a double copyin in sendmsg() and recvmsg()
freebsd32_sendmsg() and freebsd32_recvmsg() both copyin the message
header twice, once directly and once in freebsd32_copyinmsghdr().  The
iovec length from the former is used when copying in msg_iov, but the
rest of the kernel uses the iovec length from the latter.  When
kern_sendit() and kern_recvit() iterate over the iovec to compute the
residual for I/O, they can therefore end up walking past the end of the
copied in iovec, either resulting in a system call error, userspace
memory corruption from uiomove() with invalid iovecs, or a kernel page
fault if the copied-in iovec is followed by an unmapped KVA region.

Reported by:	syzbot+7cc64cd0c49605acd421@syzkaller.appspotmail.com
Reviewed by:	kib, emaste
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32010
2021-09-19 13:54:16 -04:00
Mark Johnston
4bda16ff18 freebsd32: Provide an ANSI definition for freebsd32_recvmsg()
Fix style in the freebsd32_sendmsg() definition.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-09-19 13:53:57 -04:00
Konstantin Belousov
796a8e1ad1 procctl(2): Add PROC_WXMAP_CTL/STATUS
It allows to override kern.elf{32,64}.allow_wx on per-process basis.
In particular, it makes it possible to run binaries without PT_GNU_STACK
and without elfctl note while allow_wx = 0.

Reviewed by:	brooks, emaste, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31779
2021-09-17 15:42:01 +03:00
Konstantin Belousov
f575573ca5 Remove PT_GET_SC_ARGS_ALL
Reimplement bdf0f24bb1 by checking for the caller' ABI in
the implementation of PT_GET_SC_ARGS, and copying out everything if
it is Linuxolator.

Also fix a minor information leak: if PT_GET_SC_ARGS_ALL is done on the
thread reused after other process, it allows to read some number of that
thread last syscall arguments. Clear td_sa.args in thread_alloc().

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D31968
2021-09-16 20:11:27 +03:00
John Baldwin
9553c6af88 <linux/overflow.h>: Don't use __has_builtin().
GCC only added support for __has_builtin in GCC 10.  However, all
supported versions of GCC and clang include these builtins so just use
them unconditionally.

This fixes the build with GCC 9.

Reviewed by:	manu, hselasky, imp
Differential Revision:	https://reviews.freebsd.org/D31942
2021-09-15 09:03:17 -07:00
Edward Tomasz Napierala
bdf0f24bb1 linux: implement PTRACE_GET_SYSCALL_INFO
This is one of the pieces required to make modern (ie Focal)
strace(1) work.

Reviewed By:	jhb (earlier version)
Sponsored by:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D28212
2021-09-14 20:19:55 +00:00
Brooks Davis
df501bac69 syscalls.master: switch to CAPENABLED flags
Switch the main syscall table to use CAPENABLED flags rather than
capabilities.conf.  This avoid synchronization issues between
syscalls.master and capabilities.conf (e.g. when renaming a syscall
during development).

For now, move capabilities.conf to sys/compat/freebsd32 and use it
there.  Use of sys/compat/freebsd32/syscalls.master should be replaced
by makesyscalls.lua enhancements to allow the main one to be used.

This change results in no changes to generated files after running
`make sysent`.

Reviewed by:	kevans, emaste
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D31350
2021-09-01 21:58:16 +01:00
Andrew Turner
b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Dmitry Chagin
d4da692862 linux(4): Improve comment.
Reported by:	pfg
MFC after:	2 weeks
2021-08-13 11:36:42 +03:00
Dmitry Chagin
aecd31a8a3 linux(4): Remove clone3 and faccessat2 from dummy.
MFC after:		2 weeks
2021-08-12 16:07:21 +03:00
Dmitry Chagin
1af0780b5f linux(4): Move ff variable initialization from declaration.
Modern style(9) allows variables initialization where they are declared,
but in this case initialization obfuscate the code.

MFC after:		2 weeks
2021-08-12 11:57:16 +03:00
Dmitry Chagin
c2cc5345b8 linux(4): Verify that higher 32bits of exit_signal in clone3 are unset.
MFC after:		2 weeks
2021-08-12 11:56:51 +03:00
Dmitry Chagin
4385147547 linux(4): Return ENOSYS for unsupported clone3 option bits.
Differential Revision:	https://reviews.freebsd.org/D31483
MFC after:		2 weeks
2021-08-12 11:56:36 +03:00
Dmitry Chagin
0d77f6c0c3 linux(4): Add LINUX_RATELIMIT_MSG macro for future use.
Differential Revision:	https://reviews.freebsd.org/D31488
MFC after:		2 weeks
2021-08-12 11:55:55 +03:00
Dmitry Chagin
c5fc9fe7f3 linux(4): Implement CLONE_CLEAR_SIGHAND option bit.
CLONE_CLEAR_SIGHAND is designed to reset all signal handlers of the child
not set to SIG_IGN to SIG_DFL.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D31481
MFC after:		2 weeks
2021-08-12 11:55:35 +03:00
Dmitry Chagin
a796845d6d linux(4): Add CLONE_PIDFD option bit.
Differential revision:	https://reviews.freebsd.org/D31478
MFC after:		2 weeks
2021-08-12 11:55:24 +03:00
Dmitry Chagin
17913b0b6b linux(4): Implement clone3 system call.
clone3 system call is used by glibc-2.34.

Differential revision:	https://reviews.freebsd.org/D31475
MFC after:		2 weeks
2021-08-12 11:49:36 +03:00
Dmitry Chagin
0a4b664ae8 linux(4): Add struct clone_args for future clone3 system call.
In preparation for clone3 system call add struct clone_args and use it in
clone implementation.
Move all of clone related bits to the newly created linux_fork.h header.

Differential revision:	https://reviews.freebsd.org/D31474
MFC after:		2 weeks
2021-08-12 11:49:01 +03:00
Dmitry Chagin
f1c450492f linux(4): Change clone syscall definition to match Linux actual one.
Differential revision:	https://reviews.freebsd.org/D31473
MFC after:		2 weeks
2021-08-12 11:46:36 +03:00
Dmitry Chagin
de8374df28 fork: Allow ABI to specify fork return values for child.
At least Linux x86 ABI's does not use carry bit and expects that the dx register
is preserved. For this add a new sv_set_fork_retval hook and call it from cpu_fork().

Add a short comment about touching dx in x86_set_fork_retval(), for more details
see phab comments from kib@ and imp@.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D31472
MFC after:		2 weeks
2021-08-12 11:45:25 +03:00
Dmitry Chagin
fc37be2460 linux(4): Plug in aarch64 fcntl flags.
Fixes opendir() libc function.

Differential Revision:	https://reviews.freebsd.org/D31357
MFC after:		2 weeks
2021-08-12 11:42:50 +03:00
Dmitry Chagin
13d79be995 linux(4): Implement faccessat2 system call.
It's used by bash on arm64 with glibc-2.32.

Reviewed by:		trasz
Differential Revision:	https://reviews.freebsd.org/D31345
MFC after:		2 weeks
2021-08-12 11:40:42 +03:00
Dmitry Chagin
6e31bed646 linux(4): Fix futex copyrights.
As no more NetBSD code in futexes exists replace NetBSD copyrights by
standard FreeBSD 2 clause license.
Add Roman Divacky's copyrights as an author of the robust futexes.

Differential revision:	https://reviews.freebsd.org/D31347
MFC after:		2 weeks
2021-08-12 11:36:24 +03:00
Ed Maste
9feff969a0 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation
2021-08-08 10:42:24 -04:00
Ka Ho Ng
da9fe3529b Regen after 0dc332bff2 2021-08-05 23:22:02 +08:00
Ka Ho Ng
0dc332bff2 Add fspacectl(2), vn_deallocate(9) and VOP_DEALLOCATE(9).
fspacectl(2) is a system call to provide space management support to
userspace applications. VOP_DEALLOCATE(9) is a VOP call to perform the
deallocation. vn_deallocate(9) is a public KPI for kmods' use.

The purpose of proposing a new system call, a KPI and a VOP call is to
allow bhyve or other hypervisor monitors to emulate the behavior of SCSI
UNMAP/NVMe DEALLOCATE on a plain file.

fspacectl(2) comprises of cmd and flags parameters to specify the
space management operation to be performed. Currently cmd has to be
SPACECTL_DEALLOC, and flags has to be 0.

fo_fspacectl is added to fileops.
VOP_DEALLOCATE(9) is added as a new VOP call. A trivial implementation
of VOP_DEALLOCATE(9) is provided.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D28347
2021-08-05 23:20:42 +08:00
Bjoern A. Zeeb
22e20d852f LinuxKPI: fix bug in le32p_replace_bits()
Fix a bug that slipped in in 90707c4e44
using the correct field in le32p_replace_bits().

MFC after:	3 days
Reviewed by:	hselasky
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31352
2021-07-31 22:15:35 +00:00
Hans Petter Selasky
469884cf04 LinuxKPI: Make FPU sections thread-safe and use the NOCTX flag.
Reviewed by:	kib
Submitted by:	greg@unrelenting.technology
Differential Revision:	https://reviews.freebsd.org/D29921
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-07-31 15:36:48 +02:00
Bjoern A. Zeeb
4c8af633d1 LinuxKPI: bitfield.h cleanup
Add a missing tab and remove an unnecessary return.
No functional changes.

MFC after:	3 days
2021-07-29 21:24:35 +00:00
Dmitry Chagin
2411ac0b89 linux(4): Eliminate a now unused includes after futexes refactoring.
MFC after:		2 weeks
2021-07-29 12:56:39 +03:00
Dmitry Chagin
d90df8ac13 linux(4): Add a comment about wait/requeue pi operations.
MFC after:		2 weeks
2021-07-29 12:55:59 +03:00
Dmitry Chagin
626cbd4648 linux(4): Handle incorrect FUTEX_CLOCK_REALTIME option bit.
Return ENOSYS if the FUTEX_CLOCK_REALTIME option bit is specified for an
inappropriate futex operation.

MFC after:		2 weeks
2021-07-29 12:55:33 +03:00
Dmitry Chagin
a9bb1b1c18 linux(4): Handle FUTEX_LOCK_PI2 oeration.
FUTEX_LOCK_PI2 was added to support clock selection as FUTEX_LOCK_PI uses a
CLOCK_REALTIME based absolute value since it was implemented, but it does not
require that the FUTEX_CLOCK_REALTIME bit is set, because that was introduced
later.

MFC after:		2 weeks
2021-07-29 12:55:02 +03:00
Dmitry Chagin
bd25bf092a linux(4): Use variable name not type for sizeof() to calculate storage size.
MFC after:		2 weeks
2021-07-29 12:54:32 +03:00
Dmitry Chagin
49a5c0409b linux(4): Move len variable initialization to the appropriate place.
MFC after:		2 weeks
2021-07-29 12:54:16 +03:00
Dmitry Chagin
c8e9d2b7eb linux(4): Use linux_tdfind() in get_robust_list.
In the Linux emulation layer linux_tdfind() has a special purpose to
handle glibc specific TID mangling and we should use it instead of tdfind().

MFC after:		2 weeks
2021-07-29 12:53:59 +03:00
Dmitry Chagin
f88d3c522f linux(4): Eliminate unnecessary error initialization.
MFC after:		2 weeks
2021-07-29 12:53:41 +03:00
Dmitry Chagin
6b68e8af1f linux(4): Eliminate unnecessary head initialization.
MFC after:		2 weeks
2021-07-29 12:53:25 +03:00
Dmitry Chagin
971b53fa04 linux(4): style, wrap too long line.
MFC after:		2 weeks
2021-07-29 12:53:07 +03:00
Dmitry Chagin
edd44176aa linux(4): Eliminating remnants of futex sdt.
MFC after:		2 weeks
2021-07-29 12:52:36 +03:00
Dmitry Chagin
b59cf25eac linux(4): Handle special case for regular futex in handle_futex_death().
Handle some races in handle_futex_death() which can prevents a wakeup of
potential waiters which can cause these waiters to block forever.

Differential Revision:	https://reviews.freebsd.org/D31280
MFC after:		2 weeks
2021-07-29 12:51:39 +03:00
Dmitry Chagin
dad1077056 linux(4): Futex address must be 32-bit aligned.
Linux futex documentation explicitly states that EINVAL is returned if
the futex is not 4-byte aligned. Check futex alignment as a Linux do
and return EINVAL.

Differential Revision:	https://reviews.freebsd.org/D31279
MFC after:		2 weeks
2021-07-29 12:50:58 +03:00
Dmitry Chagin
b33e469027 linux(4): Finish cf8d74e3fe.
Add forgotten val3_compare initialization in case of time64 futex.

MFC after:		2 weeks
2021-07-29 12:50:43 +03:00
Dmitry Chagin
4f34dc6453 linux(4): Replace casuword32 by casueword32.
Follow the r349951 (30b3018d), add check to react to stops and requests
to terminate between retries.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31254
MFC after:		2 weeks
2021-07-29 12:50:11 +03:00
Dmitry Chagin
7a718f293a linux(4): Implement pi futexes using umtx.
Differential Revision:	https://reviews.freebsd.org/D31240
MFC after:		2 weeks
2021-07-29 12:49:42 +03:00
Dmitry Chagin
cb01cc4a10 linux(4): Replace copyin() by fueword32() in handle_futex_death().
According to fetch(9) fueword facility designed to fetch atomically
small amount of data from user space.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31239
MFC after:		2 weeks
2021-07-29 12:48:59 +03:00
Dmitry Chagin
b9c89fa39e linux(4): Eliminate unused includes.
MFC after:		2 weeks
2021-07-29 12:46:35 +03:00
Dmitry Chagin
0dc38e3303 linux(4): Reimplement futexes using umtx.
Differential Revision:	https://reviews.freebsd.org/D31236
MFC after:		2 weeks
2021-07-29 12:43:48 +03:00
Dmitry Chagin
af29f39958 umtx: Split umtx.h on two counterparts.
To prevent umtx.h polluting by future changes split it on two headers:
umtx.h - ABI header for userspace;
umtxvar.h - the kernel staff.

While here fix umtx_key_match style.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31248
MFC after:		2 weeks
2021-07-29 12:41:29 +03:00
Dmitry Chagin
7cf06e075d freebsd32: Remove the unnecessary spaces.
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31247
MFC after:		2 weeks
2021-07-29 12:40:36 +03:00
Dmitry Chagin
3c886cb691 freebsd32: Remove unused umtx.h include.
Differential Revision:	https://reviews.freebsd.org/D31246
MFC after:		2 weeks
2021-07-29 12:40:08 +03:00
Dmitry Chagin
32a18e9abd freebsd32: Eliminate spaces at end of line.
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31245
MFC after:		2 weeks
2021-07-29 12:39:30 +03:00
Dmitry Chagin
f337940144 linux(4): Fix gcc buld.
gcc failed as it didn't inlined the builtins and generates calls to
the libgcc, ld can't find libgcc as cross-toolchain libgcc is not installed.
To avoid this add internal vDSO ffs functions without optimized builtins.

Reported by:		jhb
MFC after:		2 weeks
2021-07-29 09:52:33 +03:00
Bjoern A. Zeeb
fed248a6ac LinuxKPI: add read_poll_timeout()
Add an implementation of read_poll_timeout() and the atomic variant
which I did at some point last year for rtw88 and now updated based
on feedback.

MFC after:	10 days
Reviewed by:	hsealsky
Differential Revision: https://reviews.freebsd.org/D30980
2021-07-28 16:21:12 +00:00
Bjoern A. Zeeb
cc2723370b LinuxKPI: add fsleep()
Add fsleep() function now required by rtw88.  This seems to be
making a decision depending on time to sleep on how to sleep.
Given our compat framework already is lenient on how long to sleep,
this is a cut down version.

MFC after:	10 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D31322
2021-07-28 13:35:34 +00:00
Bjoern A. Zeeb
ac134e762e LinuxKPI: dmi.h do not rely on implicit includes
Add sys/types.h to dmi.h and do not rely on other files to include
all needed headers in Linux land.  I ran into compile problems with
rtw88 otherwise.

MFC after:	3 days
2021-07-28 13:28:48 +00:00
Konstantin Belousov
273728b125 Regen 2021-07-28 13:21:22 +03:00
Konstantin Belousov
9b6b793bd7 Revert most of ce42e79310
to restore ABI compatibility for pre-10.x binaries.

It restores _umtx_lock() and _umtx_unlock() syscalls, and UMTX_OP_LOCK/
UMTX_OP_UNLOCK umtx_op(2) operations. UMUTEX_ERROR_CHECK flag is left
out for now, I do not think it makes a difference.

PR:	218571
Reviewed by:	brooks (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31220
2021-07-28 13:21:12 +03:00
Konstantin Belousov
d96f55bc71 linuxkpi: remove global atomic counter of the task allocations
Use thread_reap_barrier() to ensure that no threads are kept in the
zombies list which could have the linuxkpi task allocated.

Also fix order of initialization and teardown for current task
allocation hooks and resources. Register current task allocator after
zones are initialized. Deregister allocator before cycling over threads
and zeroing task pointer.

Reviewed by:	hselasky, markj
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30468
2021-07-27 20:01:19 +03:00
Bjoern A. Zeeb
ea4dea8394 LinuxKPI: add sign_extend32()
Add sign_extend32() replicating the 64 version.  This is needed by
the rtw88 driver.

MFC after:	10 days
Reviewed by:	imp, emaste, hselasky
Differential Revision: https://reviews.freebsd.org/D30979
2021-07-27 15:03:38 +00:00
Bjoern A. Zeeb
b9d984e2c5 LinuxKPI: add nexthdr definitions for IPv6
Add the nexthdr definitions for IPv6 which are used by wireless
drivers and were previously placed in an 80211 header file by
accident.

Obtained from:	bz_iwlwifi
Sponsored by:	The FreeBSD Foundation
Reviewed by:	hselasky
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D31321
2021-07-27 15:00:21 +00:00
Bjoern A. Zeeb
366d68f283 LinuxKPI: add module_pci_driver() and pci_alloc_irq_vectors()
Add the two new functions needed by rtw88 to register the driver and
handle the module bits as well as a version of pci_alloc_irq_vectors()
for what is needed.

Reviewed by:	hselasky
MFC after:	10 days
Differential Revision: https://reviews.freebsd.org/D30981
2021-07-27 14:57:23 +00:00
Edward Tomasz Napierala
30c6d98219 linux: implement sigaltstack(2) on arm64
... by making it machine-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31286
2021-07-27 13:34:49 +00:00
Edward Tomasz Napierala
72f7ddb587 linux: implement rt_sigsuspend(2) on arm64
... by making it architecture-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31259
2021-07-23 20:13:00 +00:00
Dmitry Chagin
75cb2382b8 linux(4): Factor out the futex_wait() op into linux_futex_wait().
MFC after:		2 weeks
2021-07-20 14:40:24 +03:00
Dmitry Chagin
ef4251e271 linux(4): Prevent an endless loop.
In the futex_atomic_op() the encoded_op is a user-supplied parameter.
If the user specifies an incorrect value for this parameter paired with a valid
*uaddr parameter the caller will go into the endless loop. To prevent this check
futex_atomic_op() result and break the loop in case of ENOSYS.

MFC after:		2 weeks
2021-07-20 14:40:08 +03:00
Dmitry Chagin
80b8d6b144 linux(4): Eliminate bogus comment.
For the caller is no need for access checking here, as the caller must take care
of EFAULT handling. Moreover, this check would be superfluous, since EFAULT is
extremily rare, and we prefer the fast path.

MFC after:		2 weeks
2021-07-20 14:39:56 +03:00
Dmitry Chagin
cf8d74e3fe linux(4): Allow musl brand to use FUTEX_REQUEUE op.
Initial patch from submitter was adapted by me to prevent unconditional
FUTEX_REQUEUE use.

PR:			255947
Submitted by:		Philippe Michaud-Boudreault
Differential Revision:	https://reviews.freebsd.org/D30332
2021-07-20 14:39:20 +03:00
Dmitry Chagin
4c361d7a5a linux(4): Factor out the FUTEX_WAKE_OP op into linux_futex_wakeop().
MFC after:		2 weeks
2021-07-20 14:38:44 +03:00
Dmitry Chagin
bb62a91944 linux(4): Factor out the FUTEX_CMP_REQUEUE op into linux_futex_requeue().
MFC after:		2 weeks
2021-07-20 14:38:27 +03:00
Dmitry Chagin
19f7e2c2fb linux(4): Factor out the FUTEX_WAKE op into linux_futex_wake().
MFC after:		2 weeks
2021-07-20 14:38:05 +03:00
Dmitry Chagin
f6b0d275eb linux(4): Factor out the FUTEX_WAIT op into linux_futex_wait().
MFC after:		2 weeks
2021-07-20 14:37:51 +03:00
Dmitry Chagin
1866eef484 linux(4): Refactor the struct linux_futex_args.
Move flags and rtclock to the struct linux_futex_args. This will be used when
I split linux_futex() into separate futex op functions.

MFC after:		2 weeks
2021-07-20 14:37:37 +03:00
Dmitry Chagin
2b38186330 Drop rdivacky@ "All rights reserved" from linux_event.
I got explicit permission from Roman.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30913
MFC after:		2 weeks
2021-07-20 10:06:16 +03:00
Dmitry Chagin
1ca6b15bbd Drop "All rights reserved" from my copyright statements.
Add email and fixup years while here.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30912
MFC after:		2 weeks
2021-07-20 10:05:50 +03:00
Dmitry Chagin
fe7409530c linprocfs: Fixup vDSO name in the procmaps after 9931033bbf.
As the sv_shared_page_base now pointed out to the native sharedpage and
the process VA layout has changed as follows:
VDSOPAGE	(2 * PAGE_SIZE)
SHAREDPAGE	(PAGE_SIZE)
USRSTACK
fixup the vDSO name by calculating the start of page relative to the
native sharedpage.

Differential revision:	https://reviews.freebsd.org/D30903
MFC after:		2 weeks
2021-07-20 10:04:20 +03:00
Dmitry Chagin
9931033bbf linux(4); Almost complete the vDSO.
The vDSO (virtual dynamic shared object) is a small shared library that the
kernel maps R/O into the address space of all Linux processes on image
activation. The vDSO is a fully formed ELF image, shared by all processes
with the same ABI, has no process private data.

The primary purpose of the vDSO:
- non-executable stack, signal trampolines not copied to the stack;
- signal trampolines unwind, mandatory for the NPTL;
- to avoid contex-switch overhead frequently used system calls can be
  implemented in the vDSO: for now gettimeofday, clock_gettime.

The first two have been implemented, so add the implementation of system
calls.

System calls implemenation based on a native timekeeping code with some
limitations:
- ifunc can't be used, as vDSO r/o mapped to the process VA and rtld
  can't relocate symbols;
- reading HPET memory is not implemented for now (TODO).

In case on any error vDSO system calls fallback to the kernel system
calls. For unimplemented vDSO system calls added prototypes which call
corresponding kernel system call.

Tested by:		trasz (arm64)
Differential revision:  https://reviews.freebsd.org/D30900
MFC after:              2 weeks
2021-07-20 10:01:18 +03:00
Neel Chauhan
086cfe4df8 linuxkpi: Add spin_trylock_irqsave() macro
This is needed by the drm-kmod 5.6 update.

Reviewed by:		hselasky
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D30706
2021-07-15 07:52:42 -07:00
Edward Tomasz Napierala
3eaf271d3c linux(4): Improve comment about SA_RESTORER
No functional changes.

Sponsored By:	EPSRC
2021-07-13 11:13:17 +01:00
Hans Petter Selasky
05f56ac92f LinuxKPI: Force the usleep_range() function to sleep instead of spinning on the timer.
This allows other threads to execute, typically during hardware waiting loops.
This also maches how the function works in Linux.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-07-10 21:59:31 +02:00
Konstantin Belousov
747a6b7ace cloudabi and linux ABIs: do not call umtx_thread_cleanup() from thr_exit syscall
These ABIs do not use umtx at all, so there is nothing to clean.
Cloudabi references to umtx keys do not require any cleanups anyway.

Requested by:	dchagin
Reviewed by:	dchagin, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30987
2021-07-07 14:12:14 +03:00
Konstantin Belousov
28a66fc3da Do not call FreeBSD-ABI specific code for all ABIs
Use sysentvec hooks to only call umtx_thread_exit/umtx_exec, which handle
robust mutexes, for native FreeBSD ABI.  Similarly, there is no sense
in calling sigfastblock_clear() for non-native ABIs.

Requested by:	dchagin
Reviewed by:	dchagin, markj (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30987
2021-07-07 14:12:07 +03:00
Vladimir Kondratyev
8b33cb8303 LinuxKPI: Implement sequence counters and sequential locks
as a thin wrapper around native version found in sys/seqc.h.
This replaces out-of-base GPLv2-licensed code used by drm-kmod.

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D31006
2021-07-05 03:20:55 +03:00
Vladimir Kondratyev
019391bf85 LinuxKPI: Implement strscpy
strscpy copies the src string, or as much of it as fits, into the dst
buffer.  The dst buffer is always NUL terminated, unless it's zero-sized.
strscpy returns the number of characters copied (not including the
trailing NUL) or -E2BIG if len is 0 or src was truncated.

Currently drm-kmod replaces strscpy with strncpy that is not quite
correct as strncpy does not NUL-terminate truncated strings and returns
different values on exit.

Reviewed by:	hselasky, imp, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D31005
2021-07-05 03:20:42 +03:00
Vladimir Kondratyev
98a6984a9e LinuxKPI: Use macro for implementation of some dma_map_* functions
This allows to remove unimplemented attrs parameter which type differs
between Linux kernel versions and to compile both drm-kmod and ofed
callers unmodified.
Also convert it to 'unsigned long' type to match modern Linuxes.

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D30932
2021-07-05 03:20:23 +03:00
Vladimir Kondratyev
864b11007a LinuxKPI: Implement irq_work_sync() routine.
irq_work_sync() performs draining of irq_work task.
Required by drm-kmod.

Reviewed by:	hselasky
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30818
2021-07-05 03:20:06 +03:00
Vladimir Kondratyev
1ab61a1932 LinuxKPI: Do not wait for a grace period in rcu_barrier()
Linux docs explicitly state that this is not required [1]:

"Important note: The rcu_barrier() function is not, repeat, not,
obligated to wait for a grace period.  It is instead only required to
wait for RCU callbacks that have already been posted.  Therefore, if
there are no RCU callbacks posted anywhere in the system, rcu_barrier()
is within its rights to return immediately.  Even if there are
callbacks posted, rcu_barrier() does not necessarily need to wait for
a grace period."

[1] https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html

Reviewed by:	emaste, hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30809
2021-07-05 03:19:50 +03:00
Vladimir Kondratyev
c0862b2b1f LinuxKPI: Add compiler barriers to list_for_each_entry_lockless macro
so this list-traversal primitive may safely run concurrently with the
_rcu list-mutation primitives such as list_add_rcu() as long as the
traversal is guarded by rcu_read_lock().

Do it by reusing the "list_for_each_entry_rcu" macro which does the same.
On Linux it implements some additional lockdep stuff which we skip.

Also move the macro to linux/rculist.h where it resides on Linux.

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D30795
2021-07-05 03:19:35 +03:00
Vladimir Kondratyev
c77ec79b57 LinuxKPI: Change flags parameter type of atomic_dec_and_lock_irqsave
On Linux atomic_dec_and_lock_irqsave is a wrapper macro which provides
a reference to third parameter rather than parameter value itself to
implementation routine called _atomic_dec_and_lock_irqsave [1].

While here, implement a fast path.

[1] https://github.com/torvalds/linux/blob/master/include/linux/spinlock.h#L476

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D30781
2021-07-05 03:19:01 +03:00
Vladimir Kondratyev
78a02d8b33 LinuxKPI: Add #defines required by drm-kmod v5.5
Reviewed by:	hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30767
2021-07-05 03:18:47 +03:00
Vladimir Kondratyev
a2b83b59db LinuxKPI: Allow kmem_cache_free() to be called from critical sections
as it is required by i915kms driver from Linux kernel v 5.5.
This is done with asynchronous freeing of requested memory areas from
taskqueue thread. As memory to be freed is reused to store linked list
entry, backing UMA zone item size is rounded up to pointer size.

While here, make struct linux_kmem_cache private to LKPI to reduce amount
of BSD headers included by linux/slab.h and switch RCU code to usage of
LKPI's linux_irq_work_tq taskqueue to avoid injection of current into
system-wide taskqueue_fast thread context.

Submitted by:	nc (initial version for drm-kmod)
Reviewed by:	manu, nc
Differential revision:	https://reviews.freebsd.org/D30760
2021-07-05 03:18:14 +03:00
Edward Tomasz Napierala
2f514e6f13 linux(4): implement PR_SET_NO_NEW_PRIVS
This makes prctl(2) support PR_SET_NO_NEW_PRIVS, by mapping it
to the native PROC_NO_NEW_PRIVS_CTL procctl(2).

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30973
2021-07-03 08:42:37 +01:00
Edward Tomasz Napierala
db8d680ebe procctl(2): add PROC_NO_NEW_PRIVS_CTL, PROC_NO_NEW_PRIVS_STATUS
This introduces a new, per-process flag, "NO_NEW_PRIVS", which
is inherited, preserved on exec, and cannot be cleared.  The flag,
when set, makes subsequent execs ignore any SUID and SGID bits,
instead executing those binaries as if they not set.

The main purpose of the flag is implementation of Linux
PROC_SET_NO_NEW_PRIVS prctl(2), and possibly also unpriviledged
chroot.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30939
2021-07-01 09:42:07 +01:00
Edward Tomasz Napierala
447636e43c linux(4): implement coredump support
Implement dumping core for Linux binaries on amd64, for both
32- and 64-bit executables.  Some bits are still missing.

This is based on a prototype by chuck@.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30019
2021-06-30 22:45:06 +01:00
Dmitry Chagin
5ca9d41700 LinuxKPI: Rename a short description of the kmalloc type.
To avoid duplication in the vmstat -m output rename the kmalloc type short
description to 'lkpikmalloc' as the Linux emulation layer historically names
its linux malloc type as 'linux'.

Reviewed by:		hselasky, kib, emaste
Differential Revision:	https://reviews.freebsd.org/D30928
MFC after:		2 weeks
2021-06-29 20:20:01 +03:00
Dmitry Chagin
1fd26da926 LinuxKPI: Put compat code under appropriate condition.
Reviewed by:		hselasky, emaste, kib
Differential Revision:	https://reviews.freebsd.org/D30927
MFC after:		2 weeks
2021-06-29 20:19:17 +03:00