Commit Graph

4235 Commits

Author SHA1 Message Date
Edward Tomasz Napierala
99f563ed76 linux: recognize TCP_INFO and ratelimit the warning
This ratelimits the "unsupported getsockopt level 6 optname 11"
warnings that happen all the time when watching Netflix.

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32454
2021-10-17 13:19:10 +01:00
Edward Tomasz Napierala
7e7859e7c2 linux: Partially implement TCSBRK
This fixes tcflush(3), unbreaking cheribuild.py under arm64 Focal.

Reviewed By:	imp
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D32291
2021-10-17 11:19:56 +01:00
Mateusz Guzik
2b68eb8e1d vfs: remove thread argument from VOP_STAT
and fo_stat.
2021-10-11 13:22:32 +00:00
Alex Richardson
d98f2712c7 linuxkpi: implement ida_alloc()
Needed for the virtio-gpu driver.

Reviewed By:	#linuxkpi, manu, bz, hselasky
Differential Revision: https://reviews.freebsd.org/D32366
2021-10-11 11:51:44 +01:00
Alex Richardson
6d15ccde4d linuxkpi: Allow BUILD_BUG_ON in if statements without braces
I got a compilation failure in virtio-gpu without this change.

Reviewed By:	#linuxkpi, manu, bz, hselasky
Differential Revision: https://reviews.freebsd.org/D32366
2021-10-11 11:51:44 +01:00
Alex Richardson
ff479cc6c9 linuxkpi: add PAGE_ALIGNED macro
Needed for the virtio-gpu driver.

Reviewed By:	#linuxkpi, manu, bz, hselasky
Differential Revision: https://reviews.freebsd.org/D32366
2021-10-11 11:51:43 +01:00
Alex Richardson
2686b10db4 linuxkpi: Add sg_init_one
Needed for the virtio-gpu driver.

Reviewed By:	#linuxkpi, manu, bz, hselasky
Differential Revision: https://reviews.freebsd.org/D32366
2021-10-11 11:51:43 +01:00
Mark Johnston
a76de17715 linuxkpi: Handle a NULL cache pointer in kmem_cache_destroy()
This is compatible with Linux, and some driver error paths depend on it.

Reviewed by:	bz, emaste
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32337
2021-10-06 14:49:39 -04:00
Jessica Clarke
8167c92f65 LinuxKPI: Add more #ifdef VM_MEMATTR_WRITE_COMBINING guards
One of the three uses is already guarded; this guards the remaining ones
to support architectures like riscv that do not provide write-combining,
and is needed to build drm-kmod on riscv.

Reviewed by:	hselasky, manu
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31999
2021-10-03 19:34:40 +01:00
Mateusz Guzik
69ab528386 linprocfs: find cwd and root handling
The code would incorrectly use curthread instead of the target proc to
resolve vnodes.

Fixes:	8d03b99b9d ("fd: move vnodes out of filedesc into a dedicated structure")
PR:	258729
Noted by:	 Damjan Jovanovic <damjan.jov@gmail.com>
2021-09-30 12:59:58 +02:00
Vladimir Kondratyev
062f15004f LinuxKPI: Remove vma argument from fault method of vm_operations_struct
It is removed from Linux since 4.11.
In FreeBSD it results in several #ifdefs in drm-kmod.

Reviewed by:	emaste, hselasky, manu
Differential revision:	https://reviews.freebsd.org/D32169
2021-09-29 23:26:32 +03:00
Vladimir Kondratyev
5ca1f3f5e3 LinuxKPI: Hide some internal symbols in linux_interrupt.c
Reviewed by:	hselasky, manu
Differential revision:	https://reviews.freebsd.org/D32168
2021-09-29 23:26:14 +03:00
Vladimir Kondratyev
c072f6e856 LinuxKPI: Import linux_page.c and some dependent code from drm-kmod
No functional changes intended

Reviewed by:	hselasky, manu, markj
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32167
2021-09-29 23:15:37 +03:00
Vladimir Kondratyev
88531adbfb LinuxKPI: Update pte_fn_t definition to match Linux 5.3
Reviewed by:	emaste, hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32166
2021-09-29 23:15:27 +03:00
Vladimir Kondratyev
b52e363840 LinuxKPI: Implement backlight_enable and backlight_disable functions
For now, disable backlight if brightness level is set to 0.
In the future we may implement separate knob in backlight(8).

Required by drm-kmod v5.6

Reviewed by:	hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32165
2021-09-29 23:15:12 +03:00
Vladimir Kondratyev
3d86f8f1d7 LinuxKPI: Add dummy pgprot_decrypted() implementation
to reduce number of #ifdefs in drm-kmod

Reviewed by:	hselasky
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32094
2021-09-29 23:14:58 +03:00
Vladimir Kondratyev
37eba5b77a LinuxKPI: Cast offset_in_page() parameter to unsigned long
to reduce number of patches in drm-kmod

Reviewed by:	hselasky
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32093
2021-09-29 23:14:47 +03:00
Vladimir Kondratyev
6efabdeede LinuxKPI: Import linux/poison.h header
Required by drm-kmod 5.6

Reviewed by:	hselasky, imp, manu
MFC after:	2 weeks
Obtained from:	OpenBSD
Differential revision:	https://reviews.freebsd.org/D32092
2021-09-29 23:14:34 +03:00
Vladimir Kondratyev
b59ffedae8 LinuxKPI: Add helper functions to store integers to linux/xarray.h
Required by drm-kmod.

Reviewed by:	hselasky
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32091
2021-09-29 23:14:23 +03:00
Vladimir Kondratyev
62ff0566c9 LinuxKPI: Allow cdev_pager prefault handler to steal pages
from other vm_objects. This workarounds "Page already inserted" panic
in vm_page_insert routine triggered on attempt to mmap file created
with shmem_file_setup call. After introduction of "GTT mmap
interface v4" a.k.a. MMAP_OFFSET, vm_objects allocated by these calls
may try to own intersected sets of pages that leads to the assertion.

Reviewed by:	kib
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32090
2021-09-29 23:14:05 +03:00
Vladimir Kondratyev
bd6d55adb4 LinuxKPI: stub anon_inode_getfile
Although drm-kmod contains better implementation which is able to
allocate real entries on pseudofs, this feature has never been used.

Starting from drm-kmod v5.6 old implementation began to leak entries
on each drm device close(). Now just drop pseudofs support instead of
fixing it in drm-kmod and provide stub in base.

Reviewed by:	hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32069
2021-09-29 23:13:53 +03:00
Vladimir Kondratyev
f6823dac8f LinuxKPI: Factor out vmf_insert_pfn_prot() routine
from GEM and TTM page fault handlers and move it in to base system. This
code is tightly integrated with LKPI mmap support to belong to drm-kmod.

As this routine requires associated vm_object to be locked, it got
additional _locked suffix.

Reviewed by:	hselasky, markj
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32068
2021-09-29 23:13:41 +03:00
Vladimir Kondratyev
7d92d48358 LinuxKPI: Invoke release handler when file is destroyed by fput()
Required by drm_kmod 5.6

Reviewed by:	hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32067
2021-09-29 23:13:27 +03:00
Vladimir Kondratyev
2fe9ea5d3a LinuxKPI: allocate current before taking shrinkers lock
This fixes following warnings when shrinkers are invoked first time:

uma_zalloc_debug: zone "lkpicurr" with the following non-sleepable
locks held: exclusive sleep mutex lkpi-shrinker (lkpi-shrinker)

uma_zalloc_debug: zone "lkpimm" with the following non-sleepable locks
held: exclusive sleep mutex lkpi-shrinker (lkpi-shrinker)

Reviewed by:	hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D32066
2021-09-29 23:12:58 +03:00
Vladimir Kondratyev
b58c916f11 LinuxKPI: implement _IOC_TYPE and _IOC_NR macros in linux/ioctl.h
They are used by drm-kmod

Reviewed by:	emaste, hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D31674
2021-09-29 23:12:47 +03:00
Vladimir Kondratyev
66ea390652 LinuxKPI: Remove FreeBSD struct resource from all LKPI headers
except linux/pci.h to avoid conflicts with Linux version.
This allows to #define resource in drm-kmod globally and strip some #ifdef-s

Reviewed by:	hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D31673
2021-09-29 23:12:36 +03:00
Vladimir Kondratyev
a81b36c6d3 LinuxKPI: Implement get_file_rcu()
get_file_rcu() grabs a file if the file->f_count is not zero.

Required by drm-kmod 5.6

Reviewed by:	hselasky, manu (previous version)
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D31672
2021-09-29 23:12:25 +03:00
Bjoern A. Zeeb
1269873159 LinuxKPI: fix build
Add a missing "static" for non-{i386,amd64,arm64} which was missed in
c39eefe715.   This should ifx the builds.

Sponsored by:	The FreeBSD Foundation
MFC after:	7 days
X-MFC with:	c39eefe715
2021-09-29 13:50:12 +00:00
Bjoern A. Zeeb
c39eefe715 LinuxKPI: implement dma_set_coherent_mask()
Coherent is lower 32bit only by default in Linux and our only default
dma mask is 64bit currently which violates expectations unless
dma_set_coherent_mask() was called explicitly with a different mask.

Implement coherent by creating a second tag, and storing the tags in the
objects and use the tag from the object wherever possible.
This currently does not update the scatterlist or pool (both could be
converted but S/G cannot be MFCed as easily).

There is a 2nd change embedded in the updated logic of
linux_dma_alloc_coherent() to always zero the allocation as
otherwise some drivers get cranky on uninialised garbage.

Sponsored by:	The FreeBSD Foundation
MFC after:	7 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D32164
2021-09-29 12:41:28 +00:00
Bjoern A. Zeeb
72c89ce97b LinuxKPI: dma-mapping.h unify "mask" and "dma_mask"
In some places we are using "mask" and others "dma_mask" for the
same thing.  Harmonize the various places to "dma_mask" as used in
linux_pci.c.  For the declaration remove the argument names to
avoid the entire problem.

This is in preparation for an upcoming change.
No functional changes intended.

Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
2021-09-27 20:53:06 +00:00
Bjoern A. Zeeb
93b14194ac LinuxKPI: disable device_release_driver()
As reported by multiple people testing iwlwifi, device_release_driver()
can lead to a panic on secondary errors (usually during attach).
Disable device_release_driver() for the short-term to prevent the panic
but leave it in place so it can be re-worked and fixed properly for
the long-term more easily.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2021-09-27 17:45:06 +00:00
Konstantin Belousov
cf0ee8738e Drop cloudabi
According to https://github.com/NuxiNL/cloudlibc:
CloudABI is no longer being maintained. It was an awesome experiment,
but it never got enough traction to be sustainable.

There is no reason to keep it in FreeBSD.

Approved by:	ed (private mail)
Reviewed by:	emaste
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D31923
2021-09-22 00:18:44 +03:00
Mark Johnston
fea1a98ead freebsd32: Fix a double copyin in sendmsg() and recvmsg()
freebsd32_sendmsg() and freebsd32_recvmsg() both copyin the message
header twice, once directly and once in freebsd32_copyinmsghdr().  The
iovec length from the former is used when copying in msg_iov, but the
rest of the kernel uses the iovec length from the latter.  When
kern_sendit() and kern_recvit() iterate over the iovec to compute the
residual for I/O, they can therefore end up walking past the end of the
copied in iovec, either resulting in a system call error, userspace
memory corruption from uiomove() with invalid iovecs, or a kernel page
fault if the copied-in iovec is followed by an unmapped KVA region.

Reported by:	syzbot+7cc64cd0c49605acd421@syzkaller.appspotmail.com
Reviewed by:	kib, emaste
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32010
2021-09-19 13:54:16 -04:00
Mark Johnston
4bda16ff18 freebsd32: Provide an ANSI definition for freebsd32_recvmsg()
Fix style in the freebsd32_sendmsg() definition.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-09-19 13:53:57 -04:00
Konstantin Belousov
796a8e1ad1 procctl(2): Add PROC_WXMAP_CTL/STATUS
It allows to override kern.elf{32,64}.allow_wx on per-process basis.
In particular, it makes it possible to run binaries without PT_GNU_STACK
and without elfctl note while allow_wx = 0.

Reviewed by:	brooks, emaste, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31779
2021-09-17 15:42:01 +03:00
Konstantin Belousov
f575573ca5 Remove PT_GET_SC_ARGS_ALL
Reimplement bdf0f24bb1 by checking for the caller' ABI in
the implementation of PT_GET_SC_ARGS, and copying out everything if
it is Linuxolator.

Also fix a minor information leak: if PT_GET_SC_ARGS_ALL is done on the
thread reused after other process, it allows to read some number of that
thread last syscall arguments. Clear td_sa.args in thread_alloc().

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D31968
2021-09-16 20:11:27 +03:00
John Baldwin
9553c6af88 <linux/overflow.h>: Don't use __has_builtin().
GCC only added support for __has_builtin in GCC 10.  However, all
supported versions of GCC and clang include these builtins so just use
them unconditionally.

This fixes the build with GCC 9.

Reviewed by:	manu, hselasky, imp
Differential Revision:	https://reviews.freebsd.org/D31942
2021-09-15 09:03:17 -07:00
Edward Tomasz Napierala
bdf0f24bb1 linux: implement PTRACE_GET_SYSCALL_INFO
This is one of the pieces required to make modern (ie Focal)
strace(1) work.

Reviewed By:	jhb (earlier version)
Sponsored by:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D28212
2021-09-14 20:19:55 +00:00
Brooks Davis
df501bac69 syscalls.master: switch to CAPENABLED flags
Switch the main syscall table to use CAPENABLED flags rather than
capabilities.conf.  This avoid synchronization issues between
syscalls.master and capabilities.conf (e.g. when renaming a syscall
during development).

For now, move capabilities.conf to sys/compat/freebsd32 and use it
there.  Use of sys/compat/freebsd32/syscalls.master should be replaced
by makesyscalls.lua enhancements to allow the main one to be used.

This change results in no changes to generated files after running
`make sysent`.

Reviewed by:	kevans, emaste
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D31350
2021-09-01 21:58:16 +01:00
Andrew Turner
b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Dmitry Chagin
d4da692862 linux(4): Improve comment.
Reported by:	pfg
MFC after:	2 weeks
2021-08-13 11:36:42 +03:00
Dmitry Chagin
aecd31a8a3 linux(4): Remove clone3 and faccessat2 from dummy.
MFC after:		2 weeks
2021-08-12 16:07:21 +03:00
Dmitry Chagin
1af0780b5f linux(4): Move ff variable initialization from declaration.
Modern style(9) allows variables initialization where they are declared,
but in this case initialization obfuscate the code.

MFC after:		2 weeks
2021-08-12 11:57:16 +03:00
Dmitry Chagin
c2cc5345b8 linux(4): Verify that higher 32bits of exit_signal in clone3 are unset.
MFC after:		2 weeks
2021-08-12 11:56:51 +03:00
Dmitry Chagin
4385147547 linux(4): Return ENOSYS for unsupported clone3 option bits.
Differential Revision:	https://reviews.freebsd.org/D31483
MFC after:		2 weeks
2021-08-12 11:56:36 +03:00
Dmitry Chagin
0d77f6c0c3 linux(4): Add LINUX_RATELIMIT_MSG macro for future use.
Differential Revision:	https://reviews.freebsd.org/D31488
MFC after:		2 weeks
2021-08-12 11:55:55 +03:00
Dmitry Chagin
c5fc9fe7f3 linux(4): Implement CLONE_CLEAR_SIGHAND option bit.
CLONE_CLEAR_SIGHAND is designed to reset all signal handlers of the child
not set to SIG_IGN to SIG_DFL.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D31481
MFC after:		2 weeks
2021-08-12 11:55:35 +03:00
Dmitry Chagin
a796845d6d linux(4): Add CLONE_PIDFD option bit.
Differential revision:	https://reviews.freebsd.org/D31478
MFC after:		2 weeks
2021-08-12 11:55:24 +03:00
Dmitry Chagin
17913b0b6b linux(4): Implement clone3 system call.
clone3 system call is used by glibc-2.34.

Differential revision:	https://reviews.freebsd.org/D31475
MFC after:		2 weeks
2021-08-12 11:49:36 +03:00
Dmitry Chagin
0a4b664ae8 linux(4): Add struct clone_args for future clone3 system call.
In preparation for clone3 system call add struct clone_args and use it in
clone implementation.
Move all of clone related bits to the newly created linux_fork.h header.

Differential revision:	https://reviews.freebsd.org/D31474
MFC after:		2 weeks
2021-08-12 11:49:01 +03:00
Dmitry Chagin
f1c450492f linux(4): Change clone syscall definition to match Linux actual one.
Differential revision:	https://reviews.freebsd.org/D31473
MFC after:		2 weeks
2021-08-12 11:46:36 +03:00
Dmitry Chagin
de8374df28 fork: Allow ABI to specify fork return values for child.
At least Linux x86 ABI's does not use carry bit and expects that the dx register
is preserved. For this add a new sv_set_fork_retval hook and call it from cpu_fork().

Add a short comment about touching dx in x86_set_fork_retval(), for more details
see phab comments from kib@ and imp@.

Reviewed by:		kib
Differential revision:	https://reviews.freebsd.org/D31472
MFC after:		2 weeks
2021-08-12 11:45:25 +03:00
Dmitry Chagin
fc37be2460 linux(4): Plug in aarch64 fcntl flags.
Fixes opendir() libc function.

Differential Revision:	https://reviews.freebsd.org/D31357
MFC after:		2 weeks
2021-08-12 11:42:50 +03:00
Dmitry Chagin
13d79be995 linux(4): Implement faccessat2 system call.
It's used by bash on arm64 with glibc-2.32.

Reviewed by:		trasz
Differential Revision:	https://reviews.freebsd.org/D31345
MFC after:		2 weeks
2021-08-12 11:40:42 +03:00
Dmitry Chagin
6e31bed646 linux(4): Fix futex copyrights.
As no more NetBSD code in futexes exists replace NetBSD copyrights by
standard FreeBSD 2 clause license.
Add Roman Divacky's copyrights as an author of the robust futexes.

Differential revision:	https://reviews.freebsd.org/D31347
MFC after:		2 weeks
2021-08-12 11:36:24 +03:00
Ed Maste
9feff969a0 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation
2021-08-08 10:42:24 -04:00
Ka Ho Ng
da9fe3529b Regen after 0dc332bff2 2021-08-05 23:22:02 +08:00
Ka Ho Ng
0dc332bff2 Add fspacectl(2), vn_deallocate(9) and VOP_DEALLOCATE(9).
fspacectl(2) is a system call to provide space management support to
userspace applications. VOP_DEALLOCATE(9) is a VOP call to perform the
deallocation. vn_deallocate(9) is a public KPI for kmods' use.

The purpose of proposing a new system call, a KPI and a VOP call is to
allow bhyve or other hypervisor monitors to emulate the behavior of SCSI
UNMAP/NVMe DEALLOCATE on a plain file.

fspacectl(2) comprises of cmd and flags parameters to specify the
space management operation to be performed. Currently cmd has to be
SPACECTL_DEALLOC, and flags has to be 0.

fo_fspacectl is added to fileops.
VOP_DEALLOCATE(9) is added as a new VOP call. A trivial implementation
of VOP_DEALLOCATE(9) is provided.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D28347
2021-08-05 23:20:42 +08:00
Bjoern A. Zeeb
22e20d852f LinuxKPI: fix bug in le32p_replace_bits()
Fix a bug that slipped in in 90707c4e44
using the correct field in le32p_replace_bits().

MFC after:	3 days
Reviewed by:	hselasky
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D31352
2021-07-31 22:15:35 +00:00
Hans Petter Selasky
469884cf04 LinuxKPI: Make FPU sections thread-safe and use the NOCTX flag.
Reviewed by:	kib
Submitted by:	greg@unrelenting.technology
Differential Revision:	https://reviews.freebsd.org/D29921
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-07-31 15:36:48 +02:00
Bjoern A. Zeeb
4c8af633d1 LinuxKPI: bitfield.h cleanup
Add a missing tab and remove an unnecessary return.
No functional changes.

MFC after:	3 days
2021-07-29 21:24:35 +00:00
Dmitry Chagin
2411ac0b89 linux(4): Eliminate a now unused includes after futexes refactoring.
MFC after:		2 weeks
2021-07-29 12:56:39 +03:00
Dmitry Chagin
d90df8ac13 linux(4): Add a comment about wait/requeue pi operations.
MFC after:		2 weeks
2021-07-29 12:55:59 +03:00
Dmitry Chagin
626cbd4648 linux(4): Handle incorrect FUTEX_CLOCK_REALTIME option bit.
Return ENOSYS if the FUTEX_CLOCK_REALTIME option bit is specified for an
inappropriate futex operation.

MFC after:		2 weeks
2021-07-29 12:55:33 +03:00
Dmitry Chagin
a9bb1b1c18 linux(4): Handle FUTEX_LOCK_PI2 oeration.
FUTEX_LOCK_PI2 was added to support clock selection as FUTEX_LOCK_PI uses a
CLOCK_REALTIME based absolute value since it was implemented, but it does not
require that the FUTEX_CLOCK_REALTIME bit is set, because that was introduced
later.

MFC after:		2 weeks
2021-07-29 12:55:02 +03:00
Dmitry Chagin
bd25bf092a linux(4): Use variable name not type for sizeof() to calculate storage size.
MFC after:		2 weeks
2021-07-29 12:54:32 +03:00
Dmitry Chagin
49a5c0409b linux(4): Move len variable initialization to the appropriate place.
MFC after:		2 weeks
2021-07-29 12:54:16 +03:00
Dmitry Chagin
c8e9d2b7eb linux(4): Use linux_tdfind() in get_robust_list.
In the Linux emulation layer linux_tdfind() has a special purpose to
handle glibc specific TID mangling and we should use it instead of tdfind().

MFC after:		2 weeks
2021-07-29 12:53:59 +03:00
Dmitry Chagin
f88d3c522f linux(4): Eliminate unnecessary error initialization.
MFC after:		2 weeks
2021-07-29 12:53:41 +03:00
Dmitry Chagin
6b68e8af1f linux(4): Eliminate unnecessary head initialization.
MFC after:		2 weeks
2021-07-29 12:53:25 +03:00
Dmitry Chagin
971b53fa04 linux(4): style, wrap too long line.
MFC after:		2 weeks
2021-07-29 12:53:07 +03:00
Dmitry Chagin
edd44176aa linux(4): Eliminating remnants of futex sdt.
MFC after:		2 weeks
2021-07-29 12:52:36 +03:00
Dmitry Chagin
b59cf25eac linux(4): Handle special case for regular futex in handle_futex_death().
Handle some races in handle_futex_death() which can prevents a wakeup of
potential waiters which can cause these waiters to block forever.

Differential Revision:	https://reviews.freebsd.org/D31280
MFC after:		2 weeks
2021-07-29 12:51:39 +03:00
Dmitry Chagin
dad1077056 linux(4): Futex address must be 32-bit aligned.
Linux futex documentation explicitly states that EINVAL is returned if
the futex is not 4-byte aligned. Check futex alignment as a Linux do
and return EINVAL.

Differential Revision:	https://reviews.freebsd.org/D31279
MFC after:		2 weeks
2021-07-29 12:50:58 +03:00
Dmitry Chagin
b33e469027 linux(4): Finish cf8d74e3fe.
Add forgotten val3_compare initialization in case of time64 futex.

MFC after:		2 weeks
2021-07-29 12:50:43 +03:00
Dmitry Chagin
4f34dc6453 linux(4): Replace casuword32 by casueword32.
Follow the r349951 (30b3018d), add check to react to stops and requests
to terminate between retries.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31254
MFC after:		2 weeks
2021-07-29 12:50:11 +03:00
Dmitry Chagin
7a718f293a linux(4): Implement pi futexes using umtx.
Differential Revision:	https://reviews.freebsd.org/D31240
MFC after:		2 weeks
2021-07-29 12:49:42 +03:00
Dmitry Chagin
cb01cc4a10 linux(4): Replace copyin() by fueword32() in handle_futex_death().
According to fetch(9) fueword facility designed to fetch atomically
small amount of data from user space.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31239
MFC after:		2 weeks
2021-07-29 12:48:59 +03:00
Dmitry Chagin
b9c89fa39e linux(4): Eliminate unused includes.
MFC after:		2 weeks
2021-07-29 12:46:35 +03:00
Dmitry Chagin
0dc38e3303 linux(4): Reimplement futexes using umtx.
Differential Revision:	https://reviews.freebsd.org/D31236
MFC after:		2 weeks
2021-07-29 12:43:48 +03:00
Dmitry Chagin
af29f39958 umtx: Split umtx.h on two counterparts.
To prevent umtx.h polluting by future changes split it on two headers:
umtx.h - ABI header for userspace;
umtxvar.h - the kernel staff.

While here fix umtx_key_match style.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31248
MFC after:		2 weeks
2021-07-29 12:41:29 +03:00
Dmitry Chagin
7cf06e075d freebsd32: Remove the unnecessary spaces.
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31247
MFC after:		2 weeks
2021-07-29 12:40:36 +03:00
Dmitry Chagin
3c886cb691 freebsd32: Remove unused umtx.h include.
Differential Revision:	https://reviews.freebsd.org/D31246
MFC after:		2 weeks
2021-07-29 12:40:08 +03:00
Dmitry Chagin
32a18e9abd freebsd32: Eliminate spaces at end of line.
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D31245
MFC after:		2 weeks
2021-07-29 12:39:30 +03:00
Dmitry Chagin
f337940144 linux(4): Fix gcc buld.
gcc failed as it didn't inlined the builtins and generates calls to
the libgcc, ld can't find libgcc as cross-toolchain libgcc is not installed.
To avoid this add internal vDSO ffs functions without optimized builtins.

Reported by:		jhb
MFC after:		2 weeks
2021-07-29 09:52:33 +03:00
Bjoern A. Zeeb
fed248a6ac LinuxKPI: add read_poll_timeout()
Add an implementation of read_poll_timeout() and the atomic variant
which I did at some point last year for rtw88 and now updated based
on feedback.

MFC after:	10 days
Reviewed by:	hsealsky
Differential Revision: https://reviews.freebsd.org/D30980
2021-07-28 16:21:12 +00:00
Bjoern A. Zeeb
cc2723370b LinuxKPI: add fsleep()
Add fsleep() function now required by rtw88.  This seems to be
making a decision depending on time to sleep on how to sleep.
Given our compat framework already is lenient on how long to sleep,
this is a cut down version.

MFC after:	10 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D31322
2021-07-28 13:35:34 +00:00
Bjoern A. Zeeb
ac134e762e LinuxKPI: dmi.h do not rely on implicit includes
Add sys/types.h to dmi.h and do not rely on other files to include
all needed headers in Linux land.  I ran into compile problems with
rtw88 otherwise.

MFC after:	3 days
2021-07-28 13:28:48 +00:00
Konstantin Belousov
273728b125 Regen 2021-07-28 13:21:22 +03:00
Konstantin Belousov
9b6b793bd7 Revert most of ce42e79310
to restore ABI compatibility for pre-10.x binaries.

It restores _umtx_lock() and _umtx_unlock() syscalls, and UMTX_OP_LOCK/
UMTX_OP_UNLOCK umtx_op(2) operations. UMUTEX_ERROR_CHECK flag is left
out for now, I do not think it makes a difference.

PR:	218571
Reviewed by:	brooks (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31220
2021-07-28 13:21:12 +03:00
Konstantin Belousov
d96f55bc71 linuxkpi: remove global atomic counter of the task allocations
Use thread_reap_barrier() to ensure that no threads are kept in the
zombies list which could have the linuxkpi task allocated.

Also fix order of initialization and teardown for current task
allocation hooks and resources. Register current task allocator after
zones are initialized. Deregister allocator before cycling over threads
and zeroing task pointer.

Reviewed by:	hselasky, markj
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30468
2021-07-27 20:01:19 +03:00
Bjoern A. Zeeb
ea4dea8394 LinuxKPI: add sign_extend32()
Add sign_extend32() replicating the 64 version.  This is needed by
the rtw88 driver.

MFC after:	10 days
Reviewed by:	imp, emaste, hselasky
Differential Revision: https://reviews.freebsd.org/D30979
2021-07-27 15:03:38 +00:00
Bjoern A. Zeeb
b9d984e2c5 LinuxKPI: add nexthdr definitions for IPv6
Add the nexthdr definitions for IPv6 which are used by wireless
drivers and were previously placed in an 80211 header file by
accident.

Obtained from:	bz_iwlwifi
Sponsored by:	The FreeBSD Foundation
Reviewed by:	hselasky
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D31321
2021-07-27 15:00:21 +00:00
Bjoern A. Zeeb
366d68f283 LinuxKPI: add module_pci_driver() and pci_alloc_irq_vectors()
Add the two new functions needed by rtw88 to register the driver and
handle the module bits as well as a version of pci_alloc_irq_vectors()
for what is needed.

Reviewed by:	hselasky
MFC after:	10 days
Differential Revision: https://reviews.freebsd.org/D30981
2021-07-27 14:57:23 +00:00
Edward Tomasz Napierala
30c6d98219 linux: implement sigaltstack(2) on arm64
... by making it machine-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31286
2021-07-27 13:34:49 +00:00
Edward Tomasz Napierala
72f7ddb587 linux: implement rt_sigsuspend(2) on arm64
... by making it architecture-independent.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D31259
2021-07-23 20:13:00 +00:00
Dmitry Chagin
75cb2382b8 linux(4): Factor out the futex_wait() op into linux_futex_wait().
MFC after:		2 weeks
2021-07-20 14:40:24 +03:00
Dmitry Chagin
ef4251e271 linux(4): Prevent an endless loop.
In the futex_atomic_op() the encoded_op is a user-supplied parameter.
If the user specifies an incorrect value for this parameter paired with a valid
*uaddr parameter the caller will go into the endless loop. To prevent this check
futex_atomic_op() result and break the loop in case of ENOSYS.

MFC after:		2 weeks
2021-07-20 14:40:08 +03:00
Dmitry Chagin
80b8d6b144 linux(4): Eliminate bogus comment.
For the caller is no need for access checking here, as the caller must take care
of EFAULT handling. Moreover, this check would be superfluous, since EFAULT is
extremily rare, and we prefer the fast path.

MFC after:		2 weeks
2021-07-20 14:39:56 +03:00
Dmitry Chagin
cf8d74e3fe linux(4): Allow musl brand to use FUTEX_REQUEUE op.
Initial patch from submitter was adapted by me to prevent unconditional
FUTEX_REQUEUE use.

PR:			255947
Submitted by:		Philippe Michaud-Boudreault
Differential Revision:	https://reviews.freebsd.org/D30332
2021-07-20 14:39:20 +03:00
Dmitry Chagin
4c361d7a5a linux(4): Factor out the FUTEX_WAKE_OP op into linux_futex_wakeop().
MFC after:		2 weeks
2021-07-20 14:38:44 +03:00
Dmitry Chagin
bb62a91944 linux(4): Factor out the FUTEX_CMP_REQUEUE op into linux_futex_requeue().
MFC after:		2 weeks
2021-07-20 14:38:27 +03:00
Dmitry Chagin
19f7e2c2fb linux(4): Factor out the FUTEX_WAKE op into linux_futex_wake().
MFC after:		2 weeks
2021-07-20 14:38:05 +03:00
Dmitry Chagin
f6b0d275eb linux(4): Factor out the FUTEX_WAIT op into linux_futex_wait().
MFC after:		2 weeks
2021-07-20 14:37:51 +03:00
Dmitry Chagin
1866eef484 linux(4): Refactor the struct linux_futex_args.
Move flags and rtclock to the struct linux_futex_args. This will be used when
I split linux_futex() into separate futex op functions.

MFC after:		2 weeks
2021-07-20 14:37:37 +03:00
Dmitry Chagin
2b38186330 Drop rdivacky@ "All rights reserved" from linux_event.
I got explicit permission from Roman.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30913
MFC after:		2 weeks
2021-07-20 10:06:16 +03:00
Dmitry Chagin
1ca6b15bbd Drop "All rights reserved" from my copyright statements.
Add email and fixup years while here.

Reviewed by:		imp
Differential Revision:	https://reviews.freebsd.org/D30912
MFC after:		2 weeks
2021-07-20 10:05:50 +03:00
Dmitry Chagin
fe7409530c linprocfs: Fixup vDSO name in the procmaps after 9931033bbf.
As the sv_shared_page_base now pointed out to the native sharedpage and
the process VA layout has changed as follows:
VDSOPAGE	(2 * PAGE_SIZE)
SHAREDPAGE	(PAGE_SIZE)
USRSTACK
fixup the vDSO name by calculating the start of page relative to the
native sharedpage.

Differential revision:	https://reviews.freebsd.org/D30903
MFC after:		2 weeks
2021-07-20 10:04:20 +03:00
Dmitry Chagin
9931033bbf linux(4); Almost complete the vDSO.
The vDSO (virtual dynamic shared object) is a small shared library that the
kernel maps R/O into the address space of all Linux processes on image
activation. The vDSO is a fully formed ELF image, shared by all processes
with the same ABI, has no process private data.

The primary purpose of the vDSO:
- non-executable stack, signal trampolines not copied to the stack;
- signal trampolines unwind, mandatory for the NPTL;
- to avoid contex-switch overhead frequently used system calls can be
  implemented in the vDSO: for now gettimeofday, clock_gettime.

The first two have been implemented, so add the implementation of system
calls.

System calls implemenation based on a native timekeeping code with some
limitations:
- ifunc can't be used, as vDSO r/o mapped to the process VA and rtld
  can't relocate symbols;
- reading HPET memory is not implemented for now (TODO).

In case on any error vDSO system calls fallback to the kernel system
calls. For unimplemented vDSO system calls added prototypes which call
corresponding kernel system call.

Tested by:		trasz (arm64)
Differential revision:  https://reviews.freebsd.org/D30900
MFC after:              2 weeks
2021-07-20 10:01:18 +03:00
Neel Chauhan
086cfe4df8 linuxkpi: Add spin_trylock_irqsave() macro
This is needed by the drm-kmod 5.6 update.

Reviewed by:		hselasky
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D30706
2021-07-15 07:52:42 -07:00
Edward Tomasz Napierala
3eaf271d3c linux(4): Improve comment about SA_RESTORER
No functional changes.

Sponsored By:	EPSRC
2021-07-13 11:13:17 +01:00
Hans Petter Selasky
05f56ac92f LinuxKPI: Force the usleep_range() function to sleep instead of spinning on the timer.
This allows other threads to execute, typically during hardware waiting loops.
This also maches how the function works in Linux.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	NVIDIA Networking
2021-07-10 21:59:31 +02:00
Konstantin Belousov
747a6b7ace cloudabi and linux ABIs: do not call umtx_thread_cleanup() from thr_exit syscall
These ABIs do not use umtx at all, so there is nothing to clean.
Cloudabi references to umtx keys do not require any cleanups anyway.

Requested by:	dchagin
Reviewed by:	dchagin, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30987
2021-07-07 14:12:14 +03:00
Konstantin Belousov
28a66fc3da Do not call FreeBSD-ABI specific code for all ABIs
Use sysentvec hooks to only call umtx_thread_exit/umtx_exec, which handle
robust mutexes, for native FreeBSD ABI.  Similarly, there is no sense
in calling sigfastblock_clear() for non-native ABIs.

Requested by:	dchagin
Reviewed by:	dchagin, markj (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30987
2021-07-07 14:12:07 +03:00
Vladimir Kondratyev
8b33cb8303 LinuxKPI: Implement sequence counters and sequential locks
as a thin wrapper around native version found in sys/seqc.h.
This replaces out-of-base GPLv2-licensed code used by drm-kmod.

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D31006
2021-07-05 03:20:55 +03:00
Vladimir Kondratyev
019391bf85 LinuxKPI: Implement strscpy
strscpy copies the src string, or as much of it as fits, into the dst
buffer.  The dst buffer is always NUL terminated, unless it's zero-sized.
strscpy returns the number of characters copied (not including the
trailing NUL) or -E2BIG if len is 0 or src was truncated.

Currently drm-kmod replaces strscpy with strncpy that is not quite
correct as strncpy does not NUL-terminate truncated strings and returns
different values on exit.

Reviewed by:	hselasky, imp, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D31005
2021-07-05 03:20:42 +03:00
Vladimir Kondratyev
98a6984a9e LinuxKPI: Use macro for implementation of some dma_map_* functions
This allows to remove unimplemented attrs parameter which type differs
between Linux kernel versions and to compile both drm-kmod and ofed
callers unmodified.
Also convert it to 'unsigned long' type to match modern Linuxes.

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D30932
2021-07-05 03:20:23 +03:00
Vladimir Kondratyev
864b11007a LinuxKPI: Implement irq_work_sync() routine.
irq_work_sync() performs draining of irq_work task.
Required by drm-kmod.

Reviewed by:	hselasky
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30818
2021-07-05 03:20:06 +03:00
Vladimir Kondratyev
1ab61a1932 LinuxKPI: Do not wait for a grace period in rcu_barrier()
Linux docs explicitly state that this is not required [1]:

"Important note: The rcu_barrier() function is not, repeat, not,
obligated to wait for a grace period.  It is instead only required to
wait for RCU callbacks that have already been posted.  Therefore, if
there are no RCU callbacks posted anywhere in the system, rcu_barrier()
is within its rights to return immediately.  Even if there are
callbacks posted, rcu_barrier() does not necessarily need to wait for
a grace period."

[1] https://www.kernel.org/doc/Documentation/RCU/Design/Requirements/Requirements.html

Reviewed by:	emaste, hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30809
2021-07-05 03:19:50 +03:00
Vladimir Kondratyev
c0862b2b1f LinuxKPI: Add compiler barriers to list_for_each_entry_lockless macro
so this list-traversal primitive may safely run concurrently with the
_rcu list-mutation primitives such as list_add_rcu() as long as the
traversal is guarded by rcu_read_lock().

Do it by reusing the "list_for_each_entry_rcu" macro which does the same.
On Linux it implements some additional lockdep stuff which we skip.

Also move the macro to linux/rculist.h where it resides on Linux.

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D30795
2021-07-05 03:19:35 +03:00
Vladimir Kondratyev
c77ec79b57 LinuxKPI: Change flags parameter type of atomic_dec_and_lock_irqsave
On Linux atomic_dec_and_lock_irqsave is a wrapper macro which provides
a reference to third parameter rather than parameter value itself to
implementation routine called _atomic_dec_and_lock_irqsave [1].

While here, implement a fast path.

[1] https://github.com/torvalds/linux/blob/master/include/linux/spinlock.h#L476

Reviewed by:	hselasky
Differential revision:	https://reviews.freebsd.org/D30781
2021-07-05 03:19:01 +03:00
Vladimir Kondratyev
78a02d8b33 LinuxKPI: Add #defines required by drm-kmod v5.5
Reviewed by:	hselasky, manu
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30767
2021-07-05 03:18:47 +03:00
Vladimir Kondratyev
a2b83b59db LinuxKPI: Allow kmem_cache_free() to be called from critical sections
as it is required by i915kms driver from Linux kernel v 5.5.
This is done with asynchronous freeing of requested memory areas from
taskqueue thread. As memory to be freed is reused to store linked list
entry, backing UMA zone item size is rounded up to pointer size.

While here, make struct linux_kmem_cache private to LKPI to reduce amount
of BSD headers included by linux/slab.h and switch RCU code to usage of
LKPI's linux_irq_work_tq taskqueue to avoid injection of current into
system-wide taskqueue_fast thread context.

Submitted by:	nc (initial version for drm-kmod)
Reviewed by:	manu, nc
Differential revision:	https://reviews.freebsd.org/D30760
2021-07-05 03:18:14 +03:00
Edward Tomasz Napierala
2f514e6f13 linux(4): implement PR_SET_NO_NEW_PRIVS
This makes prctl(2) support PR_SET_NO_NEW_PRIVS, by mapping it
to the native PROC_NO_NEW_PRIVS_CTL procctl(2).

Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30973
2021-07-03 08:42:37 +01:00
Edward Tomasz Napierala
db8d680ebe procctl(2): add PROC_NO_NEW_PRIVS_CTL, PROC_NO_NEW_PRIVS_STATUS
This introduces a new, per-process flag, "NO_NEW_PRIVS", which
is inherited, preserved on exec, and cannot be cleared.  The flag,
when set, makes subsequent execs ignore any SUID and SGID bits,
instead executing those binaries as if they not set.

The main purpose of the flag is implementation of Linux
PROC_SET_NO_NEW_PRIVS prctl(2), and possibly also unpriviledged
chroot.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30939
2021-07-01 09:42:07 +01:00
Edward Tomasz Napierala
447636e43c linux(4): implement coredump support
Implement dumping core for Linux binaries on amd64, for both
32- and 64-bit executables.  Some bits are still missing.

This is based on a prototype by chuck@.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30019
2021-06-30 22:45:06 +01:00
Dmitry Chagin
5ca9d41700 LinuxKPI: Rename a short description of the kmalloc type.
To avoid duplication in the vmstat -m output rename the kmalloc type short
description to 'lkpikmalloc' as the Linux emulation layer historically names
its linux malloc type as 'linux'.

Reviewed by:		hselasky, kib, emaste
Differential Revision:	https://reviews.freebsd.org/D30928
MFC after:		2 weeks
2021-06-29 20:20:01 +03:00
Dmitry Chagin
1fd26da926 LinuxKPI: Put compat code under appropriate condition.
Reviewed by:		hselasky, emaste, kib
Differential Revision:	https://reviews.freebsd.org/D30927
MFC after:		2 weeks
2021-06-29 20:19:17 +03:00
Dmitry Chagin
945accf502 LinuxKPI: Use the proper API to determine the ABI of the running process.
Reviewed by:		markj, hselasky, kib
Differential Revision:	https://reviews.freebsd.org/D30924
MFC after:		2 weeks
2021-06-29 20:17:16 +03:00
Edward Tomasz Napierala
435754a59e Add infrastructure required for Linux coredump support
This adds `sv_elf_core_osabi`, `sv_elf_core_abi_vendor`,
and `sv_elf_core_prepare_notes` fields to `struct sysentvec`,
and modifies imgact_elf.c to make use of them instead
of hardcoding FreeBSD-specific values.  It also updates all
of the ABI definitions to preserve current behaviour.

This makes it possible to implement non-native ELF coredump
support without unnecessary code duplication.  It will be used
for Linux coredumps.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30921
2021-06-29 08:49:12 +01:00
Bjoern A. Zeeb
399da52fff LinuxKPI: firmware, implement deferred loading for "nowait"
Change linuxkpi_request_firmware_nowait() to deferred firmware loading
scheduling a task.  This changes behaviour in some cases that we
return from loading the driver before the driver is finished
initialising if the driver does not deal with it (wait).
This brings the behaviour one would expect from when this function is
called and I implemented it to see if it would help a specific case.

Sponsored by:	The FreeBSD Foundation
MFC after:	12 days
Reviewed by:	hselasky, imp (earlier version)
Differential Revision: https://reviews.freebsd.org/D30830
2021-06-28 12:13:43 +00:00
Bjoern A. Zeeb
539228d372 LinuxKPI: pci re-add pci_free_irq_vectors()
Re-add pci_free_irq_vectors() accidentally removed in
d4a4960c65 and now needed by drm-kmod v5.5.

Reported by:	wulf
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
X-MFC with:	d4a4960c65
2021-06-28 12:09:16 +00:00
Dmitry Chagin
c1da89fec2 linux(4): Retire linux_kplatform.
Assuming we can't run on i486, i586 class cpu, retire linux_kplatform var
and use hardcoded 'machine' value in linux_newuname().

I have added linux_kplatform for consistency with linux_platform which is
placed in to vdso to avoid excess copyout it on stack for AT_PLATFORM at
exec time.

This is the first stage of Linuxulator's vdso revision.

Reviewed by:		trasz, imp
Differential Revision:	https://reviews.freebsd.org/D30774
MFC after:		2 weeks
2021-06-22 08:36:21 +03:00
Dmitry Chagin
2eff670fde linux(4): Implement poll system call via linux_common_ppol()
for the sake of converting events to/from native.

MFC after:	2 weeks
2021-06-22 08:07:46 +03:00
Dmitry Chagin
26795a0378 linux(4): Rework Linux ppoll system call.
For now the Linux emulation layer uses in kernel ppoll(2) without
conversion of user supplied fd 'events', and does not convert the
kernel supplied fd 'revents'.

At least POLLRDHUP is handled by FreeBSD differently than by
Linux. Seems that Linux silencly ignores POLLRDHUP on non socket fd's
unlike FreeBSD, which does more strictly check and fails.

Rework the Linux ppoll, using kern_poll and converting 'events'
and 'revents' values.
While here, move poll events defines to the MI part of code as they
mostly identical on all arches except arm.

Differential Revision:	https://reviews.freebsd.org/D30716
MFC after:		2 weeks
2021-06-22 08:06:05 +03:00
Bjoern A. Zeeb
5f88df77a6 LinuxKPI: fix build after d4a4960c65 pci: "pcim" (managed) support
Fix a last minute change from d4a4960c65
based on review feedback in where a function now gets called before
it is declared which did not fully get merged back to my commit branch.

Noticed by:	CI, jkim
MFC after:	10 days
X-MFC with:	d4a4960c65
Sponsored-by:	The FreeBSD Foundation
2021-06-18 22:49:12 +00:00
Bjoern A. Zeeb
46ae23a402 LinuxKPI: avoid userret: Returning with with pinned thread
Some code manually calls local_bh_disable() and spin_lock() but
then calls spin_unlock_bh() (or vice versa).
Our code then calls local_bh_disable() again from spin_lock()
which means we have the thread pin count increased twice and that
means we get out of synch and are still pinned when returning to
user space.

Avoid this by adding the explicit local_bh_{enable,disable}() to
the spin_[un]lock_bh() versions.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30711
2021-06-18 21:20:10 +00:00
Bjoern A. Zeeb
edfcdffefc LinuxKPI: fix sg_pcopy_from_buffer()
In sg_pcopy_from_buffer() is an error in that skip can underflow
and lead to bogus page arithmetics which may lead to memory corruption
or more likely panics.  Once we found a s/g page to copy into there
is nothing to skip anymore so simply set skip to 0.

Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30676
2021-06-18 21:20:10 +00:00
Bjoern A. Zeeb
d4a4960c65 LinuxKPI: pci: cleanup some code and add support for "pcim" (managed)
Restructure some code and add support for various "managed" versions
for PCI resource management.
This is beyond of what iwlwifi needs but some was found with other
wireless drivers and it mostly all goes together.
Add one FreeBSD sepcific feature returning the resource rather than
the handle to allow us to use bus_*() functions in drivers directly.

Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30558
2021-06-18 21:20:09 +00:00
Bjoern A. Zeeb
c3518147ce LinuxKPI: fix pci device devres initialisation
Given we are manually setting up the "device" in PCI in some cases,
we need to initialise the list and lock for device devres here as well
as otherwise we will panic on the uninitialised lock.

Sponsored by:	The FreeBSD Foundation
MFC after:	5 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30681
2021-06-18 21:20:09 +00:00
Bjoern A. Zeeb
d16b6cb178 LinuxKPI: enhance the irq KPI for managed and threaded operations.
Move request_irq() to an internal function which serves request_irq()
and the newly added request_threaded_irq() and devm_request_threaded_irq().
Likewise factor out parts of free_irq() to also be used with
devm_free_irq().  Add the storage and call to a thread_handler in case
of IRQ_WAKE_THREAD.
This is needed for the iwlwifi driver.

Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30549
2021-06-18 21:20:09 +00:00
Bjoern A. Zeeb
8e106c5230 LinuxKPI: extend pci.h by various functions for wireless driver
Add dummy functions for dealing with "HotPlug" events which we currently
do not support.

Add pci_dev_get(), pci_find_ext_capability() and pci_pme_capable().

The added pcie_find_root_port() is a bit special as we need to create
another linux pci device;  for that make lkpinew_pci_dev() public
which is also helpful for other cases when we want to use the Linux
routines to check for device identifiers only and need a container
for the "bsddev" to use natively.  This has proven to avoid basic
checking code for the sake of rewriting it to native field names
elsewhere.  Given we cache the newly created "root" we also need to
make sure we clean it up.

Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30521
2021-06-18 21:20:09 +00:00
Bjoern A. Zeeb
2afeed13b5 LinuxKPI: add dmam_pool_create() support
dmam_pool_create() is a "managed" version of dma_pool_create() which
will cleanup everything left when the device goes away using the
devres framework.  For that add an internal cleanup function to be
called from devres release.
This is used by at least one wireless driver.

Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30520
2021-06-18 21:20:09 +00:00
Bjoern A. Zeeb
644b4f1176 LinuxKPI: add device_reprobe() and device_release_driver()
Add two new (though untested) functions to linux/device.h which are
dealing with manually managing the device/driver and are used by
at least one wireless driver.  We may have to re-fine them in the
future.
Move the devres declarations further up so they can be used earlier
in the file.

Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
Reviewed by:	imp
Differential Revision: https://reviews.freebsd.org/D30519
2021-06-18 21:20:09 +00:00
Bjoern A. Zeeb
801cf532e7 LinuxKPI: add KPI for netdev_notifier_info returning ifp
While currently the ifp gets cast to a net_device and then returned
and consumers are expecting an ifp again, allow parallel usage now and
in the future by extending and also passing the ifp directly back in
the netdev_notifier_info.  Add a function to return the ifp instead of
the net_device.

Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
Suggested by:	hselasky
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30522
2021-06-18 17:55:30 +00:00
Edward Tomasz Napierala
9d167945e8 linux: improve reporting for unsupported syscall flags
Filter out the flags we do support; previously we would print
out the flag value verbatim.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30693
2021-06-15 10:18:18 +01:00
Neel Chauhan
b47f461c8e linuxkpi: Add list_for_each_entry_lockless() macro
This is needed by the drm-kmod 5.7 update.

Approved by:		hselasky (src)
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D30708
2021-06-10 08:15:29 -07:00
Dmitry Chagin
ed61e0ce1d linux(4): Implement ppoll_time64 system call.
MFC after:	2 weeks
2021-06-10 15:18:46 +03:00
Dmitry Chagin
f6d075ecd7 linux(4): Implement pselect6_time64 system call.
MFC after:	2 weeks
2021-06-10 15:03:30 +03:00
Dmitry Chagin
db4a1f331b linux(4): Implement rt_sigtimedwait_time64 system call.
It still does not work as intended, awaits D30675.

MFC after:	2 weeks
2021-06-10 14:51:30 +03:00
Dmitry Chagin
2e46d0c3d9 linux(4): Implement futex_time64 system call.
MFC after:	2 weeks
2021-06-10 14:27:06 +03:00
Dmitry Chagin
25b09d6f39 linux(4): Prevent integer overflow in futex_requeue.
To prevent a signed integer overflow in futex_requeue add a sanity check
to catch negative values of nrwake or nrrequeue.

MFC after:	2 weeks
2021-06-10 14:23:11 +03:00
Greg V
597cc550e7 LinuxKPI: add fault_flag_allow_retry_first
Used by drm 5.7.

Reviewed by:	bz, hselasky, nc
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D30673
2021-06-09 19:11:41 -04:00
Neel Chauhan
8a1a42b2a7 linuxkpi: Add macros for might_lock_nested() and lockdep_(re/un/)pin_lock()
In Linux, these are macros to locks in the kernel for scheduling purposes.
But as with other macros in this header, we aren't doing anything with them
so we are doing `do {} while (0)` for now.

This is needed by the drm-kmod 5.7 update.

Approved by:		hselasky (src)
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D30710
2021-06-09 14:41:44 -07:00
Neel Chauhan
fee0d486ef linuxkpi: Add _RET_IP_ macro in kernel.h
This is needed by the drm-kmod 5.7 update.

Approved by:		hselasky (src)
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D30707
2021-06-09 14:41:44 -07:00
Philippe Michaud-Boudreault
2362ad457a linux: implement statx(2)
PR:		252106
Reviewed By:	dchagin
Differential Revision:	https://reviews.freebsd.org/D30466
2021-06-08 10:08:56 +01:00
Neel Chauhan
1b602f641a linuxkpi: Fix build from redefined pr_err_once() 2021-06-07 16:37:21 -07:00
Neel Chauhan
37d64dcdfa linuxkpi: Include pr_err_once() in printk.h
Approved by:		bz (src), hselasky (src)
Differential Reivison:	https://reviews.freebsd.org/D30687
2021-06-07 15:53:24 -07:00
Neel Chauhan
096104e790 linuxkpi: Add rom and romlen to struct pci_dev
Approved by:		bz (src), hselasky (src)
Differential Reivison:	https://reviews.freebsd.org/D30686
2021-06-07 15:53:24 -07:00
Greg V
05c2d94a08 LinuxKPI: add pr_err_once
Reviewed by:	hselasky, emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D30672
2021-06-07 10:31:48 -04:00
Edward Tomasz Napierala
128a1db806 linux: improve FUSE support
This fixes a number of AppImages; tested with
scribus-1.5.6.1-linux-x86_64.AppImage.

Reported By:	@probonopd
Reviewed By:	asomers, emaste
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30606
2021-06-07 10:43:28 +01:00
Konstantin Belousov
62b8258a7e Change the return type of sv__setid_allowed from bool to int
to please some userspace code using sys/sysent.h.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-06-06 23:38:48 +03:00
Konstantin Belousov
598f6fb49c linuxolator: Add compat.linux.setid_allowed knob
PR:	21463
Reported by:	kris
Reviewed by:	dchagin
Tested by:	trasz
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D28154
2021-06-06 21:43:00 +03:00
Dmitry Chagin
66e73ce737 linux(4): Fix clock_nanosleep return value for unsupported clockid.
The Linux clock_nanosleep() returns ENOTSUP for CLOCK_THREAD_CPUTIME_ID.
This silence one of the LTP clock_nanosleep tests.

MFC after:	2 weeks
2021-06-07 06:22:25 +03:00
Dmitry Chagin
f4e801085b linux(4): optimize ksiginfo to siginfo conversion.
Retire ksiginfo_to_lsiginfo function, use siginfo_to_lsiginfo instead.
Convert rt_sigtimedwait siginfo variables to well known names.

MFC after:	2 weeks
2021-06-07 06:06:17 +03:00
Dmitry Chagin
9c1045ff00 linux(4): Properly convert linux siginfo to native siginfo
add input validation.

MFC after:	2 weeks
2021-06-07 05:55:34 +03:00
Dmitry Chagin
0f8dab4540 linux(4): Fix timeout parameter of rt_sigtimedwait syscall, which is
timespec not a timeval.

MFC after:	2 weeks
2021-06-07 05:35:35 +03:00
Dmitry Chagin
6501370a7d linux(4): Implement clock_nanosleep_time64 system call.
MFC after:	2 weeks
2021-06-07 05:26:48 +03:00
Dmitry Chagin
187715a420 linux(4): Implement clock_getres_time64 system call.
MFC after:	2 weeks
2021-06-07 05:21:32 +03:00
Dmitry Chagin
19f9a0e4df linux(4): Implement clock_settime64 system call.
MFC after:	2 weeks
2021-06-07 05:11:25 +03:00
Dmitry Chagin
99b6f43069 linux(4): Implement clock_gettime64 system call.
MFC after:	2 weeks
2021-06-07 05:04:42 +03:00
Dmitry Chagin
e4bffb80bb linux(4): Implement utimensat_time64 system call.
MFC after:	2 weeks
2021-06-07 04:54:30 +03:00
Dmitry Chagin
bfcce1a9f6 linux(4): add struct timespec64 definition and conversion routine for
future use.

MFC after:		2 weeks
2021-06-07 04:47:12 +03:00
Bjoern A. Zeeb
b5d37e5a20 net80211/LinuxKPI: add more radiotap definitions
Add more raditap definitions based on "names" found in actual drivers
and based on documentation from radiotap.org (where avail).

Leave one specific "duplicate" in the LinuxKPI implementation but
otherwise manage it all in net80211.

Sponsored by:	The FreeBSD Foundation
MFC after:	10 days
Reviewed by:	hselasky, adrian, sam
Differential Revision: https://reviews.freebsd.org/D30641
2021-06-05 16:21:49 +00:00
Dmitry Chagin
2a0fa277f6 linux(4): Microoptimize futimesat, utimes, utime.
While here wrap long line.

Differential Revision:	https://reviews.freebsd.org/D30488
MFC after:		2 weeks
2021-05-31 22:54:18 +03:00
Dmitry Chagin
b4f9b6eef2 linux(4): Handle AT_EMPTY_PATH in the utimensat syscall.
Differential Revision:	https://reviews.freebsd.org/D30518
MFC after:		2 weeks
2021-05-31 22:37:06 +03:00
Dmitry Chagin
8505eb5dd8 linux(4): Convert flags before use in utimensat.
Differential Revision:	https://reviews.freebsd.org/D30487
MFC after:		2 weeks
2021-05-31 22:30:37 +03:00
Dmitry Chagin
a06c12464b linux(4): Add F_GETPIPE_SZ fcntl operation which returns the capacity
of the pipe referred by fd.

Differential Revision:	https://reviews.freebsd.org/D30517
MFC after:		2 weeks
2021-05-31 22:15:02 +03:00
Edward Tomasz Napierala
83043a741d linux: deduplicate DUMMY() entries
No functional changes.

Reviewed By:	emaste
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30524
2021-05-29 17:51:36 +00:00
Edward Tomasz Napierala
6d926e850d linux: add new syscall numbers
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30193
2021-05-28 09:02:16 +01:00
Bjoern A. Zeeb
4cc8a9da49 LinuxKPI: add HWEIGHT32()
Add HWEIGHT32() macro needed by iwlwifi and while here add the 8/16/64
variants likewise.

Sponsored by:	The FreeBSD Foundation
MFC after:	12 days
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30501
2021-05-27 13:38:30 +00:00
Bjoern A. Zeeb
9b6835f3ab LinuxKPI: netdevice.h remove more ifnet operating macros
Now that mlx4 and ofed either are operating on ifnet functions
directly or have a private copy of these macros, we can remove them
from linux/netdevice.h.
With this only the #define for net_device to ifnet is left.

Sponsored by:	The FreeBSD Foundation
MFC after:	12 days
Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D30478
2021-05-27 12:26:01 +00:00
Bjoern A. Zeeb
c35034b338 LinuxKPI/OFED/mlx4: cleanup netdevice.h some more
This removes all unused bits from linux/netdevice.h and migrates two
inline functions into the mlx4 and ofed code respectively.

This gets the mlx4/ofed (struct ifnet) specific bits down to 7 lines
in netdevice.h.

Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
Reviewed by:	hselasky, kib
Differential Revision: https://reviews.freebsd.org/D30461
2021-05-26 12:30:02 +00:00
Dmitry Chagin
5184e2da41 linux_common: retire extra module version.
The second 'linuxcommon' line was added by c66f5b079d
but Linuxulator's modules dependend on 'linux_common'.
To avoid such mistakes in the future rename moduledata name and module
name to  'linux_common' and retire 'linuxcommon' line.

Reviewed by:		emaste
Differential Revision:	https://reviews.freebsd.org/D30409
MFC after:		2 weeks
2021-05-26 08:34:32 +03:00
Bjoern A. Zeeb
095f018e49 LinuxKPI: add addrconf_addr_solict_mult()
Introduce net/addrconf.h with an implementation to
addrconf_addr_solict_mult() used by WiFi drivers.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30416
2021-05-25 18:01:49 +00:00
Bjoern A. Zeeb
32f753f270 LinuxKPI: add Exponentially Weighted Moving Average implementation
Add DECLARE_EWMA() which expands to a per-name EWMA implementation
as used by multiple wireless drivers.

Sposnored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky, cperciva, dwmalone
Differential Revision: https://reviews.freebsd.org/D30415
2021-05-25 18:01:48 +00:00
Bjoern A. Zeeb
f4a145b136 LinuxKPI: add linux/bsearch.h for sort(9)
Add linux/bsearch.h which only includes libkern.h as the sort(9)
functions seem to be compatible.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30417
2021-05-25 18:01:48 +00:00
Bjoern A. Zeeb
5aeeab54b0 LinuxKPI: byteorder.h
Add a few more le<n>_{tp,add}_cpu*() #defines/functions found in
wireless drivers.  While here fill most of the combinatorics gaps
and also add the remaining combinations [1].

Suggested by:	emaste [1] (for one part)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30418
2021-05-25 18:01:48 +00:00
Bjoern A. Zeeb
e21652c13c LinuxKPI: cache.h add SMP_CACHE_BYTES
Add a definition for SMP_CACHE_BYTES and while here include sys/param.h
for CACHE_LINE_SIZE as otherwise code might not compile standalone.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30419
2021-05-25 18:01:48 +00:00
Bjoern A. Zeeb
da717031c9 LinuxKPI: compiler.h add three more defines
Add fallthrough, ____cacheline_aligned_in_smp, and smp_mb() to
linux/compiler.h.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30420
2021-05-25 18:01:48 +00:00
Bjoern A. Zeeb
5fce802722 LinuxKPI: add cpu.h for cpumask_*()
Add linux/cpu.h for cpumask_*() functions found in wireless drivers
and make sure cpu_online_mask is always initialised.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30421
2021-05-25 18:01:48 +00:00
Bjoern A. Zeeb
29923fea03 LinuxKPI: add devcoredump.h
Add linux/devcoredump.h with stub implementation of dev_coredumpv()
and dev_coredumpsg() which only free the passed in SG table as needed
for iwlwifi.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30423
2021-05-25 18:01:48 +00:00
Bjoern A. Zeeb
e7a0b68540 LinuxKPI: add dev_crit() to linux/device.h
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	emaste, hselasky
Differential Revision: https://reviews.freebsd.org/D30424
2021-05-25 18:01:48 +00:00
Bjoern A. Zeeb
834227ba6e LinuxKPI: add ether_addr_equal_unaligned()
Replace the implementation for ether_addr_equal() with
ether_addr_equal_unaligned() and add a define for ether_addr_equal()
pointing to the now ether_addr_equal_unaligned() implementation.
This way ether_addr_equal_unaligned() cannot be broken by accident [1].

Suggested by:	emaste [1]
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30425
2021-05-25 18:01:47 +00:00
Bjoern A. Zeeb
ff09f9133f LinuxKPI: net/if_inet6.h add struct inet6_dev { }
Add a dummy struct inet6_dev {}; to net/if_inet6.h.  This is currently
not used for anything but in a declaration.  Just needs to be there.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30426
2021-05-25 18:01:47 +00:00
Bjoern A. Zeeb
602e4e433d LinuxKPI: add irq_set_affinity_hint()
Add an implementation for irq_set_affinity_hint() to linux/interrupt.h
and include linux/hardirq.h for synchronize_irq() as needed by
wireless drivers.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30427
2021-05-25 18:01:47 +00:00
Bjoern A. Zeeb
b26fb63f2b LinuxKPI: add linux/{ip,tcp,udp}.h
Add header files for struct and accessors for IPv4, UDP, and TCP.
Only parts of the fields of the structs have been seen while working
on wireless drivers.  The remaining field names are filled up with
the FreeBSD field names for now.  If you have insights into their
correct naming in Linux, feel free to adjust.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30428
2021-05-25 18:01:47 +00:00
Bjoern A. Zeeb
762efb2d6d LinuxKPI: ipv6.h add missing #include
Include linux/bitops.h for a definition of BITS_PER_LONG so that this
file can be used independently.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30429
2021-05-25 18:01:47 +00:00
Bjoern A. Zeeb
8620fe4c10 LinuxKPI: add time_is_after_jiffies() definition
This is used by wireless drivers.  Use the time_after() macro as
done for the "after_eq" version.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30430
2021-05-25 18:01:47 +00:00
Bjoern A. Zeeb
1082490cd8 LinuxKPI: change BUILD_BUG_ON()
BUILD_BUG_ON() can be used inside functions where the definition to
CTASSERT() (_Static_assert()) seems to not work.
Go back to an old-style CTASSERT() implementation but also add a
variable dclaration to avoid "unsued typedef" errors and dummy-use
the variable to avoid "unusued variable" errors.  Given it is all
self-contained in a block and not used outside this should be
optimised away.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30431
2021-05-25 18:01:47 +00:00
Bjoern A. Zeeb
18d303b05f LinuxKPI: add ktime_get_boottime_ns() implementation to ktime.h
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30432
2021-05-25 18:01:47 +00:00
Bjoern A. Zeeb
c1661d59e6 LinuxKPI: add LINUXKPI_PARAM_charp()
Add yet another version of the various module_param_named() use cases.
This one deals with "charp".

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30433
2021-05-25 18:01:46 +00:00
Bjoern A. Zeeb
fc1d840901 LinuxKPI: add more #defines to pci.h
Add more definitions for various PCI uses to linux/pci.h.  Almost all
are defined to their FreeBSD counterparts which are described there.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30434
2021-05-25 18:01:46 +00:00
Bjoern A. Zeeb
10096cb606 LinuxKPI: add prandom_u32() as used by wireless drivers.
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30435
2021-05-25 18:01:46 +00:00
Bjoern A. Zeeb
fa58da02f7 LinuxKPI: add rcu_dereference_check()
Add a define for rcu_dereference_check() to rcu_dereference_protected()
which ignores the check argument.  Our lockdep compat implementation
for use cases found in iwlwifi would return 1 anyway.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30436
2021-05-25 18:01:46 +00:00
Bjoern A. Zeeb
abcac97f82 LinuxKPI: add kfree_sensitive() using zfree().
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30437
2021-05-25 18:01:46 +00:00
Bjoern A. Zeeb
43b4c00643 LinuxKPI: extract stringify() in their own header file
Add linux/stringify.h as directly included by drivers.  Remove the
definitions from compiler.h and include the new header in places
where the stringify macros are already used without linuxkpi.

I have adjusted the Copyright of the new file according to the commit
originaly adding the macros (99e690772a).

Sposnored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30440
2021-05-25 18:01:46 +00:00
Bjoern A. Zeeb
5878c7c7b0 LinuxKPI: add kernel_ulong_t typedef in linux/kernel.h.
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30438
2021-05-25 18:01:46 +00:00
Bjoern A. Zeeb
cae1683120 LinuxKPI: add guid_t for ACPI consumers.
Add a placeholder struct for guid_t which is needed by ACPI consumers
in at least one wireless driver.

Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Reviewed by:	hselasky
Differential Revision: https://reviews.freebsd.org/D30439
2021-05-25 18:01:46 +00:00
Hans Petter Selasky
b764a42653 There is a window where threads are removed from the process list and where
the thread destructor is invoked. Catch that window by waiting for all
task_struct allocations to be returned before freeing the UMA zone in the
LinuxKPI. Else UMA may fail to release the zone due to concurrent access
and panic:

panic() - Bad link element prev->next != elm
zone_release()
bucket_drain()
bucket_free()
zone_dtor()
zone_free_item()
uma_zdestroy()
linux_current_uninit()

This failure can be triggered by loading and unloading the LinuxKPI module
in a loop:

while true
do
kldload linuxkpi
kldunload linuxkpi
done

Discussed with:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-21 13:18:41 +02:00
Hans Petter Selasky
209d4919c5 Make sure all tasklets are drained before unloading the LinuxKPI.
Else use-after-free may happen.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-21 11:21:32 +02:00
Ed Maste
2c9764f36b regen syscall files after d51198d63b63 2021-05-13 14:09:58 -04:00
Hans Petter Selasky
b8f113cab9 Implement cdev_device_add() and cdev_device_del() in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-11 21:00:23 +02:00
Hans Petter Selasky
67807f5066 cdev_del() should only put it's kernel object in the LinuxKPI.
The destructor takes care of the rest.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-11 21:00:23 +02:00
Hans Petter Selasky
904390b478 Implement read-only VM_SHARED flag in the LinuxKPI.
For use by mmap(2) callbacks.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-05-11 21:00:14 +02:00
Edward Tomasz Napierala
5e8caee259 linux: remove redundant SDT tracepoints
Remove all the 'entry' and 'return' probes; they clutter up the source
and are redundant to FBT.

Reviewed By:	dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30040
2021-05-05 13:59:00 +01:00
Edward Tomasz Napierala
ee384b229d linux(4): make linkat(2) handle AT_EMPTY_PATH
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29974
2021-05-04 13:09:46 +01:00
Konstantin Belousov
87a64872cd Add ptrace(PT_COREDUMP)
It writes the core of live stopped process to the file descriptor
provided as an argument.

Based on the initial version from https://reviews.freebsd.org/D29691,
submitted by Michał Górny <mgorny@gentoo.org>.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29955
2021-05-03 19:18:26 +03:00
Neel Chauhan
9781105bea linuxkpi: Introduce tasklet_disable_nosync()
This is needed for the drm-kmod 5.5 update.

Reviewed by:		hselasky (src)
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D30024
2021-04-28 08:05:57 -07:00
Neel Chauhan
efe7f12cd3 linuxkpi: Implement rcu_replace_pointer() macro
This is needed for the drm-kmod 5.5 update.

Reviewed by:		hselasky (src)
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D30025
2021-04-28 08:04:52 -07:00
Neel Chauhan
e657f3de6d linuxkpi: Remove unneeded {} in atomic_dec_and_lock_irqsave() 2021-04-26 08:25:33 -07:00
Neel Chauhan
c8de6e2015 linuxkpi: Elimiate brackets on return in spinlock.h 2021-04-26 08:16:48 -07:00
Neel Chauhan
ce65353ac1 linuxkpi: Implement atomic_dec_and_lock_irqsave()
This is needed by the drm-kmod 5.5 update.

Reviewed by:		hselasky, manu
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D29988
2021-04-26 08:15:49 -07:00
Neel Chauhan
057f145aae linuxkpi: Implement the wait_event_interruptible macro
This is needed by the drm-kmod 5.5 update and is similar in logic to the
existing wait_event_killable macro.

Reviewed by:		hselasky, manu
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D29987
2021-04-26 08:12:18 -07:00
Edward Tomasz Napierala
5d1d844a77 kern_linkat: modify to accept AT_ flags instead of FOLLOW/NOFOLLOW
This makes this API match other kern_xxxat() functions.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29776
2021-04-25 14:13:12 +01:00
Konstantin Belousov
fad437ba61 linuxkpi: reduce number of stray mm_struct allocations
Only allocate struct_mm after we checked that other threads do not carry
useful mm_struct.  If they don't, drop process lock, allocate, and recheck.

Note that for M_NOWAIT allocations we could avoid dropping process lock,
but I do not think that this increased complexity is useful.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-19 21:34:08 +03:00
Konstantin Belousov
165ba13fb8 linuxkpi: guarantee allocations of task and mm for interrupt threads
Create and use zones for task and mm.  Reserve items in zones based on the
estimation of the max number of interrupts in the system.  Use M_USE_RESERVE
to allow to take reserved items when allocation occurs from the interrupt
thread context.

Of course, this would only work first time we allocate the task for
interrupt thread. If interrupt is deallocated and allocated anew,
creating a new thread, it might be that zone is depleted. It still
should be good enough for practical uses.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-19 21:34:08 +03:00
Konstantin Belousov
4ce1f6162e linuxkpi: some style, wrap too long lines
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-19 21:34:07 +03:00
Edward Tomasz Napierala
156da725d3 linux(4): bump osrelease to 4.4.0.
This is required for the current Arch Linux binaries to work.

PR:		254112
Reviewed By:	emaste
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29218
2021-04-19 11:37:58 +01:00
Edward Tomasz Napierala
e47823b831 linux: support AT_EMPTY_PATH flag in fchownat(2)
This fixes rsyslog package installation scripts in Bionic.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29108
2021-04-16 16:27:20 +01:00
Edward Tomasz Napierala
4b45c2bb83 linux: make fstatat(2) handle AT_EMPTY_PATH
Without it, Qt5 apps from Focal fail to start, being unable to load
their plugins.  It's also necessary for glibc 2.33, as found in recent
Arch snapshots.

PR:		254112
Reviewed By:	kib
Sponsored by:	The FreeBSD Foundation, EPSRC
Differential Revision:	https://reviews.freebsd.org/D28192
2021-04-16 08:56:19 +01:00
Edward Tomasz Napierala
1663120ae4 linux: implement O_PATH
Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29773
2021-04-15 15:30:59 +01:00
Edward Tomasz Napierala
1b11173c00 linux: extend the LINUX_O_ constants to make room for O_PATH
No functional changes.

Sponsored By:	EPSRC
2021-04-15 15:04:44 +01:00
Edward Tomasz Napierala
ca6e1fa3ce linux: adjust ordering of Linux auxv and add dummy AT_HWCAP2
This should be a no-op; the purpose of this is to reduce
a spurious difference between Linuxulator and Linux, to make
debugging core dumps slightly easier.

Note that AT_HWCAP2 we pass to Linux binaries is always 0,
instead of being equal to 'cpu_feature2'.  This matches what
I've observed under Ubuntu Focal VM.

Reviewed By:	chuck, dchagin
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D29609
2021-04-13 13:14:30 +01:00
Mark Johnston
3f322b22e0 linuxkpi: Fix pcie_set_readrq()
We were passing a LinuxKPI struct device * to a pci(4) function that
expects a device_t.

Reviewed by:	manu, hselasky, bz
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29675
2021-04-12 09:32:21 -04:00
Konstantin Belousov
5b3b19db73 linuxkpi: remove erronously committed diff save file
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-06 03:42:13 +03:00
Konstantin Belousov
8011fb795b linuxkpi: drop single-use variable
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-06 03:38:29 +03:00
Konstantin Belousov
f6b108837e linuxkpi: avoid counting per-thread use for the embedded linux cdevs
The counter is not used to control destroy.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-06 03:38:29 +03:00
Konstantin Belousov
7f9867f8c6 linuxkpi: do not destroy/free embedded linux cdevs
They have their own lifetime managed by the containing objects.
Premature and unexpected free causes corruption.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-06 03:38:29 +03:00
Konstantin Belousov
28b482e2ba linuxkpi: rename cdev to ldev
the variables hold pointers to a linux_cdev, not to a FreeBSD cdev.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-06 03:38:28 +03:00
Konstantin Belousov
7b0125cbec linuxkpi: copy ldev into local to test and free the same pointer
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
2021-04-06 03:38:28 +03:00
Bjoern A. Zeeb
37c3241a43 LinuxKPI: treat firmware file names more lenient
A lot of firmware files have a "-" in the name.  That "-" is a problem
when dealing with shell variables or loader (e.g., auto-loading .ko).
It may thus often be convenient to generate firmware kernel object files
with s/-/_/g in the name.  In order to automatically find them from
drivers using LinuxKPI also substitue the '-' for a '_' like we do
for '/' and '.' already.

Reviewed-by:	hselasky, manu (ok)
MFC-after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D29514
2021-04-02 10:03:39 +00:00
Bjoern A. Zeeb
7069b4c6a4 LinuxKPI/OFED: (re)move inetdevice.h implementation
The two functions in linux/inetdevice.h are highly FreeBSD/ifnet
specific.  This is a result of struct net_device being mapped to
struct ifnet.

The only known consumer of these functions are two files in the
ofed/infiniband code.

As a first step of cleaning up copy linux/inetdevice.h to
rdma/ib_addr_freebsd.h. (It stayed a separate file to preserve
copyright and license of the original file; otherwise it could be
merged into ib_addr.h where more EPOCH/vnet/.. are already used).

Slightly rename the function to not conflict with LinuxKPI
in the future.

Remove the three last, now unneeded includes of inetdevice.h and
zap linux/inetdevice.h to an empty header file with only the forward
include to netdevice.h remaining.

Sponsored-by:	The FreeBSD Foundation
MFC-after:	2 weeks
Reviewed-by:	hselasky, kib
X-D-R:		D29366 (extracted as further cleanup)
Differential Revision:	https://reviews.freebsd.org/D29434
2021-03-30 14:40:46 +00:00
Hans Petter Selasky
1777720880 Reduce chance of RCU deadlock in the LinuxKPI by implementing the section
feature of the concurrency kit, CK.

Differential Revision:	https://reviews.freebsd.org/D29467
Reviewed by:	kib@ and markj@
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-03-29 10:55:14 +02:00
Bjoern A. Zeeb
fdcfe8a298 LinuxKPI: netdevice notifier callback argument
Introduce struct netdev_notifier_info as a container to pass
net_device to the callback functions.
Adjust netdev_notifier_info_to_dev() to return the net_device field.

Add explicit casts from ifp to ni->dev even though currently
struct net_device is defined to struct ifnet.  This is needed in
preparation for untangling this and improving the net_device compat
code.

Obtained-from:	bz_iwlwifi
Sponsored-by:	The FreeBSD Foundation
MFC-after:	2 weeks
Reviewed-by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D29365
2021-03-26 13:00:23 +00:00
Bjoern A. Zeeb
bc042266b2 LinuxKPI: add net_ratelimit()
Add a net_ratelimit() compat implementation based on ppsratecheck().
Add a sysctl to allow tuning of the number of messages.

Sponsored-by:	The FreeBSD Foundation
MFC-after:	2 weeks
Reviewed-by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D29399
2021-03-26 12:05:48 +00:00
Bjoern A. Zeeb
3b1ecc9fa1 LinuxKPI: remove < 5.0 version support
We are not aware of any out-of-tree consumers anymore
which would need KPI support for before Linux version 5.
Update the two in-tree consumers to use the new KPI.
This allows us to remove the extra version check and
will also give access to {lower,upper}_32_bits() unconditionally.

Sponsored-by:	The FreeBSD Foundation
Reviewed-by:	hselasky, rlibby, rstone
MFC-after:	2 weeks
X-MFC:		to 13 only
Differential Revision: https://reviews.freebsd.org/D29391
2021-03-24 23:00:03 +00:00
Bjoern A. Zeeb
f1069375d9 LinuxKPI: add lockdep_map
Add stubs for struct lockdep_map and three accessor functions
used by iwlwifi.

Obtained-from:	bz_iwlwifi
Sponsored-by:	The FreeBSD Foundation
MFC-after:	2 weeks
Reviewed-by:	hselasky, emaste
Differential Revision:	https://reviews.freebsd.org/D29398
2021-03-24 22:50:55 +00:00
Bjoern A. Zeeb
5a402a3ae3 LinuxKPI: add pci_ids.h
brcm80211 include pci_ids.h directly while historically we were tracking
IDs in pci.h.  Move the current set of IDs from pci.h to pci_ids.h and
while here add IDs for Realtek and Broadcom as well as a network class
as needed by their wireless drivers.

We still include pci_ids.h from pci.h so this should not change anything.

MFC-after:	2 weeks
Reviewed-by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D29400
2021-03-24 22:35:18 +00:00
Bjoern A. Zeeb
3cce818c46 LinuxKPI: if_ether additions
Add various protocol IDs found in various wireless drivers.
Also add ETH_FRAME_LEN and struct ethhdr.

Obtained-from:	bz_iwlwifi
Sponsored-by:	The FreeBSD Foundation
MFC-after:	2 weeks
Reviewed-by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D29397
2021-03-24 22:33:03 +00:00
Bjoern A. Zeeb
4b0632cfc5 LinuxKPI: add more linux-specific errno
Add ERFKILL and EBADE found in iwlwifi and brcmfmac wireless drivers.
While here add a comment above the block of error numbers above 500 to
document expectations.

Obtained-from:	bz_iwlwifi
Sponsored-by:	The FreeBSD Foundation
MFC-after:	2 weeks
Reviewed-by:	hselasky, emaste
Differential Revision:	https://reviews.freebsd.org/D29396
2021-03-24 22:31:37 +00:00
Bjoern A. Zeeb
de8a7cc703 linuxkpi: add ieee80211_node.h to headers to include before LIST_HEAD
ieee80211_node.h uses LIST_HEAD() which LinuxKPI redefines and this
can lead to problems (see comment there).  Make sure the net80211
header file is handled correctly by adding it to the list of files
to include before re-defining the macro.
Also add header files needed as dependencies.

Sponsored-by:	The FreeBSD Foundation
MFC-after:	2 weeks
Reviewed-by:	philip, hselasky
Differential Revision:	https://reviews.freebsd.org/D29336
2021-03-24 22:19:34 +00:00
John Baldwin
3b57ddb029 Rename linux_set_upcall_kse() to linux_set_upcall().
This matches the rename of cpu_set_upcall_kse() in
5c2cf81845.

Reviewed by:	kib, emaste
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29295
2021-03-18 12:14:34 -07:00
Bjoern A. Zeeb
0c7b75f128 LinuxKPI: add support for crc32_le()
Add support for crc32_le() as a wrapper around crc32_raw().

Sponsored-by:	The FreeBSD Foundation
Obtained-from:	bz_iwlwifi
MFC-after:	2 weeks
Reviewed-by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D29187
2021-03-18 10:56:22 +00:00
Edward Tomasz Napierala
0dfbdd9fc2 linux(4): make getcwd(2) return ERANGE instead of ENOMEM
For native FreeBSD binaries, the return value from __getcwd(2)
doesn't really matter, as the libc wrapper takes over and returns
the proper errno.

PR:		kern/254120
Reported By:	Alex S <iwtcex@gmail.com>
Reviewed By:	kib
Sponsored By:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29217
2021-03-12 15:31:45 +00:00
Edward Tomasz Napierala
dc0119c281 linsysfs: create /sys/bus/ and /sys/subsystem/
This looks like a no-op, but it prevents udevadm(8) with failing
loudly, which in turn unbreaks installation of libfprint-2-2, which
in Focal is a dependency for make-4.2.1-1.2.

One might wonder why installing a build utility involves messing
with device handling...

Sponsored By:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D29133
2021-03-11 15:50:51 +00:00
Hans Petter Selasky
dfb33cb0ef Allocating the LinuxKPI current structure from a software interrupt thread
must be done using the M_NOWAIT flag after 1ae20f7c70 .

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-03-10 13:27:40 +01:00
Hans Petter Selasky
d1cbe79089 Allocating the LinuxKPI current structure from an interrupt thread must be
done using the M_NOWAIT flag after 1ae20f7c70 .

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-03-10 10:51:04 +01:00
Hans Petter Selasky
ebe5cf355d Implement basic support for allocating memory from a specific numa node
in the LinuxKPI.

Differential Revision:	https://reviews.freebsd.org/D29077
Reviewed by:	markj@ and kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2021-03-09 21:01:47 +01:00
Edward Tomasz Napierala
cd84c82c6a linux: add support for SO_PEERGROUPS
The su(8) and sudo(8) from Ubuntu Bionic use it.

Sponsored By:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28165
2021-03-06 19:48:58 +00:00
Alex Richardson
fa2528ac64 Use atomic loads/stores when updating td->td_state
KCSAN complains about racy accesses in the locking code. Those races are
fine since they are inside a TD_SET_RUNNING() loop that expects the value
to be changed by another CPU.

Use relaxed atomic stores/loads to indicate that this variable can be
written/read by multiple CPUs at the same time. This will also prevent
the compiler from doing unexpected re-ordering.

Reported by:	GENERIC-KCSAN
Test Plan:	KCSAN no longer complains, kernel still runs fine.
Reviewed By:	markj, mjg (earlier version)
Differential Revision: https://reviews.freebsd.org/D28569
2021-02-18 14:02:48 +00:00
Mark Johnston
0fc8a79672 linux: Unmap the VDSO page when unloading
linux_shared_page_init() creates an object and grabs and maps a single
page to back the VDSO.  When destroying the VDSO object, we failed to
destroy the mapping and free KVA.  Fix this.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28696
2021-02-16 09:40:02 -05:00
Vladimir Kondratyev
b3c6fe663b epoll: Store epoll_event udata member in ext member of kevent.
Current epoll implementation stores udata fields of epoll_event
structure in special dynamically-sized table rather than in udata field
of backing kevent structure because of 2 reasons:
1. Kevent's udata size is smaller than epoll's on 32-bit archs.
2. Kevent's udata can be clobbered on execution EPOLL_CTL_ADD as kqueue
   modifies existing event while epoll returns error in this case.

After r320043 has introduced four new 64bit user data members (ext[]),
we can store epoll udata in one of them and drop aforementioned table.
According to kqueue_register() source code ext members are not updated
when existing kevent is modified that fixes p.2.

As a side effect the patch fixes PR/252582.

Reviewed by:	trasz
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D28169
2021-02-08 02:46:14 +03:00
Edward Tomasz Napierala
e44a78ce6f linux: add support for SO_PEERSEC getsockopt
It returns "unconfined", like Linux without SELinux would.

Sponsored By:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D28164
2021-02-07 20:42:04 +00:00
Edward Tomasz Napierala
f6e8256a96 linux: fix handling of flags for 32 bit send(2) syscall
Previously the flags were passed as-is, which could resulted
in spurious EAGAIN returned for non-blocking sockets, which
broke some Steam games.

PR:		248065
Reported By:	Alex S <iwtcex@gmail.com>
Tested By:	Alex S <iwtcex@gmail.com>
Reviewed By:	emaste
MFC After:	3 days
Sponsored By:	The FreeBSD Foundation
2021-02-06 23:21:27 +00:00
Ryan Stone
b58cf1cb35 Fix race condition in linuxkpi workqueue
Consider the following scenario:

1. A delayed_work struct in the WORK_ST_TIMER state.
2. Thread A calls mod_delayed_work()
3. Thread B (a callout thread) simultaneously calls
linux_delayed_work_timer_fn()

The following sequence of events is possible:

A: Call linux_cancel_delayed_work()
A: Change state from TIMER TO CANCEL
B: Change state from CANCEL to TASK
B: taskqueue_enqueue() the task
A: taskqueue_cancel() the task
A: Call linux_queue_delayed_work_on().  This is a no-op because the
state is WORK_ST_TASK.

As a result, the delayed_work struct will never be invoked.  This is
causing address resolution in ib_addr.c to stop permanently, as it
never tries to reschedule a task that it thinks is already scheduled.

Fix this by introducing locking into the cancel path (which
corresponds with the lock held while the callout runs).  This will
prevent the callout from changing the state of the task until the
cancel is complete, preventing the race.

Differential Revision:	https://reviews.freebsd.org/D28420
Reviewed by: hselasky
MFC after: 2 months
2021-02-04 13:54:53 -05:00
shu
14c40d2c29 linux: remove locks around callout_drain in timerfd_close()
The lock around callout_drain() is unnecessary and may cause
deadlock when one closes a timer descriptor during timer execution.

Reviewed By:	delphij
Submitted By:	ankohuu_outlook.com (Shunchao Hu)
Differential Revision: https://reviews.freebsd.org/D28148
2021-02-03 19:47:38 +00:00
shu
ae71b794cb linux: make timerfd_settime(2) set expirations count to zero
On Linux, read(2) from a timerfd file descriptor returns an unsigned
8-byte integer (uint64_t) containing the number of expirations
that have occurred, if the timer has already expired one or more
times since its settings were last modified using timerfd_settime(),
or since the last successful read(2).  That's to say, once we do
a read or call timerfd_settime(), timer fd's expiration count should
be zero.  Some Linux applications create timerfd and add it to epoll
with LT mode, when event comes, they do timerfd_settime instead
of read to stop event source from trigger.  On FreeBSD,
timerfd_settime(2) didn't set the count to zero, which caused high
CPU utilization.

Submitted by:	ankohuu_outlook.com (Shunchao Hu)
Differential Revision: https://reviews.freebsd.org/D28231
2021-02-03 19:08:40 +00:00
Bjoern A. Zeeb
4a26380ba6 LinuxKPI: add module dependency on firmware(9)
In a6c2507d1b support for LinuxKPI
firmware loading was added.  Record the dependency on firmware(9)
as otherwise (if built as module) linuxkpi will no longer load.

Reported-by:	tijl
MFC after:	1 day
X-MFC-with:	a6c2507d1b
Sponsored-by:	The FreeBSD Foundation
2021-01-30 17:50:26 +00:00
Bjoern A. Zeeb
fa765ca73e LinuxKPI: implement devres() framework parts and two examples
This code implements a version of the devres framework found
working for various iwlwifi use cases and also providing functions
for ttm_page_alloc_dma.c from DRM.

Part of the framework replicates the consumed KPI, while others
are internal helper functions.

In addition the simple devm_k*malloc() consumers were implemented
and kvasprintf() was enhanced to also work for the devm_kasprintf()
case.
Addmittingly lkpi_devm_kmalloc_release() could be avoided but for
the overall understanding of the code and possible memory tracing
it may still be helpful.

Further devsres consumer are implemented for iwlwifi but will follow
later as the main reason for this change is to sort out overlap with
DRM.

Sponsored-by:	The FreeBSD Foundation
Obtained-from:	bz_iwlwifi
MFC After:	3 days
Reviewed-by:	hselasky, manu
Differential Revision:	https://reviews.freebsd.org/D28189
2021-01-28 16:32:43 +00:00
Bjoern A. Zeeb
1fac2cb4d6 LinuxKPI: enhance PCI bits for DRM
In pci_domain_nr() directly return the domain which got set in
lkpifill_pci_dev() in all cases.  This was missed between D27550
and 105a37cac7 .

In order to implement pci_dev_put() harmonize further code
(which was started in the aforementioned commit) and add kobj
related bits (through the now common lkpifill_pci_dev() code)
to the DRM specific calls without adding the DRM allocated
pci devices to the pci_devices list.
Add a release for the lkpinew_pci_dev() (DRM) case so freeing
will work.
This allows the DRM created devices to use the normal kobj/refcount
logic and work with, e.g., pci_dev_put().
(For a slightly more detailed code walk see the review).

Sponsored-by:	The FreeBSD Foundation
Obtained-from:	bz_iwlwifi (partially)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D28188
2021-01-28 16:23:19 +00:00
Bjoern A. Zeeb
4abbf816bf LinuxKPI: upstream a collection of drm-kmod conflicting changes
The upcoming in-kernel implementations for LinuxKPI based on work on
iwlwifi (and other wireless drivers) conflicts in a few places with
the drm-kmod graphics work outside the base system.

In order to transition smoothly extract the conflicting bits.
This included "unaligned" accessor functions, sg_pcopy_from_buffer(),
IS_*() macros (to be further restricted in the future), power management
bits (possibly no longer conflicting with DRM), and other minor changes.

Obtained-from:  bz_iwlwifi
Sponsored-by:   The FreeBSD Foundation
MFC after:	3 days
Reviewed by:	kib, hselasky, manu, bdragon (looked at earlier versions)
Differential Revision: https://reviews.freebsd.org/D26598
2021-01-28 16:15:12 +00:00
Bjoern A. Zeeb
a6c2507d1b LinuxKPI: add firmware loading support
Implement linux firmware KPI compat code.
This includes: request_firmware() request_firmware_nowait(),
request_firmware_direct(), firmware_request_nowarn(),
and release_firmware().

Given we will try to map requested names from natively ported
or full-linuxkpi-using drivers to a firmware(9) auto-loading
name format (.ko file name and image name matching),
we quieten firmware(9) and print success or failure (unless
the _nowarn() version was called) in the linuxkpi implementation.
At the moment we try up-to 4 different naming combinations,
with path stripped, original name, and requested name with '/'
or '.' replaced.

We do not currently defer loading in the "nowait" case.

Sponsored-by:	The FreeBSD Foundation
Sponsored-by:	Rubicon Communications, LLC ("Netgate")
		(firmware(9) nowarn update from D27413)
MFC after:	3 days
Reviewed by:	kib, manu (looked at older versions)
Differential Revision:	https://reviews.freebsd.org/D27414
2021-01-28 16:05:32 +00:00
Brooks Davis
7a1591c1b6 Rename kern_mmap_req to kern_mmap
Replace all uses of kern_mmap with kern_mmap_req move the old kern_mmap.
Reand rename kern_mmap_req to kern_mmap                                .

The helper saved some code churn initially, but having multiple
interfaces is sub-optimal.

Obtained from:	CheriBSD
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D28292
2021-01-25 21:50:37 +00:00
Brooks Davis
bfc99943b0 ndis(4): remove as previous announced
nids(4) was a clever idea in the early 2000's when the market was
flooded with 10/100 NICs with Windows-only drivers, but that hasn't been
the case for ages and the driver has had no meaningful maintenance in
ages. It only supports Windows-XP era drivers.

Also remove:
 - ndis support from wpa_supplicant
 - ndiscvt(8)

Reviewed By:	emaste, bcr (manpages)
Differential Revision:	https://reviews.freebsd.org/D27609
2021-01-25 21:45:03 +00:00
Edward Tomasz Napierala
7d3310c4fc linux: remove spurious newline.
Sponsored by:	The FreeBSD Foundation
2021-01-19 09:56:45 +00:00
Mark Johnston
4af9323542 linuxkpi: Fix the shrinker scan target
Use the number of items scanned to control the duration of the shrink
loop.  Otherwise, if a consumer like TTM is not able to free the number
of items requested for some reason, the shrinker keeps looping forever.

Reviewed by:	manu
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D28224
2021-01-18 17:07:55 -05:00
Vladimir Kondratyev
ec25b6fa5f LinuxKPI: Reimplement irq_work queue on top of fast taskqueue
Summary:
Linux's irq_work queue was created for asynchronous execution of code from contexts where spin_lock's are not available like "hardware interrupt context". FreeBSD's fast taskqueues was created for the same purposes.

Drm-kmod 5.4 uses irq_work_queue() at least in one place to schedule execution of task/work from the critical section that triggers following INVARIANTS-induced panic:

```
panic: acquiring blockable sleep lock with spinlock or critical section held (sleep mutex) linuxkpi_short_wq @ /usr/src/sys/kern/subr_taskqueue.c:281
cpuid = 6
time = 1605048416
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe006b538c90
vpanic() at vpanic+0x182/frame 0xfffffe006b538ce0
panic() at panic+0x43/frame 0xfffffe006b538d40
witness_checkorder() at witness_checkorder+0xf3e/frame 0xfffffe006b538f00
__mtx_lock_flags() at __mtx_lock_flags+0x94/frame 0xfffffe006b538f50
taskqueue_enqueue() at taskqueue_enqueue+0x42/frame 0xfffffe006b538f70
linux_queue_work_on() at linux_queue_work_on+0xe9/frame 0xfffffe006b538fb0
irq_work_queue() at irq_work_queue+0x21/frame 0xfffffe006b538fd0
semaphore_notify() at semaphore_notify+0xb2/frame 0xfffffe006b539020
__i915_sw_fence_notify() at __i915_sw_fence_notify+0x2e/frame 0xfffffe006b539050
__i915_sw_fence_complete() at __i915_sw_fence_complete+0x63/frame 0xfffffe006b539080
i915_sw_fence_complete() at i915_sw_fence_complete+0x8e/frame 0xfffffe006b5390c0
dma_i915_sw_fence_wake() at dma_i915_sw_fence_wake+0x4f/frame 0xfffffe006b539100
dma_fence_signal_locked() at dma_fence_signal_locked+0x105/frame 0xfffffe006b539180
dma_fence_signal() at dma_fence_signal+0x72/frame 0xfffffe006b5391c0
dma_fence_is_signaled() at dma_fence_is_signaled+0x80/frame 0xfffffe006b539200
dma_resv_add_shared_fence() at dma_resv_add_shared_fence+0xb3/frame 0xfffffe006b539270
i915_vma_move_to_active() at i915_vma_move_to_active+0x18a/frame 0xfffffe006b5392b0
eb_move_to_gpu() at eb_move_to_gpu+0x3ad/frame 0xfffffe006b539320
eb_submit() at eb_submit+0x15/frame 0xfffffe006b539350
i915_gem_do_execbuffer() at i915_gem_do_execbuffer+0x7d4/frame 0xfffffe006b539570
i915_gem_execbuffer2_ioctl() at i915_gem_execbuffer2_ioctl+0x1c1/frame 0xfffffe006b539600
drm_ioctl_kernel() at drm_ioctl_kernel+0xd9/frame 0xfffffe006b539670
drm_ioctl() at drm_ioctl+0x5cd/frame 0xfffffe006b539820
linux_file_ioctl() at linux_file_ioctl+0x323/frame 0xfffffe006b539880
kern_ioctl() at kern_ioctl+0x1f4/frame 0xfffffe006b5398f0
sys_ioctl() at sys_ioctl+0x12a/frame 0xfffffe006b5399c0
amd64_syscall() at amd64_syscall+0x121/frame 0xfffffe006b539af0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe006b539af0
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x800a6f09a, rsp = 0x7fffffffe588, rbp = 0x7fffffffe640 ---
KDB: enter: panic
```
Here, the  dma_resv_add_shared_fence() performs a critical_enter() and following call of schedule_work() from semaphore_notify() triggers 'acquiring blockable sleep lock with spinlock or critical section held' panic.

Switching irq_work implementation to fast taskqueue fixes the panic for me.

Other report with the similar bug: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=247166

Reviewed By: hselasky
Differential Revision: https://reviews.freebsd.org/D27171
2021-01-17 12:47:28 +01:00
Edward Tomasz Napierala
feb96ee9c8 linux: mute "unsupported socket(AF_NETLINK, 3, NETLINK_AUDIT)" warnings
They are way too noisy with Focal.

Sponsored by:	The FreeBSD Foundation
2021-01-14 09:16:28 +00:00
Edward Tomasz Napierala
ec2700e015 linux: mute the "unsupported prctl option 23" warnings
Make the PR_CAPBSET_READ prctl(2) return EINVAL without logging
any warnings; this is way too noisy with Focal.

Sponsored by:	The FreeBSD Foundation
2021-01-13 10:31:56 +00:00
Edward Tomasz Napierala
a339b4223a linux: bump the default version from 3.10.0 to 3.17.0
This is required for Qt5, as found in Ubuntu Focal.  The library contains
the minimum kernel version encoded in an ELF note; this makes rtld ignore
it altogether, with a confusing error message.  Without it, things fail
like this:

$ konsole: error while loading shared libraries: libQt5Core.so.5: cannot
open shared object file: No such file or directory

For reference, the Qt kernel version requirements can be found at:
https://github.com/qt/qtbase/blob/dev/src/corelib/global/minimum-linux_p.h

Sponsored by:	The FreeBSD Foundation
Reviewed By:	emaste
Differential Revision:	https://reviews.freebsd.org/D28105
2021-01-13 10:02:16 +00:00
Mateusz Guzik
6b3a9a0f3d Convert remaining cap_rights_init users to cap_rights_init_one
semantic patch:

@@

expression rights, r;

@@

- cap_rights_init(&rights, r)
+ cap_rights_init_one(&rights, r)
2021-01-12 13:16:10 +00:00
Emmanuel Vadot
11d62b6f31 linuxkpi: add kernel_fpu_begin/kernel_fpu_end
With newer AMD GPUs (>=Navi,Renoir) there is FPU context usage in the
amdgpu driver.
The `kernel_fpu_begin/end` implementations in drm did not even allow nested
begin-end blocks.

Submitted by: Greg V
Reviewed By: manu, hselasky
Differential Revision: https://reviews.freebsd.org/D28061
2021-01-12 12:31:00 +01:00
Emmanuel Vadot
2c95fb753f linuxkpi: Add shrinker support
A driver can register a shrinker that will be called when the kernel
wants to free some memory.
Add support for that in linuxkpi and call the registered shrinkers
when the lowmem event is triggered.

Reviewed by:	bz
Differential Revision:	 https://reviews.freebsd.org/D27728
2021-01-12 12:31:00 +01:00
Emmanuel Vadot
105a37cac7 linuxkpi: Add more pci functions needed by DRM
-pci_get_class : This function search for a matching pci device based on
   the class/subclass and returns a newly created pci_dev.
 - pci_{save,restore}_state : This is analogous to ours with the same name
 - pci_is_root_bus : Return true if this is the root bus
 - pci_get_domain_bus_and_slot : This function search for a matching pci
   device based on domain, bus and slot/function concat into a single
   unsigned int (devfn) and returns a newly created pci_dev
 - pci_bus_{read,write}_config* : Read/Write to the config space.

While here add some helper function to alloc and fill the pci_dev struct.

Reviewed by:   hselasky, bz (older version)
Differential Revision:	   https://reviews.freebsd.org/D27550
2021-01-12 12:31:00 +01:00
Neel Chauhan
408c514f73 linuxkpi: Fix the "error: unknown type name 'u32'" compilation issue when
building the Intel QAT/QuickAssist driver.

Approved by:		hselasky, kib
Differential Revision:	https://reviews.freebsd.org/D28055
2021-01-09 15:27:04 -08:00
Konstantin Belousov
de27805fee linuxkpi: handle ARI
Stop trying to manually calculate RID, which cannot be done correctly
by PCI_DEVFN().  Use PCI_GET_RID() method instead.

Do not use pci_find_dbsf() to go from the linux pci_dev to freebsd
device_t.  First, device is readily available as dev.bsddev.  Second,
using pci_find_dbsf() fails for ARI-enabled functions with large
function numbers, because PCI_SLOT()/PCI_FUNC() are for non-ARI.

Reviewed by:	bz, hselasky, manu
Tested by:	manu (drm)
Sponsored by:	Mellanox Technologies/NVidia Networking
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27960
2021-01-08 23:17:21 +02:00
Alan Somers
20321e6225 Regenerate syscall files after reallocation of aio_writev/aio_readv 2021-01-07 19:50:32 -07:00
Alan Somers
b3286afae3 Reallocate syscall numbers for aio_writev and aio_readv
The originally chosen numbers interfere with downstream projects'
syscalls.  Move them to the end of the syscall table instead.

Reported by:	jrtc27
Reviewed by:	brooks
MFC-With:	022ca2fc7f
Differential Revision:	022ca2fc7f
2021-01-07 19:49:27 -07:00
Alan Somers
1868a91fac Regenerate syscall files after addition of aio_writev/aio_readv 2021-01-02 19:57:58 -07:00
Alan Somers
022ca2fc7f Add aio_writev and aio_readv
POSIX AIO is great, but it lacks vectored I/O functions. This commit
fixes that shortcoming by adding aio_writev and aio_readv. They aren't
part of the standard, but they're an obvious extension. They work just
like their synchronous equivalents pwritev and preadv.

It isn't yet possible to use vectored aiocbs with lio_listio, but that
could be added in the future.

Reviewed by:    jhb, kib, bcr
Relnotes:       yes
Differential Revision: https://reviews.freebsd.org/D27743
2021-01-02 19:57:58 -07:00
Konstantin Belousov
9dd48b87e6 Regen. 2020-12-27 12:57:27 +02:00
Konstantin Belousov
7a202823aa Expose eventfd in the native API/ABI using a new __specialfd syscall
eventfd is a Linux system call that produces special file descriptors
for event notification. When porting Linux software, it is currently
usually emulated by epoll-shim on top of kqueues.  Unfortunately, kqueues
are not passable between processes.  And, as noted by the author of
epoll-shim, even if they were, the library state would also have to be
passed somehow.  This came up when debugging strange HW video decode
failures in Firefox.  A native implementation would avoid these problems
and help with porting Linux software.

Since we now already have an eventfd implementation in the kernel (for
the Linuxulator), it's pretty easy to expose it natively, which is what
this patch does.

Submitted by:   greg@unrelenting.technology
Reviewed by:    markj (previous version)
MFC after:      2 weeks
Differential Revision:  https://reviews.freebsd.org/D26668
2020-12-27 12:57:26 +02:00
Konstantin Belousov
7cb901bf22 Remove useless ARGUSED annotations.
Submitted by:	greg@unrelenting.technology
2020-12-27 12:57:26 +02:00
Konstantin Belousov
11c9f2ff1a Add SPDX tag.
Submitted by:	greg@unrelenting.technology
2020-12-27 12:57:26 +02:00
Konstantin Belousov
673e2dd652 Add ELF flag to disable ASLR stack gap.
Also centralize and unify checks to enable ASLR stack gap in a new
helper exec_stackgap().

PR:	239873
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2020-12-18 23:14:39 +00:00
John Baldwin
de66c9a118 Cleanups to *ERR* compat shims.
- Use [u]intptr_t casts to convert pointers to integers.

- Change IS_ERR* to return bool instead of long.

Reviewed by:	manu
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27577
2020-12-17 20:28:53 +00:00
John Baldwin
ce8395ecfd Use the 't' modifier to print a ptrdiff_t.
Reviewed by:	imp
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27576
2020-12-16 00:11:30 +00:00
Hans Petter Selasky
b8b3f4fdc3 Improve handling of alternate settings in the USB stack.
Allow setting the alternate interface number to fail when there is only
one alternate setting present, to comply with the USB specification.

Refactor how iface->num_altsetting is computed.

Bump the __FreeBSD_version due to change of core USB structure.

PR:		251856
MFC after:	1 week
Submitted by:	Ma, Horse <Shichun.Ma@dell.com>
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-12-15 12:05:07 +00:00
Bryan Drewery
f6e7d67a43 linux_dma: Ensure proper flags pass to allocators.
Possibly fixes the wrong flags being passed to the kernel
allocators in linux_dma_alloc_coherent() and linux_dma_pool_alloc().

Reviewed by:	hps
MFC after:	2 weeks
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D27508
2020-12-10 20:45:08 +00:00
Hans Petter Selasky
a399cf139b Prefer using the MIN() function macro over the min() inline function
in the LinuxKPI. Linux defines min() to be a macro, while in FreeBSD
min() is a static inline function clamping its arguments to
"unsigned int".

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-12-07 09:48:06 +00:00
Tijl Coosemans
77fb6b6644 Move V4L feature declarations and DTrace provider definitions from
linux_common.c to linux_util.c so they become available on i386.

linux_common.c defines the linux_common kernel module but this module does
not exist on i386 and linux_common.c is not included in the linux module.
linux_util.c is included in the linux_common module on amd64 and the linux
module on i386.

Remove linux_common.c from files.i386 again.  It was added recently in
r367433 when the DTrace provider definitions were moved.

The V4L feature declarations were moved to linux_common in r283423.
2020-12-06 10:58:55 +00:00
Konstantin Belousov
d7d95c3ff8 Regen 2020-12-04 18:58:27 +00:00
Konstantin Belousov
31df9c26c5 Fix compat32 for ntp_adjtime(2).
struct timex is not 32-bit safe, it uses longs for members.
Provide translation.

Reviewed by:	brooks, cy
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D27471
2020-12-04 18:57:58 +00:00
Hans Petter Selasky
ff15f3f133 Allow the rbtree header file in the LinuxKPI to be used in standalone code.
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-12-04 15:50:44 +00:00
Hans Petter Selasky
01fdacdbc7 Allow the list header file in the LinuxKPI to be used in standalone code.
Some style and spelling nits while at it.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-12-04 15:46:48 +00:00
Hans Petter Selasky
ed2b70e8af Use function macro for sema_init() in the LinuxKPI to limit macro expansion scope.
MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-11-30 09:47:53 +00:00
Konstantin Belousov
cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Konstantin Belousov
4815f175d0 Linuxolator: Replace use of eventhandlers by sysent hooks.
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27309
2020-11-23 18:18:16 +00:00
Kyle Evans
60e60e73fd freebsd32: take the _umtx_op struct definitions back
Providing these in freebsd32.h facilitates local testing/measuring of the
structs rather than forcing one to locally recreate them. Sanity checking
offsets/sizes remains in kern_umtx.c where these are typically used.
2020-11-23 00:58:14 +00:00
Kyle Evans
15eaec6a5c _umtx_op: move compat32 definitions back in
These are reasonably compact, and a future commit will blur the compat32
lines by supporting 32-bit operations with the native _umtx_op.
2020-11-22 05:34:51 +00:00
Hans Petter Selasky
99f20bdc47 Allow LinuxKPI types to be used in bootloaders, by checking for the
_STANDALONE definition.

No functional change intended.

MFC after:	1 week
Sponsored by:	Mellanox Technologies // NVIDIA Networking
2020-11-18 13:47:11 +00:00
Conrad Meyer
f8f74aaa84 linux(4) clone(2): Correctly handle CLONE_FS and CLONE_FILES
The two flags are distinct and it is impossible to correctly handle clone(2)
without the assistance of fork1().  This change depends on the pwddesc split
introduced in r367777.

I've added a fork_req flag, FR2_SHARE_PATHS, which indicates that p_pd
should be treated the opposite way p_fd is (based on RFFDG flag).  This is a
little ugly, but the benefit is that existing RFFDG API is preserved.
Holding FR2_SHARE_PATHS disabled, RFFDG indicates both p_fd and p_pd are
copied, while !RFFDG indicates both should be cloned.

In Chrome, clone(2) is used with CLONE_FS, without CLONE_FILES, and expects
independent fd tables.

The previous conflation of CLONE_FS and CLONE_FILES was introduced in
r163371 (2006).

Discussed with:	markj, trasz (earlier version)
Differential Revision:	https://reviews.freebsd.org/D27016
2020-11-17 21:20:11 +00:00
Conrad Meyer
85078b8573 Split out cwd/root/jail, cmask state from filedesc table
No functional change intended.

Tracking these structures separately for each proc enables future work to
correctly emulate clone(2) in linux(4).

__FreeBSD_version is bumped (to 1300130) for consumption by, e.g., lsof.

Reviewed by:	kib
Discussed with:	markj, mjg
Differential Revision:	https://reviews.freebsd.org/D27037
2020-11-17 21:14:13 +00:00
Conrad Meyer
ede4af47ae unix(4): Enhance LOCAL_CREDS_PERSISTENT ABI
As this ABI is still fresh (r367287), let's correct some mistakes now:

- Version the structure to allow for future changes
- Include sender's pid in control message structure
- Use a distinct control message type from the cmsgcred / sockcred mess

Discussed with:	kib, markj, trasz
Differential Revision:	https://reviews.freebsd.org/D27084
2020-11-17 20:01:21 +00:00
Conrad Meyer
b1976ea14c linprocfs(5): Add rudimentary /proc/<pid>/mountinfo
This is used by some Linux programs using filehandles (r367773) to locate
the mountpoint for a given fsid.

Differential Revision:	https://reviews.freebsd.org/D27136
2020-11-17 19:56:47 +00:00
Conrad Meyer
de774e422e linux(4): Implement name_to_handle_at(), open_by_handle_at()
They are similar to our getfhat(2) and fhopen(2) syscalls.

Differential Revision:	https://reviews.freebsd.org/D27111
2020-11-17 19:51:47 +00:00
Kyle Evans
63ecb272a0 umtx_op: reduce redundancy required for compat32
All of the compat32 variants are substantially the same, save for
copyin/copyout (mostly). Apply the same kind of technique used with kevent
here by having the syscall routines supply a umtx_copyops describing the
operations needed.

umtx_copyops carries the bare minimum needed- size of timespec and
_umtx_time are used for determining if copyout is needed in the sem2_wait
case.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D27222
2020-11-17 03:36:58 +00:00
Vladimir Kondratyev
146e176df7 LinuxKPI: Exclude linux/acpi.h content on non-ACPI archs.
LinuxKPI ACPI support is based on FreeBSD import of ACPICA which can be
compiled only on aarch64, amd64 and i386. Ifdef-out broken parts on our
side to avoid patching of vendor code.

This fixes drm-devel-kmod build on powerpc64(le).

Reported by:	pkubaj
2020-11-14 10:34:18 +00:00
Emmanuel Vadot
dab39c11af LinuxKPI: Implement ACPI bits required by drm-kmod in base system
It includes:

ACPI_HANDLE() implementation.
AC and VIDEO ACPI events notification support.
Replacement of hand-rolled GPLed _DSM method evaluation helpers
with in-base ones.

Submitted by:	wulf
Differential Revision:	https://reviews.freebsd.org/D26603
2020-11-09 13:20:14 +00:00
Edward Tomasz Napierala
e3b1c847a4 Make it possible to mount a fuse filesystem, such as squashfuse,
from a Linux binary.  Should come handy for AppImages.

Reviewed by:	asomers
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26959
2020-11-09 08:53:15 +00:00
Mateusz Guzik
e90afaa015 kqueue: save space by using only one func pointer for assertions 2020-11-09 00:04:35 +00:00
Alexander Leidinger
8ec6c4a38b - add more linux socket options (sorted by value)
- map those IPv4 / IPv6 socket options which exist in FreeBSD
   + most of them visually verified to have the same type/layout of arguments
   + not tested with linux programs to behave as intended
 - be more human readable for known options which are not handled
 - be more verbose for unhandled socket message flags we know about
 - print the jail ID in linux_msg if run in a jail
 - add possibility to print debug message about known missing parts only once
 - add multiple levels of sysctl linux.debug:
   1: print debug messages, tell about unimplemented stuff (only once)
   2: like 1, but also print messages about implemented but not tested
      stuff (only once)
   3+: like 2, but no rate limiting of messages
 - increase default linux debug level from 1 to 3

We are a lot more verbose in as we need to be (e.g. some of the IP socket
options which are the same, and share the same memory layout, and are
believed to work). The reason is that we have no good testsuite to test those
linux-bits. The LTP or other test suites like the python one, are not fully
up to the task we need. As such the excessive messages about emulated but not
tested socket options.

IMO any MFC (possible, but most probably not by me) should set the default
debug level to 1.

Discussed with:	trasz
2020-11-08 09:50:58 +00:00
Conrad Meyer
76b2bfeda4 linux(4): Fix loadable modules after r367395
Move dtrace SDT definitions into linux_common module code.  Also, build
linux_dummy.c into the linux_common kld -- we don't need separate
versions of these stubs for 32- and 64-bit emulation.

Reported by:	several
PR:		250897
Discussed with:	emaste, trasz
Tested by:	John Kennedy, Yasuhiro KIMURA, Oleg Sidorkin
X-MFC-With:	r367395
Differential Revision:	https://reviews.freebsd.org/D27124
2020-11-06 22:04:57 +00:00
Conrad Meyer
e9b13c6612 linux(4): Deduplicate unimpl/dummy syscall handlers
No functional change.

Reviewed by:	emaste, trasz
Differential Revision:	https://reviews.freebsd.org/D27099
2020-11-05 19:30:31 +00:00
Conrad Meyer
20172854ab Add sbuf streaming mode to pseudofs(9), use in linprocfs(5)
Add a pseudofs node flag 'PFS_AUTODRAIN', which automatically emits sbuf
contents to the caller when the sbuf buffer fills.  This is only
permissible if the corresponding PFS node fill function can sleep
whenever it appends to the sbuf.

linprocfs' /proc/self/maps node happens to meet this requirement.
Streaming out the file as it is composed avoids truncating the output
and also avoids preallocating a very large buffer.

Reviewed by:	markj; earlier version: emaste, kib, trasz
Differential Revision:	https://reviews.freebsd.org/D27047
2020-11-05 06:48:51 +00:00
Edward Tomasz Napierala
cdf6e4e922 Unbreak buildworld after r367339.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-11-04 21:39:04 +00:00
Edward Tomasz Napierala
2f927d87f9 Add linux_to_bsd_errtbl[], mapping Linux errnos to their BSD counterparts.
This will be used by fuse(4).

Reviewed by:	asomers
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26974
2020-11-04 19:54:18 +00:00
Conrad Meyer
9e47480e94 linux(4): Improve netlink diagnostics
Add some missing netlink_family definitions and produce vaguely
human-readable error messages for those definitions, like we used to do for
just ROUTE and KOBJECT_UEVENTS.

Additionally, if we know it's a netfilter socket but didn't find it in the
table, fall back to printing that instead of the generic handler ("socket
domain 16, ...").

No change to the emulator correctness, just mildly improved diagnostics for
gaps.
2020-11-03 19:50:42 +00:00
Edward Tomasz Napierala
7abf30d339 Make linux_errtbl[] static.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D27004
2020-11-03 19:12:33 +00:00
Edward Tomasz Napierala
939e5de8d4 Fix rookie mistake - it's nitems(), not sizeof().
Reported by:	xtouqh_icloud.com
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-11-03 14:44:33 +00:00
Conrad Meyer
eaa5afcefa linux(4) prctl(2): Implement PR_[GS]ET_DUMPABLE
Proxy the flag to the roughly analogous FreeBSD procctl 'TRACE'.

TRACE-disabled processes are not coredumped, and Linux !DUMPABLE processes
can not be ptraced.  There are some additional semantics around ownership of
files in the /proc/[pid] pseudo-filesystem, which we do not attempt to
emulate correctly at this time.

Reviewed by:	markj (earlier version)
Differential Revision:	https://reviews.freebsd.org/D27015
2020-11-03 02:10:54 +00:00
Conrad Meyer
443d8a07df linux(4): Emulate Linux SOL_SOCKET:SO_PASSCRED
This is required by some major linux applications, such as Chrome and
Firefox.  (As well as Electron-using applications, which are essentially
a bundled version of Chrome.)

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D27012
2020-11-03 01:19:13 +00:00
Conrad Meyer
a98f03786e linux(4): style: Eliminate dead 'break' after 'return'
No functional change.
2020-11-03 01:10:27 +00:00
Conrad Meyer
7731194090 linux(4): Quiesce unrecognized ioctl warning for F2FS query
On Linux, sqlite probes for underlying F2FS filesystems that support
certain kinds of atomic update with this ioctl.  The expected result on
non-F2FS filesystem (i.e., all FreeBSD filesystems) is any error value.

Minimally implement the ioctl and avoid the warning message.

(This shows up in Linux Chrome, which embeds sqlite.)

Reviewed by:	emaste, trasz
Differential Revision:	https://reviews.freebsd.org/D27050
2020-11-02 18:45:43 +00:00
Conrad Meyer
53efdb55a8 linux(4): Deduplicate ioctl range construction with a helper macro
No functional change.

Reviewed by:	emaste, trasz
Differential Revision:	https://reviews.freebsd.org/D27049
2020-11-02 18:45:15 +00:00
Conrad Meyer
63ed2e3642 linux(4): Disambiguate identical ioctl errors in distinct paths
And stop truncating the full ioctl number in the error message.

Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D27048
2020-11-02 06:16:11 +00:00
Conrad Meyer
76dfd556f1 linux(4): Add missing clone(2) flags 2020-10-31 01:12:35 +00:00
Conrad Meyer
ae9cafd919 linux(4): Quiesce warning about madvise(..., -1)
This API misuse is intended to produce an error value to detect certain
bogus stub implementations of MADV_WIPEONFORK.  We don't need to log a
warning about it.

Example:
https://boringssl.googlesource.com/boringssl/+/ad5582985cc6b89d0e7caf0d9cc7e301de61cf66%5E%21/

Reviewed by:	emaste, trasz
Differential Revision:	https://reviews.freebsd.org/D27017
2020-10-30 19:02:59 +00:00
Edward Tomasz Napierala
ad7b26ecdc Make linprocfs(4) print a warning when there's not enough room to fill
/proc/self/maps.

Submitted by:	dchagin (earlier version)
Reviewed by:	emaste (earlier version)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20576
2020-10-29 15:44:44 +00:00
Edward Tomasz Napierala
b60b81e643 Fix typo.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-10-29 14:42:51 +00:00
Edward Tomasz Napierala
1a8577fa68 Add defines for Linux errno values and use them to make linux_errtbl[]
more readable.  While here, add linux_check_errtbl() function to make
sure we don't leave holes.

No objections:	emaste (earlier version)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26972
2020-10-29 14:23:52 +00:00
Edward Tomasz Napierala
1701c69b6e Make linux_errtbl a bit more readable by using named initializers.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26970
2020-10-28 14:16:08 +00:00
Edward Tomasz Napierala
866b1f5147 Fix misnomer - linux_to_bsd_errno() does the exact opposite.
Reported by:	arichardson
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26965
2020-10-27 12:49:40 +00:00
Mateusz Guzik
fe76bef462 linux: silence renameat2 flags warning
Hogs the console while building the Linux kernel in a Ubuntu Focal jail.
2020-10-26 18:03:50 +00:00
Mateusz Guzik
1024de70f9 linux: add missing conversions for compat.linux.use_emul_path handling 2020-10-26 18:02:52 +00:00
Kyle Evans
275c821d3d audit: correct reporting of *execve(2) success
r326145 corrected do_execve() to return EJUSTRETURN upon success so that
important registers are not clobbered. This had the side effect of tapping
out 'failures' for all *execve(2) audit records, which is less than useful
for auditing purposes.

Audit exec returns earlier, where we can know for sure that EJUSTRETURN
translates to success. Note that this unsets TDP_AUDITREC as we commit the
audit record, so the usual audit in the syscall return path will do nothing.

PR:		249179
Reported by:	Eirik Oeverby <ltning-freebsd anduin net>
Reviewed by:	csjp, kib
MFC after:	1 week
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26922
2020-10-24 14:39:17 +00:00
Edward Tomasz Napierala
b3be0b4d0c Tweak linux(4) socket(2) debug messages.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26900
2020-10-24 14:25:38 +00:00
Edward Tomasz Napierala
62b1382ff3 Further improve prctl(2) debug.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26916
2020-10-24 14:23:44 +00:00
Hans Petter Selasky
ab79c9061c Implement xa_init() in the LinuxKPI as a wrapper for xa_init_flags().
MFC after:		1 week
Sponsored by:		Mellanox Technologies // NVIDIA Networking
2020-10-24 13:16:10 +00:00