Commit Graph

3656 Commits

Author SHA1 Message Date
Hans Petter Selasky
d96e599643 Implement extensible arrays API using the existing radix tree implementation
in the LinuxKPI.

Differential Revision:	https://reviews.freebsd.org/D25101
Reviewed by:	kib @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-08-27 10:28:12 +00:00
Mateusz Guzik
feabaaf995 cache: drop the always curthread argument from reverse lookup routines
Note VOP_VPTOCNP keeps getting it as temporary compatibility for zfs.

Tested by:	pho
2020-08-24 08:57:02 +00:00
Mateusz Guzik
7ad2a82da2 vfs: drop the error parameter from vn_isdisk, introduce vn_isdisk_error
Most consumers pass NULL.
2020-08-19 02:51:17 +00:00
Mateusz Guzik
a125ed50a6 linux: add sysctl compat.linux.use_emul_path
This is a step towards facilitating jails with only Linux binaries.
Supporting emul_path adds path lookups which are completely spurious
if the binary at hand runs in a Linux-based root directory.

It defaults to on (== current behavior).

make -C /root/linux-5.3-rc8 -s -j 1 bzImage:

use_emul_path=1: 101.65s user 68.68s system 100% cpu 2:49.62 total
use_emul_path=0: 101.41s user 64.32s system 100% cpu 2:45.02 total
2020-08-18 22:04:22 +00:00
Mark Johnston
a7044c60a5 Fix handling of ancillary data on non-AF_UNIX Linux sockets.
After r340674, the "continue" would restart the loop without having
updated clen, resulting in an infinite loop.  Restore the old behaviour
of simply ignoring all control messages on such sockets, since we
currently only implement handling for AF_UNIX-specific messages.

Reported by:	syzkaller
Reviewed by:	tijl
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26093
2020-08-18 14:17:14 +00:00
Mark Johnston
d9565182fd Remove "emulation" of clone(CLONE_PARENT | CLONE_THREAD).
On Linux this is supposed to result in EINVAL.

Reported by:	syzkaller
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-08-17 21:30:49 +00:00
Mark Johnston
74a796e0fc Fix a lock leak when emulating futex(FUTEX_WAIT_BITSET).
Reported by:	syzkaller
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-08-17 21:30:15 +00:00
Mark Johnston
30dcce2709 Skip Linux madvise(MADV_DONTNEED) on unmanaged objects.
vm_object_madvise() is a no-op for unmanaged objects, but we should also
limit the scope of mappings on which pmap_remove() is called.  In
particular, with the WIP largepage shm objects patch the kernel must
remove mappings of such objects along superpage boundaries, and without
this check Linux madvise(MADV_DONTNEED) could violate that requirement.

Reviewed by:	alc, kib
MFC with:	r362631
Sponsored by:	Juniper Networks, Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D26084
2020-08-17 17:14:56 +00:00
Mateusz Guzik
a92a971bbb vfs: remove the thread argument from vget
It was already asserted to be curthread.

Semantic patch:

@@

expression arg1, arg2, arg3;

@@

- vget(arg1, arg2, arg3)
+ vget(arg1, arg2)
2020-08-16 17:18:54 +00:00
Emmanuel Vadot
0e123c13fe linuxkpi: Add a few wait_bit functions
The linux function does a lot more than that as multiple waitqueue could be fetch
from a static table based on the hash of the argument but since in DRM it's only used
in one place just add a single variable.
We will probably need to change that in the futur but it's ok with DRM even with current
linux.

Reviewed by:	hselasky
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26054
2020-08-14 08:48:17 +00:00
Mark Johnston
9eb0cd08ae linprocfs: Fix some inaccuracies in meminfo.
- Fill out MemFree correctly.  Delete an ancient comment suggesting that
  we don't want to advertise the true quantity of free memory.
- Populate the Buffers field by reading vfs.bufspace.
- The page cache consists of all pages in page queues, not just the
  inactive queue.

PR:		248463
Reported and tested by:	danfe
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-08-12 16:08:44 +00:00
Mark Johnston
aa5dbc8953 Remove sys/compat/netbsd.
It contained only a header used by ncv(4), which was mainly used on pc98
systems.  Both ncv(4) and pc98 support have long been removed.
2020-08-11 16:40:09 +00:00
Hans Petter Selasky
74d3a63559 Use atomic_clear_rel_long() to implement clear_bit_unlock() in the LinuxKPI
after r363842.

Suggested by:	alc@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-08-11 12:41:40 +00:00
Hans Petter Selasky
6ae240797f Need to clone the task struct fields related to RCU aswell in the
LinuxKPI after r359727. This fixes a minor regression issue. Else the
priority tracking won't work properly when both sleepable and
non-sleepable RCU is in use on the same thread.

Bump the __FreeBSD_version to force recompilation of external kernel
modules.

PR:		242272
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-08-11 12:17:46 +00:00
Mateusz Guzik
51ea7bea91 vfs: add VOP_STAT
The current scheme of calling VOP_GETATTR adds avoidable overhead.

An example with tmpfs doing fstat (ops/s):
before: 7488958
after:  7913833

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D25910
2020-08-07 23:06:40 +00:00
Hans Petter Selasky
6b839ff47b Implement radix_tree_store() in the LinuxKPI for use with the coming
extensible arrays implementation.

While at it add some more comments explaining the current
radix_tree_insert() function and make sure to clean the root node when
the radix tree reaches the maximum height. This can happen if the
index passed is too big when the tree is empty.

The radix_tree_store() function is basically a copy of the
radix_tree_insert() function with some added functionality.

The radix_tree_store() function is local to FreeBSD and does not yet
exist in Linux.

Reviewed by:		kib
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2020-08-07 16:15:44 +00:00
Mark Johnston
1b1428dcc8 Fix a TOCTOU vulnerability in freebsd32_copyin_control().
PR:		248257
Reported by:	m00nbsd working with Trend Micro Zero Day Initiative
Reviewed by:	kib
Security:	SA-20:23.sendmsg
Security:	CVE-2020-7460
Security:	ZDI-CAN-11543
2020-08-05 17:06:14 +00:00
Emmanuel Vadot
dfb4ecb38b linuxkpi: Add time_after32 and time_before32
This compare two 32 bits times

Sponsored by: The FreeBSD Foundation
Reviewed by:	kib, hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25700
2020-08-04 15:27:32 +00:00
Emmanuel Vadot
334680ab07 linuxkpi: Add clear_bit_unlock
This calls clear_bit and adds a memory barrier.

Sponsored by: The FreeBSD Foundation

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25943
2020-08-04 15:25:22 +00:00
Emmanuel Vadot
38ba9c8bac Re-apply r363564.
We now have linux/sizes.h in the tree.
2020-08-04 14:53:41 +00:00
Emmanuel Vadot
2d946b2e12 linuxkpi: Add nested variant of mutex_lock_interruptible
We don't do anything with the _nesteds variant so just call mutex_lock_interruptible

Sponsoredby: The FreeBSD Foundation
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25944
2020-08-04 14:45:22 +00:00
Emmanuel Vadot
7237a74f3b linuxkpi: Add kref_put_lock
Same as kref_put but in addition to calling the rel function it will
acquire the lock first.

Sponsored by: The FreeBSD Foundation
Reviewed by:	hselasky, emaste
Differential Revision:	https://reviews.freebsd.org/D25942
2020-08-04 14:44:16 +00:00
Emmanuel Vadot
16fdd8b7ad linuxkpi: Add linux/sizes.h
This file contain some defines for common sizes.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky, emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25941
2020-08-04 14:42:38 +00:00
Emmanuel Vadot
85d787b2fe Fix r363565
lockdep.h needs sys/lock.h for LOCK_CLASS
2020-07-26 18:33:29 +00:00
Emmanuel Vadot
cdb6eebe08 Revert r363564
linux/sizes.h doesn't exists in base ... sorry.
2020-07-26 17:21:24 +00:00
Emmanuel Vadot
0e4e9e8f34 linuxkpi: Add taint* defines
This isn't used for us but allow us to port drivers more easily.

Reviewed by:	hselasky
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25703
2020-07-26 16:31:49 +00:00
Emmanuel Vadot
f12af2b387 linuxkpi: Include hardirq.h in preempt.h and lockdep.h in hardirq.h
Linux does the same, this avoids ifdef or extra includes in ported drivers.

Reviewed by:	emaste, hselasky
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25702
2020-07-26 16:30:59 +00:00
Emmanuel Vadot
820272c408 linuxkpi: Include linux/sizes.h in dma-mapping.h
Linux does the same, this avoids ifdef or extra includes in ported drivers.

Reviewed by:	emaste, hselasky
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25701
2020-07-26 16:30:01 +00:00
Mark Johnston
94140f4781 usb(4): Stop checking for failures from malloc(M_WAITOK).
Handle the fact that parts of usb(4) can be compiled into the boot
loader, where M_WAITOK does not guarantee a successful allocation.

PR:		240545
Submitted by:	Andrew Reiter <arr@watson.org> (original version)
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25706
2020-07-22 14:32:47 +00:00
Edward Tomasz Napierala
aa75412146 Make linux(4) support the BLKPBSZGET ioctl. Oracle uses it.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25694
2020-07-19 12:25:03 +00:00
Edward Tomasz Napierala
d5c5b4b382 Make linux fallocate(2) return EOPNOTSUPP, not ENOSYS, on unsupported mode,
as documented in the man page.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-07-18 12:21:08 +00:00
Edward Tomasz Napierala
eb6ae7576d Bump the default linux version from 3.2.0 to 3.10.0, which corresponds
to RHEL 7.  Required for DB2.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25656
2020-07-18 11:37:30 +00:00
Edward Tomasz Napierala
8d1d017175 Add a trivial linux(4) splice(2) implementation, which simply
returns EINVAL.  Fixes grep (grep-3.1-2build1).

PR:		kern/218699
Reported by:	avos
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25636
2020-07-18 11:28:40 +00:00
Edward Tomasz Napierala
978ffef22f Add missing SysV IPC stats to linprocfs(4). Fixes 'ipcs -l',
and also helps Oracle.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25669
2020-07-18 10:56:04 +00:00
Edward Tomasz Napierala
7ce051e799 Fix bogomips calculation. Previously it was off by half. This was
verified under VMWare Fusion, comparing to what's reported under CentOS,
and by comparing numbers reported by linuxulator on T420 with a googled
up Linux cpuinfo (https://lkml.org/lkml/2011/11/29/116).

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20693
2020-07-18 10:53:56 +00:00
Edward Tomasz Napierala
8ba7dddd6f Fix two typos in flag names in /proc/cpuinfo.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25695
2020-07-18 10:49:17 +00:00
Vladimir Kondratyev
34c2f79d83 linuxkpi: Ignore NULL pointers passed to string parameter of kstr(n)dup
That follows Linux and fixes related drm-kmod-5.3 panic.

Reviewed by:	imp, hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25657
2020-07-14 21:56:59 +00:00
Alexander Leidinger
e37db348c1 Fix r363125 (Implement CLOCK_MONOTONIC_RAW (linux >= 2.6.28)),
by realy using the MONOTONIC version and not the REALTIME version.

Noticed by:	myfreeweb at github
2020-07-12 14:57:29 +00:00
Alexander Leidinger
8c2602f30f Implement CLOCK_MONOTONIC_RAW (linux >= 2.6.28).
It is documented as a raw hardware-based clock not subject to NTP or
incremental adjustments. With this "not as precise as CLOCK_MONOTONIC"
description in mind, map it to our CLOCK_MONOTNIC_FAST (the same
mapping as for the linux CLOCK_MONOTONIC_COARSE).

This is needed for the webcomponent of steam (chromium) and some
other steam component or game.

The linux-steam-utils port contains a LD_PRELOAD based fix for this.
There this is mapped to CLOCK_MONOTONIC.
As an untrained ear/eye (= the majority of people) is normaly not
noticing a difference of jitter in the 10-20 ms range, specially
if you don't pay attention like for example in a browser session
while watching a video stream, the mapping to CLOCK_MONOTONIC_FAST
seems more appropriate than to CLOCK_MONOTONIC.
2020-07-12 09:51:09 +00:00
Edward Tomasz Napierala
ce28fd95e0 Make linprocfs(5) report correct tty number in /proc/<PID>/stat.
Fixes sudo (sudo-1.8.21p2-3ubuntu1.2); previously would fail
with "sudo: no tty present and no askpass program specified".

Reviewed by:	kib, emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25588
2020-07-11 13:11:54 +00:00
Edward Tomasz Napierala
17f701a3fb Make linux stat(2) return the same st_dev for every devfs instance.
The reason for this is to work around an idiosyncrasy of glibc
getttynam(3) implementation: it checks whether st_dev returned for
fd 0 is the same as st_dev returned for the target of /proc/self/fd/0
symlink, and with linux chroots having their own devfs instance,
the check will fail if you chrooted into it.

PR:		kern/240767
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25559
2020-07-11 13:08:16 +00:00
Edward Tomasz Napierala
09c4e43d18 Don't emit warnings on MADV_HUGEPAGE; Firefox uses it a lot.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-07-10 21:41:09 +00:00
Hans Petter Selasky
127d8cfafb Implement the bitmap_subset() function in the LinuxKPI. This function
checks if the bitmap pointed to by the first argument is a subset of
the bitmap pointed to by the second argument. The function returns one
on success and zero on failure.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2020-07-10 12:06:18 +00:00
Hans Petter Selasky
d2890eeea1 Implement the array_size() function in the LinuxKPI. This function
basically multiplies its two arguments and returns SIZE_MAX if the
result overflows the size_t type.  Else the product of the two
arguments is returned.

Bump the FreeBSD_version to mitigate issues with existing
implementation of array_size() in drm-devel-kmod.

Discussed with:		manu@
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2020-07-10 11:27:54 +00:00
Kyle Evans
423a033ba7 memfd_create: turn on SHM_GROW_ON_WRITE
memfd_create fds will no longer require an ftruncate(2) to set the size;
they'll grow (to the extent that it's possible) upon write(2)-like syscalls.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D25502
2020-07-10 00:45:16 +00:00
Mark Johnston
866a5d1298 Regenerate.
Sponsored by:	The FreeBSD Foundation
2020-07-06 16:34:49 +00:00
Hans Petter Selasky
588fbadffb Fix include file order in io.h in the LinuxKPI.
Make sure sys/types.h is included before machine/vm.h.

PR:		247775
Submitted by:	pkubaj@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-07-05 19:38:36 +00:00
Edward Tomasz Napierala
4d2b7be54a Fix Linux recvmsg(2) when msg_namelen returned is 0. Previously
it would fail with EINVAL, breaking some of the Python regression
tests.

While here, cap the user-controlled message length.

Note that the code doesn't seem to be copying out the new length
in either (success or failure) case. This will be addressed separately.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25392
2020-07-05 10:57:28 +00:00
Edward Tomasz Napierala
7fcc9f7ee5 Add /proc/sys/kernel/tainted to linprocfs(5). Helps LTP.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25556
2020-07-04 11:26:03 +00:00
Edward Tomasz Napierala
8b99a63fd8 Make linprocfs(5) create /proc/bus/pci/devices/, and linsysfs(5)
create /sys/class/power_supply/.  This silences some warnings
from biology/linux-foldingathome.

Reported by:	0mp
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25557
2020-07-04 11:22:35 +00:00
Mateusz Guzik
d6d9ddd41f linux: fix ioctl performance for termios
TCGETS et al are frequently issued by Linux binaries while the previous code
avoidably ping-pongs a global sx lock and serializes on Giant.

Note that even with the fix the common case will serialize on a per-tty lock.
2020-07-04 06:25:41 +00:00
Konstantin Belousov
f334f212d9 linuxkpi: improvements for linux_pid_task() and linux_get_pid_task().
Unify functions bodies.
Do not call tdfind() if pid is passed, and do not call pfind() if tid
is supplied.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25534
2020-07-02 10:42:58 +00:00
Edward Tomasz Napierala
6d76adbb6d Rework linux accept(2). This makes the code flow easier to follow,
and fixes a bug where calling accept(2) could result in closing fd 0.

Note that the code still contains a number of problems: it makes
assumptions about l_sockaddr_in being the same as sockaddr_in,
the EFAULT-related code looks like it doesn't work at all, and the
socket type check is racy.  Those will be addressed later on;
I'm trying to work in small steps to avoid breaking one thing while
fixing another.

It fixes Redis, among other things.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25461
2020-07-01 10:37:08 +00:00
Hans Petter Selasky
9a4e535b39 The "pid" field in the LinuxKPI task struct is typically set to the thread ID
and not the process ID. Make sure the linux_task_exiting() function uses tdfind()
to lookup the BSD procedure structure pointer by the "pid" field, and only
fallback to pfind() when no match is found! This makes linux_task_exiting()
in line with the rest of the code.

Differential Revision: https://reviews.freebsd.org/D25509
Submitted by:	Greg V <greg@unrelenting.technology>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-07-01 08:23:57 +00:00
Edward Tomasz Napierala
c2da36fecd Make linprocfs(5) create the /proc/<PID>/task/ directores.
This is to silence down some Chromium assertions.

PR:		kern/240991
Analyzed by:	Alex S <iwtcex@gmail.com>
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25256
2020-06-30 16:24:28 +00:00
Edward Tomasz Napierala
9bc42c18cb Make linux(4) ignore SA_INTERRUPT. The zsh(1) binary from Bionic uses it.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25499
2020-06-30 16:18:09 +00:00
Hans Petter Selasky
d326a6c7c1 Document the is_signed(), type_max() and type_min() function macros in the
LinuxKPI. Try to make the function argument more readable.

Suggested by:	several
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-30 08:41:33 +00:00
Kyle Evans
97ce5033a8 linux: reposition the comment for bsd_to_linux_bits/linux_to_bsd_bits
rpokala notes that splitting the definitions like this is kind of silly,
since the comment applies to both.  Move the comment up (or the definition
down, depending on your perspective on life) accordingly.

Reported by:	rpokala
2020-06-29 17:47:00 +00:00
Hans Petter Selasky
d0eed838e3 Implement is_signed(), type_max() and type_min() function macros in the
LinuxKPI.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-29 13:08:40 +00:00
Kyle Evans
5403f186a7 linuxolator: implement memfd_create syscall
This effectively mirrors our libc implementation, but with minor fudging --
name needs to be copied in from userspace, so we just copy it straight into
stack-allocated memfd_name into the correct position rather than allocating
memory that needs to be cleaned up.

The sealing-related fcntl(2) commands, F_GET_SEALS and F_ADD_SEALS, have
also been implemented now that we support them.

Note that this implementation is still not quite at feature parity w.r.t.
the actual Linux version; some caveats, from my foggy memory:

- Need to implement SHM_GROW_ON_WRITE, default for memfd (in progress)
- LTP wants the memfd name exposed to fdescfs
- Linux allows open() of an fdescfs fd with O_TRUNC to truncate after dup.
  (?)

Interested parties can install and run LTP from ports (devel/linux-ltp) to
confirm any fixes.

PR:		240874
Reviewed by:	kib, trasz
Differential Revision:	https://reviews.freebsd.org/D21845
2020-06-29 03:09:14 +00:00
Mark Johnston
3507b8d467 Remove some redundant assignments and computations.
Reported by:	alc
Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25400
2020-06-28 21:34:38 +00:00
Edward Tomasz Napierala
4fe5361cbe Make linux(4) support SO_PROTOCOL. Running Python test suite
with python3.8 from Focal triggers those.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25491
2020-06-28 18:56:32 +00:00
Edward Tomasz Napierala
d5629eb216 Make linux(4) warn about unsupported SA_ flags.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25453
2020-06-27 15:50:35 +00:00
Mark Johnston
f4134e3d87 Implement an approximation of Linux MADV_DONTNEED semantics.
Linux MADV_DONTNEED is not advisory: it has side effects for anonymous
memory, and some system software depends on that.  In particular,
MADV_DONTNEED causes anonymous pages to be discarded.  If the mapping is
a private mapping of a named object then subsequent faults are to
repopulate the range from that object, otherwise pages will be
zero-filled.  For mappings of non-anonymous objects, Linux MADV_DONTNEED
can be implemented in the same way as our MADV_DONTNEED.

This implementation differs from Linux semantics in its handling of
private mappings, inherited through fork(), of non-anonymous objects.
After applying MADV_DONTNEED, subsequent faults will repopulate the
mapping from the parent object rather than the root of the shadow chain.

PR:		230160
Reviewed by:	alc, kib
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25330
2020-06-25 20:30:30 +00:00
Doug Moore
158c55a584 In r362552, RB_SET_PARENT is defined, and use in parens in
RB_CLEAR_NODE.  But it is not an expression, and ought not to be
enclosed in parens.  Remove them.

Approved by:	markj
Differential Revision:	https://reviews.freebsd.org/D25421
2020-06-23 22:47:54 +00:00
Doug Moore
4d56980017 Define RB_SET_PARENT to do all assignments to rb parent
pointers. Define RB_SWAP_CHILD to replace the child of a parent with
its twin, and use it in 4 places. Use RB_SET in rb_link_node to remove
the only linuxkpi reference to color, and then drop color- and
parent-related definitions that are defined and used only in rbtree.h.

This is intended to be entirely cosmetic, with no impact on program
behavior, and leave RB_PARENT and RB_SET_PARENT as the only ways to
read and write rb parent pointers.

Reviewed by:	markj, kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D25264
2020-06-23 20:02:55 +00:00
Thomas Munro
f270658873 vfs: track sequential reads and writes separately
For software like PostgreSQL and SQLite that sometimes reads sequentially
while also writing sequentially some distance behind with interleaved
syscalls on the same fd, performance is better on UFS if we do
sequential access heuristics separately for reads and writes.

Patch originally by Andrew Gierth in 2008, updated and proposed by me with
his permission.

Reviewed by:	mjg, kib, tmunro
Approved by:	mjg (mentor)
Obtained from:	Andrew Gierth <andrew@tao11.riddles.org.uk>
Differential Revision:	https://reviews.freebsd.org/D25024
2020-06-21 08:51:24 +00:00
Edward Tomasz Napierala
52c81be11a Add linux_madvise(2) instead of having Linux apps call the native
FreeBSD madvise(2) directly.  While some of the flag values match,
most don't.

PR:		kern/230160
Reported by:	markj
Reviewed by:	markj
Discussed with:	brooks, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25272
2020-06-20 18:29:22 +00:00
Edward Tomasz Napierala
4afe4fae1b Add warnings for unsupported Linux clockids.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25322
2020-06-19 19:33:06 +00:00
Mark Johnston
0f1e6ec591 Add a helper function for validating VA ranges.
Functions which take untrusted user ranges must validate against the
bounds of the map, and also check for wraparound.  Instead of having the
same logic duplicated in a number of places, add a function to check.

Reviewed by:	dougm, kib
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25328
2020-06-19 03:32:04 +00:00
Konstantin Belousov
8a15ac8378 Fix execution of linux binary from multithreaded non-Linux process.
If multithreaded non-Linux process execs Linux binary, then non-Linux
threads different from the one that execing are cleared by
single-threading at boundary, and then terminating them in
post_execve(). Since at that time the process is already switched to
linux ABI, linuxolator is involved in the thread clearing on boundary,
but cannot find the emul data.

Handle it by pre-creating emuldata for all threads in the execing process.

Also remove a code in linux_proc_exec() handler that cleared emul data
for other threads when execing from multithreaded Linux process. It is
excessive.

PR:	247020
Reported by:	Martin FIlla <freebsd@sysctl.cz>
Reported by:	Henrique L. Amorim, Independent Security Researcher
Reported by:	Rodrigo Rubira Branco (BSDaemon), Amazon Web Services
Reviewed by:	markj
Tested by:	trasz
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D25293
2020-06-18 20:49:56 +00:00
Edward Tomasz Napierala
3d8dd98381 Make Linux uname(2) return x86_64 to 32-bit apps. This helps Steam.
PR:		kern/240432
Analyzed by by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25248
2020-06-15 20:12:10 +00:00
Edward Tomasz Napierala
889cd28520 Make linux(4) warn about unsupported CMSG level/type.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25255
2020-06-14 14:38:40 +00:00
Doug Moore
9f1041dc2e Linuxkpi uses the rb-tree structures without using their interfaces,
making them break when the representation changes. Revert changes that
eliminated the color field from rb-trees, leaving everything as it was
before.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25250
2020-06-13 01:54:09 +00:00
Doug Moore
13dca1937f Revert r362108, as it breaks compilation. 2020-06-12 17:48:12 +00:00
Doug Moore
3159ceca97 The linuxkpi code accesses left/right rb tree pointers without using
RB_LEFT or RB_RIGHT, so they aren't stripping off the color bit
encoded there. Strip off that bit for linuxkpi.

Reported by:	dch
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25245
2020-06-12 16:51:55 +00:00
Edward Tomasz Napierala
462171d9aa Add compat.linux.debug sysctl, to make it possible to silence down
the debug messages. While here, clean up some variable naming.

Reviewed by:	bcr (manpages), emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25230
2020-06-12 14:37:50 +00:00
Edward Tomasz Napierala
599dadca55 Fix naming clash.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-12 14:31:19 +00:00
Edward Tomasz Napierala
34ff0c0e6a Make linux(4) warn about unsupported fcntls.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25231
2020-06-12 14:25:32 +00:00
Edward Tomasz Napierala
4beacc3b1d Minor code cleanup; no functional changes.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25232
2020-06-12 14:23:10 +00:00
Edward Tomasz Napierala
86e794eb65 Don't use newlines with linux_msg(). No functional changes.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-11 14:57:30 +00:00
Edward Tomasz Napierala
bc8e281082 Replace LINUX_FASYNC with LINUX_O_ASYNC; no functional changes.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25218
2020-06-11 14:09:43 +00:00
Edward Tomasz Napierala
433d61a573 Improve the warnings.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-06-11 12:35:00 +00:00
Edward Tomasz Napierala
3bc69ad9b3 Make linux(4) handle SO_REUSEPORT.
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25216
2020-06-11 12:25:49 +00:00
Mark Johnston
479f70ef24 Fix a couple of nits in Linux sysinfo(2) emulation.
- Use the same definition of free memory as Linux.
- Rename the totalbig and freebig fields to match the corresponding
  names on Linux.

Discussed with:	alc
MFC after:	1 week
2020-06-10 23:52:50 +00:00
Mark Johnston
27e4374dd4 Add a comment reflecting the commit log for r361945.
Suggested by:	alc
Reviewed by:	alc
MFC with:	r361945
2020-06-10 23:52:39 +00:00
Edward Tomasz Napierala
8c5059e9ea Make linux(4) set the openfiles soft resource limit to 1024 for Linux
applications, which often depend on this being the case.  There's a new
sysctl, compat.linux.default_openfiles, to control this behaviour.

Reviewed by:	kevans, emaste, bcr (manpages)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25177
2020-06-10 18:50:46 +00:00
Edward Tomasz Napierala
c31a6a6612 Support SO_SNDBUFFORCE/SO_RCVBUFFORCE by aliasing them to the
standard SO_SNDBUF/SO_RCVBUF.  Mostly cosmetics, to get rid
of the warning during 'apt upgrade'.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25173
2020-06-10 18:43:43 +00:00
Doug Moore
36ba4b393f To reduce the size of an rb_node, drop the color field. Set the least
significant bit in the pointer to the node from its parent to indicate
that the node is red. Have the tree rotation macros leave the
old-parent/new-child node red and the new-parent/old-child node black.

This change makes RB_LEFT and RB_RIGHT no longer assignable, and
RB_COLOR no longer defined. Any code that modifies the tree or
examines a node color would have to be modified after this change.

Reviewed by:	markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D25105
2020-06-09 20:19:11 +00:00
John Baldwin
58b552dcec Refactor ptrace() ABI compatibility.
Add a freebsd32_ptrace() and move as many freebsd32 shims as possible
to freebsd32_ptrace().  Aside from register sets, freebsd32 passes
pointers to native structures to kern_ptrace() and converts to/from
native/32-bit structure formats in freebsd32_ptrace() outside of
kern_ptrace().

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25195
2020-06-09 16:43:23 +00:00
Mark Johnston
3e5fae34fc Stop computing a "sharedram" value when emulating Linux sysinfo(2).
The previous code was computing an incorrect value in a very expensive
manner.  "sharedram" is supposed to be the amount of memory used by
named swap objects, which on FreeBSD basically corresponds to memory
usage by shared memory objects (including, for example, GEM objects) and
tmpfs.  We currently have no cheap way to count such pages.  The
previous code tried to determine the number of copy-on-write pages
shared between processes.

Just replace the computed value with 0.  illumos reportedly does the
same thing.  Linux itself did not populate this field until a 2014
commit, "mm: export NR_SHMEM via sysinfo(2) / si_meminfo() interfaces".

Reported by:	mjg
MFC after:	1 week
2020-06-08 22:29:52 +00:00
Hans Petter Selasky
c51613866f Ensure pci_channel_offline() actually queries the PCI register space,
and not only the software cache of that register.  Else
pci_channel_offline() won't detect that the PCI device is gone when
using the LinuxKPI.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-05 08:12:08 +00:00
Hans Petter Selasky
d053391cd7 Implement __is_constexpr() function macro in the LinuxKPI.
Bump the FreeBSD version.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-02 12:23:04 +00:00
Hans Petter Selasky
ef5f8c18b5 Implement struct_size() function macro in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-02 10:19:45 +00:00
Hans Petter Selasky
c185f13b92 Implement BUILD_BUG_ON_ZERO() in the LinuxKPI.
Tested using gcc and clang.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-06-02 09:45:43 +00:00
Rick Macklem
c01cd3f558 Update the files created from the new syscalls.master from r361599.
Reviewed by:	brooks
Differential Revision:	https://reviews.freebsd.org/D24949
2020-05-28 21:23:02 +00:00
Rick Macklem
d9021e389a Add a syscall for the nfs-over-tls daemons to use.
The nfs-over-tls daemons need a system call to perform operations such as
associate a file descriptor with a krpc socket.
The daemons will not be in head for some time, but it will make it
easier for testers of nfs-over-tls to do testing if the system call
is in head (basically the stub for libc which will be commited soon).

Reviewed by:	brooks
Differential Revision:	https://reviews.freebsd.org/D24949
2020-05-28 21:06:10 +00:00
Emmanuel Vadot
8e52f22b25 linuxkpi: Add kstrtou16
This function convert a char * to a u16.
Simply use strtoul and cast to compare for ERANGE

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24996
2020-05-27 11:42:09 +00:00
Emmanuel Vadot
9b703f3082 linuxkpi: Add rcu_swap_protected
This macros swap an rcu pointer with a normal pointer.
The condition only seems to be used for debug/warning under linux, ignore
for now.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24954
2020-05-27 10:01:30 +00:00
Emmanuel Vadot
8287045dde linuxkpi: Add overflow.h
Only add check_add_overflow and check_mul_overflow as those are the only
two needed function by DRM v5.3.
Both gcc and clang have builtin to do this check so use them directly
but throw an error if the compiler/code checker doesn't support this builtin.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselsasky
Differential Revision:	https://reviews.freebsd.org/D25015
2020-05-27 09:31:50 +00:00
Emmanuel Vadot
42f0f394f7 linuxkpi: Fix mod_timer and del_timer_sync
mod_timer is supposed to return 1 if the modified timer was pending, which
is exactly what callout_reset does so return the value after checking
that it's a correct one in case the api change.
del_timer_sync returns int so add a function and handle that.

Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24983
2020-05-25 12:46:05 +00:00
Emmanuel Vadot
4efd5dd70d linuxkpi: Add refcount.h
Implement some refcount functions needed by drm.
Just use the atomic_t struct and functions from linuxkpi for simplicity.

Sponsored-by: The FreeBSD Foundation

Reviewed by:	hselsasky
Differential Revision:	https://reviews.freebsd.org/D24985
2020-05-25 12:44:07 +00:00
Emmanuel Vadot
93d70cd3fe linuxkpi: Add __same_type and __must_be_array macros
The same_type macro simply wraps around builtin_types_compatible_p which
exist for both GCC and CLANG, which returns 1 if both types are the same.
The __must_be_array macros returns 1 if the argument is an array.

This is needed for DRM v5.3

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24953
2020-05-25 12:42:55 +00:00
Emmanuel Vadot
af300929f9 linuxkpi: Add prandom_u32_max
This is just a wrapper around arc4random_uniform
Needed by DRM v5.3

Sponsored-by: The FreeBSD Foundation
Reviewed by:	cem, hselasky
Differential Revision:	https://reviews.freebsd.org/D24961
2020-05-23 17:52:25 +00:00
Emmanuel Vadot
2491b25c3a linuxkpi: Add rcu_work functions
The rcu_work function helps to queue some work after waiting for a grace
period.
This is needed by DRM drivers.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24942
2020-05-21 20:18:38 +00:00
Emmanuel Vadot
eda697d2ef linuxkpi: Add irq_work.h
Since handlers are call in a thread context we can simply use a workqueue
to emulate those functions.
The DRM code was patched to do that already, having it in linuxkpi allows us
to not patch the upstream code.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24859
2020-05-19 09:04:35 +00:00
Emmanuel Vadot
5e30a739c7 linuxkpi: add pci_dev_present
pci_dev_present shows if a set of pci ids are present in the system.
It just wraps pci_find_device.
Needed by DRMv5.2

Submitted by:	Austing Shafer (ashafer@badland.io)
Differential Revision:	https://reviews.freebsd.org/D24796
2020-05-19 08:44:33 +00:00
Emmanuel Vadot
ff443195bf linuxkpi: Add __init_waitqueue_head
The only difference with init_waitqueue_head is that the name and the
lock class key are provided but we don't use those so use init_waitqueue_head
directly.

Sponsored-by: The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24861
2020-05-19 08:43:17 +00:00
Emmanuel Vadot
355711ea76 linuxkpi: Add offsetofend macro
This calculate the offset of the end of the member in the given struct.
Needed by DRM in Linux v5.3

Sponsored-by: The FreeBSD Foudation
Differential Revision:	https://reviews.freebsd.org/D24849
2020-05-17 20:14:49 +00:00
Emmanuel Vadot
d003cc4318 linuxkpi: Add __mutex_init
Same as mutex_init, the lock_class_key argument seems to be only used for
debug in Linux, simply ignore it for now.
Needed by DRM in Linux v5.3

Sponsored-by: The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24848
2020-05-17 20:12:16 +00:00
Emmanuel Vadot
7708d3d765 linuxkpi: Add atomic_dec_and_mutex_lock
This function decrement the counter and if the result is 0 it acquires
the mutex and returns 1, if not it simply returns 0.
Needed by DRM from Linux v5.3

Sponsored-by: The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24847
2020-05-17 20:09:11 +00:00
Hans Petter Selasky
cf9f2ca3ef Implement synchronize_srcu_expedited() in the LinuxKPI.
Differential Revision:	https://reviews.freebsd.org/D24798
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-16 14:27:50 +00:00
Emmanuel Vadot
cfa985350d linuxkpi: Add EBADRQC to errno.h
This is used in the amdgpu driver from Linux 5.2

Sponsored-by: The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24807
2020-05-13 07:49:12 +00:00
Andriy Gapon
a164a32b4d linuxkpi: print stack trace in WARN_ON macros
Reviewed by:	hselasky, kib
MFC after:	1 week
Differential Revision: https://reviews.freebsd.org/D24779
2020-05-13 07:47:56 +00:00
Emmanuel Vadot
3d84874da0 linuxkpi: Really add bitmap_alloc and bitmap_zalloc
This was missing in r360870

Sponsored-by: The FreeBSD Foundation
2020-05-10 13:12:05 +00:00
Emmanuel Vadot
ce03b3013f linuxkpi: Add bitmap_alloc and bitmap_free
This is a simple call to kmallock_array/kfree, therefore include linux/slab.h as
this is where the kmalloc_array/kfree definition is.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselsasky
Differential Revision:	https://reviews.freebsd.org/D24794
2020-05-10 13:07:00 +00:00
Emmanuel Vadot
26a578697c linuxkpi: Add bitmap_copy and bitmap_andnot
bitmap_copy simply copy the bitmaps, no idea why it exists.
bitmap_andnot is similar to bitmap_and but uses !src2.

Sponsored-by: The FreeBSD Foundation
Reviewed by:	hselasky
Differential Revision:	https://reviews.freebsd.org/D24782
2020-05-09 17:52:50 +00:00
Emmanuel Vadot
4c27484934 linuxkpi: Add pci_iomap and pci_iounmap
Those function are use to map/unmap io region of a pci device.
Different resource can be mapped depending on the bar so use a
tailq to store them all.

Sponsored-by: The FreeBSD Foundation

Reviewed by:	emaste, hselasky
Differential Revision:	https://reviews.freebsd.org/D24696
2020-05-07 17:00:51 +00:00
Hans Petter Selasky
5e6233ccab Optimise use of sg_page_count() in __sg_page_iter_next() in the LinuxKPI.
No need to compute value twice.

No functional change intended.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-04 10:10:07 +00:00
Hans Petter Selasky
fe4b041a14 Implement more scatter and gather functions in the LinuxKPI.
Differential Revision:	https://reviews.freebsd.org/D24611
Submitted by:	ashafer_badland.io (Austin Shafer)
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-04 09:58:45 +00:00
Hans Petter Selasky
42f8ef4bf5 Fix warning about sleeping with non-sleepable lock when allocating
"current" from linux_cdev_pager_populate() in the LinuxKPI:

Backtrace:
witness_debugger()
witness_warn()
uma_zalloc_arg()
malloc()
linux_alloc_current()
linux_cdev_pager_populate()
vm_fault()
vm_fault_trap()
trap_pfault()
trap()
calltrap()

Suggested by:	avg@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-04 08:05:01 +00:00
Hans Petter Selasky
b4edb17c82 Implement more PCI-express bandwidth functions in the LinuxKPI.
Submitted by:	ashafer_badland.io (Austin Shafer)
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-01 10:32:42 +00:00
Hans Petter Selasky
1bbbe083a1 Implement mutex_lock_killable() in the LinuxKPI.
Submitted by:	ashafer_badland.io (Austin Shafer)
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-01 10:28:21 +00:00
Hans Petter Selasky
3ff7ec1cc1 Implement DIV64_U64_ROUND_UP() in the LinuxKPI.
Submitted by:	ashafer_badland.io (Austin Shafer)
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-01 10:25:07 +00:00
Hans Petter Selasky
922106bf00 Implement more lockdep macros in the LinuxKPI.
Submitted by:	ashafer_badland.io (Austin Shafer)
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-01 10:18:07 +00:00
Hans Petter Selasky
61f7fe6b2d Implement kstrtou64() in the LinuxKPI.
Submitted by:	ashafer_badland.io (Austin Shafer)
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-05-01 10:14:45 +00:00
Kyle Evans
2c9c433e17 sysent: re-roll after 360236 (AUE_CLOSERANGE used) 2020-04-24 01:30:33 +00:00
Kyle Evans
3e6b82913d close_range(2): use newly assigned AUE_CLOSERANGE 2020-04-24 01:30:00 +00:00
Hans Petter Selasky
253dbe7487 Factor code in LinuxKPI to allow attach and detach using any BSD device.
This allows non-LinuxKPI based infiniband device drivers to attach
correctly to ibcore.

No functional change intended.

Reviewed by:	np @
Differential Revision:	https://reviews.freebsd.org/D24514
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-22 14:33:25 +00:00
Hans Petter Selasky
b9bf16adfb Implement the atomic fetch add unless functions for the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-20 16:21:37 +00:00
Hans Petter Selasky
fdbfa4f19e Implement aligned LinuxKPI types for u16, u32 and u64.
Makes a difference for 32-bit platforms mostly.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-20 14:03:05 +00:00
Hans Petter Selasky
07fdea3672 Allow test_bit() in the LinuxKPI to accept a const pointer.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-20 13:47:15 +00:00
Hans Petter Selasky
47c0672b08 Allow the ERR_CAST() function in the LinuxKPI to take a const void pointer.
No functional change.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-20 13:36:01 +00:00
Mark Johnston
546c117f86 Remove a vestigal reference to kmem_object.
kmem_object has been an alias of kernel_object for a while.

MFC after:	1 week
2020-04-17 19:12:52 +00:00
Brooks Davis
b24e6ac8b7 Convert canary, execpathp, and pagesizes to pointers.
Use AUXARGS_ENTRY_PTR to export these pointers.  This is a followup to
r359987 and r359988.

Reviewed by:	jhb
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24446
2020-04-16 21:53:17 +00:00
Brooks Davis
9df1c38bbc Export argc, argv, envc, envv, and ps_strings in auxargs.
This simplifies discovery of these values, potentially with reducing the
number of syscalls we need to make at runtime.  Longer term, we wish to
convert the startup process to pass an auxargs pointer to _start() and
use that rather than walking off the end of envv.  This is cleaner,
more C-friendly, and for systems with strong bounds (e.g. CHERI)
necessary.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24407
2020-04-15 20:23:55 +00:00
Brooks Davis
397df744f9 Make ps_strings in struct image_params into a pointer.
This is a prepratory commit for D24407.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
2020-04-15 20:21:30 +00:00
Brooks Davis
618a20d4f9 Remove bogus use of useracc() in (clock_)nanosleep.
There's no point in pre-checking that we can access the user's rmtp
pointer before we do it in copyout().

While here, improve style(9) compliance.

Reviewed by:	imp
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24409
2020-04-14 20:53:12 +00:00
Brooks Davis
562894f0dc Centralize compatability translation macros.
Copy the CP, PTRIN, etc macros from freebsd32.h into a sys/abi_compat.h
and replace existing definitation with includes where required. This
eliminates duplicate code and allows Linux and FreeBSD compatability
headers to be included in the same files.

Input from:	cem, jhb
Obtained from:	CheriBSD
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24275
2020-04-14 20:30:48 +00:00
Kyle Evans
e19b97f7a0 sysent: re-roll after r359930 2020-04-14 18:11:26 +00:00
Kyle Evans
7d03e08112 Mark closefrom(2) COMPAT12, reimplement in libc to wrap close_range
Include a temporarily compatibility shim as well for kernels predating
close_range, since closefrom is used in some critical areas.

Reviewed by:	markj (previous version), kib
Differential Revision:	https://reviews.freebsd.org/D24399
2020-04-14 18:07:42 +00:00
Kyle Evans
3d224fc909 sysent: re-roll after introduction of close_range in r359836 2020-04-12 21:23:51 +00:00
Kyle Evans
472ced39ef Implement a close_range(2) syscall
close_range(min, max, flags) allows for a range of descriptors to be
closed. The Python folk have indicated that they would much prefer this
interface to closefrom(2), as the case may be that they/someone have special
fds dup'd to higher in the range and they can't necessarily closefrom(min)
because they don't want to hit the upper range, but relocating them to lower
isn't necessarily feasible.

sys_closefrom has been rewritten to use kern_close_range() using ~0U to
indicate closing to the end of the range. This was chosen rather than
requiring callers of kern_close_range() to hold FILEDESC_SLOCK across the
call to kern_close_range for simplicity.

The flags argument of close_range(2) is currently unused, so any flags set
is currently EINVAL. It was added to the interface in Linux so that future
flags could be added for, e.g., "halt on first error" and things of this
nature.

This patch is based on a syscall of the same design that is expected to be
merged into Linux.

Reviewed by:	kib, markj, vangyzen (all slightly earlier revisions)
Differential Revision:	https://reviews.freebsd.org/D21627
2020-04-12 21:23:19 +00:00
Hans Petter Selasky
eae5868ce9 Clone the RCU interface into a sleepable and a non-sleepable part
in the LinuxKPI.

This allows synchronize RCU to be used inside a SRCU read section.
No functional change intended.

Bump the __FreeBSD_version to force recompilation of external kernel modules.

PR:		242272
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-08 17:09:45 +00:00
Hans Petter Selasky
61d82b0794 Some fixes for SRCU in the LinuxKPI.
- Make sure to use READ_ONCE() when deferring variables.
- Remove superfluous zero initializer.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-04-08 16:07:57 +00:00
John Baldwin
59838c1a19 Retire procfs-based process debugging.
Modern debuggers and process tracers use ptrace() rather than procfs
for debugging.  ptrace() has a supserset of functionality available
via procfs and new debugging features are only added to ptrace().
While the two debugging services share some fields in struct proc,
they each use dedicated fields and separate code.  This results in
extra complexity to support a feature that hasn't been enabled in the
default install for several years.

PR:		244939 (exp-run)
Reviewed by:	kib, mjg (earlier version)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D23837
2020-04-01 19:22:09 +00:00
Mark Johnston
4596ac234e compat/linux/linux.h depends on queue.h since r353725.
Sponsored by:	The FreeBSD Foundation
2020-03-26 17:12:55 +00:00
Warner Losh
9275cd0dc5 Implement a workaround for kms-drm modules
pci_iov_if.h was added to pci.h, but none of the kms-drm branches have
that. Rather than play whack a mole with the branches, move its inclusion to
linux_pci.c which is the only part of the code that needs it now.

Longer term, other solutions will be needed, but this gives us time to get those
deployed on all the supported versions.
2020-03-20 15:07:25 +00:00
Konstantin Belousov
2928e60e55 linuxkpi: Add infrastructure to pass FreeBSD IOV method calls into
pci_driver methods.

Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	2 weeks
2020-03-18 22:10:49 +00:00
Hans Petter Selasky
d845d3dc9a Add support for the device statistics IOCTL, needed by the coming
linux_libusb upgrade.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2020-03-10 15:56:49 +00:00