Commit Graph

1213 Commits

Author SHA1 Message Date
Edward Tomasz Napierala
618b55c2e2 Make linux(4) handle MAP_32BIT.
This unbreaks Mono (mono-devel-4.6.2.7+dfsg-1ubuntu1 from Ubuntu Bionic);
previously would crash on "amd64_is_imm32" assert.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23306
2020-01-24 12:08:23 +00:00
Edward Tomasz Napierala
b3fb13eb55 Add kern_unmount() and use in Linuxulator. No functional changes.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22646
2020-01-24 11:57:55 +00:00
Edward Tomasz Napierala
10f2d3f857 Revert r356948; breaks build somehow. 2020-01-21 20:32:49 +00:00
Edward Tomasz Napierala
c5f4e26e7d Make linux(4) handle MAP_32BIT.
This unbreaks Mono (mono-devel-4.6.2.7+dfsg-1ubuntu1 from Ubuntu Bionic);
previously would crash on "amd64_is_imm32" assert.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2020-01-21 19:19:02 +00:00
Mark Johnston
149afbf3ba Fix 64-bit syscall argument fetching in 32-bit Linux syscall handlers.
The Linux32 system call argument fetcher places each argument (passed in
registers in the Linux x86 system call convention) into an entry in the
generic system call args array.  Each member of this array is 8 bytes
wide, so this approach is broken for system calls that take off_t
arguments.

Fix the problem by splitting l_loff_t arguments in the 32-bit system
call descriptions, the same as we do for FreeBSD32.  Change entry points
to handle this using the PAIR32TO64 macro.

Move linux_ftruncate64() into compat/linux.

PR:		243155
Reported by:	Alex S <iwtcex@gmail.com>
Reviewed by:	kib (previous version)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23210
2020-01-21 17:28:22 +00:00
Edward Tomasz Napierala
66632fe7bb Properly translate MNT_FORCE flag to Linux umount2(2). Previously
it worked by accident.

MFC after:	2 weeks
Sponsored by:	DARPA
2020-01-20 12:16:32 +00:00
Edward Tomasz Napierala
9c6eb0f92f Make linux(4) use kern_setsockopt(9) instead of going through
sys_setsockopt.  Just a cleanup; no functional changes.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22812
2020-01-14 11:33:07 +00:00
Edward Tomasz Napierala
dfd060c0b6 Make linux(4) use kern_getsockopt(9) instead of going through
sys_getsockopt().  It's a cleanup; no functional changes.

Reviewed by:	kib (earlier version)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22813
2020-01-14 11:30:30 +00:00
Edward Tomasz Napierala
46209ceae5 Make linux getcpu(2) report the domain.
Submitted by:	markj
Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23144
2020-01-14 11:24:06 +00:00
Edward Tomasz Napierala
ca603bb1ee dd kern_getpriority(), make Linuxulator use it.
Reviewed by:	kib, emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22842
2020-01-12 14:25:44 +00:00
Edward Tomasz Napierala
7a0ef283e6 Add kern_setpriority(), use it in Linuxulator.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22841
2020-01-12 13:38:51 +00:00
Mateusz Guzik
c8b3463dd0 vfs: reimplement deferred inactive to use a dedicated flag (VI_DEFINACT)
The previous behavior of leaving VI_OWEINACT vnodes on the active list without
a hold count is eliminated. Hold count is kept and inactive processing gets
explicitly deferred by setting the VI_DEFINACT flag. The syncer is then
responsible for vdrop.

Reviewed by:	kib (previous version)
Tested by:	pho (in a larger patch, previous version)
Differential Revision:	https://reviews.freebsd.org/D23036
2020-01-07 15:56:24 +00:00
Kyle Evans
18348a2369 kern_mmap: add a variant that allows caller to inspect fp
Linux mmap rejects mmap() on a write-only file with EACCES.
linux_mmap_common currently does a fun dance to grab the fp associated with
the passed in fd, validates it, then drops the reference and calls into
kern_mmap(). Doing so is perhaps both fragile and premature; there's still
plenty of chance for the request to get rejected with a more appropriate
error, and it's prone to a race where the file we ultimately mmap has
changed after it drops its referenced.

This change alleviates the need to do this by providing a kern_mmap variant
that allows the caller to inspect the fp just before calling into the fileop
layer. The callback takes flags, prot, and maxprot as one could imagine
scenarios where any of these, in conjunction with the file itself, may
influence a caller's decision.

The file type check in the linux compat layer has been removed; EINVAL is
seemingly not an appropriate response to the file not being a vnode or
device. The fileop layer will reject the operation with ENODEV if it's not
supported, which more closely matches the common linux description of
mmap(2) return values.

If we discover that we're allowing an mmap() on a file type that Linux
normally wouldn't, we should restrict those explicitly.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22977
2020-01-04 23:39:58 +00:00
Mateusz Guzik
b249ce48ea vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21427
2020-01-03 22:29:58 +00:00
Edward Tomasz Napierala
cc50333011 Add basic getcpu(2) support to linuxulator. The purpose of this
syscall is to query the CPU number and the NUMA domain the calling
thread is currently running on.  The third argument is ignored.
It doesn't do anything regarding scheduling - it's literally
just a way to query the current state, without any guarantees
you won't get rescheduled an opcode later.

This unbreaks Java from CentOS 8
(java-11-openjdk-11.0.5.10-0.el8_0.x86_64).

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22972
2019-12-31 22:01:08 +00:00
Pawel Biernacki
54666dffa8 linux(4): implement copy_file_range(2)
copy_file_range(2) is implemented natively since r350315, make it available
for Linux binaries too.

Reviewed by:	kib (mentor), trasz (previous version)
Approved by:	kib (mentor)
Differential Revision:	https://reviews.freebsd.org/D22959
2019-12-30 18:11:06 +00:00
Edward Tomasz Napierala
ee0fe82ee2 Implement Linux syslog(2) syscall; just enough to make Linux dmesg(8)
utility work.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22465
2019-12-29 15:53:55 +00:00
Edward Tomasz Napierala
273ce4ae1b Make Linux stat(2) et al distinguish between block and character
devices.  It's required for LTP, among other things.  It's not
complete, but good enough for now.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22950
2019-12-29 13:54:02 +00:00
Edward Tomasz Napierala
b9594cd92d Implement Linux BLKGETSIZE64 ioctl.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-12-29 12:22:11 +00:00
Edward Tomasz Napierala
135b723803 Make linux mount(2) tolerate NULL 'from' argument, and fix flag
handling.

This should unbreak access04, acct01, chmod06, creat06,
and fchmod06 LTP tests.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-12-28 13:35:54 +00:00
Edward Tomasz Napierala
2cf9eb6c4e Add a hack to make ^T work for Linux binaries, enabled with
'compat.linux.preserve_vstatus=1' sysctl.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21967
2019-12-16 20:15:44 +00:00
Edward Tomasz Napierala
b5f20658ee Add compat.linux.emul_path, so it can be set to something other
than "/compat/linux".  Useful when you have several compat directories
with different Linux versions and you don't want to clash with files
installed by linux-c7 packages.

Reviewed by:	bcr (manpages)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22574
2019-12-16 20:07:04 +00:00
Edward Tomasz Napierala
518cce0274 Don't use K&R definitions. No functional changes.
Reported by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-12-16 17:45:15 +00:00
Jeff Roberson
61a74c5ccd schedlock 1/4
Eliminate recursion from most thread_lock consumers.  Return from
sched_add() without the thread_lock held.  This eliminates unnecessary
atomics and lock word loads as well as reducing the hold time for
scheduler locks.  This will eventually allow for lockless remote adds.

Discussed with:	kib
Reviewed by:	jhb
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22626
2019-12-15 21:11:15 +00:00
Edward Tomasz Napierala
cf69fe66d4 Add sync_file_range(2) implementation to linux(4); it's a thin wrapper
over the usual fsync(2).

This silences some warnings when running "apt-get upgrade".

Reviewed by:	brooks, emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22371
2019-12-14 13:37:17 +00:00
Edward Tomasz Napierala
34ad5ac242 Add kern_kill() and use it in Linuxulator. It's just a cleanup,
no functional changes.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22645
2019-12-13 18:44:02 +00:00
Edward Tomasz Napierala
be2cfdbc86 Add kern_getsid() and use it in Linuxulator; no functional changes.
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22647
2019-12-13 18:39:36 +00:00
Jeff Roberson
4504268a1b Fix the last few cases that grab without busy or valid. The grab functions must
return the page in some held state for consistency elsewhere.

Reviewed by:	alc, kib, markj
Differential Revision:	https://reviews.freebsd.org/D22610
2019-12-02 22:38:25 +00:00
Vladimir Kondratyev
71b8e362c5 Linux epoll: Allow passing of any negative timeout value to epoll_wait
Linux epoll allow passing of any negative timeout value to epoll_wait()
to cause unbound blocking

Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22517
2019-11-24 20:51:09 +00:00
Vladimir Kondratyev
335fe0afb8 Linux epoll: Register events with zero event mask
Such an events are legal and should be interpreted as EPOLLERR | EPOLLHUP.
Register a disabled kqueue event in that case as we do not support EPOLLHUP yet.

Required by Linux Steam client.

PR:		240590
Reported by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22516
2019-11-24 20:47:40 +00:00
Vladimir Kondratyev
461120b834 Linux epoll: Check both read and write kqueue events existence in EPOLL_CTL_ADD
Linux epoll EPOLL_CTL_ADD op handler should always check registration
of both EVFILT_READ and EVFILT_WRITE kevents to deceide if supplied
file descriptor fd is already registered with epoll instance.

Reviewed by:	emaste
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22515
2019-11-24 20:44:14 +00:00
Vladimir Kondratyev
896a4c279d Linux epoll: Don't deregister file descriptor after EPOLLONESHOT is fired
Linux epoll does not remove descriptor after one-shot event has been triggered.
Set EV_DISPATCH kqueue flag rather then EV_ONESHOT to get the same behavior.

Required by Linux Steam client.

PR:		240590
Reported by:	Alex S <iwtcex@gmail.com>
Reviewed by:	emaste, imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22513
2019-11-24 20:41:47 +00:00
Mateusz Guzik
4de1818baf linux: avoid overhead of P_CONTROLT checks if possible
Sponsored by:	The FreeBSD Foundation
2019-11-20 12:06:29 +00:00
Edward Tomasz Napierala
dfe91e5e34 Make linux(4) open(2)/openat(2) return ELOOP instead of EMLINK,
when being passed O_NOFOLLOW.  This fixes LTP testcase openat02:5.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22384
2019-11-18 10:19:16 +00:00
Edward Tomasz Napierala
299cb52a80 Support O_CLOEXEC in linux(4) open(2) and openat(2).
Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21966
2019-11-15 16:21:46 +00:00
Ed Maste
01b9ee4c50 linux_renameat2: improve flag checks
In the cases where Linux returns an error (e.g. passing in an undefined
flag) there's no need for us to emit a message.  (The target of this
message is a developer working on the linuxulatorm, not the author of
presumably broken Linux software).

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21606
2019-11-07 15:51:44 +00:00
Edward Tomasz Napierala
044ab55e41 Make linux(4) create /dev/shm. Linux applications often expect
a tmpfs to be mounted there, and because they like to verify it's
actually a mountpoint, a symlink won't do.

Reviewed by:	dchagin (earlier version)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20333
2019-11-06 20:53:33 +00:00
Yuri Pankov
a161fba992 linux: futex_mtx should follow futex_list
Move futex_mtx to linux_common.ko for amd64 and aarch64 along
with respective list/mutex init/destroy.

PR:		240989
Reported by:	Alex S <iwtcex@gmail.com>
2019-10-18 12:25:33 +00:00
Yuri Pankov
b9d3556a34 linux: provide just one instance of futex_list
Move futex_list definition to linux.c which is included once
in linux.ko (i386) and in linux_common.ko (amd64 and aarch64)
allowing 32/64 bit linux programs to access the same futexes
in the latter case.

PR:		240989
Reviewed by:	dchagin
Differential Revision:	https://reviews.freebsd.org/D22073
2019-10-18 10:28:08 +00:00
Pawel Biernacki
ea2609a490 linux_renameat2: don't add extra \n on error.
linux_msg() already adds \n at the end of all messages.

Reported by:	emaste, kib (mentor), mjg (mentor)
Reviewed by:	kib (mentor), mjg (mentor)
Differential Revision:	https://reviews.freebsd.org/D21852
2019-09-30 19:05:14 +00:00
Ed Maste
2eb6ef203a linux: add trivial renameat2 implementation
Just return EINVAL if flags != 0.  The Linux man page documents one
case of EINVAL as "The filesystem does not support one of the flags in
flags."

After r351723 userland binaries will try using new system calls.

Reported by:	mjg
Reviewed by:	mjg, trasz
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21590
2019-09-11 13:01:59 +00:00
Edward Tomasz Napierala
1d3a302b4a Bump Linux version to 3.2.0. Without it, binaries linked against
glibc 2.24 and up (eg Ubuntu 19.04) fail with "FATAL: kernel too old".

This alone is not enough to make newer binaries actually work;
fix/hack/workaround is pending review at https://reviews.freebsd.org/D20687.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20757
2019-09-02 18:10:35 +00:00
Edward Tomasz Napierala
7a8cbc5288 Relax compat.linux.osrelease checks. This way one can do eg
'compat.linux.osrelease=3.10.0-957.12.1.el7.x86_64', which
corresponds to CentOS 7.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20685
2019-09-02 16:57:42 +00:00
Konstantin Belousov
bb9e2184f0 Change locking requirements for VOP_UNSET_TEXT().
Require the vnode to be locked for the VOP_UNSET_TEXT() call.  This
will be used by the following bug fix for a tmpfs issue.

Tested by:	sbruno, pho (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-08-18 20:24:52 +00:00
Konstantin Belousov
62375ca79c compat/linux: Remove obsoleted and somewhat confusing comments related to COMPAT_43.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21200
2019-08-11 19:17:29 +00:00
Andriy Gapon
c66f5b079d linuxcommon: add module version
MFC after:	2 weeks
2019-07-10 13:47:10 +00:00
Tijl Coosemans
e2fba140a8 Let linuxulator mprotect mask unsupported bits before calling kern_mprotect.
After r349240 kern_mprotect returns EINVAL for unsupported bits in the prot
argument.  Linux rtld uses PROT_GROWSDOWN and PROT_GROWS_UP when marking the
stack executable.  Mask these bits like kern_mprotect used to do.  For other
unsupported bits EINVAL is returned like Linux does.

Reviewed by:	trasz, brooks
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D20864
2019-07-10 08:19:33 +00:00
Ed Maste
b97ebbbf72 Update Linux compat version to 2.6.36
New system calls between 2.6.32 and 2.6.26 are already implemented.

This should be mostly NFC as far as contemporary Linux applications are
concerned though, as Linux kernel 3.2 is the oldest supported by a
number of popular distros today; work is in progress by others to enable
support for those applications.

Discussed with:	trasz
MFC after:	1 month
2019-07-04 20:42:08 +00:00
Edward Tomasz Napierala
0fabd7b5cc Return ENOTSUP for Linux FS_IOC_FIEMAP ioctl.
Linux man(1) calls it for no good reason; this avoids the console spam
(eg '(man): ioctl fd=4, cmd=0x660b ('f',11) is not implemented').

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20690
2019-07-04 20:16:04 +00:00
Edward Tomasz Napierala
2478d444d1 Fix linuxulator prlimit64(2) with pid == 0. This makes 'ulimit -a'
return something reasonable, and helps linux binaries which attempt
to close all the files, eg apt(8).

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D20692
2019-07-04 19:40:01 +00:00