2759 Commits

Author SHA1 Message Date
kib
bff8060ac5 Amend the layout of kevent32 on powerpc where uint64_t has 8-byte
alignment.

Reported,tested and assertion updates by:	andreast
Sponsored by:	The FreeBSD Foundation
2017-06-30 16:12:57 +00:00
jhb
131ea5a7f4 Store a 32-bit PT_LWPINFO struct for 32-bit process core dumps.
Process core notes for a 32-bit process running on a 64-bit host need to
use 32-bit structures so that the note layout matches the layout of notes
of a core dump of a 32-bit process under a 32-bit kernel.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D11407
2017-06-29 21:31:13 +00:00
jhibbits
342f85f1d4 Update comments and simplify conditionals for compat32
Only amd64 (because of i386) needs 32-bit time_t compat now, everything else is
64-bit time_t.  Rather than checking on all 64-bit time_t archs, only check the
oddball amd64/i386.

Reviewed By: emaste, kib, andrew
Differential Revision: https://reviews.freebsd.org/D11364
2017-06-27 01:29:10 +00:00
markj
ff2787fe48 Implement parts of the hrtimer API in the LinuxKPI.
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11359
2017-06-26 16:28:46 +00:00
avg
e647d591f8 linux_getdents, linux_readdir: fix mismatch between malloc and free tags
MFC after:	3 days
2017-06-26 09:13:25 +00:00
jhibbits
11d34fd62a Solve the y2038 problem for powerpc
AKA Make time_t 64 bits on powerpc(32).

PowerPC currently (until now) was one of two architectures with a 32-bit time_t
on 32-bit archs (the other being i386).  This is an ABI breakage, so all ports,
and all local binaries, *must* be recompiled.

Tested by:	andreast, others
MFC after:	Never
Relnotes:	Yes
2017-06-26 02:25:19 +00:00
markj
3a714c6e83 Add u64_to_user_ptr() to the LinuxKPI.
MFC after:	1 week
2017-06-25 19:30:20 +00:00
markj
cc928d211a Add ns_to_ktime() to the LinuxKPI.
MFC after:	1 week
2017-06-25 19:28:01 +00:00
markj
7ab36912bc Add a couple of macros to lockdep.h in the LinuxKPI.
MFC after:	1 week
2017-06-25 19:23:14 +00:00
markj
9575bf4b0e Add the thaw_early method to struct dev_pm_ops in the LinuxKPI.
MFC after:	1 week
2017-06-25 19:21:59 +00:00
markj
7ca4ed867f Add noop_lseek() to the LinuxKPI.
MFC after:	1 week
2017-06-25 19:20:12 +00:00
mmokhi
889f8689d9 Fix caveat in new implementation of linprocfs_docpuinfo():
Prevent kernel panic in case that extended-cpuid isn't supported by CPU

Reviewed by:	kib, ngie, trasz
Approved by:	trasz
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11294
2017-06-23 10:36:27 +00:00
markj
881fc6be99 Update io-mapping.h in the LinuxKPI.
Add io_mapping_init_wc() and add a third (unused) parameter to
io_mapping_map_wc().

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11286
2017-06-21 18:20:17 +00:00
markj
4fffa01fff Add missing lock destructor invocations to the LinuxKPI unload handler.
MFC after:	1 week
2017-06-21 18:17:32 +00:00
markj
a49c47f0df Include kmod.h from the LinuxKPI's module.h.
MFC after:	1 week
2017-06-21 18:15:47 +00:00
markj
0959c78162 Add a lockdep macro to the LinuxKPI.
Also fix some nearby style issues.

MFC after:	1 week
2017-06-21 18:08:36 +00:00
hselasky
41ece8e7db Allow the VM fault handler to be NULL in the LinuxKPI when handling a
memory map request. When the VM fault handler is NULL a return code of
VM_PAGER_BAD is returned from the character device's pager populate
handler. This fixes compatibility with Linux.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-06-21 14:38:52 +00:00
markj
d656eefc9c Add kthread parking support to the LinuxKPI.
Submitted by:	kmacy (original version)
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11264
2017-06-18 19:22:05 +00:00
markj
c8d732dce1 Avoid including list.h in LinuxKPI headers.
list.h includes a number of FreeBSD headers as a workaround for the
LIST_HEAD name collision. To reduce pollution, avoid including list.h
in commonly used headers when it is not explicitly needed.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11249
2017-06-18 16:43:57 +00:00
emaste
70be6f4a29 Add ZFS to Linux statfs ftype
PR:		220086
Reviewed by:	cem
MFC after:	3 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D11252
2017-06-18 11:51:03 +00:00
markj
2c739e0e3f Remove prototypes for unimplemented LinuxKPI functions.
MFC after:	1 week
2017-06-17 22:52:23 +00:00
kib
f7db07a715 Regen. 2017-06-17 00:58:19 +00:00
kib
d7f022a3ab Add abstime kqueue(2) timers and expand struct kevent members.
This change implements NOTE_ABSTIME flag for EVFILT_TIMER, which
specifies that the data field contains absolute time to fire the
event.

To make this useful, data member of the struct kevent must be extended
to 64bit.  Using the opportunity, I also added ext members.  This
changes struct kevent almost to Apple struct kevent64, except I did
not changed type of ident and udata, the later would cause serious API
incompatibilities.

The type of ident was kept uintptr_t since EVFILT_AIO returns a
pointer in this field, and e.g. CHERI is sensitive to the type
(discussed with brooks, jhb).

Unlike Apple kevent64, symbol versioning allows us to claim ABI
compatibility and still name the new syscall kevent(2).  Compat shims
are provided for both host native and compat32.

Requested by:	bapt
Reviewed by:	bapt, brooks, ngie (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D11025
2017-06-17 00:57:26 +00:00
kib
e2a14c603f Move struct syscall_args syscall arguments parameters container into
struct thread.

For all architectures, the syscall trap handlers have to allocate the
structure on the stack.  The structure takes 88 bytes on 64bit arches
which is not negligible.  Also, it cannot be easily found by other
code, which e.g. caused duplication of some members of the structure
to struct thread already.  The change removes td_dbg_sc_code and
td_dbg_sc_nargs which were directly copied from syscall_args.

The structure is put into the copied on fork part of the struct thread
to make the syscall arguments information correct in the child after
fork.

This move will also allow several more uses shortly.

Reviewed by:	jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
X-Differential revision:	https://reviews.freebsd.org/D11080
2017-06-12 21:03:23 +00:00
dchagin
fa3d0ad781 Remove the outdated definition.
MFC after:	1 week
2017-06-12 07:48:51 +00:00
dchagin
703192d28e Since r318735 (ino64 project) the size of the native struct dirent is
equal or greater than the size of Linux struct dirent or struct dirent64.
So, remove LINUX_RECLEN_RATIO magic as useless.
2017-06-12 07:35:59 +00:00
markj
e10054b3aa Implement pci_disable_device() in the LinuxKPI.
Submitted by:	kmacy
MFC after:	2 weeks
2017-06-09 19:57:27 +00:00
markj
d2500bbf92 Augment wait queue support in the LinuxKPI.
In particular:
- Don't evaluate event conditions with a sleepqueue lock held, since such
  code may attempt to acquire arbitrary locks.
- Fix the return value for wait_event_interruptible() in the case that the
  wait is interrupted by a signal.
- Implement wait_on_bit_timeout() and wait_on_atomic_t().
- Implement some functions used to test for pending signals.
- Implement a number of wait_event_*() variants and unify the existing
  implementations.
- Unify the mechanism used by wait_event_*() and schedule() to put the
  calling thread to sleep.

This is required to support updated DRM drivers. Thanks to hselasky for
finding and fixing a number of bugs in the original revision.

Reviewed by:	hselasky
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D10986
2017-06-09 19:41:12 +00:00
kib
f703462277 Enhance vfs.ino64_trunc_error sysctl.
Provide a new mode "2" which returns a special overflow indicator in
the non-representable field instead of the silent truncation (mode
"0") or EOVERFLOW (mode "1").

In particular, the typical use of st_ino to detect hard links with
mode "2" reports false positives, which might be more suitable for
some uses.

Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
2017-06-09 11:17:08 +00:00
jhibbits
29092d1ae9 Remove ARM and MIPS from linuxkpi ioremap_attr definition
ARM and MIPS fail universe builds.

ARM and MIPS are missing the following:
* VM_MEMATTR_WRITE_THROUGH
* VM_MEMATTR_WRITE_COMBINING

Pointy-hat to:	jhibbits
2017-06-08 02:44:34 +00:00
jhibbits
d485495fa5 Add more #ifdef arch checks to the linuxkpi
arm, mips, and powerpc all implement pmap_mapdev_attr() and pmap_unmapdev(),
so add those archs to the checks.  powerpc also includes the atomic_swap_*()
functions, so add that to the supported list as well.  Not tested except by
compiling powerpc.

Reviewed by:	markj
2017-06-07 18:08:11 +00:00
hselasky
9b5b52b6c4 Fix init order in the LinuxKPI for IDR support after recent changes.
CPU_FOREACH() is not available until SI_SUB_CPU at SI_ORDER_ANY
when the LinuxKPI is loaded as part of the kernel.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-06-06 10:12:58 +00:00
kib
6a43deb7a3 Add sysctl vfs.ino64_trunc_error controlling action on truncating
inode number or link count for the ABI compat binaries.

Right now, and by default after the change, too large 64bit values are
silently truncated to 32 bits.  Enabling the knob causes the system to
return EOVERFLOW for stat(2) family of compat syscalls when some
values cannot be completely represented by the old structures.  For
getdirentries(2), knob skips the dirents which would cause non-trivial
truncation of d_ino.

EOVERFLOW error is specified by the X/Open 1996 LFS document
('Adding Support for Arbitrary File Sizes to the Single UNIX
Specification').

Based on the discussion with:	bde
Sponsored by:	The FreeBSD Foundation
2017-06-05 11:40:30 +00:00
dchagin
1a97fb3a1a On success, getrandom() Linux system call returns the number of bytes that
were copied to the buffer supplied by the user.

PR:           219464
Submitted by: Maciej Pasternacki
Reported by:  Maciej Pasternacki
MFC after:    1 week
2017-06-04 18:35:30 +00:00
dchagin
ff032734d8 Revert r319053 due to lack of sence. As pointed out by kib@ opt_global.h
contains such fundamental settings as e.g. SMP option and fake
opt_global.h almost never match real configured kernels.

Reported by:	kib@
2017-06-04 18:24:41 +00:00
hselasky
ab63ba157f Improve kqueue() support in the LinuxKPI. Some applications using the
kqueue() does not set non-blocking I/O mode for event driven read of
file descriptors. This means the LinuxKPI internal kqueue read and
write event flags must be updated before the next read and/or write
system call. Else the read and/or write system call may block. This
can happen when there is no more data to read following a previous
read event. Then the application also gets blocked from processing
other events. This situation can also be solved by the applications
setting and using non-blocking I/O mode.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-02 16:52:18 +00:00
hselasky
702f89e7fe Add support for setting the non-blocking I/O flag for LinuxKPI
character devices. In Linux the FIONBIO IOCTL is handled by the kernel
and not the drivers. Also need return success for the FIOASYNC ioctl
due to existing logic in kern_fcntl() even though it is not supported
currently.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-02 16:30:40 +00:00
hselasky
68f2e5bd35 Make sure the selrecord() function is only called from within system
polling contexts in the LinuxKPI.

After the kqueue() support was added to the LinuxKPI in r319409 the
Linux poll file operation will be used outside the system file polling
callback function, which can cause a NULL-pointer panic inside
selrecord() because curthread->td_sel is set to NULL. This patch moves
the selrecord() call away from poll_wait() and to the system file poll
callback function in the LinuxKPI, which essentially wraps the Linux
one. This is similar to what the cuse(3) module is currently doing.
Refer to sys/fs/cuse/*.[ch] for more details.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-01 16:49:48 +00:00
hselasky
d8c3341b93 Translate the ERESTARTSYS error code into ERESTART in the LinuxKPI
ioctl(), read() and write() system call handlers. This error code is
internal to the kernel and should not be seen by user-space programs
according to Linux.

Submitted by:		Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-01 09:53:55 +00:00
hselasky
63be850583 Add generic kqueue() and kevent() support to the LinuxKPI character
devices. The implementation allows read and write filters to be
created and piggybacks on the poll() file operation to determine when
a filter should trigger. The piggyback mechanism is simply to check
for the EWOULDBLOCK or EAGAIN return code from read(), write() or
ioctl() system calls and then update the kqueue() polling state bits.
The implementation is similar to the one found in the cuse(3) module.
Refer to sys/fs/cuse/*.[ch] for more details.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-01 09:34:51 +00:00
hselasky
798a6026c2 Implement print_hex_dump(), print_hex_dump_bytes() and
printk_ratelimited() in the LinuxKPI.

While at it fix the inclusion guard of printk.h to be similar to the
rest of the LinuxKPI header files.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 16:24:02 +00:00
hselasky
255c5da0b8 Properly implement idr_preload() and idr_preload_end() in the
LinuxKPI.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 16:08:30 +00:00
hselasky
fc09dd7cd5 Implement in_atomic() function in the LinuxKPI.
Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 15:05:44 +00:00
hselasky
81d82e5e26 Properly set the .d_name field in the cdevsw structure for the
LinuxKPI.

Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 13:11:06 +00:00
hselasky
52f245c3fd Make sure the VMAP's "vm_file" field is referenced in a Linux
compatible way by the linux_dev_mmap_single() function in the
LinuxKPI.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 13:07:05 +00:00
hselasky
d59cbd826c Remove the VMA handle from its list before calling the LinuxKPI VMA
close operation to prevent other threads from reusing the VM object
handle pointer.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 13:05:54 +00:00
hselasky
5ad34d08ba Don't acquire a reference on the VM-space when allocating the LinuxKPI
task structure to avoid deadlock when tearing down the VM object
during a process exit.

Found by:		markj @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 13:01:27 +00:00
hselasky
42ce45834f Fix a reference count leak in the LinuxKPI due to calling VM open when
it shouldn't be called.

Background:
The Linux VM open operation is called when a new VMA is
created on top of the current VMA. This is done through either mremap
flow or split_vma, usually due to mlock, madvise, munmap and so
on. This is currently not supported by the LinuxKPI.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 12:08:25 +00:00
hselasky
f87686da6d Fixes for refcounting "struct linux_file" in the LinuxKPI.
- Allow "struct linux_file" to be refcounted when its "_file" member
  is NULL by using its "f_count" field. The reference counts are
  transferred to the file structure when the file descriptor is
  installed.

- Add missing vdrop() calls for error cases during open().

- Set the "_file" member of "struct linux_file" during open. This
allows use of refcounting through get_file() and fput() with LinuxKPI
character devices.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 12:02:59 +00:00
hselasky
b7e7cccc2e Make sure the thread's priority is restored for all three cases inside
linux_synchronize_rcu_cb() in the LinuxKPI.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 10:01:15 +00:00