Commit Graph

2842 Commits

Author SHA1 Message Date
Hans Petter Selasky
f6800be3ce Use integer type to pass around jiffies and/or ticks values in the
LinuxKPI because in FreeBSD ticks are 32-bit.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-10 13:05:40 +00:00
Hans Petter Selasky
4ef8a6301f Fixes for wait event in the LinuxKPI. These are regression issues
after r319757.

1) Correct the return value from __wait_event_common() from 1 to 0 in
case the timeout is specified as MAX_SCHEDULE_TIMEOUT. In the other
case __ret is zero and will be substituted in the last part of the
macro with the appropriate value before return.

2) Make sure the "timeout" argument is casted to "int" before
evaluating negativity. Else the signedness of a "long" might be
checked instead of the signedness of an integer.

3) The wait_event() function should not have a return value.

Found by:	KrishnamRaju ErapaRaju <Krishna2@chelsio.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-10 13:00:10 +00:00
Hans Petter Selasky
8ea4441598 Make sure the linux_wait_event_common() function in the LinuxKPI properly
handles a timeout value of MAX_SCHEDULE_TIMEOUT which basically means there
is no timeout. This is a regression issue after r319757.

While at it change the type of returned variable from "long" to "int" to
match the actual return type.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-10 12:51:04 +00:00
Alexander Motin
3a150601e1 Fix few issues of LinuxKPI workqueue.
LinuxKPI workqueue wrappers reported "successful" cancellation for works
already completed in normal way.  This change brings reported status and
real cancellation fact into sync.  This required for drm-next operation.

Reviewed by:	hselasky (earlier version)
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D11904
2017-08-08 19:36:34 +00:00
Mark Johnston
c0589825fd Add round_jiffies_up(), local_clock() and __setup_timer() to the LinuxKPI.
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11871
2017-08-08 04:34:02 +00:00
Mark Johnston
48dac28d63 Add macros for defining attribute groups and for WO and RW attributes.
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11872
2017-08-08 04:30:22 +00:00
Alexander Motin
e1cf70fbab Fix hrtimer_active() in case of cancellation.
While there, switch to FreeBSD internal callout active status.

Reviewed by:	markj, hselasky
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D11900
2017-08-07 14:34:05 +00:00
Ruslan Bukin
ca20f8ec29 o Replace __riscv__ with __riscv
o Replace __riscv64 with (__riscv && __riscv_xlen == 64)

This is required to support new GCC 7.1 compiler.
This is compatible with current GCC 6.1 compiler.

RISC-V is extensible ISA and the idea here is to have built-in define
per each extension, so together with __riscv we will have some subset
of these as well (depending on -march string passed to compiler):

__riscv_compressed
__riscv_atomic
__riscv_mul
__riscv_div
__riscv_muldiv
__riscv_fdiv
__riscv_fsqrt
__riscv_float_abi_soft
__riscv_float_abi_single
__riscv_float_abi_double
__riscv_cmodel_medlow
__riscv_cmodel_medany
__riscv_cmodel_pic
__riscv_xlen

Reviewed by:	ngie
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D11901
2017-08-07 14:09:57 +00:00
Mark Johnston
f2ec04a394 Add subsystem vendor and device ID fields to struct pci_dev.
MFC after:	1 week
2017-08-03 21:14:46 +00:00
Hans Petter Selasky
2b79a966ab Fix LinuxKPI regression after r321920. The mda_unit and si_drv0 fields are not
wide enough to hold the full 64-bit dev_t. Instead use the "dev" field in
the "linux_cdev" structure to store and lookup this value.

While at it remove superfluous use of parenthesis inside the
MAJOR(), MINOR() and MKDEV() macros in the LinuxKPI.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-08-02 14:27:27 +00:00
Hans Petter Selasky
0991f0af6d Remove cycle_t type from the LinuxKPI similar to Linux upstream.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-07-31 09:17:54 +00:00
Dmitry Chagin
c151945c86 Avoid using [LINUX_]SHAREDPAGE constant directly in the vdso code.
This is needed for https://reviews.freebsd.org/D11780.

Reported by:	kib@
2017-07-30 21:24:20 +00:00
Ian Lepore
d35f6548e6 Add inline functions to convert between sbintime_t and decimal time units.
Use them in some existing code that is vulnerable to roundoff errors.

The existing constant SBT_1NS is a honeypot, luring unsuspecting folks into
writing code such as long_timeout_ns*SBT_1NS to generate the argument for a
sleep call.  The actual value of 1ns in sbt units is ~4.3, leading to a
large roundoff error giving a shorter sleep than expected when multiplying
by the trucated value of 4 in SBT_1NS.  (The evil honeypot aspect becomes
clear after you waste a whole day figuring out why your sleeps return early.)
2017-07-29 17:00:23 +00:00
Ed Schouten
cea9310d4e Upgrade to the latest sources generated from the CloudABI specification.
The CloudABI specification has had some minor changes over the last half
year. No substantial features have been added, but some features that
are deemed unnecessary in retrospect have been removed:

- mlock()/munlock():

  These calls tend to be used for two different purposes: real-time
  support and handling of sensitive (cryptographic) material that
  shouldn't end up in swap. The former use case is out of scope for
  CloudABI. The latter may also be handled by encrypting swap.

  Removing this has the advantage that we no longer need to worry about
  having resource limits put in place.

- SOCK_SEQPACKET:

  Support for SOCK_SEQPACKET is rather inconsistent across various
  operating systems. Some operating systems supported by CloudABI (e.g.,
  macOS) don't support it at all. Considering that they are rarely used,
  remove support for the time being.

- getsockname(), getpeername(), etc.:

  A shortcoming of the sockets API is that it doesn't allow you to
  create socket(pair)s, having fake socket addresses associated with
  them. This makes it harder to test applications or transparently
  forward (proxy) connections to them.

  With CloudABI, we're slowly moving networking connectivity into a
  separate daemon called Flower. In addition to passing around socket
  file descriptors, this daemon provides address information in the form
  of arbitrary string labels. There is thus no longer any need for
  requesting socket address information from the kernel itself.

This change also updates consumers of the generated code accordingly.
Even though system calls end up getting renumbered, this won't cause any
problems in practice. CloudABI programs always call into the kernel
through a kernel-supplied vDSO that has the numbers updated as well.

Obtained from:	https://github.com/NuxiNL/cloudabi
2017-07-26 06:57:15 +00:00
Ryan Libby
4e64c62564 linuxkpi compiler.h: avoid gcc -Wunused-value in dummy expressions
It looks like the __acquire and __release macros are for the consumption
of static analysis tools and have no semantic effect.  Transform the
definitions from constant expressions to empty statements in order to
avoid -Wunused-value from gcc.

Likewise avoid future warnings for __chk_{user,io}_ptr, but with a cast
to void, because it looks like some linux kernel code may use those in
expression contexts.

Reviewed by:	hselasky, markj
Approved by:	markj (mentor)
Sponsored by:	Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D11695
2017-07-22 21:29:44 +00:00
Dmitry Chagin
8e9e07e682 Style(9) whitespace fix.
MFC after:	1 week
2017-07-22 09:03:40 +00:00
Konstantin Belousov
5cead59181 Correct sysent flags for dynamically loaded syscalls.
Using the https://github.com/google/capsicum-test/ suite, the
PosixMqueue.CapModeForked test was failing due to an ECAPMODE after
calling kmq_notify(). On further inspection, the dynamically
loaded syscall entry was initialized with sy_flags zeroed out, since
SYSCALL_INIT_HELPER() left sysent.sy_flags with the default value.

Add a new helper SYSCALL{,32}_INIT_HELPER_F() which takes an
additional argument to specify the sy_flags value.

Submitted by:	Siva Mahadevan <smahadevan@freebsdfoundation.org>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D11576
2017-07-14 09:34:44 +00:00
Mark Johnston
8d92040b75 Add some functions to jiffies.h.
Also add some checks for overflow to existing functions.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11533
2017-07-13 18:27:22 +00:00
Mark Johnston
70bb2cdb04 Add some functions to math64.h in the LinuxKPI, and fix nearby style.
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11535
2017-07-09 23:14:51 +00:00
Mark Johnston
7a2553d9d7 Add a few functions to ktime.h in the LinuxKPI, and fix nearby style.
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11534
2017-07-09 23:13:08 +00:00
Mark Johnston
abf5c031bb Free existing per-thread task structs when unloading linuxkpi.ko.
They are otherwise leaked.

Reported and tested by:	ae
MFC after:		1 week
2017-07-09 22:57:00 +00:00
Mark Johnston
dac6b88a20 Add some helper definitions to fs.h in the LinuxKPI.
Add a field to struct linux_file to allow the creation of anonymous
shmem objects.

MFC after:	1 week
2017-07-08 20:11:06 +00:00
Mark Johnston
e51dd47b08 Fix the definitions of pgprot_{noncached,writecombine} after r316562.
MFC after:	1 week
2017-07-08 19:22:29 +00:00
Mark Johnston
aa2b6b4957 Add device_is_registered() to the LinuxKPI.
MFC after:	1 week
2017-07-08 18:53:02 +00:00
Mark Johnston
8cd823ecf7 Add TASK_COMM_LEN to the LinuxKPI.
MFC after:	1 week
2017-07-08 18:52:29 +00:00
Hans Petter Selasky
611572285a Complete r320189 which allows a NULL VM fault handler in the LinuxKPI.
Instead of mapping a dummy page upon a page fault, map the page
pointed to by the physical address given by IDX_TO_OFF(vmap->vm_pfn).
To simplify the implementation use OBJT_DEVICE to implement our own
linux_cdev_pager_fault() instead of using the existing
linux_cdev_pager_populate().

Some minor code factoring while at it.

Reviewed by:	markj @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-07-07 13:44:14 +00:00
Hans Petter Selasky
ea16525413 Fix a bug in synchronize RCU when the calling thread is bound to a CPU.
Set "td_pinned" to zero after "sched_unbind()" to prevent "td_pinned"
from temporarily becoming negative during "sched_bind()". This can
happen if "sched_bind()" uses "sched_pin()" and "sched_unpin()".

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-07-07 13:15:00 +00:00
Mark Johnston
d34188a0e1 Invoke suspend/resume methods from the driver pmops if available.
Obtained from:	kmacy (original version)
MFC after:	1 week
2017-07-04 18:44:14 +00:00
Mark Johnston
88156ba581 Add some auxiliary types for device driver support.
MFC after:	1 week
2017-07-04 01:23:36 +00:00
Mark Johnston
6373e95eb6 Add a field for the class code to struct pci_driver.
Fill out some previously uninitialized fields as well.

MFC after:	1 week
2017-07-04 01:05:20 +00:00
Mark Johnston
ecf29cf148 Add some PCI class definitions.
MFC after:	1 week
2017-07-04 00:48:50 +00:00
Mark Johnston
b38dc0a16d Rename the "driver" field to "bsddriver" to avoid a name collision.
MFC after:	1 week
2017-07-04 00:30:48 +00:00
Mark Johnston
0a930cf078 Hold the PCI device list lock when removing an element.
MFC after:	1 week
2017-07-04 00:02:06 +00:00
Mark Johnston
4600d349be Let io_mapping_init_wc() fall back to an uncacheable mapping.
This allows usage of the function on architectures that don't support
write-combining.

Reported by:	bz, emaste
X-MFC With:	r320196
2017-07-03 02:01:16 +00:00
Konstantin Belousov
aef2a6a75d Port PowerPC kqueue(2) compat32 fix in r320500 to MIPS.
All 32bit MIPS ABIs align uint64_t on 8-byte.  Since struct kevent32
is defined using 32bit types to avoid extra alignment on amd64/i386,
layout of the structure needs paddings on PowerPC and apparently MIPS.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D11434
2017-07-01 22:52:17 +00:00
Konstantin Belousov
cfb2d93ba6 Amend the layout of kevent32 on powerpc where uint64_t has 8-byte
alignment.

Reported,tested and assertion updates by:	andreast
Sponsored by:	The FreeBSD Foundation
2017-06-30 16:12:57 +00:00
John Baldwin
51645e836d Store a 32-bit PT_LWPINFO struct for 32-bit process core dumps.
Process core notes for a 32-bit process running on a 64-bit host need to
use 32-bit structures so that the note layout matches the layout of notes
of a core dump of a 32-bit process under a 32-bit kernel.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D11407
2017-06-29 21:31:13 +00:00
Justin Hibbits
b436609213 Update comments and simplify conditionals for compat32
Only amd64 (because of i386) needs 32-bit time_t compat now, everything else is
64-bit time_t.  Rather than checking on all 64-bit time_t archs, only check the
oddball amd64/i386.

Reviewed By: emaste, kib, andrew
Differential Revision: https://reviews.freebsd.org/D11364
2017-06-27 01:29:10 +00:00
Mark Johnston
9ea3e14182 Implement parts of the hrtimer API in the LinuxKPI.
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11359
2017-06-26 16:28:46 +00:00
Andriy Gapon
16454bee3a linux_getdents, linux_readdir: fix mismatch between malloc and free tags
MFC after:	3 days
2017-06-26 09:13:25 +00:00
Justin Hibbits
fbcf7bcdf4 Solve the y2038 problem for powerpc
AKA Make time_t 64 bits on powerpc(32).

PowerPC currently (until now) was one of two architectures with a 32-bit time_t
on 32-bit archs (the other being i386).  This is an ABI breakage, so all ports,
and all local binaries, *must* be recompiled.

Tested by:	andreast, others
MFC after:	Never
Relnotes:	Yes
2017-06-26 02:25:19 +00:00
Mark Johnston
ee7c3198cd Add u64_to_user_ptr() to the LinuxKPI.
MFC after:	1 week
2017-06-25 19:30:20 +00:00
Mark Johnston
1fde37964d Add ns_to_ktime() to the LinuxKPI.
MFC after:	1 week
2017-06-25 19:28:01 +00:00
Mark Johnston
934277c59c Add a couple of macros to lockdep.h in the LinuxKPI.
MFC after:	1 week
2017-06-25 19:23:14 +00:00
Mark Johnston
0bfde0a7c7 Add the thaw_early method to struct dev_pm_ops in the LinuxKPI.
MFC after:	1 week
2017-06-25 19:21:59 +00:00
Mark Johnston
4eb1bcfc62 Add noop_lseek() to the LinuxKPI.
MFC after:	1 week
2017-06-25 19:20:12 +00:00
Mahdi Mokhtari
4b36080668 Fix caveat in new implementation of linprocfs_docpuinfo():
Prevent kernel panic in case that extended-cpuid isn't supported by CPU

Reviewed by:	kib, ngie, trasz
Approved by:	trasz
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11294
2017-06-23 10:36:27 +00:00
Mark Johnston
c73cdca2c4 Update io-mapping.h in the LinuxKPI.
Add io_mapping_init_wc() and add a third (unused) parameter to
io_mapping_map_wc().

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11286
2017-06-21 18:20:17 +00:00
Mark Johnston
47d8a7d4d1 Add missing lock destructor invocations to the LinuxKPI unload handler.
MFC after:	1 week
2017-06-21 18:17:32 +00:00
Mark Johnston
9b6197df69 Include kmod.h from the LinuxKPI's module.h.
MFC after:	1 week
2017-06-21 18:15:47 +00:00
Mark Johnston
33baed9452 Add a lockdep macro to the LinuxKPI.
Also fix some nearby style issues.

MFC after:	1 week
2017-06-21 18:08:36 +00:00
Hans Petter Selasky
cde3f930bc Allow the VM fault handler to be NULL in the LinuxKPI when handling a
memory map request. When the VM fault handler is NULL a return code of
VM_PAGER_BAD is returned from the character device's pager populate
handler. This fixes compatibility with Linux.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-06-21 14:38:52 +00:00
Mark Johnston
8504aa9852 Add kthread parking support to the LinuxKPI.
Submitted by:	kmacy (original version)
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11264
2017-06-18 19:22:05 +00:00
Mark Johnston
4eb18346d1 Avoid including list.h in LinuxKPI headers.
list.h includes a number of FreeBSD headers as a workaround for the
LIST_HEAD name collision. To reduce pollution, avoid including list.h
in commonly used headers when it is not explicitly needed.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D11249
2017-06-18 16:43:57 +00:00
Ed Maste
dbaa9ebf1b Add ZFS to Linux statfs ftype
PR:		220086
Reviewed by:	cem
MFC after:	3 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D11252
2017-06-18 11:51:03 +00:00
Mark Johnston
8239734079 Remove prototypes for unimplemented LinuxKPI functions.
MFC after:	1 week
2017-06-17 22:52:23 +00:00
Konstantin Belousov
eb84ca643c Regen. 2017-06-17 00:58:19 +00:00
Konstantin Belousov
2b34e84335 Add abstime kqueue(2) timers and expand struct kevent members.
This change implements NOTE_ABSTIME flag for EVFILT_TIMER, which
specifies that the data field contains absolute time to fire the
event.

To make this useful, data member of the struct kevent must be extended
to 64bit.  Using the opportunity, I also added ext members.  This
changes struct kevent almost to Apple struct kevent64, except I did
not changed type of ident and udata, the later would cause serious API
incompatibilities.

The type of ident was kept uintptr_t since EVFILT_AIO returns a
pointer in this field, and e.g. CHERI is sensitive to the type
(discussed with brooks, jhb).

Unlike Apple kevent64, symbol versioning allows us to claim ABI
compatibility and still name the new syscall kevent(2).  Compat shims
are provided for both host native and compat32.

Requested by:	bapt
Reviewed by:	bapt, brooks, ngie (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D11025
2017-06-17 00:57:26 +00:00
Konstantin Belousov
2d88da2f06 Move struct syscall_args syscall arguments parameters container into
struct thread.

For all architectures, the syscall trap handlers have to allocate the
structure on the stack.  The structure takes 88 bytes on 64bit arches
which is not negligible.  Also, it cannot be easily found by other
code, which e.g. caused duplication of some members of the structure
to struct thread already.  The change removes td_dbg_sc_code and
td_dbg_sc_nargs which were directly copied from syscall_args.

The structure is put into the copied on fork part of the struct thread
to make the syscall arguments information correct in the child after
fork.

This move will also allow several more uses shortly.

Reviewed by:	jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
X-Differential revision:	https://reviews.freebsd.org/D11080
2017-06-12 21:03:23 +00:00
Dmitry Chagin
12bbbbb254 Remove the outdated definition.
MFC after:	1 week
2017-06-12 07:48:51 +00:00
Dmitry Chagin
ac1082e590 Since r318735 (ino64 project) the size of the native struct dirent is
equal or greater than the size of Linux struct dirent or struct dirent64.
So, remove LINUX_RECLEN_RATIO magic as useless.
2017-06-12 07:35:59 +00:00
Mark Johnston
f67b5de754 Implement pci_disable_device() in the LinuxKPI.
Submitted by:	kmacy
MFC after:	2 weeks
2017-06-09 19:57:27 +00:00
Mark Johnston
465659643b Augment wait queue support in the LinuxKPI.
In particular:
- Don't evaluate event conditions with a sleepqueue lock held, since such
  code may attempt to acquire arbitrary locks.
- Fix the return value for wait_event_interruptible() in the case that the
  wait is interrupted by a signal.
- Implement wait_on_bit_timeout() and wait_on_atomic_t().
- Implement some functions used to test for pending signals.
- Implement a number of wait_event_*() variants and unify the existing
  implementations.
- Unify the mechanism used by wait_event_*() and schedule() to put the
  calling thread to sleep.

This is required to support updated DRM drivers. Thanks to hselasky for
finding and fixing a number of bugs in the original revision.

Reviewed by:	hselasky
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D10986
2017-06-09 19:41:12 +00:00
Konstantin Belousov
7abe0df223 Enhance vfs.ino64_trunc_error sysctl.
Provide a new mode "2" which returns a special overflow indicator in
the non-representable field instead of the silent truncation (mode
"0") or EOVERFLOW (mode "1").

In particular, the typical use of st_ino to detect hard links with
mode "2" reports false positives, which might be more suitable for
some uses.

Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
2017-06-09 11:17:08 +00:00
Justin Hibbits
864092bcaa Remove ARM and MIPS from linuxkpi ioremap_attr definition
ARM and MIPS fail universe builds.

ARM and MIPS are missing the following:
* VM_MEMATTR_WRITE_THROUGH
* VM_MEMATTR_WRITE_COMBINING

Pointy-hat to:	jhibbits
2017-06-08 02:44:34 +00:00
Justin Hibbits
287e7a861a Add more #ifdef arch checks to the linuxkpi
arm, mips, and powerpc all implement pmap_mapdev_attr() and pmap_unmapdev(),
so add those archs to the checks.  powerpc also includes the atomic_swap_*()
functions, so add that to the supported list as well.  Not tested except by
compiling powerpc.

Reviewed by:	markj
2017-06-07 18:08:11 +00:00
Hans Petter Selasky
25b3ef2c99 Fix init order in the LinuxKPI for IDR support after recent changes.
CPU_FOREACH() is not available until SI_SUB_CPU at SI_ORDER_ANY
when the LinuxKPI is loaded as part of the kernel.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-06-06 10:12:58 +00:00
Konstantin Belousov
3df7ebc4ed Add sysctl vfs.ino64_trunc_error controlling action on truncating
inode number or link count for the ABI compat binaries.

Right now, and by default after the change, too large 64bit values are
silently truncated to 32 bits.  Enabling the knob causes the system to
return EOVERFLOW for stat(2) family of compat syscalls when some
values cannot be completely represented by the old structures.  For
getdirentries(2), knob skips the dirents which would cause non-trivial
truncation of d_ino.

EOVERFLOW error is specified by the X/Open 1996 LFS document
('Adding Support for Arbitrary File Sizes to the Single UNIX
Specification').

Based on the discussion with:	bde
Sponsored by:	The FreeBSD Foundation
2017-06-05 11:40:30 +00:00
Dmitry Chagin
51d93426d9 On success, getrandom() Linux system call returns the number of bytes that
were copied to the buffer supplied by the user.

PR:           219464
Submitted by: Maciej Pasternacki
Reported by:  Maciej Pasternacki
MFC after:    1 week
2017-06-04 18:35:30 +00:00
Dmitry Chagin
e2e6a2a1b6 Revert r319053 due to lack of sence. As pointed out by kib@ opt_global.h
contains such fundamental settings as e.g. SMP option and fake
opt_global.h almost never match real configured kernels.

Reported by:	kib@
2017-06-04 18:24:41 +00:00
Hans Petter Selasky
67e984c8f2 Improve kqueue() support in the LinuxKPI. Some applications using the
kqueue() does not set non-blocking I/O mode for event driven read of
file descriptors. This means the LinuxKPI internal kqueue read and
write event flags must be updated before the next read and/or write
system call. Else the read and/or write system call may block. This
can happen when there is no more data to read following a previous
read event. Then the application also gets blocked from processing
other events. This situation can also be solved by the applications
setting and using non-blocking I/O mode.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-02 16:52:18 +00:00
Hans Petter Selasky
639af71ab1 Add support for setting the non-blocking I/O flag for LinuxKPI
character devices. In Linux the FIONBIO IOCTL is handled by the kernel
and not the drivers. Also need return success for the FIOASYNC ioctl
due to existing logic in kern_fcntl() even though it is not supported
currently.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-02 16:30:40 +00:00
Hans Petter Selasky
8600ba1aa9 Make sure the selrecord() function is only called from within system
polling contexts in the LinuxKPI.

After the kqueue() support was added to the LinuxKPI in r319409 the
Linux poll file operation will be used outside the system file polling
callback function, which can cause a NULL-pointer panic inside
selrecord() because curthread->td_sel is set to NULL. This patch moves
the selrecord() call away from poll_wait() and to the system file poll
callback function in the LinuxKPI, which essentially wraps the Linux
one. This is similar to what the cuse(3) module is currently doing.
Refer to sys/fs/cuse/*.[ch] for more details.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-01 16:49:48 +00:00
Hans Petter Selasky
328c75d621 Translate the ERESTARTSYS error code into ERESTART in the LinuxKPI
ioctl(), read() and write() system call handlers. This error code is
internal to the kernel and should not be seen by user-space programs
according to Linux.

Submitted by:		Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-01 09:53:55 +00:00
Hans Petter Selasky
a6b28ee02a Add generic kqueue() and kevent() support to the LinuxKPI character
devices. The implementation allows read and write filters to be
created and piggybacks on the poll() file operation to determine when
a filter should trigger. The piggyback mechanism is simply to check
for the EWOULDBLOCK or EAGAIN return code from read(), write() or
ioctl() system calls and then update the kqueue() polling state bits.
The implementation is similar to the one found in the cuse(3) module.
Refer to sys/fs/cuse/*.[ch] for more details.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-06-01 09:34:51 +00:00
Hans Petter Selasky
c2676069cb Implement print_hex_dump(), print_hex_dump_bytes() and
printk_ratelimited() in the LinuxKPI.

While at it fix the inclusion guard of printk.h to be similar to the
rest of the LinuxKPI header files.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 16:24:02 +00:00
Hans Petter Selasky
427cefde27 Properly implement idr_preload() and idr_preload_end() in the
LinuxKPI.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 16:08:30 +00:00
Hans Petter Selasky
dff36e69a1 Implement in_atomic() function in the LinuxKPI.
Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 15:05:44 +00:00
Hans Petter Selasky
90b30e6560 Properly set the .d_name field in the cdevsw structure for the
LinuxKPI.

Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 13:11:06 +00:00
Hans Petter Selasky
d56f1ed887 Make sure the VMAP's "vm_file" field is referenced in a Linux
compatible way by the linux_dev_mmap_single() function in the
LinuxKPI.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 13:07:05 +00:00
Hans Petter Selasky
cca15f28c5 Remove the VMA handle from its list before calling the LinuxKPI VMA
close operation to prevent other threads from reusing the VM object
handle pointer.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 13:05:54 +00:00
Hans Petter Selasky
68b9f2f00c Don't acquire a reference on the VM-space when allocating the LinuxKPI
task structure to avoid deadlock when tearing down the VM object
during a process exit.

Found by:		markj @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 13:01:27 +00:00
Hans Petter Selasky
ea67550be0 Fix a reference count leak in the LinuxKPI due to calling VM open when
it shouldn't be called.

Background:
The Linux VM open operation is called when a new VMA is
created on top of the current VMA. This is done through either mremap
flow or split_vma, usually due to mlock, madvise, munmap and so
on. This is currently not supported by the LinuxKPI.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 12:08:25 +00:00
Hans Petter Selasky
f5a9867b7d Fixes for refcounting "struct linux_file" in the LinuxKPI.
- Allow "struct linux_file" to be refcounted when its "_file" member
  is NULL by using its "f_count" field. The reference counts are
  transferred to the file structure when the file descriptor is
  installed.

- Add missing vdrop() calls for error cases during open().

- Set the "_file" member of "struct linux_file" during open. This
allows use of refcounting through get_file() and fput() with LinuxKPI
character devices.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 12:02:59 +00:00
Hans Petter Selasky
3f743d782a Make sure the thread's priority is restored for all three cases inside
linux_synchronize_rcu_cb() in the LinuxKPI.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-31 10:01:15 +00:00
Mark Johnston
cb564d2436 Add some miscellaneous definitions to support DRM drivers.
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D10985
2017-05-30 17:16:08 +00:00
Dmitry Chagin
9ecc1abca3 On success, getrandom() Linux system call returns the number of bytes that
were copied to the buffer supplied by the user.

Also fix getrandom() if Linuxulator modules are built without the kernel.

PR:		219464
Submitted by:	Maciej Pasternacki
Reported by:	Maciej Pasternacki
MFC after:	1 week
2017-05-28 07:40:09 +00:00
Allan Jude
c20feae640 Followup to r318765 (capsicumize cpuset_*affinity)
Update *sysent files
2017-05-24 01:01:57 +00:00
Allan Jude
f299c47b52 Allow cpuset_{get,set}affinity in capabilities mode
bhyve was recently sandboxed with capsicum, and needs to be able to
control the CPU sets of its vcpu threads

Reviewed by:	emaste, oshogbo, rwatson
MFC after:	2 weeks
Sponsored by:	ScaleEngine Inc.
Differential Revision:	https://reviews.freebsd.org/D10170
2017-05-24 00:58:30 +00:00
Konstantin Belousov
ec95c622ff Regen. 2017-05-23 09:30:42 +00:00
Konstantin Belousov
6992112349 Commit the 64-bit inode project.
Extend the ino_t, dev_t, nlink_t types to 64-bit ints.  Modify
struct dirent layout to add d_off, increase the size of d_fileno
to 64-bits, increase the size of d_namlen to 16-bits, and change
the required alignment.  Increase struct statfs f_mntfromname[] and
f_mntonname[] array length MNAMELEN to 1024.

ABI breakage is mitigated by providing compatibility using versioned
symbols, ingenious use of the existing padding in structures, and
by employing other tricks.  Unfortunately, not everything can be
fixed, especially outside the base system.  For instance, third-party
APIs which pass struct stat around are broken in backward and
forward incompatible ways.

Kinfo sysctl MIBs ABI is changed in backward-compatible way, but
there is no general mechanism to handle other sysctl MIBS which
return structures where the layout has changed. It was considered
that the breakage is either in the management interfaces, where we
usually allow ABI slip, or is not important.

Struct xvnode changed layout, no compat shims are provided.

For struct xtty, dev_t tty device member was reduced to uint32_t.
It was decided that keeping ABI compat in this case is more useful
than reporting 64-bit dev_t, for the sake of pstat.

Update note: strictly follow the instructions in UPDATING.  Build
and install the new kernel with COMPAT_FREEBSD11 option enabled,
then reboot, and only then install new world.

Credits: The 64-bit inode project, also known as ino64, started life
many years ago as a project by Gleb Kurtsou (gleb).  Kirk McKusick
(mckusick) then picked up and updated the patch, and acted as a
flag-waver.  Feedback, suggestions, and discussions were carried
by Ed Maste (emaste), John Baldwin (jhb), Jilles Tjoelker (jilles),
and Rick Macklem (rmacklem).  Kris Moore (kris) performed an initial
ports investigation followed by an exp-run by Antoine Brodin (antoine).
Essential and all-embracing testing was done by Peter Holm (pho).
The heavy lifting of coordinating all these efforts and bringing the
project to completion were done by Konstantin Belousov (kib).

Sponsored by:	The FreeBSD Foundation (emaste, kib)
Differential revision:	https://reviews.freebsd.org/D10439
2017-05-23 09:29:05 +00:00
Gleb Smirnoff
33c6ba0c65 Fix regression in ndis(4) after r286410. This adds a bunch of checks for
whether this is a Ethernet or 802.11 device and does proper dereferencing.

PR:		213237
Submitted by:	<ota j.email.ne.jp>
MFC after:	2 weeks
2017-05-22 20:00:01 +00:00
Ed Maste
bd309b323a Regen sysent after r318634, no open(2) in capability mode
Sponsored by:	The FreeBSD Foundation
2017-05-22 11:45:45 +00:00
Ed Maste
68fc8f3934 disallow open(2) in capability mode
Previously open(2) was allowed in capability mode, with a comment that
suggested this was likely the case to facilitate debugging. The system
call would still fail later on, but it's better to disallow the syscall
altogether.

We now have the kern.trap_enotcap sysctl or PROC_TRAPCAP_CTL proccontrol
to aid in debugging.

In any case libc has translated open() to the openat syscall since
r277032.

Reviewed by:	kib, rwatson
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D10850
2017-05-22 11:43:19 +00:00
Mark Johnston
d6c8335623 Add get_cpu() and put_cpu().
MFC after:	1 week
2017-05-21 00:06:36 +00:00
Mark Johnston
02fb845bbf Fix a few uses of kern_yield() in the TTM and the LinuxKPI.
kern_yield(0) effectively causes the calling thread to be rescheduled
immediately since it resets the thread's priority to the highest possible
value. This can cause livelocks when the pattern
"while (!trylock()) kern_yield(0);" is used since the thread holding the
lock may linger on the runqueue for the CPU on which the looping thread is
running.

MFC after:	1 week
2017-05-18 18:35:14 +00:00
Hans Petter Selasky
d8e073a985 Fix init order in the LinuxKPI for RCU support.
CPU_FOREACH() is not available until SI_SUB_CPU at SI_ORDER_ANY
when the LinuxKPI is loaded as part of the kernel.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-05-09 12:51:42 +00:00
Mahdi Mokhtari
906ba87284 Fix linprocfs_docpuinfo() output regarding to what newer Linux apps expect
Reviewed by:	trasz
Approved by:	trasz
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D10274
2017-05-06 17:37:01 +00:00
Brooks Davis
e9f32d1dc4 Regent post r317845.
MFC after:	1 week
MFC with:	r317845
Sponsored by:	DARPA, AFRL
2017-05-05 18:50:22 +00:00
Brooks Davis
f19351aad8 Provide a freebsd32 implementation of sigqueue()
The previous misuse of sys_sigqueue() was sending random register or
stack garbage to 64-bit targets.  The freebsd32 implementation preserves
the sival_int member of value when signaling a 64-bit process.

Document the mixed ABI implementation of union sigval and the
incompability of sival_ptr with pointer integrity schemes.

Reviewed by:	kib, wblock
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D10605
2017-05-05 18:49:39 +00:00