Commit Graph

2777 Commits

Author SHA1 Message Date
Hans Petter Selasky
3803a97f84 Use __LP64__ to detect presence of suword64() to fix linking and
loading of the LinuxKPI on 32-bit platforms.

Reported by:		lwhsu @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-16 20:39:31 +00:00
Hans Petter Selasky
810ea5b270 The LinuxKPI pagefault disable and enable functions can only be used
pairwise to support the FreeBSD way of pushing and popping the page
fault flags. Ensure this by requiring every occurrence of pagefault
disable function call to have a corresponding pagefault enable call.

Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-16 16:53:22 +00:00
Hans Petter Selasky
0e05589b39 Implement more userspace memory access functions in the LinuxKPI.
Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-16 16:49:27 +00:00
Hans Petter Selasky
b8a8ed7c96 Define some more LinuxKPI task related macros.
Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-16 12:33:34 +00:00
Hans Petter Selasky
8300fb13dd Add helper function similar to ip_dev_find() to the LinuxKPI to lookup
a network device by its IPv6 address in the given VNET.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-16 10:02:45 +00:00
Hans Petter Selasky
404027276b Add basic support for VIMAGE to the LinuxKPI and ibcore.
Support is implemented by mapping Linux's "struct net" into FreeBSD's
"struct vnet". Currently only vnet0 is supported by ibcore.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-16 09:59:35 +00:00
Dmitry Chagin
5bb676adf7 Fix usage of the same 'i' variable in the external and nested loops.
Submitted by: Svyatoslav <razmyslov at viva64.com>
Sponsored by: PVS-Studio

MFC after:	1 week
2017-03-14 18:29:23 +00:00
Hans Petter Selasky
5f50a414ba Set "current" pointer for LinuxKPI interrupts and timer callbacks.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-14 14:02:47 +00:00
Konstantin Belousov
01feb4c3d4 Use designated initializers for kevent_copyops.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-03-14 09:25:01 +00:00
Hans Petter Selasky
f73286645f Fix implementation of the DECLARE_WORK() macro in the LinuxKPI to fully
initialize the declared work structure and not only the function callback
pointer.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-09 18:37:17 +00:00
Hans Petter Selasky
9760ac0a3e Implement support for mutexes with deadlock avoidance in the LinuxKPI.
When locking a mutex and deadlock is detected the first mutex lock
call that sees the deadlock will return -EDEADLK .

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-09 18:33:40 +00:00
Hans Petter Selasky
4f688d19fb Cleanup the LinuxKPI mutex wrappers.
Add support for using mutexes during KDB and shutdown. This is also
required for doing mode-switching during panic for drm-next.

Add new mutex functions mutex_init_witness() and mutex_destroy()
allowing LinuxKPI mutexes to be tracked by witness.

Declare mutex_is_locked() and mutex_is_owned() like inline functions
to get cleaner warnings. These functions are used inside WARN_ON()
statements which might look a bit odd if these functions get fully
expanded.

Give mutexes better debug names through the mutex_name() macro when
WITNESS_ALL is defined. The mutex_name() macro can prefix parts of the
filename and line number before the mutex name.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-09 17:01:00 +00:00
Hans Petter Selasky
c23b6e238f Don't create any threads before SI_SUB_INIT_IF in the LinuxKPI. Else
kthread_add() will assert it is called too soon. This fixes a startup
issue when COMPAT_LINUXKPI is in enabled the kernel configuration
file.

Reported by:		Michael Butler <imb@protected-networks.net>
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-09 09:17:43 +00:00
Hans Petter Selasky
43ee32f7df Fix compilation warning for powerpc64 by not using const keyword in
return types:

Type qualifiers ignored on function return type [-Wreturn-type]

Reported by:		andreast @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-08 21:28:53 +00:00
Hans Petter Selasky
14c5024db8 Cleanup the LinuxKPI slab implementation.
Put large functions into linux_slab.c instead of declaring them static
inline.

Add support for more memory allocation wrappers like kmalloc_array()
and __vmalloc().

Make sure either the M_WAITOK or the M_NOWAIT flag is set and mask
away unused memory allocation flags before calling FreeBSD's malloc()
routine.

Move kmalloc_node() definition to slab.h where it belongs.

Implement support for the SLAB_DESTROY_BY_RCU feature when creating a
kmem_cache which basically means kmem_cache memory is freed using
call_rcu().

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-08 11:09:27 +00:00
Hans Petter Selasky
0e3bbe9197 Implement eth_zero_addr() in the LinuxKPI.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-08 09:53:20 +00:00
Hans Petter Selasky
a767c1883d Add support for constant pointer constructs to READ_ONCE() in the
LinuxKPI. When the type of the argument is constant the temporary
variable cannot be assigned after the barrier. Instead assign the
temporary variable by initialization.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-07 20:24:34 +00:00
Dmitry Chagin
b8bec5a415 Linux semop system call return EINVAL in case when the invalid nsops
or semid values specified.

MFC after:	1 month
2017-03-07 17:12:22 +00:00
Dmitry Chagin
c10e04f5a0 Linux kernel does not export to the user space ipc_perm.mode values
other than S_IRWXUGO (0777).

MFC after:	1 month
2017-03-07 17:09:12 +00:00
Dmitry Chagin
ab60bc8488 Reduce code duplication between MD Linux code by moving SYSV IPC 64-bit
related struct definitions out into the MI path.

Invert the native ipc structs to the Linux ipc structs convesion logic.
Since 64-bit variant of ipc structs has more precision convert native ipc
structs to the 64-bit Linux ipc structs and then truncate 64-bit values
into the non 64-bit if needed. Unlike Linux, return EOVERFLOW if the
values do not fit.

Fix SYSV IPC for 64-bit Linuxulator which never sets IPC_64 bit.

MFC after:	1 month
2017-03-07 17:07:16 +00:00
Hans Petter Selasky
249a42207b Implement time_is_after_eq_jiffies() function in the LinuxKPI.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-07 15:37:51 +00:00
Hans Petter Selasky
661a318c83 Fix implementation of the DECLARE_RWSEM() macro in the LinuxKPI.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-07 15:34:49 +00:00
Hans Petter Selasky
dc0d19dd4a Make sure jiffies value is cast to an integer in the LinuxKPI before
doing millisecond conversion. Under FreeBSD jiffies are 32-bit.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-07 15:33:38 +00:00
Hans Petter Selasky
4cd34a41c9 Use grouptaskqueue for tasklets in the LinuxKPI.
This avoids creating own per-CPU threads and also ensures the tasklet
execution happens on the same CPU core invoking the tasklet.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-07 13:51:14 +00:00
Hans Petter Selasky
ca2ad6bd77 LinuxKPI workqueue cleanup.
This change makes the workqueue implementation behave more like in
Linux, both functionality wise and structure wise.

All workqueue code has been moved to linux_work.c

Add an atomic based statemachine to the work_struct to ensure proper
operation. Prior to this change struct_work was directly mapped to a
FreeBSD task. When a taskqueue has multiple threads the same task may
end up being executed on more than one worker thread simultaneously.
This might cause problems with code coming from Linux, which expects
serial behaviour, similar to Linux tasklets.

Move all global workqueue function names into the linux_xxx domain to
avoid symbol name clashes in the future.

Implement a few more workqueue related functions and macros.

Create two multithreaded taskqueues for the LinuxKPI during module
load, one for time-consuming callbacks and one for non-time consuming
callbacks.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-07 12:09:14 +00:00
Mahdi Mokhtari
8049c6bfb8 Add UNIMPLEMENTED() placeholder macro for
the syscalls that are not implemented in Linux kernel itself.
Cleanup DUMMY() macros.

Reviewed by:	dchagin, trasz
Approved by:	dchagin
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D9804
2017-03-06 18:11:38 +00:00
Hans Petter Selasky
def277d3ef Implement add_timer_on() function in the LinuxKPI.
Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-06 14:56:57 +00:00
Hans Petter Selasky
19bf8ef562 Implement DECLARE_RWSEM() macro in the LinuxKPI to initialize a
Read-Write semaphore during module init time.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-06 12:22:05 +00:00
Hans Petter Selasky
684bcfec89 Give LinuxKPI Read-Write semaphores better debug names when
WITNESS_ALL is defined. The lock name is based on the filename and
line number where the initialisation happens.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-06 12:20:56 +00:00
Hans Petter Selasky
e0db0ddb39 Remove duplicate prototype in the LinuxKPI to fix compilation warning.
Reported by:		emaste @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-04 20:06:47 +00:00
Dmitry Chagin
8042c504f5 Style(9).
MFC after:	1 month
2017-03-04 08:59:21 +00:00
Dmitry Chagin
e45e3698ae Remove attribute __packed from some IPC struct definition since
Linuxulator is x86 only.
The only notable differences in algnment for an LP64 64-bit system
when compared to a 32-bit system is an eight or large byte types
alignment.

MFC after:	1 month
2017-03-04 08:57:39 +00:00
Dmitry Chagin
c1c8a12139 Hide Linux socketcall constants under corresponding #ifdef since
they are used only in i386 Linuxulator.

MFC after:	1 week
2017-03-04 06:54:05 +00:00
Hans Petter Selasky
1f827dab9e Update the LinuxKPI RCU and SRCU wrappers for the concurrency kit, CK.
- Optimise the RCU implementation to not allocate and free
ck_epoch_records during runtime. Instead allocate two sets of
ck_epoch_records per CPU for general purpose use. The first set is
only used for reader locks and the second set is only used for
synchronization and barriers and is protected with a regular mutex to
prevent simultaneous issues.

- Move the task structure away from the rcu_head structure and into
the per-CPU structures. This allows the size of the rcu_head structure
to be reduced down to the size of two pointers.

- Fix a bug where the linux_rcu_barrier() function only waited for one
per-CPU epoch record to be completed instead of all.

- Use a critical section or a mutex to protect ck_epoch_begin() and
ck_epoch_end() depending on RCU or SRCU type. All the ck_epoch_xxx()
functions, except ck_epoch_register(), ck_epoch_unregister() and
ck_epoch_recycle() are not re-entrant and needs a critical section or
a mutex to operate in the LinuxKPI, after inspecting the CK
implementation of the above mentioned functions. The simultaneous
issues arise from per-CPU epoch records being shared between multiple
threads depending on the amount of taskswitching and how many threads
are involved with the RCU and SRCU operations.

- Properly free all epoch records by using safe list traversal at
LinuxKPI module unload. It turns out the ck_epoch_recycle() always
have the records on an internal list and use a flag in the epoch
record to track allocated and free entries. This would lead to use
after free during module unload.

- Remove redundant synchronize_rcu() call from the
linux_compat_uninit() function. Let the linux_rcu_runtime_uninit()
function do the final rcu_barrier() instead.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-03-03 16:28:03 +00:00
Konstantin Belousov
ef130ff64c With the removal of IA64, the only arch which uses ia32 compat is amd64.
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
2017-03-01 11:39:29 +00:00
Dmitry Chagin
0633df29be Linux epoll return EEXIST on case when op is EPOLL_CTL_ADD, and the supplied
file descriptor fd is already registered with this epoll instance.

MFC after:	1 month
2017-02-28 19:55:16 +00:00
Dmitry Chagin
8bc4edfd75 Linux epoll return ENOENT error in case when op is EPOLL_CTL_MOD or
EPOLL_CTL_DEL, and fd is not registered with this epoll instance.

MFC after:	1 month
2017-02-28 19:54:22 +00:00
Dmitry Chagin
29b1ecbe01 FreeBSD does not have analgue for epill EPOLLPRI event type.
So, do not set EPOLLPRI event acidently.
Also, do not set EPOLLWRNORM and EPOLLRDNORM events as epoll
do not set this events.

MFC after:	1 month
2017-02-28 19:49:21 +00:00
Gleb Smirnoff
efe3b0de14 Remove SVR4 (System V Release 4) binary compatibility support.
UNIX System V Release 4 is operating system released in 1988. It ceased
to exist in early 2000-s.
2017-02-28 05:14:42 +00:00
Dmitry Chagin
281afc595f Return EINVAL when an invalid file descriptor specified.
MFC after:	1 month
2017-02-27 16:55:09 +00:00
Dmitry Chagin
ed211c40f4 Unify eventfd ioctl method and use it for other similar interfaces.
MFC after:	1 month
2017-02-27 16:53:52 +00:00
Hans Petter Selasky
3cfeca84b4 Implement more bit operation functions in the LinuxKPI.
Some minor whitespace nits while at it.

Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-27 14:38:17 +00:00
Hans Petter Selasky
522f4b2c75 Define __sum16 type in the LinuxKPI.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-27 13:59:02 +00:00
Dmitry Chagin
2fa6d2fe84 Return EINVAL in case when an invalid size of signal mask specified.
MFC after:	1 month
2017-02-26 20:01:58 +00:00
Dmitry Chagin
80c7315d17 Restore signal mask in epoll_pwait.
MFC after:	1 month
2017-02-26 19:54:17 +00:00
Dmitry Chagin
e0a254f6df Return EINVAL when an invalid file descriptor is specified.
MFC after:	1 month
2017-02-26 19:51:44 +00:00
Dmitry Chagin
dd93b628e9 Implement timerfd family syscalls.
MFC after:	1 month
2017-02-26 09:48:18 +00:00
Dmitry Chagin
01d4a346c2 Nostly style(9) changes, replace unused eventfd_truncate()
by default invfo_truncate() method.

MFC after:	1 month
2017-02-26 09:42:34 +00:00
Dmitry Chagin
0670e972e3 Return EOVERFLOW error in case then the size of tv_sec field of struct timespec
in COMPAT_LINUX32 Linuxulator's not equal to the size of native tv_sec.

MFC after:	1 month
2017-02-26 09:40:42 +00:00
Edward Tomasz Napierala
e801ac7852 Fix linux_fstatfs() to return proper value for f_frsize. Without it,
linux df(1) binary from Xenial shows garbage.

Reviewed by:	dchagin
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9692
2017-02-25 20:32:37 +00:00
Mahdi Mokhtari
bd911530b7 Add linux_preadv() and linux_pwritev() syscalls to Linuxulator.
Reviewed by:	dchagin
Approved by:	dchagin, trasz (src committers)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D9722
2017-02-24 20:04:02 +00:00
Dmitry Chagin
8665c4d9cd Revert r314217. Commit is not match that I have approved. 2017-02-24 19:47:27 +00:00
Mahdi Mokhtari
21d23e3249 Add linux_preadv() and linux_pwritev() syscalls to Linuxulator.
Reviewed by:	dchagin
Approved by:	dchagin, trasz (src committers)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D9722
2017-02-24 19:22:17 +00:00
Hans Petter Selasky
0f32531a56 Implement more string functions in the LinuxKPI.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-24 17:36:55 +00:00
Hans Petter Selasky
5a5a8c8a17 Prototype device structure to ensure LinuxKPI header file can be
included standalone.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-24 17:03:14 +00:00
Hans Petter Selasky
959d6165a2 Implement srcu_dereference() macro in the LinuxKPI.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-24 14:40:15 +00:00
Hans Petter Selasky
797046eebb Implement BIT_ULL() macro in the LinuxKPI.
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-24 14:23:46 +00:00
Hans Petter Selasky
cffaf933d7 Implement __test_and_clear_bit() and __test_and_set_bit() in the LinuxKPI.
The clang compiler will optimise these functions down to three AMD64
instructions if the bit argument is a constant during compilation.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-23 09:53:54 +00:00
Dmitry Chagin
c1f156d447 Right clock defines specified in linux_timer.h.
Get rid of spirious clock defines from linux_misc.h.

MFC after:	1 week
2017-02-23 08:17:42 +00:00
Hans Petter Selasky
72ebbe00b3 Convert magic values into macros in the LinuxKPI scatterlist
implementation.

Suggested by:		cem @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-22 20:24:09 +00:00
Hans Petter Selasky
1cdefd084d Optimise unmapped LinuxKPI page allocations.
When allocating unmapped pages, take advantage of the direct map on
AMD64 to get the virtual address corresponding to a page. Else all
pages allocated must be mapped because sometimes the virtual address
of a page is requested.

Move all page allocation and deallocation code into an own C-file.

Add support for GFP_DMA32, GFP_KERNEL, GFP_ATOMIC and __GFP_ZERO
allocation flags.

Make a clear separation between mapped and unmapped allocations.

Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-22 19:39:54 +00:00
Hans Petter Selasky
8306998f5b Improve LinuxKPI scatter list support.
The i915kms driver in Linux 4.9 reimplement parts of the scatter list
functions with regards to performance. In other words there is not so
much room for changing structure layouts and functionality if the
i915kms should be built AS-IS. This patch aligns the scatter list
support to what is expected by the i915kms driver. Remove some
comments not needed while at it.

Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-22 19:31:02 +00:00
Hans Petter Selasky
1a01b4e566 Replace dummy implementation of RCU in the LinuxKPI with one based on
the in-kernel concurrency kit's ck_epoch API. Factor RCU hlist_xxx()
functions into own rculist.h header file.

Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-21 18:04:21 +00:00
Edward Tomasz Napierala
3b51ec0886 Get rid of foo_sys() in linuxulator code. It was commented out, and it
would be useless anyway - there is no point in pretending to have block
devices; our "block" devices are in fact character ones, and can only
be accessed as such.

Discussed with:	dchagin
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-02-21 15:57:01 +00:00
Hans Petter Selasky
e560eab72c Streamline the LinuxKPI spinlock wrappers.
1) Add better spinlock debug names when WITNESS_ALL is defined.

2) Make sure that the calling thread gets bound to the current CPU
while a spinlock is locked. Some Linux kernel code depends on that the
CPU ID doesn't change while a spinlock is locked.

3) Add support for using LinuxKPI spinlocks during a panic().

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-21 14:22:14 +00:00
Hans Petter Selasky
ef23481a79 Add support for LinuxKPI tasklets.
Tasklets are implemented using a taskqueue and a small statemachine on
top. The additional statemachine is required to ensure all LinuxKPI
tasklets get serialized. FreeBSD taskqueues do not guarantee
serialisation of its tasks, except when there is only one worker
thread configured.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-21 13:23:53 +00:00
Hans Petter Selasky
1e3db1de0c Make the LinuxKPI task struct persistent accross system calls.
A set of helper functions have been added to manage the life of the
LinuxKPI task struct. When an external system call or task is invoked,
a check is made to create the task struct by demand. A thread
destructor callback is registered to free the task struct when a
thread exits to avoid memory leaks.

This change lays the ground for emulating the Linux kernel more
closely which is a dependency by the code using the LinuxKPI APIs.

Add new dedicated td_lkpi_task field has been added to struct thread
instead of abusing td_retval[1].

Fix some header file inclusions to make LINT kernel build properly
after this change.

Bump the __FreeBSD_version to force a rebuild of all kernel modules.

MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-21 12:43:02 +00:00
Edward Tomasz Napierala
15cbcf4bab Add /proc/self/mounts to linprocfs; some linux binaries need it.
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-02-20 17:33:25 +00:00
Edward Tomasz Napierala
c6d57d3073 There are some Linux binaries that expect the system to obey the "addr"
parameter to mmap(2), even if MAP_FIXED is not explicitly specified.
Android ART is one example.  Implement bug compatibility for this case
in linuxulator.

Reviewed by:	dchagin@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9373
2017-02-19 17:17:06 +00:00
Dmitry Chagin
486a06bdf0 Implement rt_tgsigqueueinfo system call used by glibc for pthread_sigqueue(3).
MFC after:	2 week
2017-02-19 07:38:11 +00:00
Dmitry Chagin
dddb7e7f25 Style(9), some XXX comments fix. No functional changes.
MFC after:	1 week
2017-02-18 10:01:17 +00:00
Dmitry Chagin
fa580e65c4 Initialize cap_rights before use.
MFC after:	1 week
2017-02-18 09:39:20 +00:00
Dmitry Chagin
56fba8e66b Finich r313684.
Convert linux_recv(), linux_send() and linux_accept() system call arguments
to the register_t type too.

PR:		217161
MFC after:	3 days
xMFC with:	r313284,r313285,r313684
2017-02-18 07:21:50 +00:00
Hans Petter Selasky
269d8c86e9 Implement GFP_DMA32 flag in the LinuxKPI.
Define all FreeBSD native GFP bits as GFP_NATIVE_MASK.

Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-17 13:31:11 +00:00
Hans Petter Selasky
ddad2785bc Allow container_of() to be used with constant data pointers.
Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-16 14:13:36 +00:00
Hans Petter Selasky
622f2291e8 Implement more LinuxKPI atomic functions and macros.
Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-16 12:56:10 +00:00
Hans Petter Selasky
28a04a26b2 Allow passing a constant atomic_t to atomic_read().
Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-16 12:20:57 +00:00
Hans Petter Selasky
13459eb4a3 Whitespace fix.
Obtained from:		kmacy @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2017-02-16 12:08:52 +00:00
Edward Tomasz Napierala
8081c6ce6b Improve debugging output.
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
2017-02-16 10:36:00 +00:00
Dmitry Chagin
3a66cf0372 Replace Linuxulator implementation of readdir(), getdents() and
getdents64() with wrapper over kern_getdirentries().

The patch was originally written by emaste@ and then adapted by trasz@
and me.

Note:
1. I divided linux_getdents() and linux_readdir() as in case when the
getdents() called with count = 1 (readdir() case) it can overwrite
user stack (by writing to user buffer pointer more than 1 byte).

2. Linux returns EINVAL in case when user supplied buffer is not enough
to contain fetched dirent.

3. Linux returns ENOTDIR in case when fd points to not a directory.

Reviewed by:		trasz@
MFC after:		1 month
Differential Revision:	https://reviews.freebsd.org/D2210
2017-02-14 19:13:27 +00:00
Konstantin Belousov
496ab0532d Rework r313352.
Rename kern_vm_* functions to kern_*.  Move the prototypes to
syscallsubr.h.  Also change Mach VM types to uintptr_t/size_t as
needed, to avoid headers pollution.

Requested by:	alc, jhb
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D9535
2017-02-13 09:04:38 +00:00
Konstantin Belousov
995b8f4fb8 Style: wrap long line.
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2017-02-13 00:39:43 +00:00
Dmitry Chagin
abf20e9392 Fix r313284.
Members of the syscall argument structures are padded to a word size. So,
for COMPAT_LINUX32 we should convert user supplied system call arguments
which is 32-bit in that case to the array of register_t.

Reported by:	Oleg V. Nauman
MFC after:	1 week
2017-02-12 15:22:50 +00:00
John Baldwin
bb9b710477 Regenerate all the system call tables to drop "created from" lines.
One of the ibcs2 files contains some actual changes (new headers) as
it hasn't been regenerated after older changes to makesyscalls.sh.
2017-02-10 19:45:02 +00:00
Edward Tomasz Napierala
69cdfcef2e Add kern_vm_mmap2(), kern_vm_mprotect(), kern_vm_msync(), kern_vm_munlock(),
kern_vm_munmap(), and kern_vm_madvise(), and use them in various compats
instead of their sys_*() counterparts.

Reviewed by:	ed, dchagin, kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9378
2017-02-06 20:57:12 +00:00
Dmitry Chagin
8b756d40a7 Update syscall.master to 4.10-rc6. Also fix comments, a typo,
and wrong numbering for a few unimplemented syscalls.

For 32-bit Linuxulator, socketcall() syscall was historically
the entry point for the sockets API. Starting in Linux 4.3, direct
syscalls are provided for the sockets API. Enable it.

The initial version of patch was provided by trasz@ and extended by me.

Submitted by:	trasz
MFC after:	2 week
Differential Revision:	https://reviews.freebsd.org/D9381
2017-02-05 14:17:09 +00:00
Edward Tomasz Napierala
85dbb41686 Fix linux_pipe() and linux_pipe2() to close file descriptors on copyout
error.

Reviewed by:	dchagin
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9425
2017-02-05 14:03:25 +00:00
Edward Tomasz Napierala
96ee43103d Add kern_cpuset_getaffinity() and kern_cpuset_getaffinity(),
and use it in compats instead of their sys_*() counterparts.

Reviewed by:	kib, jhb, dchagin
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9383
2017-02-05 13:24:54 +00:00
Edward Tomasz Napierala
b38b22b0b2 Add kern_pread() and kern_pwrite(), and use it in compats instead
of their sys_*() counterparts. The svr4 is left unchanged.

Reviewed by:	kib@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9379
2017-01-31 15:35:18 +00:00
Edward Tomasz Napierala
5db72ef2e4 Fix linux_getppid() to debug the actual parent, even it was reparented
by debugger.

Reviewed by:	dchagin@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9361
2017-01-31 15:22:51 +00:00
Edward Tomasz Napierala
fc8bde8ffe Replace calls to sys_truncate() with kern_truncate().
Reviewed by:	kib@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9371
2017-01-31 15:19:44 +00:00
Edward Tomasz Napierala
ea2ebdc19e Add kern_cpuset_getid() and kern_cpuset_setid(), and use them
in compat32 instead of their sub_*() counterparts.

Reviewed by:	jhb@, kib@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9382
2017-01-31 15:11:23 +00:00
Edward Tomasz Napierala
d293f35c09 Add kern_listen(), kern_shutdown(), and kern_socket(), and use them
instead of their sys_*() counterparts in various compats. The svr4
is left untouched, because there's no point.

Reviewed by:	ed@, kib@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9367
2017-01-30 12:57:22 +00:00
Edward Tomasz Napierala
f67d6b5f12 Add kern_lseek() and use it instead of sys_lseek() in various compats.
I didn't touch svr4/, there's no point.

Reviewed by:	ed@, kib@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9366
2017-01-30 12:24:47 +00:00
Edward Tomasz Napierala
ae6b6ef6cb Replace sys_ftruncate() with kern_ftruncate() in various compats.
Reviewed by:	kib@
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9368
2017-01-30 11:50:54 +00:00
Baptiste Daroussin
b4b4b5304b Revert crap accidentally committed 2017-01-28 16:31:23 +00:00
Baptiste Daroussin
814aaaa7da Revert r312923 a better approach will be taken later 2017-01-28 16:30:14 +00:00
Mateusz Guzik
21b737495b Introduce __read_mostly and __exclusive_cache_line macros.
The intended use is to annotate frequently used globals which either rarely
change (and thus can be grouped in the same cacheline) or are an atomic counter
(which means it may benefit from being the only variable in the cacheline).

Linker script support is provided only for amd64. Architectures without it risk
having other variables put in, i.e. as if they were not annotated. This is
harmless from correctness point of view.

Reviewed by:	bde (previous version)
MFC after:	1 month
2017-01-27 14:53:09 +00:00
Ed Schouten
4423244072 Catch up with changes to structure member names.
Pointer/length pairs are now always named ${name} and ${name}_len.
2017-01-17 22:05:52 +00:00
Ed Schouten
c20537def1 Regenerate sources based on the system call tables. 2017-01-17 22:05:01 +00:00
Gleb Smirnoff
9565357905 Use getsock_cap() instead of fgetsock().
Reviewed by:	dchagin
2017-01-06 04:38:38 +00:00
Konstantin Belousov
2f304845e2 Do not allocate struct statfs on kernel stack.
Right now size of the structure is 472 bytes on amd64, which is
already large and stack allocations are indesirable.  With the ino64
work, MNAMELEN is increased to 1024, which will make it impossible to have
struct statfs on the stack.

Extracted from:	ino64 work by gleb
Discussed with:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-05 17:19:26 +00:00
Konstantin Belousov
607fa849d2 Some style fixes for getfstat(2)-related code.
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2017-01-05 17:03:35 +00:00
John Baldwin
1fabda45c3 Regen after r310638.
Differential Revision:	https://reviews.freebsd.org/D8854
2016-12-27 20:22:17 +00:00
John Baldwin
34ed0c63c8 Rename the 'flags' argument to getfsstat() to 'mode' and validate it.
This argument is not a bitmask of flags, but only accepts a single value.
Fail with EINVAL if an invalid value is passed to 'flag'.  Rename the
'flags' argument to getmntinfo(3) to 'mode' as well to match.

This is a followup to r308088.

Reviewed by:	kib
MFC after:	1 month
2016-12-27 20:21:11 +00:00
Hans Petter Selasky
a11bac7379 Implement more list header file functions.
Add definition guard for the list_head structure.

Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-12-26 10:41:51 +00:00
Hans Petter Selasky
70a3cc597a Fix LINT build.
Found by:	mmel @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-12-26 10:03:33 +00:00
Hans Petter Selasky
1125dbc049 Implement register and unregister chrdev in the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-12-26 01:18:07 +00:00
Hans Petter Selasky
a1410999f4 Use correct integer type when computing the maximum physical address
for kmem_alloc_contig().

Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-12-25 21:41:40 +00:00
Hans Petter Selasky
03adb29e0d Improve LinuxKPI device support. Only delete own BSD devices and not
the ones obtained through devclass_get_device(). Some minor code
cleanups while at it.

Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-12-25 19:49:09 +00:00
Conrad Meyer
2f02a9e15a linuxkpi: Fix not-found case of linux_pci_find_irq_dev
Linux list_for_each_entry() does not neccessarily end with the iterator
NULL (it may be an offset from NULL if the list member is not the first
element of the member struct).

Reported by:	Coverity
CID:		1366940
Reviewed by:	hselasky@
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8780
2016-12-13 19:58:21 +00:00
Ed Schouten
d6d7df8aa0 Remove the only user of sysctl_add_oid().
My plan is to change this function's prototype at some point in the
future to add a new label argument, which can be used to export all of
sysctl as metrics that can be scraped by Prometheus. Switch over this
caller to use the macro wrapper counterpart.
2016-12-13 07:58:30 +00:00
Hans Petter Selasky
12af734d32 Add more LinuxKPI PCI definitions.
Obtained from:	kmacy @
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-12-09 15:05:09 +00:00
Hans Petter Selasky
1724ded49c Prefer function macros over regular macros in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-12-09 15:01:37 +00:00
Hans Petter Selasky
be48ab92ac Avoid malloc() warnings when using the LinuxKPI by zero-checking
the allocation flags.

Obtained from:		kmacy @
Sponsored by:           Mellanox Technologies
MFC after:		1 week
2016-12-09 14:06:22 +00:00
Hans Petter Selasky
0a61267a99 MSIX can support more than 256 IRQs. Make sure the invalid IRQ number
set in the LinuxKPI is big enough.

Sponsored by:           Mellanox Technologies
MFC after:		1 week
2016-12-09 13:53:31 +00:00
Hans Petter Selasky
e996b07c72 Prefix some _pci_xxx() functions in the Linux KPI with linux_ and make
sure the IRQ number used by these functions is unsigned.

Sponsored by:           Mellanox Technologies
MFC after:		1 week
2016-12-09 13:47:50 +00:00
Hans Petter Selasky
7fdce5c42b Prefix the Linux KPI's kmem_xxx() functions with linux_ to avoid
conflict with the opensolaris kernel module.

This patch solves a problem where the kernel linker will incorrectly
resolve opensolaris kmem_xxx() functions as linuxkpi ones, which leads
to a panic when these functions are used.

Submitted by:		gallatin @
Sponsored by:           Mellanox Technologies
MFC after:		1 week
2016-12-09 13:41:26 +00:00
Eric van Gyzen
3d32d4a7c9 Export the whole thread name in kinfo_proc
kinfo_proc::ki_tdname is three characters shorter than
thread::td_name.  Add a ki_moretdname field for these three
extra characters.  Add the new field to kinfo_proc32, as well.
Update all in-tree consumers to read the new field and assemble
the full name, except for lldb's HostThreadFreeBSD.cpp, which
I will handle separately.  Bump __FreeBSD_version.

Reviewed by:	kib
MFC after:	1 week
Relnotes:	yes
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D8722
2016-12-07 15:04:22 +00:00
Alan Cox
bba39b9ae3 Remove PG_CACHED-related fields from struct vmmeter, because they are no
longer used.  More precisely, they are always zero because the code that
decremented and incremented them no longer exists.

Bump __FreeBSD_version to mark this change.

Reviewed by:	kib, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8583
2016-11-22 18:13:46 +00:00
Ed Maste
5503201d98 Tidy up ia32_sysvec sv_flags setting
Use the same approach as sys/arm/arm/elf_machdep.c to avoid an odd-
looking , on a separate line.
2016-10-20 20:29:54 +00:00
Sepherosa Ziehau
6faefea03b linuxkpi: Fix PCI BAR lazy allocation support.
FreeBSD supports lazy allocation of PCI BAR, that is, when a device
driver's attach method is invoked, even if the device's PCI BAR
address wasn't initialized, the invocation of bus_alloc_resource_any()
(the call chain: pci_alloc_resource() -> pci_alloc_multi_resource() ->
pci_reserve_map() -> pci_write_bar()) would allocate a proper address
for the PCI BAR and write this 'lazy allocated' address into the PCI
BAR.

This model works fine for native FreeBSD device drivers, but _not_ for
device drivers shared with Linux (e.g. dev/mlx5/mlx5_core/mlx5_main.c
and ofed/drivers/net/mlx4/main.c.  Both of them use
pci_request_regions(), which doesn't work properly with the PCI BAR
lazy allocation, because pci_resource_type() -> _pci_get_rle() always
returns NULL, so pci_request_regions() doesn't have the opportunity to
invoke bus_alloc_resource_any().  We now use pci_find_bar() in
pci_resource_type(), which is able to locate all available PCI BARs
even if some of them will be lazy allocated.

Submitted by:	Dexuan Cui <decui microsoft com>
Reviewed by:	hps
MFC after:	1 week
Sponsored by:	Microsoft
Differential Revision:	https://reviews.freebsd.org/D8071
2016-09-30 05:51:11 +00:00
Hans Petter Selasky
bdff61f849 The IORESOURCE_XXX defines should resemble a bitmask while SYS_RES_XXX
are not bitmasks. Fix return value of pci_resource_flags() to reflect
this change.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-09-29 14:35:32 +00:00
Mateusz Guzik
02274c93c3 cloudabi: use fget_cap instead of hand-rolling capability read
This has a side effect of unbreaking the build after r306272.

Discussed with:		ed
2016-09-23 23:08:23 +00:00
Mariusz Zaborski
85b0f9de11 capsicum: propagate rights on accept(2)
Descriptor returned by accept(2) should inherits capabilities rights from
the listening socket.

PR:		201052
Reviewed by:	emaste, jonathan
Discussed with:	many
Differential Revision:	https://reviews.freebsd.org/D7724
2016-09-22 09:58:46 +00:00
Mark Johnston
bdaf6d6913 Regenerate syscall provider argument strings. 2016-09-22 04:50:03 +00:00
Konstantin Belousov
643f6f47fd Add PROC_TRAPCAP procctl(2) controls and global sysctl kern.trap_enocap.
Both can be used to cause processes in capability mode to receive
SIGTRAP when ENOTCAPABLE or ECAPMODE errors are returned from
syscalls.

Idea by:	emaste
Reviewed by:	oshogbo (previous version), emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D7965
2016-09-21 08:23:33 +00:00
Ed Maste
df4336ddfa Catch up to sys/capability.h rename to sys/capsicum.h in r263232
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2016-09-19 18:44:43 +00:00
Konstantin Belousov
94ffabd5ce Regen. 2016-09-18 22:03:26 +00:00
Konstantin Belousov
84a30d33a3 Add compat32 support for capsicum.
Reviewed by:	bapt, emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D7942
2016-09-18 22:03:07 +00:00
Dmitry Chagin
ede2869c4c Implement BLKSSZGET ioctl for the Linuxulator.
PR:		212700
Submitted by:	Erik Cederstrand
Reported by:	Erik Cederstrand
MFC after:	1 week
2016-09-17 08:10:01 +00:00
Mateusz Guzik
dbfe4b2673 linprocfs: garbage collect meminfo fields not present in linux
In particular memshared not only does not exist in linux, it was
extremely expensive to calculate.
2016-09-16 03:36:43 +00:00
John Baldwin
38605d7312 Remove 'cpu' and 'cpu_class' on amd64.
The 'cpu' and 'cpu_class' variables were always set to the same value
on amd64 and are legacy holdovers from i386.  Remove them entirely on
amd64.

Reviewed by:	imp, kib (older version)
Differential Revision:	https://reviews.freebsd.org/D7888
2016-09-15 17:05:54 +00:00
Brooks Davis
ef13681631 Remove a pointless translation of struct ioc_toc_header.
struct ioc_toc_header will be the same size (and thus IOREADTOCHEADER
will have the same value on all supported platforms).

Sponsored by:	DARPA, AFRL
2016-09-08 00:38:50 +00:00
Ed Schouten
102754b3cf Add missing header dependency.
This header depends on sigaltstack32 being declared.
2016-08-24 09:57:19 +00:00
Ed Schouten
79ad79d6a4 Add source files generated from the 32-bit system call table. 2016-08-21 16:02:25 +00:00
Ed Schouten
240f8c2d51 Add CPU independent code for running 32-bits CloudABI executables.
Essentially, this is a literal copy of the code in sys/compat/cloudabi64,
except that it now makes use of 32-bits datatypes and limits. In
sys/conf/files, we now need to take care to build the code in
sys/compat/cloudabi if either COMPAT_CLOUDABI32 or COMPAT_CLOUDABI64 is
turned on.

This change does not yet include any of the CPU dependent bits. Right
now I have implementations for running i386 binaries both on i386 and
x86-64, which I will send out for review separately.
2016-08-21 16:01:30 +00:00
Ed Schouten
90f4145f82 Don't forget to define __ELF_WORD_SIZE.
Without it, we only obtain the ELF types native to the system. In this
we explicitly want the 64-bit versions.
2016-08-21 15:37:49 +00:00
Ed Schouten
47cb4d7bd0 Add a utility macro for converting 64-bit pointers to native pointers.
Right now we're casting uint64_t's to native pointers. This isn't
causing any problems right now, but if we want to provide a 32-bit
compatibility layer that works on 64-bit systems as well, this will
cause problems. Casting a uint32_t to a 64-bit pointer throws a compiler
error.

Introduce a TO_PTR() macro that casts the value to uintptr_t before
casting it to a pointer.
2016-08-21 15:36:18 +00:00
Ed Schouten
4fbc90654c Move the linker script from cloudabi64/ to cloudabi/.
It turns out that it works perfectly fine for generating 32-bits vDSOs
as well. While there, get rid of the extraneous .s file extension.
2016-08-21 15:14:06 +00:00
Ed Schouten
a953f555e1 Use the right _MAX constant.
Though uio_resid is of type ssize_t, we need to take into account that
this source file contains an implementation specific to a certain
userspace pointer size. If this file provided 32-bit implementations,
this should have used INT32_MAX, even when running a 64-bit kernel.

This change has no effect, but is simply in preparation for adding
support for running 32-bit CloudABI executables.
2016-08-21 09:32:20 +00:00
Ed Schouten
384ef4841a Use memcpy() to copy 64-bit timestamps into the syscall return values.
On 32-bit platforms, our 64-bit timestamps need to be split up across
two registers. A simple assignment to td_retval[0] will cause the top 32
bits to get lost. By using memcpy(), we will automatically either use 1
or 2 registers depending on the size of register_t.
2016-08-21 07:41:11 +00:00
Ed Schouten
7a3558bed5 Regenerate system call table after r304483. 2016-08-19 17:54:06 +00:00
Ed Schouten
a5c3c5b14f Import the new automatically generated system call table for CloudABI.
Now that we've switched over to using the vDSO on CloudABI, it becomes a
lot easier for us to phase out old features. System call numbering is no
longer something that's part of the ABI. It's fully based on names. As
long as the numbering used by the kernel and the vDSO is consistent
(which it always is), it's all right.

Let's put this to the test by removing a system call (thread_tcb_set())
that's already unused for quite some time now, but was only left intact
to serve as a placeholder. Sync in the new system call table that uses
alphabetic sorting of system calls.

Obtained from:	https://github.com/NuxiNL/cloudabi
2016-08-19 17:49:35 +00:00
George V. Neville-Neil
3e7e23332f Remove the obsolete and unused openbsd_poll system call. (Phase 2)
Reported by:	brooks
Reviewed by:	brooks, jhb
Differential Revision:	https://reviews.freebsd.org/D7548
2016-08-18 10:54:39 +00:00
George V. Neville-Neil
5cba398b0c Remove unusedd and obsolete openbsd_poll system call. (Phase 1)
Reported by:	brooks
Reviewed by:	brooks,jhb
Differential Revision:	https://reviews.freebsd.org/D7548
2016-08-18 10:50:40 +00:00
Ed Schouten
93d9ebd82e Eliminate use of sys_fsync() and sys_fdatasync().
Make the kern_fsync() function public, so that it can be used by other
parts of the kernel. Fix up existing consumers to make use of it.

Requested by:	kib
2016-08-15 20:11:52 +00:00
Ed Schouten
2256eed3eb Let CloudABI use fdatasync() as well.
Now that FreeBSD supports fdatasync() natively, we can tidy up
CloudABI's equivalent system call to use that instead.
2016-08-15 19:42:21 +00:00
Konstantin Belousov
1d2537a26a Regen after r304176, fdatasync(2) addition. 2016-08-15 19:15:46 +00:00
Konstantin Belousov
295af703a0 Add an implementation of fdatasync(2).
The syscall is a trivial wrapper around new VOP_FDATASYNC(), sharing
code with fsync(2).  For all filesystems, this commit provides the
implementation which delegates the work of VOP_FDATASYNC() to
VOP_FSYNC().  This is functionally correct but not efficient.

This is not yet POSIX-compliant implementation, because it does not
ensure that queued AIO requests are completed before returning.

Reviewed by:	mckusick
Discussed with:	avg (ZFS), jhb (AIO part)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D7471
2016-08-15 19:08:51 +00:00
Ed Schouten
13b4b4df98 Provide the CloudABI vDSO to its executables.
CloudABI executables already provide support for passing in vDSOs. This
functionality is used by the emulator for OS X to inject system call
handlers. On FreeBSD, we could use it to optimize calls to
gettimeofday(), etc.

Though I don't have any plans to optimize any system calls right now,
let's go ahead and already pass in a vDSO. This will allow us to
simplify the executables, as the traditional "syscall" shims can be
removed entirely. It also means that we gain more flexibility with
regards to adding and removing system calls.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D7438
2016-08-10 21:02:41 +00:00
Bryan Drewery
c1fa440409 Regenerate after r303755.
MFC after:	3 days
X-MFC-With:	r303755
Sponsored by:	EMC / Isilon Storage Division
2016-08-04 19:15:51 +00:00
Ed Schouten
e938ebbc0c Regenerate system call tables for r303699 and r303700. 2016-08-03 06:36:45 +00:00
Ed Schouten
a813fdc6c3 mprotect(): Change prototype to comply to POSIX.
Our mprotect() function seems to take a "const void *" address to the
pages whose permissions need to be adjusted. POSIX uses "void *". Simply
stick to the POSIX one to prevent us from writing unportable code.

PR:		211423 (exp-run)
Tested by:	antoine@ (Thanks!)
2016-08-03 06:33:04 +00:00
Brooks Davis
40018b91dd Don't create pointless backups of generated files in "make sysent".
Any sensible workflow will include a revision control system from which
to restore the old files if required.  In normal usage, developers just
have to clean up the mess.

Reviewed by:	jhb
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D7353
2016-07-28 21:29:04 +00:00
Konstantin Belousov
584b675ed6 Hide the boottime and bootimebin globals, provide the getboottime(9)
and getboottimebin(9) KPI. Change consumers of boottime to use the
KPI.  The variables were renamed to avoid shadowing issues with local
variables of the same name.

Issue is that boottime* should be adjusted from tc_windup(), which
requires them to be members of the timehands structure.  As a
preparation, this commit only introduces the interface.

Some uses of boottime were found doubtful, e.g. NLM uses boottime to
identify the system boot instance.  Arguably the identity should not
change on the leap second adjustment, but the commit is about the
timekeeping code and the consumers were kept bug-to-bug compatible.

Tested by:	pho (as part of the bigger patch)
Reviewed by:	jhb (same)
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 month
X-Differential revision:	https://reviews.freebsd.org/D7302
2016-07-27 11:08:59 +00:00
Ed Schouten
f63cd251b2 Add shmatt_t.
It looks like our "struct shmid_ds::shm_nattch" deviates from the
standard in the sense that it is a signed integer, whereas POSIX
requires that it is unsigned, having a special type shmatt_t.

Patch up our native and 32-bit copies to use a new shmatt_t that is an
unsigned integer. As it's unsigned, we can relax the comparisons that
are performed on it. Leave the Linux, iBCS2, etc. copies of the
structure alone.

Reviewed by:	ngie
Differential Revision:	https://reviews.freebsd.org/D6655
2016-07-26 17:23:49 +00:00
Gleb Smirnoff
32e0ade6c4 Partially revert r257696/r257713, which have an issue with writing to user
controlled address. Restore the old code that emulated OSIOCGIFCONF in if.c.

Noticed by:	C Turt
2016-07-24 10:10:09 +00:00
Dmitry Chagin
97d06da692 Fix a copy/paste bug introduced during X86_64 Linuxulator work.
FreeBSD support NX bit on X86_64 processors out of the box, for i386 emulation
use READ_IMPLIES_EXEC flag, introduced in r302515.

While here move common part of mmap() and mprotect() code to the files in compat/linux
to reduce code dupcliation between Linuxulator's.

Reported by:    Johannes Jost Meixner, Shawn Webb

MFC after:	1 week
XMFC with:	r302515, r302516
2016-07-10 08:22:04 +00:00
Dmitry Chagin
23e8912c60 Implement Linux personality() system call mainly due to READ_IMPLIES_EXEC flag.
In Linux if this flag is set, PROT_READ implies PROT_EXEC for mmap().
Linux/i386 set this flag automatically if the binary requires executable stack.

READ_IMPLIES_EXEC flag will be used in the next Linux mmap() commit.
2016-07-10 08:15:50 +00:00
Dmitry Chagin
3a49978f45 Fix a bug introduced in r283433.
[1] Remove unneeded sockaddr conversion before kern_recvit() call as the from
argument is used to record result (the source address of the received message) only.

[2] In Linux the type of msg_namelen member of struct msghdr is signed but native
msg_namelen has a unsigned type (socklen_t). So use the proper storage to fetch fromlen
from userspace and than check the user supplied value and return EINVAL if it is less
than 0 as a Linux do.

Reported by:	Thomas Mueller <tmueller at sysgo dot com> [1]
Reviewed by:	kib@
Approved by:	re (gjb, kib)
MFC after:	3 days
2016-06-26 16:59:59 +00:00
Brooks Davis
722bc9c00b Regen post r302096 and implement svr4_pipe().
Approved by:	re (implict, fixing build)
Sponsored by:	DARPA, AFRL
2016-06-23 00:30:09 +00:00
Brooks Davis
2b4a6471f4 Declare a svr4 version of pipe() now that sys_pipe() is no more.
Approved by:	re (implicit, fixing build)
Sponsored by:	DARPA, AFRL
2016-06-23 00:29:03 +00:00
Brooks Davis
a72c64b0b6 Generate syscall tables and update pipe() implementation after r302094.
Mark the pipe() system call as COMPAT10.

As of r302092 libc uses pipe2() with a zero flags value instead of pipe().

Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D6816
2016-06-22 21:18:19 +00:00
Brooks Davis
e16e64098c Mark the pipe() system call as COMPAT10.
As of r302092 libc uses pipe2() with a zero flags value instead of pipe().

Commit with regenerated files and implementation to follow.

Approved by:	re (gjb)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D6816
2016-06-22 21:15:59 +00:00
Konstantin Belousov
5c2cf81845 Update comments for the MD functions managing contexts for new
threads, to make it less confusing and using modern kernel terms.

Rename the functions to reflect current use of the functions, instead
of the historic KSE conventions:
  cpu_set_fork_handler -> cpu_fork_kthread_handler (for kthreads)
  cpu_set_upcall -> cpu_copy_thread (for forks)
  cpu_set_upcall_kse -> cpu_set_upcall (for new threads creation)

Reviewed by:	jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Approved by:	re (hrs)
Differential revision:	https://reviews.freebsd.org/D6731
2016-06-16 12:05:44 +00:00
Mark Johnston
8de3effe5e Add a missing error check for a malloc() call in idr_get().
Submitted by:	Matt Joras <mjoras@isilon.com>
Approved by:	re (gjb)
MFC after:	1 week
Sponsored by:	EMC / Isilon Storage Division
2016-06-14 03:57:00 +00:00
Konstantin Belousov
eb00e5b783 swap_dev_info() does not require Giant, so Giant locking around
the loop in linprocfs_doswaps() is useless.
List of the registered filesystems is protected by vfsconf_sx,
not by the Giant.  Adjust linprocfs_dofilesystems() correspondingly.

Approved by:	re (delphij), des (linprocfs maintainer)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2016-06-12 11:13:38 +00:00
Hans Petter Selasky
d8571d3ea3 Fallback to arc4rand() in the LinuxKPI when read_random() returns
zero. This can happen for virtual machines.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-06-07 13:10:13 +00:00
Gleb Smirnoff
34e05ebe72 Fix kernel stack disclosures in the Linux and 4.3BSD compat layers.
Submitted by:	CTurt
Security:	SA-16:20
Security:	SA-16:21
2016-05-31 16:56:30 +00:00
Hans Petter Selasky
8eeb3e1773 The SCHEDULER_STOPPED() macro already contains a predict false statement.
Remove superfluous unlikely() wrapper.

Suggested by:	glebius
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-27 07:33:49 +00:00
Hans Petter Selasky
2bb46d5516 Define ATOMIC_LONG_INIT() in the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-26 10:03:22 +00:00
Hans Petter Selasky
06ca64ecf3 Add support for runtime modifiable module parameters in the LinuxKPI.
Linux module parameters have a permissions value. If any write bits
are set we are allowed to modify the module parameter runtime. Reflect
this when creating the static SYSCTL nodes.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-05-26 09:04:14 +00:00
Hans Petter Selasky
707324edee Add more module parameter macros to the LinuxKPI.
Obtained from:	kmacy @
Sponsored by:	Mellanox Technologies
2016-05-26 08:47:06 +00:00
Hans Petter Selasky
0bb3dd300b Add support for boolean module parameters in the LinuxKPI.
Requested by:	kmacy @
Sponsored by:	Mellanox Technologies
2016-05-26 08:44:11 +00:00
Hans Petter Selasky
1d9b99e5e3 Implement Linux module parameters as read-only tunable SYSCTLs.
Bool module parameters are no longer supported, because there is no
equivalent in FreeBSD.

There are two macros available which control the behaviour of the
LinuxKPI module parameters:

- LINUXKPI_PARAM_PARENT allows the consumer to set the SYSCTL parent
where the modules parameters will be created.

- LINUXKPI_PARAM_PREFIX defines a parameter name prefix, which is
  added to all created module parameters.

Sponsored by:	Mellanox Technologies
MFC after:	1 week
2016-05-25 12:12:14 +00:00
Hans Petter Selasky
8571421886 Add checks for SCHEDULER_STOPPED() so that code using the LinuxKPI can
run after a panic(). This for example allows a LinuxKPI based graphics
stack to receive prints during a panic.

Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-25 09:04:06 +00:00
Kevin Lo
8636496407 Add __iowrite32_copy() to the Linux kernel compatibility layer.
Reviewed by:	hselasky
2016-05-24 09:23:04 +00:00
Hans Petter Selasky
9183a497e7 Use the DROP_GIANT() and PICKUP_GIANT() macros instead of making
assumptions about how the Giant mutex is locked.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-24 07:52:53 +00:00
Hans Petter Selasky
3ce1263063 Set "current" for all PCI enumeration callbacks.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-24 07:46:20 +00:00
Hans Petter Selasky
5a6748b2cf Use make_dev_s() instead of make_dev() to avoid race setting
"si_drv1". Convert panic() into regular error while at it.

Suggested by:	jhb @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-24 07:06:04 +00:00
Dmitry Chagin
ab610366b5 Don't leak fp in case where fo_ioctl() returns an error.
Reported by:	C Turt <ecturt@gmail.com>
MFC after:	1 week
2016-05-24 05:29:41 +00:00
Hans Petter Selasky
44701cf732 Implement "atomic_long_add_unless()" in the LinuxKPI and fix the
implementation of "atomic_long_inc_not_zero()".

Found by:	ngie @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 16:19:51 +00:00
Hans Petter Selasky
8b68f2509f A missing definition needed by ktime_to_ms().
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 13:19:20 +00:00
Hans Petter Selasky
425da8eb61 Fix some data types and add "inline" keyword for __reg_op() function.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 13:18:15 +00:00
Hans Petter Selasky
83cfd83419 Implement ror32() in the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 12:53:17 +00:00
Hans Petter Selasky
fb2faed84e Define more copy to/from userspace functions in the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 12:52:22 +00:00
Hans Petter Selasky
aef2a67b83 Add more printf() related functions to the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 12:35:07 +00:00
Hans Petter Selasky
3092e0c54c Set an invalid IRQ number when no PCI IRQ is available in the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 12:13:16 +00:00
Hans Petter Selasky
7dfa8b2c4e Add more ktime related functions to the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 12:10:28 +00:00
Hans Petter Selasky
08a5e6ec7f Implement "kref_put_mutex()" for the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 12:06:34 +00:00
Hans Petter Selasky
aad02fb444 Add more list_xxx() functions to the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 12:03:40 +00:00
Hans Petter Selasky
0f8f7f554b Make header file standalone by including definitions for needed
linux_wait_xxx() functions.

Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 11:57:23 +00:00
Hans Petter Selasky
94a201be43 Implement "_outb()" to the LinuxKPI for i386 and amd64 only.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 11:53:00 +00:00
Hans Petter Selasky
ed5f781270 Add support for "cdev_add_ext()" to the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 11:50:05 +00:00
Hans Petter Selasky
299f29203a Add more GFP related defines to the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 11:47:54 +00:00
Hans Petter Selasky
4a0f827906 Add support for atomic_long_inc_not_zero() to the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 11:44:46 +00:00
Hans Petter Selasky
b5c541821a Add support for atomic_long_inc_not_zero() to the LinuxKPI.
Obtained from:	kmacy @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2016-05-23 11:41:35 +00:00
Dmitry Chagin
df964aa45d Convert proto family in both directions. The linux and native values for
local and inet are identical, but for inet6 values differ.

PR:		155040
Reported by:	Simon Walton
MFC after:	2 week
2016-05-22 19:08:29 +00:00
Pedro F. Giffuni
91460148b2 ndis(4): Undo unneeded workarounds in ndis' rand().
- Revert the change for seed(0) in r300384. I misunderstood the standard
and while our random() implementation in libkern may be improved, it
handles the seed(0) case fine.

Pointed out by:	bde, ache
2016-05-22 14:13:20 +00:00