Commit Graph

3207 Commits

Author SHA1 Message Date
Hans Petter Selasky
f698bc4d76 Define __poll_t type in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-08-06 08:35:16 +00:00
Hans Petter Selasky
62baacef3f Implement ktime_add_ms() and ktime_before() in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-08-03 09:02:57 +00:00
Hans Petter Selasky
a76c73aadc Don't refer to non-existing atomic functions, even though not compiled,
in the LinuxKPI.

Found by:		rpolka @
MFC after:		1 week
Sponsored by:		Mellanox Technologies
2018-08-01 19:10:46 +00:00
Konstantin Belousov
87842989f8 Add ioctl to conveniently mmap a PCI device BAR into userspace.
Add the ioctl PCIOCBARMMAP on /dev/pci to conveniently create
userspace mapping of a PCI device BAR.  This is enormously superior to
read the BAR value with PCIOCREAD and then try to mmap /dev/mem, and
should allow to automatically activate the mapped BARs when needed in
future.

Current implementation creates new sg pager for each user mmap
request.  If the pointer (and reference) to a managed device pager is
stored in pci_map, we would be able to revoke all mappings on the BAR
deactivation or relocation.  This is related to the unimplemented BAR
activation on mmap, and is postponed for the future.

Discussed with:	imp, jhb
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D15583
2018-08-01 18:58:24 +00:00
Konstantin Belousov
f2f5c2152c Regenerate after r336980. 2018-07-31 17:16:06 +00:00
Konstantin Belousov
3de1e9aa1e Provide compat32 shims for sched_rr_get_interval(2).
The interface uses struct timespec, which needs a translation.

Reported and reviewed by:	asomers
PR:	230175
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D16525
2018-07-31 17:15:31 +00:00
Alan Somers
6040822c4e Make timespecadd(3) and friends public
The timespecadd(3) family of macros were imported from NetBSD back in
r35029. However, they were initially guarded by #ifdef _KERNEL. In the
meantime, we have grown at least 28 syscalls that use timespecs in some
way, leading many programs both inside and outside of the base system to
redefine those macros. It's better just to make the definitions public.

Our kernel currently defines two-argument versions of timespecadd and
timespecsub.  NetBSD, OpenBSD, and FreeDesktop.org's libbsd, however, define
three-argument versions.  Solaris also defines a three-argument version, but
only in its kernel.  This revision changes our definition to match the
common three-argument version.

Bump _FreeBSD_version due to the breaking KPI change.

Discussed with:	cem, jilles, ian, bde
Differential Revision:	https://reviews.freebsd.org/D14725
2018-07-30 15:46:40 +00:00
Alan Somers
9dea3ac8c2 freebsd32_getrusage(2): skip freebsd32_rusage_out on error
PR:		230153
Reported by:	kib
MFC after:	2 weeks
X-MFC-With:	336871
Differential Revision:	https://reviews.freebsd.org/D16500
2018-07-29 19:20:13 +00:00
Alan Somers
5cf35a100c getrusage(2): fix return value under 32-bit emulation
According to the man page, getrusage(2) should return EFAULT if the rusage
argument lies outside of the process's address space. But due to an
oversight in r100384, that's never been the case during 32-bit emulation.
Fix it.

PR:		230153
Reported by:	tests(7)
Reviewed by:	cem
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D16500
2018-07-29 18:22:26 +00:00
Brooks Davis
942ae5c8b8 Regen after r336171. 2018-07-10 14:04:52 +00:00
Brooks Davis
7cc923f8a8 Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary very called them (they would call lchown and
msync directly) and we haven't supported NetBSD binaries in ages.

This is a respin of r335983 with a workaround for the ancient BFD linker
in the libc stubs.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D16193
2018-07-10 13:32:04 +00:00
Warner Losh
971b5f7632 Create PCI_MATCH and pci_match_device
Create a covenience function to match PCI device IDs. It's about 15
years overdue.

Differential Revision: https://reviews.freebsd.org/D15999
2018-07-07 15:25:11 +00:00
Andrew Turner
2bf9501287 Create a new macro for static DPCPU data.
On arm64 (and possible other architectures) we are unable to use static
DPCPU data in kernel modules. This is because the compiler will generate
PC-relative accesses, however the runtime-linker expects to be able to
relocate these.

In preparation to fix this create two macros depending on if the data is
global or static.

Reviewed by:	bz, emaste, markj
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D16140
2018-07-05 17:13:37 +00:00
Brooks Davis
714c03c81e Revert r335983.
The bfd linker in tree doesn't support multiple names for the same
symbol (at least with current flags).
2018-07-05 16:03:03 +00:00
Brooks Davis
5b04a71dae Get rid of netbsd_lchown and netbsd_msync syscall entries.
No valid FreeBSD binary ever called them (they would call lchown and
msync directly) and we haven't supported NetBSD binaries in ages.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15814
2018-07-05 14:12:56 +00:00
Mariusz Zaborski
0a111358cb Regen after 335900.
PR:		228671
2018-07-03 18:18:29 +00:00
Mariusz Zaborski
2c4a6a5d12 capsicum: add getdirentries to the freebsd32 compact
There is a getdirentries syscall in freebsd32 and it's
capability ready so allow calling it in the capability mode.

PR:		228671
2018-07-03 18:17:19 +00:00
Ed Maste
e8a1ec3e05 Split kern_break from sys_break and use it in linuxulator
Previously the linuxulator's linux_brk invoked the FreeBSD sys_break
syscall implementation directly.  Instead, move the bulk of the existing
implementation to kern_break, and call that from both sys_break and
linux_brk.

This also addresses a minor bug in linux_brk in that we now return the
actual (rounded up) break address, rather than the requested value.

Reviewed by:	brooks (earlier version)
Sponsored by:	Turing Robotic Industries
Differential Revision:	https://reviews.freebsd.org/D16019
2018-06-27 14:45:13 +00:00
Ed Maste
9c42fa94a6 Quiet unused fn warning for linuxulator w/o legacy syscalls
Sponsored by:	Turing Robotic Industries
2018-06-25 19:24:50 +00:00
Chuck Tuffli
4c4317193b Fix output of linprocfs stat entry
The Linux /proc/stat entry has grown over time

 v2.5.41 <
   user, nice, system, idle
 v2.5.41
   user, nice, system, idle, iowait, irq
 v2.6.11
   user, nice, system, idle, iowait, irq, softirq, steal
 v2.6.24
   user, nice, system, idle, iowait, irq, softirq, steal, guest
 v2.6.32 >
   user, nice, system, idle, iowait, irq, softirq, steal, guest, guest_nice

Some applications (e.g. nodejs) depend on the correct number of entries
and will abort otherwise.

Fix is to print the correct number of entries based on the value of
osrelease set either in sysctl or the jail settings. Change is similar
to approach used by illumos.

Reviewed by: emaste, imp (mentor)
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15858
2018-06-22 00:02:05 +00:00
Chuck Tuffli
3575504976 Fix the Linux kernel version number calculation
The Linux compatibility code was converting the version number (e.g.
2.6.32) in two different ways and then comparing the results.

The linux_map_osrel() function converted MAJOR.MINOR.PATCH similar to
what FreeBSD does natively. I.e. where major=v0, minor=v1, and patch=v2
    v = v0 * 1000000 + v1 * 1000 + v2;

The LINUX_KERNVER() macro, on the other hand, converted the value with
bit shifts. I.e. where major=a, minor=b, and patch=c
    v = (((a) << 16) + ((b) << 8) + (c))

The Linux kernel uses the later format via the KERNEL_VERSION() macro in
include/generated/uapi/linux/version.h

Fix is to use the LINUX_KERNVER() macro in linux_map_osrel() as well as
in the .trans_osrel functions.

PR: 229209
Reviewed by: emaste, cem, imp (mentor)
Approved by: imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15952
2018-06-22 00:02:03 +00:00
Konstantin Belousov
31665c1abb linux_clone_thread: mark new thread as TDB_BORN.
So that the ptrace code will catch it and report it to attached
debugger.  Enables debugging of threaded Linux binaries with FreeBSD
debugger.

Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D15880
2018-06-21 21:15:04 +00:00
Ed Maste
aec1e6d390 linuxulator: handle V3 capget/capset
Linux 2.6.26 introduced 64-bit capability sets.  Extend our stub
implementation to handle both 32- and 64-bit.  (We still report no
capabilities in capget, and disallow any in capset.)

Reviewed by:	chuck
Sponsored by:	Turing Robotic Industries Inc.
Differential Revision:	https://reviews.freebsd.org/D15887
2018-06-19 21:26:23 +00:00
Ed Maste
645f3d4345 linuxulator: add debugging for invalid capget/capset version
Sponsored by:	Turing Robotic Industries Inc.
2018-06-18 18:43:45 +00:00
Ed Maste
bbc8294729 linsysfs: depend on linux_common module on arm64, as on amd64
Sponsored by:	Turing Robotic Industries
2018-06-18 13:26:45 +00:00
Dimitry Andric
5494dcfa1c Fix build of ndis with base gcc on i386
Casting from rman_res_t to a pointer results in "cast to pointer from
integer of different size" warnings with base gcc on i386, so use an
intermediate cast to uintptr_t to suppress it.  In this case, the I/O
port range is effectively limited to the range of 0..65535.

Reviewed by:	imp
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D15746
2018-06-17 19:24:40 +00:00
Chuck Tuffli
d1fa346fac Add linprocfs support for min_free_kbytes
This adds linprocfs support for proc/sys/vm/min_free_kbytes which the
free program requires for correct operation. The approach mirrors the
approach used in illumos.

Reviewed by:	imp (mentor), emaste
Approved by:	imp (mentor)
Differential Revision: https://reviews.freebsd.org/D15563
2018-06-15 15:22:27 +00:00
Ed Maste
931e2a1a6e linuxulator: do not include legacy syscalls on arm64
Existing linuxulator platforms (i386, amd64) support legacy syscalls,
such as non-*at ones like open, but arm64 and other new platforms do
not.

Wrap these in #ifdef LINUX_LEGACY_SYSCALLS, #defined in the MD linux.h
files.  We may need finer grained control in the future but this is
sufficient for now.

Reviewed by:	andrew
Sponsored by:	Turing Robotic Industries
Differential Revision:	https://reviews.freebsd.org/D15237
2018-06-15 14:41:51 +00:00
Ed Maste
4b842782da Correct debug control for linuxulator faccessat
The Linuxulator provides per-syscall debug control via the
compat.linux.debug sysctl.  There's generally a 1:1 mapping between
sysctl setting and syscall, but faccessat was controlled by the access
setting, perhaps due to copy-paste.

Sponsored by:	Turing Robotic Industries
2018-06-15 14:29:41 +00:00
Konstantin Belousov
1f3bc15763 linprocfs: add TracerPid to /proc/pid/status.
Also fix the value of parent pid if the process is traced.

Submitted by:	Yanko Yankulov <yanko.yankulov@gmail.com>
MFC after:	1 week
2018-06-15 13:56:58 +00:00
Ed Maste
6a1d1e0ca3 Add stubbed arm64 linuxulator /proc/cpuinfo handler
Sponsored by:	Turing Robotic Industries
2018-06-15 13:53:37 +00:00
Brooks Davis
7d87c005da Regen after 335177 (rename sys_obreak to sys_break). 2018-06-14 21:29:31 +00:00
Brooks Davis
9da5364ed9 Name the implementation of brk and sbrk sys_break().
The break() system call was renamed (several times) starting in v3
AT&T UNIX when C was invented and break was a language keyword. The
last vestage of a need for it to be called something else (eg obreak)
was removed in r225617 which consistantly prefixed all syscall
implementations.

Reviewed by:	emaste, kib (older version)
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15638
2018-06-14 21:27:25 +00:00
Bruce Evans
407a812657 Oops, r335053 had an old version of the comment about 16-bit linux dev_t
translation.
2018-06-13 12:44:45 +00:00
Bruce Evans
ab35e1c71b Fix the encoding of major and minor numbers in 64-bit dev_t by restoring
the old encodings for the lower 16 and 32 bits and only using the
higher 32 bits for unusually large major and minor numbers.  This
change breaks compatibility with the previous encoding (which was only
used in -current).

Fix truncation to (essentially) 16-bit dev_t in newnfs v3.

Any encoding of device numbers gives an ABI, so it can't be changed
without translations for compatibility.  Extra bits give the much
larger complication that the translations need to compress into fewer
bits.  Fortunately, more than 32 bits are rarely needed, so
compression is rarely needed except for 16-bit linux dev_t where it
was always needed but never done.

The previous encoding moved the major number into the top 32 bits.
Almost no translation code handled this, so the major number was blindly
truncated away in most 32-bit encodings.  E.g., for ffs, mknod(8) with
major = 1 and minor = 2 gave dev_t = 0x10000002; ffs cannot represent
this and blindly truncated it to 2.  But if this mknod was run on any
released version of FreeBSD, it gives dev_t = 0x102.  ffs can represent
this, but in the previous encoding it was not decoded, giving major = 0,
minor = 0x102.

The presence of bugs was most obvious for exporting dev_t's from an
old system to -current, since bugs in newnfs augment them.  I fixed
oldnfs to support 32-bit dev_t in 1996 (r16634), but this regressed
to 16-bit dev_t in newnfs, first to the old 16-bit encoding and then
further in -current.  E.g., old ad0 with major = 234, minor = 0x10002
had the correct (major, minor) number on the wire, but newnfs truncated
this to (234, 2) and then the previous encoding shifted the major
number into oblivion as seen by ffs or old applications.

I first tried to fix this by translating on every ABI/API boundary, but
there are too many boundaries and too many sloppy translations by blind
truncation.  So use the old encoding for the low 32 bits so that sloppy
translations work no worse than before provided the high 32 bits are
not set.  Add some error checking for when bits are lost.  Keep not
doing any error checking for translations for almost everything in
compat/linux.

compat/freebsd32/freebsd32_misc.c:
Optionally check for losing bits after possibly-truncating assignments as
before.

compat/linux/linux_stats.c:
Depend on the representation being compatible with Linux's (or just with
itself for local use) and spell some of the translations as assignments in
a macro that hides the details.

fs/nfsclient/nfs_clcomsubs.c:
Essentially the same fix as in 1996, except there is now no possible
truncation in makedev() itself.  Also fix nearby style bugs.

kern/vfs_syscalls.c:
As for freebsd32.  Also update the sysctl description to include file
numbers, and change it to describe device ids as device numbers.

sys/types.h:
Use inline functions (wrapped by macros) since the expressions are now
a bit too complicated for plain macros.  Describe the encoding and
some of the reasons for it.  16-bit compatibility didn't leave many
reasonable choices for the 32-bit encoding, and 32-bit compatibility
doesn't leave many reasonable choices for the 64-bit encoding.  My
choice is to put the 8 new minor bits in the low 8 bits of the top 32
bits.  This minimizes discontiguities.

Reviewed by:	kib (except for rewrite of the comment in linux_stats.c)
2018-06-13 12:22:00 +00:00
Bruce Evans
372639f944 Fix some bugs found while fixing the representation and translation
of 64-bit dev_t's (but not ones involving dev_t's).

st_size was supposed to be clamped in cvtstat() and linux's copy_stat(),
but the clamping code wasn't aware that st_size is signed, and also had
an obfuscated off-by-1 value for the unsigned limit, so its effect was
to produce a bizarre negative size instead of clamping.

Change freebsd32's copy_ostat() to be no worse than cvtstat().  It was
missing clamping and bzero()ing of padding.

Reviewed by:	kib (except a final fix of the clamp to the signed maximum)
2018-06-13 08:50:43 +00:00
Hans Petter Selasky
986e3bed8b Implement the ip_eth_mc_map() function in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-06-12 08:43:49 +00:00
Hans Petter Selasky
35555d474b Implement the kstrtobool() and kstrtobool_from_user() functions
in the LinuxKPI.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-11 16:26:33 +00:00
Hans Petter Selasky
93ed9ab20b Implement the user_access_begin(), user_access_end(), usafe_get_user() and
unsafe_put_user() function macros in the LinuxKPI.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-11 15:42:29 +00:00
Hans Petter Selasky
71ee95ddf7 Define ARCH_KMALLOC_MINALIGN in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-07 11:44:11 +00:00
Hans Petter Selasky
d150e15285 Wrap timespec64 into timespec in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-07 11:41:42 +00:00
Hans Petter Selasky
a041e75a34 Move the EXPORT_SYMBOL_XXX() function macros into own header file.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-07 11:34:59 +00:00
Hans Petter Selasky
422d8af4df Implement the dev_pm_set_driver_flags() function macro in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-07 11:29:07 +00:00
Hans Petter Selasky
40ddfc7604 Make some list functions RCU safe in the LinuxKPI.
While at it rename hlist_add_after() into hlist_add_behind().

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-06 15:49:01 +00:00
Hans Petter Selasky
17e2a84e9a Rewrite code using atomic_fcmpset_int() in the LinuxKPI.
Suggested by:	mjg@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-06 15:31:47 +00:00
Hans Petter Selasky
e6e028d01f Implement the __add_wait_queue_entry_tail() function in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-06 15:19:30 +00:00
Hans Petter Selasky
7e95e98db8 Implement the might_sleep_if() function macro in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-06 15:10:11 +00:00
Hans Petter Selasky
ab98f1e8d8 Rename two structure field members while keeping backwards compatibility in
the LinuxKPI. Add a comment saying in which Linux version this change was made.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-06 15:06:21 +00:00
Hans Petter Selasky
1b092623b2 Implement the init_wait_entry() function macro in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-06 14:59:23 +00:00
Hans Petter Selasky
23dcf4359e Implement the atomic_dec_if_positive() function in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-06 13:59:51 +00:00
Hans Petter Selasky
9e067b2256 Implement the ktime_compare() and ktime_after() functions in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-06 13:37:31 +00:00
Hans Petter Selasky
a16387c1d4 Implement the rdmsrl_safe() function macro in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-06 13:29:52 +00:00
Hans Petter Selasky
7a13eeba18 Declare and set the global "system_highpri_wq" workqueue structure pointer
in the LinuxKPI.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-05 15:49:35 +00:00
Hans Petter Selasky
c6d920309c Implement the INIT_DELAYED_WORK_ONSTACK() function macro in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-05 15:46:16 +00:00
Hans Petter Selasky
7f346854b8 Define the __kernel_size_t type in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-05 15:42:35 +00:00
Hans Petter Selasky
5a1d03bb7c Implement the task_pid_vnr() function macro in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-05 15:40:09 +00:00
Hans Petter Selasky
747d9a8165 Add "access" function pointer to the "vm_operations_struct" structure
in the LinuxKPI. While at it document when to use the "virtual_address" or
the "address" field in the "vm_fault" structure.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-05 15:37:28 +00:00
Hans Petter Selasky
0d2dce0b78 Implement mul_u32_u32() function in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-05 15:30:36 +00:00
Hans Petter Selasky
f446b7cab4 Implement timer_setup() and from_timer() function macros in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-05 15:20:20 +00:00
Mark Johnston
97bc9a9384 Regen after r334626. 2018-06-04 19:36:47 +00:00
Mark Johnston
9f9c9b22ec Reimplement brk() and sbrk() to avoid the use of _end.
Previously, libc.so would initialize its notion of the break address
using _end, a special symbol emitted by the static linker following
the bss section.  Compatibility issues between lld and ld.bfd could
cause the wrong definition of _end (libc.so's definition rather than
that of the executable) to be used, breaking the brk()/sbrk()
interface.

Avoid this problem and future interoperability issues by simply not
relying on _end.  Instead, modify the break() system call to return
the kernel's view of the current break address, and have libc
initialize its state using an extra syscall upon the first use of the
interface.  As a side effect, this appears to fix brk()/sbrk() usage
in executables run with rtld direct exec, since the kernel and libc.so
no longer maintain separate views of the process' break address.

PR:		228574
Reviewed by:	kib (previous version)
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D15663
2018-06-04 19:35:15 +00:00
Hans Petter Selasky
57a865f808 Implement the __sg_alloc_table_from_pages() function based on the existing
sg_alloc_table_from_pages() function in the LinuxKPI.

This basically allow segments to have a limit, max_segment.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-01 12:09:07 +00:00
Hans Petter Selasky
6fad8d171a Implement radix_tree_iter_delete() in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-01 11:42:09 +00:00
Hans Petter Selasky
0a85496223 Improve high resolution timer support in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-01 11:33:14 +00:00
Hans Petter Selasky
f03ae7e802 Add more GFP macro definitions in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-06-01 11:14:59 +00:00
Hans Petter Selasky
13a5c70b91 Implement support for the PCI_BUS_NUM() function macro in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-05-31 13:17:34 +00:00
Hans Petter Selasky
f1aa567bfe Implement support for the kvmalloc_array() function in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-05-31 13:13:08 +00:00
Hans Petter Selasky
f6d4552417 Correct macroname in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-05-31 12:55:38 +00:00
Hans Petter Selasky
cbea4f294f Define __initconst in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-05-31 12:50:42 +00:00
Hans Petter Selasky
7ce7605ece Implement bitmap_complement() in the LinuxKPI.
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-05-31 12:48:24 +00:00
Hans Petter Selasky
69d6653ba4 Implement idr_is_empty() in the LinuxKPI and make idr_remove() API compatible
with upstream Linux by returning the pointer to the removed element.

Submitted by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-05-31 12:35:21 +00:00
Brooks Davis
64b378f1e1 Remove alternative names that are identical to the default.
Verified by make sysent producing no changes.
2018-05-30 22:22:58 +00:00
Hans Petter Selasky
666cda9e0c The schedule_timeout_killable() function should listen for signals
in the LinuxKPI.

Found by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-05-28 11:26:40 +00:00
Hans Petter Selasky
bd40dea7e2 Implement wait_event_killable() in the LinuxKPI.
Requested by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-05-28 10:54:24 +00:00
Hans Petter Selasky
2a3ec12831 Allow TASK_PARKED bit being set when going to sleep in the LinuxKPI.
Found by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-05-28 10:51:39 +00:00
Brooks Davis
659a2e9243 Regen after r334223: make vadvise compat freebsd11. 2018-05-25 20:41:26 +00:00
Brooks Davis
7351a8bdb5 Make vadvise compat freebsd11.
The vadvise syscall (aka ovadvise) is undocumented and has always been
implmented as returning EINVAL.  Put the syscall under COMPAT11 and
provide a userspace implementation.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15557
2018-05-25 20:40:23 +00:00
Matt Macy
4f6c66cc9c UDP: further performance improvements on tx
Cumulative throughput while running 64
  netperf -H $DUT -t UDP_STREAM -- -m 1
on a 2x8x2 SKL went from 1.1Mpps to 2.5Mpps

Single stream throughput increases from 910kpps to 1.18Mpps

Baseline:
https://people.freebsd.org/~mmacy/2018.05.11/udpsender2.svg

- Protect read access to global ifnet list with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender3.svg

- Protect short lived ifaddr references with epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender4.svg

- Convert if_afdata read lock path to epoch
https://people.freebsd.org/~mmacy/2018.05.11/udpsender5.svg

A fix for the inpcbhash contention is pending sufficient time
on a canary at LLNW.

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15409
2018-05-23 21:02:14 +00:00
Matt Macy
d7c5a620e2 ifnet: Replace if_addr_lock rwlock with epoch + mutex
Run on LLNW canaries and tested by pho@

gallatin:
Using a 14-core, 28-HTT single socket E5-2697 v3 with a 40GbE MLX5
based ConnectX 4-LX NIC, I see an almost 12% improvement in received
packet rate, and a larger improvement in bytes delivered all the way
to userspace.

When the host receiving 64 streams of netperf -H $DUT -t UDP_STREAM -- -m 1,
I see, using nstat -I mce0 1 before the patch:

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
4.98   0.00   4.42   0.00 4235592     33   83.80 4720653 2149771   1235 247.32
4.73   0.00   4.20   0.00 4025260     33   82.99 4724900 2139833   1204 247.32
4.72   0.00   4.20   0.00 4035252     33   82.14 4719162 2132023   1264 247.32
4.71   0.00   4.21   0.00 4073206     33   83.68 4744973 2123317   1347 247.32
4.72   0.00   4.21   0.00 4061118     33   80.82 4713615 2188091   1490 247.32
4.72   0.00   4.21   0.00 4051675     33   85.29 4727399 2109011   1205 247.32
4.73   0.00   4.21   0.00 4039056     33   84.65 4724735 2102603   1053 247.32

After the patch

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
5.43   0.00   4.20   0.00 3313143     33   84.96 5434214 1900162   2656 245.51
5.43   0.00   4.20   0.00 3308527     33   85.24 5439695 1809382   2521 245.51
5.42   0.00   4.19   0.00 3316778     33   87.54 5416028 1805835   2256 245.51
5.42   0.00   4.19   0.00 3317673     33   90.44 5426044 1763056   2332 245.51
5.42   0.00   4.19   0.00 3314839     33   88.11 5435732 1792218   2499 245.52
5.44   0.00   4.19   0.00 3293228     33   91.84 5426301 1668597   2121 245.52

Similarly, netperf reports 230Mb/s before the patch, and 270Mb/s after the patch

Reviewed by:	gallatin
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D15366
2018-05-18 20:13:34 +00:00
Brooks Davis
e15f0023ed Allow freebsd32 __sysctl(2) to return ENOMEM.
This is required by programs like sockstat that read variably sized
sysctls such as kern.file.  The normal path has no such restriction and
the restriction was added without comment along with initial support for
freebsd32 in 2002 (r100384).

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D15438
2018-05-15 16:24:58 +00:00
Mark Johnston
e3d5c4ade1 Remove "All rights reserved" from my files.
See r333391 for the rationale.

MFC after:	1 week
2018-05-09 20:57:18 +00:00
Matt Macy
cbd92ce62e Eliminate the overhead of gratuitous repeated reinitialization of cap_rights
- Add macros to allow preinitialization of cap_rights_t.

- Convert most commonly used code paths to use preinitialized cap_rights_t.
  A 3.6% speedup in fstat was measured with this change.

Reported by:	mjg
Reviewed by:	oshogbo
Approved by:	sbruno
MFC after:	1 month
2018-05-09 18:47:24 +00:00
Hans Petter Selasky
c20feee43b Add myself to copyright in the LinuxKPI RCU support layer.
Suggested by:	mmacy@
Sponsored by:	Mellanox Technologies
2018-05-09 08:50:42 +00:00
Jamie Gritton
0e5c6bd436 Make it easier for filesystems to count themselves as jail-enabled,
by doing most of the work in a new function prison_add_vfs in kern_jail.c
Now a jail-enabled filesystem need only mark itself with VFCF_JAIL, and
the rest is taken care of.  This includes adding a jail parameter like
allow.mount.foofs, and a sysctl like security.jail.mount_foofs_allowed.
Both of these used to be a static list of known filesystems, with
predefined permission bits.

Reviewed by:	kib
Differential Revision:	D14681
2018-05-04 20:54:27 +00:00
Hans Petter Selasky
45aca3b1a1 Define USEC_PER_MSEC and USEC_PER_SEC in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-04-30 09:31:09 +00:00
Eitan Adler
e07db02261 [procfs] Split procfs_attr into multiple functions
Reviewed by:	des, kib
Discussed with:	mmacy
Differential Revision:	https://reviews.freebsd.org/D15150
2018-04-24 14:49:09 +00:00
Konstantin Belousov
0cde66af78 Fix futexes on i386 after the 4/4G split.
Use proper method to access userspace.  For now, only the slow copyout
path is implemented.

Reported and tested by:	tijl (previous version)
Sponsored by:	The FreeBSD Foundation
2018-04-24 12:50:21 +00:00
Ed Maste
38be312706 Map FreeBSD EDOOFUS to Linux EINVAL
Previously EDOOFUS mapped to EBUSY.  EINVAL seems more appropriate.

Discussed with:	cem
MFC after:	1 week
Sponsored by:	Turing Robotic Industries Inc.
2018-04-23 18:33:26 +00:00
Konstantin Belousov
1302eea7bb Rename PROC_PDEATHSIG_SET -> PROC_PDEATHSIG_CTL and PROC_PDEATHSIG_GET
-> PROC_PDEATHSIG_STATUS for consistency with other procctl(2)
operations names.

Requested by:	emaste
Sponsored by:	The FreeBSD Foundation
MFC after:	13 days
2018-04-20 15:19:27 +00:00
John Baldwin
73c8686e91 Simplify the code to allocate stack for auxv, argv[], and environment vectors.
Remove auxarg_size as it was only used once right after a confusing
assignment in each of the variants of exec_copyout_strings().

Reviewed by:	emaste
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D15123
2018-04-19 16:00:34 +00:00
Konstantin Belousov
b940886338 Add PROC_PDEATHSIG_SET to procctl interface.
Allow processes to request the delivery of a signal upon death of
their parent process.  Supposed consumer of the feature is PostgreSQL.

Submitted by:	Thomas Munro
Reviewed by:	jilles, mjg
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D15106
2018-04-18 21:31:13 +00:00
Ed Maste
b267239d4b linuxulator: deduplicate linux_exec_imgact_try
Previously linuxulator had three identical copies of
linux_exec_imgact_try.  Deduplicate before adding another arch to
linuxulator.

Sponsored by:	Turing Robotic Industries Inc
Differential Revision:	https://reviews.freebsd.org/D14856
2018-04-09 17:24:01 +00:00
Brooks Davis
6469bdcdb6 Move most of the contents of opt_compat.h to opt_global.h.
opt_compat.h is mentioned in nearly 180 files. In-progress network
driver compabibility improvements may add over 100 more so this is
closer to "just about everywhere" than "only some files" per the
guidance in sys/conf/options.

Keep COMPAT_LINUX32 in opt_compat.h as it is confined to a subset of
sys/compat/linux/*.c.  A fake _COMPAT_LINUX option ensure opt_compat.h
is created on all architectures.

Move COMPAT_LINUXKPI to opt_dontuse.h as it is only used to control the
set of compiled files.

Reviewed by:	kib, cem, jhb, jtl
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14941
2018-04-06 17:35:35 +00:00
Mark Johnston
e4ba1a50af Fix the definitions of get_cpu() and put_cpu().
They are supposed to disable preemption.

Reported by:	rstone
MFC after:	5 days
2018-04-05 17:26:03 +00:00
Ed Maste
19406511e5 Fix kernel memory disclosure in linux_ioctl_socket
strlcpy is used to copy a string into a buffer to be copied to userland,
previously leaving uninitialized data after the terminating NUL.  Zero
the buffer first to avoid a kernel memory disclosure.

admbugs:	765, 811
MFC after:	1 day
Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
Reported by:	Vlad Tsyrklevich
Sponsored by:	The FreeBSD Foundation
2018-04-04 19:58:25 +00:00
Ed Maste
d851b216eb linux_ioctl_hdio: fix kernel memory disclosure
Stack-allocated struct linux_hd_big_geometry has undeclared padding
copied to userland.

admbugs:	765
Reported by:	Vlad Tsyrklevich
MFC after:	1 day
Security:	Kernel memory disclosure
Sponsored by:	The FreeBSD Foundation
2018-04-04 14:41:48 +00:00
Mark Johnston
2a1067a9f0 Wrap long lines.
MFC after:	3 days
2018-04-03 18:41:27 +00:00
Hans Petter Selasky
4b70609941 Optimise use of Giant in the LinuxKPI.
- Make sure Giant is locked when calling PCI device methods.
Newbus currently requires this.

- Avoid unlocking Giant right before aquiring the sleepqueue lock.
This can save a task switch.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-30 20:11:12 +00:00
Hans Petter Selasky
fa0d4f31a7 Swap two instances of regular macros with function macros in the LinuxKPI,
to narrow down the substitution scope.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-28 17:54:34 +00:00
Konstantin Belousov
fb441a8829 Fix several leaks of kernel stack data through paddings.
It is random collection of fixes for issues not yet corrected,
reported at https://tsyrklevi.ch/clang_analyzer/freebsd_013017/. Many
issues from that list were already corrected. Most of them are for
compat32, old compat32 or affect both primary host ABI and compat32.

The freebsd32_kldstat(), for instance, was already fixed by using
malloc(M_ZERO).  Patch includes correction to report the supplied
version back, which is just pedantic.

Reviewed by:	brooks, emaste (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D14868
2018-03-27 18:05:51 +00:00
Brooks Davis
026cac8213 Move 32-bit compat for md(4) ioctls into the md code.
This is more correct in that ioctl commands have no meaning until they
hit the handler associated with the file descriptor.

Add support for MDIOCRESIZE_32 which was missed when it was added.

Reviewed by:	cem, kib, markj (various versions)
Obtained from:	CheriBSD
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14714
2018-03-27 16:07:54 +00:00
Brooks Davis
34a77b9741 Move uio enums to sys/_uio.h.
Include _uio.h instead of uio.h in several headers to reduce header
polution.

Fix a few places that relied on header polution to get the uio.h header.

I have not moved struct uio as many more things that use it rely on
header polution to get other definitions from uio.h.

Reviewed by:	cem, kib, markj
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14811
2018-03-27 15:20:03 +00:00
Ed Maste
8363051739 linuxkpi whitespace cleanup
Reviewed by:	hselasky, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D14807
2018-03-23 15:50:01 +00:00
Ed Maste
b1ed95fd9d Rationalize license text on Linuxolator files
Many licenses on Linuxolator files contained small variations from the
standard FreeBSD license text.  To avoid license proliferation switch to
the standard 2-Clause FreeBSD license for those files where I have
permission from each of the listed copyright holders.

Approved by:	rdivacky, marcel
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-03-23 14:39:34 +00:00
Hans Petter Selasky
555deb3c96 The pci_disable_device() function is also expected to clear the PCI
busmaster. This fixes LinuxKPI compliancy with Linux.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-22 13:30:35 +00:00
Ed Maste
1ac2776bbb Share Linux errno table with libsysdecode
Requested by:	jhb
Reviewed by:	jhb
Sponsored by:	Turing Robotic Industries Inc.
2018-03-22 12:58:49 +00:00
Hans Petter Selasky
e1992aa142 Clear old MSIX IRQ numbers in the LinuxKPI.
When disabling the MSIX IRQ vectors for a PCI device through the
LinuxKPI, make sure any old MSIX IRQ numbers are no longer visible to
the linux_pci_find_irq_dev() function else IRQs can be requested from
the wrong PCI device.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-22 12:26:27 +00:00
Conrad Meyer
4948f7bf11 Regenerate sysent files after r331279. 2018-03-21 01:17:01 +00:00
Conrad Meyer
e9ac27430c Implement getrandom(2) and getentropy(3)
The general idea here is to provide userspace programs with well-defined
sources of entropy, in a fashion that doesn't require opening a new file
descriptor (ulimits) or accessing paths (/dev/urandom may be restricted
by chroot or capsicum).

getrandom(2) is the more general API, and comes from the Linux world.
Since our urandom and random devices are identical, the GRND_RANDOM flag
is ignored.

getentropy(3) is added as a compatibility shim for the OpenBSD API.

truss(1) support is included.

Tests for both system calls are provided.  Coverage is believed to be at
least as comprehensive as LTP getrandom(2) test coverage.  Additionally,
instructions for running the LTP tests directly against FreeBSD are provided
in the "Test Plan" section of the Differential revision linked below.  (They
pass, of course.)

PR:		194204
Reported by:	David CARLIER <david.carlier AT hardenedbsd.org>
Discussed with:	cperciva, delphij, jhb, markj
Relnotes:	maybe
Differential Revision:	https://reviews.freebsd.org/D14500
2018-03-21 01:15:45 +00:00
Ed Maste
d595c5c0d6 linux_errno.c: add newer errno values
Also introduce a static assert to ensure the list is kept up to date.

Sponsored by:	Turing Robotic Industries Inc.
2018-03-16 14:51:47 +00:00
Ed Maste
6e481f83f7 Share a single bsd-linux errno table across MD consumers
Three copies of the linuxulator linux_sysvec.c contained identical
BSD to Linux errno translation tables, and future work to support other
architectures will also use the same table.  Move the table to a common
file to be used by all.  Make it 'const int' to place it in .rodata.

(Some existing Linux architectures use MD errno values, but x86 and Arm
share the generic set.)

This change should introduce no functional change; a followup will add
missing errno values.

MFC after:	3 weeks
Sponsored by:	Turing Robotic Industries Inc.
Differential Revision:	https://reviews.freebsd.org/D14665
2018-03-16 14:46:38 +00:00
Hans Petter Selasky
cbfc3c73ce Fix compliancy of the kstrtoXXX() functions in the LinuxKPI, by skipping
one newline character at the end, if any.

Found by:	greg@unrelenting.technology
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-14 19:51:28 +00:00
Ed Maste
340f4a8d3e Linuxulator: apply style(9) to return
Sponsored by:	Turing Robotic Industries Inc.
2018-03-12 15:35:24 +00:00
Hans Petter Selasky
be15e1332d Implement proper support for complete_all() in the LinuxKPI.
When complete_all() is called there might be multiple waiters. The
current implementation could only handle one waiter. Make sure the
completion is sticky when complete_all() is called to be compatible
with Linux.

Found by:	Johannes Lundberg <johalun0@gmail.com>
MFC after:	1 week
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-09 12:16:55 +00:00
Eitan Adler
e124e27ba2 sys/cloudabi: Avoid relying on GNU specific extensions
An empty initializer list is not technically valid C grammar.

MFC After:	1 week
2018-03-07 14:47:43 +00:00
Andrey V. Elsukov
c2a5dc6cd7 Add mapping for several ethernet types used by Linux to FreeBSD
ethernet types.

Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D14594
2018-03-06 12:58:00 +00:00
Brooks Davis
aec37bad99 Regen after r330517. 2018-03-05 17:02:50 +00:00
Brooks Davis
1c1b4c66b6 Remove remenants of 1990s efforts to let us run Net/OpenBSD binaries.
No functional change (comments change in some generated files.)

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14571
2018-03-05 17:02:16 +00:00
Hans Petter Selasky
e9e4ec118f Properly wrap the BUILD_BUG() function macro in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-04 19:42:50 +00:00
Hans Petter Selasky
20789a72e0 Stub kernel_param_lock() and kernel_param_unlock() in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-04 19:10:30 +00:00
Hans Petter Selasky
c3bfe0de4c Implement wait_event_lock_irq() macro function in the LinuxKPI.
MFC after:	1 week
Requested by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-03-04 19:07:10 +00:00
Hans Petter Selasky
ce930365d1 Keep the old SLAB_DESTROY_BY_RCU macro definition around in the LinuxKPI
to avoid compilation breakage in external kernel modules.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-04 18:53:41 +00:00
Hans Petter Selasky
8f368d485d Implement DEFINE_WAIT_FUNC() function macro and default_wake_function()
in the LinuxKPI.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-04 18:51:43 +00:00
Hans Petter Selasky
ab7da72090 Implement pr_err_ratelimited() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-04 18:27:50 +00:00
Hans Petter Selasky
8961c48323 Implement __MODULE_STRING() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-04 18:21:21 +00:00
Hans Petter Selasky
20c8d8270c Implement BUILD_BUG() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-04 18:19:44 +00:00
Hans Petter Selasky
6c51dfb060 Implement writel_relaxed() in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-04 18:17:54 +00:00
Hans Petter Selasky
dc354b1551 Define noinline and __maybe_unused macros in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-04 18:13:31 +00:00
Hans Petter Selasky
9bce524efa Implement for_each_clear_bit() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-04 18:10:18 +00:00
Hans Petter Selasky
5d503e30ad Implement GENMASK_ULL() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-04 18:08:21 +00:00
Hans Petter Selasky
782a90d16a Rename the SLAB_DESTROY_BY_RCU flag into SLAB_TYPESAFE_BY_RCU in the LinuxKPI
to be compatible with Linux.

MFC after:	1 week
Requested by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-03-04 18:04:37 +00:00
Eitan Adler
026227b6ae sys/linux: Fix a few potential infoleaks in cloudabi
Submitted by:	Domagoj Stolfa <domagoj.stolfa@gmail.com>
MFC After:	1 month
Sponsored by:	DARPA/AFRL
2018-03-03 21:50:55 +00:00
Eitan Adler
6eee883a17 sys/linux: Fix a few potential infoleaks in Linux IPC
Submitted by:	Domagoj Stolfa <domagoj.stolfa@gmail.com>
MFC After:	1 month
2018-03-03 21:14:55 +00:00
Hans Petter Selasky
ccae7bb851 Use mstosbt() instead of SBT_1MS in the LinuxKPI to get the last few bits
of precision.

MFC after:	1 week
Suggested by:	ian@
Sponsored by:	Mellanox Technologies
2018-03-03 19:26:40 +00:00
Hans Petter Selasky
7cf1c51588 Implement msleep_interruptible() in the LinuxKPI. While at it use pause_sbt()
instead of pause() in the msleep() function to avoid rounding errors when
converting delay values forth and back. Add a guard for a delay value
of zero milliseconds which is undefined.

MFC after:	1 week
Requested by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-03-03 18:54:16 +00:00
Brooks Davis
93e48a303a Rename kernel-only members of semid_ds and msgid_ds.
This deliberately breaks the API in preperation for future syscall
revisions which will remove these nonstandard members.

In an exp-run a single port (devel/qemu-user-static) was found to
use them which it did becuase it emulates system calls.  This has
been fixed in the ports tree.

PR:		224443 (exp-run)
Reviewed by:	kib, jhb (previous version)
Exp-run by:	antoine
Sponsored by:	DARPA, AFRP
Differential Revision:	https://reviews.freebsd.org/D14490
2018-03-02 22:10:48 +00:00
Hans Petter Selasky
86ba49a722 Implement more lockdep stubs in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-02 08:59:53 +00:00
Hans Petter Selasky
8554bc585b Implement ktime_get_raw() function in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-02 08:58:32 +00:00
Hans Petter Selasky
d901abf167 Implement wait_on_bit() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-02 08:56:15 +00:00
Hans Petter Selasky
9555cfd2b2 Rename callout member in struct timer_list to match the one in struct
delayed_work in the LinuxKPI. This allows the timer_pending() function
macro to be used with delayed work structures.

No functional nor structural change.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-03-02 08:52:27 +00:00
Ed Maste
023b850b62 Rationalize license text on Linuxolator files
Many licenses on Linuxolator files contained small variations from the
standard FreeBSD license text.  To avoid license proliferation switch to
the standard 2-clause FreeBSD license for those files where I have
permission from each of the listed copyright holders.  Additional files
still waiting on permission from others are listed in review D14210.

Approved by:    dchagin, rdivacky, sos
MFC after:	1 week
MFC with:	r329370
Sponsored by:	The FreeBSD Foundation
2018-03-01 13:52:18 +00:00
Hans Petter Selasky
b44247b1a9 Correct the return value from flush_work() and flush_delayed_work() in the
LinuxKPI to comply more with Linux. This fixes an issue when these functions
are used in waiting loops.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-01 10:31:51 +00:00
Ed Maste
315fbaeca2 Correct pseudo misspelling in sys/ comments
contrib code and #define in intel_ata.h unchanged.
2018-02-23 18:15:50 +00:00
Hans Petter Selasky
949440623b Return correct error code to user-space when a system call receives a
signal in the LinuxKPI.

The read(), write() and mmap() system calls can return either EINTR or
ERESTART upon receiving a signal. Add code to figure out the correct
return value by temporarily storing the return code from the relevant
FreeBSD kernel APIs in the Linux task structure.

MFC after:	3 days
Sponsored by:	Mellanox Technologies
2018-02-22 15:29:19 +00:00
Ed Maste
eae594f7d5 Correct proper nouns in the Linuxulator
- Capitalize Linux
- Spell FreeBSD out in full
- Address some style(9) on changed lines

Sponsored by:	Turing Robotic Industries Inc.
2018-02-22 02:24:17 +00:00
Hans Petter Selasky
07c757ec25 Allow LinuxKPI character devices to receive mmap() calls from the Linux
binary mode user-space emulation layer. This is a regression issue after
r328436, when LinuxKPI character devices started to use DTYPE_DEV in
the "f_type" field of the associated file structure(s).

MFC after:	3 days
Found by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-21 10:13:17 +00:00
Brooks Davis
b81e88d296 Reduce duplication in dynamic syscall registration code.
Remove the unused syscall_(de)register() functions in favor of the
better documented and easier to use syscall_helper_(un)register(9)
functions.

The default and freebsd32 versions differed in which array of struct
sysents they used and a few missing updates to the 32-bit code as
features were added to the main code.

Reviewed by:	cem
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14337
2018-02-20 18:08:57 +00:00
Konstantin Belousov
2c0f13aa59 vm_wait() rework.
Make vm_wait() take the vm_object argument which specifies the domain
set to wait for the min condition pass.  If there is no object
associated with the wait, use curthread' policy domainset.  The
mechanics of the wait in vm_wait() and vm_wait_domain() is supplied by
the new helper vm_wait_doms(), which directly takes the bitmask of the
domains to wait for passing min condition.

Eliminate pagedaemon_wait().  vm_domain_clear() handles the same
operations.

Eliminate VM_WAIT and VM_WAITPFAULT macros, the direct functions calls
are enough.

Eliminate several control state variables from vm_domain, unneeded
after the vm_wait() conversion.

Scetched and reviewed by:	jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation, Mellanox Technologies
Differential revision:	https://reviews.freebsd.org/D14384
2018-02-20 10:13:13 +00:00
Hans Petter Selasky
e44fa94c09 Implement list_safe_reset_next() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-02-19 16:31:19 +00:00
Hans Petter Selasky
0f839f3a6d When stepping the radix tree in the LinuxKPI make sure we
clear the least significant bits, so that no entries are
skipped.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-02-19 06:11:58 +00:00
Hans Petter Selasky
8f294983e9 Optimise xchg() to use atomic_swap_32() and atomic_swap_64().
Suggested by:	kib@
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-02-18 18:46:56 +00:00
Hans Petter Selasky
644680491e Fix implementation of xchg() function macro in the LinuxKPI.
The exchange operation must be atomic.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-02-18 17:37:23 +00:00
Hans Petter Selasky
ead15282ae Implement support for radix_tree_for_each_slot() and radix_tree_exception()
in the LinuxKPI and use unsigned long type for the radix tree index.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-02-18 12:54:21 +00:00
Hans Petter Selasky
78d7441913 Implement the KMEM_CACHE() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-02-18 09:52:30 +00:00
Hans Petter Selasky
0628fc903e Make the vm_fault structure in the LinuxKPI compatible with
newer versions of the Linux kernel. No functional change.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-02-18 09:31:01 +00:00
Hans Petter Selasky
0597ffb0b5 Implement the rcu_dereference_raw() function macro.
Make sure all RCU dereferencing use the READ_ONCE() function macro.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-02-18 09:10:14 +00:00
Hans Petter Selasky
7c86047355 Implement __GFP_BITS_SHIFT and __GFP_BITS_MASK macros in the LinuxKPI.
Add compile time asserts to catch conflicts with native defines.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-02-18 08:58:20 +00:00
Hans Petter Selasky
15052dc861 Implement __list_del_entry() helper functions in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-02-18 08:47:15 +00:00
Hans Petter Selasky
d51be3591a Implement file_inode() and call_mmap() helper functions in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-02-18 08:40:07 +00:00
Hans Petter Selasky
b15a13af6b Refactor dentry structure into its own header file in the LinuxKPI similary
to Linux. No functional change. Implement d_inode() helper function.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-02-18 08:29:25 +00:00
Hans Petter Selasky
0424e413e7 Update the ktime type in the LinuxKPI to be a signed 64-bit integer similarly
to Linux, to avoid compilation issues. Implement ktime_get_real_seconds().

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
Sponsored by:	Limelight Networks
2018-02-18 08:05:40 +00:00
Hans Petter Selasky
9a323f25ab Implement spin_trylock_irq() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 22:45:15 +00:00
Hans Petter Selasky
1169b94c7b Stub more lockdep function macros in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 22:41:20 +00:00
Hans Petter Selasky
94b9710bc7 Implement get_task_pid() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 22:33:26 +00:00
Hans Petter Selasky
314d034088 Allow the put_user() function macro to put constant values by using the
existing __put_user() macro.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 21:47:15 +00:00
Hans Petter Selasky
2460cbb4a6 Implement BUILD_BUG_ON_INVALID() function macro in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 21:40:19 +00:00
Hans Petter Selasky
03f8ddedf0 Add support for printk_ratelimit() function macro and improve the existing
printk_ratelimited() function macro to return a boolean stating if there
was a printout, true, or not, false.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 21:25:19 +00:00
Hans Petter Selasky
e35dc5149d Add support for kref_read() function in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 20:56:35 +00:00
Hans Petter Selasky
13a27c3b43 Add support for mmgrab() function in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 20:52:54 +00:00
Hans Petter Selasky
2060ca654e Add support for __percpu and __weak macros in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 20:50:18 +00:00
Hans Petter Selasky
7353335d1c Move the IRQ_RETVAL() and irqreturn definitions to irqreturn.h in the
LinuxKPI to be compatible with Linux. No functional change.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 20:37:21 +00:00
Hans Petter Selasky
1249c589b6 Add checks for valid IRQ tag before setting up or tearing down an interrupt
handler in the LinuxKPI. This is needed when the interrupt handler is disabled
before freeing the interrupt.

MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-17 20:09:43 +00:00
Hans Petter Selasky
af4010be77 Compile fix for GCC in the LinuxKPI.
Older versions of GCC don't allow flexible array members in a union.
Use a zero length array instead.

MFC after:	1 week
Reported by:	jbeich@
Sponsored by:	Mellanox Technologies
2018-02-17 08:12:35 +00:00
Hans Petter Selasky
f4824a028d Implement mutex_trylock_recursive() in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-16 16:01:39 +00:00
Hans Petter Selasky
10ee3d3016 Implement memdup_user_nul() in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-16 15:52:28 +00:00
Hans Petter Selasky
f1f7e04a29 Implement tasklet_enable() and tasklet_disable() in the LinuxKPI.
MFC after:	1 week
Requested by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-16 15:41:16 +00:00
Hans Petter Selasky
219ff59ce2 Implement enable_irq() and disable_irq() in the LinuxKPI.
MFC after:	1 week
Submitted by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-16 15:37:33 +00:00
Hans Petter Selasky
2a7c2b914f Allow the cmpxchg() macro in the LinuxKPI to work on pointers without
generating compiler warnings, -Wint-conversion .

Requested by:	Johannes Lundberg <johalun0@gmail.com>
Sponsored by:	Mellanox Technologies
2018-02-16 15:20:21 +00:00
Ed Maste
0ba1b36553 Rationalize license text on Linuxolator files
Many licenses on Linuxolator files contained small variations from the
standard FreeBSD license text.  To avoid license proliferation switch to
the standard 2-clause FreeBSD license for those files where I have
permission from each of the listed copyright holders.  Additional files
waiting on permission from others are listed in review D14210.

Approved by:	kan, marcel, sos, rdivacky
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2018-02-16 15:00:14 +00:00
Brooks Davis
2feb5b8dc9 Regen after r329322. 2018-02-15 18:32:11 +00:00
Brooks Davis
a4dcd0ef22 Remove freebsd32_getdirentries(), it will be unused after the next
commit.
2018-02-15 18:31:43 +00:00
Brooks Davis
e4039d68fb Revert r329323. I missed something in my testing. 2018-02-15 17:58:51 +00:00
Brooks Davis
a170bb0387 Regen after r329322: Fix getdirentries(2) under 32-bit compat.
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14379
2018-02-15 17:27:19 +00:00
Brooks Davis
1f6023cf0b Fix getdirentries(2) under 32-bit compat.
The latest version of getdirentries (syscall 554) takes a pointer
an an off_t as the last argument. The old version which copies out
an int32_t was being used instead. Use the standard sys_getdirentries()
implementation instead.

Reviewed by:	kib
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14379
2018-02-15 17:26:30 +00:00
Konstantin Belousov
67dcd64ab8 linuxkpi: Do not leak pages on put.
When the owner of the wire reference releases the last reference, it
might be that the page was already attempted to be freed (but free
cannot be performed at that time due to wire).  Check that the page
was removed from the object as the indicator of the free attempt and
finish the free operation if so.

Reported and tested by:	Slava Shwartsman
Reviewed by:	hselasky
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2018-02-13 15:44:35 +00:00
Jeff Roberson
e958ad4cf3 Make v_wire_count a per-cpu counter(9) counter. This eliminates a
significant source of cache line contention from vm_page_alloc().  Use
accessors and vm_page_unwire_noq() so that the mechanism can be easily
changed in the future.

Reviewed by:	markj
Discussed with:	kib, glebius
Tested by:	pho (earlier version)
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14273
2018-02-12 22:53:00 +00:00
Hans Petter Selasky
5b7cc89266 Fix implementation of ktime_add_ns() and ktime_sub_ns() in the LinuxKPI to
actually return the computed result instead of the input value.

This is a regression issue after r289572.

Found by:	gcc6
MFC after:	3 days
Sponsored by:	Mellanox Technologies
2018-02-07 12:12:06 +00:00
Jeff Roberson
e2068d0bcd Use per-domain locks for vm page queue free. Move paging control from
global to per-domain state.  Protect reservations with the free lock
from the domain that they belong to.  Refactor to make vm domains more
of a first class object.

Reviewed by:    markj, kib, gallatin
Tested by:      pho
Sponsored by:   Netflix, Dell/EMC Isilon
Differential Revision:  https://reviews.freebsd.org/D14000
2018-02-06 22:10:07 +00:00
Ed Maste
132f90c660 Linuxolator whitespace cleanup
A version of each of the MD files by necessity exists for each CPU
architecture supported by the Linuxolator.  Clean these up so that new
architectures do not inherit whitespace issues.

Clean up shared Linuxolator files while here.

Sponsored by:	Turing Robotic Industries Inc.
2018-02-05 17:29:12 +00:00
Brooks Davis
0fd25723bc Add kern.ipc.{msqids,semsegs,sema} sysctls for FreeBSD32.
Stop leaking kernel pointers though theses sysctls and make sure that the
padding in the structures is zeroed on allocation to avoid other leaks.

Reviewed by:	gordon, kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D13459
2018-02-02 18:03:12 +00:00
Hans Petter Selasky
f71d0b0da7 Fix some recent regressions after r328436 in the LinuxKPI:
1) The OPW() function macro should have the same return type like the
function it executes.
2) The DEVFS I/O-limit should be enforced for all character device reads
and writes.
3) The character device file handle should be passable, same as for
DEVFS based file handles.

Reported by:	jbeich @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-02-01 19:57:21 +00:00
Hans Petter Selasky
3f3735db30 Make sure the LinuxKPI's internal ERESTARTSYS error code gets translated
into ERESTART for mmap and page fault calls aswell.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-02-01 17:32:45 +00:00
Hans Petter Selasky
cb57d1dd30 Properly implement the cond_resched() function macro in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-01-31 13:40:36 +00:00
Bryan Drewery
595109196a Don't use an .OBJDIR for 'make sysent'.
Reported by:	emaste, jhb
Sponsored by:	Dell EMC
2018-01-29 19:14:15 +00:00
Hans Petter Selasky
e23ae408c0 Decouple Linux files from the belonging character device right after open
in the LinuxKPI. This is done by calling finit() just before returning a magic
value of ENXIO in the "linux_dev_fdopen" function.

The Linux file structure should mimic the BSD file structure as much as
possible. This patch decouples the Linux file structure from the belonging
character device right after the "linux_dev_fdopen" function has returned.
This fixes an issue which allows a Linux file handle to exist after a
character device has been destroyed and removed from the directory index
of /dev. Only when the reference count of the BSD file handle reaches zero,
the Linux file handle is destroyed. This fixes use-after-free issues related
to accessing the Linux file structure after the character device has been
destroyed.

While at it add a missing NULL check for non-present file operation.
Calling a NULL pointer will result in a segmentation fault.

Reviewed by:	kib @
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-01-26 10:49:02 +00:00
Justin Hibbits
51bd6f9618 Minimal change to build linuxkpi on architectures with physical addresses larger
than virtual

Summary:
Some architectures have physical/bus addresses that are much larger
than virtual addresses.  This change just quiets a warning, as DMAP is not used
on those architectures, and on 64-bit platforms uintptr_t is the same size as
vm_paddr_t and void *.

Reviewed By:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D14043
2018-01-26 00:56:09 +00:00
Hans Petter Selasky
f885420d3c Properly implement the "id" callback argument in the "idr_for_each" function
in the LinuxKPI. The old implementation assumed only one IDR layer was present.
Take additional IDR layers into account when computing the "id" value.

MFC after:	1 week
Found by:	Karthik Palanichamy <karthikp@chelsio.com>
Tested by:	Karthik Palanichamy <karthikp@chelsio.com>
Sponsored by:	Mellanox Technologies
2018-01-24 13:37:07 +00:00
Pedro F. Giffuni
ac2fffa4b7 Revert r327828, r327949, r327953, r328016-r328026, r328041:
Uses of mallocarray(9).

The use of mallocarray(9) has rocketed the required swap to build FreeBSD.
This is likely caused by the allocation size attributes which put extra pressure
on the compiler.

Given that most of these checks are superfluous we have to choose better
where to use mallocarray(9). We still have more uses of mallocarray(9) but
hopefully this is enough to bring swap usage to a reasonable level.

Reported by:	wosch
PR:		225197
2018-01-21 15:42:36 +00:00
Nathan Whitehorn
ad6b97e7ca Define PHYS_TO_DMAP() and DMAP_TO_PHYS() as panics on the architectures
(i386 and arm) that never implement them. This allows the removal of
#ifdef PHYS_TO_DMAP on code otherwise protected by a runtime check on
PMAP_HAS_DMAP. It also fixes the build on ARM and i386 after I forgot an
#ifdef in r328168.

Reported by:	Milan Obuch
Pointy hat to:	me
2018-01-19 22:17:13 +00:00
Nathan Whitehorn
9a8196ce19 Remove SFBUF_OPTIONAL_DIRECT_MAP and such hacks, replacing them across the
kernel by PHYS_TO_DMAP() as previously present on amd64, arm64, riscv, and
powerpc64. This introduces a new MI macro (PMAP_HAS_DMAP) that can be
evaluated at runtime to determine if the architecture has a direct map;
if it does not (or does) unconditionally and PMAP_HAS_DMAP is either 0 or
1, the compiler can remove the conditional logic.

As part of this, implement PHYS_TO_DMAP() on sparc64 and mips64, which had
similar things but spelled differently. 32-bit MIPS has a partial direct-map
that maps poorly to this concept and is unchanged.

Reviewed by:		kib
Suggestions from:	marius, alc, kib
Runtime tested on:	amd64, powerpc64, powerpc, mips64
2018-01-19 17:46:31 +00:00