128778 Commits

Author SHA1 Message Date
nwhitehorn
0be5598124 Avoid emitting a PT_INTERP section for powerpc64 kernels and arrange for
the first instruction to be at the start of the text segment. This allows
the kernel to be booted correctly by stock kexec-lite.

MFC after:	2 weeks
2017-11-25 21:45:51 +00:00
nwhitehorn
839dd38bd3 Automatically use the ELFv2 ABI on powerpc64 if supported by the compiler.
This has the same effects on DDB working as -mcall=aixdesc, but also is
supported by clang and marginally improves kernel performance.

MFC after:	2 weeks
2017-11-25 21:44:23 +00:00
mjg
16d5d58622 Add the missing lockstat check for thread lock. 2017-11-25 20:49:27 +00:00
mjg
0057ac8761 Convert in-kernel thread_lock_flags calls to thread_lock when debug is disabled
The flags argument is not used in this case.
2017-11-25 20:37:13 +00:00
mjg
67f9bba460 rwlock: fix up compilation of the previous change
commmitted wrong version of the patch
2017-11-25 20:25:45 +00:00
mjg
c52edb8a9e rwlock: add __rw_try_{r,w}lock_int 2017-11-25 20:22:51 +00:00
mjg
2d0a05c617 sx: change sunlock to wake waiters up if it locked sleepq
sleepq is only locked if the curhtread is the last reader. By the time
the lock gets acquired new ones could have arrived. The previous code
would unlock and loop back. This results spurious relocking of sleepq.

This is a step towards xadd-based unlock routine.
2017-11-25 20:13:50 +00:00
mjg
c7ffe8c4e0 locks: retry turnstile/sleepq loops on failed cmpset
In order to go to sleep threads set waiter flags, but that can spuriously
fail e.g. when a new reader arrives. Instead of unlocking everything and
looping back, re-evaluate the new state while still holding the lock necessary
to go to sleep.
2017-11-25 20:10:33 +00:00
mjg
d5c5e62cc7 rwlock: stop re-reading the owner when going to sleep 2017-11-25 20:08:11 +00:00
kevans
f844b408fe Allwinner a83t: enable USB support
Originally a patch by Mark Millard, augmented with information from work
done on NetBSD by jmcneill@.

Submitted by:	Mark Millard (markmi@dsl-only.net)
Reviewed by:	emaste, manu
Approved by:	emaste (mentor)
Differential Revision:	https://reviews.freebsd.org/D13240
2017-11-25 16:46:35 +00:00
kevans
de2be652e7 Add r-ccu support for the Allwinner a83t
The r-ccu on the a83t differs from the others only by what it names the
ar100 parents. Export the _CCU macros (now converted to an enu) so that
ccu_sun8i_r can differentiate between a83t r-ccu and the others, then add
the compat string for the a83t r-ccu.

Reviewed by:	manu
Approved by:	emaste (mentor, implicit)
Differential Revision:	https://reviews.freebsd.org/D13206
2017-11-25 15:14:40 +00:00
mav
ccc463ad9d Slightly fix bidirectional stream number allocation.
This logic is still imperfect, since it allows at most 15 bidirectional
streams out of 30 allowed by specification, but at least now those should
work better.  On the other side I don't remember I ever saw controller
supporting the bidirectional streams, so this is likely a nop change.

MFC after:	1 month
2017-11-25 09:42:14 +00:00
jhb
bac78aa2a4 Decode kevent structures logged via ktrace(2) in kdump.
- Add a new KTR_STRUCT_ARRAY ktrace record type which dumps an array of
  structures.

  The structure name in the record payload is preceded by a size_t
  containing the size of the individual structures.  Use this to
  replace the previous code that dumped the kevent arrays dumped for
  kevent().  kdump is now able to decode the kevent structures rather
  than dumping their contents via a hexdump.

  One change from before is that the 'changes' and 'events' arrays are
  not marked with separate 'read' and 'write' annotations in kdump
  output.  Instead, the first array is the 'changes' array, and the
  second array (only present if kevent doesn't fail with an error) is
  the 'events' array.  For kevent(), empty arrays are denoted by an
  entry with an array containing zero entries rather than no record.

- Move kevent decoding tables from truss to libsysdecode.

  This adds three new functions to decode members of struct kevent:
  sysdecode_kevent_filter, sysdecode_kevent_flags, and
  sysdecode_kevent_fflags.

  kdump uses these helper functions to pretty-print kevent fields.

- Move structure definitions for freebsd11 and freebsd32 kevent
  structures to <sys/event.h> so that they can be shared with userland.
  The 32-bit structures are only exposed if _WANT_KEVENT32 is defined.
  The freebsd11 structures are only exposed if _WANT_FREEBSD11_KEVENT is
  defined.  The 32-bit freebsd11 structure requires both.

- Decode freebsd11 kevent structures in truss for the compat11.kevent()
  system call.

- Log 32-bit kevent structures via ktrace for 32-bit compat kevent()
  system calls.

- While here, constify the 'void *data' argument to ktrstruct().

Reviewed by:	kib (earlier version)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D12470
2017-11-25 04:49:12 +00:00
tuexen
02d5109d72 Fix SPDX line as suggested by pfg 2017-11-24 19:38:59 +00:00
emaste
40ebf27a67 Temporarily disable VIMAGE on arm64
Loading a kernel module with a static VNET_DEFINE'd variable (e.g.
if_lagg) currently results in a kernel panic.

PR:		223670
2017-11-24 19:21:21 +00:00
markj
f82ed113d8 Don't redefine _KERNEL.
MFC after:	1 week
2017-11-24 19:08:54 +00:00
markj
d88166d005 Have lockstat:::sx-release fire only after the lock state has changed.
MFC after:	1 week
2017-11-24 19:04:31 +00:00
markj
e32cd825f4 Add a missing lockstat:::sx-downgrade probe.
We were returning without firing the probe when the lock had no shared
waiters.

MFC after:	1 week
2017-11-24 19:02:06 +00:00
landonf
c317aee78b bhnd(4): Add missing dependency on ofw_bus_if.h
Reported by:	wma
Approved by:	adrian (mentor, implicit)
2017-11-24 19:01:14 +00:00
nwhitehorn
2e6a611a14 Switch the default firmware for npe(4) from the QOS_VLAN one to the
plain-vanilla ETH microcode. The QOS_VLAN firmware added support in microcode
for handling IEEE 802.1q tags, but the npe(4) driver did not actually
support the relevant signalling. As a result, it was impossible to use
VLANs with npe(4). Switching to the more basic microcode (same license)
removes the on-NIC promisisng and makes vlan(4) work on both NPE interfaces.

Ref: https://lists.freebsd.org/pipermail/freebsd-arm/2012-August/003826.html
2017-11-24 15:48:17 +00:00
hselasky
f4a0996f11 RoCE/infiniband upgrade to Linux v4.9 for kernel and userspace.
This commit merges projects/bsd_rdma_4_9 to head.

List of kernel sources used:
============================

1) kernel sources were cloned from git://github.com/torvalds/linux.git
Top commit 69973b830859bc6529a7a0468ba0d80ee5117826 - tag: v4.9, linux-4.9

2) krping was cloned from https://github.com/larrystevenwise/krping
Top commit 292a2f1abf0348285e678a82264740d52e4dcfe4

List of userspace sources used:
===============================

1) rdma-core was cloned from https://github.com/linux-rdma/rdma-core.git
Top commit d65138ef93af30b3ea249f3a84aa6a24ba7f8a75

2) OpenSM was cloned from git://git.openfabrics.org/~halr/opensm.git
Top commit 85f841cf209f791c89a075048a907020e924528d

3) libibmad was cloned from git://git.openfabrics.org/~iraweiny/libibmad.git
Tag 1.3.13 with some additional patches from Mellanox.

4) infiniband-diags was cloned from git://git.openfabrics.org/~iraweiny/infiniband-diags.git
Tag 1.6.7 with some additional patches from Mellanox.

NOTES:
======

1) The mthca driver has been removed in kernel and in userspace.
2) All GPLv2 only sources have been removed and where applicable
   rewritten from scratch under a BSD license.
3) List of fully supported drivers in userspace and kernel:
   a) iw_cxgbe (Chelsio)
   b) mlx4ib (Mellanox)
   c) mlx5ib (Mellanox)
4) WITH_OFED=YES is still required by make in order to build
   OFED userspace and kernel code.
5) Full support has been added for routable RoCE, RoCE v2.

Sponsored by:	Mellanox Technologies
2017-11-24 14:50:28 +00:00
ed
87b257c9de Pick the right vDSO file/linker flags when building cloudabi32.ko on ARM64.
The recently imported cloudabi_vdso_armv6_on_64bit.S should be the vDSO
for 32-bit processes when being run on FreeBSD/arm64. This vDSO ensures
that all system call arguments are padded to 64 bits, so that they can
be used by the kernel to call into most of the native implementations
directly.
2017-11-24 14:02:32 +00:00
ed
f670022acb Set CP15BEN in SCTLR to make memory barriers work in 32-bit mode.
Binaries generated by Clang for ARMv6 may contain these instructions:

  MCR p15, 0, <Rd>, c7, c10, 5

These instructions are deprecated as of ARMv7, which is why modern
processors have a way of toggling support for them. On FreeBSD/arm64 we
currently disable support for these instructions, meaning that if 32-bit
executables with these instructions are run, they would crash with
SIGILL. This is likely not what we want.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D13145
2017-11-24 13:51:59 +00:00
ed
a73a92ed88 Add rudimentary support for building FreeBSD/arm64 with COMPAT_FREEBSD32.
Right now I'm using two Raspberry Pi's (2 and 3) to test CloudABI
support for armv6, armv7 and aarch64. It would be nice if I could
restrict this to just a single instance when testing smaller changes.
This is why I'd like to get COMPAT_CLOUDABI32 to work on arm64.

As COMPAT_CLOUDABI32 depends on COMPAT_FREEBSD32, at least for the ELF
loading, this change adds all of the bits necessary to at least build a
kernel with COMPAT_FREEBSD32. All of the machine dependent system calls
are still stubbed out, for the reason that implementations for these are
only useful if actual support for running FreeBSD binaries is added.
This is outside the scope of this work.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D13144
2017-11-24 13:50:53 +00:00
tuexen
05e0aa35ab Unbreak compilation when using SCTP_DETAILED_STR_STATS option.
MFC after:	1 week
2017-11-24 12:18:48 +00:00
hselasky
091ce9badd Merge ^/head r326132 through r326161. 2017-11-24 12:13:27 +00:00
hselasky
cd19f399a1 Implement atomic_fetchadd_64() for i386. This function is needed by the
atomic64 header file in the LinuxKPI for i386.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2017-11-24 12:10:42 +00:00
hselasky
5b03215ba8 Compile fixes for 32-bit architectures.
Sponsored by:	Mellanox Technologies
2017-11-24 12:08:50 +00:00
hselasky
9dfe7482b5 Compile fix for LINT-NOIP kernel target.
Sponsored by:	Mellanox Technologies
2017-11-24 12:05:49 +00:00
avg
993cc07930 amd-vi: a small whitespace cleanup
Reviewed by:	anish
2017-11-24 11:37:41 +00:00
avg
d56e6db67b amd-vi: use correct type for pci_rid, start_dev_rid, end_dev_rid sysctls
Previously, the values could look confusing because of unrelated bits from
adjacent memory.

Reviewed by:	anish
2017-11-24 11:36:35 +00:00
avg
bc48907278 amd-vi: small improvements to event printing
Ensure that an opening bracket always has a matching closing one.
Ensure that there is always a new-line at the end of a report line.
Also, add a space before the printed event flag.

Reviewed by:	anish
2017-11-24 11:35:43 +00:00
avg
17a7ee6279 amd-vi: print some additional details for INVALID_DEVICE_REQUEST event
Namely, the type of the hardware event and whether the transaction
was a translation request.

Reviewed by:	anish
2017-11-24 11:34:46 +00:00
tuexen
4505506cf7 Add SPDX line. 2017-11-24 11:25:53 +00:00
avg
88db7a2c1c amd-vi: fix up r326152, the new width requires a wider type
This is my brain-o from extending the width at the last moment.
2017-11-24 11:25:06 +00:00
avg
3cf07fb3d3 amd-vi: fix and extend definition of Command and Event Status Register (0x2020)
The defined bits are the lower bits, not the higher ones.

Also, the specification has been extended to define bits 0:18 and they
all could potentially be interesting to us, so extend the width of the
field accordingly.

Reviewed by:	anish
2017-11-24 11:20:10 +00:00
avg
58c07c5a95 vmm/amd: improve iteration over IVHD (type 10h) entries in IVRS table
Many 8-byte entries have zero at byte 4, so the second 4-byte part is
skipped as a 4-byte padding entry.  But not all 8-byte entries have that
property and they get misinterpreted.

A real example:
    48 00 00 00 ff 01 00 01
This an 8-byte ACPI_IVRS_TYPE_SPECIAL entry for IOAPIC with ID 255 (bogus).
It is reported as:
    ivhd0: Unknown dev entry:0xff
Fortunately, it was completely harmless.

Also, bail out early if we encounter an entry of a variable length type.
We do not have proper handling for those yet.

Reviewed by:	anish
2017-11-24 11:10:36 +00:00
hselasky
7c149b4959 Make sure all tasks are cancelled synchronously in ipoib to avoid
use after free.

Sponsored by:	Mellanox Technologies
2017-11-24 09:55:20 +00:00
hselasky
38de5905c9 Build fix for ipoib when CONFIG_INFINIBAND_IPOIB_CM is defined.
Sponsored by:	Mellanox Technologies
2017-11-24 09:52:56 +00:00
hselasky
70eddcb90e Build fix for kernel LINT target.
Sponsored by:	Mellanox Technologies
2017-11-24 09:12:13 +00:00
ed
9b06c6070c Don't let cpu_set_syscall_retval() clobber exec_setregs().
Upon successful completion, the execve() system call invokes
exec_setregs() to initialize the registers of the initial thread of the
newly executed process. What is weird is that when execve() returns, it
still goes through the normal system call return path, clobbering the
registers with the system call's return value (td->td_retval).

Though this doesn't seem to be problematic for x86 most of the times (as
the value of eax/rax doesn't matter upon startup), this can be pretty
frustrating for architectures where function argument and return
registers overlap (e.g., ARM). On these systems, exec_setregs() also
needs to initialize td_retval.

Even worse are architectures where cpu_set_syscall_retval() sets
registers to values not derived from td_retval. On these architectures,
there is no way cpu_set_syscall_retval() can set registers to the way it
wants them to be upon the start of execution.

To get rid of this madness, let sys_execve() return EJUSTRETURN. This
will cause cpu_set_syscall_retval() to leave registers intact. This
makes process execution easier to understand. It also eliminates the
difference between execution of the initial process and successive ones.
The initial call to sys_execve() is not performed through a system call
context.

Reviewed by:	kib, jhibbits
Differential Revision:	https://reviews.freebsd.org/D13180
2017-11-24 07:35:08 +00:00
kevans
bcc38930ac Add ccu compat string for Allwinner a83t
A ccu driver was added for the a83t in r326114. Add compat string to
aw_ccung and register the clocks for the a83t upon attach.

Reviewed by:	manu
Approved by:	emaste (mentor, implicit)
Differential Revision:	https://reviews.freebsd.org/D13205
2017-11-24 02:39:38 +00:00
andrew
851e08541e Ensure we check the program state set in the trap frame on arm and arm64.
This value may be set by userspace so we need to check it before using it.
If this is not done correctly on exception return the kernel may continue
in kernel mode with all registers set to a userspace controlled value. Fix
this by moving the check into set_mcontext, and also add the missing
sanitisation from the arm64 set_regs.

Discussed with:	security-officer@
MFC after:	3 days
Sponsored by:	DARPA, AFRL
2017-11-23 17:40:40 +00:00
markj
b7a8474133 Duplicate helpers after disabling inherited tracepoints during a fork.
We may create probes in the nascent child process, so we first need to
ensure that any inherited tracepoints are first removed. Otherwise the
probe sites will not be in the state expected by fasttrap, and it won't
be able to enable the probes.

MFC after:	2 weeks
2017-11-23 14:29:07 +00:00
hselasky
7b5126003a Merge ^/head r325999 through r326131. 2017-11-23 14:28:14 +00:00
markj
36a1a14004 Allow kern.geom.mirror.debug to be negative.
A negative value can be used to suppress all prints from the gmirror
kernel code, which can be useful when attempting to trigger race
conditions using stress tests.

MFC after:	1 week
2017-11-23 14:07:52 +00:00
hselasky
1088f71c35 Make sure the iSCSI I/O limits are set properly so that the ISCSIDSEND IOCTL
can be used prior to the ISCSIDHANDOFF IOCTL which set the negotiated values.
Else the login PDU will fail when passing the "-r" option to "iscsictl" which
means iSCSI over RDMA instead of TCP/IP.

Discussed with:	np@ and trasz@
Sponsored by:	Mellanox Technologies
MFC after:	1 week
2017-11-23 13:57:44 +00:00
hselasky
4946eec324 The __internal_mr is freed as part of the protection domain, pd.
There is no need to free this mr. This fixes an issue accessing
freed memory in ISER.

Sponsored by:	Mellanox Technologies
2017-11-23 12:25:11 +00:00
kib
873f304292 Remove lint support from system headers and MD x86 headers.
Reviewed by:	dim, jhb
Discussed with:	imp
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D13156
2017-11-23 11:40:16 +00:00
kib
11c77eaad8 Kill all descendants of the reaper, even if they are descendants of a
subordinate reaper.

Also, mark reapers when listing pids.

Reported by:	Michael Zuo <muh.muhten@gmail.com>
PR:	223745
Reviewed by:	bapt
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D13183
2017-11-23 11:25:11 +00:00