248218 Commits

Author SHA1 Message Date
jhb
6db893ef6a Remove limitation of 6 arguments for makecontext() on mips.
This implementation spills additional arguments on the stack so works
fine with more than 6 arguments.  I believe the check was just copied
over from sparc64 (which doesn't support spilling onto the stack)

Sponsored by:	DARPA / AFRL
2018-01-31 18:00:23 +00:00
jhb
b61009fc63 Remove bogus checks against NCARGS.
NCARGS isn't a limit on the number of arguments to pass to a function,
but the number of bytes that can be consumed by arguments to exec.  As
such, it is not suitable for a limit on the count of arguments passed
to makecontext().

Sponsored by:	DARPA / AFRL
2018-01-31 17:57:59 +00:00
jhb
ef323a885c Clarify that the additional arguments to makecontext() are of type int.
MFC after:	1 week
Sponsored by:	DARPA / AFRL
2018-01-31 17:56:36 +00:00
jhb
d4fdf34d06 Consistently use 16-byte alignment for MIPS N32 and N64.
- Add a new <machine/abi.h> header to hold constants shared between C
  and assembly such as CALLFRAME_SZ.
- Add a new STACK_ALIGN constant to <machine/abi.h> and use it to
  replace hardcoded constants in the kernel and makecontext().  As a
  result of this, ensure the stack pointer on N32 and N64 is 16-byte
  aligned for N32 and N64 after exec(), after pthread_create(), and
  when sending signals rather than 8-byte aligned.

Reviewed by:	jmallett
Sponsored by:	DARPA / AFRL
Differential Revision:	https://reviews.freebsd.org/D13875
2018-01-31 17:36:39 +00:00
kib
0eb8b964d2 When switching IBRS on, also enable STIBP (Single Thread Indirect
Branch Predictors) mitigation.

DOcument 336996-001 promises that CPUs which implement IBRS but not
STIBP silently ignore setting of the bit instead of trapping.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-31 16:56:02 +00:00
kib
637f4cf41d Expand IBRS TLA in sysctl help lines.
Requested by:	bz
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-31 16:54:05 +00:00
avg
22ad2342b1 zfs_rezget: drop cached pages before doing anything else
We did that in the case of success to prevent the use of stale cached
data, but it makes even less sense to keep the cached data when we fail.

Ideally, we should call vgone() on the vnode in the case of zfs_rezget
failure, but the current lock order prevents us from doing that.

The change also rearranges the order of unlinked check and the size
change check.

While there, add missing SET_ERROR in one of the error paths.

MFC after:	2 weeks
2018-01-31 14:44:51 +00:00
kib
01b52fdebb IBRS support, AKA Spectre hardware mitigation.
It is coded according to the Intel document 336996-001, reading of the
patches posted on lkml, and some additional consultations with Intel.

For existing processors, you need a microcode update which adds IBRS
CPU features, and to manually enable it by setting the tunable/sysctl
hw.ibrs_disable to 0.  Current status can be checked in sysctl
hw.ibrs_active.  The mitigation might be inactive if the CPU feature
is not patched in, or if CPU reports that IBRS use is not required, by
IA32_ARCH_CAP_IBRS_ALL bit.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D14029
2018-01-31 14:36:27 +00:00
kib
9c0b8085dc Do not enable PTI when IA32_ARCH_CAP_RDCL_NO bit is set.
Intel document 336996-001 claims that this will be the way to inform
about Meltdown correction.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-01-31 14:25:42 +00:00
hselasky
e90574eea9 Properly implement the cond_resched() function macro in the LinuxKPI.
MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-01-31 13:40:36 +00:00
avg
bedef9326b vmm/svm: post LAPIC interrupts using event injection, not virtual interrupts
The virtual interrupt method uses V_IRQ, V_INTR_PRIO, and V_INTR_VECTOR
fields of VMCB to inject a virtual interrupt into a guest VM.  This
method has many advantages over the direct event injection as it
offloads all decisions of whether and when the interrupt can be
delivered to the guest.  But with a purely software emulated vAPIC the
advantage is also a problem.  The problem is that the hypervisor does
not have any precise control over when the interrupt is actually
delivered to the guest (or a notification about that).  Because of that
the hypervisor cannot update the interrupt vector in IRR and ISR in the
same way as real hardware would.  The hypervisor becomes aware that the
interrupt is being serviced only upon the first VMEXIT after the
interrupt is delivered.  This creates a window between the actual
interrupt delivery and the update of IRR and ISR.  That means that IRR
and ISR might not be correctly set up to the point of the
end-of-interrupt signal.

The described deviation has been observed to cause an interrupt loss in
the following scenario.  vCPU0 posts an inter-processor interrupt to
vCPU1.  The interrupt is injected as a virtual interrupt by the
hypervisor.  The interrupt is delivered to a guest and an interrupt
handler is invoked.  The handler performs a requested action and
acknowledges the request by modifying a global variable.  So far, there
is no VMEXIT and the hypervisor is unaware of the events.  Then, vCPU0
notices the acknowledgment and sends another IPI with the same vector.
The IPI gets collapsed into the previous IPI in the IRR of vCPU1.  Only
after that a VMEXIT of vCPU1 occurs.  At that time the vector is cleared
in the IRR and is set in the ISR.  vCPU1 has vAPIC state as if the
second IPI has never been sent.
The scenario is impossible on the real hardware because IRR and ISR are
updated just before the interrupt handler gets started.

I saw several possibilities of fixing the problem.  One is to intercept
the virtual interrupt delivery to update IRR and ISR at the right
moment.  The other is to deliver the LAPIC interrupts using the event
injection, same as legacy interrupts.  I opted to use the latter
approach for several reasons.  It's equivalent to what VMM/Intel does
(in !VMX case).  It appears to be what VirtualBox and KVM do.  The code
is already there (to support legacy interrupts).

Another possibility was to use a special intermediate state for a vector
after it is injected using a virtual interrupt and before it is known
whether it was accepted or is still pending.
That approach was implemented in https://reviews.freebsd.org/D13828
That method is more complex and does not have any clear advantage.

Please see sections 15.20 and 15.21.4 of "AMD64 Architecture
Programmer's Manual Volume 2: System Programming" (publication 24593,
revision 3.29) for comparison between event injection and virtual
interrupt injection.

PR:		215972
Reported by:	ajschot@hotmail.com, grehan
Tested by:	anish, grehan,  Nils Beyer <nbe@renzel.net>
Reviewed by:	anish, grehan
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D13780
2018-01-31 11:14:26 +00:00
adrian
12318046d5 [arswitch] Fix ATU programming on the AR8327 switch.
Doing a flush actually requires setting this bit.
2018-01-31 07:37:33 +00:00
adrian
d15af82b1a [arswitch] Fix ATU flushing on AR8216/AR8316 and most of the later chips.
The switch hardware requires this bit to be set in order to kick start the
actual ATU update.  This was being masked on some chips by the learning
programming (what to do when a MAC address moves, hash table collision, etc)
which is currently inconsistent between chips.

Tested:

* AR9344 SoC (AR7240 style switch internal)
2018-01-31 07:36:51 +00:00
adrian
b36494644e [arswitch] add a new debug section for upcoming address table management. 2018-01-31 07:20:34 +00:00
wma
3809160605 PowerNV: fix compilation on non-NV platforms
Submitted by:          Wojciech Macek <wma@semihalf.com>
Obtained from:         Semihalf
Sponsored by:          IBM, QCM Technologies
2018-01-31 06:42:01 +00:00
imp
086d3e5033 Update stand.h for changes for strto*l
Move prototypes to proper section now that we don't have modified
versions of strtol and strtoul in libsa. Add prototypes for new
strtoll and strtoull. Use prototypes copied from stdlib.h instead of
the old hand-rolled ones.

(I forgot to move this file form my lua branch in r328613)
2018-01-31 05:07:43 +00:00
imp
623ec8479b Move libstand.3 to libsa.3. Update libsa.3 to include functions
recently added. More are likely missing.
2018-01-31 04:29:05 +00:00
imp
87ace50e1e Kill copies of strtol and strtoul. Use the ones that are in libc,
since they suffice. Create xlocale_private.h which provides the most
minimal locale implementation we can get away with. Add strtoll and
strtoull from libc.
2018-01-31 04:29:00 +00:00
imp
d7c2501a9b Move strtold wrapper from strtol.c to its own strtold.c. This code
was written by theraven@ (David Chisnall) entirely, there's no
original Berkeley code left here so just copy his copyright over.
2018-01-31 03:05:14 +00:00
mav
90316ba76c Try to preallocate receive memory early.
We may not have enough contiguous memory later, when NTB connection get
established.  It is quite likely that NTB windows are symmetric and this
allocation remain, but even if not, we will just reallocate it later.

MFC after:	2 weeks
2018-01-31 01:04:36 +00:00
jhb
b5c926aafe Ensure 'name' is not NULL before passing to strcmp().
This avoids a nested page fault when obtaining a stack trace in DDB if
the address from the first frame does not resolve to a known symbol.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-01-30 23:29:27 +00:00
jhb
3df465f54f Export tcp_always_keepalive for use by the Chelsio TOM module.
This used to work by accident with ld.bfd even though always_keepalive
was marked as static. LLD honors static more correctly, so export this
variable properly (including moving it into the tcp_* namespace).

Reviewed by:	bz, emaste
MFC after:	2 weeks
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D14129
2018-01-30 23:01:37 +00:00
asomers
4a4bfa321f zfsd: Don't spare a vdev that's being replaced
If a zfs pool contains a replacing vdev (either created manually by "zpool
replace" or by zfsd(8) via autoreplace by physical path) and then new spares
get added to the pool, zfsd shouldn't use one to replace the drive that is
already being replaced.  That's a waste of resources that just slows down
the rebuild.

PR:		225547
MFC after:	3 weeks
Sponsored by:	Spectra Logic Corp
2018-01-30 21:25:43 +00:00
sbruno
24854d6272 Add missing non-POWERPC case to give the scr value something non-zero.
This fixes the instant reboot of netbooting after r328536 on x86 systems.

Reviewed by:	peter
Sponsored by:	Limelight Networks
2018-01-30 20:00:12 +00:00
emaste
187eb493d5 makesyscalls: permit a range of syscall numbers for UNIMPL
Some ABIs have large gaps in syscall numbers.  Allow gaps to be filled
as ranges of UNIMPL, with an entry like:

248-1023	AUE_NULL	UNIMPL	unimplemented

Reviewed by:	jhb, gnn
Sponsored by:	Turing Robotic Industries Inc.
Differential Revision:	https://reviews.freebsd.org/D14122
2018-01-30 18:29:38 +00:00
emaste
da20a161b2 Pull in r322131 from upstream llvm trunk (by Rafael Espíndola):
Use a MCExpr for the size of MCFillFragment.

  This allows the size to be found during ralaxation. This fixes
  [LLVM] pr35858.

Requested by:	royger
2018-01-30 16:43:20 +00:00
emaste
049894fe59 Pull in r322123 from upstream llvm trunk (by Rafael Espíndola):
Don't create MCFillFragment directly.

  Instead use higher level APIs that take care of most bookkeeping.
2018-01-30 16:42:08 +00:00
emaste
5784379e04 Pull in r322108 from upstream llvm trunk (by Rafael Espíndola):
Make one of the emitFill methods non virtual. NFC.

  This is just preparatory work to fix [LLVM] PR35858.
2018-01-30 16:41:38 +00:00
swills
6f73837b39 Change installer default to not install ports tree
Reviewed by:	gjb, dteske, allanjude, bdrewery, mat
Approved by:	gjb
Differential Revision:	https://reviews.freebsd.org/D14064
2018-01-30 16:34:56 +00:00
hselasky
e8b153d235 Move the mlx5 core device pointer first in the mlx5en priv. This help simplify
checks to recognize own network devices when using mlx5ib. This patch fixes
an issues where mlx5ib fails to recognize mceX network devices for use with
RoCE.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-01-30 12:38:06 +00:00
trasz
f673d5b3e6 Document the new hw.usb.template behaviour.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2018-01-30 10:10:02 +00:00
trasz
1bb843f344 Make the handler routine for the hw.usb.template sysctl trigger the USB
host to reprobe the bus by switching the USB pull up resistors off and
back on.  In other words - when FreeBSD is configured as a USB device,
changing the sysctl will be immediately noticed by the machine it's
connected to.

Reviewed by:	hselasky@
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2018-01-30 10:08:11 +00:00
manu
9a1d45c825 nfsstat: Add libxo output
Add libxo output support
Merge exp41_intpr and exp_intpr function. The only difference is to print
NFSV4.1 operations in exp41, add a third arguement to control that.
printtitle was set to 1 and don't have a switch, add a -q options to control it.

Reviewed by:	bapt
Sponsored by:	Gandi.net
Differential Revision:	https://reviews.freebsd.org/D14012
2018-01-30 09:59:52 +00:00
mmel
6c722cc3aa Use more verbose panic messages.
MFC after: 2 weeks
2018-01-30 04:06:30 +00:00
mmel
988cb65712 Revert r328511, it was committed with <patch>.diff instead of <patch>.txt as
commit log.
2018-01-30 04:05:03 +00:00
kevans
4ddcdb38c1 stand/fdt: Remove unused write-only new_fdtp, correct comment
MFC after:	3 days
2018-01-30 03:31:40 +00:00
pfg
ee2a2ea1d3 libedit: sort the Makefile in line with NetBSD's version.
NetBSD's libedit has been been cleaned-up considerably so the
non--widecharacter version is no longer an option. Re -sorting the
Makefile should make it easier for some brave soul trying to update it.

No functional change intended.

MFC after:	5 days
2018-01-29 22:38:23 +00:00
fsu
3b5710279c Fix mistake in case of zeroed inode check.
Reported by:	pho
MFC after:	6 months
2018-01-29 22:15:46 +00:00
fsu
0a2aa6292e Add flex_bg/meta_bg features RW support.
Reviewed by:    pfg
MFC after:      6 months

Differential Revision:    https://reviews.freebsd.org/D13964
2018-01-29 21:54:13 +00:00
bdrewery
59f27534cc Don't use an .OBJDIR for 'make sysent'.
Reported by:	emaste, jhb
Sponsored by:	Dell EMC
2018-01-29 19:14:15 +00:00
kevans
da9dad4a1f Remove t_grep:mmap_eof_not_eol test
The test was marked as an expected failure in r320414 after r319971's import
of a newer jemalloc removed an essential feature (opt.redzone) for
reproducing the behavior it was testing. Since then, no way has been found
or demonstrated to reliably test the behavior, so remove the test.

PR:		220309
2018-01-29 18:50:45 +00:00
imp
5f0b41aafe Do the book-keeping on release before we release the reference. The
periph was going away on final release, and then returning and we
started dancing in free memory.

Sponsored by: Netflix
2018-01-29 18:07:14 +00:00
benno
93cb25f62d Remove some duplicated sys/conf/files* entries.
net80211/ieee80211_ageq.c was present twice in sys/conf/files so leave the
correctly sorted one. dev/wpi/if_wpi.c was present in sys/conf/files as well
as sys/conf/files.amd64 and sys/conf/files.i386 so prefer the sys/conf/files
entry.

Reviewed by:	allanjude, rstone
2018-01-29 17:32:30 +00:00
vangyzen
08bd8b9104 ND6: Set the correct state for new neighbor cache entries
Restore state 6.  Many of the UNH tests end up exercising this
state, where we have a new neighbor cache entry and a new link-layer
entry is being created for it.  The link-layer address is currently
unknown so the initial state of the "llentry" should remain initialized
to ND6_LLINFO_NOSTATE so that the ND code will send a solicitation.
Setting this to ND6_LLINFO_STALE implies that the link-level entry
is valid and can be used (but needs to be refreshed via the Neighbor
Unreachability state machine).

https://forums.freebsd.org/threads/64287/

Submitted by:	Farrell Woods <Farrell_Woods@Dell.com>
Reviewed by:	mjoras, dab, ae
MFC after:	1 week
Sponsored by:	Dell EMC
Differential Revision:	https://reviews.freebsd.org/D14059
2018-01-29 16:12:26 +00:00
pfg
f6e629ecd6 pppctl88) Avoid strcpy() copies on overlapping string.
This may lead to unpredicatable behaviour on different platforms or C
library implementations. Use an intermediate variable.

Obtained from:	DragonFlyBSD (git a861a526)
2018-01-29 14:23:44 +00:00
kevans
f3370b7c78 awk(1): Don't install tests at all
Tests were disconnected so that running `make check` in usr.bin/awk did not
have any effect, but CI runs use installed tests. Fully disconnect tests/
from the build for the time being as a short term solutio

Reported by:	lwhsu
2018-01-29 14:15:44 +00:00
kevans
8fbee4f85b libregex: Mark gnuext test as an expected fail
The test was added prematurely as a goal to reach with the GNU extension
functionality, but the functionality has not yet been introduced. Mark it as
an expected fail until that point.
2018-01-29 14:00:33 +00:00
emaste
717143875c lld: Put the header in the first PT_LOAD even if that PT_LOAD has a LMAExpr
The root problem is that we were creating a PT_LOAD just for the header.
That was technically valid, but inconvenient: we should not be making
the ELF discontinuous.

The solution is to allow a section with LMAExpr to be added to a PT_LOAD
if that PT_LOAD doesn't already have a LMAExpr.

LLVM PR:	36017
Obtained from:	LLVM r323625 by Rafael Espindola
2018-01-29 13:55:50 +00:00
emaste
2f81b33ec9 lld: Move LMAOffset from the OutputSection to the PhdrEntry. NFC.
If two sections are in the same PT_LOAD, their relatives offsets,
virtual address and physical addresses are all the same.

[Rafael] initially wanted to have a single global LMAOffset, on the
assumption that every ELF file was in practiced loaded contiguously in
both physical and virtual memory.

Unfortunately that is not the case. The linux kernel has:

  LOAD           0x200000 0xffffffff81000000 0x0000000001000000 0xced000 0xced000 R E 0x200000
  LOAD           0x1000000 0xffffffff81e00000 0x0000000001e00000 0x15f000 0x15f000 RW  0x200000
  LOAD           0x1200000 0x0000000000000000 0x0000000001f5f000 0x01b198 0x01b198 RW  0x200000
  LOAD           0x137b000 0xffffffff81f7b000 0x0000000001f7b000 0x116000 0x1ec000 RWE 0x200000

The delta for all but the third PT_LOAD is the same:
0xffffffff80000000. [Rafael] thinks the 3rd one is a hack for implementing
per cpu data, but we can't break that.

Obtained from:	LLVM r323456 by Rafael Espindola
2018-01-29 13:54:51 +00:00
emaste
8c7b18046d lld: Improve LMARegion handling.
This fixes the crash reported at [LLVM] PR36083.

The issue is that we were trying to put all the sections in the same
PT_LOAD and crashing trying to write past the end of the file.

This also adds accounting for used space in LMARegion, without it all
3 PT_LOADs would have the same physical address.

Obtained from:	LLVM r323449 by Rafael Espindola
2018-01-29 13:52:42 +00:00