Commit Graph

249829 Commits

Author SHA1 Message Date
mav
3b4d2dc7e3 MFV r331407: 9213 zfs: sytem typo
illumos/illumos-gate@edc8ef7d92

Reviewed by: C Fraire <cfraire@me.com>
Reviewed by: Andy Fiddaman <omnios@citrus-it.co.uk>
Approved by: Joshua M. Clulow <josh@sysmgr.org>
Author: Toomas Soome <tsoome@me.com>
2018-03-23 02:30:29 +00:00
mav
1c678a8460 MFV r331405: 9084 spa_*_ashift must ignore spare devices
illumos/illumos-gate@b037f3dbd6

Reviewed by: Prashanth Sreenivasa <pks@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Prakash Surya <prakash.surya@delphix.com>
2018-03-23 02:24:52 +00:00
mav
a2335dfaf9 MFV r331400: 8484 Implement aggregate sum and use for arc counters
In pursuit of improving performance on multi-core systems, we should
implements fanned out counters and use them to improve the performance of
some of the arc statistics. These stats are updated extremely frequently,
and can consume a significant amount of CPU time.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>
2018-03-23 02:15:05 +00:00
jhibbits
e969264bab Debug interrupts aren't instruction traps
The EXC_DEBUG type is akin to the MPC74xx "Instruction Breakpoint" trap.
Don't treat it as a trap instruction.
2018-03-23 00:40:08 +00:00
mav
555f9563c9 8484 Implement aggregate sum and use for arc counters
In pursuit of improving performance on multi-core systems, we should
implements fanned out counters and use them to improve the performance of
some of the arc statistics. These stats are updated extremely frequently,
and can consume a significant amount of CPU time.

Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Approved by: Dan McDonald <danmcd@joyent.com>
Author: Paul Dagnelie <pcd@delphix.com>
2018-03-23 00:20:42 +00:00
sbruno
622ba6f45a Refactor ip6_getpcbopt() for better locking and memory management
Created GET_PKTOPT_EXT_HDR() and GET_PKTOPT_SOCKADDR() macros to
handle safely fetching options from in6p_outputopts, including
properly dealing with in6p locking and preparing memory for
sooptcopyout().

Changed the function signature of ip6_getpcbopt() to allow the
function to acquire and release locks on in6p as needed.

Submitted by:	Jason Eggleston <jason@eggnet.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14619
2018-03-22 23:34:48 +00:00
sbruno
57df63d5af Simple locking fixes in ip_ctloutput, ip6_ctloutput, rip_ctloutput.
Submitted by:	Jason Eggleston <jason@eggnet.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14624
2018-03-22 22:29:32 +00:00
landonf
67ac67c3b6 Add missing NULL checks when calling malloc(M_NOWAIT) in
bhnd_nv_strdup/bhnd_nv_strndup.

If malloc(9) failed during initial bhnd(4) attach, while allocating the root
NVRAM path string ("/"), the returned NULL pointer would be passed as the
destination to memcpy().

Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
2018-03-22 22:13:46 +00:00
sbruno
b74ecf8d2a Handle locking and memory safety for IPV6_PATHMTU in ip6_ctloutput().
Submitted by:	Jason Eggleston <jason@eggnet.com>
Reviewed by:	ae
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14622
2018-03-22 21:18:34 +00:00
kib
3fd81cd03f Do not send signals to init directly from shutdown_nice(9), do it from
the task context.

shutdown_nice() is used from the fast interrupt handlers, mostly for
console drivers, where we cannot lock blockable locks.  Schedule the
task in the fast queue to send the signal from the proper context.

Reviewed by:	imp
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-03-22 20:47:25 +00:00
kib
d93ae96f8a Fixes for ptrace(PT_GETXSTATE_INFO) related to the padding in struct
ptrace_xstate_info).

struct ptrace_xstate_info has 64bit member but ends up with 32bit
one. As result, on amd64 there is a 32bit padding at the end, but not
on i386.

We must clear the padding before doing the copyout. For compat32 case,
we must copyout the structure which does not have the padding at the
end.  The later fixes 32bit gdb display of the YMM registers when
running on amd64 kernel.

Reported by:	Vlad Tsyrklevich
Reviewed by:	brooks (previous version)
Sponsored by:	The FreeBSD Foundation
admbugs:	765
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D14794
2018-03-22 20:44:27 +00:00
sbruno
bcda73b875 Improve write locking in ip6_ctloutput() with macros.
Submitted by:	Jason Eggleston <jason@eggnet.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14620
2018-03-22 20:21:05 +00:00
jeff
78fd3a94bb Lock reservations with a dedicated lock in each reservation. Protect the
vmd_free_count with atomics.

This allows us to allocate and free from reservations without the free lock
except where a superpage is allocated from the physical layer, which is
roughly 1/512 of the operations on amd64.

Use the counter api to eliminate cache conention on counters.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14707
2018-03-22 19:21:11 +00:00
jeff
701e47b13b Start witness much earlier in boot so that we can shrink the pend list and
make it more immune to further change.

Reviewed by:	markj, imp (Part of D14707)
Sponsored by:	Netflix, Dell/EMC Isilon
2018-03-22 19:11:43 +00:00
jeff
0f9ff62c37 Use read_mostly and alignment tags to eliminate or limit false sharing.
Reviewed by:	markj (Part of D14707)
Sponsored by:	Netflix, Dell/EMC Isilon
2018-03-22 19:06:50 +00:00
dim
d50c586f85 Pull in r327101 from upstream llvm trunk (by Rafael Espindola):
Don't treat .symver as a regular alias definition.

  This patch starts simplifying the handling of .symver.

  For now it just moves the responsibility for creating an alias down to
  the streamer. With that the asm streamer can pass a .symver unchanged,
  which is nice since gas cannot parse "foo@bar = zed".

  In a followup I hope to move the handling down to the writer so that
  we don't need special hacks for avoiding breaking names with @@@ on
  windows.

Pull in r327160 from upstream llvm trunk (by Rafael Espindola):

  Delay creating an alias for @@@.

  With this we only create an alias for @@@ once we know if it should
  use @ or @@. This avoids last minutes renames and hacks to handle MS
  names.

  This only handles the ELF writer. LTO still has issues with @@@
  aliases.

Pull in r327928 from upstream llvm trunk (by Vitaly Buka):

  Object: Move attribute calculation into RecordStreamer. NFC

  Summary: Preparation for D44274

  Reviewers: pcc, espindola

  Subscribers: hiraditya

  Differential Revision: https://reviews.llvm.org/D44276

Pull in r327930 from upstream llvm trunk (by Vitaly Buka):

  Object: Fix handling of @@@ in .symver directive

  Summary:
  name@@@nodename is going to be replaced with name@@nodename if symbols is
  defined in the assembled file, or name@nodename if undefined.
  https://sourceware.org/binutils/docs/as/Symver.html

  Fixes PR36623

  Reviewers: pcc, espindola

  Subscribers: mehdi_amini, hiraditya

  Differential Revision: https://reviews.llvm.org/D44274

Together, these changes fix handling of @@@ in .symver directives when
doing Link Time Optimization.

Reported by:	Shawn Webb <shawn.webb@hardenedbsd.org>
MFC after:	3 months
X-MFC-With:	r327952
2018-03-22 18:58:34 +00:00
kevans
cef0ea8f72 Re-work efidev ordering to fix efirt preloaded by loader on amd64
On amd64, efi_enter calls fpu_kern_enter(). This may not be called until
fpuinitstate has been invoked, resulting in a kernel panic with
efirt_load="YES" in loader.conf(5).

Move fpuinitstate a little earlier in SI_SUB_DRIVERS so that we can squeeze
efirt between it and efirtc at SI_SUB_DRIVERS, SI_ORDER_ANY. efidev must be
after efirt and doesn't really need to be at SI_SUB_DEVFS, so drop it at
SI_SUB_DRIVER, SI_ORDER_ANY.

The not immediately obvious dependency of fpuinitstate by efirt has been
noted in both places.

Discussed with:	kib, andrew
Reported by:	Jakob Alvermark <jakob@alvermark.net>
X-MFC-With:	r330868
2018-03-22 18:24:00 +00:00
gjb
e62fa66429 Remove google_accounts_manager from VM_RC_LIST in the GCE configuration
file, no longer needed.

PR:		221714
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2018-03-22 17:49:27 +00:00
imp
fad66ba05b Drop any recursed taking of Giant once and for all at the top of
kern_reboot(). The shutdown path is now safe to run without Giant.

Discussed with: kib@
Sponsored by: Netflix
2018-03-22 15:34:37 +00:00
andrew
1e9dc78eef Enter into the EFI environment before dereferencing the runtime services
pointer. This may be within the EFI address space and not the FreeBSD
kernel address space.

X-MFC-With:	r330868
Sponsored by:	DARPA, AFRL
2018-03-22 15:32:57 +00:00
andrew
7a8b240cfd Increase the size of the endpoint buffers. They are double buffered so
need to be twice the size.

Sponsored by:	DARPA, AFRL
2018-03-22 15:24:26 +00:00
imp
b1d831d84d Revert r331298
Normally, shutdown_nice() just signals init. However, sometimes it
calls kern_reboot directly. For that case, r331298 dropped the Giant
lock before calling it. This turns out to be incorrect for the more
common case where init exists and we just signal it. Restore the old
behavior. The direct call to kern_reboot() doesn't sync buffers to the
disk, so should work with Giant held, so we don't need to drop locks
here for that.

Noticed by: bde@
Sponsored by: Netflix
2018-03-22 15:11:53 +00:00
asomers
65a0e81cdc tftpd: misc Coverity cleanup in the tests
A bunch of unchecked return values from open(2) and read(2)

Reported by:	Coverity
CID:		1386900, 1386911, 1386926, 1386928, 1386932, 1386942
CID:		1386961, 1386979
MFC after:	8 days
X-MFC-With:	330696
2018-03-22 14:51:05 +00:00
hselasky
ab822fada3 The pci_disable_device() function is also expected to clear the PCI
busmaster. This fixes LinuxKPI compliancy with Linux.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-22 13:30:35 +00:00
emaste
289be8516c Share Linux errno table with libsysdecode
Requested by:	jhb
Reviewed by:	jhb
Sponsored by:	Turing Robotic Industries Inc.
2018-03-22 12:58:49 +00:00
hselasky
7aa86e0b99 Clear old MSIX IRQ numbers in the LinuxKPI.
When disabling the MSIX IRQ vectors for a PCI device through the
LinuxKPI, make sure any old MSIX IRQ numbers are no longer visible to
the linux_pci_find_irq_dev() function else IRQs can be requested from
the wrong PCI device.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-22 12:26:27 +00:00
kevans
dae0d51cda Partially revert r328780
efi.4th was added to ObsoleteFiles and disconnected from the build, but not
removed from hte repo. We've since found a mild use for it that makes some
amount of sense, so partially revert r328780 and bring it back to life.

Reported by:	many
X-MFC-With:	r331326
2018-03-22 11:57:59 +00:00
jtl
80fab35507 Bump netstat.1's .Dd after r331347. 2018-03-22 09:43:15 +00:00
jtl
a93bdf6963 Add the "TCP Blackbox Recorder" which we discussed at the developer
summits at BSDCan and BSDCam in 2017.

The TCP Blackbox Recorder allows you to capture events on a TCP connection
in a ring buffer. It stores metadata with the event. It optionally stores
the TCP header associated with an event (if the event is associated with a
packet) and also optionally stores information on the sockets.

It supports setting a log ID on a TCP connection and using this to correlate
multiple connections that share a common log ID.

You can log connections in different modes. If you are doing a coordinated
test with a particular connection, you may tell the system to put it in
mode 4 (continuous dump). Or, if you just want to monitor for errors, you
can put it in mode 1 (ring buffer) and dump all the ring buffers associated
with the connection ID when we receive an error signal for that connection
ID. You can set a default mode that will be applied to a particular ratio
of incoming connections. You can also manually set a mode using a socket
option.

This commit includes only basic probes. rrs@ has added quite an abundance
of probes in his TCP development work. He plans to commit those soon.

There are user-space programs which we plan to commit as ports. These read
the data from the log device and output pcapng files, and then let you
analyze the data (and metadata) in the pcapng files.

Reviewed by:	gnn (previous version)
Obtained from:	Netflix, Inc.
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D11085
2018-03-22 09:40:08 +00:00
lwhsu
be9ac11770 Fix build.
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D14793
2018-03-22 08:32:39 +00:00
rpokala
d58deaf0d2 jedec_dimm: Use correct string length when populating sc->slotid_str
Don't limit the copy to the size of the target string *pointer* (always
4 on 32-bit / 8 on 64-bit). Instead, just use strdup().

Reported by:	Coverity
CID:		1386912
Reviewed by:	cem, imp
MFC after:	1 week
2018-03-22 06:31:05 +00:00
glebius
b14bf58088 Redo r331328. We need to fix not only type but also format. While
here again notice that we are fixing regression from r331106.
2018-03-22 05:26:27 +00:00
glebius
3ef748bde0 Fix LINT-NOINET build initializing local to false. This is
a dead code, since for NOINET build isipv6 is always true,
but this dead code makes it compilable.

Reported by:	rpokala
2018-03-22 05:07:57 +00:00
np
c8d6e27e4b cxgbe(4): Do not read MFG diags information from custom boards.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-03-22 04:42:29 +00:00
kevans
ebe7a261c6 forthloader: Don't break BIOS boots...
I thought I tested this scenario, but clearly I failed to. =(

BIOS boots won't have efi-autoresizecons, so trying to use it as a forth
word fails during include. Use evaluate on "efi-autoresizecons" as a string
instead to move any potential errors to runtime- safely after we've already
checked that we're booting UEFI.

Pointy hat to:	me
Reported by:	cy
2018-03-22 04:16:14 +00:00
np
b14159f18c cxgbe(4): Tunnel congestion drops on a port should be cleared when the
stats for that port are cleared.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-03-22 02:04:57 +00:00
emaste
8c45d03988 Correct signedness bug in drm_modeset_ctl
drm_modeset_ctl() takes a signed in from userland, does a boundscheck,
and then uses it to index into a structure and write to it.  The
boundscheck only checks upper bound, and never checks for nagative
values.  If the int coming from userland is negative [after conversion]
it will bypass the boundscheck, perform a negative index into an array
and write to it, causing memory corruption.

Note that this is in the "old" drm driver; this issue does not exist
in drm2.

Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by:	cem
MFC after:	1 day
Sponsored by:	The FreeBSD Foundation
2018-03-22 01:00:55 +00:00
cem
a6cf821162 getentropy(3): Fallback to kern.arandom sysctl on older kernels
On older kernels, when userspace program disables SIGSYS, catch ENOSYS and
emulate getrandom(2) syscall with the kern.arandom sysctl (via existing
arc4_sysctl wrapper).

Special care is taken to faithfully emulate EFAULT on NULL pointers, because
sysctl(3) as used by kern.arandom ignores NULL oldp.  (This was caught by
getentropy(3) ATF tests.)

Reported by:	kib
Reviewed by:	kib
Discussed with:	delphij
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14785
2018-03-21 23:52:37 +00:00
emaste
819725ff5d Fix kernel memory disclosure in drm_infobufs
drm_infobufs() has a structure on the stack, fills it out and copies it
to userland.  There are 2 elements in the struct that are not filled out
and left uninitialized.  This will leak uninitialized kernel stack data
to userland.

Submitted by:	Domagoj Stolfa <ds815@cam.ac.uk>
Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after:	1 day
Security:	Kernel memory disclosure (798)
2018-03-21 23:51:14 +00:00
jamie
9267ddfb50 If a jail parameter isn't found, try loading a related kernel module. 2018-03-21 23:50:46 +00:00
cem
a840a2cd0c Apply r228478 (CTASSERT => _Static_assert()) to stand bootstrap.h
Reported by:	GCC (it doesn't like the unused array)
Sponsored by:	Dell EMC Isilon
2018-03-21 23:46:26 +00:00
emaste
444301c25e Fix kernel memory disclosure in ibcs2_getdents
ibcs2_getdents() copies a dirent structure to userland.  The ibcs2
dirent structure contains a 2 byte pad element.  This element is never
initialized, but copied to userland none-the-less.

Note that ibcs2 has not built on HEAD since r302095.

Submitted by:	Domagoj Stolfa <ds815@cam.ac.uk>
Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after:	3 days
Security:	Kernel memory disclosure (803)
2018-03-21 23:26:42 +00:00
glebius
293a5a55c1 Fix sysctl types broken in r329612. 2018-03-21 23:21:32 +00:00
emaste
af3be5b4fb Add ) missing from r330297
Sponsored by:	The FreeBSD Foundation
2018-03-21 23:17:26 +00:00
kevans
c4117cad3c Forth version of EFI autoresizing
r331321 delegated autoresizing to an efi-autoresizecons command that
currently is expected to be done in forth/lua prior to drawing anything
useful.

Add the Forth version of the lua addition in r331321, hook efi.4th up to be
installed.

efiboot? was written by dteske@; anything outside of that may be blamed on
me.
2018-03-21 22:01:51 +00:00
markj
40e57cc4d2 Elide the object lock in the common case in vfs_vmio_unwire().
The object lock was only needed when attempting to free B_DIRECT
buffer pages, and for testing for invalid pages (and freeing them
if so). Handle the latter by instead moving invalid pages near the head
of the inactive queue, where they will be reclaimed quickly.

Reviewed by:	alc, kib, jeff
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D14778
2018-03-21 21:15:43 +00:00
jhb
995b92da50 Ensure thread library is initialized in pthread_testcancel().
Call _thr_check_init() before reading curthread in pthread_testcancel().

If a constructor in a library creates a semaphore via sem_init() and
then waits for it via sem_wait(), the program can core dump in
_pthread_testcancel() called from sem_wait().  This is because the
semaphore implementation lives in libc, so the library's constructors
can be run before libthr's constructors.

Reported by:	arichardson
Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA / AFRL
Differential Revision:	https://reviews.freebsd.org/D14786
2018-03-21 21:13:26 +00:00
glebius
e5ec0a0e43 The net.inet.tcp.nolocaltimewait=1 optimization prevents local TCP connections
from entering the TIME_WAIT state. However, it omits sending the ACK for the
FIN, which results in RST. This becomes a bigger deal if the sysctl
net.inet.tcp.blackhole is 2. In this case RST isn't send, so the other side of
the connection (also local) keeps retransmitting FINs.

To fix that in tcp_twstart() we will not call tcp_close() immediately. Instead
we will allocate a tcptw on stack and proceed to the end of the function all
the way to tcp_twrespond(), to generate the correct ACK, then we will drop the
last PCB reference.

While here, make a few tiny improvements:
- use bools for boolean variable
- staticize nolocaltimewait
- remove pointless acquisiton of socket lock

Reported by:	jtl
Reviewed by:	jtl
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D14697
2018-03-21 20:59:30 +00:00
kevans
048178884e UEFI: Ditch console mode setting, choose optimal GOP mode later in boot
boot1 is too early to be deciding a good resolution. Console modes don't map
cleanly/predictably to actual screen resolutions, and GOP does not reflect
the actual screen resolution after a console mode change. Rip it out.

Add an efi-autoresizecons command to loader to choose an optimal screen
resolution based on the current environment. We'll explicitly execute this
later, preferably before we draw anything of value but after we load config
and pick up any tunables we may need to decide where we're going.

This method also allows us to actually pass the correct framebuffer
information on to the kernel.

UGA autoresizing is not implemented because it doesn't have the kind of mode
enumeration that GOP does. If an interested person with relevant hardware
could get in contact, we can take a look at implementing UGA autoresize.

This effectively "fixes" the breakage caused by r327058, but doesn't
actually set the resolution correctly until the interpreter calls
efi-autoresizcons. The lualoader version of this has been included for
reference; the forth equivalent will follow.

Reviewed by:	imp (with some hestitation), manu
Differential Revision:	https://reviews.freebsd.org/D14788
2018-03-21 20:36:57 +00:00
kevans
23b6d9fc0c lualoader: Use printc when we expect ANSI escape sequences 2018-03-21 18:02:56 +00:00