Commit Graph

249815 Commits

Author SHA1 Message Date
sbruno
bcda73b875 Improve write locking in ip6_ctloutput() with macros.
Submitted by:	Jason Eggleston <jason@eggnet.com>
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D14620
2018-03-22 20:21:05 +00:00
jeff
78fd3a94bb Lock reservations with a dedicated lock in each reservation. Protect the
vmd_free_count with atomics.

This allows us to allocate and free from reservations without the free lock
except where a superpage is allocated from the physical layer, which is
roughly 1/512 of the operations on amd64.

Use the counter api to eliminate cache conention on counters.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14707
2018-03-22 19:21:11 +00:00
jeff
701e47b13b Start witness much earlier in boot so that we can shrink the pend list and
make it more immune to further change.

Reviewed by:	markj, imp (Part of D14707)
Sponsored by:	Netflix, Dell/EMC Isilon
2018-03-22 19:11:43 +00:00
jeff
0f9ff62c37 Use read_mostly and alignment tags to eliminate or limit false sharing.
Reviewed by:	markj (Part of D14707)
Sponsored by:	Netflix, Dell/EMC Isilon
2018-03-22 19:06:50 +00:00
dim
d50c586f85 Pull in r327101 from upstream llvm trunk (by Rafael Espindola):
Don't treat .symver as a regular alias definition.

  This patch starts simplifying the handling of .symver.

  For now it just moves the responsibility for creating an alias down to
  the streamer. With that the asm streamer can pass a .symver unchanged,
  which is nice since gas cannot parse "foo@bar = zed".

  In a followup I hope to move the handling down to the writer so that
  we don't need special hacks for avoiding breaking names with @@@ on
  windows.

Pull in r327160 from upstream llvm trunk (by Rafael Espindola):

  Delay creating an alias for @@@.

  With this we only create an alias for @@@ once we know if it should
  use @ or @@. This avoids last minutes renames and hacks to handle MS
  names.

  This only handles the ELF writer. LTO still has issues with @@@
  aliases.

Pull in r327928 from upstream llvm trunk (by Vitaly Buka):

  Object: Move attribute calculation into RecordStreamer. NFC

  Summary: Preparation for D44274

  Reviewers: pcc, espindola

  Subscribers: hiraditya

  Differential Revision: https://reviews.llvm.org/D44276

Pull in r327930 from upstream llvm trunk (by Vitaly Buka):

  Object: Fix handling of @@@ in .symver directive

  Summary:
  name@@@nodename is going to be replaced with name@@nodename if symbols is
  defined in the assembled file, or name@nodename if undefined.
  https://sourceware.org/binutils/docs/as/Symver.html

  Fixes PR36623

  Reviewers: pcc, espindola

  Subscribers: mehdi_amini, hiraditya

  Differential Revision: https://reviews.llvm.org/D44274

Together, these changes fix handling of @@@ in .symver directives when
doing Link Time Optimization.

Reported by:	Shawn Webb <shawn.webb@hardenedbsd.org>
MFC after:	3 months
X-MFC-With:	r327952
2018-03-22 18:58:34 +00:00
kevans
cef0ea8f72 Re-work efidev ordering to fix efirt preloaded by loader on amd64
On amd64, efi_enter calls fpu_kern_enter(). This may not be called until
fpuinitstate has been invoked, resulting in a kernel panic with
efirt_load="YES" in loader.conf(5).

Move fpuinitstate a little earlier in SI_SUB_DRIVERS so that we can squeeze
efirt between it and efirtc at SI_SUB_DRIVERS, SI_ORDER_ANY. efidev must be
after efirt and doesn't really need to be at SI_SUB_DEVFS, so drop it at
SI_SUB_DRIVER, SI_ORDER_ANY.

The not immediately obvious dependency of fpuinitstate by efirt has been
noted in both places.

Discussed with:	kib, andrew
Reported by:	Jakob Alvermark <jakob@alvermark.net>
X-MFC-With:	r330868
2018-03-22 18:24:00 +00:00
gjb
e62fa66429 Remove google_accounts_manager from VM_RC_LIST in the GCE configuration
file, no longer needed.

PR:		221714
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2018-03-22 17:49:27 +00:00
imp
fad66ba05b Drop any recursed taking of Giant once and for all at the top of
kern_reboot(). The shutdown path is now safe to run without Giant.

Discussed with: kib@
Sponsored by: Netflix
2018-03-22 15:34:37 +00:00
andrew
1e9dc78eef Enter into the EFI environment before dereferencing the runtime services
pointer. This may be within the EFI address space and not the FreeBSD
kernel address space.

X-MFC-With:	r330868
Sponsored by:	DARPA, AFRL
2018-03-22 15:32:57 +00:00
andrew
7a8b240cfd Increase the size of the endpoint buffers. They are double buffered so
need to be twice the size.

Sponsored by:	DARPA, AFRL
2018-03-22 15:24:26 +00:00
imp
b1d831d84d Revert r331298
Normally, shutdown_nice() just signals init. However, sometimes it
calls kern_reboot directly. For that case, r331298 dropped the Giant
lock before calling it. This turns out to be incorrect for the more
common case where init exists and we just signal it. Restore the old
behavior. The direct call to kern_reboot() doesn't sync buffers to the
disk, so should work with Giant held, so we don't need to drop locks
here for that.

Noticed by: bde@
Sponsored by: Netflix
2018-03-22 15:11:53 +00:00
asomers
65a0e81cdc tftpd: misc Coverity cleanup in the tests
A bunch of unchecked return values from open(2) and read(2)

Reported by:	Coverity
CID:		1386900, 1386911, 1386926, 1386928, 1386932, 1386942
CID:		1386961, 1386979
MFC after:	8 days
X-MFC-With:	330696
2018-03-22 14:51:05 +00:00
hselasky
ab822fada3 The pci_disable_device() function is also expected to clear the PCI
busmaster. This fixes LinuxKPI compliancy with Linux.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-22 13:30:35 +00:00
emaste
289be8516c Share Linux errno table with libsysdecode
Requested by:	jhb
Reviewed by:	jhb
Sponsored by:	Turing Robotic Industries Inc.
2018-03-22 12:58:49 +00:00
hselasky
7aa86e0b99 Clear old MSIX IRQ numbers in the LinuxKPI.
When disabling the MSIX IRQ vectors for a PCI device through the
LinuxKPI, make sure any old MSIX IRQ numbers are no longer visible to
the linux_pci_find_irq_dev() function else IRQs can be requested from
the wrong PCI device.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2018-03-22 12:26:27 +00:00
kevans
dae0d51cda Partially revert r328780
efi.4th was added to ObsoleteFiles and disconnected from the build, but not
removed from hte repo. We've since found a mild use for it that makes some
amount of sense, so partially revert r328780 and bring it back to life.

Reported by:	many
X-MFC-With:	r331326
2018-03-22 11:57:59 +00:00
jtl
80fab35507 Bump netstat.1's .Dd after r331347. 2018-03-22 09:43:15 +00:00
jtl
a93bdf6963 Add the "TCP Blackbox Recorder" which we discussed at the developer
summits at BSDCan and BSDCam in 2017.

The TCP Blackbox Recorder allows you to capture events on a TCP connection
in a ring buffer. It stores metadata with the event. It optionally stores
the TCP header associated with an event (if the event is associated with a
packet) and also optionally stores information on the sockets.

It supports setting a log ID on a TCP connection and using this to correlate
multiple connections that share a common log ID.

You can log connections in different modes. If you are doing a coordinated
test with a particular connection, you may tell the system to put it in
mode 4 (continuous dump). Or, if you just want to monitor for errors, you
can put it in mode 1 (ring buffer) and dump all the ring buffers associated
with the connection ID when we receive an error signal for that connection
ID. You can set a default mode that will be applied to a particular ratio
of incoming connections. You can also manually set a mode using a socket
option.

This commit includes only basic probes. rrs@ has added quite an abundance
of probes in his TCP development work. He plans to commit those soon.

There are user-space programs which we plan to commit as ports. These read
the data from the log device and output pcapng files, and then let you
analyze the data (and metadata) in the pcapng files.

Reviewed by:	gnn (previous version)
Obtained from:	Netflix, Inc.
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D11085
2018-03-22 09:40:08 +00:00
lwhsu
be9ac11770 Fix build.
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D14793
2018-03-22 08:32:39 +00:00
rpokala
d58deaf0d2 jedec_dimm: Use correct string length when populating sc->slotid_str
Don't limit the copy to the size of the target string *pointer* (always
4 on 32-bit / 8 on 64-bit). Instead, just use strdup().

Reported by:	Coverity
CID:		1386912
Reviewed by:	cem, imp
MFC after:	1 week
2018-03-22 06:31:05 +00:00
glebius
b14bf58088 Redo r331328. We need to fix not only type but also format. While
here again notice that we are fixing regression from r331106.
2018-03-22 05:26:27 +00:00
glebius
3ef748bde0 Fix LINT-NOINET build initializing local to false. This is
a dead code, since for NOINET build isipv6 is always true,
but this dead code makes it compilable.

Reported by:	rpokala
2018-03-22 05:07:57 +00:00
np
c8d6e27e4b cxgbe(4): Do not read MFG diags information from custom boards.
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-03-22 04:42:29 +00:00
kevans
ebe7a261c6 forthloader: Don't break BIOS boots...
I thought I tested this scenario, but clearly I failed to. =(

BIOS boots won't have efi-autoresizecons, so trying to use it as a forth
word fails during include. Use evaluate on "efi-autoresizecons" as a string
instead to move any potential errors to runtime- safely after we've already
checked that we're booting UEFI.

Pointy hat to:	me
Reported by:	cy
2018-03-22 04:16:14 +00:00
np
b14159f18c cxgbe(4): Tunnel congestion drops on a port should be cleared when the
stats for that port are cleared.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-03-22 02:04:57 +00:00
emaste
8c45d03988 Correct signedness bug in drm_modeset_ctl
drm_modeset_ctl() takes a signed in from userland, does a boundscheck,
and then uses it to index into a structure and write to it.  The
boundscheck only checks upper bound, and never checks for nagative
values.  If the int coming from userland is negative [after conversion]
it will bypass the boundscheck, perform a negative index into an array
and write to it, causing memory corruption.

Note that this is in the "old" drm driver; this issue does not exist
in drm2.

Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed by:	cem
MFC after:	1 day
Sponsored by:	The FreeBSD Foundation
2018-03-22 01:00:55 +00:00
cem
a6cf821162 getentropy(3): Fallback to kern.arandom sysctl on older kernels
On older kernels, when userspace program disables SIGSYS, catch ENOSYS and
emulate getrandom(2) syscall with the kern.arandom sysctl (via existing
arc4_sysctl wrapper).

Special care is taken to faithfully emulate EFAULT on NULL pointers, because
sysctl(3) as used by kern.arandom ignores NULL oldp.  (This was caught by
getentropy(3) ATF tests.)

Reported by:	kib
Reviewed by:	kib
Discussed with:	delphij
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D14785
2018-03-21 23:52:37 +00:00
emaste
819725ff5d Fix kernel memory disclosure in drm_infobufs
drm_infobufs() has a structure on the stack, fills it out and copies it
to userland.  There are 2 elements in the struct that are not filled out
and left uninitialized.  This will leak uninitialized kernel stack data
to userland.

Submitted by:	Domagoj Stolfa <ds815@cam.ac.uk>
Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after:	1 day
Security:	Kernel memory disclosure (798)
2018-03-21 23:51:14 +00:00
jamie
9267ddfb50 If a jail parameter isn't found, try loading a related kernel module. 2018-03-21 23:50:46 +00:00
cem
a840a2cd0c Apply r228478 (CTASSERT => _Static_assert()) to stand bootstrap.h
Reported by:	GCC (it doesn't like the unused array)
Sponsored by:	Dell EMC Isilon
2018-03-21 23:46:26 +00:00
emaste
444301c25e Fix kernel memory disclosure in ibcs2_getdents
ibcs2_getdents() copies a dirent structure to userland.  The ibcs2
dirent structure contains a 2 byte pad element.  This element is never
initialized, but copied to userland none-the-less.

Note that ibcs2 has not built on HEAD since r302095.

Submitted by:	Domagoj Stolfa <ds815@cam.ac.uk>
Reported by:	Ilja Van Sprundel <ivansprundel@ioactive.com>
MFC after:	3 days
Security:	Kernel memory disclosure (803)
2018-03-21 23:26:42 +00:00
glebius
293a5a55c1 Fix sysctl types broken in r329612. 2018-03-21 23:21:32 +00:00
emaste
af3be5b4fb Add ) missing from r330297
Sponsored by:	The FreeBSD Foundation
2018-03-21 23:17:26 +00:00
kevans
c4117cad3c Forth version of EFI autoresizing
r331321 delegated autoresizing to an efi-autoresizecons command that
currently is expected to be done in forth/lua prior to drawing anything
useful.

Add the Forth version of the lua addition in r331321, hook efi.4th up to be
installed.

efiboot? was written by dteske@; anything outside of that may be blamed on
me.
2018-03-21 22:01:51 +00:00
markj
40e57cc4d2 Elide the object lock in the common case in vfs_vmio_unwire().
The object lock was only needed when attempting to free B_DIRECT
buffer pages, and for testing for invalid pages (and freeing them
if so). Handle the latter by instead moving invalid pages near the head
of the inactive queue, where they will be reclaimed quickly.

Reviewed by:	alc, kib, jeff
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D14778
2018-03-21 21:15:43 +00:00
jhb
995b92da50 Ensure thread library is initialized in pthread_testcancel().
Call _thr_check_init() before reading curthread in pthread_testcancel().

If a constructor in a library creates a semaphore via sem_init() and
then waits for it via sem_wait(), the program can core dump in
_pthread_testcancel() called from sem_wait().  This is because the
semaphore implementation lives in libc, so the library's constructors
can be run before libthr's constructors.

Reported by:	arichardson
Reviewed by:	kib
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA / AFRL
Differential Revision:	https://reviews.freebsd.org/D14786
2018-03-21 21:13:26 +00:00
glebius
e5ec0a0e43 The net.inet.tcp.nolocaltimewait=1 optimization prevents local TCP connections
from entering the TIME_WAIT state. However, it omits sending the ACK for the
FIN, which results in RST. This becomes a bigger deal if the sysctl
net.inet.tcp.blackhole is 2. In this case RST isn't send, so the other side of
the connection (also local) keeps retransmitting FINs.

To fix that in tcp_twstart() we will not call tcp_close() immediately. Instead
we will allocate a tcptw on stack and proceed to the end of the function all
the way to tcp_twrespond(), to generate the correct ACK, then we will drop the
last PCB reference.

While here, make a few tiny improvements:
- use bools for boolean variable
- staticize nolocaltimewait
- remove pointless acquisiton of socket lock

Reported by:	jtl
Reviewed by:	jtl
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D14697
2018-03-21 20:59:30 +00:00
kevans
048178884e UEFI: Ditch console mode setting, choose optimal GOP mode later in boot
boot1 is too early to be deciding a good resolution. Console modes don't map
cleanly/predictably to actual screen resolutions, and GOP does not reflect
the actual screen resolution after a console mode change. Rip it out.

Add an efi-autoresizecons command to loader to choose an optimal screen
resolution based on the current environment. We'll explicitly execute this
later, preferably before we draw anything of value but after we load config
and pick up any tunables we may need to decide where we're going.

This method also allows us to actually pass the correct framebuffer
information on to the kernel.

UGA autoresizing is not implemented because it doesn't have the kind of mode
enumeration that GOP does. If an interested person with relevant hardware
could get in contact, we can take a look at implementing UGA autoresize.

This effectively "fixes" the breakage caused by r327058, but doesn't
actually set the resolution correctly until the interpreter calls
efi-autoresizcons. The lualoader version of this has been included for
reference; the forth equivalent will follow.

Reviewed by:	imp (with some hestitation), manu
Differential Revision:	https://reviews.freebsd.org/D14788
2018-03-21 20:36:57 +00:00
kevans
23b6d9fc0c lualoader: Use printc when we expect ANSI escape sequences 2018-03-21 18:02:56 +00:00
csjp
5726e5cc3d Document the limitations associated with using the audit syscalls
from jailed process.  These might get implemented in jails in the
future, but for now they are not supported.

Discussed on:   freebsd-security@
Reviewed by:    brueffer@
MFC after:      2 weeks
2018-03-21 17:22:42 +00:00
cem
f5c5ebb133 Import Blake2 algorithms (blake2b, blake2s) from libb2
The upstream repository is on github BLAKE2/libb2.  Files landed in
sys/contrib/libb2 are the unmodified upstream files, except for one
difference:  secure_zero_memory's contents have been replaced with
explicit_bzero() only because the previous implementation broke powerpc
link.  Preferential use of explicit_bzero() is in progress upstream, so
it is anticipated we will be able to drop this diff in the future.

sys/crypto/blake2 contains the source files needed to port libb2 to our
build system, a wrapped (limited) variant of the algorithm to match the API
of our auth_transform softcrypto abstraction, incorporation into the Open
Crypto Framework (OCF) cryptosoft(4) driver, as well as an x86 SSE/AVX
accelerated OCF driver, blake2(4).

Optimized variants of blake2 are compiled for a number of x86 machines
(anything from SSE2 to AVX + XOP).  On those machines, FPU context will need
to be explicitly saved before using blake2(4)-provided algorithms directly.
Use via cryptodev / OCF saves FPU state automatically, and use via the
auth_transform softcrypto abstraction does not use FPU.

The intent of the OCF driver is mostly to enable testing in userspace via
/dev/crypto.  ATF tests are added with published KAT test vectors to
validate correctness.

Reviewed by:	jhb, markj
Obtained from:	github BLAKE2/libb2
Differential Revision:	https://reviews.freebsd.org/D14662
2018-03-21 16:18:14 +00:00
cem
6564a13bd2 cryptosoft(4): Zero plain hash contexts, too
An OCF-naive user program could use these primitives to implement HMAC, for
example.  This would make the freed context sensitive data.

Probably other bzeros in this file should be explicit_bzeros as well.
Future work.

Reviewed by:	jhb, markj
Differential Revision:	https://reviews.freebsd.org/D14662 (minor part of a larger work)
2018-03-21 16:12:07 +00:00
shurd
7f8dee3093 Update copyright per Matthew Macy
"Under my tutelage Nicole did 85% of the work. At the time it seemed
simplest for a number of reasons to put my copyright on it. I now consider
that to have been a mistake."

Submitted by:	Matthew Macy <mmacy@mattmacy.io>
Reviewed by:	shurd
Approved by:	shurd
Differential Revision:	https://reviews.freebsd.org/D14766
2018-03-21 15:57:36 +00:00
jtl
ee029a5d0b If the INP lock is uncontested, avoid taking a reference and jumping
through the lock-switching hoops.

A few of the INP lookup operations that lock INPs after the lookup do
so using this mechanism (to maintain lock ordering):

1. Lock lookup structure.
2. Find INP.
3. Acquire reference on INP.
4. Drop lock on lookup structure.
5. Acquire INP lock.
6. Drop reference on INP.

This change provides a slightly shorter path for cases where the INP
lock is uncontested:

1. Lock lookup structure.
2. Find INP.
3. Try to acquire the INP lock.
4. If successful, drop lock on lookup structure.

Of course, if the INP lock is contested, the functions will need to
revert to the previous way of switching locks safely.

This saves a few atomic operations when the INP lock is uncontested.

Discussed with:	gallatin, rrs, rwatson
MFC after:	2 weeks
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D12911
2018-03-21 15:54:46 +00:00
andrew
77f4d6e932 Use a table to find the endpoint configuration
On the Allwinner SoCs we need to set a custom endpoint configuration. To
allow for this use a table to store the configuration so the attachment
can override it.

Reviewed by:	hselasky
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D14783
2018-03-21 15:17:54 +00:00
kevans
494d4310a6 lualoader: Clear up some possible naming confusion
In the original lualoader project, 'escapef' and 'escapeb' were chosen for
'escape fg' and 'escape bg'. We've carried on this naming convention, and as
our use of attributes grow the likeliness of 'escapeb'/'resetb' being
confused upon glance for 'escape bold'/'reset bold' increases.

Fix this by renaming these four functions to {escape,reset}{fg,bg} rather
than {escape,reset}{f,b} for clarity.

Reported by:	dteske
2018-03-21 15:09:47 +00:00
imp
89c879aab5 Mark psycho interrupts as MPSAFE. It's safe to do so now that we don't
need Giant to call shutdown_nice().
2018-03-21 14:47:17 +00:00
imp
17b7d8a1eb Unlock giant when calling shutdown_nice() 2018-03-21 14:47:12 +00:00
imp
95bc67def2 This is MPSAFE on this platform, so don't take Giant out while running
the callback.
2018-03-21 14:47:08 +00:00
imp
55d52e5cbf These interrupts call shutdown_nice() which should be called Giant
unlocked. Rather than dropping it in the interrupt handler, mark these
handlers as MPSAFE.
2018-03-21 14:47:03 +00:00