Commit Graph

124026 Commits

Author SHA1 Message Date
Gordon Tetlow
89f652d149 Clear stack allocated data structure to prevent kernel memory leak.
Reported by:	Thomas Barabosch, Fraunhofer FKIE
Reviewed by:	wes@
Approved by:	re (implicit)
Approved by:	so
Security:	FreeBSD-EN-18:12.mem
Security:	CVE-2018-17155
2018-09-27 18:39:54 +00:00
John Baldwin
83382d027f Don't clear DR6 for debug exceptions from userland.
This reverts part of r333368.  The attempt to clear DR6 was occuring
too soon as trapsignal() does not pause to let the debugger notice the
SIGTRAP and query DR6.  The signal exchange does not occur until much
later during ast().  As a result, GDB was no longer recognizing
hardware breakpoints and watchpoints on x86.

In addition, any userland programs that want to inspect DR6 in a
SIGTRAP handler don't have a way to do this if we clear DR6 in the
exception handler.

Instead of relying on the kernel to clear DR6, debuggers will have to
explicitly clear it after a trace trap (which they needed to do on
older kernels anyway).

Reviewed by:	kib
Approved by:	re (delphij)
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D17319
2018-09-27 17:33:59 +00:00
Mateusz Guzik
5910b87605 amd64: macroify and mostly depessimize copyinstr
See r338968 for details.

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17288
2018-09-27 15:53:36 +00:00
Bjoern A. Zeeb
e15e0e3e4d In in6_pcbpurgeif0() called, e.g., from if_clone_destroy(),
once we have a lock, make sure the inp is not marked freed.
This can happen since the list traversal and locking was
converted to epoch(9).  If the inp is marked "freed", skip it.

This prevents a NULL pointer deref panic later on.

Reported by:	slavash (Mellanox)
Tested by:	slavash (Mellanox)
Reviewed by:	markj (no formal review but caught my unlock mistake)
Approved by:	re (kib)
2018-09-27 15:32:37 +00:00
Mateusz Guzik
3d95cc51bb amd64: mostly depessimize copystr
- remove a forward branch in the common case
- replace xchg + lodsb/stosb loop with simple movs

A simple test on Intel(R) Core(TM) i7-4600U CPU @ 2.10GH copying
/foo/bar/baz in a loop goes from 295715863 ops/s to 465807408.

Further changes are pending.

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17281
2018-09-27 15:27:53 +00:00
Mateusz Guzik
0e59ecce47 amd64: clean up copyin/copyout
- move the PSL.AC comment to the fault handler
- stop testing for zero-sized ops. after several minutes of package
building there were no copyin calls with zero bytes and very few
copyout. the semantic of returning 0 in this case is preserved
- shorten exit paths by clearing %eax earlier
- replace xchg with 3 movs. this is what compilers do. a naive
benchmark on EPYC suggests about 1% increase in thoughput thanks to
this change.
- remove the useless movb %cl,%al from copyout. it looks like a
leftover from many years ago

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17286
2018-09-27 15:24:16 +00:00
Mateusz Guzik
a8e3f99ec1 amd64: implement memcmp in assembly
Both the in-kernel C variant and libc asm variant have very poor performance.
The former compiles to a single byte comparison loop, which breaks down even
for small sizes. The latter uses rep cmpsq/b which turn out to have very poor
throughput and are slower than a hand-coded 32-byte comparison loop.

Depending on size this is about 3-4 times faster than the current routines.

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17328
2018-09-27 14:05:44 +00:00
Andrew Turner
e3f284eee7 Export ID_AA64PFR0_EL1 to userland
Create a user view of the ID_AA64PFR0_EL1 register with values common
across all CPUs.

Approved by:	re (kib)
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D17301
2018-09-27 13:54:09 +00:00
Andrew Turner
c7637c4d19 Move the undefined instruction handler to identcpu.c so we have access
to the registers from boot.

Approved by:	re (kib)
Sponsored by:	ABT Systems Ltd
Differential Revision:	https://reviews.freebsd.org/D17301
2018-09-27 13:50:57 +00:00
Mateusz Piotrowski
8385e87c78 newvers.sh: Unbreak building in Git repositories.
Building the kernel in Git repositories when git-svn is not available and
the "help.autocorrect" Git parameter is enabled results in Git trying to
replace the "svn" command (it does not know) with "serve". As a result the
output of the "git server" command is appended to the value of the
environmental variable VERINFO, which causes the auto generated vers.c
file to contain invalid C syntax (missing newline escapes):

    #define "@(#)FreeBSD 12.0-ALPHA7  r000eversion 2
    0015agent=git/2.19.0
    000cls-refs
    0012fetch=shallow
    0012server-option
    0000=5e2272613fa(splash-vt)"
    #define VERSTR "FreeBSD 12.0-ALPHA7  r000eversion 2
    0015agent=git/2.19.0
    000cls-refs
    0012fetch=shallow
    0012server-option
    0000=5e2272613fa(splash-vt)\n"

Using `-c help.autocorrect=0` seems to be a good solution as it does not
modify user's environment. I am not sure, however, if we should use
programs (or Git commands), which we are not sure exist (we never check if
git-svn is available on the host), as there may be more unexpected
behaviors like this one.

Reviewed by:	eadler, emaste, krion
Approved by:	re (gjb), krion (mentor)
Sponsored by:	Bally Wulff Games & Entertainment GmbH
Differential Revision:	https://reviews.freebsd.org/D17271
2018-09-27 12:15:31 +00:00
Andrew Turner
27d2645787 Handle a guest executing a vm instruction by trapping and raising an
undefined instruction exception. Previously we would exit the guest,
however an unprivileged user could execute these.

Found with:	syzkaller
Reviewed by:	araujo, tychon (previous version)
Approved by:	re (kib)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D17192
2018-09-27 11:16:19 +00:00
Navdeep Parhar
ac4031d805 cxgbe(4): Enable support for per-connection rate limiting in the default
firmware configuration files.

Approved by:	re@ (gjb@)
Sponsored by:	Chelsio Communications
2018-09-26 21:16:07 +00:00
Sean Eric Fagan
a7fcb1afcb Add per-session locking to cryptosoft (swcr).
As part of ZFS Crypto, I started getting a series of panics when I did not
have AESNI loaded.  Adding locking fixed it, and I concluded that the
Reinit function altered the AES key schedule.  This locking is not as
fine-grained as it could be (AESNI uses per-cpu locking), but
it's minimally invasive.

Sponsored by: iXsystems Inc
Reviewed by: cem, mav
Approved by: re (gjb), mav (mentor)
Differential Revision: https://reviews.freebsd.org/D17307
2018-09-26 20:23:12 +00:00
Warner Losh
b28589287b Remove bogus spaces.
Spaces aren't allowed in these strings.

Approved by: re@ (glen)
2018-09-26 19:41:00 +00:00
Warner Losh
0dc34160f3 Add PNP info to PCI attachments of cbb, cxgb, ida, iwn, ixl, ixlv,
mfi, mps, mpr, mvs, my, oce, pcn, ral, rl. This only labels existing
pci device tables, and has no probe / attach code changes.

Reviewed by: imp, chuck
Submitted by: Lakhan Shiva Kamireddy <lakhanshiva@gmail.com>
Sponsored by: Google, Inc. (GSoC 2018)
Approved by: re (glen)
2018-09-26 17:12:30 +00:00
Warner Losh
329e817fcc Reapply, with minor tweaks, r338025, from the original commit:
Remove unused and easy to misuse PNP macro parameter

Inspired by r338025, just remove the element size parameter to the
MODULE_PNP_INFO macro entirely.  The 'table' parameter is now required to
have correct pointer (or array) type.  Since all invocations of the macro
already had this property and the emitted PNP data continues to include the
element size, there is no functional change.

Mostly done with the coccinelle 'spatch' tool:

  $ cat modpnpsize0.cocci
    @normaltables@
    identifier b,c;
    expression a,d,e;
    declarer MODULE_PNP_INFO;
    @@
     MODULE_PNP_INFO(a,b,c,d,
    -sizeof(d[0]),
     e);

    @singletons@
    identifier b,c,d;
    expression a;
    declarer MODULE_PNP_INFO;
    @@
     MODULE_PNP_INFO(a,b,c,&d,
    -sizeof(d),
     1);

  $ rg -l MODULE_PNP_INFO -- sys | \
    xargs spatch --in-place --sp-file modpnpsize0.cocci

(Note that coccinelle invokes diff(1) via a PATH search and expects diff to
tolerate the -B flag, which BSD diff does not.  So I had to link gdiff into
PATH as diff to use spatch.)

Tinderbox'd (-DMAKE_JUST_KERNELS).
Approved by: re (glen)
2018-09-26 17:12:14 +00:00
Andrey V. Elsukov
0ddfd867ed Fix witness warning in xform_init().
Do not call crypto_newsession() while holding xforms_lock mutex.
Release mutex before invoking crypto_newsession(), and use
ipsec_kmod_enter()/ipsec_kmod_exit() functions to protect from doing
access to unloaded kernel module memory.

Move xform-releated functions into subr_ipsec.c to be able use
ipsec_kmod_* functions. Also unconditionally build ipsec_kmod_*
functions, since now they are always used by IPSec code.

Add xf_cntr field to struct xformsw, it is used by ipsec_kmod_*
functions. Also constify xf_name field, since it is not expected to be
modified.

Approved by:	re (kib)
Differential Revision:	https://reviews.freebsd.org/D17302
2018-09-26 14:47:51 +00:00
Slava Shwartsman
a4ea412d4f Add PCIV_INVALID definition
From PCI Spec rev 2.2, 6.2.1. Device Identification:
Vendor ID This field identifies the manufacturer of the device. Valid
vendor identifiers are allocated by the PCI SIG to ensure uniqueness.
0FFFFh is an invalid value for Vendor ID.

MFC after:      3 days
Approved by:    re (Glen), hselasky (mentor), kib (mentor)
Sponsored by:   Mellanox Technologies
2018-09-26 13:16:55 +00:00
Michael Tuexen
0277ec9c43 Whitespace changes and fixing a typo. No functional change.
Approved by:	re (kib@)
MFC after:	1 week
2018-09-26 10:24:50 +00:00
Navdeep Parhar
cb8dedc76e cxgbe(4): Treat base/end of firmware parameters as signed integers when
figuring out whether the range is valid or not.

Approved by:	re@ (rgrimes@)
MFC after:	1 week
Sponsored by:	Chelsio Communications
2018-09-26 02:27:37 +00:00
Konstantin Belousov
05e1cca97a Fix some uses of dmaplimit.
dmaplimit is the first byte after the end of DMAP.

Reported by:	"Johnson, Archna" <Archna.Johnson@netapp.com>
Reviewed by:	alc, markj
Approved by:	re (gjb)
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D17318
2018-09-25 20:07:58 +00:00
Konstantin Belousov
cbe100dfca Fix an issue in r338862.
For pmap_invalidate_all_pcid(), only reset pm_gen for non-kernel
pmaps, as it was done before the conversion to ifuncs.  The reset is
useless but innocent for kernel_pmap. Coverity reported that cpuid is
used uninitialized in this case.

Reported by:	cem
Reviewed by:	alc, cem, markj
CID:	 1395807
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
Differential revision:	https://reviews.freebsd.org/D17314
2018-09-25 18:24:25 +00:00
Mateusz Guzik
af534f8d99 zfs: depessimize zfs_root with rmlocks
Currently vfs calls the root method on each absolute lookup and when
crossing mount points.

zfs_root ends up looking up the inode internally as if it was not
instantianted which results in significant lock contention on systems
like EPYC.

Store the vnode in the mount point and protect the access with rmlocks.
This is a temporary hack for 12.0.

Sample result:

before:
make -s -j 128 buildkernel 2778.09s user 3319.45s system 8370% cpu 1:12.85 total

after:
make -s -j 128 buildkernel 3199.57s user 1772.78s system 8232% cpu 1:00.40 total

Tested by:	pho (zfs mount/unmount tests)
Reviewed by:	kib, mav, sef (different parts)
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17233
2018-09-25 17:58:06 +00:00
Navdeep Parhar
ea710848dc cxgbe(4): Link related changes.
- Switch to using 32b port/link capabilities in the driver.  The 32b
  format is used internally by firmwares > 1.16.45.0 and the driver will
  now interact with the firmware in its native format, whether it's 16b
  or 32b.  Note that the 16b format doesn't have room for 50G, 200G, or
  400G speeds.

- Add a bit in the pause_settings knobs to allow negotiated PAUSE
  settings to override manual settings.

- Ensure that manual link settings persist across an administrative
  down/up as well as transceiver unplug/replug.

- Remove unused is_*G_port() functions.

Approved by:	re@ (gjb@)
MFC after:	1 month
Sponsored by:	Chelsio Communications
2018-09-25 05:52:42 +00:00
Justin Hibbits
fd8cf3be5f powerpc: Blacklist the top 64kB range of the lower 4GB PA space
The PHB4 host bridge used by the POWER9 uses a 64kB range in 32-bit
space at the address 0xffff0000-0xffffffff.  Reserve this range so that
DMA memory cannot be allocated within this range.  This fixes seemingly
random crashes on a POWER9 system.  Ideally this range will have been
reserved by the firmware, but as of now this is not the case.

Submitted by:	git_bdragon.rtk0.net
Reviewed by:	nwhitehorn
Approved by:	re(kib)
Differential Revision:	https://reviews.freebsd.org/D17183
2018-09-25 02:34:28 +00:00
Colin Percival
189bd7a978 Recognize the Amazon PCI serial device found in i3.metal EC2 instances
as an NS8250 UART.

Reviewed by:	sbruno, imp
Approved by:	re (delphij)
Sponsored by:	https://www.patreon.com/cperciva
Differential Revision:	https://reviews.freebsd.org/D17250
2018-09-24 22:15:04 +00:00
Mark Johnston
463406ac4a Add more NUMA-specific low memory predicates.
Use these predicates instead of inline references to vm_min_domains.
Also add a global all_domains set, akin to all_cpus.

Reviewed by:	alc, jeff, kib
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17278
2018-09-24 19:24:17 +00:00
John Baldwin
b65a9568f8 Restore the API of the kf_sa_local and kf_sa_peer members.
In 11.x and earlier these were accessible as direct members of 'struct
kinfo_file'.  Existing code already knows about the new location of
these members as well, so wrapper macros did not work for these
fields.  Instead, define an anonymous struct containing the fields
from 'struct kinfo_file' in FreeBSD 11 that were not part of the
'kf_un' union.  This anonymous struct is then placed in an anonymous
union along with the new 'kf_un' union.  This preserves the API of
both structure layouts without requiring any wrapper macros.

PR:		231525
Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17262
2018-09-24 18:20:38 +00:00
John Baldwin
7a102e0463 Implement pmap_sync_icache().
This invokes "fence" on the hart performing the write followed by an IPI
to execute "fence.i" on all harts.

This is required to support userland debuggers setting breakpoints in
user processes.

Reviewed by:	br (earlier version), markj
Approved by:	re (gjb)
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D17139
2018-09-24 17:41:29 +00:00
Alexander Motin
cf4a52cf67 Fix use-after-free in RAID0 error reporting of GEOM_RAID.
PR:		231510
Submitted by:	yangx92@hotmail.com
Approved by:	re (gjb)
MFC after:	1 week
2018-09-24 16:58:55 +00:00
Alan Cox
f5fbe90de4 Passing UMA_ZONE_NOFREE to uma_zcreate() for swpctrie_zone and swblk_zone is
redundant, because uma_zone_reserve_kva() is performed on both zones and it
sets this same flag on the zone.  (Moreover, the implementation of the swap
pager does not itself require these zones to be UMA_ZONE_NOFREE.)

Reviewed by:	kib, markj
Approved by:	re (gjb)
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D17296
2018-09-24 16:49:02 +00:00
Mark Johnston
3d14a7bb43 Ensure that "domain" is initialized when vm_ndomains == 1.
Reported by:	alc
Approved by:	re (gjb)
2018-09-24 15:32:46 +00:00
Mateusz Guzik
9afff6b1c0 Eliminate false sharing in malloc due to statistic collection
Currently stats are collected in a MAXCPU-sized array which is not
aligned and suffers enormous false-sharing. Fix the problem by
utilizing per-cpu allocation.

The counter(9) API is not used here as it is too incomplete and does
not provide a win over per-cpu zone sized for malloc stats struct. In
particular stats are being reported for each cpu separately by just
copying what is supposed to be an array element for given cpu.

This eliminates significant false-sharing during malloc-heavy tests
e.g. on Skylake. See the review for details.

Reviewed by:	markj
Approved by:	re (kib)
Differential Revision:	https://reviews.freebsd.org/D17289
2018-09-23 19:00:06 +00:00
Michael Tuexen
078a49a077 Remove the unused parameter 'locked' from the function
syncache_respond(). There is no functional change. The
parameter became unused in r313330, but wasn't removed.

Approved by:		re (kib@)
MFC after:		1 month
Sponsored by:		Netflix, Inc.
2018-09-23 16:37:32 +00:00
Konstantin Belousov
b7befdf509 Correct panic messages.
Reviewed by:	mckusick
Sponsored by:	The FreeBSD Foundation
Approved by:	re (rgrimes)
MFC after:	1 week
2018-09-22 17:05:49 +00:00
Konstantin Belousov
108ff63e8a Further reorganize pmap_invalidate TLB code.
Split calculation of mask for shootdown IPI and local
invalidation. Reorder IPI before local.

Suggested by:	alc
Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Approved by:	re (rgrimes)
Differential revision:	https://reviews.freebsd.org/D17277
2018-09-22 17:04:39 +00:00
Mateusz Guzik
334107b91b vfs: __predict common case in VFS_EPILOGUE/PROLOGUE
NFS is the only in-tree filesystem using the feature, but all ops test
for it.

Currently the resulting sigdefer calls have to be jumped over in the
common case.

This is a bandaid, longer term fix will move this feature away.

Approved by:	re (kib)
2018-09-22 11:39:30 +00:00
Navdeep Parhar
061bbaf7e7 cxgbe(4): Reuse existing "switching" L2T entries when possible.
Approved by:	re@ (rgrimes@)
Sponsored by:	Chelsio Communications
2018-09-22 01:24:30 +00:00
Alexander Motin
ae1b0b825a MFV r338866: 9700 ZFS resilvered mirror does not balance reads
illumos/illumos-gate@82f63c3c2b

Reviewed by: Toomas Soome <tsoome@me.com>
Reviewed by: Sanjay Nadkarni <sanjay.nadkarni@nexenta.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Matthew Ahrens <mahrens@delphix.com>
Author:     Jerry Jelinek <jerry.jelinek@joyent.com>

Approved by:	re (delphij)
2018-09-21 21:56:00 +00:00
Mark Johnston
6cdde6fda2 Use the GNU as-compatible .endm instead of .endmacro.
Approved by:	re (gjb)
2018-09-21 20:20:03 +00:00
Konstantin Belousov
d995b5b1ea Convert x86 TLB top-level invalidation functions to ifuncs.
Note that shootdown IPI handlers are already per-mode.

Suggested by:	alc
Reviewed by:	alc, markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Approved by:	re (gjb)
Differential revision:	https://reviews.freebsd.org/D17184
2018-09-21 17:53:06 +00:00
Mateusz Guzik
68c7542edf amd64: even up copyin/copyout with memcpy + other cleanup
- _fault handlers for both primitives are identical, provide just one
- change the copying scheme to match memcpy (in particular jump
avoidance for the most common case of multiply of 8)
- stop re-reading pcb address on exit, just store it locally (in r9)

Reviewed by:	kib
Approved by:	re (gjb)
Differential Revision:	https://reviews.freebsd.org/D17265
2018-09-21 15:00:46 +00:00
Andrey V. Elsukov
50f5c94edb Fix possible NULL pointer dereference in ffec_alloc_mbufcl().
PR:		231514
Approved by:	re (kib)
MFC after:	1 week
2018-09-21 13:44:05 +00:00
Ed Maste
f789d9839a Include kernel ident in uname
In non-reproducible mode we have the kernel ident as a side effect of
including the build directory.  Explicitly add it to the ident string in
reproducible mode.

Reported by:	mjg
Approved by:	re (gjb)
Sponsored by:	The FreeBSD Foundation
2018-09-21 13:43:06 +00:00
Mateusz Guzik
d6fda03a64 select: stop doing zero-sized memsets
Approved by:	re (kib)
2018-09-21 13:20:41 +00:00
Ed Maste
18c00da806 remove double space between branch and version in kernel ident
Reported by:	dim
Approved by:	re (kib)
Sponsored by:	The FreeBSD Foundation
2018-09-21 13:02:25 +00:00
Mateusz Guzik
bb32268086 amd64: check for small size in memmove, memcpy and memset
If the size is 15 bytes or less avoid spinning up rep just to copy the 8
bytes. In my tests on EPYC and old Intel microarchs without ERMS (like
Westmere) it provided a nice win over the current version (e.g. for EPYC
memset with 15 bytes of size goes from 59712651 ops/s to 70600095) all
while almost not pessimizing the other cases.

Data collected during package building shows that < 16 sizes are pretty
common.

Verified with the glibc test suite.

Approved by:	re (kib)
2018-09-21 12:27:36 +00:00
Matt Macy
b08d611de8 fix vlan locking to permit sx acquisition in ioctl calls
- update vlan(9) to handle changes earlier this year in multicast locking

Tested by: np@, darkfiberu at gmail.com

PR:	230510
Reviewed by:	mjoras@, shurd@, sbruno@
Approved by:	re (gjb@)
Sponsored by:	Limelight Networks
Differential Revision:	https://reviews.freebsd.org/D16808
2018-09-21 01:37:08 +00:00
Glen Barber
a128aaea95 Update head from ALPHA6 to ALPHA7 as part of the 12.0-RELEASE
cycle.

Approved by:	re (implicit)
Sponsored by:	The FreeBSD Foundation
2018-09-20 23:59:42 +00:00
Mateusz Guzik
5254f065ab amd64: macroify copyin/copyout and provide erms variants, follow up
Fix a fat-fingered typo with a "funny" side-effect: when doing copyin on a
cpu without ERMS and with size being a multiply of 8 a page fault would be
triggered resulting in EFAULT.

Pointy hat: mjg
Approved by:	re (implicit)
2018-09-20 20:32:08 +00:00