130479 Commits

Author SHA1 Message Date
Ian Lepore
9a1d4b0012 Add #ifdef option-test wrappers around another call to an arm/unwind.c
function which is only compiled-in with certain options.

Why is it always the most trivial part of a big commit that takes 3 tries
to get right?
2020-01-07 21:13:34 +00:00
Mateusz Guzik
a9a047bc87 vfs: handle doomed vnodes in vdefer_inactive
vgone dooms the vnode while keeping VI_OWEINACT set and then drops the
interlock.

vputx can pick up the interlock and pass it to vdefer_inactive since the
flag is set.

The race is harmless, just don't defer anything as vgone will take care of it.

Reported by:	pho
2020-01-07 20:24:21 +00:00
Ed Maste
ee92463aca Do not define TCPOUTFLAGS in rack_bbr_common
tcp_outflags isn't used in this source file and compilation failed with
external GCC on sparc64.  I'm not sure why only that case failed (perhaps
inconsistent -Werror config) but it is a legitimate issue to fix.

Reviewed by:	tuexen
Differential Revision:	https://reviews.freebsd.org/D23068
2020-01-07 17:57:08 +00:00
Mark Johnston
958ff217e7 Decrease logging severity when adding a device or reading config table.
In PR 243056 a user reports some spam from smartpqi(4).  In particular,
the driver warns about an unrecognized PQI_CONF_TABLE_SECTION_SOFT_RESET
section (not yet defined in the driver, but handled in Linux), but this
doesn't cause any problems.  The Linux driver also does not warn about
unrecognized sections.

Also do not log a warning when a device is added, since this is routine.
Lower severity to DISC, to match pqisrc_remove_device().

PR:		243056
Reviewed by:	sbruno
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23023
2020-01-07 16:07:30 +00:00
Mark Johnston
f3e982e764 Define a unified pmap structure for i386.
The overloading of struct pmap for PAE and non-PAE pmaps results in
three distinct layouts for the structure, which is embedded in
struct vmspace.  This causes a large number of duplicate structure
definitions in the i386 kernel's CTF type graph.

Since most pmap fields are the same in the two pmaps, simply provide
side-by-side variants of the fields that are distinct, using fixed-size
types.

PR:		242689
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22896
2020-01-07 15:59:31 +00:00
Mark Johnston
e8bbca1b44 Consistently use pmap_t instead of struct pmap *.
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2020-01-07 15:59:02 +00:00
Mateusz Guzik
c8b3463dd0 vfs: reimplement deferred inactive to use a dedicated flag (VI_DEFINACT)
The previous behavior of leaving VI_OWEINACT vnodes on the active list without
a hold count is eliminated. Hold count is kept and inactive processing gets
explicitly deferred by setting the VI_DEFINACT flag. The syncer is then
responsible for vdrop.

Reviewed by:	kib (previous version)
Tested by:	pho (in a larger patch, previous version)
Differential Revision:	https://reviews.freebsd.org/D23036
2020-01-07 15:56:24 +00:00
Mateusz Guzik
b7cc9d1847 vfs: trylock in vfs_msync and refactor the func
- use LK_NOWAIT instead of calling VOP_ISLOCKED before deciding to lock
- evaluate flags before looping over vnodes

Reviewed by:	kib
Tested by:	pho (in a larger patch, previous version)
Differential Revision:	https://reviews.freebsd.org/D23035
2020-01-07 15:44:19 +00:00
Mateusz Guzik
c92fe112a7 vfs: use a dedicated counter for free vnode recycling
Otherwise vlrureclaim activitity is mixed in and it is hard to tell which
vnodes got reclaimed.
2020-01-07 15:42:01 +00:00
Kristof Provost
76c6e771bc sifive: Fix incorrect tx/rx ctrl defines
Happily these were never used, but they should be correct anyway.

Reported by:	Nicholas O'Brien <nickisobrien_gmail.com>
Sponsored by:	Axiado
2020-01-07 09:02:14 +00:00
Mateusz Guzik
75ad73a8b9 zfs: plug a vnode reserve leak in zfs_make_xattrdir 2020-01-07 04:34:29 +00:00
Mateusz Guzik
cc2b586d69 vfs: prevent numvnodes and freevnodes re-reads when appropriate
Otherwise in code like this:
if (numvnodes > desiredvnodes)
	vnlru_free_locked(numvnodes - desiredvnodes, NULL);

numvnodes can drop below desiredvnodes prior to the call and if the
compiler generated another read the subtraction would get a negative
value.
2020-01-07 04:34:03 +00:00
Mateusz Guzik
37fe521a6f vfs: annotate numvnodes and vnode_free_list_mtx with __exclusive_cache_line 2020-01-07 04:30:49 +00:00
Mateusz Guzik
478368ca41 vfs: eliminate v_tag from struct vnode
There was only one consumer and it was using it incorrectly.

It is given an equivalent hack.

Reviewed by:	jeff
Differential Revision:	https://reviews.freebsd.org/D23037
2020-01-07 04:29:34 +00:00
Mateusz Guzik
a91190c63e vfs: add a helper for allocating marker vnodes 2020-01-07 04:27:40 +00:00
Andrew Turner
1b02a76602 Add more Arm arm64 CPU identification values
- Add all the Cortex-A CPU ID register values I can find.
 - Add the Neoverse-N1 ID regiser value [1]
 - Sort macros by register value.

PR:		243065
Submitted by:	Ali Saidi <alisaidi AT amazon.com> [1]
Sponsored by:	DARPA, AFRL (other than [1])
2020-01-06 20:57:59 +00:00
Pawel Biernacki
91c4b68fa3 kern_sysctl: make sysctl.debug work as intended
r136999 introduced SYSTCL_DEBUG but apparently "opt_sysctl.h" was never
included making the option ignored.

r322954 introduced sysctl.reuse_test with OID number equal to 0, effectively
shadowing the very special sysctl.debug one. Use OID_AUTO as it doesn't need
any special treatment.

Reviewed by:	kib (mentor)
Approved by:	kib (mentor)
Differential Revision:	https://reviews.freebsd.org/D23056
2020-01-06 19:47:59 +00:00
John Baldwin
9f669cf0a2 Simplify arguments to signal handlers on mips.
- Use ksi_addr directly as si_addr in the siginfo instead of the
  'badvaddr' register.
- Remove a duplicate assignment of si_code.
- Use ksi_addr as the 4th argument to the old-style handler instead of
  'badvaddr'.

Reviewed by:	brooks, kevans
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23013
2020-01-06 18:02:02 +00:00
Randall Stewart
4ad2473790 This catches rack up in the recent changes to ECN and
also commonizes the functions that both the freebsd and
rack stack uses.

Sponsored by:Netflix Inc
Differential Revision:	https://reviews.freebsd.org/D23052
2020-01-06 15:29:14 +00:00
Randall Stewart
a9a08eced6 This change adds a small feature to the tcp logging code. Basically
a connection can now have a separate tag added to the id.

Obtained from:	Lawrence Stewart
Sponsored by:	Netflix Inc
Differential Revision:	https://reviews.freebsd.org/D22866
2020-01-06 12:48:06 +00:00
Pawel Biernacki
a1d7296784 sysctl: mark more nodes as MPSAFE
vm.kvm_size and vm.kvm_free are read only and marked as MPSAFE on i386
already. Mark them as that on amd64 and arm64 too to avoid locking Giant.

Reviewed by:	kib (mentor)
Approved by:	kib (mentor)
Differential Revision:	https://reviews.freebsd.org/D23039
2020-01-06 10:52:13 +00:00
Hans Petter Selasky
6c110e8611 Add own counter for cancelled USB transfers.
Do not count these as errors.

MFC after:	1 week
Sponsored by:	Mellanox Technologies
2020-01-06 09:49:20 +00:00
Jeff Roberson
79c9f9429a Fix uma boot pages calculations on NUMA machines that also don't have
MD_UMA_SMALL_ALLOC.  This is unusual but not impossible.  Fix the alignemnt
of zones while here.  This was already correct because uz_cpu strongly
aligned the zone structure but the specified alignment did not match
reality and involved redundant defines.

Reviewed by:	markj, rlibby
Differential Revision:	https://reviews.freebsd.org/D23046
2020-01-06 02:51:19 +00:00
Jeff Roberson
bfb6b7a121 The fix in r356353 was insufficient. Not every architecture returns 0 for
EARLY_COUNTER.  Only amd64 seems to.

Suggested by:	markj
Reported by:	lwhsu
Reviewed by:	markj
PR:		243117
2020-01-05 22:54:25 +00:00
Bjoern A. Zeeb
aeaef7d597 netgraph/ng_bridge: Reestablish old ABI
In order to be able to merge r353026 bring back support for the old
cookie API for a transition period in 12.x releases (and possibly 13)
before the old API can be removed again entirely.

Suggested by:	julian
Submitted by:	Lutz Donnerhacke (lutz donnerhacke.de)
PR:		240787
Reviewed by:	julian
MFC after:	2 weeks
X-MFC with:	r353026
Differential Revision:	https://reviews.freebsd.org/D21961
2020-01-05 19:14:16 +00:00
Michael Tuexen
97a8ab398e Don't make the sendall iterator as being up if it could not be started.
MFC after:		1 week
2020-01-05 14:08:01 +00:00
Michael Tuexen
4b66d476b3 Return -1 consistently if an error occurs.
MFC after:	1 week
2020-01-05 14:06:40 +00:00
Michael Tuexen
397b1c945f Ensure that we don't miss a trigger for kicking off the SCTP iterator.
Reported by:		nwhitehorn@
MFC after:		1 week
2020-01-05 13:56:32 +00:00
Mateusz Guzik
2e77cad11d locks: add default delay struct
Use it for all primitives. This makes everything fit in 8 bytes.
2020-01-05 12:48:19 +00:00
Mateusz Guzik
6b8dd26e7c locks: convert delay times to u_short
int is just a waste of space for this purpose.
2020-01-05 12:47:29 +00:00
Mateusz Guzik
d6ae918835 Mark mtxpool_sleep as read mostly, not frequently.
The latter is not justified.
2020-01-05 12:46:35 +00:00
Kyle Evans
535b1df993 shm: correct KPI mistake introduced around memfd_create
When file sealing and shm_open2 were introduced, we should have grown a new
kern_shm_open2 helper that did the brunt of the work with the new interface
while kern_shm_open remains the same. Instead, more complexity was
introduced to kern_shm_open to handle the additional features and consumers
had to keep changing in somewhat awkward ways, and a kern_shm_open2 was
added to wrap kern_shm_open.

Backpedal on this and correct the situation- kern_shm_open returns to the
interface it had prior to file sealing being introduced, and neither
function needs an initial_seals argument anymore as it's handled in
kern_shm_open2 based on the shmflags.
2020-01-05 04:06:40 +00:00
Kyle Evans
58366f05c0 shmfd/mmap: restrict maxprot with MAP_SHARED + F_SEAL_WRITE
If a write seal is set on a shared mapping, we must exclude VM_PROT_WRITE as
the fd is effectively read-only. This was discovered by running
devel/linux-ltp, which mmap's with acceptable protections specified then
attempts to raise to PROT_READ|PROT_WRITE with mprotect(2), which we
allowed.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D22978
2020-01-05 03:15:16 +00:00
Mateusz Guzik
7e2ea5772b vfs: factor out avoidable branches in _vn_lock 2020-01-05 01:00:11 +00:00
Mateusz Guzik
8dbc63520c vfs: drop thread argument from vinactive 2020-01-05 00:59:47 +00:00
Mateusz Guzik
867fd730c6 vfs: patch up vnode count assertions to report found value 2020-01-05 00:59:16 +00:00
Mateusz Guzik
1cde9e385a vfs: predict VN_IS_DOOMED as false
The macro is used everywhere.
2020-01-05 00:58:20 +00:00
Kyle Evans
2180f6c6f1 kern_mmap: restore character deleted in transit
Pointy hat to:	kevans
X-MFC-With:	r356359
2020-01-04 23:51:44 +00:00
Kyle Evans
18348a2369 kern_mmap: add a variant that allows caller to inspect fp
Linux mmap rejects mmap() on a write-only file with EACCES.
linux_mmap_common currently does a fun dance to grab the fp associated with
the passed in fd, validates it, then drops the reference and calls into
kern_mmap(). Doing so is perhaps both fragile and premature; there's still
plenty of chance for the request to get rejected with a more appropriate
error, and it's prone to a race where the file we ultimately mmap has
changed after it drops its referenced.

This change alleviates the need to do this by providing a kern_mmap variant
that allows the caller to inspect the fp just before calling into the fileop
layer. The callback takes flags, prot, and maxprot as one could imagine
scenarios where any of these, in conjunction with the file itself, may
influence a caller's decision.

The file type check in the linux compat layer has been removed; EINVAL is
seemingly not an appropriate response to the file not being a vnode or
device. The fileop layer will reject the operation with ENODEV if it's not
supported, which more closely matches the common linux description of
mmap(2) return values.

If we discover that we're allowing an mmap() on a file type that Linux
normally wouldn't, we should restrict those explicitly.

Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D22977
2020-01-04 23:39:58 +00:00
Michael Tuexen
ae7cc6c9f8 Make the message size limit used for SCTP_SENDALL configurable via
a sysctl variable instead of a compiled in constant.

This is based on a patch provided by nwhitehorn@.
2020-01-04 20:33:12 +00:00
Alan Cox
1c3a241032 When a copy-on-write fault occurs, pmap_enter() is called on to replace the
mapping to the old read-only page with a mapping to the new read-write page.
To destroy the old mapping, pmap_enter() must destroy its page table and PV
entries and invalidate its TLB entry.  This change simply invalidates that
TLB entry a little earlier, specifically, on amd64 and arm64, before the PV
list lock is held.

Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D23027
2020-01-04 19:50:25 +00:00
Jeff Roberson
31c251a046 Fix an assertion introduced in r356348. On architectures without
UMA_MD_SMALL_ALLOC vmem has a more complicated startup sequence that
violated the new assert.  Resolve this by rewriting the COLD asserts to
look at the per-cpu allocation counts for evidence of api activity.

Discussed with:	rlibby
Reviewed by:	markj
Reported by:	lwhsu
2020-01-04 19:29:25 +00:00
Jeff Roberson
dfe13344f5 UMA NUMA flag day. UMA_ZONE_NUMA was a source of confusion. Make the names
more consistent with other NUMA features as UMA_ZONE_FIRSTTOUCH and
UMA_ZONE_ROUNDROBIN.  The system will now pick a select a default depending
on kernel configuration.  API users need only specify one if they want to
override the default.

Remove the UMA_XDOMAIN and UMA_FIRSTTOUCH kernel options and key only off
of NUMA.  XDOMAIN is now fast enough in all cases to enable whenever NUMA
is.

Reviewed by:	markj
Discussed with:	rlibby
Differential Revision:	https://reviews.freebsd.org/D22831
2020-01-04 18:48:13 +00:00
Jeff Roberson
91d947bfbe Sort cross-domain frees into per-domain buckets before inserting these
onto their respective bucket lists.  This is a several order of magnitude
improvement in contention on the keg lock under heavy free traffic while
requiring only an additional bucket per-domain worth of memory.

Discussed with:		markj, rlibby
Differential Revision:	https://reviews.freebsd.org/D22830
2020-01-04 07:56:28 +00:00
Jeff Roberson
8b987a7769 Use per-domain keg locks. This provides both a lock and separate space
accounting for each NUMA domain.  Independent keg domain locks are important
with cross-domain frees.  Hashed zones are non-numa and use a single keg
lock to protect the hash table.

Reviewed by:	markj, rlibby
Differential Revision:	https://reviews.freebsd.org/D22829
2020-01-04 03:30:08 +00:00
Jeff Roberson
727c691857 Use a separate lock for the zone and keg. This provides concurrency
between populating buckets from the slab layer and fetching full buckets
from the zone layer.  Eliminate some nonsense locking patterns where
we lock to fetch a single variable.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D22828
2020-01-04 03:15:34 +00:00
Jeff Roberson
4bd61e19a2 Use atomics for the zone limit and sleeper count. This relies on the
sleepq to serialize sleepers.  This patch retains the existing sleep/wakeup
paradigm to limit 'thundering herd' wakeups.  It resolves a missing wakeup
in one case but otherwise should be bug for bug compatible.  In particular,
there are still various races surrounding adjusting the limit via sysctl
that are now documented.

Discussed with:	markj
Reviewed by:	rlibby
Differential Revision:	https://reviews.freebsd.org/D22827
2020-01-04 03:04:46 +00:00
Justin Hibbits
24e87ffae8 powerpc: Remove 'sec' device from QORIQ64 config
The SEC crypto engine, as implemented in this driver, does not exist on any
64-bit SoC, so don't bother compiling it in.
2020-01-04 01:13:00 +00:00
Mateusz Guzik
952f595351 vfs: remove CTASSERT from VOP_UNLOCK_FLAGS
gcc does not like it and it's not worth working around just for that
compiler.
2020-01-04 00:44:53 +00:00
Mateusz Guzik
b249ce48ea vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21427
2020-01-03 22:29:58 +00:00