Commit Graph

246684 Commits

Author SHA1 Message Date
Mateusz Guzik
61f67f32c7 vfs: allow tail call optimisation in vops in the common case
Most frequently used vops boil down to checking SDT probes, doing the call and
checking again. There is no vop_post/pre in their case but the check after the
call prevents tail call optimisation from taking place. Instead, check once
upfront. Kernels with debug or vops with non-empty vop_post still don't short
circuit.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22739
2019-12-16 00:07:51 +00:00
Mateusz Guzik
6fa079fc3f vfs: flatten vop vectors
This eliminates the following loop from all VOP calls:

while(vop != NULL && \
    vop->vop_spare2 == NULL && vop->vop_bypass == NULL)
        vop = vop->vop_default;

Reviewed by:	jeff
Tesetd by:	pho
Differential Revision:	https://reviews.freebsd.org/D22738
2019-12-16 00:06:22 +00:00
Mateusz Guzik
3fd19ce7a5 mtx: eliminate recursion support from thread lock
Now that it is not used after schedlock changes got merged.

Note the unlock routine temporarily still checks for it on account of just using
regular spin unlock.

This is a prelude towards a general clean up.
2019-12-16 00:04:33 +00:00
Alexander Motin
5233eb9dc6 Properly detect ATA sanitize errors.
It seems I read specifications not careful enough.  There are devices not
setting successful completion bit, causing previous code report false error.

MFC after:	1 week
2019-12-15 23:28:53 +00:00
Alan Cox
68ca966558 Apply a small optimization to pmap_remove_l3_range(). Specifically, hoist a
PHYS_TO_VM_PAGE() operation that always returns the same vm_page_t out of
the loop.  (Since arm64 is configured as VM_PHYSSEG_SPARSE, the
implementation of PHYS_TO_VM_PAGE() is more costly than that of
VM_PHYSSEG_DENSE platforms, like amd64.)

MFC after:	1 week
2019-12-15 22:41:57 +00:00
Toomas Soome
3c2db0ef43 loader: rewrite zfs vdev initialization
In some cases the pool discovery will get stuck in infinite loop while setting
up the vdev children.

To fix, we split the vdev setup into two parts, first we create vdevs based on
configuration we do get from pool label, then, we process pool config from MOS
and update the pool config if needed.

Testing done: confirm previously hung loader is not hung any more.

MFC after:	1 week
2019-12-15 21:52:40 +00:00
Jeff Roberson
686bcb5c14 schedlock 4/4
Don't hold the scheduler lock while doing context switches.  Instead we
unlock after selecting the new thread and switch within a spinlock
section leaving interrupts and preemption disabled to prevent local
concurrency.  This means that mi_switch() is entered with the thread
locked but returns without.  This dramatically simplifies scheduler
locking because we will not hold the schedlock while spinning on
blocked lock in switch.

This change has not been made to 4BSD but in principle it would be
more straightforward.

Discussed with:	markj
Reviewed by:	kib
Tested by:	pho
Differential Revision: https://reviews.freebsd.org/D22778
2019-12-15 21:26:50 +00:00
Justin Hibbits
1223b40eba powerpc/powernv: Set the PTCR for the Nest MMU
The Nest MMU manages address translation for accelerators on the POWER9.  To
do so, it needs a page table, so export the system page table to the Nest
MMU.  This will quietly fail on pre-POWER9 systems that do not have a NMMU.

The NMMU is currently unused, so this change is currently effectively a NOP,
but the NMMU and VAS will eventually be used.
2019-12-15 21:20:18 +00:00
Jeff Roberson
1c81a87efd schedlock 3/4
Eliminate lock recursion from turnstiles.  This was simply used to avoid
tracking the top-level turnstile lock.  explicitly check for it before
picking up and dropping locks.

Reviewed by:	kib
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22746
2019-12-15 21:19:41 +00:00
Jeff Roberson
045b7c6084 schedlock 2/4
Do all sleepqueue post-processing in sleepq_remove_thread() so that we
do not require the thread lock after a context switch.

Reviewed by:	jhb, kib
Differential Revision:	https://reviews.freebsd.org/D22745
2019-12-15 21:18:07 +00:00
Ian Lepore
b19c9dea3e Rewrite arm kernel stack unwind code to work when unwinding through modules.
The arm kernel stack unwinder has apparently never been able to unwind when
the path of execution leads through a kernel module. There was code that
tried to handle modules by looking for the unwind data in them, but it did
so by trying to find symbols which have never existed in arm kernel
modules. That caused the unwind code to panic, and because part of panic
handling calls into the unwind code, that just created a recursion loop.

Locating the unwind data in a loaded module requires accessing the Elf
section headers to find the SHT_ARM_EXIDX section. For preloaded modules
those headers are present in a metadata blob. For dynamically loaded
modules, the headers are present only while the loading is in progress; the
memory is freed once the module is ready to use. For that reason, there is
new code in kern/link_elf.c, wrapped in #ifdef __arm__, to extract the
unwind info while the headers are loaded. The values are saved into new
fields in the linker_file structure which are also conditional on __arm__.

In arm/unwind.c there is new code to locally cache the per-module info
needed to find the unwind tables. The local cache is crafted for lockless
read access, because the unwind code often needs to run in context where
sleeping is not allowed.  A large comment block describes the local cache
list, so I won't repeat it all here.
2019-12-15 21:16:35 +00:00
Jeff Roberson
61a74c5ccd schedlock 1/4
Eliminate recursion from most thread_lock consumers.  Return from
sched_add() without the thread_lock held.  This eliminates unnecessary
atomics and lock word loads as well as reducing the hold time for
scheduler locks.  This will eventually allow for lockless remote adds.

Discussed with:	kib
Reviewed by:	jhb
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22626
2019-12-15 21:11:15 +00:00
Justin Hibbits
0548026500 powerpc/mpc85xx: Clean up Freescale SATA driver a little
* Remove unused ATA_IN/OUT macros, they just clutter up the file.
* Fix some RID management bits for the channel memory resource.
2019-12-15 21:08:40 +00:00
Ian Lepore
d937171727 Support --all-repeats in uniq(1) for compatibility with gnu coreutils.
This adds a new -D/--all-repeats option to uniq(1), which outputs each copy
of any repeated lines (as opposed to a single copy of a repeated line). You
can specify a separator option to output a blank line before or after each
group of repeated lines. This adds compatibility with the GNU coreutils
version of uniq(1).

This change also re-groups the -c, -d, -D, -u options in the usage display
and man page to indicate that they are mutally exclusive of each other. This
matches the posix/opengroup definition of uniq(1) command line args. Note
that this change does NOT actually enforce the mutual exclusion in the code,
for now, it simply documents that the arguments should be considered
exclusive with each other.

Differential Revision:	https://reviews.freebsd.org/D22262
2019-12-15 18:05:18 +00:00
Conrad Meyer
482f0c0255 Revert r355760, r355759
And remove the inline/deprecated attribute use entirely in stdlib.h, from
r355747.  The intent was to provide a buildable API transitionary period, but
clearly that was counter-productive.

Reported by:	delphij, imp, others
2019-12-15 17:33:26 +00:00
Kyle Evans
58b22b9df2 kbd: convert kbdd_* macros to inline functions
This reduces the noise when interested parties wish to de-Giant kbd; these
accesses to kbdsw will need to be properly locked.
2019-12-15 16:28:12 +00:00
Michal Meloun
0a4b14e8cc Properly synchronize completion DMA buffers.
Within command completion processing the callback function may access
DMAed data buffer. Synchronize it before use, not after.
This allows to use NVMe disk on non-DMA coherent arm64 system.

MFC after:	3 weeks
2019-12-15 14:28:38 +00:00
Toomas Soome
2e6bb6553b loader: zfsimpl.c cstyle cleanup
No functional changes intended.

MFC after:	1 week
2019-12-15 14:09:49 +00:00
Jeff Roberson
d29f674f2e Fix a mistake in r355765. We need to activate the page if it is not yet
on a pagequeue.

Reported by:	pho
2019-12-15 06:26:47 +00:00
Kyle Evans
5ca70a4673 kbd: drop _KERNEL #ifdef in kbdreg.h
This #ifdef is misleading as there are actually no user-serviceable parts
inside and, as far as I can tell, there is no pollution leading from
userland to this header. Furthermore, it becomes a slight nuisance when
attempting to move things around in this header.
2019-12-15 04:22:50 +00:00
Jeff Roberson
4bf95d00ce Previously we did not support invalid pages in default objects. This means
that if fault fails to progress and needs to restart the loop it must free
the page it is working on and allocate again on restart.  Resolve the few
places that need to be modified to support this condition and simply
deactivate the page.  Presently, we only permit this when fault restarts
for busy contention.  This has an added benefit of removing some object
trylocking in this case.

While here consolidate some page cleanup logic into fault_page_free() and
fault_page_release() to reduce redundant code and automate some teardown.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D22653
2019-12-15 04:08:24 +00:00
Jeff Roberson
a808177864 Add a deferred free mechanism for freeing swap space that does not require
an exclusive object lock.

Previously swap space was freed on a best effort basis when a page that
had valid swap was dirtied, thus invalidating the swap copy.  This may be
done inconsistently and requires the object lock which is not always
convenient.

Instead, track when swap space is present.  The first dirty is responsible
for deleting space or setting PGA_SWAP_FREE which will trigger background
scans to free the swap space.

Simplify the locking in vm_fault_dirty() now that we can reliably identify
the first dirty.

Discussed with:	alc, kib, markj
Differential Revision:	https://reviews.freebsd.org/D22654
2019-12-15 03:15:06 +00:00
Jeff Roberson
d966c7615f Slightly optimize locking in vm_map_copy_swap_entry(). Anonymous objects
require the object lock to synchronize collapse.  Other swap objects such
as tmpfs do not.

Reported by:	mjg
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D22747
2019-12-15 02:02:27 +00:00
Jeff Roberson
af00971419 Handle pagein clustering in vm_page_grab_valid() so that it can be used by
exec_map_first_page().  This will also enable pagein clustering for other
interested consumers (tmpfs, md, etc).

Discussed with:	alc
Approved by:	kib
Differential Revision:	https://reviews.freebsd.org/D22731
2019-12-15 02:00:32 +00:00
Pedro F. Giffuni
d5dfb2fbc8 cdefs: use more accurate GCC version for the deprecated attribute.
The message argument in the "deprecated" attribute was introduced in GCC 4.5 *.
Use the accurate version number for consistency, as done already with other
attributes.

* https://gcc.gnu.org/onlinedocs/gcc-4.5.0/gcc/Function-Attributes.html
2019-12-15 01:56:56 +00:00
Kyle Evans
727b66b6d9 <unistd.h>: remove redundant __BSD_VISIBLE
This bit is already inside of a larger __BSD_VISIBLE block.

Reported by:	vangyzen
2019-12-15 01:26:57 +00:00
Conrad Meyer
9d4710591b linuxkpi: Drop incompatible __deprecated definition
Probably all of these linuxkpi stubs should be '#ifndef' guarded, but maybe
that would prevent people from noticing when they are defined.

Introduced in r355759.  For some reason I only ran a buildworld and not a
kernel.  Mea culpa.

Reported by:	Mark Millard
X-MFC-with:	r355759
2019-12-14 23:39:32 +00:00
Conrad Meyer
215332ffe7 cdefs: Add __deprecated(message) function attribute macro
The legacy version of GCC4 currently in base does not support the
parameterized form of this function attribute, as recent introduced in
stdlib.h (r355747).

As we have done for other function attributes with similar compatibility
problems, add a version-compatibile definition in sys/cdefs.h.  Note that
Clang defines itself to be GCC 4, so one must check for __clang__ in
addition to __GNUC__ version.  On legacy GCC 4, the macro expands to just
the __deprecated__ attribute; on modern GCC or Clang, the macro expands to
the parameterized variant with the message.

Ignoring legacy or unsupported compilers, the macro is also beneficial in
that it is a bit more ergonomic than the full
__attribute__((__deprecated__())) boilerplate.

Reported by:	CI (but not tinderbox); imp and others
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D22817
2019-12-14 21:52:49 +00:00
Rick Macklem
547a4dba9f Update the mount_nfs.8 man page to include NFSv4.2.
r355677 added NFSv4.2 support to the NFS client. This patch updates the
mount_nfs.8 man page to reflect that.
It also clarifies that the "nolockd" option does not apply to NFSv4 mounts.

This is a content change.
2019-12-14 21:49:47 +00:00
Doug Moore
9f70442a04 Simplify the processing a leaf mask to find big-enough ranges of set
bits, by storing and modifying the complement of the original leaf
mask, and by avoiding some unnecessary intermediate variables in
computing the shift amounts. The logic is similar to what has recently
been committed to sys/sys/bitstring.h.

Compute better hint updates for the case when the cursor starts in
mid-leaf, and eliminates some otherwise viable solutions. Assume the
worst case, that all the eliminated offsets could have been solutions,
and you can still compute a better hint than we use now.

Eliminate some unnecessary conditional control flow.

Approved by: alc
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D22666
2019-12-14 19:44:42 +00:00
Michal Meloun
dfd1d0fcab Add driver for Rockchip PCIe root complex found in RK3399 SOC.
Unfortunately, there are some limitations:
- memory aperture of his controller is only 16MiB, so it is nearly
  unusable for graphic cards
- every attempt to generate type 1 config cycle always causes trap.
  These config cycles are disabled now and we don't support cards
  with PCIe switch.
- in some cases, attempt to do config cycle to (probably) not-yet ready
  card also causes trap. This cannot be detected at runtime, but it seems
  like very rare issue.

MFC after:	3 weeks
Differential Revision:  https://reviews.freebsd.org/D22724
2019-12-14 14:56:34 +00:00
Edward Tomasz Napierala
cf69fe66d4 Add sync_file_range(2) implementation to linux(4); it's a thin wrapper
over the usual fsync(2).

This silences some warnings when running "apt-get upgrade".

Reviewed by:	brooks, emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22371
2019-12-14 13:37:17 +00:00
Edward Tomasz Napierala
0cde2b3239 Regen after r355752.
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22371
2019-12-14 13:32:37 +00:00
Edward Tomasz Napierala
0610f417a4 Fix definitions for linuxulator's sync_file_range(2).
Reviewed by:	brooks, emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22371
2019-12-14 13:30:43 +00:00
Edward Tomasz Napierala
8ad16e5541 Add 'sesutil show' subcommand to show enclosure and its contents
in a user-friendly way.

Reviewed by:	allanjude, bcr (manpages)
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D22567
2019-12-14 10:58:06 +00:00
Edward Tomasz Napierala
1e62ecedf2 Add -M option to nc(1), which makes it print the TCP connection
statistics obtained with stats(3) in JSON format to standard error.

Reviewed by:	allanjude, thj, cem (earlier version)
Tested by:	thj
MFC after:	2 weeks
Relnotes:	yes
Sponsored by:	Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D21324
2019-12-14 10:53:52 +00:00
Conrad Meyer
c62ff2800b Deprecate sranddev(3) API
It serves no useful purpose and wasn't as popular as its equally meritless
cousin, srandomdev(3).

Setting aside the problems with rand(3) in general, the problem with this
interface is that the seed isn't shared with the caller (other than by
attacking the output of the generator, which is trivial, but not a hallmark of
pleasant API design).  The (arguable) utility of rand(3) or random(3) is as a
semi-fast simulation generator which produces consistent results from a given
seed.  These are mutually at odd.  Furthermore, sometimes people got the
mistaken impression that a high quality random seed meant a weak generator like
rand(3) or random(3) could be used for things like cryptographic key
generation.  This is absolutely not so.

The API was never part of a standard and was not widely used in tree.  Existing
in-tree uses have all been removed.

Possible replacement in out of tree codebases:

	char buf[3];
	time_t t;

	time(t);
	strftime(buf, sizeof(buf), "%S", gmtime(&t));
	srand(atoi(buf));

Relnotes:	yes
2019-12-14 08:28:10 +00:00
Ryan Libby
815db20425 uma dbg: flexible size for slab debug bitset too
Recently (r355315) the size of the struct uma_slab bitset field us_free
became dynamic instead of conservative.  Now, make the debug bitset
size dynamic too.  The debug bitset is INVARIANTS-only, so in fact we
don't care too much about the space savings that results from this, but
enabling minimally-sized slabs on INVARIANTS builds is still important
in order to be able to test new slab layouts effectively.

Reviewed by:	jeff (previous version), markj (previous version)
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D22759
2019-12-14 05:21:56 +00:00
Kristof Provost
cca2ea64e9 pf: Make request_maxcount runtime adjustable
There's no reason for this to be a tunable. It's perfectly safe to
change this at runtime.

Reviewed by:	Lutz Donnerhacke
Differential Revision:	https://reviews.freebsd.org/D22737
2019-12-14 02:06:07 +00:00
Kristof Provost
3c7fbb06a0 pfctl: Warn users when they run into kernel limits
Warn users when they try to add/delete/modify more items than the kernel will
allow.

Reviewed by:	allanjude (previous version), Lutz Donnerhacke
Differential Revision:	https://reviews.freebsd.org/D22733
2019-12-14 02:03:47 +00:00
Mateusz Guzik
6f836483ec Remove the useless return value from proc_set_cred 2019-12-14 00:43:17 +00:00
Scott Long
97faa4c470 Add accessors for the Vendor Specific Extended Capability (VSEC)
Parse out the VSEC.  If the user invokes a second -c command line option,
do a hex dump of the vendor data.

Reviewed by:	imp
MFC after:	3 days
Sponsored by:	Intel
Differential Revision:	http://reviews.freebsd.org/D22808
2019-12-13 23:46:59 +00:00
John Baldwin
93dafad57a Expand net epoch in the cxgbe TOE driver to satisfy assertions.
Reviewed by:	np
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D22483
2019-12-13 23:33:54 +00:00
Jung-uk Kim
f9a6772ec3 MFV: r355716
Merge ACPICA 20191213.
2019-12-13 23:28:52 +00:00
Ian Lepore
85a6a404fb Include ofw_bus_if.h in SRCS only on systems configured with the FDT option. 2019-12-13 23:22:49 +00:00
Warner Losh
c0834910ca Better copyright advice
Document the common practices around copyrights with "all rights reserved" in
them as new copyright notices get added.

It's an open question qhether to point people at the fact that since the Berne
convention was ratified, All rights reserved is largely obsolete.
https://en.wikipedia.org/wiki/All_rights_reserved#Obsolescence has the
details. The committer's guide will be revised shortly, and it's likely that's a
better place for this discussion. If not, I'll add a blurb here.

Reviewed by: jhb@, brooks@
Differential Review: https://reviews.freebsd.org/D22800
2019-12-13 22:32:05 +00:00
Andriy Gapon
c527e92004 zfs boot: fix a crash in a rarely taken path in fzap_lookup
Instead of passing NULL to fzap_name_equal and crashing, just return
ENOENT.  This happened when higher bits of a hash of the searched key
(its hash prefix) matched a hash prefix of some key in the ZAP, but the
full hash value of the searched key did not match any key in the ZAP.

I observerved this problem when loader tried to look up
"features_for_read" in a particular old pool that predates pool
features.

MFC after:	2 weeks
Sponsored by:	Panzura
2019-12-13 22:04:13 +00:00
Warner Losh
9f07ef760a Be consistent about checking return value from bus_delayed_attach_children.
Most places checked, but a couple last minute changes didn't. Make them all use
the return value.

Noticed by: rpokala@
2019-12-13 21:39:20 +00:00
Warner Losh
16db09d8c1 Don't use contractions. Fix the date.
Contractions cause problems for translators, so s/aren't/are not/ in the one
place this slipped through.

While here, noticed I commited with the date I did the work, not today's
date. Fix that too.

Noticed by: bjk@
2019-12-13 21:39:10 +00:00
Rick Macklem
f808cf7294 Silence some "might not be initialized" warnings for riscv64.
None of these case were actually using the variable(s) uninitialized, but
I figured that silencing the warnings via initializing them made sense.

Some of these predated r355677.
2019-12-13 21:38:08 +00:00