Commit Graph

262748 Commits

Author SHA1 Message Date
mjg
6090f91124 vfs: convert struct mount counters to per-cpu
There are 3 counters modified all the time in this structure - one for
keeping the structure alive, one for preventing unmount and one for
tracking active writers. Exact values of these counters are very rarely
needed, which makes them a prime candidate for conversion to a per-cpu
scheme, resulting in much better performance.

Sample benchmark performing fstatfs (modifying 2 out of 3 counters) on
a 104-way 2 socket Skylake system:
before:   852393 ops/s
after:  76682077 ops/s

Reviewed by:	kib, jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21637
2019-09-16 21:37:47 +00:00
mjg
099eed319c vfs: manage mnt_writeopcount with atomics
See r352424.

Reviewed by:	kib, jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21575
2019-09-16 21:33:16 +00:00
mjg
e19820cd96 vfs: manage mnt_lockref with atomics
See r352424.

Reviewed by:	kib, jeff
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21574
2019-09-16 21:32:21 +00:00
mjg
bec2ffc72a vfs: manage mnt_ref with atomics
New primitive is introduced to denote sections can operate locklessly
on aspects of struct mount, but which can also be disabled if necessary.
This provides an opportunity to start scaling common case modifications
while providing stable state of the struct when facing unmount, write
suspendion or other events.

mnt_ref is the first counter to start being managed in this manner with
the intent to make it per-cpu.

Reviewed by:	kib, jeff
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21425
2019-09-16 21:31:02 +00:00
dmgk
b221bed0fa Add myself (dmgk) to calendar.freebsd
Approved by:	tz (mentor)
Differential Revision:	https://reviews.freebsd.org/D21675
2019-09-16 20:43:20 +00:00
dmgk
c285745507 Add myself (dmgk) as a ports committer
Approved by:	tz (mentor)
Differential Revision:	https://reviews.freebsd.org/D21672
2019-09-16 20:41:37 +00:00
tsoome
1a18c8df60 loader: Malloc(0) should return NULL.
We really should not allocate anything with size 0.
2019-09-16 20:28:08 +00:00
tsoome
236a8a7a49 loader_4th: scan_buffer can leave empty string on stack
When the file processing is done, we will have string with lenght 0 in stack and we will attempt to
allocate 0 bytes.
2019-09-16 20:26:53 +00:00
asomers
eadbc9bd79 Fix an off-by-one error from r351961
That revision addressed a Coverity CID that could lead to a buffer overflow,
but it had an off-by-one error in the buffer size check.

Reported by:	Coverity
Coverity CID:	1405530
MFC after:	3 days
MFC-With:	351961
Sponsored by:	The FreeBSD Foundation
2019-09-16 16:41:01 +00:00
asomers
0917480bfd fusefs: initialize C++ classes the Coverity way
Coverity complained that I wasn't initializing some class members until the
SetUp method.  Do it in the constructor instead.

Reported by:	Coverity
Coverity CIDs:	1404352, 1404378
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-09-16 15:56:21 +00:00
asomers
b1e999bf46 fusefs: fix some minor Coverity CIDs in the tests
Where open(2) is expected to fail, the tests should assert or expect that
its return value is -1.  These tests all accepted too much but happened to
pass anyway.

Reported by:	Coverity
Coverity CID:	1404512, 1404378, 1404504, 1404483
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2019-09-16 15:44:59 +00:00
markj
64ae9dfb9d Assert that the refcount value is not VPRC_BLOCKED in vm_page_drop().
VPRC_BLOCKED is a refcount flag used to indicate that a thread is
tearing down mappings of a page.  When set, it causes attempts to wire a
page via a pmap lookup to fail.  It should never represent the last
reference to a page, so assert this.

Suggested by:	kib
Reviewed by:	alc, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21639
2019-09-16 15:16:48 +00:00
markj
e4864a6262 Fix a race in vm_page_dequeue_deferred_free() after r352110.
This function loaded the page's queue index before setting PGA_DEQUEUE.
In this window the page daemon may have deactivated the page, updating
its queue index.  Make the operation atomic using vm_page_pqstate_cmpset();
the page daemon will not modify the page once it observes that PGA_DEQUEUE
is set.

Reported and tested by:	pho
Reviewed by:	alc, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21639
2019-09-16 15:12:49 +00:00
markj
b17d3089a7 Fix a page leak in vm_page_reclaim_run().
After r352110 the attempt to remove mappings of the page being replaced
may fail if the page is wired.  In this case we must free the replacement
page.

Reviewed by:	alc, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21639
2019-09-16 15:09:31 +00:00
markj
dcb49eef76 Fix a couple of nits in r352110.
- Remove a dead variable from the amd64 pmap_extract_and_hold().
- Fix grammar in the vm_page_wire man page.

Reported by:	alc
Reviewed by:	alc, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21639
2019-09-16 15:06:19 +00:00
markj
3616760326 Revert r352406, which contained changes I didn't intend to commit. 2019-09-16 15:04:45 +00:00
markj
543f9366b9 Fix a couple of nits in r352110.
- Remove a dead variable from the amd64 pmap_extract_and_hold().
- Fix grammar in the vm_page_wire man page.

Reported by:	alc
Reviewed by:	alc, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D21639
2019-09-16 15:03:12 +00:00
asomers
395339ca27 fusefs: fix some minor issues with fuse_vnode_setparent
* When unparenting a vnode, actually clear the flag. AFAIK this is basically
  a no-op because we only unparent a vnode when reclaiming it or when
  unlinking.

* There's no need to call fuse_vnode_setparent during reclaim, because we're
  about to free the vnode data anyway.

Reviewed by:	emaste
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21630
2019-09-16 14:51:49 +00:00
kib
607db409c8 nfscl_loadattrcache: fix rest of the cases to not call
vnode_pager_setsize() under the node mutex.

r248567 moved some calls of vnode_pager_setsize() after the node lock
is unlocked, do the rest now.

Reported and tested by:	peterj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-09-16 13:26:27 +00:00
yuripv
d53a47e89c sbuf(9): fix sbuf_drain_func typedef markup
Reviewed by:	0mp (previous version)
Differential Revision:	https://reviews.freebsd.org/D21569
2019-09-16 13:10:03 +00:00
manu
02ea78cd29 pkgbase: Move cap_mkdb from runtime to utilities POST-INSTALL
Since login and login.conf moved to the utilities packages move also
the post-install related commands.

Reported by:	mj-mailinglist@gmx.de
Reviewed by:	bapt
2019-09-16 12:51:30 +00:00
kevans
265b811972 Fix 20190507 UPDATING entry
The rc mechanism for loading kernel modules is actually called 'kld_list',
not 'kld_load'

Reported by:	yuripv
2019-09-16 12:44:44 +00:00
tuexen
8246248351 Don't write to memory outside of the allocated array for SACK blocks.
Obtained from:		rrs@
MFC after:		3 days
Sponsored by:		Netflix, Inc.
2019-09-16 08:18:05 +00:00
bapt
136394b1da Do not use our custom completion function, it is not needed anymore 2019-09-16 07:31:59 +00:00
kib
f03fdaa7ad Increase the size of the send and receive buffers for YP client rpc
calls to max allowed UDP datagram size.

Since max allowed size both for keys and values where increased, the
old sizes of around 1K cause ypmatch(3) failures, while plain maps
fetches work.

The buffers were reduced in r34146 from default UDP rpcclient values
to 1024/2304 due to the key and value size being 1K.

Reviewed by:	slavash
Sponsored by:	Mellanox Technologies
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21586
2019-09-16 06:42:01 +00:00
sjg
141dd35d7f Document logic for __DEFAULT_DEPENDENT_OPTIONS
Reviewed by:	stevek
Differential Revision:	https://reviews.freebsd.org/D21640
2019-09-16 00:32:23 +00:00
mav
25326dab4e Relax TX draining in ns8250_bus_transmit().
Since TX interrupt is generated when THRE is set, wait for TEMT set means
wait for full character transmission time.  At low speeds that may take
awhile, burning CPU time while holding sc_hwmtx lock, also congested.

This is partial revert of r317659.

PR:		240121
MFC after:	2 weeks
2019-09-15 23:56:39 +00:00
delphij
643521d9cb Avoid mixing cluster numbers and sector numbers. Makes code more readable.
Obtained from:	NetBSD
MFC after:	2 weeks
2019-09-15 19:41:54 +00:00
ian
b7eca0bfb1 Apply a runtime patch to the FDT data for imx6 to fix iomuxc problems.
The latest imported FDT data defines a node for an iomuxc-gpr device,
which we don't support (or need, right now) in addition to the usual
iomuxc device.  Unfortunately, the dts improperly assigns overlapping
ranges of mmio space to both devices.  The -gpr device is also a syscon
and simple_mfd device.

At runtime the simple_mfd driver attaches for the iomuxc-gpr node, then
when the real iomuxc driver comes along later, it fails to attach because
it tries to allocate its register space, and it's already partially in
use by the bogus instance of simple_mfd.

This change works around the problem by simply disabling the node for
the iomuxc-gpr device, since we don't need it for anything.
2019-09-15 19:38:15 +00:00
tuexen
e7003e64b0 When the IP layer calls back into the SCTP layer to perform the SCTP
checksum computation, do not assume that the IP header chain and the
SCTP common header are in contiguous memory although the SCTP lays
out the mbuf chains that way. If there are IP-level options inserted
by the IP layer, the constraint is not fulfilled anymore.

This issues was found by running syzkaller. Thanks to markj@ who is
running an instance which also provides kernel dumps. This allowed me
to find this issue.

MFC after:		3 days
2019-09-15 18:29:45 +00:00
kevans
0beaa1237c rangelock: add rangelock_cookie_assert
A future change to posixshm to add file sealing (in DIFF_21391[0] and child)
will move locking out of shm_dotruncate as kern_shm_open() will require the
lock to be held across the dotruncate until the seal is actually applied.
For this, the cookie is passed into shm_dotruncate_locked which asserts
RCA_WLOCKED.

[0] Name changed to protect the innocent, hopefully, from getting autoclosed
due to this reference...

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D21628
2019-09-15 02:59:53 +00:00
ian
76cd763d29 Make the ti_sysc device quiet. It's an internal utility pseudo-device
that makes the upstream FDT data work right, so we don't need to see a
couple dozen instances of it spam the dmesg at boot time unless it's a
verbose boot.
2019-09-15 01:02:01 +00:00
dim
a56a9ff1ae Fix arm and aarch64 builds of libedit after r352275
On arm and arm64, where chars are unsigned by default, buildworld dies
with:

--- terminal.o ---
/usr/src/contrib/libedit/terminal.c:569:41: error: comparison of
integers of different signs: 'wint_t' (aka 'int') and 'wchar_t' (aka
'unsigned int') [-Werror,-Wsign-compare]
                                     el->el_cursor.v][where & 0370] !=
                                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
/usr/src/contrib/libedit/terminal.c:659:28: error: comparison of
integers of different signs: 'wint_t' (aka 'int') and 'wchar_t' (aka
'unsigned int') [-Werror,-Wsign-compare]
                                     [el->el_cursor.h] == MB_FILL_CHAR)
                                     ~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~

Fix this by making MB_FILL_CHAR a wint_t, so no casting is needed.

Note that in https://reviews.freebsd.org/D21584 this was also proposed
by Yuichiro Naito <naito.yuichiro_gmail.com>.

Reviewed by:	bapt
Subscribers:	naito.yuichiro_gmail.com, ml_vishwin.info
MFC after:	3 weeks
X-MFC-With:	r352275
Differential Revision: https://reviews.freebsd.org/D21657
2019-09-14 21:49:42 +00:00
bdragon
f51c2e0189 Fix aux_info corruption in rtld direct execution mode.
After the aux vector is moved, it is necessary to re-digest aux_info so the
pointers are updated to the new locations.

This was causing thread creation to fail on powerpc64 when using direct
execution due to a nonsense value being read for aux_info[AT_STACKPROT].

Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D21656
2019-09-14 21:18:10 +00:00
ian
614d9507ba Create a mechanism for encoding a system errno into the IIC_Exxxxx space.
Errors are communicated between the i2c controller layer and upper layers
(iicbus and slave device drivers) using a set of IIC_Exxxxxx constants which
effectively define a private number space separate from (and having values
that conflict with) the system errno number space. Sometimes it is necessary
to report a plain old system error (especially EINTR) from the controller or
bus layer and have that value make it back across the syscall interface
intact.

I initially considered replicating a few "crucial" errno values with similar
names and new numbers, e.g., IIC_EINTR, IIC_ERESTART, etc. It seemed like
that had the potential to grow over time until many of the errno names were
duplicated into the IIC_Exxxxx space.

So instead, this defines a mechanism to "encode" an errno into the IIC_Exxxx
space by setting the high bit and putting the errno into the lower-order
bits; a new errno2iic() function does this. The existing iic2errno()
recognizes the encoded values and extracts the original errno out of the
encoded value. An interesting wrinkle occurs with the pseudo-error values
such as ERESTART -- they aleady have the high bit set, and turning it off
would be the wrong thing to do. Instead, iic2errno() recognizes that lots of
high bits are on (i.e., it's a negative number near to zero) and just
returns that value as-is.

Thus, existing drivers continue to work without needing any changes, and
there is now a way to return errno values from the lower layers. The first
use of that is in iicbus_poll() which does mtx_sleep() with the PCATCH flag,
and needs to return the errno from that up the call chain.

Differential Revision:	https://reviews.freebsd.org/D20975
2019-09-14 19:33:36 +00:00
trasz
d316b2187b Introduce arb(3), the Array-based Red-Black Tree macros: similar
to the traditional tree(3) RB trees, but using an array (preallocated,
linear chunk of memory) to store the tree.

This avoids allocation overhead, improves memory locality,
and makes it trivially easy to share/transfer/copy the entire tree
without the need for marshalling.  The downside is that the size
is fixed at initialization time; there is no mechanism to resize
it.

This is one of the dependencies for the new stats(3) framework
(https://reviews.freebsd.org/D20477).

Reviewed by:	bcr (man pages), markj
Discussed with:	cem
MFC after:	2 weeks
Sponsored by:	Klara Inc, Netflix
Obtained from:	Netflix
Differential Revision:	https://reviews.freebsd.org/D20324
2019-09-14 19:23:46 +00:00
trasz
ae2a352825 Make pseudofs(9) create directory entries in order, instead
of the reverse.

This fixes Linux sysctl(8) binary - it assumes the first two
directory entries are always "." and "..". There might be other
Linux apps affected by this.

NB it might be a good idea to rewrite it using queue(3).

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21550
2019-09-14 19:16:07 +00:00
ian
77b82b9f0c Include <lock.h>, required to use spinlocks in this code. 2019-09-14 18:20:14 +00:00
sg
b01bd2e8d6 amend r:352320 Fix date for sg@
Approved by:    bcr (mentor)
2019-09-14 14:26:30 +00:00
sg
ccc03a96c6 Set bcr@ mentor for sg@
Approved by:    bcr (mentor)
2019-09-14 12:40:46 +00:00
lwhsu
e7ef37d075 Improve the description of big5(5)
- Fix the statement that big5 is a de facto standard of Traditional Chinese
  text [1]
- Add a BUGS section describes the problem of big5 and suggests use utf8

PR:		189095
Submitted by:	Brennan Vincent <brennan@umanwizard.com> [1]
Reviewed by:	Ting-Wei Lan <lantw44@gmail.com>
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21622
2019-09-14 08:15:16 +00:00
kevans
c1c015f3eb lualoader: Add reload-conf loader command
This command will trigger a reload of the configuration from disk. This is
useful if you've changed currdev from recovery media to local disk as much
as I have over the past ~2 hours and are tired of the extra keystrokes.

This is really just a glorified shortcut, but reload-conf is likely easier
to remember for other people and does save some keystrokes when reloading
the configuration. It is also resilient to the underlying config method
changing interface, but this is unlikely to happen.

MFC after:	1 week
2019-09-14 03:38:18 +00:00
jhibbits
476a473cab powerpc64/powernv: Add opal NVRAM driver for PowerNV systems
Add a very basic NVRAM driver for OPAL which can be used by the IBM
powerpc-utils nvram utility, not to be confused with the base nvram utility,
which only operates on powermac_nvram.

The IBM utility handles all partitions itself, treating the nvram device as
a plain store.

An alternative would be to manage partitions in the kernel, and augment the
base nvram utility to deal with different backing stores, but that
complicates the driver significantly.  Instead, present the same interface
IBM's utlity expects, and we get the usage for free.

Tested by:	bdragon
2019-09-14 03:30:34 +00:00
chs
e8305d1e70 Add a "count_until_fail" option to gnop, which says to start failing
I/O requests after the given number have been allowed though.

Approved by:    imp (mentor)
Reviewed by:    rpokala kib 0mp mckusick
Sponsored by:   Netflix
Differential Revision:  https://reviews.freebsd.org/D21593
2019-09-13 23:03:56 +00:00
glebius
3d08ed450e Drivers may pass runt packets to filter. This is okay.
Reviewed by:	gallatin
2019-09-13 22:36:04 +00:00
dim
40f6431286 Include <stdint.h> in unwind-arm.h, since it uses uint32_t and uint64_t
in various declarations.

Otherwise, depending on how unwind-arm.h is included from other source
files, the compiler may complain that uint32_t and uint64_t are unknown
types.

MFC after:	3 days
2019-09-13 21:00:19 +00:00
cy
24b49a6e5f Add RELNOTES comment for r352304 discussing removing default mlock()
for ntpd.

Differential Revision:	https://reviews.freebsd.org/D21581
2019-09-13 20:21:59 +00:00
cy
745e6c3513 No longer mlock() ntpd pages by default in memory thus allowing its
pages to page as necessary.

To restore historic BSD behaviour add the following to ntp.conf:
	rlimit memlock 32

Discussed on:	freebsd-current@ between Sept 6-9, 2019
Reported by:	Users using ASLR with stack gap != 0
Reviewed by:	ian, kib, rgrimes (all previous versions)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D21581
2019-09-13 20:20:05 +00:00
kib
350a038c6c riscv trap_pfault: remove unneeded hold of the process around vm_fault() call.
This is re-appearance of the nop code removed from other arches in r287625.

Reviewed by:	alc (as part of the larger patch), markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
DIfferential revision:	https://reviews.freebsd.org/D21645
2019-09-13 20:17:14 +00:00
br
7ebcf07f42 Add support for Intel Stratix 10 platform.
Intel Stratix 10 SoC includes a quad-core arm64 cluster and FPGA fabric.

This adds support for reconfiguring FPGA.

Accessing FPGA core of this SoC require the level of privilege EL3,
while kernel runs in EL1 (lower) level of privilege.

This provides an Intel service layer interface that uses SMCCC to pass
queries to the secure-monitor (EL3).

Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D21454
2019-09-13 16:50:57 +00:00