Commit Graph

270115 Commits

Author SHA1 Message Date
cem
b4901e2002 Add unlocked/SMR fast path to getblk()
Convert the bufobj tries to an SMR zone/PCTRIE and add a gbincore_unlocked()
API wrapping this functionality.  Use it for a fast path in getblkx(),
falling back to locked lookup if we raced a thread changing the buf's
identity.

Reported by:	Attilio
Reviewed by:	kib, markj
Testing:	pho (in progress)
Sponsored by:	Isilon
Differential Revision:	https://reviews.freebsd.org/D25782
2020-07-24 17:34:04 +00:00
cem
302dcf6e71 Use SMR to provide safe unlocked lookup for pctries from SMR zones
Adapt r358130, for the almost identical vm_radix, to the pctrie subsystem.
Like that change, the tree is kept correct for readers with store barriers
and careful ordering.  Existing locks serialize writers.

Add a PCTRIE_DEFINE_SMR() wrapper that takes an additional smr_t parameter
and instantiates a FOO_PCTRIE_LOOKUP_UNLOCKED() function, in addition to the
usual definitions created by PCTRIE_DEFINE().

Interface consumers will be introduced in later commits.

As future work, it might be nice to add vm_radix algorithms missing from
generic pctrie to the pctrie interface, and then adapt vm_radix to use
pctrie.

Reported by:	Attilio
Reviewed by:	markj
Sponsored by:	Isilon
Differential Revision:	https://reviews.freebsd.org/D25781
2020-07-24 17:32:10 +00:00
mjg
1a763adf3f lockmgr: add missing 'continue' to account for spuriously failed fcmpset
PR:		248245
Reported by:	gbe
Noted by:	markj
Fixes by:	r363415 ("lockmgr: add adaptive spinning")
2020-07-24 17:28:24 +00:00
manu
db31c96074 mmccam: Add some aliases for non-mmccam to mmccam transition
A new tunable is present, kern.cam.sdda.mmcsd_compat to enable
this feature or not (default is enabled)
2020-07-24 17:11:14 +00:00
jmallett
3f7465dc18 Remove reference to nlist(3) missed in SCCS revision 5.26 by mckusick
when converting rwhod(8) to using kern.boottime ather than extracting
the boot time from kernel memory directly.

Reviewed by:	imp
2020-07-24 16:58:13 +00:00
0mp
92664b44e9 Fix grammar issues and typos
Reported by:	ian
MFC after:	1 week
2020-07-24 15:04:34 +00:00
0mp
088ab3df51 Document that force_depend() supports only /etc/rc.d scripts
Currently, force_depend() from rc.subr(8) does not support depending on
scripts outside of /etc/rc.d (like /usr/local/etc/rc.d). The /etc/rc.d path
is hard-coded into force_depend().

MFC after:	1 week
2020-07-24 14:17:37 +00:00
mjg
d6d69d175f vm: fix swap reservation leak and clean up surrounding code
The code did not subtract from the global counter if per-uid reservation
failed.

Cleanup highlights:
- load overcommit once
- move per-uid manipulation to dedicated routines
- don't fetch wire count if requested size is below the limit
- convert return type from int to bool
- ifdef the routines with _KERNEL to keep vm.h compilable by userspace

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D25787
2020-07-24 13:23:32 +00:00
arichardson
265ebdf071 Include TMPFS in all the GENERIC kernel configs
Being able to use tmpfs without kernel modules is very useful when building
small MFS_ROOT kernels without a real file system.
Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs.

Compiling TMPFS only adds 4 .c files so this should not make much of a
difference to NO_MODULES build times (as we do for our minimal RISC-V
images).

Reviewed By: br (earlier version for riscv), brooks, emaste
Differential Revision: https://reviews.freebsd.org/D25317
2020-07-24 08:40:04 +00:00
jmg
7086f1dd4f fix up docs for m_getjcl as well.. 2020-07-24 00:47:14 +00:00
jmg
56012feee9 document that m_get2 only accepts up to MJUMPAGESIZE.. 2020-07-24 00:35:21 +00:00
jhb
fb264c6326 Add support for KTLS RX via software decryption.
Allow TLS records to be decrypted in the kernel after being received
by a NIC.  At a high level this is somewhat similar to software KTLS
for the transmit path except in reverse.  Protocols enqueue mbufs
containing encrypted TLS records (or portions of records) into the
tail of a socket buffer and the KTLS layer decrypts those records
before returning them to userland applications.  However, there is an
important difference:

- In the transmit case, the socket buffer is always a single "record"
  holding a chain of mbufs.  Not-yet-encrypted mbufs are marked not
  ready (M_NOTREADY) and released to protocols for transmit by marking
  mbufs ready once their data is encrypted.

- In the receive case, incoming (encrypted) data appended to the
  socket buffer is still a single stream of data from the protocol,
  but decrypted TLS records are stored as separate records in the
  socket buffer and read individually via recvmsg().

Initially I tried to make this work by marking incoming mbufs as
M_NOTREADY, but there didn't seemed to be a non-gross way to deal with
picking a portion of the mbuf chain and turning it into a new record
in the socket buffer after decrypting the TLS record it contained
(along with prepending a control message).  Also, such mbufs would
also need to be "pinned" in some way while they are being decrypted
such that a concurrent sbcut() wouldn't free them out from under the
thread performing decryption.

As such, I settled on the following solution:

- Socket buffers now contain an additional chain of mbufs (sb_mtls,
  sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by
  the protocol layer.  These mbufs are still marked M_NOTREADY, but
  soreceive*() generally don't know about them (except that they will
  block waiting for data to be decrypted for a blocking read).

- Each time a new mbuf is appended to this TLS mbuf chain, the socket
  buffer peeks at the TLS record header at the head of the chain to
  determine the encrypted record's length.  If enough data is queued
  for the TLS record, the socket is placed on a per-CPU TLS workqueue
  (reusing the existing KTLS workqueues and worker threads).

- The worker thread loops over the TLS mbuf chain decrypting records
  until it runs out of data.  Each record is detached from the TLS
  mbuf chain while it is being decrypted to keep the mbufs "pinned".
  However, a new sb_dtlscc field tracks the character count of the
  detached record and sbcut()/sbdrop() is updated to account for the
  detached record.  After the record is decrypted, the worker thread
  first checks to see if sbcut() dropped the record.  If so, it is
  freed (can happen when a socket is closed with pending data).
  Otherwise, the header and trailer are stripped from the original
  mbufs, a control message is created holding the decrypted TLS
  header, and the decrypted TLS record is appended to the "normal"
  socket buffer chain.

(Side note: the SBCHECK() infrastucture was very useful as I was
 able to add assertions there about the TLS chain that caught several
 bugs during development.)

Tested by:	rmacklem (various versions)
Relnotes:	yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24628
2020-07-23 23:48:18 +00:00
bdrewery
3d54e55ad8 Limit gmirror failpoint tests to the test worker
This avoids injecting errors into the test system's mirrors.

gnop seems like a good solution here but it injects errors at the wrong
place vs where these tests expect and does not support a 'max global count'
like the failpoints do with 'n*' syntax.

Reviewed by:	cem, vangyzen
Sponsored by:	Dell EMC Isilon
2020-07-23 23:29:50 +00:00
jmg
86be00644a update example to make it active when creating a new boot method...
Clean up some of the sentences and grammar...

make igor happy..
2020-07-23 22:28:35 +00:00
jhb
42040b2aa5 Consolidate duplicated code into a ktls_ocf_dispatch function.
This function manages the loop around crypto_dispatch and coordination
with ktls_ocf_callback.

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25757
2020-07-23 21:43:06 +00:00
jhb
a9c79eb484 Set si_trapno to the exception code from esr.
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25771
2020-07-23 21:40:03 +00:00
jhb
09483f4a44 Pass the right size to memcpy() when copying the array of FP registers.
The size of the containing structure was passed instead of the size of
the array.  This happened to be harmless as the extra word copied is
one we copy in the next line anyway.

Reported by:	CHERI (bounds check violation)
Reviewed by:	brooks, imp
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25791
2020-07-23 21:33:10 +00:00
jhb
15cdd693d3 Set si_addr to badvaddr for TLB faults.
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25775
2020-07-23 20:08:42 +00:00
emaste
134d7aa0e1 md5: return non-zero if built-in tests (-x) fail
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-07-23 20:06:24 +00:00
tuexen
ecb61b7599 Clear the pointer to the socket when closing it also in case of
an ungraceful operation.
This fixes a use-after-free bug found and reported by Taylor
Brandstetter of Google by testing the userland stack.

MFC after:		1 week
2020-07-23 19:43:49 +00:00
emaste
b03abee6f4 modules/crypto: disable optimized assembly skein1024 implementation
It is presumably broken in the same way as userland skein1024 (see r363454)

PR:		248221
2020-07-23 19:19:33 +00:00
emaste
1795eeadc2 libmd: temporarily disable optimized assembly skein1024 implementation
It is apparently broken when assembled by contemporary GNU as as well as
Clang IAS (which is used in the default configuration).

PR:		248221
Reported by:	pizzamig
Sponsored by:	The FreeBSD Foundation
2020-07-23 18:55:47 +00:00
cy
84a482ddbb Document the IPFILTER_PREDEFINED environment variable.
PR:		248088
Reported by:	joeb1@a1poweruser.com
MFC after:	1 week
2020-07-23 17:39:49 +00:00
cy
6a19ffdd07 Load ipfilter, ipnat, and ippool rules, and start ipmon in a vnet jail.
PR:		248109
Reported by:	joeb1@a1poweruser.com
MFC after:	2 weeks
2020-07-23 17:39:45 +00:00
mjg
62572cdcfe locks: fix a long standing bug for primitives with kdtrace but without spinning
In such a case the second argument to lock_delay_arg_init was NULL which was
immediately causing a null pointer deref.

Since the sructure is only used for spin count, provide a dedicate routine
initializing it.

Reported by:	andrew
2020-07-23 17:26:53 +00:00
dougm
504ba65c8c Rank balanced (RB) trees are a class of balanced trees that includes
AVL trees, red-black trees, and others. Weak AVL (wavl) trees are a
recently discovered member of that class. This change replaces
red-black rebalancing with weak AVL rebalancing in the RB tree macros.

Wavl trees sit between AVL and red-black trees in terms of how
strictly balance is enforced. They have the stricter balance of AVL
trees as the tree is built - a wavl tree is an AVL tree until the
first deletion. Once removals start, wavl trees are lazier about
rebalancing than AVL trees, so that removals can be fast, but the
balance of the tree can decay to that of a red-black tree. Subsequent
insertions can push balance back toward the stricter AVL conditions.

Removing a node from a wavl tree never requires more than two
rotations, which is better than either red-black or AVL
trees. Inserting a node into a wavl tree never requires more than two
rotations, which matches red-black and AVL trees. The only
disadvantage of wavl trees to red-black trees is that more insertions
are likely to adjust the tree a bit. That's the cost of keeping the
tree more balanced.

Testing has shown that for the cases where red-black trees do worst,
wavl trees better balance leads to faster lookups, so that if lookups
outnumber insertions by a nontrivial amount, lookup time saved exceeds
the extra cost of balancing.

Reviewed by:	alc, gbe, markj
Tested by:	pho
Discussed with:	emaste
Differential Revision:	https://reviews.freebsd.org/D25480
2020-07-23 17:16:20 +00:00
markj
0c33107f0a rc.firewall: Merge two identical conditions into one.
No functional change intended.

PR:		247949
Submitted by:	Jose Luis Duran <jlduran@gmail.com>
MFC after:	1 week
2020-07-23 15:03:28 +00:00
mav
e09cb761ba Add missing newlines.
MFC after:	3 days
2020-07-23 14:33:25 +00:00
markj
6fba629def MFOpenZFS: Fix zpool history unbounded memory usage
In original implementation, zpool history will read the whole history
before printing anything, causing memory usage goes unbounded. We fix
this by breaking it into read-print iterations.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #9516

Note, this change changes the libzfs.so ABI by modifying the prototype
of zpool_get_history().  Since libzfs is effectively private to the base
system it is anticipated that this will not be a problem.

PR:		247557
Obtained from:	OpenZFS
Reported and tested by:	Sam Vaughan <samjvaughan@gmail.com>
Discussed with:	freqlabs
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25745
openzfs/zfs@7125a109dc
2020-07-23 14:21:45 +00:00
markj
8ae7ab7cc8 cuse: Stop checking for failures from malloc(M_WAITOK).
PR:		240545
Submitted by:	Andrew Reiter <arr@watson.org>
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25765
2020-07-23 14:03:37 +00:00
markj
058e99c184 ntb: Stop checking for failures from malloc(M_WAITOK).
PR:		240545
Submitted by:	Andrew Reiter <arr@watson.org>
Reviewed by:	cem, mav
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25768
2020-07-23 14:03:24 +00:00
mjg
eb2bcc3093 vm: annotate swap_reserved with __exclusive_cache_line
The counter keeps being updated all the time and variables read afterwards
share the cacheline. Note this still fundamentally does not scale and needs
to be replaced, in the meantime gets a bandaid.

brk1_processes -t 52 ops/s:
before: 8598298
after:  9098080
2020-07-23 08:42:16 +00:00
tuexen
e9af885f6d Detect and handle an invalid reassembly constellation, which results in
a memory leak.

Thanks to Felix Weinrank for finding this issue using fuzz testing the
userland stack.

MFC after:		1 week
2020-07-23 01:35:24 +00:00
brooks
c0f74adc04 Correct a type-mismatch between xdr_long and the variable "bad".
Way back in r28911 (August 1997, CVS rev 1.22) we imported a NetBSD
information leak fix via OpenBSD.  Unfortunatly we failed to track the
followup commit that fixed the type of the error code.  Apply the change
from int to long now.

Reviewed by:	emaste
Found by:	CHERI
Obtained from:	CheriBSD
MFC after:	3 days
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25779
2020-07-22 23:39:58 +00:00
brooks
cbb69bc11a Use SI_ORDER_(FOURTH|FIFTH) rather than bespoke versions.
No functional change.

When these SYSINITs were added these macros didn't exist.

Reviewed by:	imp
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25758
2020-07-22 23:35:41 +00:00
rmacklem
b3eb632fd0 Modify writing to mirrored pNFS DSs to prepare for use of ext_pgs mbufs.
This patch modifies writing to mirrored pNFS DSs slightly so that there is
only one m_copym() call for a mirrored pair instead of two of them.
This call replaces the custom nfsm_copym() call, which is no longer needed
and deleted by this patch. The patch does introduce a new nfsm_split()
function that only calls m_split() for the non-ext_pgs case.
The semantics of nfsm_uiombuflist() is changed to include code that nul
pads the generated mbuf list. This was done by nfsm_copym() prior to this patch.

The main reason for this change is that it allows the data to be a list
of ext_pgs mbufs, since the m_copym() is for the entire mbuf list.
This support will be added in a future commit.

This patch only affects writing to mirrored flexible file layout pNFS servers.
2020-07-22 23:33:37 +00:00
jhb
143dbd1964 Add missing space after switch.
Reviewed by:	br, emaste
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25778
2020-07-22 22:51:14 +00:00
brooks
a90c65d635 Avoid reading one byte before the path buffer.
This happens when there's only one component (e.g. "/foo"). This
(mostly-harmless) bug has been present since June 1990 when it was
commited to mountd.c SCCS version 5.9.

Note: the bug is on the second changed line, the first line is changed
for visual consistency.

Reviewed by:	cem, emaste, mckusick, rmacklem
Found with:	CHERI
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25759
2020-07-22 21:44:51 +00:00
mav
0f96b37978 Untie nmi_handle_intr() from DEV_ISA.
The only part of nmi_handle_intr() depending on ISA is isa_nmi(), which is
already wrapped.  Entering debugger on NMI does not really depend on ISA.

MFC after:	2 weeks
2020-07-22 20:15:21 +00:00
manu
05581ba980 mmccam: Add support for 1.2V and 1.8V eMMC
If the card reports that it support 1.2V or 1.8V signaling switch to this voltage.

Submitted by:	kibab
2020-07-22 19:08:05 +00:00
manu
4914b590b6 mmccam: Add support for 1.8V sdcard
If the card reports that it support 1.8V signaling switch to this voltage.
While here update the list of mode for mmccam.

Submitted by:	kibab
2020-07-22 19:04:45 +00:00
manu
13cd98bf11 aw_mmc: Start a mmccam discovery when the CD handler is called.
Submitted by:	kibab
2020-07-22 18:33:36 +00:00
manu
2a668104d2 mmccam: Add a generic mmccam_start_discovery function
This is a generic function start a scan request for the given
cam_sim.
Other driver can now just use this function to request a new rescan.

Submitted by:	kibab
2020-07-22 18:30:17 +00:00
manu
54c1a28759 mmccam: Use a sbuf for the mmc ident function
While here fix a typo.
2020-07-22 18:21:37 +00:00
lwhsu
e3c5c008a5 Fix sys.geom.class.eli.onetime_test.onetime after r363402
PR:		247954
X-MFC with:	r363402
Sponsored by:	The FreeBSD Foundation
2020-07-22 17:37:11 +00:00
manu
af4440e10e mmc_xpt: Fix debug messages
PROBE_RESET was printed for PROBE_IDENTIFY, fix this.
While here add one for the PROBE_RESET.

Submitted by:	kibab
2020-07-22 17:36:28 +00:00
kevans
ea470e8f40 pkg-bootstrap: complain on improper pkg bootstrap usage
Right now, the bootstrap will gloss over things like pkg bootstrap -x or
pkg bootstrap -f pkg. Make it more clear that this is incorrect, and hint
at the correct formatting.

Reported by:	jhb (IIRC via IRC)
Approved by:	bapt, jhb, manu
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D24750
2020-07-22 17:33:35 +00:00
markj
59b94fa393 usb(4): Stop checking for failures from malloc(M_WAITOK).
Handle the fact that parts of usb(4) can be compiled into the boot
loader, where M_WAITOK does not guarantee a successful allocation.

PR:		240545
Submitted by:	Andrew Reiter <arr@watson.org> (original version)
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25706
2020-07-22 14:32:47 +00:00
thj
36ae2ebd15 Add tests for "add", "change" and "delete" functionality of /sbin/route.
Add tests to cover "add", "change" and "delete" functionality of /sbin/route
for ipv4 and ipv6. These tests for the existing route tool are the first step
towards creating libroute.

Submitted by:   Ahsan Barkati
Sponsored by:   Google, Inc. (GSoC 2020)
Reviewed by:    kp, thj
Approved by:    bz (mentor)
MFC after:      1 month
Differential Revision:  https://reviews.freebsd.org/D25220
2020-07-22 13:49:54 +00:00
gbe
ab6c7929bb geli(8): Add missing commands in the EXAMPLES section
- Add a missing 'geli attach' command
- Fix the passphrase prompt for a 'geli attach' command

Reported by:	Fabian Keil <freebsd-listen at fabiankeil dot de>
Reviewed by:	bcr (mentor)
Approved by:	bcr (mentor)
Differential Revision:	https://reviews.freebsd.org/D25761
2020-07-22 13:00:56 +00:00