Commit Graph

251690 Commits

Author SHA1 Message Date
Ilya Bakulin
badc50c270 Make it possible to get/set MMC frequency from camcontrol
Enhance camcontrol(8) so that it's possible to manually set frequency for SD/MMC cards.
While here, display more information about the current controller, such as
supported operating modes and VCCQ voltages, as well as current VCCQ voltage.

Reviewed by:	manu
Approved by:	imp (mentor)
Differential Revision:	https://reviews.freebsd.org/D25795
2020-07-24 21:14:59 +00:00
Alexander Motin
9977c593a7 Introduce ipi_self_from_nmi().
It allows safe IPI sending to current CPU from NMI context.

Unlike other ipi_*() functions this waits for delivery to leave LAPIC in
a state safe for interrupted code.

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2020-07-24 20:52:09 +00:00
Alexander Motin
279cd05b7e Use APIC_IPI_DEST_OTHERS for bitmapped IPIs too.
It should save bunch of LAPIC register accesses.

MFC after:	2 weeks
2020-07-24 20:44:50 +00:00
Alexander Motin
23ce462092 Make lapic_ipi_vectored(APIC_IPI_DEST_SELF) NMI safe.
Sending IPI to self or all CPUs does not require write into upper part of
the ICR, prone to races.  Previously the code disabled interrupts, but it
was not enough for NMIs.  Instead of that when possible write only lower
part of the register, or use special SELF IPI register in x2APIC mode.

This also removes ICR reads used to preserve reserved bits on write.
It was there from the beginning, but I failed to find explanation why,
neither I see Linux doing it.  Specification even tells that ICR content
may be lost in deep C-states, so if hardware does not bother to preserve
it, why should we?

MFC after:	2 weeks
Sponsored by:	iXsystems, Inc.
2020-07-24 19:54:15 +00:00
Emmanuel Vadot
41c653be98 dwmmc: Add MMCCAM part
Add support for MMCCAM for dwmmc

Submitted by:	kibab
Tested On:	Rock64, RockPro64
2020-07-24 19:52:52 +00:00
Emmanuel Vadot
a6d9c9257c mmccam: aw_mmc: Only print the new ios value under bootverbose 2020-07-24 18:44:50 +00:00
Emmanuel Vadot
f1ed7b6563 mmccam: Make non bootverbose more readable
Remove some debug printfs.
Convert some to CAM_DEBUG
Only print some when bootverbose is set.
2020-07-24 18:43:46 +00:00
Conrad Meyer
81dc6c2c61 Use gbincore_unlocked for unprotected incore()
Reviewed by:	markj
Sponsored by:	Isilon
Differential Revision:	https://reviews.freebsd.org/D25790
2020-07-24 17:34:44 +00:00
Conrad Meyer
68ee1dda06 Add unlocked/SMR fast path to getblk()
Convert the bufobj tries to an SMR zone/PCTRIE and add a gbincore_unlocked()
API wrapping this functionality.  Use it for a fast path in getblkx(),
falling back to locked lookup if we raced a thread changing the buf's
identity.

Reported by:	Attilio
Reviewed by:	kib, markj
Testing:	pho (in progress)
Sponsored by:	Isilon
Differential Revision:	https://reviews.freebsd.org/D25782
2020-07-24 17:34:04 +00:00
Conrad Meyer
3c30b23519 Use SMR to provide safe unlocked lookup for pctries from SMR zones
Adapt r358130, for the almost identical vm_radix, to the pctrie subsystem.
Like that change, the tree is kept correct for readers with store barriers
and careful ordering.  Existing locks serialize writers.

Add a PCTRIE_DEFINE_SMR() wrapper that takes an additional smr_t parameter
and instantiates a FOO_PCTRIE_LOOKUP_UNLOCKED() function, in addition to the
usual definitions created by PCTRIE_DEFINE().

Interface consumers will be introduced in later commits.

As future work, it might be nice to add vm_radix algorithms missing from
generic pctrie to the pctrie interface, and then adapt vm_radix to use
pctrie.

Reported by:	Attilio
Reviewed by:	markj
Sponsored by:	Isilon
Differential Revision:	https://reviews.freebsd.org/D25781
2020-07-24 17:32:10 +00:00
Mateusz Guzik
138698898f lockmgr: add missing 'continue' to account for spuriously failed fcmpset
PR:		248245
Reported by:	gbe
Noted by:	markj
Fixes by:	r363415 ("lockmgr: add adaptive spinning")
2020-07-24 17:28:24 +00:00
Emmanuel Vadot
bf2868538e mmccam: Add some aliases for non-mmccam to mmccam transition
A new tunable is present, kern.cam.sdda.mmcsd_compat to enable
this feature or not (default is enabled)
2020-07-24 17:11:14 +00:00
Juli Mallett
ce219ecd93 Remove reference to nlist(3) missed in SCCS revision 5.26 by mckusick
when converting rwhod(8) to using kern.boottime ather than extracting
the boot time from kernel memory directly.

Reviewed by:	imp
2020-07-24 16:58:13 +00:00
Mateusz Piotrowski
d6dade0002 Fix grammar issues and typos
Reported by:	ian
MFC after:	1 week
2020-07-24 15:04:34 +00:00
Mateusz Piotrowski
5ccb7079f8 Document that force_depend() supports only /etc/rc.d scripts
Currently, force_depend() from rc.subr(8) does not support depending on
scripts outside of /etc/rc.d (like /usr/local/etc/rc.d). The /etc/rc.d path
is hard-coded into force_depend().

MFC after:	1 week
2020-07-24 14:17:37 +00:00
Mateusz Guzik
ee74412269 vm: fix swap reservation leak and clean up surrounding code
The code did not subtract from the global counter if per-uid reservation
failed.

Cleanup highlights:
- load overcommit once
- move per-uid manipulation to dedicated routines
- don't fetch wire count if requested size is below the limit
- convert return type from int to bool
- ifdef the routines with _KERNEL to keep vm.h compilable by userspace

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D25787
2020-07-24 13:23:32 +00:00
Alex Richardson
b798ef6490 Include TMPFS in all the GENERIC kernel configs
Being able to use tmpfs without kernel modules is very useful when building
small MFS_ROOT kernels without a real file system.
Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs.

Compiling TMPFS only adds 4 .c files so this should not make much of a
difference to NO_MODULES build times (as we do for our minimal RISC-V
images).

Reviewed By: br (earlier version for riscv), brooks, emaste
Differential Revision: https://reviews.freebsd.org/D25317
2020-07-24 08:40:04 +00:00
John-Mark Gurney
b6dd8b71d1 fix up docs for m_getjcl as well.. 2020-07-24 00:47:14 +00:00
John-Mark Gurney
92b56ebaf7 document that m_get2 only accepts up to MJUMPAGESIZE.. 2020-07-24 00:35:21 +00:00
John Baldwin
3c0e568505 Add support for KTLS RX via software decryption.
Allow TLS records to be decrypted in the kernel after being received
by a NIC.  At a high level this is somewhat similar to software KTLS
for the transmit path except in reverse.  Protocols enqueue mbufs
containing encrypted TLS records (or portions of records) into the
tail of a socket buffer and the KTLS layer decrypts those records
before returning them to userland applications.  However, there is an
important difference:

- In the transmit case, the socket buffer is always a single "record"
  holding a chain of mbufs.  Not-yet-encrypted mbufs are marked not
  ready (M_NOTREADY) and released to protocols for transmit by marking
  mbufs ready once their data is encrypted.

- In the receive case, incoming (encrypted) data appended to the
  socket buffer is still a single stream of data from the protocol,
  but decrypted TLS records are stored as separate records in the
  socket buffer and read individually via recvmsg().

Initially I tried to make this work by marking incoming mbufs as
M_NOTREADY, but there didn't seemed to be a non-gross way to deal with
picking a portion of the mbuf chain and turning it into a new record
in the socket buffer after decrypting the TLS record it contained
(along with prepending a control message).  Also, such mbufs would
also need to be "pinned" in some way while they are being decrypted
such that a concurrent sbcut() wouldn't free them out from under the
thread performing decryption.

As such, I settled on the following solution:

- Socket buffers now contain an additional chain of mbufs (sb_mtls,
  sb_mtlstail, and sb_tlscc) containing encrypted mbufs appended by
  the protocol layer.  These mbufs are still marked M_NOTREADY, but
  soreceive*() generally don't know about them (except that they will
  block waiting for data to be decrypted for a blocking read).

- Each time a new mbuf is appended to this TLS mbuf chain, the socket
  buffer peeks at the TLS record header at the head of the chain to
  determine the encrypted record's length.  If enough data is queued
  for the TLS record, the socket is placed on a per-CPU TLS workqueue
  (reusing the existing KTLS workqueues and worker threads).

- The worker thread loops over the TLS mbuf chain decrypting records
  until it runs out of data.  Each record is detached from the TLS
  mbuf chain while it is being decrypted to keep the mbufs "pinned".
  However, a new sb_dtlscc field tracks the character count of the
  detached record and sbcut()/sbdrop() is updated to account for the
  detached record.  After the record is decrypted, the worker thread
  first checks to see if sbcut() dropped the record.  If so, it is
  freed (can happen when a socket is closed with pending data).
  Otherwise, the header and trailer are stripped from the original
  mbufs, a control message is created holding the decrypted TLS
  header, and the decrypted TLS record is appended to the "normal"
  socket buffer chain.

(Side note: the SBCHECK() infrastucture was very useful as I was
 able to add assertions there about the TLS chain that caught several
 bugs during development.)

Tested by:	rmacklem (various versions)
Relnotes:	yes
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D24628
2020-07-23 23:48:18 +00:00
Bryan Drewery
3cee7cb269 Limit gmirror failpoint tests to the test worker
This avoids injecting errors into the test system's mirrors.

gnop seems like a good solution here but it injects errors at the wrong
place vs where these tests expect and does not support a 'max global count'
like the failpoints do with 'n*' syntax.

Reviewed by:	cem, vangyzen
Sponsored by:	Dell EMC Isilon
2020-07-23 23:29:50 +00:00
John-Mark Gurney
98b765e5c2 update example to make it active when creating a new boot method...
Clean up some of the sentences and grammar...

make igor happy..
2020-07-23 22:28:35 +00:00
John Baldwin
70d1a4351a Consolidate duplicated code into a ktls_ocf_dispatch function.
This function manages the loop around crypto_dispatch and coordination
with ktls_ocf_callback.

Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D25757
2020-07-23 21:43:06 +00:00
John Baldwin
d7d14db9c5 Set si_trapno to the exception code from esr.
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25771
2020-07-23 21:40:03 +00:00
John Baldwin
e7aaabe15e Pass the right size to memcpy() when copying the array of FP registers.
The size of the containing structure was passed instead of the size of
the array.  This happened to be harmless as the extra word copied is
one we copy in the next line anyway.

Reported by:	CHERI (bounds check violation)
Reviewed by:	brooks, imp
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25791
2020-07-23 21:33:10 +00:00
John Baldwin
6273c7420d Set si_addr to badvaddr for TLB faults.
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25775
2020-07-23 20:08:42 +00:00
Ed Maste
af9de844c4 md5: return non-zero if built-in tests (-x) fail
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2020-07-23 20:06:24 +00:00
Michael Tuexen
205f3e1597 Clear the pointer to the socket when closing it also in case of
an ungraceful operation.
This fixes a use-after-free bug found and reported by Taylor
Brandstetter of Google by testing the userland stack.

MFC after:		1 week
2020-07-23 19:43:49 +00:00
Ed Maste
e32e868528 modules/crypto: disable optimized assembly skein1024 implementation
It is presumably broken in the same way as userland skein1024 (see r363454)

PR:		248221
2020-07-23 19:19:33 +00:00
Ed Maste
0d2c19d05b libmd: temporarily disable optimized assembly skein1024 implementation
It is apparently broken when assembled by contemporary GNU as as well as
Clang IAS (which is used in the default configuration).

PR:		248221
Reported by:	pizzamig
Sponsored by:	The FreeBSD Foundation
2020-07-23 18:55:47 +00:00
Cy Schubert
f0276e8c38 Document the IPFILTER_PREDEFINED environment variable.
PR:		248088
Reported by:	joeb1@a1poweruser.com
MFC after:	1 week
2020-07-23 17:39:49 +00:00
Cy Schubert
795be686d8 Load ipfilter, ipnat, and ippool rules, and start ipmon in a vnet jail.
PR:		248109
Reported by:	joeb1@a1poweruser.com
MFC after:	2 weeks
2020-07-23 17:39:45 +00:00
Mateusz Guzik
c795344ff7 locks: fix a long standing bug for primitives with kdtrace but without spinning
In such a case the second argument to lock_delay_arg_init was NULL which was
immediately causing a null pointer deref.

Since the sructure is only used for spin count, provide a dedicate routine
initializing it.

Reported by:	andrew
2020-07-23 17:26:53 +00:00
Doug Moore
e605dcc939 Rank balanced (RB) trees are a class of balanced trees that includes
AVL trees, red-black trees, and others. Weak AVL (wavl) trees are a
recently discovered member of that class. This change replaces
red-black rebalancing with weak AVL rebalancing in the RB tree macros.

Wavl trees sit between AVL and red-black trees in terms of how
strictly balance is enforced. They have the stricter balance of AVL
trees as the tree is built - a wavl tree is an AVL tree until the
first deletion. Once removals start, wavl trees are lazier about
rebalancing than AVL trees, so that removals can be fast, but the
balance of the tree can decay to that of a red-black tree. Subsequent
insertions can push balance back toward the stricter AVL conditions.

Removing a node from a wavl tree never requires more than two
rotations, which is better than either red-black or AVL
trees. Inserting a node into a wavl tree never requires more than two
rotations, which matches red-black and AVL trees. The only
disadvantage of wavl trees to red-black trees is that more insertions
are likely to adjust the tree a bit. That's the cost of keeping the
tree more balanced.

Testing has shown that for the cases where red-black trees do worst,
wavl trees better balance leads to faster lookups, so that if lookups
outnumber insertions by a nontrivial amount, lookup time saved exceeds
the extra cost of balancing.

Reviewed by:	alc, gbe, markj
Tested by:	pho
Discussed with:	emaste
Differential Revision:	https://reviews.freebsd.org/D25480
2020-07-23 17:16:20 +00:00
Mark Johnston
7df88b9ddd rc.firewall: Merge two identical conditions into one.
No functional change intended.

PR:		247949
Submitted by:	Jose Luis Duran <jlduran@gmail.com>
MFC after:	1 week
2020-07-23 15:03:28 +00:00
Alexander Motin
81614d236f Add missing newlines.
MFC after:	3 days
2020-07-23 14:33:25 +00:00
Mark Johnston
4cbba6ae24 MFOpenZFS: Fix zpool history unbounded memory usage
In original implementation, zpool history will read the whole history
before printing anything, causing memory usage goes unbounded. We fix
this by breaking it into read-print iterations.

Reviewed-by: Tom Caputi <tcaputi@datto.com>
Reviewed-by: Matt Ahrens <matt@delphix.com>
Reviewed-by: Igor Kozhukhov <igor@dilos.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Chunwei Chen <david.chen@nutanix.com>
Closes #9516

Note, this change changes the libzfs.so ABI by modifying the prototype
of zpool_get_history().  Since libzfs is effectively private to the base
system it is anticipated that this will not be a problem.

PR:		247557
Obtained from:	OpenZFS
Reported and tested by:	Sam Vaughan <samjvaughan@gmail.com>
Discussed with:	freqlabs
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D25745
openzfs/zfs@7125a109dc
2020-07-23 14:21:45 +00:00
Mark Johnston
cbef26ed16 cuse: Stop checking for failures from malloc(M_WAITOK).
PR:		240545
Submitted by:	Andrew Reiter <arr@watson.org>
Reviewed by:	hselasky
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25765
2020-07-23 14:03:37 +00:00
Mark Johnston
dace381270 ntb: Stop checking for failures from malloc(M_WAITOK).
PR:		240545
Submitted by:	Andrew Reiter <arr@watson.org>
Reviewed by:	cem, mav
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25768
2020-07-23 14:03:24 +00:00
Mateusz Guzik
126a2470b9 vm: annotate swap_reserved with __exclusive_cache_line
The counter keeps being updated all the time and variables read afterwards
share the cacheline. Note this still fundamentally does not scale and needs
to be replaced, in the meantime gets a bandaid.

brk1_processes -t 52 ops/s:
before: 8598298
after:  9098080
2020-07-23 08:42:16 +00:00
Michael Tuexen
91e04f9e7a Detect and handle an invalid reassembly constellation, which results in
a memory leak.

Thanks to Felix Weinrank for finding this issue using fuzz testing the
userland stack.

MFC after:		1 week
2020-07-23 01:35:24 +00:00
Brooks Davis
e9b21d432b Correct a type-mismatch between xdr_long and the variable "bad".
Way back in r28911 (August 1997, CVS rev 1.22) we imported a NetBSD
information leak fix via OpenBSD.  Unfortunatly we failed to track the
followup commit that fixed the type of the error code.  Apply the change
from int to long now.

Reviewed by:	emaste
Found by:	CHERI
Obtained from:	CheriBSD
MFC after:	3 days
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25779
2020-07-22 23:39:58 +00:00
Brooks Davis
5a01eca698 Use SI_ORDER_(FOURTH|FIFTH) rather than bespoke versions.
No functional change.

When these SYSINITs were added these macros didn't exist.

Reviewed by:	imp
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25758
2020-07-22 23:35:41 +00:00
Rick Macklem
9516bcdfb4 Modify writing to mirrored pNFS DSs to prepare for use of ext_pgs mbufs.
This patch modifies writing to mirrored pNFS DSs slightly so that there is
only one m_copym() call for a mirrored pair instead of two of them.
This call replaces the custom nfsm_copym() call, which is no longer needed
and deleted by this patch. The patch does introduce a new nfsm_split()
function that only calls m_split() for the non-ext_pgs case.
The semantics of nfsm_uiombuflist() is changed to include code that nul
pads the generated mbuf list. This was done by nfsm_copym() prior to this patch.

The main reason for this change is that it allows the data to be a list
of ext_pgs mbufs, since the m_copym() is for the entire mbuf list.
This support will be added in a future commit.

This patch only affects writing to mirrored flexible file layout pNFS servers.
2020-07-22 23:33:37 +00:00
John Baldwin
a1119d08b9 Add missing space after switch.
Reviewed by:	br, emaste
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25778
2020-07-22 22:51:14 +00:00
Brooks Davis
d90b364147 Avoid reading one byte before the path buffer.
This happens when there's only one component (e.g. "/foo"). This
(mostly-harmless) bug has been present since June 1990 when it was
commited to mountd.c SCCS version 5.9.

Note: the bug is on the second changed line, the first line is changed
for visual consistency.

Reviewed by:	cem, emaste, mckusick, rmacklem
Found with:	CHERI
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25759
2020-07-22 21:44:51 +00:00
Alexander Motin
ce53f590ca Untie nmi_handle_intr() from DEV_ISA.
The only part of nmi_handle_intr() depending on ISA is isa_nmi(), which is
already wrapped.  Entering debugger on NMI does not really depend on ISA.

MFC after:	2 weeks
2020-07-22 20:15:21 +00:00
Emmanuel Vadot
fd7371f7e2 mmccam: Add support for 1.2V and 1.8V eMMC
If the card reports that it support 1.2V or 1.8V signaling switch to this voltage.

Submitted by:	kibab
2020-07-22 19:08:05 +00:00
Emmanuel Vadot
2657d8e33e mmccam: Add support for 1.8V sdcard
If the card reports that it support 1.8V signaling switch to this voltage.
While here update the list of mode for mmccam.

Submitted by:	kibab
2020-07-22 19:04:45 +00:00
Emmanuel Vadot
9bca466745 aw_mmc: Start a mmccam discovery when the CD handler is called.
Submitted by:	kibab
2020-07-22 18:33:36 +00:00