Commit Graph

304 Commits

Author SHA1 Message Date
Alexander Motin
67f508db84 Mark some sysctls as CTLFLAG_MPSAFE.
MFC after:	2 weeks
2021-08-10 22:18:26 -04:00
Mitchell Horne
c8a96cdcd9 Add an option for entering KDB on recursive panics
There are many cases where one would choose avoid entering the debugger
on a normal panic, opting instead to reboot and possibly save a kernel
dump. However, recursive kernel panics are an unusual case that might
warrant attention from a human, so provide a secondary tunable,
debug.debugger_on_recursive_panic, to allow entering the debugger only
when this occurs.

For for simplicity in maintaining existing behaviour, the tunable
defaults to zero.

Reviewed by:	cem, markj
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D27271
2020-11-19 18:03:40 +00:00
Mitchell Horne
6debfd4b13 Remove unused function cpu_boot()
The prototype was added with the creation of kern_shutdown.c in r17658,
but it appears to have never been implemented. Remove it now.

Reviewed by:	cem, kib
Differential Revision:	https://reviews.freebsd.org/D26702
2020-10-06 23:16:56 +00:00
Mark Johnston
6255e8c8e2 Fix writing of the final block of encrypted, compressed kernel dumps.
Previously any residual data in the final block of a compressed kernel
dump would be written unencrypted.  Note, such a configuration already
does not work properly when using AES-CBC since the compressed data is
typically not a multiple of the AES block length in size and EKCD does
not implement any padding scheme.  However, EKCD more recently gained
support for using the ChaCha20 cipher, which being a stream cipher does
not have this problem.

Submitted by:	sigsys@gmail.com
Reviewed by:	cem
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D26188
2020-08-27 17:36:06 +00:00
John Baldwin
4a711b8d04 Use zfree() instead of explicit_bzero() and free().
In addition to reducing lines of code, this also ensures that the full
allocation is always zeroed avoiding possible bugs with incorrect
lengths passed to explicit_bzero().

Suggested by:	cem
Reviewed by:	cem, delphij
Approved by:	csprng (cem)
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D25435
2020-06-25 20:17:34 +00:00
Eric van Gyzen
ba0ced82ea Fix handling of NMIs from unknown sources (BMC, hypervisor)
Release kernels have no KDB backends enabled, so they discard an NMI
if it is not due to a hardware failure.  This includes NMIs from
IPMI BMCs and hypervisors.

Furthermore, the interaction of panic_on_nmi, kdb_on_nmi, and
debugger_on_panic is confusing.

Respond to all NMIs according to panic_on_nmi and debugger_on_panic.
Remove kdb_on_nmi.  Expand the meaning of panic_on_nmi by making
it a bitfield.  There are currently two bits: one for NMIs due to
hardware failure, and one for all others.  Leave room for more.

If panic_on_nmi and debugger_on_panic are both true, don't actually panic,
but directly enter the debugger, to allow someone to leave the debugger
and [hopefully] resume normal execution.

Reviewed by:	kib
MFC after:	2 weeks
Relnotes:	yes: machdep.kdb_on_nmi is gone; machdep.panic_on_nmi changed
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D24558
2020-04-26 00:41:29 +00:00
Conrad Meyer
7a119578a4 kern_shutdown: Add missing EKCD ifdef
Submitted by:	Puneeth Jothaiah <puneethkumar.jothaia AT dell.com>
Reviewed by:	bdrewery
Sponsored by:	Dell EMC Isilon
2020-03-12 21:26:36 +00:00
Pawel Biernacki
b05ca4290c sys/: Document few more sysctls.
Submitted by:	Antranig Vartanian <antranigv@freebsd.am>
Reviewed by:	kaktus
Commented by:	jhb
Approved by:	kib (mentor)
Sponsored by:	illuria security
Differential Revision:	https://reviews.freebsd.org/D23759
2020-03-02 15:30:52 +00:00
Pawel Biernacki
7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Ryan Libby
fe20aaec0a sys/kern: quiet -Wwrite-strings
Quiet a variety of Wwrite-strings warnings in sys/kern at low-impact
sites.  This patch avoids addressing certain others which would need to
plumb const through structure definitions.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D23798
2020-02-23 03:32:16 +00:00
Mateusz Guzik
d199ad3b44 Add "panicked" boolean which can be tested instead of panicstr
The test is performed all the time and reading entire panicstr to do it
wastes space.
2020-01-12 06:09:10 +00:00
Mateusz Guzik
b249ce48ea vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21427
2020-01-03 22:29:58 +00:00
Mateusz Guzik
abd80ddb94 vfs: introduce v_irflag and make v_type smaller
The current vnode layout is not smp-friendly by having frequently read data
avoidably sharing cachelines with very frequently modified fields. In
particular v_iflag inspected for VI_DOOMED can be found in the same line with
v_usecount. Instead make it available in the same cacheline as the v_op, v_data
and v_type which all get read all the time.

v_type is avoidably 4 bytes while the necessary data will easily fit in 1.
Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new
flag field with a new value: VIRF_DOOMED.

Reviewed by:	kib, jeff
Differential Revision:	https://reviews.freebsd.org/D22715
2019-12-08 21:30:04 +00:00
Alexander Motin
61322a0a8a Mark some more hot global variables with __read_mostly.
MFC after:	1 week
2019-12-04 21:26:03 +00:00
Andriy Gapon
3ad1ce46d3 debug,kassert.warnings is a statistic, not a tunable
MFC after:	1 week
2019-10-21 12:21:56 +00:00
Conrad Meyer
addccb8c51 Add a very limited DDB dumpon(8)-alike to MI dumper code
This allows ddb(4) commands to construct a static dumperinfo during
panic/debug and invoke doadump(false) using the provided dumper
configuration (always inserted first in the list).

The intended usecase is a ddb(4)-time netdump(4) command.

Reviewed by:	markj (earlier version)
Differential Revision:	https://reviews.freebsd.org/D21448
2019-10-17 18:29:44 +00:00
Andriy Gapon
387df3b805 shutdown_halt: make sure that watchdog timer is stopped
The point of halt is to keep the machine in limbo.

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D21222
2019-09-04 13:26:59 +00:00
Conrad Meyer
8298529226 EKCD: Add Chacha20 encryption mode
Add Chacha20 mode to Encrypted Kernel Crash Dumps.

Chacha20 does not require messages to be multiples of block size, so it is
valid to use the cipher on non-block-sized messages without the explicit
padding AES-CBC would require.  Therefore, allow use with simultaneous dump
compression.  (Continue to disallow use of AES-CBC EKCD with compression.)

dumpon(8) gains a -C cipher flag to select between chacha and aes-cbc.
It defaults to chacha if no -C option is provided.  The man page documents this
behavior.

Relnotes:	sure
Sponsored by:	Dell EMC Isilon
2019-05-23 20:12:24 +00:00
Conrad Meyer
6b6e2954dd List-ify kernel dump device configuration
Allow users to specify multiple dump configurations in a prioritized list.
This enables fallback to secondary device(s) if primary dump fails.  E.g.,
one might configure a preference for netdump, but fallback to disk dump as a
second choice if netdump is unavailable.

This change does not list-ify netdump configuration, which is tracked
separately from ordinary disk dumps internally; only one netdump
configuration can be made at a time, for now.  It also does not implement
IPv6 netdump.

savecore(8) is already capable of scanning and iterating multiple devices
from /etc/fstab or passed on the command line.

This change doesn't update the rc or loader variables 'dumpdev' in any way;
it can still be set to configure a single dump device, and rc.d/savecore
still uses it as a single device.  Only dumpon(8) is updated to be able to
configure the more complicated configurations for now.

As part of revving the ABI, unify netdump and disk dump configuration ioctl
/ structure, and leave room for ipv6 netdump as a future possibility.
Backwards-compatibility ioctls are added to smooth ABI transition,
especially for developers who may not keep kernel and userspace perfectly
synced.

Reviewed by:	markj, scottl (earlier version)
Relnotes:	maybe
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D19996
2019-05-06 18:24:07 +00:00
John Baldwin
b317cfd4c0 Don't enter DDB for fatal traps before panic by default.
Add a new 'debugger_on_trap' knob separate from 'debugger_on_panic'
and make the calls to kdb_trap() in MD fatal trap handlers prior to
calling panic() conditional on this new knob instead of
'debugger_on_panic'.  Disable the new knob by default.  Developers who
wish to recover from a fatal fault by adjusting saved register state
and retrying the faulting instruction can still do so by enabling the
new knob.  However, for the more common case this makes the user
experience for panics due to a fatal fault match the user experience
for other panics, e.g. 'c' in DDB will generate a crash dump and
reboot the system rather than being stuck in an infinite loop of fatal
fault messages and DDB prompts.

Reviewed by:	kib, avg
MFC after:	2 months
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D17768
2018-11-01 21:34:17 +00:00
Conrad Meyer
4ca8c1efe4 KASSERT: Make runtime optionality optional
Add an option, KASSERT_PANIC_OPTIONAL, that allows runtime KASSERT()
behavior changes.  When this option is not enabled, code that allows
KASSERTs to become optional is not enabled, and all violated assertions
cause termination.

The runtime KASSERT behavior was added in r243980.

One important distinction here is that panic has __dead2
("attribute((noreturn))"), while kassert_panic does not.  Static analyzers
like Coverity understand __dead2.  Without it, KASSERTs go misunderstood,
resulting in many false positives that result from violation of program
invariants.

Reviewed by:	jhb, jtl, np, vangyzen
Relnotes:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D16835
2018-08-22 22:19:42 +00:00
Matt Macy
11d4f748d7 remove unused variable 2018-05-19 03:55:42 +00:00
Mark Johnston
bd92e6b6f5 Refactor some of the MI kernel dump code in preparation for netdump.
- Add clear_dumper() to complement set_dumper().
- Drain netdump's preallocated mbuf pool when clearing the dumper.
- Don't do bounds checking for dumpers with mediasize 0.
- Add dumper callbacks for initialization for writing out headers.

Reviewed by:	sbruno
MFC after:	1 month
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D15252
2018-05-06 00:22:38 +00:00
Conrad Meyer
65df124845 Do not totally silence suppressed secondary kasserts unless debug.kassert.do_log is disabled
To totally silence and ignore secondary kassert violations after a primary
panic, set debug.kassert.do_log=0 and debug.kassert.suppress_in_panic=1.

Additional assertion warnings shouldn't block core dump and may alert the
developer to another erroneous condition.  Secondary stack traces may be
printed, identically to the unsuppressed case where panic() is reentered --
controlled via debug.trace_all_panics.

Sponsored by:	Dell EMC Isilon
2018-04-24 19:10:51 +00:00
Conrad Meyer
07aa6ea677 Fix debug.kassert.do_log description text
This has been an (incorrect) copy-paste duplicate of debug.kassert.warn_only
since it was originally committed in r243980.

Sponsored by:	Dell EMC Isilon
2018-04-24 18:59:40 +00:00
Conrad Meyer
ad1fc31570 panic: Optionally, trace secondary panics
To diagnose and fix secondary panics, it is useful to have a stack trace.
When panic tracing is enabled, optionally trace secondary panics as well.

The option is configured with the tunable/sysctl debug.trace_all_panics.

(The original concern that inspired only tracing the primary panic was
likely that the secondary trace may scroll the original panic message or trace
off the screen.  This is less of a concern for serial consoles with logging.
Not everything has a serial console, though, so the behavior is optional.)

Discussed with:	jhb
Sponsored by:	Dell EMC Isilon
2018-04-24 18:54:20 +00:00
Jonathan T. Looney
18959b695d Update r332860 by changing the default from suppressing post-panic
assertions to not suppressing post-panic assertions.

There are some post-panic assertions that are valuable and we shouldn't
default to disabling them.  However, when a user trips over them, the
user can still adjust the tunable/sysctl to suppress them temporarily to
get conduct troubleshooting (e.g. get a core dump).

Reported by:	cem, markj
2018-04-24 18:47:35 +00:00
Jonathan T. Looney
44b71282b5 When running with INVARIANTS, the kernel contains extra checks. However,
these assumptions may not hold true once we've panic'd. Therefore, the
checks hold less value after a panic.  Additionally, if one of the checks
fails while we are already panic'd, this creates a double-panic which can
interfere with debugging the original panic.

Therefore, this commit allows an administrator to suppress a response to
KASSERT checks after a panic by setting a tunable/sysctl.  The
tunable/sysctl (debug.kassert.suppress_in_panic) defaults to being
enabled.

Reviewed by:	kib
Sponsored by:	Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D12920
2018-04-21 17:05:00 +00:00
Konstantin Belousov
c398200721 Do not send signals to init directly from shutdown_nice(9), do it from
the task context.

shutdown_nice() is used from the fast interrupt handlers, mostly for
console drivers, where we cannot lock blockable locks.  Schedule the
task in the fast queue to send the signal from the proper context.

Reviewed by:	imp
Discussed with:	bde
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2018-03-22 20:47:25 +00:00
Warner Losh
f0d847af61 Drop any recursed taking of Giant once and for all at the top of
kern_reboot(). The shutdown path is now safe to run without Giant.

Discussed with: kib@
Sponsored by: Netflix
2018-03-22 15:34:37 +00:00
Warner Losh
d5292812f8 Remove Giant from init creation and vfs_mountroot.
Sponsored by: Netflix
Discussed with: kib@, mckusick@
Differential Review: https://reviews.freebsd.org/D14712
2018-03-21 14:46:54 +00:00
Mark Johnston
bde3b1e1a5 Return E2BIG if we run out of space writing a compressed kernel dump.
ENOSPC causes the MD kernel dump code to retry the dump, but this is
undesirable in the case where we legitimately ran out of space.
2018-03-08 17:04:36 +00:00
Mark Johnston
6026dcd7ca Add support for zstd-compressed user and kernel core dumps.
This works similarly to the existing gzip compression support, but
zstd is typically faster and gives better compression ratios.

Support for this functionality must be configured by adding ZSTDIO to
one's kernel configuration file. dumpon(8)'s new -Z option is used to
configure zstd compression for kernel dumps. savecore(8) now recognizes
and saves zstd-compressed kernel dumps with a .zst extension.

Submitted by:	cem (original version)
Relnotes:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D13101,
			https://reviews.freebsd.org/D13633
2018-02-13 19:28:02 +00:00
Mark Johnston
78f57a9cde Generalize the gzio API.
We currently use a set of subroutines in kern_gzio.c to perform
compression of user and kernel core dumps. In the interest of adding
support for other compression algorithms (zstd) in this role without
complicating the API consumers, add a simple compressor API which can be
used to select an algorithm.

Also change the (non-default) GZIO kernel option to not enable
compressed user cores by default. It's not clear that such a default
would be desirable with support for multiple algorithms implemented,
and it's inconsistent in that it isn't applied to kernel dumps.

Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D13632
2018-01-08 21:27:41 +00:00
Alexander Kabaev
4daa09f343 Remove dead store to local variable. 2017-12-23 16:49:57 +00:00
Nathan Whitehorn
efe67753cc Remove some, but not all, assumptions that the BSP is CPU 0 and that CPUs
are numbered densely from there to n_cpus.

MFC after:	1 month
2017-11-25 23:41:05 +00:00
Pedro F. Giffuni
51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Warner Losh
48f1a4921e Add two new tunables / sysctls to controll reboot after panic:
kern.poweroff_on_panic which, when enabled, instructs a system to
power off on a panic instead of a reboot.

kern.powercyle_on_panic which, when enabled, instructs a system to
power cycle, if possible, on a panic instead of a reboot.

Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D13042
2017-11-14 00:29:14 +00:00
Warner Losh
7d41b6f078 Handle RB_POWERCYCLE in the MI part of the kernel
Signal init with SIGWINCH in shutdown_nice for RB_POWERCYCLE.

Sponsored by: Netflix
2017-10-25 15:30:44 +00:00
Mark Johnston
64a16434d8 Add support for compressed kernel dumps.
When using a kernel built with the GZIO config option, dumpon -z can be
used to configure gzip compression using the in-kernel copy of zlib.
This is useful on systems with large amounts of RAM, which require a
correspondingly large dump device. Recovery of compressed dumps is also
faster since fewer bytes need to be copied from the dump device.

Because we have no way of knowing the final size of a compressed dump
until it is written, the kernel will always attempt to dump when
compression is configured, regardless of the dump device size. If the
dump is aborted because we run out of space, an error is reported on
the console.

savecore(8) is modified to handle compressed dumps and save them to
vmcore.<index>.gz, as it does when given the -z option.

A new rc.conf variable, dumpon_flags, is added. Its value is added to
the boot-time dumpon(8) invocation that occurs when a dump device is
configured in rc.conf.

Reviewed by:	cem (earlier version)
Discussed with:	def, rgrimes
Relnotes:	yes
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11723
2017-10-25 00:51:00 +00:00
Mark Johnston
46fcd1af63 Move kernel dump offset tracking into MI code.
All of the kernel dump implementations keep track of the current offset
("dumplo") within the dump device. However, except for textdumps, they
all write the dump sequentially, so we can reduce code duplication by
having the MI code keep track of the current offset. The new
dump_append() API can be used to write at the current offset.

This is needed to implement support for kernel dump compression in the
MI kernel dump code.

Also simplify dump_encrypted_write() somewhat: use dump_write() instead
of duplicating its bounds checks, and get rid of the redundant offset
tracking.

Reviewed by:	cem
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11722
2017-10-18 15:38:05 +00:00
Mark Johnston
e9666bf645 Remove some unneeded subroutines for padding writes to dump devices.
Right now we only need to pad when writing kernel dump headers, so
flatten three related subroutines into one. The encrypted kernel dump
code already writes out its key in a dumper.blocksize-sized block.

No functional change intended.

Reviewed by:	cem, def
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11647
2017-08-18 04:07:25 +00:00
Mark Johnston
01938d3666 Rename mkdumpheader() and group EKCD functions in kern_shutdown.c.
This helps simplify the code in kern_shutdown.c and reduces the number
of globally visible functions.

No functional change intended.

Reviewed by:	cem, def
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11603
2017-08-18 04:04:09 +00:00
Mark Johnston
50ef60dabe Factor out duplicated kernel dump code into dump_{start,finish}().
dump_start() and dump_finish() are responsible for writing kernel dump
headers, optionally writing the key when encryption is enabled, and
initializing the initial offset into the dump device.

Also remove the unused dump_pad(), and make some functions static now that
they're only called from kern_shutdown.c.

No functional change intended.

Reviewed by:	cem, def
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D11584
2017-08-18 03:52:35 +00:00
Mark Johnston
ab384d75db Revert r320918 and have mkdumpheader() handle version string truncation.
Reported by:	jhb
MFC after:	1 week
2017-07-15 20:53:08 +00:00
Gleb Smirnoff
6cf0c1db55 Fix compilation of r314784 on 32 bit. 2017-03-06 22:32:56 +00:00
Gleb Smirnoff
f2498877c9 In panic() print current timestamp, which matches timestamp in the dump
header.  This will help to correlate console server logs with dump files,
no matter how precise is clock on a console server appliance, and how
buggy the appliance is.
2017-03-06 19:14:08 +00:00
Baptiste Daroussin
b4b4b5304b Revert crap accidentally committed 2017-01-28 16:31:23 +00:00
Baptiste Daroussin
814aaaa7da Revert r312923 a better approach will be taken later 2017-01-28 16:30:14 +00:00
Mark Johnston
42d33c1f4d Stop the scheduler upon panic even in non-SMP kernels.
This is needed for kernel dumps to work, as the panicking thread will call
into code that makes use of kernel locks.

Reported and tested by:	Eugene Grosbein
MFC after:	1 week
2017-01-14 22:16:03 +00:00