272f833cde
illumos/illumos-gate@94ddd0900a https://www.illumos.org/issues/8909: There's a race condition that exists if `zil_free_lwb` races with either `zil_commit_waiter_timeout` and/or `zil_lwb_flush_vdevs_done`. Here's an example panic due to this bug: > ::status debugging crash dump vmcore.0 (64-bit) from ip-10-110-205-40 operating system: 5.11 dlpx-5.2.2.0_2017-12-04-17-28-32b6ba51fb (i86pc) image uuid: 4af0edfb-e58e-6ed8-cafc-d3e9167c7513 panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff0010555970 addr=60 occurred in mo dule "zfs" due to a NULL pointer dereference dump content: kernel pages only > $c zio_shrink+0x12() zil_lwb_write_issue+0x30d(ffffff03dcd15cc0, ffffff03e0730e20) zil_commit_waiter_timeout+0xa2(ffffff03dcd15cc0, ffffff03d97ffcf8) zil_commit_waiter+0xf3(ffffff03dcd15cc0, ffffff03d97ffcf8) zil_commit+0x80(ffffff03dcd15cc0, 9a9) zfs_write+0xc34(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0) fop_write+0x5b(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0) write+0x250(42, fffffd7ff4832000, 2000) sys_syscall+0x177() If there's an outstanding lwb that's in `zil_commit_waiter_timeout` waiting to timeout, waiting on it's waiter's CV, we must be sure not to call `zil_free_lwb`. If we end up calling `zil_free_lwb`, then that LWB may be freed and can result in a use-after-free situation where the stale lwb pointer stored in the `zil_commit_waiter_t` structure of the thread waiting on the waiter's CV is used. A similar situation can occur if an lwb is issued to disk, and thus in the `LWB_STATE_ISSUED` state, and `zil_free_lwb` is called while the disk is servicing that lwb. In this situation, the lwb will be freed by `zil_free_lwb`, which will result in a use-after-free situation when the lwb's zio completes, and `zil_lwb_flush_vdevs_done` is called. This race condition is prevented in `zil_close` by calling `zil_commit` before `zil_free_lwb` is called, which will ensure all outstanding (i.e. all lwb's in the `LWB_STATE_OPEN` and/or `LWB_STATE_ISSUED` states) reach the `LWB_STATE_DONE` state before the lwb's are freed (`zil_commit` will not return untill all the lwb's are `LWB_STATE_DONE`). Further, this race condition is prevented in `zil_sync` by only calling `zil_free_lwb` for lwb's that do not have their `lwb_buf` pointer set. All lwb's not in the `LWB_STATE_DONE` state will have a non-null value for this pointer; the pointer is only cleared in `zil_lwb_flush_vdevs_done`, at which point the lwb's state will be changed to `LWB_STATE_DONE`. This race is present in `zil_suspend`, leading to this bug. At first glance, it would appear as though this would not be true because `zil_suspend` will call `zil_commit`, just like `zil_close`, but the problem is that `zil_suspend` will set the zilog's `zl_suspend` field prior to calling `zil_commit`. Further, in `zil_commit`, if `zl_suspend` is set, `zil_commit` will take a special branch of logic and use `txg_wait_synced` instead of performing the normal `zil_commit` logic. This call to `txg_wait_synced` might be good enough for the data to reach disk safely before it returns, but it does not ensure that all outstanding lwb's reach the `LWB_STATE_DONE` state before it returns. This is because, if there's an lwb "stuck" in `zil_commit_waiter_timeout`, waiting for it's lwb to timeout, it will maintain a non-null value for it's `lwb_buf` field and thus `zil_sync` will not free that lwb. Thus, even though the lwb's data is already on disk, the lwb will be left lingering, waiting on the CV, and will eventually timeout and be issued to disk even though the write is unnesseary. So, after `zil_commit` is called from `zil_suspend`, we incorrectly assume that there are not outstanding lwb's, and proceed to free all lwb's found on the zilog's lwb list. As a result, we free the lwb that will later be used `zil_commit_waiter_timeout`. Reviewed by: John Kennedy <jwk404@gmail.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Brad Lewis <brad.lewis@delphix.com> Reviewed by: Igor Kozhukhov <igor@dilos.org> Approved by: Robert Mustacchi <rm@joyent.com> Author: Prakash Surya <prakash.surya@delphix.com> |
||
---|---|---|
bin | ||
cddl | ||
contrib | ||
crypto | ||
etc | ||
gnu | ||
include | ||
kerberos5 | ||
lib | ||
libexec | ||
release | ||
rescue | ||
sbin | ||
secure | ||
share | ||
stand | ||
sys | ||
targets | ||
tests | ||
tools | ||
usr.bin | ||
usr.sbin | ||
.arcconfig | ||
.arclint | ||
.gitattributes | ||
.gitignore | ||
COPYRIGHT | ||
LOCKS | ||
MAINTAINERS | ||
Makefile | ||
Makefile.inc1 | ||
Makefile.libcompat | ||
Makefile.sys.inc | ||
ObsoleteFiles.inc | ||
README | ||
README.md | ||
UPDATING |
FreeBSD Source:
This is the top level of the FreeBSD source directory. This file
was last revised on:
FreeBSD
For copyright information, please see the file COPYRIGHT in this directory (additional copyright information also exists for some sources in this tree - please see the specific source directories for more information).
The Makefile in this directory supports a number of targets for building components (or all) of the FreeBSD source tree. See build(7) and https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html for more information, including setting make(1) variables.
The buildkernel
and installkernel
targets build and install
the kernel and the modules (see below). Please see the top of
the Makefile in this directory for more information on the
standard build targets and compile-time flags.
Building a kernel is a somewhat more involved process. See build(7), config(8), and https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig.html for more information.
Note: If you want to build and install the kernel with the
buildkernel
and installkernel
targets, you might need to build
world before. More information is available in the handbook.
The kernel configuration files reside in the sys/<arch>/conf
sub-directory. GENERIC is the default configuration used in release builds.
NOTES contains entries and documentation for all possible
devices, not just those commonly used.
Source Roadmap:
bin System/user commands.
cddl Various commands and libraries under the Common Development
and Distribution License.
contrib Packages contributed by 3rd parties.
crypto Cryptography stuff (see crypto/README).
etc Template files for /etc.
gnu Various commands and libraries under the GNU Public License.
Please see gnu/COPYING* for more information.
include System include files.
kerberos5 Kerberos5 (Heimdal) package.
lib System libraries.
libexec System daemons.
release Release building Makefile & associated tools.
rescue Build system for statically linked /rescue utilities.
sbin System commands.
secure Cryptographic libraries and commands.
share Shared resources.
stand Boot loader sources.
sys Kernel sources.
tests Regression tests which can be run by Kyua. See tests/README
for additional information.
tools Utilities for regression testing and miscellaneous tasks.
usr.bin User commands.
usr.sbin System administration commands.
For information on synchronizing your source tree with one or more of the FreeBSD Project's development branches, please see:
https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/current-stable.html