illumos/illumos-gate@873c4903a5
873c4903a5
https://www.illumos.org/issues/7336
We can run into a problem where we call into zfs_mount, which in turn calls
is_dir_empty, which opens the directory to try and make sure it's empty. The
issue with the current approach is that it holds the directory open while it
traverses it with readdir, which, due to subtle interaction with the Java JVM,
vfork, and exec can cause a tricky race condition resulting in zfs_mount
failures.
The approach to resolving the issue in this patch is to drop the usage of
readdir altogether, and instead rely on the fact that ZFS stores the number of
entries contained in a directory using the st_size field of the stat structure.
Thus, if the directory in question is a ZFS directory, we can check to see if
it's empty by calling stat() and inspecting the st_size field of structure
returned.
===============================================================================
The root cause appears to be an interesting race between vfork, exec, and
zfs_mount's usage of O_CLOEXEC when calling openat. Here's what is going on:
1. We call zfs_mount, and this in turn calls openat to check if the directory
is empty, which results in opening the directory we're trying to mount onto,
and increment v_count.
2. As we're in the middle of reading the directory, vfork is called by the JVM
and proceeds to exec the jspawnhelper utility. As a result of the vfork, we
take an additional hold on the directory, which increments v_count a second
time. The semantics of vfork mean the parent process will wait for the child
process to exit or exec before the parent can continue; at this point the
parent is in the middle of zfs_mount, reading the directory to determine if
it's empty or not.
3. The child process exec-ing jspawnhelper gets to the relvm call within
exec_args (which is called by exec_common). relvm is the function that releases
the parent process, allowing the parent to proceed. The problem is, at this
point of calling relvm, the child hasn't yet called close_exec which is
responsible for closing the file descriptors inherited from the parent process
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: Paul Dagnelie <pcd@delphix.com>
Reviewed by: Robert Mustacchi <rm@joyent.com>
Approved by: Dan McDonald <danmcd@omniti.com>
Author: Prakash Surya <prakash.surya@delphix.com>
FreeBSD Source:
This is the top level of the FreeBSD source directory. This file
was last revised on:
FreeBSD
For copyright information, please see the file COPYRIGHT in this directory (additional copyright information also exists for some sources in this tree - please see the specific source directories for more information).
The Makefile in this directory supports a number of targets for building components (or all) of the FreeBSD source tree. See build(7) and https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/makeworld.html for more information, including setting make(1) variables.
The buildkernel
and installkernel
targets build and install
the kernel and the modules (see below). Please see the top of
the Makefile in this directory for more information on the
standard build targets and compile-time flags.
Building a kernel is a somewhat more involved process. See build(7), config(8), and https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/kernelconfig.html for more information.
Note: If you want to build and install the kernel with the
buildkernel
and installkernel
targets, you might need to build
world before. More information is available in the handbook.
The kernel configuration files reside in the sys/<arch>/conf
sub-directory. GENERIC is the default configuration used in release builds.
NOTES contains entries and documentation for all possible
devices, not just those commonly used.
Source Roadmap:
bin System/user commands.
cddl Various commands and libraries under the Common Development
and Distribution License.
contrib Packages contributed by 3rd parties.
crypto Cryptography stuff (see crypto/README).
etc Template files for /etc.
gnu Various commands and libraries under the GNU Public License.
Please see gnu/COPYING* for more information.
include System include files.
kerberos5 Kerberos5 (Heimdal) package.
lib System libraries.
libexec System daemons.
release Release building Makefile & associated tools.
rescue Build system for statically linked /rescue utilities.
sbin System commands.
secure Cryptographic libraries and commands.
share Shared resources.
stand Boot loader sources.
sys Kernel sources.
tests Regression tests which can be run by Kyua. See tests/README
for additional information.
tools Utilities for regression testing and miscellaneous tasks.
usr.bin User commands.
usr.sbin System administration commands.
For information on synchronizing your source tree with one or more of the FreeBSD Project's development branches, please see:
https://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/current-stable.html