Brian Behlendorf 3ec3bc2167 OpenZFS 7793 - ztest fails assertion in dmu_tx_willuse_space
Reviewed by: Steve Gonczi <steve.gonczi@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Pavel Zakharov <pavel.zakharov@delphix.com>
Ported-by: Brian Behlendorf <behlendorf1@llnl.gov>

Background information: This assertion about tx_space_* verifies that we
are not dirtying more stuff than we thought we would. We “need” to know
how much we will dirty so that we can check if we should fail this
transaction with ENOSPC/EDQUOT, in dmu_tx_assign(). While the
transaction is open (i.e. between dmu_tx_assign() and dmu_tx_commit() —
typically less than a millisecond), we call dbuf_dirty() on the exact
blocks that will be modified. Once this happens, the temporary
accounting in tx_space_* is unnecessary, because we know exactly what
blocks are newly dirtied; we call dnode_willuse_space() to track this
more exact accounting.

The fundamental problem causing this bug is that dmu_tx_hold_*() relies
on the current state in the DMU (e.g. dn_nlevels) to predict how much
will be dirtied by this transaction, but this state can change before we
actually perform the transaction (i.e. call dbuf_dirty()).

This bug will be fixed by removing the assertion that the tx_space_*
accounting is perfectly accurate (i.e. we never dirty more than was
predicted by dmu_tx_hold_*()). By removing the requirement that this
accounting be perfectly accurate, we can also vastly simplify it, e.g.
removing most of the logic in dmu_tx_count_*().

The new tx space accounting will be very approximate, and may be more or
less than what is actually dirtied. It will still be used to determine
if this transaction will put us over quota. Transactions that are marked
by dmu_tx_mark_netfree() will be excepted from this check. We won’t make
an attempt to determine how much space will be freed by the transaction
— this was rarely accurate enough to determine if a transaction should
be permitted when we are over quota, which is why dmu_tx_mark_netfree()
was introduced in 2014.

We also won’t attempt to give “credit” when overwriting existing blocks,
if those blocks may be freed. This allows us to remove the
do_free_accounting logic in dbuf_dirty(), and associated routines. This
logic attempted to predict what will be on disk when this txg syncs, to
know if the overwritten block will be freed (i.e. exists, and has no
snapshots).

OpenZFS-issue: https://www.illumos.org/issues/7793
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3704e0a
Upstream bugs: DLPX-32883a
Closes #5804 

Porting notes:
- DNODE_SIZE replaced with DNODE_MIN_SIZE in dmu_tx_count_dnode(),
  Using the default dnode size would be slightly better.
- DEBUG_DMU_TX wrappers and configure option removed.
- Resolved _by_dnode() conflicts these changes have not yet been
  applied to OpenZFS.
2017-03-07 09:51:59 -08:00

129 lines
5.3 KiB
Plaintext

# ZoL userland configuration.
# To enable a boolean setting, set it to yes, on, true, or 1.
# Anything else will be interpreted as unset.
# Run `zfs mount -a` during system start?
ZFS_MOUNT='yes'
# Run `zfs unmount -a` during system stop?
ZFS_UNMOUNT='yes'
# Run `zfs share -a` during system start?
# nb: The shareiscsi, sharenfs, and sharesmb dataset properties.
ZFS_SHARE='yes'
# Run `zfs unshare -a` during system stop?
ZFS_UNSHARE='yes'
# By default, a verbatim import of all pools is performed at boot based on the
# contents of the default zpool cache file. The contents of the cache are
# managed automatically by the 'zpool import' and 'zpool export' commands.
#
# By setting this to 'yes', the system will instead search all devices for
# pools and attempt to import them all at boot, even those that have been
# exported. Under this mode, the search path can be controlled by the
# ZPOOL_IMPORT_PATH variable and a list of pools that should not be imported
# can be listed in the ZFS_POOL_EXCEPTIONS variable.
#
# Note that importing all visible pools may include pools that you don't
# expect, such as those on removable devices and SANs, and those pools may
# proceed to mount themselves in places you do not want them to. The results
# can be unpredictable and possibly dangerous. Only enable this option if you
# understand this risk and have complete physical control over your system and
# SAN to prevent the insertion of malicious pools.
ZPOOL_IMPORT_ALL_VISIBLE='no'
# Specify specific path(s) to look for device nodes and/or links for the
# pool import(s). See zpool(8) for more information about this variable.
# It supersedes the old USE_DISK_BY_ID which indicated that it would only
# try '/dev/disk/by-id'.
# The old variable will still work in the code, but is deprecated.
#ZPOOL_IMPORT_PATH="/dev/disk/by-vdev:/dev/disk/by-id"
# List of pools that should NOT be imported at boot
# when ZPOOL_IMPORT_ALL_VISIBLE is 'yes'.
# This is a space separated list.
#ZFS_POOL_EXCEPTIONS="test2"
# List of pools that SHOULD be imported at boot by the initramfs
# instead of trying to import all available pools. If this is set
# then ZFS_POOL_EXCEPTIONS is ignored.
# Only applicable for Debian GNU/Linux {dkms,initramfs}.
# This is a semi-colon separated list.
#ZFS_POOL_IMPORT="pool1;pool2"
# Should the datasets be mounted verbosely?
# A mount counter will be used when mounting if set to 'yes'.
VERBOSE_MOUNT='no'
# Should we allow overlay mounts?
# This is standard in Linux, but not ZFS which comes from Solaris where this
# is not allowed).
DO_OVERLAY_MOUNTS='no'
# Any additional option to the 'zfs import' commandline?
# Include '-o' for each option wanted.
# You don't need to put '-f' in here, unless you want it ALL the time.
# Using the option 'zfsforce=1' on the grub/kernel command line will
# do the same, but on a case-to-case basis.
ZPOOL_IMPORT_OPTS=""
# Full path to the ZFS cache file?
# See "cachefile" in zpool(8).
# The default is "@sysconfdir@/zfs/zpool.cache".
#ZPOOL_CACHE="@sysconfdir@/zfs/zpool.cache"
#
# Setting ZPOOL_CACHE to an empty string ('') AND setting ZPOOL_IMPORT_OPTS to
# "-c @sysconfdir@/zfs/zpool.cache" will _enforce_ the use of a cache file.
# This is needed in some cases (extreme amounts of VDEVs, multipath etc).
# Generally, the use of a cache file is usually not recommended on Linux
# because it sometimes is more trouble than it's worth (laptops with external
# devices or when/if device nodes changes names).
#ZPOOL_IMPORT_OPTS="-c @sysconfdir@/zfs/zpool.cache"
#ZPOOL_CACHE=""
# Any additional option to the 'zfs mount' command line?
# Include '-o' for each option wanted.
MOUNT_EXTRA_OPTIONS=""
# Build kernel modules with the --enable-debug switch?
# Only applicable for Debian GNU/Linux {dkms,initramfs}.
ZFS_DKMS_ENABLE_DEBUG='no'
# Keep debugging symbols in kernel modules?
# Only applicable for Debian GNU/Linux {dkms,initramfs}.
ZFS_DKMS_DISABLE_STRIP='no'
# Wait for this many seconds in the initrd pre_mountroot?
# This delays startup and should be '0' on most systems.
# Only applicable for Debian GNU/Linux {dkms,initramfs}.
ZFS_INITRD_PRE_MOUNTROOT_SLEEP='0'
# Wait for this many seconds in the initrd mountroot?
# This delays startup and should be '0' on most systems. This might help on
# systems which have their ZFS root on a USB disk that takes just a little
# longer to be available
# Only applicable for Debian GNU/Linux {dkms,initramfs}.
ZFS_INITRD_POST_MODPROBE_SLEEP='0'
# List of additional datasets to mount after the root dataset is mounted?
#
# The init script will use the mountpoint specified in the 'mountpoint'
# property value in the dataset to determine where it should be mounted.
#
# This is a space separated list, and will be mounted in the order specified,
# so if one filesystem depends on a previous mountpoint, make sure to put
# them in the right order.
#
# It is not necessary to add filesystems below the root fs here. It is
# taken care of by the initrd script automatically. These are only for
# additional filesystems needed. Such as /opt, /usr/local which is not
# located under the root fs.
# Example: If root FS is 'rpool/ROOT/rootfs', this would make sense.
#ZFS_INITRD_ADDITIONAL_DATASETS="rpool/ROOT/usr rpool/ROOT/var"
# Optional arguments for the ZFS Event Daemon (ZED).
# See zed(8) for more information on available options.
#ZED_ARGS="-M"