FreeBSD src
Go to file
Etienne Dechamps bb3250d07e Allocate disk space fairly in the presence of vdevs of unequal size.
The metaslab allocator device selection algorithm contains a bias
mechanism whose goal is to achieve roughly equal disk space usage across
all top-level vdevs.

It seems that the initial rationale for this code was to allow newly
added (empty) vdevs to "come up to speed" faster in an attempt to make
the pool quickly converge to a steady state where all vdevs are equally
utilized.

While the code seems to work reasonably well for this use case, there
is another scenario in which this algorithm fails miserably: the case
where top-level vdevs don't have the same sizes (capacities). ZFS
allows this, and it is a good feature to have, so that users who simply
want to build a pool with the disks they happen to have lying around can
do so even if the disks have heteregenous sizes.

Here's a script that simulates a pool with two vdevs, with one 4X larger
than the other:

    dd if=/dev/zero of=/tmp/d1 bs=1 count=1 seek=134217728
    dd if=/dev/zero of=/tmp/d2 bs=1 count=1 seek=536870912
    zpool create testspace /tmp/d1 /tmp/d2
    dd if=/dev/zero of=/testspace/foobar bs=1M count=256
    zpool iostat -v testspace

Before this commit, the script would output the following:

                   capacity
    pool        alloc   free
    ----------  -----  -----
    testspace    252M   375M
      /tmp/d1    104M  18.5M
      /tmp/d2    148M   356M
    ----------  -----  -----

This demonstrates that the current code handles this situation very
poorly: d1 shows 85% usage despite the pool itself being only 40% full.
d1 is quite saturated at this point, and is slowing down the entire pool
due to saturation, fragmentation and the like.

In contrast, here's the result with the code in this commit:

                   capacity
    pool        alloc   free
    ----------  -----  -----
    testspace    252M   375M
      /tmp/d1   56.7M  66.3M
      /tmp/d2    195M   309M
   ----------  -----  ------

This looks much better. d1 is 46% used, which is close to the overall
pool utilization (40%). The code still doesn't result in perfectly
balanced allocation, probably because of the way mg_bias is applied
which does not guarantee perfect accuracy, but this is still much better
than before.

Signed-off-by: Etienne Dechamps <etienne@edechamps.fr>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #3389
2015-06-22 14:18:29 -07:00
cmd Unify mount and share for 'zfs create/clone' 2015-06-17 11:01:16 -07:00
config Add zfs_sb_prune_aliases() function 2015-06-22 10:22:49 -07:00
contrib Add bash completions by Aneurin Price. 2014-08-06 15:03:28 -07:00
dracut Update dracut README 2015-06-18 11:58:53 -07:00
etc Additional SYSV init script fixes. 2015-06-17 13:30:03 -07:00
include Rename cv_wait_interruptible() to cv_wait_sig() 2015-06-11 10:50:47 -07:00
lib Add taskq_wait_outstanding() function 2015-06-11 10:27:25 -07:00
man Add -y option to zpool iostat 2015-06-17 10:39:20 -07:00
module Allocate disk space fairly in the presence of vdevs of unequal size. 2015-06-22 14:18:29 -07:00
rpm Base init scripts for SYSV systems 2015-05-28 14:14:53 -07:00
scripts Set zfs_autoimport_disable default value to 1 2015-02-17 16:09:41 -08:00
udev Open pools asynchronously after module load 2013-07-03 09:24:38 -07:00
.gitignore Ignore *.{deb,rpm,tar.gz} files in the top directory. 2013-04-24 16:18:59 -07:00
.gitmodules Add zimport.sh compatibility test script 2014-02-21 12:10:31 -08:00
AUTHORS Add a missing > to AUTHORS 2014-09-02 14:18:53 -07:00
autogen.sh build: do not call boilerplate ourself 2013-04-02 10:55:20 -07:00
configure.ac Add RHEL style kmod packages 2015-03-27 14:41:48 -07:00
copy-builtin Consistent menuconfig name 2012-08-26 13:49:37 -07:00
COPYRIGHT Update ZED copyright boilerplate 2015-05-11 15:07:00 -07:00
DISCLAIMER Fix minor typos and update marketing copy. 2013-03-21 12:51:06 -07:00
Makefile.am Style check shell scripts 2015-05-20 14:10:03 -07:00
META Tag zfs-0.6.4 2015-04-08 20:16:45 -07:00
OPENSOLARIS.LICENSE Add CDDL license file 2008-12-01 14:49:34 -08:00
README.markdown Fix minor typos and update marketing copy. 2013-03-21 12:51:06 -07:00
zfs-script-config.sh.in Initial implementation of zed (ZFS Event Daemon) 2014-04-02 13:10:03 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

Native ZFS for Linux!

ZFS is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the Illumos community.

ZFS on Linux, which is also known as ZoL, is currently feature complete. It includes fully functional and stable SPA, DMU, ZVOL, and ZPL layers.

Full documentation for installing ZoL on your favorite Linux distribution can be found at: http://zfsonlinux.org