freebsd-dev/module/zfs
Brian Behlendorf 556011dbec Improve N-way mirror performance
The read bandwidth of an N-way mirror can by increased by 50%,
and the IOPs by 10%, by more carefully selecting the preferred
leaf vdev.

The existing algorthm selects a perferred leaf vdev based on
offset of the zio request modulo the number of members in the
mirror.  It assumes the drives are of equal performance and
that spreading the requests randomly over both drives will be
sufficient to saturate them.  In practice this results in the
leaf vdevs being under utilized.

Utilization can be improved by preferentially selecting the leaf
vdev with the least pending IO.  This prevents leaf vdevs from
being starved and compensates for performance differences between
disks in the mirror.  Faster vdevs will be sent more work and
the mirror performance will not be limitted by the slowest drive.

In the common case where all the pending queues are full and there
is no single least busy leaf vdev a batching stratagy is employed.
Of the N least busy vdevs one is selected with equal probability
to be the preferred vdev for T microseconds.  Compared to randomly
selecting a vdev to break the tie batching the requests greatly
improves the odds of merging the requests in the Linux elevator.

The testing results show a significant performance improvement
for all four workloads tested.  The workloads were generated
using the fio benchmark and are as follows.

1) 1MB sequential reads from 16 threads to 16 files (MB/s).
2) 4KB sequential reads from 16 threads to 16 files (MB/s).
3) 1MB random reads from 16 threads to 16 files (IOP/s).
4) 4KB random reads from 16 threads to 16 files (IOP/s).

               | Pristine              |  With 1461             |
               | Sequential  Random    |  Sequential  Random    |
               | 1MB  4KB    1MB  4KB  |  1MB  4KB    1MB  4KB  |
               | MB/s MB/s   IO/s IO/s |  MB/s MB/s   IO/s IO/s |
---------------+-----------------------+------------------------+
2 Striped      | 226  243     11  304  |  222  255     11  299  |
2 2-Way Mirror | 302  324     16  534  |  433  448     23  571  |
2 3-Way Mirror | 429  458     24  714  |  648  648     41  808  |
2 4-Way Mirror | 562  601     36  849  |  816  828     82  926  |

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #1461
2013-07-11 13:53:50 -07:00
..
arc.c Improve code in arc_buf_remove_ref 2013-07-09 11:53:28 -07:00
bplist.c Switch KM_SLEEP to KM_PUSHPAGE 2012-08-27 12:01:37 -07:00
bpobj.c Illumos #3006 2013-06-19 15:14:10 -07:00
bptree.c Illumos #3498 panic in arc_read() 2013-07-02 13:34:31 -07:00
dbuf.c Illumos #3498 panic in arc_read() 2013-07-02 13:34:31 -07:00
ddt_zap.c Add ddt_object_count() error handling 2012-10-29 08:57:45 -07:00
ddt.c Fix incorrect assertions in ddt_phys_decref and ddt_sync_entry 2013-05-06 14:10:55 -07:00
dmu_diff.c Illumos #3498 panic in arc_read() 2013-07-02 13:34:31 -07:00
dmu_object.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
dmu_objset.c Illumos #3498 panic in arc_read() 2013-07-02 13:34:31 -07:00
dmu_send.c Illumos #3498 panic in arc_read() 2013-07-02 13:34:31 -07:00
dmu_traverse.c Illumos #3498 panic in arc_read() 2013-07-02 13:34:31 -07:00
dmu_tx.c Add new kstat for monitoring time in dmu_tx_assign 2013-07-11 13:53:44 -07:00
dmu_zfetch.c Switch KM_SLEEP to KM_PUSHPAGE 2012-08-27 12:01:37 -07:00
dmu.c Illumos #3086: unnecessarily setting DS_FLAG_INCONSISTENT on async 2013-01-08 10:35:43 -08:00
dnode_sync.c Illumos #3006 2013-06-19 15:14:10 -07:00
dnode.c Illumos #3006 2013-06-19 15:14:10 -07:00
dsl_dataset.c Illumos #3498 panic in arc_read() 2013-07-02 13:34:31 -07:00
dsl_deadlist.c Illumos #3104: eliminate empty bpobjs 2013-01-08 10:35:43 -08:00
dsl_deleg.c Illumos #2619 and #2747 2013-01-08 10:35:35 -08:00
dsl_dir.c Illumos #3006 2013-06-19 15:14:10 -07:00
dsl_pool.c Add new kstat for monitoring time in dmu_tx_assign 2013-07-11 13:53:44 -07:00
dsl_prop.c Switch KM_SLEEP to KM_PUSHPAGE 2012-09-17 11:22:23 -07:00
dsl_scan.c Illumos #3498 panic in arc_read() 2013-07-02 13:34:31 -07:00
dsl_synctask.c Illumos #3006 2013-06-19 15:14:10 -07:00
fm.c Condition variable usage, zevent_cv 2012-10-15 16:01:54 -07:00
gzip.c
lz4.c Linux 3.9 compat: Undefine GCC_VERSION 2013-03-06 15:48:48 -08:00
lzjb.c Switch KM_SLEEP to KM_PUSHPAGE 2012-08-27 12:01:37 -07:00
Makefile.in Illumos #3035 LZ4 compression support in ZFS and GRUB 2013-01-29 09:28:20 -08:00
metaslab.c Illumos #3552, #3564 2013-06-19 16:22:39 -07:00
refcount.c Switch KM_SLEEP to KM_PUSHPAGE 2012-08-27 12:01:37 -07:00
rrwlock.c Enable rrwlock.c compilation 2010-12-07 16:05:25 -08:00
sa.c Constify structures containing function pointers 2013-03-04 08:49:32 -08:00
sha256.c Add linux sha2 support 2010-08-31 13:41:59 -07:00
spa_boot.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
spa_config.c Add zfs_autoimport_disable tunable 2013-07-09 10:11:19 -07:00
spa_errlog.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
spa_history.c Switch KM_SLEEP to KM_PUSHPAGE 2012-08-27 12:01:37 -07:00
spa_misc.c Illumos #3329, #3330, #3331, #3335 2013-05-06 12:39:34 -07:00
spa.c Call zvol_create_minors() in spa_open_common() when initializing pool 2013-07-03 09:22:44 -07:00
space_map.c Illumos #3552, #3564 2013-06-19 16:22:39 -07:00
txg.c Fix txg_quiesce thread deadlock 2013-04-26 14:42:36 -07:00
uberblock.c
unique.c Switch KM_SLEEP to KM_PUSHPAGE 2012-08-27 12:01:37 -07:00
vdev_cache.c Switch KM_SLEEP to KM_PUSHPAGE 2012-08-27 12:01:37 -07:00
vdev_disk.c Use GFP_NOIO in vdev_disk_io_flush() 2013-07-10 14:12:21 -07:00
vdev_file.c Illumos #3581 spa_zio_taskq[ZIO_TYPE_FREE][ZIO_TASKQ_ISSUE]->tq_lock contention 2013-05-06 14:05:37 -07:00
vdev_label.c Illumos #3090 and #3102 2013-01-08 10:35:42 -08:00
vdev_mirror.c Improve N-way mirror performance 2013-07-11 13:53:50 -07:00
vdev_missing.c Illumos #1948: zpool list should show more detailed pool info 2012-09-19 13:39:05 -07:00
vdev_queue.c 3246 ZFS I/O deadman thread 2013-05-01 17:05:52 -07:00
vdev_raidz.c Illumos #3006 2013-06-19 15:14:10 -07:00
vdev_root.c Illumos #1948: zpool list should show more detailed pool info 2012-09-19 13:39:05 -07:00
vdev.c Illumos #3552, #3564 2013-06-19 16:22:39 -07:00
zap_leaf.c Switch KM_SLEEP to KM_PUSHPAGE 2012-09-05 08:44:58 -07:00
zap_micro.c Illumos #3006 2013-06-19 15:14:10 -07:00
zap.c Illumos #3006 2013-06-19 15:14:10 -07:00
zfeature_common.c Illumos #3035 LZ4 compression support in ZFS and GRUB 2013-01-29 09:28:20 -08:00
zfeature.c Illumos #3104: eliminate empty bpobjs 2013-01-08 10:35:43 -08:00
zfs_acl.c Avoid gcc -Werror=maybe-uninitialized warnings 2013-01-28 09:10:29 -08:00
zfs_byteswap.c Add linux kernel module support 2010-08-31 13:41:58 -07:00
zfs_ctldir.c Use dsl_dataset_snap_lookup() 2013-01-25 15:07:40 -08:00
zfs_debug.c Illumos #3006 2013-06-19 15:14:10 -07:00
zfs_dir.c Trivial spelling fix 2013-04-19 15:43:16 -07:00
zfs_fm.c 3246 ZFS I/O deadman thread 2013-05-01 17:05:52 -07:00
zfs_fuid.c Drop HAVE_XVATTR macros 2011-03-02 11:44:34 -08:00
zfs_ioctl.c Call zvol_create_minors() in spa_open_common() when initializing pool 2013-07-03 09:22:44 -07:00
zfs_log.c Revert "Remove TSD zfs_fsyncer_key" 2012-12-20 09:56:28 -08:00
zfs_onexit.c
zfs_replay.c Constify structures containing function pointers 2013-03-04 08:49:32 -08:00
zfs_rlock.c Illumos #3006 2013-06-19 15:14:10 -07:00
zfs_sa.c Revert "Use SA_HDL_PRIVATE for SA xattrs" 2012-08-25 09:25:56 -07:00
zfs_vfsops.c Fix zfs_sb_teardown/zfs_resume_fs NULL dereference 2013-07-01 14:51:45 -07:00
zfs_vnops.c Add SEEK_DATA/SEEK_HOLE to lseek()/llseek() 2013-07-02 09:24:43 -07:00
zfs_znode.c Illumos #3006 2013-06-19 15:14:10 -07:00
zil.c Illumos #3498 panic in arc_read() 2013-07-02 13:34:31 -07:00
zio_checksum.c
zio_compress.c Illumos #3035 LZ4 compression support in ZFS and GRUB 2013-01-29 09:28:20 -08:00
zio_inject.c 3246 ZFS I/O deadman thread 2013-05-01 17:05:52 -07:00
zio.c Log pool suspension warnings to the console 2013-07-10 15:15:52 -07:00
zle.c
zpl_ctldir.c Add d_clear_d_op() compatibility 2013-01-23 16:33:29 -08:00
zpl_export.c Implement .commit_metadata hook for NFS export 2012-10-03 10:49:45 -07:00
zpl_file.c Add SEEK_DATA/SEEK_HOLE to lseek()/llseek() 2013-07-02 09:24:43 -07:00
zpl_inode.c Add explicit MAXNAMELEN check 2013-02-12 10:27:39 -08:00
zpl_super.c Update SAs when an inode is dirtied 2012-12-14 12:18:54 -08:00
zpl_xattr.c Only check directory xattr on ENOENT 2013-05-10 12:24:56 -07:00
zrlock.c Export ZFS symbols needed by Lustre. 2010-09-17 16:24:15 -07:00
zvol.c 3.10 API change: block_device_operations->release() returns void 2013-07-08 15:41:57 -07:00