dde9380a1b
Currently, the `zvol_threads` variable, which controls the number of worker threads which process items from the ZVOL queues, is set to the number of available CPUs. This choice seems to be based on the assumption that ZVOL threads are CPU-bound. This is not necessarily true, especially for synchronous writes. Consider the situation described in the comments for `zil_commit()`, which is called inside `zvol_write()` for synchronous writes: > itxs are committed in batches. In a heavily stressed zil there will be a > commit writer thread who is writing out a bunch of itxs to the log for a > set of committing threads (cthreads) in the same batch as the writer. > Those cthreads are all waiting on the same cv for that batch. > > There will also be a different and growing batch of threads that are > waiting to commit (qthreads). When the committing batch completes a > transition occurs such that the cthreads exit and the qthreads become > cthreads. One of the new cthreads becomes he writer thread for the batch. > Any new threads arriving become new qthreads. We can easily deduce that, in the case of ZVOLs, there can be a maximum of `zvol_threads` cthreads and qthreads. The default value for `zvol_threads` is typically between 1 and 8, which is way too low in this case. This means there will be a lot of small commits to the ZIL, which is very inefficient compared to a few big commits, especially since we have to wait for the data to be on stable storage. Increasing the number of threads will increase the amount of data waiting to be commited and thus the size of the individual commits. On my system, in the context of VM disk image storage (lots of small synchronous writes), increasing `zvol_threads` from 8 to 32 results in a 50% increase in sequential synchronous write performance. We should choose a more sensible default for `zvol_threads`. Unfortunately the optimal value is difficult to determine automatically, since it depends on the synchronous write latency of the underlying storage devices. In any case, a hardcoded value of 32 would probably be better than the current situation. Having a lot of ZVOL threads doesn't seem to have any real downside anyway. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Fixes #392 |
||
---|---|---|
.. | ||
arc.c | ||
bplist.c | ||
bpobj.c | ||
dbuf.c | ||
ddt_zap.c | ||
ddt.c | ||
dmu_diff.c | ||
dmu_object.c | ||
dmu_objset.c | ||
dmu_send.c | ||
dmu_traverse.c | ||
dmu_tx.c | ||
dmu_zfetch.c | ||
dmu.c | ||
dnode_sync.c | ||
dnode.c | ||
dsl_dataset.c | ||
dsl_deadlist.c | ||
dsl_deleg.c | ||
dsl_dir.c | ||
dsl_pool.c | ||
dsl_prop.c | ||
dsl_scan.c | ||
dsl_synctask.c | ||
fm.c | ||
gzip.c | ||
lzjb.c | ||
Makefile.in | ||
metaslab.c | ||
refcount.c | ||
rrwlock.c | ||
sa.c | ||
sha256.c | ||
spa_boot.c | ||
spa_config.c | ||
spa_errlog.c | ||
spa_history.c | ||
spa_misc.c | ||
spa.c | ||
space_map.c | ||
txg.c | ||
uberblock.c | ||
unique.c | ||
vdev_cache.c | ||
vdev_disk.c | ||
vdev_file.c | ||
vdev_label.c | ||
vdev_mirror.c | ||
vdev_missing.c | ||
vdev_queue.c | ||
vdev_raidz.c | ||
vdev_root.c | ||
vdev.c | ||
zap_leaf.c | ||
zap_micro.c | ||
zap.c | ||
zfs_acl.c | ||
zfs_byteswap.c | ||
zfs_debug.c | ||
zfs_dir.c | ||
zfs_fm.c | ||
zfs_fuid.c | ||
zfs_ioctl.c | ||
zfs_log.c | ||
zfs_onexit.c | ||
zfs_replay.c | ||
zfs_rlock.c | ||
zfs_sa.c | ||
zfs_vfsops.c | ||
zfs_vnops.c | ||
zfs_znode.c | ||
zil.c | ||
zio_checksum.c | ||
zio_compress.c | ||
zio_inject.c | ||
zio.c | ||
zle.c | ||
zpl_export.c | ||
zpl_file.c | ||
zpl_inode.c | ||
zpl_super.c | ||
zpl_xattr.c | ||
zrlock.c | ||
zvol.c |