08d08ebba2
As described in Issue #458 and #258, unlinking large amounts of data can cause the threads in the zio free wait queue to start spinning. Reducing the number of z_fr_iss threads from a fixed value of 100 to 1 per cpu signficantly reduces contention on the taskq spinlock and improves throughput. Instrumenting the taskq code showed that __taskq_dispatch() can spend a long time holding tq->tq_lock if there are a large number of threads in the queue. It turns out the time spent in wake_up() scales linearly with the number of threads in the queue. When a large number of short work items are dispatched, as seems to be the case with unlink, the worker threads drain the queue faster than the dispatcher can fill it. They then all pile into the work wait queue to wait for new work items. So if 100 threads are in the queue, wake_up() takes about 100 times as long, and the woken threads have to spin until the dispatcher releases the lock. Reducing the number of threads helps with the symptoms, but doesn't get to the root of the problem. It would seem that wake_up() shouldn't scale linearly in time with queue depth, particularly if we are only trying to wake up one thread. In that vein, I tried making all of the waiting processes exclusive to prevent the scheduler from iterating over the entire list, but I still saw the linear time scaling. So further investigation is needed, but in the meantime reducing the thread count is an easy workaround. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue #258 Issue #458 |
||
---|---|---|
.. | ||
arc.c | ||
bplist.c | ||
bpobj.c | ||
dbuf.c | ||
ddt_zap.c | ||
ddt.c | ||
dmu_diff.c | ||
dmu_object.c | ||
dmu_objset.c | ||
dmu_send.c | ||
dmu_traverse.c | ||
dmu_tx.c | ||
dmu_zfetch.c | ||
dmu.c | ||
dnode_sync.c | ||
dnode.c | ||
dsl_dataset.c | ||
dsl_deadlist.c | ||
dsl_deleg.c | ||
dsl_dir.c | ||
dsl_pool.c | ||
dsl_prop.c | ||
dsl_scan.c | ||
dsl_synctask.c | ||
fm.c | ||
gzip.c | ||
lzjb.c | ||
Makefile.in | ||
metaslab.c | ||
refcount.c | ||
rrwlock.c | ||
sa.c | ||
sha256.c | ||
spa_boot.c | ||
spa_config.c | ||
spa_errlog.c | ||
spa_history.c | ||
spa_misc.c | ||
spa.c | ||
space_map.c | ||
txg.c | ||
uberblock.c | ||
unique.c | ||
vdev_cache.c | ||
vdev_disk.c | ||
vdev_file.c | ||
vdev_label.c | ||
vdev_mirror.c | ||
vdev_missing.c | ||
vdev_queue.c | ||
vdev_raidz.c | ||
vdev_root.c | ||
vdev.c | ||
zap_leaf.c | ||
zap_micro.c | ||
zap.c | ||
zfs_acl.c | ||
zfs_byteswap.c | ||
zfs_debug.c | ||
zfs_dir.c | ||
zfs_fm.c | ||
zfs_fuid.c | ||
zfs_ioctl.c | ||
zfs_log.c | ||
zfs_onexit.c | ||
zfs_replay.c | ||
zfs_rlock.c | ||
zfs_sa.c | ||
zfs_vfsops.c | ||
zfs_vnops.c | ||
zfs_znode.c | ||
zil.c | ||
zio_checksum.c | ||
zio_compress.c | ||
zio_inject.c | ||
zio.c | ||
zle.c | ||
zpl_export.c | ||
zpl_file.c | ||
zpl_inode.c | ||
zpl_super.c | ||
zpl_xattr.c | ||
zrlock.c | ||
zvol.c |