illumos/illumos-gate@20a95fb2c420a95fb2c4https://www.illumos.org/issues/5768
zfsctl_snapshot_inactive() leaks a hold on the dvp (directory vnode) if v_count > 1.
reproduce by:
create a fs with 100 snapshots.
have a thread do:
while true; do ls -l /test/snaps/.zfs/snapshot >/dev/null; done
have another thread do:
while true; do zfs promote test/clone; zfs promote test/snaps; done
use dtrace to delay & observe:
dtrace -w -xd \\
-n 'vn_rele:entry/args0 == (void*)0xffffff01dd42ce80ULL/{[stack()]=count();
chill(100000);}' \\
-n 'zfsctl_snapshot_inactive:entry{ if (args[0]->v_count > 1) trace(args[0]-
>v_count); self->vp=args[0];}' \\
-n 'gfs_vop_inactive:entry/callers["zfsctl_snapshot_inactive"]/{self->good=1;
[stack()]=count()}' \\
-n 'zfsctl_snapshot_inactive:return{if (self->good) self->good=0; else printf
("bad return");}' \\
-n 'gfs_dir_lookup:return/callers["zfsctl_snapshot_inactive"] && self->vp-
>v_count > 1/{trace(self->vp->v_count)}'
the address is found by selecting one of the output of this at random:
dtrace -n 'zfsctl_snapshot_inactive:entry{print(args[0]);'
when you see "bad return", we have hit the bug. Then doing "zfs umount test/
snaps" will fail with EBUSY.
When we hit this case, we also leak the hold on the target vnode (vn). When the
inactive callback is called on a vnode with v_count > 1, it needs to be
decremented.
Reviewed by: George Wilson <george@delphix.com>
Reviewed by: Prakash Surya <prakash.surya@delphix.com>
Reviewed by: Adam Leventhal <adam.leventhal@delphix.com>
Reviewed by: Bayard Bell <buffer.g.overflow@gmail.com>
Approved by: Rich Lowe <richlowe@richlowe.net>
Author: Matthew Ahrens <mahrens@delphix.com>