freebsd-nq/sys
Kirk McKusick 35338e6091 This change avoids a kernel deadlock on "snaplk" when using
snapshots on UFS filesystems running with journaled soft updates.
This is the first of several bugs that need to be fixed before
removing the restriction added in -r230250 to prevent the use
of snapshots on filesystems running with journaled soft updates.

The deadlock occurs when holding the snapshot lock (snaplk)
and then trying to flush an inode via ffs_update(). We become
blocked by another process trying to flush a different inode
contained in the same inode block that we need. It holds the
inode block for which we are waiting locked. When it tries to
write the inode block, it gets blocked waiting for the our
snaplk when it calls ffs_copyonwrite() to see if the inode
block needs to be copied in our snapshot.

The most obvious place that this deadlock arises is in the
ffs_copyonwrite() routine when it updates critical metadata
in a snapshot and tries to write it out before proceeding.
The fix here is to write the data and indirect block pointer
for the snapshot, but to skip the call to ffs_update() to
write the snapshot inode. To ensure that we will never have
to update a pointer in the inode itself, the ffs_snapshot()
routine that creates the snapshot has to ensure that all the
direct blocks are allocated as part of the creation of the
snapshot.

A less obvious place that this deadlock occurs is when we hold
the snaplk because we are deleting a snapshot. In the course of
doing the deletion, we need to allocate various soft update
dependency structures and allocate some journal space. If we
hit a resource limit while doing this we decrease the resources
in use by flushing out an existing dirty file to get it to give
up the soft dependency resources that it holds. The flush can
cause an ffs_update() to be done on the inode for the file that
we have selected to flush resulting in the same deadlock as
described above when the inode that we have chosen to flush
resides in the same inode block as the snapshot inode that we hold.
The fix is to defer cleaning up any time that the inode on which
we are operating is a snapshot.

Help and review by:    Jeff Roberson
Tested by:             Peter Holm
MFC (to 9 only) after: 2 weeks
2012-03-01 18:45:25 +00:00
..
amd64 Copy amd64 stdarg.h to x86 and replace amd64/i386/pc98 stdarg.h with stubs. 2012-02-28 22:30:58 +00:00
arm Make sure we do not provide the page 0 to the VM. It can't handle it properly, 2012-02-29 12:44:34 +00:00
boot Fix a long standing bug. The caller expects a non-zero value for success. 2012-02-29 18:11:33 +00:00
bsm
cam Use a better way to silence unneeded internal declaration warnings in 2012-02-23 21:34:14 +00:00
cddl Analogous to r232059, add a parameter for the ZFS file system: 2012-02-26 16:30:39 +00:00
compat Add procfs to jail-mountable filesystems. 2012-02-29 00:30:18 +00:00
conf Add driver for the RME HDSPe AIO/RayDAT sound cards -- snd_hdspe(4). 2012-03-01 13:10:18 +00:00
contrib IFC @231845 2012-02-17 00:27:48 +00:00
crypto
ddb
dev Add driver for the RME HDSPe AIO/RayDAT sound cards -- snd_hdspe(4). 2012-03-01 13:10:18 +00:00
fs Fix the NFS clients so that they use copyin() instead of bcopy(), 2012-03-01 03:53:07 +00:00
gdb
geom If nested scheme allows dump kernel to its partition, we may allow 2012-02-20 06:35:52 +00:00
gnu/fs Use new OSS-based BSD-licensed header for cs sound driver. 2012-02-01 21:38:01 +00:00
i386 Copy amd64 stdarg.h to x86 and replace amd64/i386/pc98 stdarg.h with stubs. 2012-02-28 22:30:58 +00:00
ia64 Correct capitalization of "Hz" in user-visible text (manpages, printf(), 2012-02-28 13:19:34 +00:00
isa
kern This change avoids a kernel deadlock on "snaplk" when using 2012-03-01 18:45:25 +00:00
kgssapi
libkern
mips Revert part of old logic of assigning MAC addressess: 2012-02-29 05:48:29 +00:00
modules Add driver for the RME HDSPe AIO/RayDAT sound cards -- snd_hdspe(4). 2012-03-01 13:10:18 +00:00
net Use a more appropriate default for the maximum number of addresses in the 2012-02-29 20:58:21 +00:00
net80211 Only increment is_beacon_bad if we're not scanning. 2012-02-28 21:43:29 +00:00
netatalk Fix typos 2012-02-28 15:07:05 +00:00
netgraph Revert r231829, that was my braino. 2012-02-22 09:08:51 +00:00
netinet - Refresh dynamic tcp rule only if both sides answered keepalive packets. 2012-02-28 22:00:41 +00:00
netinet6 In selectroute() add a missing fibnum argument to an in6_rtalloc() 2012-02-24 20:06:04 +00:00
netipsec Add multi-FIB IPv6 support to the core network stack supplementing 2012-02-03 13:08:44 +00:00
netipx
netnatm
netncp
netsmb
nfs Add multi-FIB IPv6 support to the core network stack supplementing 2012-02-03 13:08:44 +00:00
nfsclient Fix the NFS clients so that they use copyin() instead of bcopy(), 2012-03-01 03:53:07 +00:00
nfsserver
nlm jwd@ reported a problem via email to freebsd-fs@ on Aug 25, 2011 2012-01-31 02:11:05 +00:00
ofed
opencrypto
pc98 Copy amd64 stdarg.h to x86 and replace amd64/i386/pc98 stdarg.h with stubs. 2012-02-28 22:30:58 +00:00
pci Use correct Config registers for RTL8139 family. Unlike RTL8168 and 2012-02-25 04:54:51 +00:00
powerpc Add backlight control to ATI-graphics PowerBooks and iBooks. 2012-02-26 13:45:25 +00:00
rpc
security Remove direct access to si_name. 2012-02-10 12:35:57 +00:00
sparc64
sys This change avoids a kernel deadlock on "snaplk" when using 2012-03-01 18:45:25 +00:00
teken
tools Make vnode_if.awk parse vnode operations with underscores, like VOP_FOO_BAR. 2012-02-21 19:35:59 +00:00
ufs This change avoids a kernel deadlock on "snaplk" when using 2012-03-01 18:45:25 +00:00
vm Simplify kmem_alloc() by eliminating code that existed on account of 2012-02-29 05:41:29 +00:00
x86 Copy amd64 stdarg.h to x86 and replace amd64/i386/pc98 stdarg.h with stubs. 2012-02-28 22:30:58 +00:00
xdr
xen blkif interface comment cleanups. No functional changes 2012-02-29 17:47:01 +00:00
Makefile