Kirk McKusick 3a7053cb60 Prevent large files from monopolizing the system buffers. Keep
track of the number of dirty buffers held by a vnode. When a
bdwrite is done on a buffer, check the existing number of dirty
buffers associated with its vnode. If the number rises above
vfs.dirtybufthresh (currently 90% of vfs.hidirtybuffers), one
of the other (hopefully older) dirty buffers associated with
the vnode is written (using bawrite). In the event that this
approach fails to curb the growth in it the vnode's number of
dirty buffers (due to soft updates rollback dependencies),
the more drastic approach of doing a VOP_FSYNC on the vnode
is used. This code primarily affects very large and actively
written files such as snapshots. This change should eliminate
hanging when taking snapshots or doing background fsck on
very large filesystems.

Hopefully, one day it will be possible to cache filesystem
metadata in the VM cache as is done with file data. As it
stands, only the buffer cache can be used which limits total
metadata storage to about 20Mb no matter how much memory is
available on the system. This rather small memory gets badly
thrashed causing a lot of extra I/O. For example, taking a
snapshot of a 1Tb filesystem minimally requires about 35,000
write operations, but because of the cache thrashing (we only
have about 350 buffers at our disposal) ends up doing about
237,540 I/O's thus taking twenty-five minutes instead of four
if it could run entirely in the cache.

Reported by:	Attila Nagy <bra@fsn.hu>
Sponsored by:   DARPA & NAI Labs.
2003-02-25 06:44:42 +00:00
..
2003-02-25 03:21:22 +00:00
2003-02-25 03:21:22 +00:00
2003-02-21 19:00:48 +00:00
2003-02-20 11:24:55 +00:00
2003-02-22 09:32:57 +00:00
2003-01-25 22:41:22 +00:00
2003-02-25 03:21:22 +00:00
2003-02-25 03:21:22 +00:00
2003-02-25 03:21:22 +00:00
2003-02-25 03:21:22 +00:00
2003-02-24 02:06:50 +00:00
2003-02-25 03:21:22 +00:00
2003-02-25 03:21:22 +00:00