Don't use almost perfectly pessimal cluster allocation. Allocation

of the the first cluster in a file (and, if the allocation cannot be
continued contiguously, for subsequent clusters in a file) was randomized
in an attempt to leave space for contiguous allocation of subsequent
clusters in each file when there are multiple writers.  This reduced
internal fragmentation by a few percent, but it increased external
fragmentation by up to a few thousand percent.

Use simple sequential allocation instead.  Actually maintain the fsinfo
sequence index for this.  The read and write of this index from/to
disk still have many non-critical bugs, but we now write an index that
has something to do with our allocations instead of being modified
garbage.  If there is no fsinfo on the disk, then we maintain the index
internally and don't go near the bugs for writing it.

Allocating the first free cluster gives a layout that is almost as good
(better in some cases), but takes too much CPU if the FAT is large and
the first free cluster is not near the beginning.

The effect of this change for untar and tar of a slightly reduced copy
of /usr/src on a new file system was:

Before (msdosfs 4K-clusters):
untar:  459.57 real              untar from cached file (actually a pipe)
tar:    342.50 real              tar from uncached tree to /dev/zero
Before (ffs2 soft updates 4K-blocks 4K-frags)
untar:   39.18 real
tar:     29.94 real
Before (ffs2 soft updates 16K-blocks 2K-frags)
untar:   31.35 real
tar:     18.30 real

After (msdosfs 4K-clusters):
untar    54.83 real
tar      16.18 real

All of these times can be improved further.

With multiple concurrent writers or readers (especially readers), the
improvement is smaller, but I couldn't find any case where it is
negative.  342 seconds for tarring up about 342 MB on a ~47MB/S partition
is just hard to unimprove on.  (This operation would take about 7.3
seconds with reasonably localized allocation and perfect read-ahead.)
However, for active file systems, 342 seconds is closer to normal than
the 16+ seconds above or the 11 seconds with other changes (best I've
measured -- won easily by msdosfs!).  E.g., my active /usr/src on ffs1
is quite old and fragmented, so reading to prepare for the above
benchmark takes about 6 times longer than reading back the fresh copies
of it.

Approved by:	re (kensmith)
This commit is contained in:
Bruce Evans 2007-07-10 13:20:24 +00:00
parent 920d0c826e
commit 8e55bfaf4b
2 changed files with 5 additions and 6 deletions

View File

@ -769,6 +769,9 @@ chainalloc(pmp, start, count, fillwith, retcluster, got)
*retcluster = start;
if (got)
*got = count;
pmp->pm_nxtfree = start + count;
if (pmp->pm_nxtfree > pmp->pm_maxcluster)
pmp->pm_nxtfree = CLUST_FIRST;
return (0);
}
@ -806,11 +809,7 @@ clusteralloc(pmp, start, count, fillwith, retcluster, got)
} else
len = 0;
/*
* Start at a (pseudo) random place to maximize cluster runs
* under multiple writers.
*/
newst = random() % (pmp->pm_maxcluster + 1);
newst = pmp->pm_nxtfree;
foundl = 0;
for (cn = newst; cn <= pmp->pm_maxcluster;) {

View File

@ -94,7 +94,7 @@ struct msdosfsmount {
u_long pm_fatsize; /* size of fat in bytes */
u_int32_t pm_fatmask; /* mask to use for fat numbers */
u_long pm_fsinfo; /* fsinfo block number */
u_long pm_nxtfree; /* next free cluster in fsinfo block */
u_long pm_nxtfree; /* next place to search for a free cluster */
u_int pm_fatmult; /* these 2 values are used in fat */
u_int pm_fatdiv; /* offset computation */
u_int pm_curfat; /* current fat for FAT32 (0 otherwise) */