Don't use almost perfectly pessimal cluster allocation. Allocation

of the the first cluster in a file (and, if the allocation cannot be continued contiguously, for subsequent clusters in a file) was randomized in an attempt to leave space for contiguous allocation of subsequent clusters in each file when there are multiple writers. This reduced internal fragmentation by a few percent, but it increased external fragmentation by up to a few thousand percent. Use simple sequential allocation instead. Actually maintain the fsinfo sequence index for this. The read and write of this index from/to disk still have many non-critical bugs, but we now write an index that has something to do with our allocations instead of being modified garbage. If there is no fsinfo on the disk, then we maintain the index internally and don't go near the bugs for writing it. Allocating the first free cluster gives a layout that is almost as good (better in some cases), but takes too much CPU if the FAT is large and the first free cluster is not near the beginning. The effect of this change for untar and tar of a slightly reduced copy of /usr/src on a new file system was: Before (msdosfs 4K-clusters): untar: 459.57 real untar from cached file (actually a pipe) tar: 342.50 real tar from uncached tree to /dev/zero Before (ffs2 soft updates 4K-blocks 4K-frags) untar: 39.18 real tar: 29.94 real Before (ffs2 soft updates 16K-blocks 2K-frags) untar: 31.35 real tar: 18.30 real After (msdosfs 4K-clusters): untar 54.83 real tar 16.18 real All of these times can be improved further. With multiple concurrent writers or readers (especially readers), the improvement is smaller, but I couldn't find any case where it is negative. 342 seconds for tarring up about 342 MB on a ~47MB/S partition is just hard to unimprove on. (This operation would take about 7.3 seconds with reasonably localized allocation and perfect read-ahead.) However, for active file systems, 342 seconds is closer to normal than the 16+ seconds above or the 11 seconds with other changes (best I've measured -- won easily by msdosfs!). E.g., my active /usr/src on ffs1 is quite old and fragmented, so reading to prepare for the above benchmark takes about 6 times longer than reading back the fresh copies of it. Approved by: re (kensmith)
2007-07-10 13:20:24 +00:00 · 2007-07-10 13:20:24 +00:00 · 8e55bfaf4b
commit 8e55bfaf4b
parent 920d0c826e
2 changed files with 5 additions and 6 deletions
--- a/sys/fs/msdosfs/msdosfs_fat.c
+++ b/sys/fs/msdosfs/msdosfs_fat.c
@ -769,6 +769,9 @@ chainalloc(pmp, start, count, fillwith, retcluster, got)
 		*retcluster = start;
 	if (got)
 		*got = count;
+	pmp->pm_nxtfree = start + count;
+	if (pmp->pm_nxtfree > pmp->pm_maxcluster)
+		pmp->pm_nxtfree = CLUST_FIRST;
 	return (0);
 }

@ -806,11 +809,7 @@ clusteralloc(pmp, start, count, fillwith, retcluster, got)
 	} else 
 		len = 0;

-	/*
-	 * Start at a (pseudo) random place to maximize cluster runs
-	 * under multiple writers.
-	 */
-	newst = random() % (pmp->pm_maxcluster + 1);
+	newst = pmp->pm_nxtfree;
 	foundl = 0;

 	for (cn = newst; cn <= pmp->pm_maxcluster;) {
--- a/sys/fs/msdosfs/msdosfsmount.h
+++ b/sys/fs/msdosfs/msdosfsmount.h
@ -94,7 +94,7 @@ struct msdosfsmount {
 	u_long pm_fatsize;	/* size of fat in bytes */
 	u_int32_t pm_fatmask;	/* mask to use for fat numbers */
 	u_long pm_fsinfo;	/* fsinfo block number */
-	u_long pm_nxtfree;	/* next free cluster in fsinfo block */
+	u_long pm_nxtfree;	/* next place to search for a free cluster */
 	u_int pm_fatmult;	/* these 2 values are used in fat */
 	u_int pm_fatdiv;	/*	offset computation */
 	u_int pm_curfat;	/* current fat for FAT32 (0 otherwise) */