freebsd-skq

Author	SHA1	Message	Date
bde	7fe18219e6	In msdosfs_read() and msdosfs_write(), don't check explicitly for (uio_offset < 0) since this can't happen. If this happens, then the general code handles the problem safely (better than before for reading, returning 0 (EOF) instead of the bogus errno EINVAL, and the same as before for writing, returning EFBIG). In msdosfs_read(), don't check for (uio_resid < 0). msdosfs_write() already didn't check. In msdosfs_read(), document in a comment our assumptions that the caller passed a valid uio_offset and uio_resid. ffs checks using KASSERT(), and that is enough sanity checking. In the same comment, partly document there is no need to check for the EOVERFLOW case, unlike in ffs where this case can happen at least in theory. In msdosfs_write(), add a comment about why the checking of (uio_resid == 0) is explicit, unlike in ffs. In msdosfs_write(), check for impossibly large final offsets before checking if the file size rlimit would be exceeded, so that we don't have an overflow bug in the rlimit check and are consistent with ffs. We now return EFBIG instead of EFBIG plus a SIGXFSZ signal if the final offset would be impossibly large but not so large as to cause overflow. Overflow normally gave the benign behaviour of no signal. Approved by: re (kensmith) (blanket)	2007-08-07 10:35:27 +00:00
bde	c2333909d4	Fix and update the comments about the effect of the read-only flag on writing. They are still too verbose. Remove nearby unreachable code for handling symlinks. Approved by: re (kensmith) (blanket)	2007-08-07 05:42:10 +00:00
bde	bc5f57144e	Fix some style bugs (don't assume that off_t == int64_t; fix some comments; remove some parentheses; fix some whitespace errors; fix only one case of a boolean comparison of a non-boolean). Improve an error message by quoting ".", and by not printing large positive values as negative ones. Approved by: re (kensmith) (blanket)	2007-08-07 03:59:49 +00:00
bde	fa70acb379	Fix some style bugs (don't assume that off_t == int64_t; fix some comments; remove some parentheses; fix only a couple of whtespace errors). Approved by: re (kensmith) (blanket)	2007-08-07 03:43:28 +00:00
bde	e46ce9b810	Fix some style bugs (mainly some whitespace errors). Approved by: re (kensmith) (blanket)	2007-08-07 03:38:36 +00:00
bde	23aced0f9b	Fix some style bugs (some whitespace errors only). Approved by: re (kensmith) (blanket)	2007-08-07 03:22:10 +00:00
bde	a5b7135230	Sort includes. Remove rotted banal comment attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:28:33 +00:00
bde	17de976386	Sort includes. Remove banal comments attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:27:35 +00:00
bde	2a03f71880	Sort includes. Remove banal comments before includes. Remove rotted banal comments attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:20:37 +00:00
bde	874f990600	Remove unused include(s). Remove banal comments before includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:11:16 +00:00
bde	8bf123bea8	Remove unused include(s). Approved by: re (kensmith) (blanket)	2007-08-07 02:08:06 +00:00
bde	cc09c09c22	Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/buf.h> and/or <sys/vnode.h> Approved by: re (kensmith) (blanket)	2007-08-07 01:40:27 +00:00
bde	c7403014c7	Include <sys/mutex.h>'s prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/vnode.h>. Sort the include of <sys/mutex.h> instead of unsorting it after <sys/vnode.h> and depending on the pollution there. Approved by: re (kensmith) (blanket)	2007-08-07 01:37:59 +00:00
bde	602091cba8	Remove unused include(s). Approved by: re (kensmith) (blanket)	2007-08-07 01:07:16 +00:00
bde	2e613b8127	Silently fix up the estimated next free cluster number from the fsinfo sector, instead of failing the whole mount if it is garbage. Fields in the fsinfo sector are only advisory, so there are better sanity checks than this, and we already silently fix up the only other advisory field in the fsinfo (the free cluster count). This wasn't handled quite right in rev.1.92, 1.117, or in NetBSD. 1.92 also failed the whole mount for the non-garbage magic value 0xffffffff 1.117 fixed this well enough in practice since garbage values shouldn't occur in practice, but left the error handling larger and more convoluted than necessary. Now we handle the magic value as a special case of fixing up all out of bounds values. Also fix up the estimated next free cluster number when there is no fsinfo sector. We were using 0, but CLUST_FIRST is safer. Approved by: re (kensmith)	2007-08-05 12:58:34 +00:00
bde	e8377e738c	Oops, fix the fix for the i/o size of the fsinfo block. Its log message explained why the size is 1 sector, but the code used a size of 1 cluster. I/o sizes larger than necessary may cause serious coherency problems in the buffer cache. Here I think there were only minor efficiency problems, since a too-large fsinfo buffer could only get far enough to overlap buffers for the same vnode (the device vnode), so mappings are coherent at the page level although not at the buffer level, and the former is probably enough due to our limited use of the fsinfo buffer. Approved by: re (kensmith)	2007-08-03 23:13:50 +00:00
delphij	8f39689f22	MFp4 - Refine locking to eliminate some potential race/panics: - Copy before testing a pointer. This closes a race window. - Use msleep with the node interlock instead of tsleep. - Do proper locking around access to tn_vpstate. - Assert vnode VOP lock for dir_{atta,de}tach to capture inconsistent locking. Suggested by: kib Submitted by: delphij Reviewed by: Howard Su Approved by: re (tmpfs blanket)	2007-08-03 06:24:31 +00:00
pjd	fe74e944d1	When we do open, we should lock the vnode exclusively. This fixes few races: - fifo race, where two threads assign v_fifoinfo, - v_writecount modifications, - v_object modifications, - and probably more... Discussed with: kib, ups Approved by: re (rwatson)	2007-07-26 16:58:09 +00:00
delphij	41fd4384d1	MFp4: Force 64-bit arithmatic when caculating the maximum file size. This fixes tmpfs caculations on 32-bit systems equipped with more than 4GB swap. Reported by: Craig Boston <craig xfoil gank org> PR: kern/114870 Approved by: re (tmpfs blanket)	2007-07-24 17:14:53 +00:00
bde	3a5cb59d83	Make using msdosfs as the root file system sort of work: o Initialize ownerships and permissions. They were garbage (0) for root mounts since vfs_mountroot_try() doesn't ask for them to be set and msdosfs's old incomplete code to set them was removed. The garbage happened to give the correct ownerships root:wheel, but it gave permissions 000 so init could not be execed. Use the macros for root: wheel and 0755. (The removed code gave 0:0 and 0777. 0755 is more normal and secure, thought wrong for /tmp.) o Check the readonly flag for initial (non-MNT_UPDATE) mounts in the correct place, as in ffs. For root mounts, it is only passed in mp->mnt_flags, since vfs_mountroot_try() only passes it as a flag and nothing translates the flag to the "ro" option string. msdosfs only looked for it in the string, so it gave a rw mount for root mounts without even clearing the flag in mp->mnt_flags, so the final state was inconsistent. Checking the flag only in mp->mnt_flags works for initial userland mounts too. The MNT_UPDATE case is messier. The main point that should work but doesn't is fsck of msdosfs root while it is mounted ro. This needs mainly MNT_RELOAD support to work. It should be possible to run fsck -p and succeed provided the fs is consistent, not just for msdosfs, but this fails because fsck -p always tries to open the device rw. The hack that allows open for writing in ffs is not implemented in msdosfs, since without MNT_RELOAD support writing could only be harmful. So fsck must be turned off to use msdosfs as root. This is quite dangerous, since msdosfs is still missing actually using its fs-dirty flag internally, so it is happy to mount dirty fileystems rw. Unrelated changes: - Fix missing error handling for MNT_UPDATE from rw to ro. - Catch up with renaming msdos to msdosfs in a string. Approved by: re (kensmith)	2007-07-23 07:10:17 +00:00
delphij	0321a712a7	MFp4: When swapping is not enabled, allow creating files by taking physical memory pages into account for tm_maxfilesize. Reported by: Dominique Goncalves <dominique.goncalves gmail.com> Submitted by: Howard Su Approved by: re (tmpfs blanket)	2007-07-23 06:54:58 +00:00
bde	8fdd1a79d0	Implement vfs clustering for msdosfs. This gives a very large speedup for small block sizes (in my tests, about 5 times for write and 3 times for read with a block size of 512, if clustering is possible) and a moderate speedup for the moderatatly large block sizes that should be used on non-small media (4K is the best size in most cases, and the speedup for that is about 1.3 times for write and 1.2 times for read). mmap() should benefit from clustering like read()/write(), but the current implementation of vm only supports clustering (at least for getpages) if the fs block size is >= PAGE SIZE. msdosfs is now only slightly slower than ffs with soft updates for writing and slightly faster for reading when both use their best block sizes. Writing is slower for msdosfs because of more sync writes. Reading is faster for msdosfs because indirect blocks interfere with clustering in ffs. The changes in msdosfs_read() and msdosfs_write() are simpler merges of corresponding code in ffs (after fixing some style bugs in ffs). msdosfs_bmap() needs fs-specific code. This implementation loops calling a lower level bmap function to do the hard parts. This is a bit inefficient, but is efficient enough since msdsfs_bmap() is only called when there is physical i/o to do. Approved by: re (hrs)	2007-07-20 17:06:57 +00:00
bde	b2bdcce9e1	Clean up before implementing vfs clustering for msdosfs: In msdosfs_read(), mainly reorder the main loop to the same order as in ffs_read(). In msdosfs_write() and extendfile(), use vfs_bio_clrbuf() instead of clrbuf(). I think this just just a bogus optimization, but ffs always does it and msdosfs already did it in one place, and it is what I've tested. In msdosfs_write(), merge good bits from a comment in ffs_write(), and fix 1 style bug. In the main comment for msdosfs_pcbmap(), improve wording and catch up with 13 years of changes in the function. This comment belongs in VOP_BMAP.9 but that doesn't exist. In msdosfs_bmap(), return EFBIG if the requested cluster number is out of bounds instead of blindly truncating it, and fix many style bugs. Approved by: re (hrs)	2007-07-20 16:21:47 +00:00
rwatson	f1da927af0	Make sure we release the control vnode in Coda: We allocate coda_ctlvp when /coda is mounted, but never release it. During the unmount this vnode was marked as UNMOUNTING and when venus is started a second time the system would hang, possibly waiting for the old vnode to disappear. So now we call vrele on the control vnode when file system is unmounted to drop the reference we got during the mount. I'm pretty sure it is also necessary to not skip the handling in coda_inactive for the control vnode, it seems like that is the place we actually get rid of the vnode once the refcount has dropped to 0. Submitted by: Jan Harkes <jaharkes at cs dot cmu dot edu> Approved by: re (kensmith)	2007-07-20 11:14:51 +00:00
delphij	d6ea8c65f3	MFp4: Rework on tmpfs's mapped read/write procedures. This should finally fix fsx test case. The printf's added here would be eventually turned into assertions. Submitted by: Mingyan Guo (mostly) Approved by: re (tmpfs blanket)	2007-07-19 03:34:50 +00:00
rwatson	25869d84fa	Complete repo-copy and move of Coda from src/sys/coda to src/sys/fs/coda by removing files from src/sys/coda, and updating include paths in the new location, kernel configuration, and Makefiles. In one case add $FreeBSD$. Discussed with: anderson, Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith) Repo-copy madness: simon	2007-07-12 21:04:58 +00:00
rwatson	fd977ace64	Forced commit to recognize repo-copy of Coda files from src/sys/coda to src/sys/fs/coda. Discussed with: anderson, Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith) Repo-copy madness: simon	2007-07-12 20:40:38 +00:00
bde	dba17f21d3	Round up the FAT block size to a multiple of the sector size so that i/o to the FAT is possible. Make the FAT block size less arbitrary before it is rounded up: - for FAT12, default to 3*512 instead of to 3 sectors. The magic 3 is the default number of 512-byte FAT sectors on a floppy drive. That many sectors is too many if the sector size is larger. - for !FAT12, default to PAGE_SIZE instead of to 4096. Remove MSDOSFS_DFLTBSIZE since it only obfuscated this 4096. For reading the BPB, use a block size of 8192 instead of 2048 so that sector sizes up to 8192 can work. We should try several sizes, or just try the maximum supported size (MAXBSIZE = 64K). I use 8192 because that is enough for DVD-RW's (even 2048 is enough) and 8192 has been tested a lot in use by ffs. This completes fixing msdosfs for some large sector sizes (up to 8K for read and 64K for write). Microsoft documents support for sector sizes up to 4K in mdosfs. ffs is currently limited to 8K for both read and write. Approved by: re (kensmith) Approved by: nyan (several years ago)	2007-07-12 17:17:47 +00:00
bde	01c1ec9b1a	Fix some bugs involving the fsinfo block (many remain unfixed). This is part of fixing msdosfs for large sector sizes. One of the fixed bugs was fatal for large sector sizes. 1. The fsinfo block has size 512, but it was misunderstood and declared as having size 1024, with nothing in the second 512 bytes except a signature at the end. The second 512 bytes actually normally (if the file system was created by Windows) consist of a second boot sector which is normally (in WinXP) empty except for a signature -- the normal layout is one boot sector, one fsinfo sector, another boot sector, then these 3 sectors duplicated. However, other layouts are valid. newfs_msdos produces a valid layout with one boot sector, one fsinfo sector, then these 2 sectors duplicated. The signature check for the extra part of the fsinfo was thus normally checking the signature in either the second boot sector or the first boot sector in the copy, and thus accidentally succeeding. The extra signature check would just fail for weirder layouts with 512-byte sectors, and for normal layouts with any other sector size. Remove the extra bytes and the extra signature check. 2. Old versions did i/o to the fsinfo block using size 1024, with the second half only used for the extra signature check on read. This was harmless for sector size 512, and worked accidentally for sector size 1024. The i/o just failed for larger sector sizes. The version being fixed did i/o to the fsinfo block using size fsi_size(pmp) = (1024 << ((pmp)->pm_BlkPerSec >> 2)). This expression makes no sense. It happens to work for sector small sector sizes, but for sector size 32K it gives the preposterous value of 64M and thus causes panics. A sector size of 32768 is necessary for at least some DVD-RW's (where the minimum write size is 32768 although the minimum read size is 2048). Now that the size of the fsinfo block is 512, it always fits in one sector so there is no need for a macro to express it. Just use the sector size where the old code uses 1024. Approved by: re (kensmith) Approved by: nyan (several years ago for a different version of (2))	2007-07-12 16:09:07 +00:00
rwatson	e9fe8191d7	Fix ioctls on the control vnode: ioctls on a character device fail with ENOTTY. Make the control vnode a regular file so that ioctls are passed through to our kernel module. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:34:41 +00:00
rwatson	1c2785e3fe	Avoid a panic in insmntque when we pass a NULL mount: this reenables some previously disabled code which according to the comment caused a problem during shutdown. But even that is still better than triggering a kernel panic whenever venus is started. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:33:46 +00:00
rwatson	4f942d92aa	Replace CODA_OPEN with CODA_OPEN_BY_FD: coda_open was disabled because we can't open container files by device/inode number pair anymore. Replace the CODA_OPEN upcall with CODA_OPEN_BY_FD, where venus returns an open file descriptor for the container file. We can then grab a reference on the vnode coda_psdev.c:vc_nb_write and use this vnode for further accesses to the container file. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:32:08 +00:00
rwatson	f30a555ada	Resolve Coda mount failing because Coda failed to match the device operations. But we don't have to, if we find the coda_mntinfo structure for this device in our linked list, we know the device is good. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:21:55 +00:00
rwatson	3287a2c53f	Avoid crash when opening Coda device: when allocating coda_mntinfo, we need to initialize dev so that we can actually find the allocated coda_mntinfo structure later on. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 20:39:53 +00:00
delphij	ff9bf4b792	MFp4: Make use of the kernel unit number allocation facility for tmpfs nodes. Submitted by: Mingyan Guo <guomingyan gmail com> Approved by: re (tmpfs blanket)	2007-07-11 14:26:27 +00:00
bde	fb1dc96e72	Don't use almost perfectly pessimal cluster allocation. Allocation of the the first cluster in a file (and, if the allocation cannot be continued contiguously, for subsequent clusters in a file) was randomized in an attempt to leave space for contiguous allocation of subsequent clusters in each file when there are multiple writers. This reduced internal fragmentation by a few percent, but it increased external fragmentation by up to a few thousand percent. Use simple sequential allocation instead. Actually maintain the fsinfo sequence index for this. The read and write of this index from/to disk still have many non-critical bugs, but we now write an index that has something to do with our allocations instead of being modified garbage. If there is no fsinfo on the disk, then we maintain the index internally and don't go near the bugs for writing it. Allocating the first free cluster gives a layout that is almost as good (better in some cases), but takes too much CPU if the FAT is large and the first free cluster is not near the beginning. The effect of this change for untar and tar of a slightly reduced copy of /usr/src on a new file system was: Before (msdosfs 4K-clusters): untar: 459.57 real untar from cached file (actually a pipe) tar: 342.50 real tar from uncached tree to /dev/zero Before (ffs2 soft updates 4K-blocks 4K-frags) untar: 39.18 real tar: 29.94 real Before (ffs2 soft updates 16K-blocks 2K-frags) untar: 31.35 real tar: 18.30 real After (msdosfs 4K-clusters): untar 54.83 real tar 16.18 real All of these times can be improved further. With multiple concurrent writers or readers (especially readers), the improvement is smaller, but I couldn't find any case where it is negative. 342 seconds for tarring up about 342 MB on a ~47MB/S partition is just hard to unimprove on. (This operation would take about 7.3 seconds with reasonably localized allocation and perfect read-ahead.) However, for active file systems, 342 seconds is closer to normal than the 16+ seconds above or the 11 seconds with other changes (best I've measured -- won easily by msdosfs!). E.g., my active /usr/src on ffs1 is quite old and fragmented, so reading to prepare for the above benchmark takes about 6 times longer than reading back the fresh copies of it. Approved by: re (kensmith)	2007-07-10 13:20:24 +00:00
delphij	b4ff595242	MFp4: - Plug memory leak. - Respect underlying vnode's properties rather than assuming that the user want root:wheel + 0755. Useful for using tmpfs(5) for /tmp. - Use roundup2 and howmany macros instead of rolling our own version. - Try to fix fsx -W -R foo case. - Instead of blindly zeroing a page, determine whether we need a pagein order to prevent data corruption. - Fix several bugs reported by Coverity. Submitted by: Mingyan Guo <guomingyan gmail com>, Howard Su, delphij Coverity ID: CID 2550, 2551, 2552, 2557 Approved by: re (tmpfs blanket)	2007-07-08 15:56:12 +00:00
kib	0ae42a4095	Since rev. 1.199 of sys/kern/kern_conf.c, the thread that calls destroy_dev() from d_close() cdev method would self-deadlock. devfs_close() bump device thread reference counter, and destroy_dev() sleeps, waiting for si_threadcount to reach zero for cdev without d_purge method. destroy_dev_sched() could be used instead from d_close(), to schedule execution of destroy_dev() in another context. The destroy_dev_sched_drain() function can be used to drain the scheduled calls to destroy_dev_sched(). Similarly, drain_dev_clone_events() drains the events clone to make sure no lingering devices are left after dev_clone event handler deregistered. make_dev_credf(MAKEDEV_REF) function should be used from dev_clone event handlers instead of make_dev()/make_dev_cred() to ensure that created device has reference counter bumped before cdev mutex is dropped inside make_dev(). Reviewed by: tegge (early versions), njl (programming interface) Debugging help and testing by: Peter Holm Approved by: re (kensmith)	2007-07-03 17:42:37 +00:00
delphij	f5523801a3	MFp4: - Remove unnecessary NULL checks after M_WAITOK allocations. - Use VOP_ACCESS instead of hand-rolled suser_cred() calls. [1] - Use malloc(9) KPI to allocate memory for string. The optimization taken from NetBSD is not valid for FreeBSD because our malloc(9) already act that way. [2] Requested by: rwatson [1] Submitted by: Howard Su [2] Approved by: re (tmpfs blanket)	2007-06-29 05:23:15 +00:00
delphij	66a8b537d6	Space/style cleanups after last set of commits. Approved by: re (tmpfs blanket)	2007-06-28 02:39:31 +00:00
delphij	17bb881f75	Staticify most of fifo/vn operations, they should not be directly exposed outside. Approved by: re (tmpfs blanket)	2007-06-28 02:36:41 +00:00
delphij	ee97250932	Use vfs_timestamp instead of nanotime when obtaining a timestamp for use with timekeeping. Approved by: re (tmpfs blanket)	2007-06-28 02:34:32 +00:00
delphij	eedf8bf53d	Reorder tf_gen and tf_id in struct tmpfs_fid. This saves 8 bytes on amd64 architecture. Obtained from: NetBSD Approved by: re (tmpfs blanket)	2007-06-28 02:32:44 +00:00
delphij	a404044c90	Remove two function prototypes that are no longer used. Approved by: re (tmpfs blanket)	2007-06-26 02:08:29 +00:00
delphij	73fffc81bc	- Sync with NetBSD's RCSID (HEAD preferred). - Correct a typo. Approved by: re (tmpfs blanket)	2007-06-26 02:07:08 +00:00
delphij	a78e2646a2	MFp4: Several clean-ups and improvements over tmpfs: - Remove tmpfs_zone_xxx KPI, the uma(9) wrapper, since they does not bring any value now. - Use \|= instead of = when applying VV_ROOT flag. - Remove tm_avariable_nodes list. Use uma to hold the released nodes. - init/destory interlock mutex of node when init/fini instead of ctor/dtor. - Change memory computing using u_int to fix negative value in 2G mem machine. - Remove unnecessary bzero's - Rely uma logic to make file id allocation harder to guess. - Fix some unsigned/signed related things. Make sure we respect -o size=xxxx - Use wire instead of hold a page. - Pass allocate_zero to obtain zeroed pages upon first use. Submitted by: Howard Su Approved by: re (tmpfs blanket, kensmith)	2007-06-25 18:46:13 +00:00
rafan	ff392b04b7	- Remove UMAP filesystem. It was disconnected from build three years ago, and it is seriously broken. Discussed on: freebsd-arch@ Approved by: re (mux)	2007-06-25 05:06:57 +00:00
delphij	1da67b5003	Use vfs_timestamp() instead of nanotime() - make it up to the user to make decisions about how detail they wanted timestamps to have.	2007-06-18 14:40:19 +00:00
delphij	d50b261fe6	MFp4: fix two locking problems: - Hold TMPFS_LOCK while updating tm_pages_used. - Hold vm page while doing uiomove. This will hopefully fix all known panics. Submitted by: Howard Su	2007-06-18 01:43:13 +00:00
delphij	ef518e4122	MFp4: Add tmpfs, an efficient memory file system. Please note that, this is currently considered as an experimental feature so there could be some rough edges. Consult http://wiki.freebsd.org/TMPFS for more information. For now, connect tmpfs to build on i386 and amd64 architectures only. Please let us know if you have success with other platforms. This work was developed by Julio M. Merino Vidal for NetBSD as a SoC project; Rohit Jalan ported it from NetBSD to FreeBSD. Howard Su and Glen Leeder are worked on it to continue this effort. Obtained from: NetBSD via p4 Submitted by: Howard Su (with some minor changes) Approved by: re (kensmith)	2007-06-16 01:56:05 +00:00

1 2 3 4 5 ...

1982 Commits