freebsd-skq

Author	SHA1	Message	Date
bde	d2c2b5f35c	Add noclusterr and noclusterw options to the options list. I forgot these when I implemented clustering.	2007-10-18 16:25:47 +00:00
bde	adbeba35f8	Fix some style bugs in the mount options list. Mainly, sort the list, leaving space for adding missing options. Negative options are sorted after removing their "no" prefix, and generic options are sorted before msdosfs-specific ones.	2007-10-18 15:48:10 +00:00
bde	896bab2157	In msdosfs_settattr(), don't do synchronous updates of the denode (except indirectly for the size pseudo-attribute). If anything deserves a sync update, then it is ids and immutable flags, since these are related to security, but ffs never synced these and msdosfs doesn't support them. (ufs_setattr() only does an update in one case where it is least needed (for timestamps); it did pessimal sync updates for timestamps until 1998/03/08 but was changed for unlogged reasons related to soft updates.) Now msdosfs calls deupdat() with waitfor == 0, which normally gives a delayed update to disk but always gives a sync update of timestamps in core, while for ffs everything is delayed until the syncer daemon or other activity causes an update (except for timestamps). This gives a large optimization mainly for things like cp -p, where attribute adjustment could easily triple the number of physical I/O's if it is done synchronously (but cp -p to msdosfs is not as bad as that, since msdosfs doesn't support many attributes so null adjustments are more common, and msdosfs doesn't support ctimes so even if cp doesn't weed out null adjustments they don't become non-null after clobbering the ctime).	2007-10-18 07:26:21 +00:00
alfred	3a60df401c	Get rid of qaddr_t. Requested by: bde	2007-10-16 10:54:55 +00:00
daichi	87bd60ac74	This changes give nullfs correctly work with latest unionfs. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:57:11 +00:00
daichi	b4e293afdf	Added whiteout behavior option. ``-o whiteout=always'' is default mode (it is established practice) and ``-o whiteout=whenneeded'' is less disk-space using mode especially for resource restricted environments like embedded environments. (Contributed by Ed Schouten. Thanks) Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:55:38 +00:00
daichi	7759a8a0eb	Default copy mode has been changed from traditional-mode to transparent-mode. Some folks who have reported some issues have solved with transparent mode. We guess it is time to change the default copy mode. The transparent-mode is the best in most situations. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:53:38 +00:00
daichi	1b42caf41d	Fixed un-vrele issue of upper layer root vnode of unionfs. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:52:01 +00:00
daichi	1f6ec6407c	Added NULL check code pointed out by Coverity. (via Stanislav Sedov. Thanks) Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:50:58 +00:00
daichi	bf7aeca620	- It has been become MPSAFE. - Fixed lock panic issue under MPSAFE. - Fixed panic issue whenever it locks vnode with reclaim. - Fixed lock implementations not conforming to vnode_if.src style. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:49:30 +00:00
daichi	f3fd8ae96c	Fixed vnode unlock/vrele untreated issues whenever errors have occurred during some treatments. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:47:44 +00:00
daichi	a009cf6b3c	- Added support for vfs_cache on unionfs. As a result, you can use applications that use procfs on unionfs. - Removed unionfs internal cache mechanism because it has vfs_cache support instead. As a result, it just simplified code of unionfs. - Fixed kern/111262 issue. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:46:11 +00:00
daichi	4aad1608ad	Added treatments to prevent readdir infinity loop using with Linux binary compatibility feature. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:44:06 +00:00
daichi	a763e0d0a2	Changed it frees unneeded memory ASAP. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:42:05 +00:00
daichi	dc348d6e70	Log: Improved access permission check treatments. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:37:52 +00:00
jhb	3739d97391	Use the correct pid when checking to see whether or not the /proc/<pid> directory itself (rather than any of its contents) is visible to the current thread. MFC after: 1 week PR: kern/90063 Submitted by: john of 8192.net Approved by: re (kensmith)	2007-10-05 17:37:25 +00:00
delphij	0c91cfd26b	MFp4: Provide a dummy verb "export" to shut up the message showed up at start when NFS is enabled. Reported by: rafan Approved by: re (tmpfs blanket)	2007-10-04 17:11:48 +00:00
delphij	679cdcf0e4	Additional work is still needed before we can claim that tmpfs is stable enough for production usage. Warn user upon mount. Approved by: re (tmpfs blanket)	2007-10-04 17:08:46 +00:00
bde	5cdc06872e	Remove some of the pessimizations involving writing the fsi sector. All active fields in fsi are advisory/optional, so we shouldn't do extra work to make them valid at all times, but instead we write to the fsi too often (we still do), and we searched for a free cluster for fsinxtfree too often. This commit just removes the whole search and its results, so that we write out our in-core copy of fsinxtfree instead of writing a "fixed" copy and clobbering our in-core copy. This saves fixing 3 bugs: - off-by-1 error for the end of the search, resulting in fsinxtfree not actually being adjusted iff only the last cluster is free. - missing adjustment when no clusters are free. - off-by-many error for the start of the search. Starting the search at 0 instead of at (the in-core copy of) fsinxtfree did more than defeat the reasons for existence of fsinxtfree. fsinxtfree exists mainly to avoid having to start at 0 for just the first search per mount, but has the side effect of reducing bias towards allocating near cluster 0. The bias would normally only be generated by the first search per mount (if fsinxtfree is not supported), but since we also adjusted the in-core copy of fsinxtfree here, we were doing extra work to maximize the bias. Approved by: re (kensmith)	2007-09-23 14:49:32 +00:00
rodrigc	b2b7d089f7	Disable multiple ntfs mounts to the same mountpoint. Eliminates panics due to locking issues. Idea taken from src/sys/gnu/fs/xfs/FreeBSD/xfs_super.c. PR: 89966, 92000, 104393 Reported by: H. Matsuo <hiroshi50000 yahoo co jp>, Chris <m2chrischou gmail.com>, Andrey V. Elsukov <bu7cher yandex ru>, Jan Henrik Sylvester <me janh de> Approved by: re (kensmith)	2007-09-21 23:50:15 +00:00
jeff	3fc0f8b973	- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)	2007-09-17 05:31:39 +00:00
bde	8e0e951bed	Fix races in msdosfs_lookup() and msdosfs_readdir(). These functions can easily block in bread(), and then there was nothing to prevent the static buffer (nambuf_{ptr,len,last_id}) being clobbered by another thread. The effects of the bug seem to have been limited to failed lookups and mangled names in readdir(), since Giant locking provides enough serialization to prevent concurrent calls to the functions that access the buffer. They were very obvious for multiple concurrent tree walks, especially with a small cluster size. The bug was introduced in msdosfs_conv.c 1.34 and associated changes, and is in all releases starting with 5.2. The fix is to allocate the buffer as a local variable and pass around pointers to it like "_r" functions in libc do. Stack use from this is large but not too large. This also fixes a memory leak on module unload. Reviewed by: kib Approved by: re (kensmith)	2007-08-31 22:29:55 +00:00
delphij	e83de305a6	MFp4: rework tmpfs_readdir() logic in terms of correctness. Approved by: re (tmpfs blanket) Tested with: fstest, fsx	2007-08-16 11:00:07 +00:00
jhb	7fdc86bfe3	On 6.x this works: % mount \| grep home /dev/ad4s1e on /home (ufs, local, noatime, soft-updates) % mount -u -o atime /home % mount \| grep home /dev/ad4s1e on /home (ufs, local, soft-updates) Restore this behavior for on 7.x for the following mount options: noatime, noclusterr, noclusterw, noexec, nosuid, nosymfollow In addition, on 7.x, the following are equivalent: mount -u -o atime /home mount -u -o nonoatime /home Ideally, when we introduce new mount options, we should avoid options starting with "no". :) Requested by: jhb Reported by: Karol Kwiat <karol.kwiat gmail com>, Scott Hetzel <swhetzel gmail com> Approved by: re (bmah) Proxy commit for: rodrigc	2007-08-15 17:40:09 +00:00
delphij	5496743409	MFp4: - LK_RETRY prohibits vget() and vn_lock() to return error. Remove associated code. [1] - Properly use vhold() and vdrop() instead of their unlocked versions, we are guaranteed to have the vnode's interlock unheld. [1] - Fix a pseudo-infinite loop caused by 64/32-bit arithmetic with the same way used in modern NetBSD versions. [2] - Reorganize tmpfs_readdir to reduce duplicated code. Submitted by: kib [1] Obtained from: NetBSD [2] Approved by: re (tmpfs blanket)	2007-08-10 11:00:30 +00:00
delphij	1e2d5f7f4a	MFp4: - Respect cnflag and don't lock vnode always as LK_EXCLUSIVE [1] - Properly lock around tn_vnode to avoid NULL deference - Be more careful handling vnodes () () This is a WIP [1] by pjd via howardsu Thanks kib@ for his valuable VFS related comments. Tested with: fsx, fstest, tmpfs regression test set Found by: pho's stress2 suite Approved by: re (tmpfs blanket)	2007-08-10 05:24:49 +00:00
bde	7fe18219e6	In msdosfs_read() and msdosfs_write(), don't check explicitly for (uio_offset < 0) since this can't happen. If this happens, then the general code handles the problem safely (better than before for reading, returning 0 (EOF) instead of the bogus errno EINVAL, and the same as before for writing, returning EFBIG). In msdosfs_read(), don't check for (uio_resid < 0). msdosfs_write() already didn't check. In msdosfs_read(), document in a comment our assumptions that the caller passed a valid uio_offset and uio_resid. ffs checks using KASSERT(), and that is enough sanity checking. In the same comment, partly document there is no need to check for the EOVERFLOW case, unlike in ffs where this case can happen at least in theory. In msdosfs_write(), add a comment about why the checking of (uio_resid == 0) is explicit, unlike in ffs. In msdosfs_write(), check for impossibly large final offsets before checking if the file size rlimit would be exceeded, so that we don't have an overflow bug in the rlimit check and are consistent with ffs. We now return EFBIG instead of EFBIG plus a SIGXFSZ signal if the final offset would be impossibly large but not so large as to cause overflow. Overflow normally gave the benign behaviour of no signal. Approved by: re (kensmith) (blanket)	2007-08-07 10:35:27 +00:00
bde	c2333909d4	Fix and update the comments about the effect of the read-only flag on writing. They are still too verbose. Remove nearby unreachable code for handling symlinks. Approved by: re (kensmith) (blanket)	2007-08-07 05:42:10 +00:00
bde	bc5f57144e	Fix some style bugs (don't assume that off_t == int64_t; fix some comments; remove some parentheses; fix some whitespace errors; fix only one case of a boolean comparison of a non-boolean). Improve an error message by quoting ".", and by not printing large positive values as negative ones. Approved by: re (kensmith) (blanket)	2007-08-07 03:59:49 +00:00
bde	fa70acb379	Fix some style bugs (don't assume that off_t == int64_t; fix some comments; remove some parentheses; fix only a couple of whtespace errors). Approved by: re (kensmith) (blanket)	2007-08-07 03:43:28 +00:00
bde	e46ce9b810	Fix some style bugs (mainly some whitespace errors). Approved by: re (kensmith) (blanket)	2007-08-07 03:38:36 +00:00
bde	23aced0f9b	Fix some style bugs (some whitespace errors only). Approved by: re (kensmith) (blanket)	2007-08-07 03:22:10 +00:00
bde	a5b7135230	Sort includes. Remove rotted banal comment attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:28:33 +00:00
bde	17de976386	Sort includes. Remove banal comments attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:27:35 +00:00
bde	2a03f71880	Sort includes. Remove banal comments before includes. Remove rotted banal comments attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:20:37 +00:00
bde	874f990600	Remove unused include(s). Remove banal comments before includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:11:16 +00:00
bde	8bf123bea8	Remove unused include(s). Approved by: re (kensmith) (blanket)	2007-08-07 02:08:06 +00:00
bde	cc09c09c22	Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/buf.h> and/or <sys/vnode.h> Approved by: re (kensmith) (blanket)	2007-08-07 01:40:27 +00:00
bde	c7403014c7	Include <sys/mutex.h>'s prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/vnode.h>. Sort the include of <sys/mutex.h> instead of unsorting it after <sys/vnode.h> and depending on the pollution there. Approved by: re (kensmith) (blanket)	2007-08-07 01:37:59 +00:00
bde	602091cba8	Remove unused include(s). Approved by: re (kensmith) (blanket)	2007-08-07 01:07:16 +00:00
bde	2e613b8127	Silently fix up the estimated next free cluster number from the fsinfo sector, instead of failing the whole mount if it is garbage. Fields in the fsinfo sector are only advisory, so there are better sanity checks than this, and we already silently fix up the only other advisory field in the fsinfo (the free cluster count). This wasn't handled quite right in rev.1.92, 1.117, or in NetBSD. 1.92 also failed the whole mount for the non-garbage magic value 0xffffffff 1.117 fixed this well enough in practice since garbage values shouldn't occur in practice, but left the error handling larger and more convoluted than necessary. Now we handle the magic value as a special case of fixing up all out of bounds values. Also fix up the estimated next free cluster number when there is no fsinfo sector. We were using 0, but CLUST_FIRST is safer. Approved by: re (kensmith)	2007-08-05 12:58:34 +00:00
bde	e8377e738c	Oops, fix the fix for the i/o size of the fsinfo block. Its log message explained why the size is 1 sector, but the code used a size of 1 cluster. I/o sizes larger than necessary may cause serious coherency problems in the buffer cache. Here I think there were only minor efficiency problems, since a too-large fsinfo buffer could only get far enough to overlap buffers for the same vnode (the device vnode), so mappings are coherent at the page level although not at the buffer level, and the former is probably enough due to our limited use of the fsinfo buffer. Approved by: re (kensmith)	2007-08-03 23:13:50 +00:00
delphij	8f39689f22	MFp4 - Refine locking to eliminate some potential race/panics: - Copy before testing a pointer. This closes a race window. - Use msleep with the node interlock instead of tsleep. - Do proper locking around access to tn_vpstate. - Assert vnode VOP lock for dir_{atta,de}tach to capture inconsistent locking. Suggested by: kib Submitted by: delphij Reviewed by: Howard Su Approved by: re (tmpfs blanket)	2007-08-03 06:24:31 +00:00
pjd	fe74e944d1	When we do open, we should lock the vnode exclusively. This fixes few races: - fifo race, where two threads assign v_fifoinfo, - v_writecount modifications, - v_object modifications, - and probably more... Discussed with: kib, ups Approved by: re (rwatson)	2007-07-26 16:58:09 +00:00
delphij	41fd4384d1	MFp4: Force 64-bit arithmatic when caculating the maximum file size. This fixes tmpfs caculations on 32-bit systems equipped with more than 4GB swap. Reported by: Craig Boston <craig xfoil gank org> PR: kern/114870 Approved by: re (tmpfs blanket)	2007-07-24 17:14:53 +00:00
bde	3a5cb59d83	Make using msdosfs as the root file system sort of work: o Initialize ownerships and permissions. They were garbage (0) for root mounts since vfs_mountroot_try() doesn't ask for them to be set and msdosfs's old incomplete code to set them was removed. The garbage happened to give the correct ownerships root:wheel, but it gave permissions 000 so init could not be execed. Use the macros for root: wheel and 0755. (The removed code gave 0:0 and 0777. 0755 is more normal and secure, thought wrong for /tmp.) o Check the readonly flag for initial (non-MNT_UPDATE) mounts in the correct place, as in ffs. For root mounts, it is only passed in mp->mnt_flags, since vfs_mountroot_try() only passes it as a flag and nothing translates the flag to the "ro" option string. msdosfs only looked for it in the string, so it gave a rw mount for root mounts without even clearing the flag in mp->mnt_flags, so the final state was inconsistent. Checking the flag only in mp->mnt_flags works for initial userland mounts too. The MNT_UPDATE case is messier. The main point that should work but doesn't is fsck of msdosfs root while it is mounted ro. This needs mainly MNT_RELOAD support to work. It should be possible to run fsck -p and succeed provided the fs is consistent, not just for msdosfs, but this fails because fsck -p always tries to open the device rw. The hack that allows open for writing in ffs is not implemented in msdosfs, since without MNT_RELOAD support writing could only be harmful. So fsck must be turned off to use msdosfs as root. This is quite dangerous, since msdosfs is still missing actually using its fs-dirty flag internally, so it is happy to mount dirty fileystems rw. Unrelated changes: - Fix missing error handling for MNT_UPDATE from rw to ro. - Catch up with renaming msdos to msdosfs in a string. Approved by: re (kensmith)	2007-07-23 07:10:17 +00:00
delphij	0321a712a7	MFp4: When swapping is not enabled, allow creating files by taking physical memory pages into account for tm_maxfilesize. Reported by: Dominique Goncalves <dominique.goncalves gmail.com> Submitted by: Howard Su Approved by: re (tmpfs blanket)	2007-07-23 06:54:58 +00:00
bde	8fdd1a79d0	Implement vfs clustering for msdosfs. This gives a very large speedup for small block sizes (in my tests, about 5 times for write and 3 times for read with a block size of 512, if clustering is possible) and a moderate speedup for the moderatatly large block sizes that should be used on non-small media (4K is the best size in most cases, and the speedup for that is about 1.3 times for write and 1.2 times for read). mmap() should benefit from clustering like read()/write(), but the current implementation of vm only supports clustering (at least for getpages) if the fs block size is >= PAGE SIZE. msdosfs is now only slightly slower than ffs with soft updates for writing and slightly faster for reading when both use their best block sizes. Writing is slower for msdosfs because of more sync writes. Reading is faster for msdosfs because indirect blocks interfere with clustering in ffs. The changes in msdosfs_read() and msdosfs_write() are simpler merges of corresponding code in ffs (after fixing some style bugs in ffs). msdosfs_bmap() needs fs-specific code. This implementation loops calling a lower level bmap function to do the hard parts. This is a bit inefficient, but is efficient enough since msdsfs_bmap() is only called when there is physical i/o to do. Approved by: re (hrs)	2007-07-20 17:06:57 +00:00
bde	b2bdcce9e1	Clean up before implementing vfs clustering for msdosfs: In msdosfs_read(), mainly reorder the main loop to the same order as in ffs_read(). In msdosfs_write() and extendfile(), use vfs_bio_clrbuf() instead of clrbuf(). I think this just just a bogus optimization, but ffs always does it and msdosfs already did it in one place, and it is what I've tested. In msdosfs_write(), merge good bits from a comment in ffs_write(), and fix 1 style bug. In the main comment for msdosfs_pcbmap(), improve wording and catch up with 13 years of changes in the function. This comment belongs in VOP_BMAP.9 but that doesn't exist. In msdosfs_bmap(), return EFBIG if the requested cluster number is out of bounds instead of blindly truncating it, and fix many style bugs. Approved by: re (hrs)	2007-07-20 16:21:47 +00:00
rwatson	f1da927af0	Make sure we release the control vnode in Coda: We allocate coda_ctlvp when /coda is mounted, but never release it. During the unmount this vnode was marked as UNMOUNTING and when venus is started a second time the system would hang, possibly waiting for the old vnode to disappear. So now we call vrele on the control vnode when file system is unmounted to drop the reference we got during the mount. I'm pretty sure it is also necessary to not skip the handling in coda_inactive for the control vnode, it seems like that is the place we actually get rid of the vnode once the refcount has dropped to 0. Submitted by: Jan Harkes <jaharkes at cs dot cmu dot edu> Approved by: re (kensmith)	2007-07-20 11:14:51 +00:00
delphij	d6ea8c65f3	MFp4: Rework on tmpfs's mapped read/write procedures. This should finally fix fsx test case. The printf's added here would be eventually turned into assertions. Submitted by: Mingyan Guo (mostly) Approved by: re (tmpfs blanket)	2007-07-19 03:34:50 +00:00
rwatson	25869d84fa	Complete repo-copy and move of Coda from src/sys/coda to src/sys/fs/coda by removing files from src/sys/coda, and updating include paths in the new location, kernel configuration, and Makefiles. In one case add $FreeBSD$. Discussed with: anderson, Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith) Repo-copy madness: simon	2007-07-12 21:04:58 +00:00
rwatson	fd977ace64	Forced commit to recognize repo-copy of Coda files from src/sys/coda to src/sys/fs/coda. Discussed with: anderson, Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith) Repo-copy madness: simon	2007-07-12 20:40:38 +00:00
bde	dba17f21d3	Round up the FAT block size to a multiple of the sector size so that i/o to the FAT is possible. Make the FAT block size less arbitrary before it is rounded up: - for FAT12, default to 3*512 instead of to 3 sectors. The magic 3 is the default number of 512-byte FAT sectors on a floppy drive. That many sectors is too many if the sector size is larger. - for !FAT12, default to PAGE_SIZE instead of to 4096. Remove MSDOSFS_DFLTBSIZE since it only obfuscated this 4096. For reading the BPB, use a block size of 8192 instead of 2048 so that sector sizes up to 8192 can work. We should try several sizes, or just try the maximum supported size (MAXBSIZE = 64K). I use 8192 because that is enough for DVD-RW's (even 2048 is enough) and 8192 has been tested a lot in use by ffs. This completes fixing msdosfs for some large sector sizes (up to 8K for read and 64K for write). Microsoft documents support for sector sizes up to 4K in mdosfs. ffs is currently limited to 8K for both read and write. Approved by: re (kensmith) Approved by: nyan (several years ago)	2007-07-12 17:17:47 +00:00
bde	01c1ec9b1a	Fix some bugs involving the fsinfo block (many remain unfixed). This is part of fixing msdosfs for large sector sizes. One of the fixed bugs was fatal for large sector sizes. 1. The fsinfo block has size 512, but it was misunderstood and declared as having size 1024, with nothing in the second 512 bytes except a signature at the end. The second 512 bytes actually normally (if the file system was created by Windows) consist of a second boot sector which is normally (in WinXP) empty except for a signature -- the normal layout is one boot sector, one fsinfo sector, another boot sector, then these 3 sectors duplicated. However, other layouts are valid. newfs_msdos produces a valid layout with one boot sector, one fsinfo sector, then these 2 sectors duplicated. The signature check for the extra part of the fsinfo was thus normally checking the signature in either the second boot sector or the first boot sector in the copy, and thus accidentally succeeding. The extra signature check would just fail for weirder layouts with 512-byte sectors, and for normal layouts with any other sector size. Remove the extra bytes and the extra signature check. 2. Old versions did i/o to the fsinfo block using size 1024, with the second half only used for the extra signature check on read. This was harmless for sector size 512, and worked accidentally for sector size 1024. The i/o just failed for larger sector sizes. The version being fixed did i/o to the fsinfo block using size fsi_size(pmp) = (1024 << ((pmp)->pm_BlkPerSec >> 2)). This expression makes no sense. It happens to work for sector small sector sizes, but for sector size 32K it gives the preposterous value of 64M and thus causes panics. A sector size of 32768 is necessary for at least some DVD-RW's (where the minimum write size is 32768 although the minimum read size is 2048). Now that the size of the fsinfo block is 512, it always fits in one sector so there is no need for a macro to express it. Just use the sector size where the old code uses 1024. Approved by: re (kensmith) Approved by: nyan (several years ago for a different version of (2))	2007-07-12 16:09:07 +00:00
rwatson	e9fe8191d7	Fix ioctls on the control vnode: ioctls on a character device fail with ENOTTY. Make the control vnode a regular file so that ioctls are passed through to our kernel module. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:34:41 +00:00
rwatson	1c2785e3fe	Avoid a panic in insmntque when we pass a NULL mount: this reenables some previously disabled code which according to the comment caused a problem during shutdown. But even that is still better than triggering a kernel panic whenever venus is started. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:33:46 +00:00
rwatson	4f942d92aa	Replace CODA_OPEN with CODA_OPEN_BY_FD: coda_open was disabled because we can't open container files by device/inode number pair anymore. Replace the CODA_OPEN upcall with CODA_OPEN_BY_FD, where venus returns an open file descriptor for the container file. We can then grab a reference on the vnode coda_psdev.c:vc_nb_write and use this vnode for further accesses to the container file. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:32:08 +00:00
rwatson	f30a555ada	Resolve Coda mount failing because Coda failed to match the device operations. But we don't have to, if we find the coda_mntinfo structure for this device in our linked list, we know the device is good. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:21:55 +00:00
rwatson	3287a2c53f	Avoid crash when opening Coda device: when allocating coda_mntinfo, we need to initialize dev so that we can actually find the allocated coda_mntinfo structure later on. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 20:39:53 +00:00
delphij	ff9bf4b792	MFp4: Make use of the kernel unit number allocation facility for tmpfs nodes. Submitted by: Mingyan Guo <guomingyan gmail com> Approved by: re (tmpfs blanket)	2007-07-11 14:26:27 +00:00
bde	fb1dc96e72	Don't use almost perfectly pessimal cluster allocation. Allocation of the the first cluster in a file (and, if the allocation cannot be continued contiguously, for subsequent clusters in a file) was randomized in an attempt to leave space for contiguous allocation of subsequent clusters in each file when there are multiple writers. This reduced internal fragmentation by a few percent, but it increased external fragmentation by up to a few thousand percent. Use simple sequential allocation instead. Actually maintain the fsinfo sequence index for this. The read and write of this index from/to disk still have many non-critical bugs, but we now write an index that has something to do with our allocations instead of being modified garbage. If there is no fsinfo on the disk, then we maintain the index internally and don't go near the bugs for writing it. Allocating the first free cluster gives a layout that is almost as good (better in some cases), but takes too much CPU if the FAT is large and the first free cluster is not near the beginning. The effect of this change for untar and tar of a slightly reduced copy of /usr/src on a new file system was: Before (msdosfs 4K-clusters): untar: 459.57 real untar from cached file (actually a pipe) tar: 342.50 real tar from uncached tree to /dev/zero Before (ffs2 soft updates 4K-blocks 4K-frags) untar: 39.18 real tar: 29.94 real Before (ffs2 soft updates 16K-blocks 2K-frags) untar: 31.35 real tar: 18.30 real After (msdosfs 4K-clusters): untar 54.83 real tar 16.18 real All of these times can be improved further. With multiple concurrent writers or readers (especially readers), the improvement is smaller, but I couldn't find any case where it is negative. 342 seconds for tarring up about 342 MB on a ~47MB/S partition is just hard to unimprove on. (This operation would take about 7.3 seconds with reasonably localized allocation and perfect read-ahead.) However, for active file systems, 342 seconds is closer to normal than the 16+ seconds above or the 11 seconds with other changes (best I've measured -- won easily by msdosfs!). E.g., my active /usr/src on ffs1 is quite old and fragmented, so reading to prepare for the above benchmark takes about 6 times longer than reading back the fresh copies of it. Approved by: re (kensmith)	2007-07-10 13:20:24 +00:00
delphij	b4ff595242	MFp4: - Plug memory leak. - Respect underlying vnode's properties rather than assuming that the user want root:wheel + 0755. Useful for using tmpfs(5) for /tmp. - Use roundup2 and howmany macros instead of rolling our own version. - Try to fix fsx -W -R foo case. - Instead of blindly zeroing a page, determine whether we need a pagein order to prevent data corruption. - Fix several bugs reported by Coverity. Submitted by: Mingyan Guo <guomingyan gmail com>, Howard Su, delphij Coverity ID: CID 2550, 2551, 2552, 2557 Approved by: re (tmpfs blanket)	2007-07-08 15:56:12 +00:00
kib	0ae42a4095	Since rev. 1.199 of sys/kern/kern_conf.c, the thread that calls destroy_dev() from d_close() cdev method would self-deadlock. devfs_close() bump device thread reference counter, and destroy_dev() sleeps, waiting for si_threadcount to reach zero for cdev without d_purge method. destroy_dev_sched() could be used instead from d_close(), to schedule execution of destroy_dev() in another context. The destroy_dev_sched_drain() function can be used to drain the scheduled calls to destroy_dev_sched(). Similarly, drain_dev_clone_events() drains the events clone to make sure no lingering devices are left after dev_clone event handler deregistered. make_dev_credf(MAKEDEV_REF) function should be used from dev_clone event handlers instead of make_dev()/make_dev_cred() to ensure that created device has reference counter bumped before cdev mutex is dropped inside make_dev(). Reviewed by: tegge (early versions), njl (programming interface) Debugging help and testing by: Peter Holm Approved by: re (kensmith)	2007-07-03 17:42:37 +00:00
delphij	f5523801a3	MFp4: - Remove unnecessary NULL checks after M_WAITOK allocations. - Use VOP_ACCESS instead of hand-rolled suser_cred() calls. [1] - Use malloc(9) KPI to allocate memory for string. The optimization taken from NetBSD is not valid for FreeBSD because our malloc(9) already act that way. [2] Requested by: rwatson [1] Submitted by: Howard Su [2] Approved by: re (tmpfs blanket)	2007-06-29 05:23:15 +00:00
delphij	66a8b537d6	Space/style cleanups after last set of commits. Approved by: re (tmpfs blanket)	2007-06-28 02:39:31 +00:00
delphij	17bb881f75	Staticify most of fifo/vn operations, they should not be directly exposed outside. Approved by: re (tmpfs blanket)	2007-06-28 02:36:41 +00:00
delphij	ee97250932	Use vfs_timestamp instead of nanotime when obtaining a timestamp for use with timekeeping. Approved by: re (tmpfs blanket)	2007-06-28 02:34:32 +00:00
delphij	eedf8bf53d	Reorder tf_gen and tf_id in struct tmpfs_fid. This saves 8 bytes on amd64 architecture. Obtained from: NetBSD Approved by: re (tmpfs blanket)	2007-06-28 02:32:44 +00:00
delphij	a404044c90	Remove two function prototypes that are no longer used. Approved by: re (tmpfs blanket)	2007-06-26 02:08:29 +00:00
delphij	73fffc81bc	- Sync with NetBSD's RCSID (HEAD preferred). - Correct a typo. Approved by: re (tmpfs blanket)	2007-06-26 02:07:08 +00:00
delphij	a78e2646a2	MFp4: Several clean-ups and improvements over tmpfs: - Remove tmpfs_zone_xxx KPI, the uma(9) wrapper, since they does not bring any value now. - Use \|= instead of = when applying VV_ROOT flag. - Remove tm_avariable_nodes list. Use uma to hold the released nodes. - init/destory interlock mutex of node when init/fini instead of ctor/dtor. - Change memory computing using u_int to fix negative value in 2G mem machine. - Remove unnecessary bzero's - Rely uma logic to make file id allocation harder to guess. - Fix some unsigned/signed related things. Make sure we respect -o size=xxxx - Use wire instead of hold a page. - Pass allocate_zero to obtain zeroed pages upon first use. Submitted by: Howard Su Approved by: re (tmpfs blanket, kensmith)	2007-06-25 18:46:13 +00:00
rafan	ff392b04b7	- Remove UMAP filesystem. It was disconnected from build three years ago, and it is seriously broken. Discussed on: freebsd-arch@ Approved by: re (mux)	2007-06-25 05:06:57 +00:00
delphij	1da67b5003	Use vfs_timestamp() instead of nanotime() - make it up to the user to make decisions about how detail they wanted timestamps to have.	2007-06-18 14:40:19 +00:00
delphij	d50b261fe6	MFp4: fix two locking problems: - Hold TMPFS_LOCK while updating tm_pages_used. - Hold vm page while doing uiomove. This will hopefully fix all known panics. Submitted by: Howard Su	2007-06-18 01:43:13 +00:00
delphij	ef518e4122	MFp4: Add tmpfs, an efficient memory file system. Please note that, this is currently considered as an experimental feature so there could be some rough edges. Consult http://wiki.freebsd.org/TMPFS for more information. For now, connect tmpfs to build on i386 and amd64 architectures only. Please let us know if you have success with other platforms. This work was developed by Julio M. Merino Vidal for NetBSD as a SoC project; Rohit Jalan ported it from NetBSD to FreeBSD. Howard Su and Glen Leeder are worked on it to continue this effort. Obtained from: NetBSD via p4 Submitted by: Howard Su (with some minor changes) Approved by: re (kensmith)	2007-06-16 01:56:05 +00:00
rwatson	00b02345d4	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
remko	5983912c3d	Correct corrupt read when the read starts at a non-aligned offset. PR: kern/77234 MFC After: 1 week Approved by: imp (mentor) Requested by: many many people Submitted by: Andriy Gapon <avg at icyb dot net dot ua>	2007-06-11 20:14:44 +00:00
attilio	12d804e413	rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor)	2007-06-09 21:48:44 +00:00
bmah	229a009b9d	Fix off-by-one error (introduced in r1.60) that had the effect of disallowing a read of exactly MAXPHYS bytes. Reviewed by: des, rdivacky MFC after: 1 week Sponsored by: nCircle Network Security	2007-06-07 15:04:30 +00:00
jeff	91d1501790	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-05 00:00:57 +00:00
attilio	9bd4fdf7ce	Do proper "locking" for missing vmmeters part. Now, we assume no more sched_lock protection for some of them and use the distribuited loads method for vmmeter (distribuited through CPUs). Reviewed by: alc, bde Approved by: jeff (mentor)	2007-06-04 21:45:18 +00:00
trhodes	4173ae4155	Revert previous, part of NFS that I didn't know about.	2007-06-01 17:06:46 +00:00
trhodes	aae93b87b9	Garbage collect msdosfs_fhtovp; it appears unused and I have been using MSDOSFS without this function and problems for the last month.	2007-06-01 14:57:19 +00:00
kib	17260ba6f1	Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file: part 2. Convert calls missed in the first big commit. Noted by: rwatson Pointy hat to: kib	2007-06-01 14:33:11 +00:00
attilio	7dd8ed88a9	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
kib	f13486a222	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
rwatson	be966bcc03	Where I previously removed calls to kdb_enter(), now remove include of kdb.h. Pointed out by: bde	2007-05-29 11:28:28 +00:00
rwatson	d162983163	Rather than entering the debugger via kdb_enter() when detecting memory corruption under SMBUFS_NAME_DEBUG, panic() with the same error message.	2007-05-27 13:12:36 +00:00
rwatson	b3193c8a43	Rather than entering the debugger via kdb_enter() in the event the root vnode is unexpectedly locked under NULLFS_DEBUG in nullfs and then returning EDEADLK, panic.	2007-05-27 13:10:16 +00:00
kib	162fa8dc6d	Since renaming of vop_lock to _vop_lock, pre- and post-condition function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.	2007-05-18 13:02:13 +00:00
jeff	e1996cb960	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
des	cde2655812	The process lock is held when procfs_ioctl() is called. Assert that this is so, and PHOLD the process while sleeping since msleep() will release the lock.	2007-05-01 12:59:20 +00:00
des	d5a4cf1cbe	Fix old locking bugs which were revealed when pseudofs was made MPSAFE. Submitted by: tegge	2007-04-23 19:17:01 +00:00
rwatson	62d2d15116	Rename macdevfsdirent() to macdevfs() to synchronize with SEDarwin, where similar data structures exist to support devfs and the MAC Framework, but are named differently. Obtained from: TrustedBSD Project Sponsored by: SPARTA, Inc.	2007-04-23 13:36:54 +00:00
alc	11f5869ec4	Add synchronization. Eliminate the acquisition and release of Giant. Reviewed by: tegge	2007-04-23 06:12:24 +00:00
trhodes	a559ca6419	In some cases, like whenever devfs file times are zero, the fix(aa) will not be applied to dev entries. This leaves us with file times like "Jan 1 1970." Work around this problem by replacing the tv_sec == 0 check with a <= 3600 check. It's doubtful anyone will be booting within an hour of the Epoch, let alone care about a few seconds worth of nonzero timestamps. It's a hackish work around, but it does work and I have not experienced any negatives in my testing. Discussed with: bde "Ok with me: phk	2007-04-20 01:47:05 +00:00
des	6a33fe5e57	Avoid "unused variable" warning when building without PSEUDOFS_TRACE.	2007-04-15 20:35:18 +00:00
des	4632956a5b	Make pseudofs (and consequently procfs, linprocfs and linsysfs) MPSAFE.	2007-04-15 17:10:01 +00:00
des	29fb9c20c7	Instead of stating GIANT_REQUIRED, just acquire and release Giant where needed. This does not make a difference now, but will when procfs is marked MPSAFE.	2007-04-15 17:06:09 +00:00

1 2 3 4 5 ...

2058 Commits