freebsd-skq

Author	SHA1	Message	Date
Bruce Evans	9e916c3163	Add noclusterr and noclusterw options to the options list. I forgot these when I implemented clustering.	2007-10-18 16:25:47 +00:00
Bruce Evans	7c3fc9de5c	Fix some style bugs in the mount options list. Mainly, sort the list, leaving space for adding missing options. Negative options are sorted after removing their "no" prefix, and generic options are sorted before msdosfs-specific ones.	2007-10-18 15:48:10 +00:00
Bruce Evans	cefb55828f	In msdosfs_settattr(), don't do synchronous updates of the denode (except indirectly for the size pseudo-attribute). If anything deserves a sync update, then it is ids and immutable flags, since these are related to security, but ffs never synced these and msdosfs doesn't support them. (ufs_setattr() only does an update in one case where it is least needed (for timestamps); it did pessimal sync updates for timestamps until 1998/03/08 but was changed for unlogged reasons related to soft updates.) Now msdosfs calls deupdat() with waitfor == 0, which normally gives a delayed update to disk but always gives a sync update of timestamps in core, while for ffs everything is delayed until the syncer daemon or other activity causes an update (except for timestamps). This gives a large optimization mainly for things like cp -p, where attribute adjustment could easily triple the number of physical I/O's if it is done synchronously (but cp -p to msdosfs is not as bad as that, since msdosfs doesn't support many attributes so null adjustments are more common, and msdosfs doesn't support ctimes so even if cp doesn't weed out null adjustments they don't become non-null after clobbering the ctime).	2007-10-18 07:26:21 +00:00
Alfred Perlstein	77465d9390	Get rid of qaddr_t. Requested by: bde	2007-10-16 10:54:55 +00:00
Daichi GOTO	1016626062	This changes give nullfs correctly work with latest unionfs. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:57:11 +00:00
Daichi GOTO	20885def58	Added whiteout behavior option. ``-o whiteout=always'' is default mode (it is established practice) and ``-o whiteout=whenneeded'' is less disk-space using mode especially for resource restricted environments like embedded environments. (Contributed by Ed Schouten. Thanks) Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:55:38 +00:00
Daichi GOTO	524f3f285d	Default copy mode has been changed from traditional-mode to transparent-mode. Some folks who have reported some issues have solved with transparent mode. We guess it is time to change the default copy mode. The transparent-mode is the best in most situations. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:53:38 +00:00
Daichi GOTO	7d72c5e67d	Fixed un-vrele issue of upper layer root vnode of unionfs. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:52:01 +00:00
Daichi GOTO	6c98d0e9db	Added NULL check code pointed out by Coverity. (via Stanislav Sedov. Thanks) Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:50:58 +00:00
Daichi GOTO	57821163d3	- It has been become MPSAFE. - Fixed lock panic issue under MPSAFE. - Fixed panic issue whenever it locks vnode with reclaim. - Fixed lock implementations not conforming to vnode_if.src style. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:49:30 +00:00
Daichi GOTO	7e0c899579	Fixed vnode unlock/vrele untreated issues whenever errors have occurred during some treatments. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:47:44 +00:00
Daichi GOTO	dc2dd18518	- Added support for vfs_cache on unionfs. As a result, you can use applications that use procfs on unionfs. - Removed unionfs internal cache mechanism because it has vfs_cache support instead. As a result, it just simplified code of unionfs. - Fixed kern/111262 issue. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:46:11 +00:00
Daichi GOTO	5adc408078	Added treatments to prevent readdir infinity loop using with Linux binary compatibility feature. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:44:06 +00:00
Daichi GOTO	b2b0db08c5	Changed it frees unneeded memory ASAP. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:42:05 +00:00
Daichi GOTO	3282e2c406	Log: Improved access permission check treatments. Submitted by: Masanori Ozawa <ozawa@ongs.co.jp> (unionfs developer) Reviewed by: jeff, kensmith Approved by: re (kensmith) MFC after: 1 week	2007-10-14 13:37:52 +00:00
John Baldwin	c1f7cf23b1	Use the correct pid when checking to see whether or not the /proc/<pid> directory itself (rather than any of its contents) is visible to the current thread. MFC after: 1 week PR: kern/90063 Submitted by: john of 8192.net Approved by: re (kensmith)	2007-10-05 17:37:25 +00:00
Xin LI	3543c1b429	MFp4: Provide a dummy verb "export" to shut up the message showed up at start when NFS is enabled. Reported by: rafan Approved by: re (tmpfs blanket)	2007-10-04 17:11:48 +00:00
Xin LI	386c969205	Additional work is still needed before we can claim that tmpfs is stable enough for production usage. Warn user upon mount. Approved by: re (tmpfs blanket)	2007-10-04 17:08:46 +00:00
Bruce Evans	ed316d339f	Remove some of the pessimizations involving writing the fsi sector. All active fields in fsi are advisory/optional, so we shouldn't do extra work to make them valid at all times, but instead we write to the fsi too often (we still do), and we searched for a free cluster for fsinxtfree too often. This commit just removes the whole search and its results, so that we write out our in-core copy of fsinxtfree instead of writing a "fixed" copy and clobbering our in-core copy. This saves fixing 3 bugs: - off-by-1 error for the end of the search, resulting in fsinxtfree not actually being adjusted iff only the last cluster is free. - missing adjustment when no clusters are free. - off-by-many error for the start of the search. Starting the search at 0 instead of at (the in-core copy of) fsinxtfree did more than defeat the reasons for existence of fsinxtfree. fsinxtfree exists mainly to avoid having to start at 0 for just the first search per mount, but has the side effect of reducing bias towards allocating near cluster 0. The bias would normally only be generated by the first search per mount (if fsinxtfree is not supported), but since we also adjusted the in-core copy of fsinxtfree here, we were doing extra work to maximize the bias. Approved by: re (kensmith)	2007-09-23 14:49:32 +00:00
Craig Rodrigues	00cedf971b	Disable multiple ntfs mounts to the same mountpoint. Eliminates panics due to locking issues. Idea taken from src/sys/gnu/fs/xfs/FreeBSD/xfs_super.c. PR: 89966, 92000, 104393 Reported by: H. Matsuo <hiroshi50000 yahoo co jp>, Chris <m2chrischou gmail.com>, Andrey V. Elsukov <bu7cher yandex ru>, Jan Henrik Sylvester <me janh de> Approved by: re (kensmith)	2007-09-21 23:50:15 +00:00
Jeff Roberson	b61ce5b0e6	- Move all of the PS_ flags into either p_flag or td_flags. - p_sflag was mostly protected by PROC_LOCK rather than the PROC_SLOCK or previously the sched_lock. These bugs have existed for some time. - Allow swapout to try each thread in a process individually and then swapin the whole process if any of these fail. This allows us to move most scheduler related swap flags into td_flags. - Keep ki_sflag for backwards compat but change all in source tools to use the new and more correct location of P_INMEM. Reported by: pho Reviewed by: attilio, kib Approved by: re (kensmith)	2007-09-17 05:31:39 +00:00
Bruce Evans	c2819440b3	Fix races in msdosfs_lookup() and msdosfs_readdir(). These functions can easily block in bread(), and then there was nothing to prevent the static buffer (nambuf_{ptr,len,last_id}) being clobbered by another thread. The effects of the bug seem to have been limited to failed lookups and mangled names in readdir(), since Giant locking provides enough serialization to prevent concurrent calls to the functions that access the buffer. They were very obvious for multiple concurrent tree walks, especially with a small cluster size. The bug was introduced in msdosfs_conv.c 1.34 and associated changes, and is in all releases starting with 5.2. The fix is to allocate the buffer as a local variable and pass around pointers to it like "_r" functions in libc do. Stack use from this is large but not too large. This also fixes a memory leak on module unload. Reviewed by: kib Approved by: re (kensmith)	2007-08-31 22:29:55 +00:00
Xin LI	1f32d0127b	MFp4: rework tmpfs_readdir() logic in terms of correctness. Approved by: re (tmpfs blanket) Tested with: fstest, fsx	2007-08-16 11:00:07 +00:00
John Baldwin	1dc5b1cc56	On 6.x this works: % mount \| grep home /dev/ad4s1e on /home (ufs, local, noatime, soft-updates) % mount -u -o atime /home % mount \| grep home /dev/ad4s1e on /home (ufs, local, soft-updates) Restore this behavior for on 7.x for the following mount options: noatime, noclusterr, noclusterw, noexec, nosuid, nosymfollow In addition, on 7.x, the following are equivalent: mount -u -o atime /home mount -u -o nonoatime /home Ideally, when we introduce new mount options, we should avoid options starting with "no". :) Requested by: jhb Reported by: Karol Kwiat <karol.kwiat gmail com>, Scott Hetzel <swhetzel gmail com> Approved by: re (bmah) Proxy commit for: rodrigc	2007-08-15 17:40:09 +00:00
Xin LI	ad3638ee08	MFp4: - LK_RETRY prohibits vget() and vn_lock() to return error. Remove associated code. [1] - Properly use vhold() and vdrop() instead of their unlocked versions, we are guaranteed to have the vnode's interlock unheld. [1] - Fix a pseudo-infinite loop caused by 64/32-bit arithmetic with the same way used in modern NetBSD versions. [2] - Reorganize tmpfs_readdir to reduce duplicated code. Submitted by: kib [1] Obtained from: NetBSD [2] Approved by: re (tmpfs blanket)	2007-08-10 11:00:30 +00:00
Xin LI	0ae6383d39	MFp4: - Respect cnflag and don't lock vnode always as LK_EXCLUSIVE [1] - Properly lock around tn_vnode to avoid NULL deference - Be more careful handling vnodes () () This is a WIP [1] by pjd via howardsu Thanks kib@ for his valuable VFS related comments. Tested with: fsx, fstest, tmpfs regression test set Found by: pho's stress2 suite Approved by: re (tmpfs blanket)	2007-08-10 05:24:49 +00:00
Bruce Evans	a4e6807c49	In msdosfs_read() and msdosfs_write(), don't check explicitly for (uio_offset < 0) since this can't happen. If this happens, then the general code handles the problem safely (better than before for reading, returning 0 (EOF) instead of the bogus errno EINVAL, and the same as before for writing, returning EFBIG). In msdosfs_read(), don't check for (uio_resid < 0). msdosfs_write() already didn't check. In msdosfs_read(), document in a comment our assumptions that the caller passed a valid uio_offset and uio_resid. ffs checks using KASSERT(), and that is enough sanity checking. In the same comment, partly document there is no need to check for the EOVERFLOW case, unlike in ffs where this case can happen at least in theory. In msdosfs_write(), add a comment about why the checking of (uio_resid == 0) is explicit, unlike in ffs. In msdosfs_write(), check for impossibly large final offsets before checking if the file size rlimit would be exceeded, so that we don't have an overflow bug in the rlimit check and are consistent with ffs. We now return EFBIG instead of EFBIG plus a SIGXFSZ signal if the final offset would be impossibly large but not so large as to cause overflow. Overflow normally gave the benign behaviour of no signal. Approved by: re (kensmith) (blanket)	2007-08-07 10:35:27 +00:00
Bruce Evans	b7837a91c9	Fix and update the comments about the effect of the read-only flag on writing. They are still too verbose. Remove nearby unreachable code for handling symlinks. Approved by: re (kensmith) (blanket)	2007-08-07 05:42:10 +00:00
Bruce Evans	e3117f852e	Fix some style bugs (don't assume that off_t == int64_t; fix some comments; remove some parentheses; fix some whitespace errors; fix only one case of a boolean comparison of a non-boolean). Improve an error message by quoting ".", and by not printing large positive values as negative ones. Approved by: re (kensmith) (blanket)	2007-08-07 03:59:49 +00:00
Bruce Evans	c0f5121cac	Fix some style bugs (don't assume that off_t == int64_t; fix some comments; remove some parentheses; fix only a couple of whtespace errors). Approved by: re (kensmith) (blanket)	2007-08-07 03:43:28 +00:00
Bruce Evans	2d7c6b2724	Fix some style bugs (mainly some whitespace errors). Approved by: re (kensmith) (blanket)	2007-08-07 03:38:36 +00:00
Bruce Evans	b6d0381e7e	Fix some style bugs (some whitespace errors only). Approved by: re (kensmith) (blanket)	2007-08-07 03:22:10 +00:00
Bruce Evans	d2bb66bacd	Sort includes. Remove rotted banal comment attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:28:33 +00:00
Bruce Evans	6becd1c855	Sort includes. Remove banal comments attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:27:35 +00:00
Bruce Evans	5696c6e0b2	Sort includes. Remove banal comments before includes. Remove rotted banal comments attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:20:37 +00:00
Bruce Evans	9b0802c90b	Remove unused include(s). Remove banal comments before includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:11:16 +00:00
Bruce Evans	a878a31c13	Remove unused include(s). Approved by: re (kensmith) (blanket)	2007-08-07 02:08:06 +00:00
Bruce Evans	eba34270fa	Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/buf.h> and/or <sys/vnode.h> Approved by: re (kensmith) (blanket)	2007-08-07 01:40:27 +00:00
Bruce Evans	1103771d95	Include <sys/mutex.h>'s prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/vnode.h>. Sort the include of <sys/mutex.h> instead of unsorting it after <sys/vnode.h> and depending on the pollution there. Approved by: re (kensmith) (blanket)	2007-08-07 01:37:59 +00:00
Bruce Evans	6fd81fc7a6	Remove unused include(s). Approved by: re (kensmith) (blanket)	2007-08-07 01:07:16 +00:00
Bruce Evans	8d61a735c6	Silently fix up the estimated next free cluster number from the fsinfo sector, instead of failing the whole mount if it is garbage. Fields in the fsinfo sector are only advisory, so there are better sanity checks than this, and we already silently fix up the only other advisory field in the fsinfo (the free cluster count). This wasn't handled quite right in rev.1.92, 1.117, or in NetBSD. 1.92 also failed the whole mount for the non-garbage magic value 0xffffffff 1.117 fixed this well enough in practice since garbage values shouldn't occur in practice, but left the error handling larger and more convoluted than necessary. Now we handle the magic value as a special case of fixing up all out of bounds values. Also fix up the estimated next free cluster number when there is no fsinfo sector. We were using 0, but CLUST_FIRST is safer. Approved by: re (kensmith)	2007-08-05 12:58:34 +00:00
Bruce Evans	3726942956	Oops, fix the fix for the i/o size of the fsinfo block. Its log message explained why the size is 1 sector, but the code used a size of 1 cluster. I/o sizes larger than necessary may cause serious coherency problems in the buffer cache. Here I think there were only minor efficiency problems, since a too-large fsinfo buffer could only get far enough to overlap buffers for the same vnode (the device vnode), so mappings are coherent at the page level although not at the buffer level, and the former is probably enough due to our limited use of the fsinfo buffer. Approved by: re (kensmith)	2007-08-03 23:13:50 +00:00
Xin LI	fb7557140e	MFp4 - Refine locking to eliminate some potential race/panics: - Copy before testing a pointer. This closes a race window. - Use msleep with the node interlock instead of tsleep. - Do proper locking around access to tn_vpstate. - Assert vnode VOP lock for dir_{atta,de}tach to capture inconsistent locking. Suggested by: kib Submitted by: delphij Reviewed by: Howard Su Approved by: re (tmpfs blanket)	2007-08-03 06:24:31 +00:00
Pawel Jakub Dawidek	57fd3d5572	When we do open, we should lock the vnode exclusively. This fixes few races: - fifo race, where two threads assign v_fifoinfo, - v_writecount modifications, - v_object modifications, - and probably more... Discussed with: kib, ups Approved by: re (rwatson)	2007-07-26 16:58:09 +00:00
Xin LI	f62e5595fd	MFp4: Force 64-bit arithmatic when caculating the maximum file size. This fixes tmpfs caculations on 32-bit systems equipped with more than 4GB swap. Reported by: Craig Boston <craig xfoil gank org> PR: kern/114870 Approved by: re (tmpfs blanket)	2007-07-24 17:14:53 +00:00
Bruce Evans	4eb3abf0a5	Make using msdosfs as the root file system sort of work: o Initialize ownerships and permissions. They were garbage (0) for root mounts since vfs_mountroot_try() doesn't ask for them to be set and msdosfs's old incomplete code to set them was removed. The garbage happened to give the correct ownerships root:wheel, but it gave permissions 000 so init could not be execed. Use the macros for root: wheel and 0755. (The removed code gave 0:0 and 0777. 0755 is more normal and secure, thought wrong for /tmp.) o Check the readonly flag for initial (non-MNT_UPDATE) mounts in the correct place, as in ffs. For root mounts, it is only passed in mp->mnt_flags, since vfs_mountroot_try() only passes it as a flag and nothing translates the flag to the "ro" option string. msdosfs only looked for it in the string, so it gave a rw mount for root mounts without even clearing the flag in mp->mnt_flags, so the final state was inconsistent. Checking the flag only in mp->mnt_flags works for initial userland mounts too. The MNT_UPDATE case is messier. The main point that should work but doesn't is fsck of msdosfs root while it is mounted ro. This needs mainly MNT_RELOAD support to work. It should be possible to run fsck -p and succeed provided the fs is consistent, not just for msdosfs, but this fails because fsck -p always tries to open the device rw. The hack that allows open for writing in ffs is not implemented in msdosfs, since without MNT_RELOAD support writing could only be harmful. So fsck must be turned off to use msdosfs as root. This is quite dangerous, since msdosfs is still missing actually using its fs-dirty flag internally, so it is happy to mount dirty fileystems rw. Unrelated changes: - Fix missing error handling for MNT_UPDATE from rw to ro. - Catch up with renaming msdos to msdosfs in a string. Approved by: re (kensmith)	2007-07-23 07:10:17 +00:00
Xin LI	7280082944	MFp4: When swapping is not enabled, allow creating files by taking physical memory pages into account for tm_maxfilesize. Reported by: Dominique Goncalves <dominique.goncalves gmail.com> Submitted by: Howard Su Approved by: re (tmpfs blanket)	2007-07-23 06:54:58 +00:00
Bruce Evans	6b6c5f5ef9	Implement vfs clustering for msdosfs. This gives a very large speedup for small block sizes (in my tests, about 5 times for write and 3 times for read with a block size of 512, if clustering is possible) and a moderate speedup for the moderatatly large block sizes that should be used on non-small media (4K is the best size in most cases, and the speedup for that is about 1.3 times for write and 1.2 times for read). mmap() should benefit from clustering like read()/write(), but the current implementation of vm only supports clustering (at least for getpages) if the fs block size is >= PAGE SIZE. msdosfs is now only slightly slower than ffs with soft updates for writing and slightly faster for reading when both use their best block sizes. Writing is slower for msdosfs because of more sync writes. Reading is faster for msdosfs because indirect blocks interfere with clustering in ffs. The changes in msdosfs_read() and msdosfs_write() are simpler merges of corresponding code in ffs (after fixing some style bugs in ffs). msdosfs_bmap() needs fs-specific code. This implementation loops calling a lower level bmap function to do the hard parts. This is a bit inefficient, but is efficient enough since msdsfs_bmap() is only called when there is physical i/o to do. Approved by: re (hrs)	2007-07-20 17:06:57 +00:00
Bruce Evans	d34b0a1bac	Clean up before implementing vfs clustering for msdosfs: In msdosfs_read(), mainly reorder the main loop to the same order as in ffs_read(). In msdosfs_write() and extendfile(), use vfs_bio_clrbuf() instead of clrbuf(). I think this just just a bogus optimization, but ffs always does it and msdosfs already did it in one place, and it is what I've tested. In msdosfs_write(), merge good bits from a comment in ffs_write(), and fix 1 style bug. In the main comment for msdosfs_pcbmap(), improve wording and catch up with 13 years of changes in the function. This comment belongs in VOP_BMAP.9 but that doesn't exist. In msdosfs_bmap(), return EFBIG if the requested cluster number is out of bounds instead of blindly truncating it, and fix many style bugs. Approved by: re (hrs)	2007-07-20 16:21:47 +00:00
Robert Watson	825eaf3470	Make sure we release the control vnode in Coda: We allocate coda_ctlvp when /coda is mounted, but never release it. During the unmount this vnode was marked as UNMOUNTING and when venus is started a second time the system would hang, possibly waiting for the old vnode to disappear. So now we call vrele on the control vnode when file system is unmounted to drop the reference we got during the mount. I'm pretty sure it is also necessary to not skip the handling in coda_inactive for the control vnode, it seems like that is the place we actually get rid of the vnode once the refcount has dropped to 0. Submitted by: Jan Harkes <jaharkes at cs dot cmu dot edu> Approved by: re (kensmith)	2007-07-20 11:14:51 +00:00
Xin LI	c5be778305	MFp4: Rework on tmpfs's mapped read/write procedures. This should finally fix fsx test case. The printf's added here would be eventually turned into assertions. Submitted by: Mingyan Guo (mostly) Approved by: re (tmpfs blanket)	2007-07-19 03:34:50 +00:00
Robert Watson	00f05dc847	Complete repo-copy and move of Coda from src/sys/coda to src/sys/fs/coda by removing files from src/sys/coda, and updating include paths in the new location, kernel configuration, and Makefiles. In one case add $FreeBSD$. Discussed with: anderson, Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith) Repo-copy madness: simon	2007-07-12 21:04:58 +00:00
Robert Watson	d21e51d059	Forced commit to recognize repo-copy of Coda files from src/sys/coda to src/sys/fs/coda. Discussed with: anderson, Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith) Repo-copy madness: simon	2007-07-12 20:40:38 +00:00
Bruce Evans	93fe42b62f	Round up the FAT block size to a multiple of the sector size so that i/o to the FAT is possible. Make the FAT block size less arbitrary before it is rounded up: - for FAT12, default to 3*512 instead of to 3 sectors. The magic 3 is the default number of 512-byte FAT sectors on a floppy drive. That many sectors is too many if the sector size is larger. - for !FAT12, default to PAGE_SIZE instead of to 4096. Remove MSDOSFS_DFLTBSIZE since it only obfuscated this 4096. For reading the BPB, use a block size of 8192 instead of 2048 so that sector sizes up to 8192 can work. We should try several sizes, or just try the maximum supported size (MAXBSIZE = 64K). I use 8192 because that is enough for DVD-RW's (even 2048 is enough) and 8192 has been tested a lot in use by ffs. This completes fixing msdosfs for some large sector sizes (up to 8K for read and 64K for write). Microsoft documents support for sector sizes up to 4K in mdosfs. ffs is currently limited to 8K for both read and write. Approved by: re (kensmith) Approved by: nyan (several years ago)	2007-07-12 17:17:47 +00:00
Bruce Evans	fd7c4230b2	Fix some bugs involving the fsinfo block (many remain unfixed). This is part of fixing msdosfs for large sector sizes. One of the fixed bugs was fatal for large sector sizes. 1. The fsinfo block has size 512, but it was misunderstood and declared as having size 1024, with nothing in the second 512 bytes except a signature at the end. The second 512 bytes actually normally (if the file system was created by Windows) consist of a second boot sector which is normally (in WinXP) empty except for a signature -- the normal layout is one boot sector, one fsinfo sector, another boot sector, then these 3 sectors duplicated. However, other layouts are valid. newfs_msdos produces a valid layout with one boot sector, one fsinfo sector, then these 2 sectors duplicated. The signature check for the extra part of the fsinfo was thus normally checking the signature in either the second boot sector or the first boot sector in the copy, and thus accidentally succeeding. The extra signature check would just fail for weirder layouts with 512-byte sectors, and for normal layouts with any other sector size. Remove the extra bytes and the extra signature check. 2. Old versions did i/o to the fsinfo block using size 1024, with the second half only used for the extra signature check on read. This was harmless for sector size 512, and worked accidentally for sector size 1024. The i/o just failed for larger sector sizes. The version being fixed did i/o to the fsinfo block using size fsi_size(pmp) = (1024 << ((pmp)->pm_BlkPerSec >> 2)). This expression makes no sense. It happens to work for sector small sector sizes, but for sector size 32K it gives the preposterous value of 64M and thus causes panics. A sector size of 32768 is necessary for at least some DVD-RW's (where the minimum write size is 32768 although the minimum read size is 2048). Now that the size of the fsinfo block is 512, it always fits in one sector so there is no need for a macro to express it. Just use the sector size where the old code uses 1024. Approved by: re (kensmith) Approved by: nyan (several years ago for a different version of (2))	2007-07-12 16:09:07 +00:00
Robert Watson	26e3bc3a96	Fix ioctls on the control vnode: ioctls on a character device fail with ENOTTY. Make the control vnode a regular file so that ioctls are passed through to our kernel module. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:34:41 +00:00
Robert Watson	0e3ce855cc	Avoid a panic in insmntque when we pass a NULL mount: this reenables some previously disabled code which according to the comment caused a problem during shutdown. But even that is still better than triggering a kernel panic whenever venus is started. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:33:46 +00:00
Robert Watson	74d326ada8	Replace CODA_OPEN with CODA_OPEN_BY_FD: coda_open was disabled because we can't open container files by device/inode number pair anymore. Replace the CODA_OPEN upcall with CODA_OPEN_BY_FD, where venus returns an open file descriptor for the container file. We can then grab a reference on the vnode coda_psdev.c:vc_nb_write and use this vnode for further accesses to the container file. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:32:08 +00:00
Robert Watson	934030b2c9	Resolve Coda mount failing because Coda failed to match the device operations. But we don't have to, if we find the coda_mntinfo structure for this device in our linked list, we know the device is good. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 21:21:55 +00:00
Robert Watson	7263babb85	Avoid crash when opening Coda device: when allocating coda_mntinfo, we need to initialize dev so that we can actually find the allocated coda_mntinfo structure later on. Submitted by: Jan Harkes <jaharkes@cs.cmu.edu> Approved by: re (kensmith)	2007-07-11 20:39:53 +00:00
Xin LI	8d9a89a3a0	MFp4: Make use of the kernel unit number allocation facility for tmpfs nodes. Submitted by: Mingyan Guo <guomingyan gmail com> Approved by: re (tmpfs blanket)	2007-07-11 14:26:27 +00:00
Bruce Evans	8e55bfaf4b	Don't use almost perfectly pessimal cluster allocation. Allocation of the the first cluster in a file (and, if the allocation cannot be continued contiguously, for subsequent clusters in a file) was randomized in an attempt to leave space for contiguous allocation of subsequent clusters in each file when there are multiple writers. This reduced internal fragmentation by a few percent, but it increased external fragmentation by up to a few thousand percent. Use simple sequential allocation instead. Actually maintain the fsinfo sequence index for this. The read and write of this index from/to disk still have many non-critical bugs, but we now write an index that has something to do with our allocations instead of being modified garbage. If there is no fsinfo on the disk, then we maintain the index internally and don't go near the bugs for writing it. Allocating the first free cluster gives a layout that is almost as good (better in some cases), but takes too much CPU if the FAT is large and the first free cluster is not near the beginning. The effect of this change for untar and tar of a slightly reduced copy of /usr/src on a new file system was: Before (msdosfs 4K-clusters): untar: 459.57 real untar from cached file (actually a pipe) tar: 342.50 real tar from uncached tree to /dev/zero Before (ffs2 soft updates 4K-blocks 4K-frags) untar: 39.18 real tar: 29.94 real Before (ffs2 soft updates 16K-blocks 2K-frags) untar: 31.35 real tar: 18.30 real After (msdosfs 4K-clusters): untar 54.83 real tar 16.18 real All of these times can be improved further. With multiple concurrent writers or readers (especially readers), the improvement is smaller, but I couldn't find any case where it is negative. 342 seconds for tarring up about 342 MB on a ~47MB/S partition is just hard to unimprove on. (This operation would take about 7.3 seconds with reasonably localized allocation and perfect read-ahead.) However, for active file systems, 342 seconds is closer to normal than the 16+ seconds above or the 11 seconds with other changes (best I've measured -- won easily by msdosfs!). E.g., my active /usr/src on ffs1 is quite old and fragmented, so reading to prepare for the above benchmark takes about 6 times longer than reading back the fresh copies of it. Approved by: re (kensmith)	2007-07-10 13:20:24 +00:00
Xin LI	1df86a323d	MFp4: - Plug memory leak. - Respect underlying vnode's properties rather than assuming that the user want root:wheel + 0755. Useful for using tmpfs(5) for /tmp. - Use roundup2 and howmany macros instead of rolling our own version. - Try to fix fsx -W -R foo case. - Instead of blindly zeroing a page, determine whether we need a pagein order to prevent data corruption. - Fix several bugs reported by Coverity. Submitted by: Mingyan Guo <guomingyan gmail com>, Howard Su, delphij Coverity ID: CID 2550, 2551, 2552, 2557 Approved by: re (tmpfs blanket)	2007-07-08 15:56:12 +00:00
Konstantin Belousov	de10ffa527	Since rev. 1.199 of sys/kern/kern_conf.c, the thread that calls destroy_dev() from d_close() cdev method would self-deadlock. devfs_close() bump device thread reference counter, and destroy_dev() sleeps, waiting for si_threadcount to reach zero for cdev without d_purge method. destroy_dev_sched() could be used instead from d_close(), to schedule execution of destroy_dev() in another context. The destroy_dev_sched_drain() function can be used to drain the scheduled calls to destroy_dev_sched(). Similarly, drain_dev_clone_events() drains the events clone to make sure no lingering devices are left after dev_clone event handler deregistered. make_dev_credf(MAKEDEV_REF) function should be used from dev_clone event handlers instead of make_dev()/make_dev_cred() to ensure that created device has reference counter bumped before cdev mutex is dropped inside make_dev(). Reviewed by: tegge (early versions), njl (programming interface) Debugging help and testing by: Peter Holm Approved by: re (kensmith)	2007-07-03 17:42:37 +00:00
Xin LI	9b258fca27	MFp4: - Remove unnecessary NULL checks after M_WAITOK allocations. - Use VOP_ACCESS instead of hand-rolled suser_cred() calls. [1] - Use malloc(9) KPI to allocate memory for string. The optimization taken from NetBSD is not valid for FreeBSD because our malloc(9) already act that way. [2] Requested by: rwatson [1] Submitted by: Howard Su [2] Approved by: re (tmpfs blanket)	2007-06-29 05:23:15 +00:00
Xin LI	a321f489a5	Space/style cleanups after last set of commits. Approved by: re (tmpfs blanket)	2007-06-28 02:39:31 +00:00
Xin LI	a96539bf8f	Staticify most of fifo/vn operations, they should not be directly exposed outside. Approved by: re (tmpfs blanket)	2007-06-28 02:36:41 +00:00
Xin LI	8d5892eeab	Use vfs_timestamp instead of nanotime when obtaining a timestamp for use with timekeeping. Approved by: re (tmpfs blanket)	2007-06-28 02:34:32 +00:00
Xin LI	5ff9b9158f	Reorder tf_gen and tf_id in struct tmpfs_fid. This saves 8 bytes on amd64 architecture. Obtained from: NetBSD Approved by: re (tmpfs blanket)	2007-06-28 02:32:44 +00:00
Xin LI	6ca4416347	Remove two function prototypes that are no longer used. Approved by: re (tmpfs blanket)	2007-06-26 02:08:29 +00:00
Xin LI	974fd8c650	- Sync with NetBSD's RCSID (HEAD preferred). - Correct a typo. Approved by: re (tmpfs blanket)	2007-06-26 02:07:08 +00:00
Xin LI	7adb177693	MFp4: Several clean-ups and improvements over tmpfs: - Remove tmpfs_zone_xxx KPI, the uma(9) wrapper, since they does not bring any value now. - Use \|= instead of = when applying VV_ROOT flag. - Remove tm_avariable_nodes list. Use uma to hold the released nodes. - init/destory interlock mutex of node when init/fini instead of ctor/dtor. - Change memory computing using u_int to fix negative value in 2G mem machine. - Remove unnecessary bzero's - Rely uma logic to make file id allocation harder to guess. - Fix some unsigned/signed related things. Make sure we respect -o size=xxxx - Use wire instead of hold a page. - Pass allocate_zero to obtain zeroed pages upon first use. Submitted by: Howard Su Approved by: re (tmpfs blanket, kensmith)	2007-06-25 18:46:13 +00:00
Rong-En Fan	534046e301	- Remove UMAP filesystem. It was disconnected from build three years ago, and it is seriously broken. Discussed on: freebsd-arch@ Approved by: re (mux)	2007-06-25 05:06:57 +00:00
Xin LI	b746bf0820	Use vfs_timestamp() instead of nanotime() - make it up to the user to make decisions about how detail they wanted timestamps to have.	2007-06-18 14:40:19 +00:00
Xin LI	21cf0e3907	MFp4: fix two locking problems: - Hold TMPFS_LOCK while updating tm_pages_used. - Hold vm page while doing uiomove. This will hopefully fix all known panics. Submitted by: Howard Su	2007-06-18 01:43:13 +00:00
Xin LI	d1fa59e9e1	MFp4: Add tmpfs, an efficient memory file system. Please note that, this is currently considered as an experimental feature so there could be some rough edges. Consult http://wiki.freebsd.org/TMPFS for more information. For now, connect tmpfs to build on i386 and amd64 architectures only. Please let us know if you have success with other platforms. This work was developed by Julio M. Merino Vidal for NetBSD as a SoC project; Rohit Jalan ported it from NetBSD to FreeBSD. Howard Su and Glen Leeder are worked on it to continue this effort. Obtained from: NetBSD via p4 Submitted by: Howard Su (with some minor changes) Approved by: re (kensmith)	2007-06-16 01:56:05 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Remko Lodder	5df29e0ce9	Correct corrupt read when the read starts at a non-aligned offset. PR: kern/77234 MFC After: 1 week Approved by: imp (mentor) Requested by: many many people Submitted by: Andriy Gapon <avg at icyb dot net dot ua>	2007-06-11 20:14:44 +00:00
Attilio Rao	a1fe14bc33	rufetch and calcru sometimes should be called atomically together. This patch fixes places where they should be called atomically changing their locking requirements (both assume per-proc spinlock held) and introducing rufetchcalc which wrappers both calls to be performed in atomic way. Reviewed by: jeff Approved by: jeff (mentor)	2007-06-09 21:48:44 +00:00
Bruce A. Mah	5cca41595d	Fix off-by-one error (introduced in r1.60) that had the effect of disallowing a read of exactly MAXPHYS bytes. Reviewed by: des, rdivacky MFC after: 1 week Sponsored by: nCircle Network Security	2007-06-07 15:04:30 +00:00
Jeff Roberson	982d11f836	Commit 14/14 of sched_lock decomposition. - Use thread_lock() rather than sched_lock for per-thread scheduling sychronization. - Use the per-process spinlock rather than the sched_lock for per-process scheduling synchronization. Tested by: kris, current@ Tested on: i386, amd64, ULE, 4BSD, libthr, libkse, PREEMPTION, etc. Discussed with: kris, attilio, kmacy, jhb, julian, bde (small parts each)	2007-06-05 00:00:57 +00:00
Attilio Rao	b4b7081961	Do proper "locking" for missing vmmeters part. Now, we assume no more sched_lock protection for some of them and use the distribuited loads method for vmmeter (distribuited through CPUs). Reviewed by: alc, bde Approved by: jeff (mentor)	2007-06-04 21:45:18 +00:00
Tom Rhodes	1be5bc7459	Revert previous, part of NFS that I didn't know about.	2007-06-01 17:06:46 +00:00
Tom Rhodes	a33ebaecf6	Garbage collect msdosfs_fhtovp; it appears unused and I have been using MSDOSFS without this function and problems for the last month.	2007-06-01 14:57:19 +00:00
Konstantin Belousov	7a31868ed0	Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file: part 2. Convert calls missed in the first big commit. Noted by: rwatson Pointy hat to: kib	2007-06-01 14:33:11 +00:00
Attilio Rao	2feb50bf7d	Revert VMCNT_* operations introduction. Probabilly, a general approach is not the better solution here, so we should solve the sched_lock protection problems separately. Requested by: alc Approved by: jeff (mentor)	2007-05-31 22:52:15 +00:00
Konstantin Belousov	9e223287c0	Revert UF_OPENING workaround for CURRENT. Change the VOP_OPEN(), vn_open() vnode operation and d_fdopen() cdev operation argument from being file descriptor index into the pointer to struct file. Proposed and reviewed by: jhb Reviewed by: daichi (unionfs) Approved by: re (kensmith)	2007-05-31 11:51:53 +00:00
Robert Watson	97cd541437	Where I previously removed calls to kdb_enter(), now remove include of kdb.h. Pointed out by: bde	2007-05-29 11:28:28 +00:00
Robert Watson	86fc5557a6	Rather than entering the debugger via kdb_enter() when detecting memory corruption under SMBUFS_NAME_DEBUG, panic() with the same error message.	2007-05-27 13:12:36 +00:00
Robert Watson	cf29f18a25	Rather than entering the debugger via kdb_enter() in the event the root vnode is unexpectedly locked under NULLFS_DEBUG in nullfs and then returning EDEADLK, panic.	2007-05-27 13:10:16 +00:00
Konstantin Belousov	d413d21071	Since renaming of vop_lock to _vop_lock, pre- and post-condition function calls are no more generated for vop_lock. Rename _vop_lock to vop_lock1 to satisfy tools/vnode_if.awk assumption about vop naming conventions. This restores pre/post-condition calls.	2007-05-18 13:02:13 +00:00
Jeff Roberson	222d01951f	- define and use VMCNT_{GET,SET,ADD,SUB,PTR} macros for manipulating vmcnts. This can be used to abstract away pcpu details but also changes to use atomics for all counters now. This means sched lock is no longer responsible for protecting counts in the switch routines. Contributed by: Attilio Rao <attilio@FreeBSD.org>	2007-05-18 07:10:50 +00:00
Dag-Erling Smørgrav	1d776018d4	The process lock is held when procfs_ioctl() is called. Assert that this is so, and PHOLD the process while sleeping since msleep() will release the lock.	2007-05-01 12:59:20 +00:00
Dag-Erling Smørgrav	b77d604841	Fix old locking bugs which were revealed when pseudofs was made MPSAFE. Submitted by: tegge	2007-04-23 19:17:01 +00:00
Robert Watson	305759909e	Rename macdevfsdirent() to macdevfs() to synchronize with SEDarwin, where similar data structures exist to support devfs and the MAC Framework, but are named differently. Obtained from: TrustedBSD Project Sponsored by: SPARTA, Inc.	2007-04-23 13:36:54 +00:00
Alan Cox	cf75c506db	Add synchronization. Eliminate the acquisition and release of Giant. Reviewed by: tegge	2007-04-23 06:12:24 +00:00
Tom Rhodes	164554dec4	In some cases, like whenever devfs file times are zero, the fix(aa) will not be applied to dev entries. This leaves us with file times like "Jan 1 1970." Work around this problem by replacing the tv_sec == 0 check with a <= 3600 check. It's doubtful anyone will be booting within an hour of the Epoch, let alone care about a few seconds worth of nonzero timestamps. It's a hackish work around, but it does work and I have not experienced any negatives in my testing. Discussed with: bde "Ok with me: phk	2007-04-20 01:47:05 +00:00
Dag-Erling Smørgrav	8edf8ae133	Avoid "unused variable" warning when building without PSEUDOFS_TRACE.	2007-04-15 20:35:18 +00:00
Dag-Erling Smørgrav	388596dffc	Make pseudofs (and consequently procfs, linprocfs and linsysfs) MPSAFE.	2007-04-15 17:10:01 +00:00
Dag-Erling Smørgrav	b1f9e8cec9	Instead of stating GIANT_REQUIRED, just acquire and release Giant where needed. This does not make a difference now, but will when procfs is marked MPSAFE.	2007-04-15 17:06:09 +00:00
Dag-Erling Smørgrav	302762c344	Fix the same bug as in procfs_doproc{,db}regs(): check that uio_offset is 0 upon entry, and don't reset it before returning. MFC after: 3 weeks	2007-04-15 13:29:36 +00:00
Dag-Erling Smørgrav	66cd74a611	Don't reset uio_offset to 0 before returning. Instead, refuse to service requests where uio_offset is not 0 to begin with. This fixes a long- standing bug where e.g. 'cat /proc/$$/regs' would loop forever. MFC after: 3 weeks	2007-04-15 13:24:03 +00:00
Dag-Erling Smørgrav	f61bc4ea5e	Further pseudofs improvements: The pfs_info mutex is only needed to lock pi_unrhdr. Everything else in struct pfs_info is modified only while Giant is held (during vfs_init() / vfs_uninit()); add assertions to that effect. Simplify pfs_destroy somewhat. Remove superfluous arguments from pfs_fileno_{alloc,free}(), and the assertions which were added in the previous commit to ensure they were consistent. Assert that Giant is held while the vnode cache is initialized and destroyed. Also assert that the cache is empty when it is destroyed. Rename the vnode cache mutex for consistency. Fix a long-standing bug in pfs_getattr(): it would uncritically return the node's pn_fileno as st_ino. This would result in st_ino being 0 if the node had not previously been visited by readdir(), and also in an incorrect st_ino for process directories and any files contained therein. Correct this by abstracting the fileno manipulations previously done in pfs_readdir() into a new function, pfs_fileno(), which is used by both pfs_getattr() and pfs_readdir().	2007-04-14 14:08:30 +00:00
Dag-Erling Smørgrav	15bad11fdb	Add a flag to struct pfs_vdata to mark the vnode as dead (e.g. process- specific nodes when the process exits) Move the vnode-cache-walking loop which was duplicated in pfs_exit() and pfs_disable() into its own function, pfs_purge(), which looks for vnodes marked as dead and / or belonging to the specified pfs_node and reclaims them. Note that this loop is still extremely inefficient. Add a comment in pfs_vncache_alloc() explaining why we have to purge the vnode from the vnode cache before returning, in case anyone should be tempted to remove the call to cache_purge(). Move the special handling for pfstype_root nodes into pfs_fileno_alloc() and pfs_fileno_free() (the root node's fileno must always be 2). This also fixes a bug where pfs_fileno_free() would reclaim the root node's fileno, triggering a panic in the unr code, as that fileno was never allocated from unr to begin with. When destroying a pfs_node, release its fileno and purge it from the vnode cache. I wish we could put off the call to pfs_purge() until after the entire tree had been destroyed, but then we'd have vnodes referencing freed pfs nodes. This probably doesn't matter while we're still under Giant, but might become an issue later. When destroying a pseudofs instance, destroy the tree before tearing down the fileno allocator. In pfs_mount(), acquire the mountpoint interlock when required. MFC after: 3 weeks	2007-04-11 22:40:57 +00:00
Dag-Erling Smørgrav	56c62ab69c	Whitespace nits.	2007-04-05 13:43:00 +00:00
Robert Watson	5e3f7694b1	Replace custom file descriptor array sleep lock constructed using a mutex and flags with an sxlock. This leads to a significant and measurable performance improvement as a result of access to shared locking for frequent lookup operations, reduced general overhead, and reduced overhead in the event of contention. All of these are imported for threaded applications where simultaneous access to a shared file descriptor array occurs frequently. Kris has reported 2x-4x transaction rate improvements on 8-core MySQL benchmarks; smaller improvements can be expected for many workloads as a result of reduced overhead. - Generally eliminate the distinction between "fast" and regular acquisisition of the filedesc lock; the plan is that they will now all be fast. Change all locking instances to either shared or exclusive locks. - Correct a bug (pointed out by kib) in fdfree() where previously msleep() was called without the mutex held; sx_sleep() is now always called with the sxlock held exclusively. - Universally hold the struct file lock over changes to struct file, rather than the filedesc lock or no lock. Always update the f_ops field last. A further memory barrier is required here in the future (discussed with jhb). - Improve locking and reference management in linux_at(), which fails to properly acquire vnode references before using vnode pointers. Annotate improper use of vn_fullpath(), which will be replaced at a future date. In fcntl(), we conservatively acquire an exclusive lock, even though in some cases a shared lock may be sufficient, which should be revisited. The dropping of the filedesc lock in fdgrowtable() is no longer required as the sxlock can be held over the sleep operation; we should consider removing that (pointed out by attilio). Tested by: kris Discussed with: jhb, kris, attilio, jeff	2007-04-04 09:11:34 +00:00
Kris Kennaway	6455de0029	Annotate that this giant acqusition is dependent on tty locking.	2007-03-26 21:56:46 +00:00
Maxim Konovalov	4b12bb048f	o cd9660 code repo-copied, update a comment.	2007-03-24 22:40:16 +00:00
Tor Egge	61b9d89ff0	Make insmntque() externally visibile and allow it to fail (e.g. during late stages of unmount). On failure, the vnode is recycled. Add insmntque1(), to allow for file system specific cleanup when recycling vnode on failure. Change getnewvnode() to no longer call insmntque(). Previously, embryonic vnodes were put onto the list of vnode belonging to a file system, which is unsafe for a file system marked MPSAFE. Change vfs_hash_insert() to no longer lock the vnode. The caller now has that responsibility. Change most file systems to lock the vnode and call insmntque() or insmntque1() after a new vnode has been sufficiently setup. Handle failed insmntque*() calls by propagating errors to callers, possibly after some file system specific cleanup. Approved by: re (kensmith) Reviewed by: kib In collaboration with: kib	2007-03-13 01:50:27 +00:00
Dag-Erling Smørgrav	771709eb78	Add a pn_destroy field to pfs_node. This field points to a destructor function which is called from pfs_destroy() before the node is reclaimed. Modify pfs_create_{dir,file,link}() to accept a pointer to a destructor function in addition to the usual attr / fill / vis pointers. This breaks both the programming and binary interfaces between pseudofs and its consumers. It is believed that there are no pseudofs consumers outside the source tree, so that the impact of this change is minimal. Submitted by: Aniruddha Bohra <bohra@cs.rutgers.edu>	2007-03-12 12:16:52 +00:00
Mike Pritchard	45cdcb7aab	Change fifo_printinfo to check if the vnode v_fifoinfo pointer is NULL and print a message to that effect to prevent a panic.	2007-03-02 00:10:11 +00:00
John Baldwin	4d70511ac3	Use pause() rather than tsleep() on stack variables and function pointers.	2007-02-27 17:23:29 +00:00
Olivier Houchard	9bf1500921	Check that the error returned by vfs_getopts() is not ENOENT before assuming there's actually an error. This is just in order to unbreak ntfs on current, before a proper solution is committed.	2007-02-21 00:30:09 +00:00
Robert Watson	969e5bdcd0	Do allow PIOCSFL in jail for setguid processes; this is more consistent with other debugging checks elsewhere. XXX comment on the fact that p_candebug() is not being used here remains.	2007-02-19 13:04:25 +00:00
Pawel Jakub Dawidek	10bcafe9ab	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs	2007-02-15 22:08:35 +00:00
Craig Rodrigues	a8d36d0d9a	Forced commit and #include changes for repo copy from sys/isofs/cd9660 to sys/fs/cd9660. Discussed on freebsd-current.	2007-02-11 13:54:25 +00:00
Craig Rodrigues	d6140aaa69	Add noatime to the list of mount options that msdosfs accepts. PR: 108896 Submitted by: Eugene Grosbein <eugen grosbein pp ru>	2007-02-08 02:30:55 +00:00
Craig Rodrigues	dc9a617afb	Style fixes: use ANSI C function declarations.	2007-02-08 02:25:35 +00:00
Konstantin Belousov	a257337698	Fix the race of dereferencing /proc/<pid>/file with execve(2) by caching the value of p_textvp. This way, we always unlock the locked vnode. While there, vhold() the vnode around the vn_lock(). Reported and tested by: Guy Helmer (ghelmer palisadesys com) Approved by: des (procfs maintainer) MFC after: 1 week	2007-02-07 10:30:49 +00:00
Craig Rodrigues	8a4cab026b	Eliminate some dead code which was introduced in 1.23, yet was always commented out.	2007-02-06 03:30:58 +00:00
Pawel Jakub Dawidek	5ab5525469	coda_vptofh is never defined nor used.	2007-02-02 15:47:28 +00:00
Tai-hwa Liang	61ad2e26ef	Fixing compilation bustage by removing references to opt_msdosfs.h. This auto-generated header file no longer exists since the removal of MSDOSFS_LARGE in sys/conf/options:1.574.	2007-01-30 08:05:04 +00:00
Tom Rhodes	bade0e00f3	Fix spacing from my previous commit to this file: Noticed by: fjoe	2007-01-30 04:41:38 +00:00
Craig Rodrigues	f458f2a553	Add a "-o large" mount option for msdosfs. Convert compile-time checks for #ifdef MSDOSFS_LARGE to run-time checks to see if "-o large" was specified. Test case provided by Oliver Fromme: truncate -s 200G test.img mdconfig -a -t vnode -f test.img -u 9 newfs_msdos -s 419430400 -n 1 /dev/md9 zip250 mount -t msdosfs /dev/md9 /mnt # should fail mount -t msdosfs -o large /dev/md9 /mnt # should succeed PR: 105964 Requested by: Oliver Fromme <olli lurza secnetix de> Tested by: trhodes MFC after: 2 weeks	2007-01-30 03:11:45 +00:00
Konstantin Belousov	7f92c4ee02	Below is slightly edited description of the LOR by Tor Egge: -------------------------- [Deadlock] is caused by a lock order reversal in vfs_lookup(), where [some] process is trying to lock a directory vnode, that is the parent directory of covered vnode) while holding an exclusive vnode lock on covering vnode. A simplified scenario: root fs var fs / A / (/var) D /var B /log (/var/log) E vfs lock C vfs lock F Within each file system, the lock order is clear: C->A->B and F->D->E When traversing across mounts, the system can choose between two lock orders, but everything must then follow that lock order: L1: C->A->B \| +->F->D->E L2: F->D->E \| +->C->A->B The lookup() process for namei("/var") mixes those two lock orders: VOP_LOOKUP() obtains B while A is held vfs_busy() obtains a shared lock on F while A and B are held (follows L1, violates L2) vput() releases lock on B VOP_UNLOCK() releases lock on A VFS_ROOT() obtains lock on D while shared lock on F is held vfs_unbusy() releases shared lock on F vn_lock() obtains lock on A while D is held (violates L1, follows L2) dounmount() follows L1 (B is locked while F is drained). Without unmount activity, vfs_busy() will always succeed without blocking and the deadlock isn't triggered (the system behaves as if L2 is followed). With unmount, you can get 4 processes in a deadlock: p1: holds D, want A (in lookup()) p2: holds shared lock on F, want D (in VFS_ROOT()) p3: holds B, want drain lock on F (in dounmount()) p4: holds A, want B (in VOP_LOOKUP()) You can have more than one instance of p2. The reversal was introduced in revision 1.81 of src/sys/kern/vfs_lookup.c and MFCed to revision 1.80.2.1, probably to avoid a cascade of vnode locks when nfs servers are dead (VFS_ROOT() just hangs) spreading to the root fs root vnode. - Tor Egge To fix the LOR, ups@ noted that when crossing the mount point, ni_dvp is actually not used by the callers of namei. Thus, placeholder deadfs vnode vp_crossmp is introduced that is filled into ni_dvp. Idea by: ups Reviewed by: tegge, ups, jeff, rwatson (mac interaction) Tested by: Peter Holm MFC after: 2 weeks	2007-01-22 11:25:22 +00:00
Tom Rhodes	752945d6c0	Add a 3rd entry in the cache, which keeps the end position from just before extending a file. This has the desired effect of keeping the write speed constant. And yes, that helps a lot copying large files always at full speed now, and I have seen improvements using benchmarks/bonnie. Stolen from: NetBSD Reviewed by: bde	2007-01-16 23:43:14 +00:00
Pav Lucistnik	0c09ac0d57	Rewrite the udf_read() routine to use a file vnode instead of the devvp vnode. The code is modelled after cd9660, including support for simple read-ahead courtesy of clustered read. Fix udf_strategy to DTRT. This change fixes sendfile(2) not to send out garbage. Reviewed by: scottl MFC after: 1 month	2007-01-15 18:45:36 +00:00
Pav Lucistnik	9f3eef13ca	Tell backing v_object the filesize right on it's creation. MFC after: 1 week	2007-01-07 23:53:16 +00:00
Craig Rodrigues	82c59ec651	When performing a mount update to change a mount from read-only to read-write, do not call markvoldirty() until the mount has been flagged as read-write. Due to the nature of the msdosfs code, this bug only seemed to appear for FAT-16 and FAT-32. This fixes the testcase: #!/bin/sh dd if=/dev/zero bs=1m count=1 oseek=119 of=image.msdos mdconfig -a -t vnode -f image.msdos newfs_msdos -F 16 /dev/md0 fd120m mount_msdosfs -o ro /dev/md0 /mnt mount \| grep md0 mount -u -o rw /dev/md0; echo $? mount \| grep md0 umount /mnt mdconfig -d -u 0 PR: 105412 Tested by: Eugene Grosbein <eugen grosbein pp ru>	2007-01-06 20:46:02 +00:00
Craig Rodrigues	dda4f444de	Simplify code in union_hashins() and union_hashget() functions. These functions now more closely resemble similar functions in nullfs. This also eliminates some errors. Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>	2007-01-05 14:06:42 +00:00
Craig Rodrigues	9170c87faa	Eliminate obsolete comment, now that getushort() is implemented in terms of functions in <sys/endian.h>.	2007-01-05 05:28:57 +00:00
Craig Rodrigues	98155f1f51	Eliminate ASSERT_VOP_ELOCKED panics when doing mkdir or symlink when sysctl vfs.lookup_shared=1. Submitted by: daichi, Masanori OZAWA <ozawa ongs co jp>	2007-01-05 02:25:44 +00:00
John Baldwin	b082761327	Use the vnode interlock to close a race where pfs_vncache_alloc() could attempt to vn_lock() a destroyed vnode resulting in a hang. MFC after: 1 week Submitted by: ups Reviewed by: des	2007-01-02 17:27:52 +00:00
Pav Lucistnik	35e0662415	Call vnode_create_vobject() in VOP_OPEN. Makes mmap work on UDF filesystem. PR: kern/92040 Approved by: scottl MFC after: 1 week	2006-12-23 18:53:22 +00:00
Marcel Moolenaar	94632b9fe1	Unbreak 64-bit little-endian systems that do require alignment. The fix involves using le16dec(), le32dec(), le16enc() and le32enc(). This eliminates invalid casts and duplicated logic.	2006-12-21 05:40:46 +00:00
Craig Rodrigues	3244bb8a12	For big-endian version of getulong() macro, cast result to u_int32_t. This macro was written expecting a 32-bit unsigned long, and doesn't work properly on 64-bit systems. This bug caused vn_stat() to return incorrect values for files larger than 2gb on msdosfs filesystems on 64-bit systems. PR: 106703 Submitted by: Axel Gonzalez <loox e-shell net> MFC after: 3 days	2006-12-19 02:31:58 +00:00
Craig Rodrigues	d01e83878b	Fix get_ulong() macro on AMD64 (or any little-endian 64-bit platform). This bug caused vn_stat() to fail on files larger than 2gb on msdosfs filesystems on AMD64. PR: 106703 Tested by: Axel Gonzalez <loox e-shell net> MFC after: 3 days	2006-12-19 01:55:45 +00:00
Craig Rodrigues	b05872f29b	Remove unused variable in unionfs_root(). Submitted by: daichi, Masanori OZAWA	2006-12-09 17:24:18 +00:00
Craig Rodrigues	1e370dbbdc	Use vfs_mount_error() in a few places to give more descriptive mount error messages.	2006-12-09 17:21:25 +00:00
Craig Rodrigues	30d471e654	Add locking around calls to unionfs_get_node_status() in unionfs_ioctl() and unionfs_poll(). Submitted by: daichi, Masanori OZAWA <ozawa@ongs.co.jp> Prompted by: kris	2006-12-09 16:51:09 +00:00
Craig Rodrigues	b16f4eec16	In unionfs_readdir(), prevent a possible NULL dereference. CID: 1667 Found by: Coverity Prevent (tm)	2006-12-09 16:34:37 +00:00
Craig Rodrigues	acc4bab11b	In unionfs_hashrem(), use LIST_FOREACH_SAFE when iterating over the list of nodes to free them. CID: 1668 Found by: Coverity Prevent (tm)	2006-12-09 16:27:50 +00:00
Craig Rodrigues	e9022ef898	Minor cleanup. If we are doing a mount update, and we pass in an "export" flag indicating that we are trying to NFS export the filesystem, and the MSDOSFS_LARGEFS flag is set on the filesystem, then deny the mount update and export request. Otherwise, let the full mount update proceed normally. MSDOSFS_LARGES and NFS don't mix because of the way inodes are calculated for MSDOSFS_LARGEFS. MFC after: 3 days	2006-12-09 01:49:19 +00:00
Tim Kientzle	8d3027e203	The ISO9660 spec does allow files up to 4G. Change the i_size field to "unsigned long" so that it actually works. Thanks to Robert Sciuk for sending me a DVD that demonstrated ISO9660-formatted media with a file >2G. I've now fixed this both in libarchive and in the cd9660 filesystem. MFC after: 14 days	2006-12-08 07:43:53 +00:00
Julian Elischer	ad1e7d285a	Threading cleanup.. part 2 of several. Make part of John Birrell's KSE patch permanent.. Specifically, remove: Any reference of the ksegrp structure. This feature was never fully utilised and made things overly complicated. All code in the scheduler that tried to make threaded programs fair to unthreaded programs. Libpthread processes will already do this to some extent and libthr processes already disable it. Also: Since this makes such a big change to the scheduler(s), take the opportunity to rename some structures and elements that had to be moved anyhow. This makes the code a lot more readable. The ULE scheduler compiles again but I have no idea if it works. The 4bsd scheduler still reqires a little cleaning and some functions that now do ALMOST nothing will go away, but I thought I'd do that as a separate commit. Tested by David Xu, and Dan Eischen using libthr and libpthread.	2006-12-06 06:34:57 +00:00
Maxim Konovalov	1c5cf521ae	o Do not leave uninitialized birthtime: in MSDOSFSMNT_LONGNAME set birthtime to FAT CTime (creation time) and in the other cases set birthtime to -1. o Set ctime to mtime instead of FAT CTime which has completely different meaning. PR: kern/106018 Submitted by: Oliver Fromme MFC after: 1 month	2006-12-03 19:04:26 +00:00
Craig Rodrigues	3d253c11cf	Add missing includes for <sys/buf.h> and <sys/bio.h>.	2006-12-02 22:30:30 +00:00
Craig Rodrigues	d00947d83a	Many, many thanks to Masanori OZAWA <ozawa@ongs.co.jp> and Daichi GOTO <daichi@FreeBSD.org> for submitting this major rewrite of unionfs. This rewrite was done to try to solve many of the longstanding crashing and locking issues in the existing unionfs implementation. This implementation also adds a 'MASQUERADE mode', which allows the user to set different user, group, and file permission modes in the upper layer. Submitted by: daichi, Masanori OZAWA Reviewed by: rodrigc (modified for minor style issues)	2006-12-02 19:35:56 +00:00
Maxim Konovalov	cc005bb92c	o From the submitter: dos2unixchr will convert to lower case if LCASE_BASE or LCASE_EXT or both are set. But dos2unixfn uses dos2unixchr separately for the basename and the extension. So if either LCASE_BASE or LCASE_EXT is set, dos2unixfn will convert both the basename and extension to lowercase because it is blindly passing in the state of both flags to dos2unixchr. The bit masks I used ensure that only the state of LCASE_BASE gets passed to dos2unixchr when the basename is converted, and only the state of LCASE_EXT is passed in when the extension is converted. PR: kern/86655 Submitted by: Micah Lieske MFC after: 3 weeks	2006-11-26 18:49:44 +00:00
Lukas Ertl	9df1370eab	Fix an integer overflow and allow access to files larger than 4GB on NTFS.	2006-11-20 19:28:36 +00:00

1 2 3 4 5 ...

2108 Commits