freebsd-skq

Author	SHA1	Message	Date
Konstantin Belousov	d6da640860	Fix r193923 by noting that type of a_fp is struct file *, not int. It was assumed that r193923 was trivial change that cannot be done wrong. MFC after: 2 weeks	2009-06-10 14:24:31 +00:00
Konstantin Belousov	e4d9bdc105	s/a_fdidx/a_fp/ for VOP_OPEN comments that inline struct vop_open_args definition. Discussed with: bde MFC after: 2 weeks	2009-06-10 14:09:05 +00:00
John Baldwin	c72ae1423b	- Hold a reference on the cdev a filesystem is mounted from in the mount. - Remove the cdev pointers from the denode and instead use the mountpoint's reference to call dev2udev() in getattr(). Reviewed by: kib, julian	2009-02-27 20:00:15 +00:00
Edward Tomasz Napierala	0da50f6ef8	According to phk@, VOP_STRATEGY should never, _ever_, return anything other than 0. Make it so. This fixes "panic: VOP_STRATEGY failed bp=0xc320dd90 vp=0xc3b9f648", encountered when writing to an orphaned filesystem. Reason for the panic was the following assert: KASSERT(i == 0, ("VOP_STRATEGY failed bp=%p vp=%p", bp, bp->b_vp)); at vfs_bio:bufstrategy(). Reviewed by: scottl, phk Approved by: rwatson (mentor) Sponsored by: FreeBSD Foundation	2008-12-16 21:13:11 +00:00
Edward Tomasz Napierala	15bc6b2bd8	Introduce accmode_t. This is required for NFSv4 ACLs - it will be neccessary to add more V* constants, and the variables changed by this patch were often being assigned to mode_t variables, which is 16 bit. Approved by: rwatson (mentor)	2008-10-28 13:44:11 +00:00
Dag-Erling Smørgrav	1ede983cc9	Retire the MALLOC and FREE macros. They are an abomination unto style(9). MFC after: 3 months	2008-10-23 15:53:51 +00:00
Konstantin Belousov	4c5a20e3da	Initialize va_rdev to NODEV instead of 0 or VNOVAL in VOP_GETATTR(). NODEV is more appropriate when va_rdev doesn't have a meaningful value. Submitted by: Jaakko Heinonen <jh saunalahti fi> Suggested by: bde Discussed on: freebsd-fs MFC after: 1 month	2008-09-20 19:49:15 +00:00
Attilio Rao	0359a12ead	Decontextualize the couplet VOP_GETATTR / VOP_SETATTR as the passed thread was always curthread and totally unuseful. Tested by: Giovanni Trematerra <giovanni dot trematerra at gmail dot com>	2008-08-28 15:23:18 +00:00
Konstantin Belousov	813d71de08	The uniqdosname() function takes char[12] as it third argument. Found by: -fstack-protector Reported by: dougb Tested by: dougb, Rainer Hurling <rhurlin gwdg de> MFC after: 3 days	2008-07-04 09:40:52 +00:00
Konstantin Belousov	eab626f110	Move the head of byte-level advisory lock list from the filesystem-specific vnode data to the struct vnode. Provide the default implementation for the vop_advlock and vop_advlockasync. Purge the locks on the vnode reclaim by using the lf_purgelocks(). The default implementation is augmented for the nfs and smbfs. In the nfs_advlock, push the Giant inside the nfs_dolock. Before the change, the vop_advlock and vop_advlockasync have taken the unlocked vnode and dereferenced the fs-private inode data, racing with with the vnode reclamation due to forced unmount. Now, the vop_getattr under the shared vnode lock is used to obtain the inode size, and later, in the lf_advlockasync, after locking the vnode interlock, the VI_DOOMED flag is checked to prevent an operation on the doomed vnode. The implementation of the lf_purgelocks() is submitted by dfr. Reported by: kris Tested by: kris, pho Discussed with: jeff, dfr MFC after: 2 weeks	2008-04-16 11:33:32 +00:00
Doug Rabson	dfdcada31e	Add the new kernel-mode NFS Lock Manager. To use it instead of the user-mode lock manager, build a kernel with the NFSLOCKD option and add '-k' to 'rpc_lockd_flags' in rc.conf. Highlights include: * Thread-safe kernel RPC client - many threads can use the same RPC client handle safely with replies being de-multiplexed at the socket upcall (typically driven directly by the NIC interrupt) and handed off to whichever thread matches the reply. For UDP sockets, many RPC clients can share the same socket. This allows the use of a single privileged UDP port number to talk to an arbitrary number of remote hosts. * Single-threaded kernel RPC server. Adding support for multi-threaded server would be relatively straightforward and would follow approximately the Solaris KPI. A single thread should be sufficient for the NLM since it should rarely block in normal operation. * Kernel mode NLM server supporting cancel requests and granted callbacks. I've tested the NLM server reasonably extensively - it passes both my own tests and the NFS Connectathon locking tests running on Solaris, Mac OS X and Ubuntu Linux. * Userland NLM client supported. While the NLM server doesn't have support for the local NFS client's locking needs, it does have to field async replies and granted callbacks from remote NLMs that the local client has contacted. We relay these replies to the userland rpc.lockd over a local domain RPC socket. * Robust deadlock detection for the local lock manager. In particular it will detect deadlocks caused by a lock request that covers more than one blocking request. As required by the NLM protocol, all deadlock detection happens synchronously - a user is guaranteed that if a lock request isn't rejected immediately, the lock will eventually be granted. The old system allowed for a 'deferred deadlock' condition where a blocked lock request could wake up and find that some other deadlock-causing lock owner had beaten them to the lock. * Since both local and remote locks are managed by the same kernel locking code, local and remote processes can safely use file locks for mutual exclusion. Local processes have no fairness advantage compared to remote processes when contending to lock a region that has just been unlocked - the local lock manager enforces a strict first-come first-served model for both local and remote lockers. Sponsored by: Isilon Systems PR: 95247 107555 115524 116679 MFC after: 2 weeks	2008-03-26 15:23:12 +00:00
Attilio Rao	22db15c06f	VOP_LOCK1() (and so VOP_LOCK()) and VOP_UNLOCK() are only used in conjuction with 'thread' argument passing which is always curthread. Remove the unuseful extra-argument and pass explicitly curthread to lower layer functions, when necessary. KPI results broken by this change, which should affect several ports, so version bumping and manpage update will be further committed. Tested by: kris, pho, Diego Sardina <siarodx at gmail dot com>	2008-01-13 14:44:15 +00:00
Attilio Rao	cb05b60a89	vn_lock() is currently only used with the 'curthread' passed as argument. Remove this argument and pass curthread directly to underlying VOP_LOCK1() VFS method. This modify makes the code cleaner and in particular remove an annoying dependence helping next lockmgr() cleanup. KPI results, obviously, changed. Manpage and FreeBSD_version will be updated through further commits. As a side note, would be valuable to say that next commits will address a similar cleanup about VFS methods, in particular vop_lock1 and vop_unlock. Tested by: Diego Sardina <siarodx at gmail dot com>, Andrea Di Pasquale <whyx dot it at gmail dot com>	2008-01-10 01:10:58 +00:00
Bruce Evans	cb65c1ee29	Implement the async (really, delayed-write) mount option for msdosfs. This is much simpler than for ffs since there are many fewer places where we need to choose between a delayed write and a sync write -- just 5 in msdosfs and more than 30 in ffs. This is more complete and correct than in ffs. Several places in ffs are are still missing the choice. ffs_update() has a layering violation that breaks callers which want to force a sync update (mainly fsync(2) and O_SYNC write(2)). However, fsync(2) and O_SYNC write(2) are still more broken than in ffs, since they are broken for default (non-sync non-async) mounts too. Both fail to sync the FAT in all cases, and both fail to sync the directory entry in some cases after losing a race. Async everything is probably safer than the half-baked sync of metadata given by default mounts.	2007-10-19 12:23:25 +00:00
Bruce Evans	cefb55828f	In msdosfs_settattr(), don't do synchronous updates of the denode (except indirectly for the size pseudo-attribute). If anything deserves a sync update, then it is ids and immutable flags, since these are related to security, but ffs never synced these and msdosfs doesn't support them. (ufs_setattr() only does an update in one case where it is least needed (for timestamps); it did pessimal sync updates for timestamps until 1998/03/08 but was changed for unlogged reasons related to soft updates.) Now msdosfs calls deupdat() with waitfor == 0, which normally gives a delayed update to disk but always gives a sync update of timestamps in core, while for ffs everything is delayed until the syncer daemon or other activity causes an update (except for timestamps). This gives a large optimization mainly for things like cp -p, where attribute adjustment could easily triple the number of physical I/O's if it is done synchronously (but cp -p to msdosfs is not as bad as that, since msdosfs doesn't support many attributes so null adjustments are more common, and msdosfs doesn't support ctimes so even if cp doesn't weed out null adjustments they don't become non-null after clobbering the ctime).	2007-10-18 07:26:21 +00:00
Bruce Evans	c2819440b3	Fix races in msdosfs_lookup() and msdosfs_readdir(). These functions can easily block in bread(), and then there was nothing to prevent the static buffer (nambuf_{ptr,len,last_id}) being clobbered by another thread. The effects of the bug seem to have been limited to failed lookups and mangled names in readdir(), since Giant locking provides enough serialization to prevent concurrent calls to the functions that access the buffer. They were very obvious for multiple concurrent tree walks, especially with a small cluster size. The bug was introduced in msdosfs_conv.c 1.34 and associated changes, and is in all releases starting with 5.2. The fix is to allocate the buffer as a local variable and pass around pointers to it like "_r" functions in libc do. Stack use from this is large but not too large. This also fixes a memory leak on module unload. Reviewed by: kib Approved by: re (kensmith)	2007-08-31 22:29:55 +00:00
Bruce Evans	a4e6807c49	In msdosfs_read() and msdosfs_write(), don't check explicitly for (uio_offset < 0) since this can't happen. If this happens, then the general code handles the problem safely (better than before for reading, returning 0 (EOF) instead of the bogus errno EINVAL, and the same as before for writing, returning EFBIG). In msdosfs_read(), don't check for (uio_resid < 0). msdosfs_write() already didn't check. In msdosfs_read(), document in a comment our assumptions that the caller passed a valid uio_offset and uio_resid. ffs checks using KASSERT(), and that is enough sanity checking. In the same comment, partly document there is no need to check for the EOVERFLOW case, unlike in ffs where this case can happen at least in theory. In msdosfs_write(), add a comment about why the checking of (uio_resid == 0) is explicit, unlike in ffs. In msdosfs_write(), check for impossibly large final offsets before checking if the file size rlimit would be exceeded, so that we don't have an overflow bug in the rlimit check and are consistent with ffs. We now return EFBIG instead of EFBIG plus a SIGXFSZ signal if the final offset would be impossibly large but not so large as to cause overflow. Overflow normally gave the benign behaviour of no signal. Approved by: re (kensmith) (blanket)	2007-08-07 10:35:27 +00:00
Bruce Evans	b7837a91c9	Fix and update the comments about the effect of the read-only flag on writing. They are still too verbose. Remove nearby unreachable code for handling symlinks. Approved by: re (kensmith) (blanket)	2007-08-07 05:42:10 +00:00
Bruce Evans	c0f5121cac	Fix some style bugs (don't assume that off_t == int64_t; fix some comments; remove some parentheses; fix only a couple of whtespace errors). Approved by: re (kensmith) (blanket)	2007-08-07 03:43:28 +00:00
Bruce Evans	d2bb66bacd	Sort includes. Remove rotted banal comment attached to includes. Approved by: re (kensmith) (blanket)	2007-08-07 02:28:33 +00:00
Bruce Evans	eba34270fa	Include <sys/mutex.h> and its prerequisite <sys/lock.h> instead of depending on namespace pollution in <sys/buf.h> and/or <sys/vnode.h> Approved by: re (kensmith) (blanket)	2007-08-07 01:40:27 +00:00
Bruce Evans	6fd81fc7a6	Remove unused include(s). Approved by: re (kensmith) (blanket)	2007-08-07 01:07:16 +00:00
Bruce Evans	6b6c5f5ef9	Implement vfs clustering for msdosfs. This gives a very large speedup for small block sizes (in my tests, about 5 times for write and 3 times for read with a block size of 512, if clustering is possible) and a moderate speedup for the moderatatly large block sizes that should be used on non-small media (4K is the best size in most cases, and the speedup for that is about 1.3 times for write and 1.2 times for read). mmap() should benefit from clustering like read()/write(), but the current implementation of vm only supports clustering (at least for getpages) if the fs block size is >= PAGE SIZE. msdosfs is now only slightly slower than ffs with soft updates for writing and slightly faster for reading when both use their best block sizes. Writing is slower for msdosfs because of more sync writes. Reading is faster for msdosfs because indirect blocks interfere with clustering in ffs. The changes in msdosfs_read() and msdosfs_write() are simpler merges of corresponding code in ffs (after fixing some style bugs in ffs). msdosfs_bmap() needs fs-specific code. This implementation loops calling a lower level bmap function to do the hard parts. This is a bit inefficient, but is efficient enough since msdsfs_bmap() is only called when there is physical i/o to do. Approved by: re (hrs)	2007-07-20 17:06:57 +00:00
Bruce Evans	d34b0a1bac	Clean up before implementing vfs clustering for msdosfs: In msdosfs_read(), mainly reorder the main loop to the same order as in ffs_read(). In msdosfs_write() and extendfile(), use vfs_bio_clrbuf() instead of clrbuf(). I think this just just a bogus optimization, but ffs always does it and msdosfs already did it in one place, and it is what I've tested. In msdosfs_write(), merge good bits from a comment in ffs_write(), and fix 1 style bug. In the main comment for msdosfs_pcbmap(), improve wording and catch up with 13 years of changes in the function. This comment belongs in VOP_BMAP.9 but that doesn't exist. In msdosfs_bmap(), return EFBIG if the requested cluster number is out of bounds instead of blindly truncating it, and fix many style bugs. Approved by: re (hrs)	2007-07-20 16:21:47 +00:00
Robert Watson	32f9753cfb	Eliminate now-unused SUSER_ALLOWJAIL arguments to priv_check_cred(); in some cases, move to priv_check() if it was an operation on a thread and no other flags were present. Eliminate caller-side jail exception checking (also now-unused); jail privilege exception code now goes solely in kern_jail.c. We can't yet eliminate suser() due to some cases in the KAME code where a privilege check is performed and then used in many different deferred paths. Do, however, move those prototypes to priv.h. Reviewed by: csjp Obtained from: TrustedBSD Project	2007-06-12 00:12:01 +00:00
Pawel Jakub Dawidek	10bcafe9ab	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs	2007-02-15 22:08:35 +00:00
Tai-hwa Liang	61ad2e26ef	Fixing compilation bustage by removing references to opt_msdosfs.h. This auto-generated header file no longer exists since the removal of MSDOSFS_LARGE in sys/conf/options:1.574.	2007-01-30 08:05:04 +00:00
Craig Rodrigues	f458f2a553	Add a "-o large" mount option for msdosfs. Convert compile-time checks for #ifdef MSDOSFS_LARGE to run-time checks to see if "-o large" was specified. Test case provided by Oliver Fromme: truncate -s 200G test.img mdconfig -a -t vnode -f test.img -u 9 newfs_msdos -s 419430400 -n 1 /dev/md9 zip250 mount -t msdosfs /dev/md9 /mnt # should fail mount -t msdosfs -o large /dev/md9 /mnt # should succeed PR: 105964 Requested by: Oliver Fromme <olli lurza secnetix de> Tested by: trhodes MFC after: 2 weeks	2007-01-30 03:11:45 +00:00
Maxim Konovalov	1c5cf521ae	o Do not leave uninitialized birthtime: in MSDOSFSMNT_LONGNAME set birthtime to FAT CTime (creation time) and in the other cases set birthtime to -1. o Set ctime to mtime instead of FAT CTime which has completely different meaning. PR: kern/106018 Submitted by: Oliver Fromme MFC after: 1 month	2006-12-03 19:04:26 +00:00
Robert Watson	acd3428b7d	Sweep kernel replacing suser(9) calls with priv(9) calls, assigning specific privilege names to a broad range of privileges. These may require some future tweaking. Sponsored by: nCircle Network Security, Inc. Obtained from: TrustedBSD Project Discussed on: arch@ Reviewed (at least in part) by: mlaier, jmg, pjd, bde, ceri, Alex Lyashkov <umka at sevcity dot net>, Skip Ford <skip dot ford at verizon dot net>, Antoine Brodin <antoine dot brodin at laposte dot net>	2006-11-06 13:42:10 +00:00
Poul-Henning Kamp	3c960d9379	Replace slightly crummy fattime<->timespec conversion functions.	2006-10-24 11:14:05 +00:00
Jeff Roberson	89b0e10910	- Reorder calls to vrele() after calls to vput() when the vrele is a directory. vrele() may lock the passed vnode, which in these cases would give an invalid lock order of child -> parent. These situations are deadlock prone although do not typically deadlock because the vrele is typically not releasing the last reference to the vnode. Users of vrele must consider it as a call to vn_lock() and order it appropriately. MFC After: 1 week Sponsored by: Isilon Systems, Inc. Tested by: kkenn	2006-02-01 00:25:26 +00:00
Tom Rhodes	9fc31f8a5f	Update incorrect comments here, there should not be a call to panic() over fs corruption. Discussed with: alfred, phk	2006-01-23 17:45:57 +00:00
Max Khon	710a9accfe	Do not assume that `char direntry::deExtension[3]' starts right after `char direntry::deName[8]' and access deExtension[] explicitly. Found by: Coverity Prevent(tm) CID: 350, 351, 352	2006-01-22 21:09:38 +00:00
Poul-Henning Kamp	7ce296cf04	Remove debug printout of major/minor numbers, print name instead.	2005-02-27 21:16:26 +00:00
Peter Edwards	72b3e305af	Unbreak a few filesystems for which vnode_create_vobject() wasn't being called in "open", causing mmap() to fail. Where possible, pass size of file to vnode_create_vobject() rather than having it find it out the hard way via VOP_LOOKUP Reviewed by: phk	2005-01-29 16:23:39 +00:00
Poul-Henning Kamp	83c6439714	Whitespace in vop_vector{} initializations.	2005-01-13 18:59:48 +00:00
Poul-Henning Kamp	0391e5a151	Wrap the bufobj operations in macros: BO_STRATEGY() and BO_WRITE()	2005-01-11 09:10:46 +00:00
Warner Losh	d167cf6f3a	/* -> /*- for copyright notices, minor format tweaks as necessary	2005-01-06 18:10:42 +00:00
Poul-Henning Kamp	4b44037433	Remove the de_devvp and stop VREF'ing it for every vnode we create.	2004-12-02 10:09:33 +00:00
Poul-Henning Kamp	aec0fb7b40	Back when VOP_* was introduced, we did not have new-style struct initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)	2004-12-01 23:16:38 +00:00
Poul-Henning Kamp	6fde64c778	Mechanically change prototypes for vnode operations to use the new typedefs.	2004-12-01 12:24:41 +00:00
Poul-Henning Kamp	9c83534dd8	Make VOP_BMAP return a struct bufobj for the underlying storage device instead of a vnode for it. The vnode_pager does not and should not have any interest in what the filesystem uses for backend. (vfs_cluster doesn't use the backing store argument.)	2004-11-15 09:18:27 +00:00
Poul-Henning Kamp	9a135592e2	Move MSDOSFS to GEOM backing instead of DEVFS. For details, please see src/sys/ufs/ffs/ffs_vfsops.c 1.250.	2004-10-29 10:40:14 +00:00
Poul-Henning Kamp	d83b7498a4	Eliminate unnecessary KASSERTs. Don't use bp->b_vp in VOP_STRATEGY: the vnode is passed in as an argument.	2004-10-27 06:48:21 +00:00
Colin Percival	56f21b9d74	Rename suser_cred()'s PRISON_ROOT flag to SUSER_ALLOWJAIL. This is somewhat clearer, but more importantly allows for a consistent naming scheme for suser_cred flags. The old name is still defined, but will be removed in a few days (unless I hear any complaints...) Discussed with: rwatson, scottl Requested by: jhb	2004-07-26 07:24:04 +00:00
Tim J. Robbins	3bc482ec1c	By popular request, add a workaround that allows large (>128GB or so) FAT32 filesystems to be mounted, subject to some fairly serious limitations. This works by extending the internal pseudo-inode-numbers generated from the file's starting cluster number to 64-bits, then creating a table mapping these into arbitrary 32-bit inode numbers, which can fit in struct dirent's d_fileno and struct vattr's va_fileid fields. The mappings do not persist across unmounts or reboots, so it's not possible to export these filesystems through NFS. The mapping table may grow to be rather large, and may grow large enough to exhaust kernel memory on filesystems with millions of files. Don't enable this option unless you understand the consequences.	2004-07-03 13:22:38 +00:00
John Baldwin	91d5354a2c	Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64	2004-02-04 21:52:57 +00:00
Bruce Evans	be039c568f	Fixed some minor style bugs in rev.1.144. All related to msdosfs_advlock() (mainly unsorting). There were no changes related to the dirty flag here. The reference NetBSD implementation put msdosfs_advlock() in a different place. This commit only moves its declarations and changes some of the function body to be like the NetBSD version.	2003-12-29 10:12:02 +00:00
Tom Rhodes	cede1f563c	Make msdosfs support the dirty flag in FAT16 and FAT32. Enable lockf support. PR: 55861 Submitted by: Jun Su <junsu@m-net.arbornet.org> (original version) Reviewed by: make universe	2003-12-26 17:19:19 +00:00

1 2 3 4

193 Commits