freebsd-skq

Author	SHA1	Message	Date
Pawel Jakub Dawidek	10bcafe9ab	Move vnode-to-file-handle translation from vfs_vptofh to vop_vptofh method. This way we may support multiple structures in v_data vnode field within one file system without using black magic. Vnode-to-file-handle should be VOP in the first place, but was made VFS operation to keep interface as compatible as possible with SUN's VFS. BTW. Now Solaris also implements vnode-to-file-handle as VOP operation. VFS_VPTOFH() was left for API backward compatibility, but is marked for removal before 8.0-RELEASE. Approved by: mckusick Discussed with: many (on IRC) Tested with: ufs, msdosfs, cd9660, nullfs and zfs	2007-02-15 22:08:35 +00:00
Mike Pritchard	5cb6b1effb	Get the vfs giant lock before calling nfs_access. Reviewed by: mohan	2007-02-13 03:27:45 +00:00
Robert Watson	c2a9c542a9	Push Giant a bit further off the NFS server in a number of straight forward cases by converting from unconditional acquisition of Giant around vnode operations to conditional acquisition: - Remove nfsrv_access_withgiant(), and cause nfsrv_access() to now assert that Giant will be held if it is required for the vnode. - Add nfsrv_fhtovp_locked(), which will drop the NFS server lock if required, and modify nfsrv_fhtovp() to conditionally acquire Giant if required. - In the VOP's not dealing with more than one vnode at a time (i.e., not involving a lookup), conditionally acquire Giant. This removes Giant use for MPSAFE file systems for a number of quite important RPCs, including getattr, read, write. It leaves unconditional Giant acquisitions in vnode operations that interact with the name space or more than one vnode at a time as these require further work. Tested by: kris Reviewed by: kib	2006-11-24 11:53:16 +00:00
Pawel Jakub Dawidek	270626bc42	Protect nfsm_srvpathsiz() call with the nfsd_mtx lock. Reviewed by: mohans	2006-11-20 07:32:52 +00:00
Konstantin Belousov	1c99939f08	Fix leak in NAMEI zone caused by nfs server when VOP_RENAME fails. Submitted by: Padma Bhooma <pbhooma at panasas com> Reviewed by: bde Approved by: pjd (mentor) MFC after: 1 week	2006-10-26 12:41:53 +00:00
Konstantin Belousov	273147358f	Temporary workaround to prevent leak of Giant from nfsd when calling lookup(). Reviewed by: tegge Tested by: "Arno J. Klaassen" <arno at heho snv jussieu fr>, "Rong-en Fan" <grafan at gmail com>, Dmitriy Kirhlarov <dimma at higis ru>, Dmitry Pryanishnikov <dmitry at atlantis dp ua> MFC after: 1 week Approved by: kan, pjd (mentors)	2006-06-05 14:48:02 +00:00
Jeff Roberson	3bbd6d8ae6	- Release the references acquired by VOP_GETWRITEMOUNT and vfs_getvfs(). Discussed with: tegge Tested by: kris Sponsored by: Isilon Systems, Inc.	2006-03-31 03:54:20 +00:00
Jeff Roberson	e64df05c33	- Reorder vrele calls after vput calls to prevent lock order reversals between leaf and directory locks. Found by: kris Sponsored by: Isilon Systems, Inc.	2006-03-12 04:59:04 +00:00
Jeff Roberson	89b0e10910	- Reorder calls to vrele() after calls to vput() when the vrele is a directory. vrele() may lock the passed vnode, which in these cases would give an invalid lock order of child -> parent. These situations are deadlock prone although do not typically deadlock because the vrele is typically not releasing the last reference to the vnode. Users of vrele must consider it as a call to vn_lock() and order it appropriately. MFC After: 1 week Sponsored by: Isilon Systems, Inc. Tested by: kkenn	2006-02-01 00:25:26 +00:00
Christian S.J. Peron	7a3e891951	Manage the ucred for the NFS server using the crget/crfree API defined in kern_prot.c. This API handles reference counting among many other things. Notably, if MAC is compiled into the kernel, it will properly initialize the MAC labels when the ucred is allocated. This work is in preparation for a new MAC entry point which will be responsible for properly initializing policy specific labels for the NFS server credential. Utilization of the crfree/crget APIs reduce the complexity associated with this label's management. Submitted by: green (with changes) [1] Obtained from: TrustedBSD Project Discussed with: rwatson, alfred [1] I moved the ucred allocation outside the scope of the NFS server lock to prevent M_WAIKOK allocations from occurring with non-sleep-able locks held. Additionally, to reduce complexity, the ucred persist as long as the NFS server descriptor.	2006-01-28 19:24:40 +00:00
Tom Rhodes	129518ec2c	Revert my previous commit. Proved I'm not that bright at times: jhb	2006-01-23 21:06:22 +00:00
Tom Rhodes	cb5c1ec725	Fix indentation. Prodded by: stefanf, ru, njl (in that order)	2006-01-23 17:41:43 +00:00
Tom Rhodes	9c013503bb	Remove some dead code. Found with: Coverity Prevent(tm)	2006-01-21 12:10:33 +00:00
Gleb Smirnoff	19bf94288f	Keep locks consistent before goto. Reported by: pho Reviewed by: mohans	2005-10-27 19:02:34 +00:00
Robert Watson	ad77d81512	NFS write gathering defers execution of NFS server write requests to wait to see if additional write requests will arrive that can be coalesced and clustered with earlier ones. When doing so, it must determine whether the two requests are made by credentials with the same access writes, so as not to coalesce improperly. NFSW_SAMECRED() implements a test of two credentials using a binary compare. Replace NFSW_SAMECRED() macro with nfsrv_samecred() function, which is aware of the contents and layout of a struct ucred, rather than a simple binary compare. While the binary compare works when ucred is simply a zero'd and embedded 'struct ucred' in the NFS descriptor, it will work less well when the ucred associated with an NFS descriptor is "real", so has defined and populated reference count, mutex, etc. MFC after: 1 week Obtained from: TrustedBSD Project	2005-04-17 16:25:36 +00:00
Poul-Henning Kamp	c62801a7f8	Don't try to create vnode_pager objects on other filesystems vnodes, either they did it themselves or it won't happen.	2005-01-24 22:09:13 +00:00
Paul Saab	f1b3bfb348	Now that we have a non blocking version of nfsm_dissect(), change all the nfsm_dissect() calls (done under the NFSD lock) to nfsm_dissect_nonblock(). Submitted by: Mohan Srinivasan	2005-01-19 22:53:40 +00:00
Poul-Henning Kamp	8df6bac4c7	Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC(). I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense: The credentials for syncing a file (ability to write to the file) should be checked at the system call level. Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well. If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data. Discussed with: rwatson	2005-01-11 07:36:22 +00:00
Warner Losh	c398230b64	/* -> /*- for license, minor formatting changes	2005-01-07 01:45:51 +00:00
Robert Watson	29af382686	Correct a bug in nfsrv_create() where a call to nfsrv_access() might be made holding the NFS server mutex. To clean this up, introduce a version of the function, nfsrv_access_withgiant(), that expects the NFS server mutex to already have been dropped and Giant acquired. Wrap nfsrv_access() around this. This permits callers to more efficiently check access if they're in a code block performing VFS operations, and can be substitited for the nfsrv_access() call that triggered this bug. PR: 73807, 73208 MFC after: 1 week	2004-11-11 21:30:52 +00:00
Poul-Henning Kamp	494eb176e7	Add b_bufobj to struct buf which eventually will eliminate the need for b_vp. Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.	2004-10-22 08:47:20 +00:00
Robert Watson	ae8c2fa228	Correct several instances where calls to vfs_getvfs() resulting in failure in the NFS server would result in a leaked instance of the NFS server subsystem lock. Liberally sprinkle assertions in all target labels for error unwinding to assert the desired locking state. RELENG_5_3 candidate. MFC after: 3 days Reported by: Wilkinson, Alex <alex dot wilkinson at dsto dot defence dot gov dot au>	2004-10-18 11:23:11 +00:00
Robert Watson	e2d2098653	Convert a mtx_lock(&Giant) to a mtx_unlock(&Giant) in nfsrv_link() to prevent leakage of Giant. With INVARIANTS, this results in an assertion failure following execution of the RPC. Without INVARIANTS, it could result in problems if the NFS server is killed causing nfsd to return to user space holding Giant. Feet provided by: brueffer	2004-08-25 16:52:59 +00:00
Poul-Henning Kamp	f3732fd15b	Second half of the dev_t cleanup. The big lines are: NODEV -> NULL NOUDEV -> NODEV udev_t -> dev_t udev2dev() -> findcdev() Various minor adjustments including handling of userland access to kernel space struct cdev etc.	2004-06-17 17:16:53 +00:00
Robert Watson	69af1dccdc	Release NFS subsystem lock and acquire Giant when calling into vn_start_write().	2004-05-31 19:08:22 +00:00
Robert Watson	73a4c21f28	One more case where we want to drop the NFS server lock and acquire Giant when entering VFS. Discovered by code inspection; still not hit without debug.mpsafenet=1. Reported by: bmilekic	2004-05-30 22:59:54 +00:00
Robert Watson	53f137e9d3	Acquire Giant around two more cases when calling into VFS to vput() a vnode. Not bumped into with asserts in the main tree because we run the NFS server with Giant by default. Discovered by inspection. Complete annotations of Giant acquisition/release to note that it's only because of VFS that we acquire Giant in most places in the NFS server.	2004-05-30 22:41:43 +00:00
Robert Watson	e95fb8576b	Don't release Giant until after the call to vput() in nfsrv_setattr(). Unless running with debug.mpsafenet=1, this was not actually a problem.	2004-05-29 15:52:39 +00:00
Robert Watson	9a7563cf2d	Call nfsm_clget_nolock() instead of nfsm_clget() when holding the NFS subsystem lock to avoid tripping over an assertion regarding whether the lock is held or not. This is likely to be the cause of a panic tripped over by Andrea Campi.	2004-05-27 20:34:04 +00:00
Robert Watson	1ee624b31d	The socket code upcalls into the NFS server using the so_upcall mechanism so that early processing on mbufs can be performed before a context switch to the NFS server threads. Because of this, if the socket code is running without Giant, the NFS server also needs to be able to run the upcall code without relying on the presence on Giant. This change modifies the NFS server to run using a "giant code lock" covering operation of the whole subsystem. Work is in progress to move to data-based locking as part of the NFSv4 server changes. Introduce an NFS server subsystem lock, 'nfsd_mtx', and a set of macros to operate on the lock: NFSD_LOCK_ASSERT() Assert nfsd_mtx owned by current thread NFSD_UNLOCK_ASSERT() Assert nfsd_mtx not owned by current thread NFSD_LOCK_DONTCARE() Advisory: this function doesn't care NFSD_LOCK() Lock nfsd_mtx NFSD_UNLOCK() Unlock nfsd_mtx Constify a number of global variables/structures in the NFS server code, as they are not modified and contain constants only: nfsrvv2_procid nfsrv_nfsv3_procid nonidempotent nfsv2_repstat nfsv2_type nfsrv_nfsv3_procid nfsrvv2_procid nfsrv_v2errmap nfsv3err_null nfsv3err_getattr nfsv3err_setattr nfsv3err_lookup nfsv3err_access nfsv3err_readlink nfsv3err_read nfsv3err_write nfsv3err_create nfsv3err_mkdir nfsv3err_symlink nfsv3err_mknod nfsv3err_remove nfsv3err_rmdir nfsv3err_rename nfsv3err_link nfsv3err_readdir nfsv3err_readdirplus nfsv3err_fsstat nfsv3err_fsinfo nfsv3err_pathconf nfsv3err_commit nfsrv_v3errmap There are additional structures that should be constified but due to their being passed into general purpose functions without const arguments, I have not yet converted. In general, acquire nfsd_mtx when accessing any of the global NFS structures, including struct nfssvc_sock, struct nfsd, struct nfsrv_descript. Release nfsd_mtx whenever calling into VFS, and acquire Giant for calls into VFS. Giant is not required for any part of the operation of the NFS server with the exception of calls into VFS. Giant will never by acquired in the upcall code path. However, it may operate entirely covered by Giant, or not. If debug.mpsafenet is set to 0, the system calls will acquire Giant across all operations, and the upcall will assert Giant. As such, by default, this enables locking and allows us to test assertions, but should not cause any substantial new amount of code to be run without Giant. Bugs should manifest in the form of lock assertion failures for now. This approach is similar (but not identical) to modifications to the BSD/OS NFS server code snapshot provided by BSDi as part of their SMPng snapshot. The strategy is almost the same (single lock over the NFS server), but differs in the following ways: - Our NFS client and server code bases don't overlap, which means both fewer bugs and easier locking (thanks Peter!). Also means NFSD_() as opposed to NFS_(). - We make broad use of assertions, whereas the BSD/OS code does not. - Made slightly different choices about how to handle macros building packets but operating with side effects. - We acquire Giant only when entering VFS from the NFS server daemon threads. - Serious bugs in BSD/OS implementation corrected -- the snapshot we received was clearly a work in progress. Based on ideas from: BSDi SMPng Snapshot Reviewed by: rick@snowhite.cis.uoguelph.ca Extensive testing by: kris	2004-05-24 04:06:14 +00:00
Maxime Henrion	7cc35e41e7	Don't send the available space as is in the FSSTAT call. Under FreeBSD, we can have a negative available space value, but the corresponding fields in the NFS protocol are unsigned. So trnucate the value to 0 if it's negative, so that the client doesn't receive absurdly high values. Tested by: cognet	2004-04-12 13:02:21 +00:00
Warner Losh	2fcbca0d85	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson	2004-04-07 05:00:01 +00:00
Poul-Henning Kamp	4d453ef101	Properly vector all bwrite() and BUF_WRITE() calls through the same path and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().	2004-03-11 18:02:36 +00:00
Poul-Henning Kamp	63b92d134e	When grabbing vnodes to service NFS requests, make sure to call vn_start_write() early to avoid snapshot deadlocks. By: mckusick	2003-10-24 18:36:49 +00:00
Ian Dowse	92daf89227	Fix a bug in nfsrv_read() that caused the replies to certain NFSv3 short read operations at the end of a file to not have the "eof" flag set as they should. The problem is that the requested read count was compared against the rounded-up reply data length instead of the actual reply data length. This bug appears to have been introduced in revision 1.78 (June 1999). It causes first-time reads of certain file sizes (e.g 4094 bytes) to fail with EIO on a RedHat 9.0 NFSv3 client. MFC after: 1 week	2003-06-24 19:04:26 +00:00
Kirk McKusick	98530110a2	Increase the size of the NFS server hash table to improve performance when serving up more than about 32 active files. For details see section 6.3 (pg 111) of Daniel Ellard and Margo Seltzer, ``NFS Tricks and Benchmarking Traps'' in the Proceedings of the Usenix 2003 Freenix Track, June 9-14, 2003 pg 101-114. Obtained from: Daniel Ellard <ellard@eecs.harvard.edu> Sponsored by: DARPA & NAI Labs.	2003-06-21 21:01:44 +00:00
Don Lewis	263c8abeb9	Beat vnode locking in the NFS server code into submission. This change is not pretty, but it fixes the code so that it no longer violates the vnode locking rules in the VFS API and doesn't trip any of the locking assertions enabled by the DEBUG_VFS_LOCKS kernel configuration option. There is one report that this patch fixed a "locking against myself" panic on an NFS server that was tripped by a diskless client. Approved by: re (scottl)	2003-05-25 06:17:33 +00:00
Alan Cox	b6e48e0372	- Acquire the vm_object's lock when performing vm_object_page_clean(). - Add a parameter to vm_pageout_flush() that tells vm_pageout_flush() whether its caller has locked the vm_object. (This is a temporary measure to bootstrap vm_object locking.)	2003-04-24 04:31:25 +00:00
Jeff Roberson	c033bdc013	- Lock bufs before inspecting their flags.	2003-03-13 07:05:22 +00:00
Jeff Roberson	17661e5ac4	- Add an interlock argument to BUF_LOCK and BUF_TIMELOCK. - Remove the buftimelock mutex and acquire the buf's interlock to protect these fields instead. - Hold the vnode interlock while locking bufs on the clean/dirty queues. This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another BUF_LOCK with a LK_TIMEFAIL to a single lock. Reviewed by: arch, mckusick	2003-02-25 03:37:48 +00:00
Warner Losh	a163d034fa	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
Alfred Perlstein	44956c9863	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
Jens Schweikhardt	9d5abbddbf	Correct typos, mostly s/ a / an / where appropriate. Some whitespace cleanup, especially in troff files.	2003-01-01 18:49:04 +00:00
Matthew Dillon	45587e2514	Abstract-out the constants for the sequential heuristic. No operational changes. MFC after: 1 day	2002-12-28 20:28:10 +00:00
Ian Dowse	2f07688e82	In the NFSv3 `fsinfo' procedure reply, don't claim that we support 32k read and write operations on datagram sockets when in fact we reject requests larger than 16k. It must be the case that virtually all clients use data sizes of 16k or less for UDP transport (FreeBSD's client defaults to 8k and never exceeds 16k), as this bug has been present ever since NFSv3 support was added. Reported by: Senthil <lihtnes78@netscape.net> Reviewed by: dillon Approved by: re MFC-after: 1 week	2002-12-05 16:58:11 +00:00
Jeff Roberson	24b50116ed	- Introduce a new macro, since that's what nfs loves, called nfsm_srvpathsiz. This macro plucks a length out of an rpc request and verifies that its size does not exceed NFS_MAXPATHLEN. If it does it generates an ENAMETOOLONG response. - Use this macro, and the existing nfsm_srvnamsiz macro in two places where we deal with paths passed in by the client. This fixes a linux interoperability bug. Linux was sending oversized path components which would cause us to ignore the request all together. This causes linux to hang indefinitly while it waits for a response. This could still happen in other cases where we error out with EBADRPC. Sponsored by: Isilon Systems, Inc. Reviewed by: alfred, fabbri@isilon.com, neal@isilon.com	2002-10-31 22:35:03 +00:00
Robert Watson	60cfb7c64a	Correct a problem wherein NFS servers running NFSv2 would not return certain classes of failure responses to the client during a failed remove operation. Submitted by: Ian Dowse <iedowse@maths.tcd.ie>	2002-10-03 21:50:37 +00:00
Jeff Roberson	d3b85e1c8b	- Use incore() instead of gbincore() so we don't have to acquire the vnode interlock.	2002-09-25 02:39:39 +00:00
Jeff Roberson	e6e370a7fe	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS	2002-08-04 10:29:36 +00:00
Matthew Dillon	3d8f797ac1	Convert old style (type foo *)0 casts to NULLs PR: kern/40360 Requested by: Hiten PAndya via direct email	2002-07-11 17:54:58 +00:00

1 2 3 4

170 Commits