freebsd-skq

Author	SHA1	Message	Date
jeff	e4eab9fb69	- cache_lookup() relocks the parent in the DOTDOT case for us. Spotted by: phk Sponsored by: Isilon Systems, Inc.	2005-04-14 07:08:34 +00:00
jeff	afab3762a0	- Change all filesystems and vfs_cache to relock the dvp once the child is locked in the ISDOTDOT case. Se vfs_lookup.c r1.79 for details. Sponsored by: Isilon Systems, Inc.	2005-04-13 10:59:09 +00:00
jeff	97c40ebd49	- LK_NOPAUSE is a nop now. Sponsored by: Isilon Systems, Inc.	2005-03-31 04:37:09 +00:00
jeff	ca1e4c2fe0	- Remove wantparent, it is no longer necessary. An assert in vfs_lookup.c prevents any callers from doing a modifying op without LOCKPARENT or WANTPARENT.	2005-03-29 13:09:42 +00:00
jeff	141aba2c7b	- cache_lookup() now locks the new vnode for us to prevent some races. Remove redundant code. Sponsored by: Isilon Systems, Inc.	2005-03-29 13:00:37 +00:00
jeff	5f8bc80203	- We no longer have to bother with PDIRUNLOCK, lookup() handles it for us. - Network filesystems are written with a special idiom that checks the cache first, and may even unlock dvp before discovering that a network round-trip is required to resolve the name. I believe dvp is prevented from being recycled even in the forced unmount case by the shared lock on the mount point. If not, this code should grow checks for VI_DOOMED after it relocks dvp or it will access NULL v_data fields. Sponsored by: Isilon Systems, Inc.	2005-03-28 09:29:58 +00:00
jeff	56f1fc7189	- Update vfs_root implementations to match the new prototype. None of these filesystems will support shared locks until they are explicitly modified to do so. Careful review must be done to ensure that this is safe for each individual filesystem. Sponsored by: Isilon Systems, Inc.	2005-03-24 07:39:03 +00:00
ps	114057c633	- The NFS client was incorrectly masking SIGSTOP (which is non-maskable). - The NFS client needs to guard against spurious wakeups while waiting for the response. ltrace causes the process under question to wakeup (possibly from ptrace()), which causes NFS to wakeup from tsleep without the response being delivered. Submitted by: Mohan Srinivasan	2005-03-23 22:10:10 +00:00
das	89bc04ad2d	Don't brelse(bp) if bp is null. Also, eliminate some redundancy and dead code. Found by: Coverity Prevent analysis tool	2005-03-18 21:23:32 +00:00
phk	172eba2632	Use vfs_hash.	2005-03-16 11:28:19 +00:00
jmg	64c69bfb4e	MFp4: use the function to fix the packet header length instead of rolling our own...	2005-03-16 08:13:08 +00:00
jeff	29a4f75b9b	- VOP_INACTIVE should no longer drop the vnode lock. Sponsored by: Isilon Systems, Inc.	2005-03-13 12:15:36 +00:00
jeff	5bd51ec6e6	- The VI_DOOMED flag now signals the end of a vnode's relationship with the filesystem. Check that rather than VI_XLOCK. Sponsored by: Isilon Systems, Inc.	2005-03-13 12:14:56 +00:00
jeff	5f59e0cd19	- It is no longer necessary to lock and unlock the vnode in nfs_close() as the top level does this for us now. Sponsored by: Isilon Systems, Inc.	2005-03-13 12:11:23 +00:00
ps	d4a5a3bc89	Minor cleanup in nfs_request() and removal of a comment that doesn't reflect reality. Submitted by: Mohan Srinivasan	2005-02-26 18:55:36 +00:00
phk	33d6741fda	vp->v_id is a private field for the vfs namecache and it is a big mistake that NFS ever started using it. Long time ago I added the necessary vhold()/vdrop() calls to replace it, but forgot to remove the v_id code. Do it now.	2005-02-22 14:52:00 +00:00
phk	66dfd63961	Try to unbreak the vnode locking around vop_reclaim() (based mostly on patch from kan@). Pull bufobj_invalbuf() out of vinvalbuf() and make g_vfs call it on close. This is not yet a generally safe function, but for this very specific use it is safe. This solves the problem with buffers not being flushed by unmount or after failed mount attempts.	2005-02-19 11:44:57 +00:00
ps	f6b334da2c	Fix for a potential NFS client race where shared data is updated from base context as well as the socket callback. Submitted by: Mohan Srinivasan	2005-02-18 23:41:39 +00:00
jhb	685dd13b54	Drop Giant before calling kthread_exit().	2005-02-07 18:21:50 +00:00
rwatson	39c4afac56	Style cleanup for O_DIRECT sysctl comment introduced in nfs_vnops.c:1.242.	2005-01-29 23:19:08 +00:00
phk	1b21636022	Make filesystems get rid of their own vnodes vnode_pager object in VOP_RECLAIM().	2005-01-28 14:42:17 +00:00
phk	d0599e9c31	Create a vnode_pager object when a file is opened.	2005-01-24 23:03:29 +00:00
phk	8dba90be16	Remove unused cred arg from nfs_vinvalbuf() and many bogus arguments passed for it.	2005-01-24 12:31:06 +00:00
peter	e4129c1fb1	Mostly back out rev 1.33 from quite some time ago, and the followup fixes and tweaks. The code was actually quite broken because it discarded the upper bits of the 64 bit division. We only had a 50% chance of scaling up the blocksize for large NFS client mounts when it was needed. For 5.x and beyond, this was harmless because we could represent the result in either case. For 4.x this was a big problem though. (4.x also has a df(1) bug to compound the problem)	2005-01-18 21:59:44 +00:00
phk	cc0cbc6b34	Eliminate unused and unnecessary "cred" argument from vinvalbuf()	2005-01-14 07:33:51 +00:00
brian	2b05c4cf78	Include opt_bootp.h for BOOTP_NFSROOT PR: 73183 Submitted by: Darrin Smith sdar at salseast dot org MFC after: 7 days	2005-01-12 12:42:46 +00:00
phk	5a497775d6	Add BO_SYNC() and add a default which uses the secret vnode pointer and VOP_FSYNC() for now.	2005-01-11 10:43:08 +00:00
phk	da2718f1af	Remove the unused credential argument from VOP_FSYNC() and VFS_SYNC(). I'm not sure why a credential was added to these in the first place, it is not used anywhere and it doesn't make much sense: The credentials for syncing a file (ability to write to the file) should be checked at the system call level. Credentials for syncing one or more filesystems ("none") should be checked at the system call level as well. If the filesystem implementation needs a particular credential to carry out the syncing it would logically have to the cached mount credential, or a credential cached along with any delayed write data. Discussed with: rwatson	2005-01-11 07:36:22 +00:00
imp	a50ffc2912	/* -> /*- for license, minor formatting changes	2005-01-07 01:45:51 +00:00
ps	86f9a5d44a	If the NFS/TCP stream is out of sync between the client and server, and if the client (erroneously) reads the RPC length as 0 bytes, the client can loop around in the socket callback. Explicitly check for the length being 0 case and teardown/re-connect. Submitted by: Mohan Srinivasan	2005-01-05 23:21:13 +00:00
ps	ad001884ff	Turn NFS directio off until the stability issues are resolved.	2004-12-23 21:30:30 +00:00
ps	0a2e8227c4	Change the NFS sillyrename convention so that we won't run out of sillyrenames (which were limited to 58 per pid per directory, for no good reason). The new format of sillyrenames looks like .nfs.0000b31a.00d24.4 ^^^^^^^^ ^^^^^ ticks pid Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!	2004-12-16 19:28:37 +00:00
ps	7c0944d56c	First cut of NFS direct IO support. - NFS direct IO completely bypasses the buffer and page caches. If a file is open for direct IO all caching is disabled. - Direct IO for Directories will be addressed later. - 2 new NFS directio related sysctls are added. One is a knob to disable NFS direct IO completely (direct IO is enabled by default). The other is to disallow mmaped IO on a file that has at least one O_DIRECT open (see the comment in nfs_vnops.c for more details). The default is to allow mmaps on a file that has O_DIRECT opens. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!	2004-12-15 22:20:22 +00:00
marcel	4b90107750	Revert rev 1.233. The null-pointer function call (a dereference on ia64) was not the result of a change in the vector operations. It was caused by the NFS locking code using a FIFO and those bypassing the vnode. This indirectly caused the panic. The NFS locking code has been changed. Requested by: phk	2004-12-11 21:36:29 +00:00
ps	b4a200824a	In nfs_rename(), skip the otw rename operation if the fsync (to either src or dst) fails. This closes a potential data loss case (where the fsync failed with ENOSPC, for example). Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!	2004-12-10 03:29:02 +00:00
ps	f46c52047f	Store a hint in the nfsnode to detect sequential access of the file. Kick off a readahead only when sequential access is detected. This eliminates wasteful readaheads in random file access. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Obtained from: Yahoo!	2004-12-10 03:27:12 +00:00
ps	81f484b21d	Fix for a Lock Order Reversal in the nfs_flush() path, between the vnode interlock and the proc lock. Reported by: marcel Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-07 21:16:32 +00:00
phk	849729c60b	Don't clobber mnt_stat.f_mntonname	2004-12-07 14:26:39 +00:00
phk	4a639d6164	The remaining part of nmount/omount/rootfs mount changes. I cannot sensibly split the conversion of the remaining three filesystems out from the root mounting changes, so in one go: cd9660: Convert to nmount. Add omount compat shims. Remove dedicated rootfs mounting code. Use vfs_mountedfrom() Rely on vfs_mount.c calling VFS_STATFS() nfs(client): Convert to nmount (the simple way, mount_nfs(8) is still necessary). Add omount compat shims. Drop COMPAT_PRELITE2 mount arg compatibility. ffs: Convert to nmount. Add omount compat shims. Remove dedicated rootfs mounting code. Use vfs_mountedfrom() Rely on vfs_mount.c calling VFS_STATFS() Remove vfs_omount() method, all filesystems are now converted. Remove MNTK_WANTRDWR, handling RO/RW conversions is a filesystem task, and they all do it now. Change rootmounting to use DEVFS trampoline: vfs_mount.c: Mount devfs on /. Devfs needs no 'from' so this is clean. symlink /dev to /. This makes it possible to lookup /dev/foo. Mount "real" root filesystem on /. Surgically move the devfs mountpoint from under the real root filesystem onto /dev in the real root filesystem. Remove now unnecessary getdiskbyname(). kern_init.c: Don't do devfs mounting and rootvnode assignment here, it was already handled by vfs_mount.c. Remove now unused bdevvp(), addaliasu() and addalias(). Put the few necessary lines in devfs where they belong. This eliminates the second-last source of bogo vnodes, leaving only the lemming-syncer. Remove rootdev variable, it doesn't give meaning in a global context and was not trustworth anyway. Correct information is provided by statfs(/).	2004-12-07 08:15:41 +00:00
ps	1d9d717d90	Always issue wakeups() to the NFS requestors under the mutex to close all potential cases of missed wakeups. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-07 03:39:52 +00:00
ps	eeccf3813d	Rewrite of the NFS client's reply handling. We now have NFS socket upcalls which do RPC header parsing and match up the reply with the request. NFS calls now sleep on the nfsreq structure. This enables us to eliminate the NFS recvlock. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 21:11:15 +00:00
ps	8eaa4f53e4	2 fixes that improve on the consistency of the NFS client cache. - Change the cached mtime to a 'struct timespec' from a time_t. Improving the precision of the cached mtime tightens up NFS' "close-to-open" consistency considerably. - Always force an over-the-wire consistency check from nfs_open() (unless the file is marked modified). This further improves NFS' "close-to-open" consistency. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 19:18:00 +00:00
ps	aa4aa62af0	Serialize NFS vinvalbuf operations by acquiring/upgrading to the vnode EXCLUSIVE lock. This prevents threads from adding pages to the vnode while an invalidation is in progress, closing potential races. In the bioread() path, callers acquire the SHARED vnode lock - so while an invalidate was in progress, it was possible to fault in new pages onto the vnode causing the invalidation to take a while or fail. We saw these races at Yahoo! with very large files+heavy concurrent access. Forcing an upgrade to EXCLUSIVE lock before doing the invalidation closes all these races. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 18:52:28 +00:00
ps	5feadd3eba	Add non-blocking versions of nfsm_dissect() and friends, for use from socket callbacks or similar callers, from both the NFS client and the server. Instituted nfsm_dissect_nonblock(), nfsm_dissect_xx_nonblock(). And nfsm_disct() now takes an extra M_TRYWAIT/M_DONTWAIT argument. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 17:33:52 +00:00
ps	ebd6438ae1	- If all data has been committed to stable storage on the server, it is safe to turn off the nfsnode's NMODIFIED flag. - Move the check for signals to the top of the loop where we loop around the dirty buffers on the vnode, scheduling writes. This ensures that we'll break ouf of the flush operation on reception of a signal. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com	2004-12-06 16:35:58 +00:00
rwatson	a98750ac1b	Correct a typo in a comment.	2004-12-06 16:11:25 +00:00
phk	753d615ec0	For reasons unknown, the nfs locking code used a fifo to send requests to userland and a dedicated system call to get replies. The vnode-bypass of fifos broke this into a panic. Ditch all the magic and create a device /dev/nfslock instead, and use that for both directions apart from the shorter path, this is also faster because the device driver runs Giant free using the vnode bypass. Noticed by: marcel	2004-12-06 08:31:32 +00:00
rwatson	6b017b90b9	Convert GIANT_REQUIRED; in nfs_mountroot() to NET_ASSERT_GIANT(), and annotate that nfs_mountroot assumes it is OK to step on the values in the global NFSv3 diskless structure as the mountroot function is called during a serialized part of the boot, before any other NFS client activity occurs. MFC after: 2 weeks	2004-12-05 22:53:17 +00:00
rwatson	22be685755	Convert a GIANT_REQUIRED; into a NET_ASSERT_GIANT();, as sockets are now only conditionally protected by Giant based on debug.mpsafenet.	2004-12-05 22:50:09 +00:00
phk	6c14f71ef7	VFS_STATFS(mp, ...) is mostly called with &mp->mnt_stat, but a few cases doesn't. Most of the implementations have grown weeds for this so they copy some fields from mnt_stat if the passed argument isn't that. Fix this the cleaner way: Always call the implementation on mnt_stat and copy that in toto to the VFS_STATFS argument if different.	2004-12-05 22:41:02 +00:00
marcel	8b42e21d12	Fix null-pointer indirect function calls introduced in the previous commit. In the new world order, the transitive closure on the vector operations is not precomputed. As such, it's unsafe to actually use any of the function pointers in an indirect function call. They can be null, and we need to use the default vector in that case. This is mostly a quick fix for the four function pointers that are ed explicitly. A more generic or scalable solution is likely to see the light of day. No pathos on: current@	2004-12-05 22:30:28 +00:00
phk	59f305606c	Back when VOP_* was introduced, we did not have new-style struct initializations but we did have lofty goals and big ideals. Adjust to more contemporary circumstances and gain type checking. Replace the entire vop_t frobbing thing with properly typed structures. The only casualty is that we can not add a new VOP_ method with a loadable module. History has not given us reason to belive this would ever be feasible in the the first place. Eliminate in toto VOCALL(), vop_t, VNODEOP_SET() etc. Give coda correct prototypes and function definitions for all vop_()s. Generate a bit more data from the vnode_if.src file: a struct vop_vector and protype typedefs for all vop methods. Add a new vop_bypass() and make vop_default be a pointer to another struct vop_vector. Remove a lot of vfs_init since vop_vector is ready to use from the compiler. Cast various vop_mumble() to void * with uppercase name, for instance VOP_PANIC, VOP_NULL etc. Implement VCALL() by making vdesc_offset the offsetof() the relevant function pointer in vop_vector. This is disgusting but since the code is generated by a script comparatively safe. The alternative for nullfs etc. would be much worse. Fix up all vnode method vectors to remove casts so they become typesafe. (The bulk of this is generated by scripts)	2004-12-01 23:16:38 +00:00
phk	cb64ed501e	Remove redundant functions (repo-copied from nfsclient) for dealing with fifos.	2004-12-01 20:18:56 +00:00
phk	ab549174e2	Scripted modification of vop_* prototypes to use typedefs.	2004-12-01 19:08:40 +00:00
phk	4eaab0b383	Add missing #include	2004-12-01 07:34:08 +00:00
ps	3601987765	Fix for a race between lookup and readdirplus, that causes a deadlock (with NFS exclusive vnode locks enabled). Lookup grabs the parent's lock and wants to lock child. Readdirplus locks the child and wants to lock parent (for loading the attrs for ".."). The fix is to not load the attrs for ".." in readdirplus. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Reviewed by: rwatson	2004-12-01 06:51:07 +00:00
ps	531cb416ae	Clean all dirty pages (dirtied by mmap'ed writes) in nfs_close(). This closes a major hole in close-to-open consistency support. Added a new sysctl so that this can be disabled for single NFS client applications with very large amounts of mmap'ed IO (for performance). Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Reviewed by: rwatson	2004-12-01 06:48:54 +00:00
ps	69d7e65011	Fix for a (blocks) underrun bug where negative values were being returned back to df from a statfs call. Causing df to print negative values. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Reviewed by: rwatson	2004-12-01 06:42:21 +00:00
ps	2b85447398	Fix for a bug in nfs_mkdir() that called vrele() instead of vput() in the error cases, causing panics. Submitted by: Mohan Srinivasan mohans at yahoo-inc dot com Reviewed by: rwatson	2004-11-29 23:05:30 +00:00
jeff	9caab2e843	- Eliminate the acquisition and release of the bqlock in bremfree() by setting the B_REMFREE flag in the buf. This is done to prevent lock order reversals with code that must call bremfree() with a local lock held. This also reduces overhead by removing two lock operations per buf for fsync() and similar. - Check for the B_REMFREE flag in brelse() and bqrelse() after the bqlock has been acquired so that we may remove ourself from the free-list. - Provide a bremfreef() function to immediately remove a buf from a free-list for use only by NFS. This is done because the nfsclient code overloads the b_freelist queue for its own async. io queue. - Simplify the numfreebuffers accounting by removing a switch statement that executed the same code in every possible case. - getnewbuf() can encounter locked bufs on free-lists once Giant is removed. Remove a panic associated with this condition and delay asserts that inspect the buf until after it is locked. Reviewed by: phk Sponsored by: Isilon Systems, Inc.	2004-11-18 08:44:09 +00:00
phk	5eae02ee76	Detect root mount attempts on the flag, not on the NULL path.	2004-11-09 22:21:52 +00:00
phk	e5715b2cc1	Retire b_magic now, we have the bufobj containing the same hint.	2004-11-04 09:48:18 +00:00
phk	1b25a59886	Move the buffer method vector (buf->b_op) to the bufobj. Extend it with a strategy method. Add bufstrategy() which do the usual VOP_SPECSTRATEGY/VOP_STRATEGY song and dance. Rename ibwrite to bufwrite(). Move the two NFS buf_ops to more sensible places, add bufstrategy to them. Add inlines for bwrite() and bstrategy() which calls through buf->b_bufobj->b_ops->b_{write,strategy}(). Replace almost all VOP_STRATEGY()/VOP_SPECSTRATEGY() calls with bstrategy().	2004-10-24 20:03:41 +00:00
phk	52a089c526	Add b_bufobj to struct buf which eventually will eliminate the need for b_vp. Initialize b_bufobj for all buffers. Make incore() and gbincore() take a bufobj instead of a vnode. Make inmem() local to vfs_bio.c Change a lot of VI_[UN]LOCK(bp->b_vp) to BO_[UN]LOCK(bp->b_bufobj) also VI_MTX() to BO_MTX(), Make buf_vlist_add() take a bufobj instead of a vnode. Eliminate other uses of bp->b_vp where bp->b_bufobj will do. Various minor polishing: remove "register", turn panic into KASSERT, use new function declarations, TAILQ_FOREACH_SAFE() etc.	2004-10-22 08:47:20 +00:00
phk	3833976d12	Move the VI_BWAIT flag into no bo_flag element of bufobj and call it BO_WWAIT Add bufobj_wref(), bufobj_wdrop() and bufobj_wwait() to handle the write count on a bufobj. Bufobj_wdrop() replaces vwakeup(). Use these functions all relevant places except in ffs_softdep.c where the use if interlocked_sleep() makes this impossible. Rename b_vnbufs to b_bobufs now that we touch all the relevant files anyway.	2004-10-21 15:53:54 +00:00
pjd	5515efcf61	Add a missing newline character.	2004-10-14 19:00:44 +00:00
das	c32ecae436	nfsclient/nfs_bio.c has a PHOLD() without a PRELE(). Neither should be necessary here. Also, use killproc() instead of psignal().	2004-10-01 05:01:41 +00:00
phk	d3ceec948f	Remove support for using NFS device nodes.	2004-09-28 08:50:01 +00:00
phk	5c67a82c63	Remove NFS4 vop method vector for devices: we are desupporing device nodes on anything but DEVFS and in this case it was not even used (see below). Put the NFS4 vop method for fifo's behind "#if 0" because it is unused. Add a XXX comment to say that I think the unusedness is a bug.	2004-09-27 20:02:50 +00:00
phk	46bdd46105	style consistency.	2004-09-27 19:44:39 +00:00
phk	02df7323ee	Remove unused B_WRITEINPROG flag	2004-09-15 21:49:22 +00:00
phk	9f1a2f23b2	Explicitly pass vnode to nfs_doio() and mountpoint to nfs_asyncio().	2004-09-07 08:56:43 +00:00
rwatson	68779f8b5e	In nfs_timer(), pass curthread rather than &thread0 into the protocol send routine. In IPv6 UDP, the thread will be passed to suser(), which asserts that if a thread is used for a super user check, it be curthread. Many of these protocol entry points probably need to accept credentials instead of threads. MT5 candidate. Noticed/tested by: kuriyama	2004-08-25 01:23:38 +00:00
phk	2d868d02cf	Put a version element in the VFS filesystem configuration structure and refuse initializing filesystems with a wrong version. This will aid maintenance activites on the 5-stable branch. s/vfs_mount/vfs_omount/ s/vfs_nmount/vfs_mount/ Name our filesystems mount function consistently. Eliminate the namiedata argument to both vfs_mount and vfs_omount. It was originally there to save stack space. A few places abused it to get hold of some credentials to pass around. Effectively it is unused. Reorganize the root filesystem selection code.	2004-07-30 22:08:52 +00:00
phk	98d8f3741c	Move a relic to its correct location(s): Put nfs diskless initialization calls with the code they call. (Yet another example of mindless copy&paste).	2004-07-28 21:54:57 +00:00
phk	075684f5fd	Remove global variable rootdevs and rootvp, they are unused as such. Add local rootvp variables as needed. Remove checks for miniroot's in the swappartition. We never did that and most of the filesystems could never be used for that, but it had still been copy&pasted all over the place.	2004-07-28 20:21:04 +00:00
phk	5297516e02	Eliminate unused second argument to reassignbuf() and simplify it accordingly.	2004-07-25 21:24:23 +00:00
alfred	b4f778e20c	Turn off SO_REUSEADDR and SO_REUSEPORT, they were causing EADDRINUSE to be returned from the protocol stack. Pointy hat to me for not groking what those options _really_ mean.	2004-07-13 05:42:59 +00:00
dwmalone	6ff1185c1d	Rename Alfred's kern_setsockopt to so_setsockopt, as this seems a a better name. I have a kern_[sg]etsockopt which I plan to commit shortly, but the arguments to these function will be quite different from so_setsockopt. Approved by: alfred	2004-07-12 21:42:33 +00:00
alfred	8a1713aada	Make VFS_ROOT() and vflush() take a thread argument. This is to allow filesystems to decide based on the passed thread which vnode to return. Several filesystems used curthread, they now use the passed thread.	2004-07-12 08:14:09 +00:00
alfred	031e087d2c	Use SO_REUSEADDR and SO_REUSEPORT when reconnecting NFS mounts. Tune the timeout from 5 seconds to 12 seconds. Provide a sysctl to show how many reconnects the NFS client has done. Seems to fix IPv6 from: kuriyama	2004-07-12 06:22:42 +00:00
brian	aae31dbf32	Change the following environment variables to kernel options: bootp -> BOOTP bootp.nfsroot -> BOOTP_NFSROOT bootp.nfsv3 -> BOOTP_NFSV3 bootp.compat -> BOOTP_COMPAT bootp.wired_to -> BOOTP_WIRED_TO - i.e. back out the previous commit. It's already possible to pxeboot(8) with a GENERIC kernel. Pointed out by: dwmalone	2004-07-08 22:35:36 +00:00
brian	2821a50eaa	Change the following kernel options to environment variables: BOOTP -> bootp BOOTP_NFSROOT -> bootp.nfsroot BOOTP_NFSV3 -> bootp.nfsv3 BOOTP_COMPAT -> bootp.compat BOOTP_WIRED_TO -> bootp.wired_to This lets you PXE boot with a GENERIC kernel by putting this sort of thing in loader.conf: bootp="YES" bootp.nfsroot="YES" bootp.nfsv3="YES" bootp.wired_to="bge1" or even setting the variables manually from the OK prompt.	2004-07-08 13:40:33 +00:00
rwatson	7e85c099fc	Acquire socket lock in nfs_connect() connection/sleep loop to protect socket state and avoid missed wakeups.	2004-07-06 16:55:41 +00:00
alfred	864fa13b59	use vfs_suser() to restrict access to the nfs mount's timeout.	2004-07-06 09:40:44 +00:00
alfred	8fd8b8c57f	NFS mobility Phase VI: Export NFS mount state via sysctl. Export timeout via sysctl.	2004-07-06 09:23:17 +00:00
alfred	97a6f04270	NFS mobility PHASE I, II & III (phase VI, and V pending): Rebind the client socket when we experience a timeout. This fixes the case where our IP changes for some reason. Signal a VFS event when NFS transitions from up to down and vice versa. Add a placeholder vfs_sysctl where we will put status reporting shortly. Also: Make down NFS mounts return EIO instead of EINTR when there is a soft timeout or force unmount in progress.	2004-07-06 09:12:03 +00:00
phk	070a613a48	When we traverse the vnodes on a mountpoint we need to look out for our cached 'next vnode' being removed from this mountpoint. If we find that it was recycled, we restart our traversal from the start of the list. Code to do that is in all local disk filesystems (and a few other places) and looks roughly like this: MNT_ILOCK(mp); loop: for (vp = TAILQ_FIRST(&mp...); (vp = nvp) != NULL; nvp = TAILQ_NEXT(vp,...)) { if (vp->v_mount != mp) goto loop; MNT_IUNLOCK(mp); ... MNT_ILOCK(mp); } MNT_IUNLOCK(mp); The code which takes vnodes off a mountpoint looks like this: MNT_ILOCK(vp->v_mount); ... TAILQ_REMOVE(&vp->v_mount->mnt_nvnodelist, vp, v_nmntvnodes); ... MNT_IUNLOCK(vp->v_mount); ... vp->v_mount = something; (Take a moment and try to spot the locking error before you read on.) On a SMP system, one CPU could have removed nvp from our mountlist but not yet gotten to assign a new value to vp->v_mount while another CPU simultaneously get to the top of the traversal loop where it finds that (vp->v_mount != mp) is not true despite the fact that the vnode has indeed been removed from our mountpoint. Fix: Introduce the macro MNT_VNODE_FOREACH() to traverse the list of vnodes on a mountpoint while taking into account that vnodes may be removed from the list as we go. This saves approx 65 lines of duplicated code. Split the insmntque() which potentially moves a vnode from one mount point to another into delmntque() and insmntque() which does just what the names say. Fix delmntque() to set vp->v_mount to NULL while holding the mountpoint lock.	2004-07-04 08:52:35 +00:00
rwatson	6b9af88e9d	When updating sb_flags, acquire the socket buffer lock to prevent races.	2004-06-24 03:12:13 +00:00
phk	40dd98a3bd	Second half of the dev_t cleanup. The big lines are: NODEV -> NULL NOUDEV -> NODEV udev_t -> dev_t udev2dev() -> findcdev() Various minor adjustments including handling of userland access to kernel space struct cdev etc.	2004-06-17 17:16:53 +00:00
rwatson	10cdb7ab20	Remove bad cookie vp kernel printf; while it does notify about an interesting event, there's little or nothing the user can do about it.	2004-06-17 00:15:37 +00:00
rwatson	65f0bd9a10	Convert GIANT_REQUIRED to NET_ASSERT_GIANT where Giant is used to protect socket operations. Leave one "as-is" as it also frobs rootvp.	2004-06-16 03:12:50 +00:00
alc	b57e5e03fd	Make vm_page's PG_ZERO flag immutable between the time of the page's allocation and deallocation. This flag's principal use is shortly after allocation. For such cases, clearing the flag is pointless. The only unusual use of PG_ZERO is in vfs_bio_clrbuf(). However, allocbuf() never requests a prezeroed page. So, vfs_bio_clrbuf() never sees a prezeroed page. Reviewed by: tegge@	2004-05-06 05:03:23 +00:00
peadar	5617193e91	Let the NFS client notice a file's size changing as a modification. This avoids presenting invalid data to the client's applications when the file is modified, and then extended within the window of the resolution of the modifcation timestamp. Reviewed By: iedowse PR: kern/64091	2004-04-14 23:23:55 +00:00
marcel	6dbee1d482	Unbreak build: s/TAILQ_ISEMPTY/TAILQ_EMPTY/g	2004-04-11 17:15:36 +00:00
peadar	9bb40b73ee	Clean up properly when unloading NFS client module. This includes a modified form of some code from Thomas Moestl (tmm@) to properly clean up the UMA zone and the "nfsnodehashtbl" hash table. Reviewed By: iedowse PR: 16299	2004-04-11 13:30:20 +00:00
imp	ebf059d1df	Remove advertising clause from University of California Regent's license, per letter dated July 22, 1999 and email from Peter Wemm, Alan Cox and Robert Watson. Approved by: core, peter, alc, rwatson	2004-04-07 05:00:01 +00:00
rwatson	18b25cc43d	Spell 2 as SHUT_RDWR when used as an argument to soshutdown().	2004-04-04 19:24:08 +00:00
peadar	461541ae31	Flush cached access mode after modifying a files attributes for NFSv3. It's likely that modifying the attributes will affect the file's accessibility. This version of the patch is one suggested by Ian Dowse after reviewing my original attempt in the PR Reviewed By: iedowse PR: kern/44336 MFC after: 3 days	2004-04-03 17:23:46 +00:00
kan	97b7fb767e	Reset callout if in nfs_timeout and rpcclnt_timeout functions. Timer are supposed to continue firing as long as there is work to do, not stop after the first invocation. This is damage control after a patch that has been committed prematurely. Tested by: kris	2004-03-28 05:55:27 +00:00
rees	4bf96c35a5	only do nfs rpc callouts if there is work to do. Submitted by: kan Approved by: alfred	2004-03-25 21:48:09 +00:00
pjd	257769d6cc	Add a comment with an explanation why we don't report EPIPE errors on nfs sockets. Requested by: ru	2004-03-17 21:10:20 +00:00
pjd	f8bf3c9231	Don't report EPIPE errors on nfs sockets. These can be due to idle tcp mounts which will be closed by netapp, solaris, etc. if left idle too long. Obtained from: NetBSD	2004-03-17 18:10:38 +00:00
peter	36be86fb0a	Calculate NFS timeouts in units of 10ms, not 5ms. This matches the default clock precision on i386. This is a NOP change on i386. But this stops the mount_nfs units from suddenly changing to units of 1/20 of a second (vs the normal 1/10 of a second) if HZ is increased.	2004-03-14 06:21:56 +00:00
brooks	7e688e2cec	Allow kernel with the BOOTP option to boot when DHCP/BOOTP sets the root path to an absolute path without a host name. Previously, there was a nasty POLA violation where a system would PXE boot until you added the BOOTP option and then it would panic instead. Reviewed by: tegge, Dirk-Willem van Gulik <dirkx at webweaving.org> (a previous version) Submitted by: tegge (getip function)	2004-03-12 20:37:40 +00:00
phk	2a5e157787	Properly vector all bwrite() and BUF_WRITE() calls through the same path and s/BUF_WRITE()/bwrite()/ since it now does the same as bwrite().	2004-03-11 18:02:36 +00:00
phk	eeb7579130	Remove unused second arg to vfinddev(). Don't call addaliasu() on VBLK nodes.	2004-03-11 16:33:11 +00:00
rwatson	b0b5f961bd	Rename dup_sockaddr() to sodupsockaddr() for consistency with other functions in kern_socket.c. Rename the "canwait" field to "mflags" and pass M_WAITOK and M_NOWAIT in from the caller context rather than "1" or "0". Correct mflags pass into mac_init_socket() from previous commit to not include M_ZERO. Submitted by: sam	2004-03-01 03:14:23 +00:00
rees	108fca056b	NFSv4 fixes from Connectathon 2004: remove unused pid field of file context struct map nfs4 error codes to errnos eliminate redundant code from nfs4_request use zero stateid on setattr that doesn't set file size use same clientid on all mounts until reboot invalidate dirty bufs in nfs4_close, to play it safe open file for writing if truncating and it's not already open Approved by: alfred	2004-02-27 19:37:43 +00:00
cperciva	9576df9a82	If mountnfs returns an error, it will have already freed nam; no need to free it again. Reported by: "Ted Unangst" <tedu@coverity.com> Approved by: rwatson (mentor)	2004-02-22 01:17:47 +00:00
jhb	279b2b8278	Locking for the per-process resource limits structure. - struct plimit includes a mutex to protect a reference count. The plimit structure is treated similarly to struct ucred in that is is always copy on write, so having a reference to a structure is sufficient to read from it without needing a further lock. - The proc lock protects the p_limit pointer and must be held while reading limits from a process to keep the limit structure from changing out from under you while reading from it. - Various global limits that are ints are not protected by a lock since int writes are atomic on all the archs we support and thus a lock wouldn't buy us anything. - All accesses to individual resource limits from a process are abstracted behind a simple lim_rlimit(), lim_max(), and lim_cur() API that return either an rlimit, or the current or max individual limit of the specified resource from a process. - dosetrlimit() was renamed to kern_setrlimit() to match existing style of other similar syscall helper functions. - The alpha OSF/1 compat layer no longer calls getrlimit() and setrlimit() (it didn't used the stackgap when it should have) but uses lim_rlimit() and kern_setrlimit() instead. - The svr4 compat no longer uses the stackgap for resource limits calls, but uses lim_rlimit() and kern_setrlimit() instead. - The ibcs2 compat no longer uses the stackgap for resource limits. It also no longer uses the stackgap for accessing sysctl's for the ibcs2_sysconf() syscall but uses kernel_sysctl() instead. As a result, ibcs2_sysconf() no longer needs Giant. - The p_rlimit macro no longer exists. Submitted by: mtm (mostly, I only did a few cleanups and catchups) Tested on: i386 Compiled on: alpha, amd64	2004-02-04 21:52:57 +00:00
obrien	ac460e8a52	Bump the NFCv3/TCP defaults for rsize and wsize from 8K to 32K to match Solaris and HP-UX. This increases read performance for large files across NFS. PR: 62024 & 26324 Submitted by: Bjoern Groenvall <bg@sics.se>	2004-01-31 10:40:15 +00:00
alfred	a5dc4dbeb8	Use function pointers to remove the depenancy cross dependancy on nfs4 and the nfs3 client. Also fix some bugs that happen to be causing crashes in both v3 and v4 introduced by the v4 import. Submitted by: Jim Rees <rees@umich.edu> Approved by: re	2003-11-22 02:21:49 +00:00
alfred	490e2fe2e2	Move the declaration for "struct nfs4_fctx" out from under #ifdef KERNEL for fstat(1).	2003-11-15 05:03:15 +00:00
alfred	302841f20d	unbreak LINT.	2003-11-15 00:26:42 +00:00
alfred	5b076fe9da	University of Michigan's Citi NFSv4 kernel client code. Submitted by: Jim Rees <rees@umich.edu>	2003-11-14 20:54:10 +00:00
kan	9352a05d40	1. Consolidate mount struct allocation/destruction into a common code in vfs_mount_alloc/vfs_mount_destroy functions and take care to completely destroy the mount point along with its locks. Mount struct has grown in coplexity recently and depending on each failure path to destroy it completely isn't working anymore. 2. Eliminate largely identical vfs_mount and vfs_unmount question by moving the code to handle both cases into a newly introduced vfs_domount function. 3. Simplify nfs_mount_diskless to always expect an allocated mount struct and never attempt an allocation/destruction itself. The vfs_allocroot allocation was there to support 'magic' swap space configuration for diskless clients that was already removed by PHK some time ago. 4. Include a vfs_buildopts cleanups by Peter Edwards to validate the sanity of nmount parameters passed from userland. Submitted by: (4) Peter Edwards <peter.edwards@openet-telecom.com> Reviewed by: rwatson	2003-11-12 02:54:47 +00:00
alfred	b1d1754bf2	Stop using shared locks for nfs vop locks. The reason this was done was to avoid a race to the root when an NFS server went down. However a semi-recent change to the way that the kernel's lookup() routine traverses mount points prevents this. Rev 1.39 of vfs_lookup.c changed the ordering of locks such that we aquire a shared lock on the mount point being accessed and then drop the directory vnode lock before requesting the target lock. With that in place we no longer need shared locks for NFS to prevent race to the root lockups.	2003-11-11 00:32:46 +00:00
sam	3eac15aaa3	Assert GIANT_REQUIRED where sockets are manipulated. This is preparatory for MPSAFE network commits and ongoing socket locking work. Supported by: FreeBSD Foundation	2003-11-07 22:57:09 +00:00
kan	36d60f3bb7	Remove mntvnode_mtx and replace it with per-mountpoint mutex. Introduce two new macros MNT_ILOCK(mp)/MNT_IUNLOCK(mp) to operate on this mutex transparently. Eventually new mutex will be protecting more fields in struct mount, not only vnode list. Discussed with: jeff	2003-11-05 04:30:08 +00:00
kan	618baf4714	Take care not to call vput if thread used in corresponding vget wasn't curthread, i.e. when we receive a thread pointer to use as a function argument. Use VOP_UNLOCK/vrele in these cases. The only case there td != curthread known at the moment is boot() calling sync with thread0 pointer. This fixes the panic on shutdown people have reported.	2003-11-02 04:52:53 +00:00
brooks	f1e94c6f29	Replace the if_name and if_unit members of struct ifnet with new members if_xname, if_dname, and if_dunit. if_xname is the name of the interface and if_dname/unit are the driver name and instance. This change paves the way for interface renaming and enhanced pseudo device creation and configuration symantics. Approved By: re (in principle) Reviewed By: njl, imp Tested On: i386, amd64, sparc64 Obtained From: NetBSD (if_xname)	2003-10-31 18:32:15 +00:00
phk	4c2cb3f397	DuH! bp->b_iooffset (the spot on the disk), not bp->b_offset (the offset in the file)	2003-10-18 14:10:28 +00:00
phk	1e371cc970	Initialize bp->b_offset before calling VOP_STRATEGY(). Remove KASSERTS and panics with B_PHYS checks which no longer apply.	2003-10-18 11:14:29 +00:00
phk	a347a9d216	We do not get B_PHYS buffers here anymore. /dev/drum is long gone.	2003-10-18 09:33:13 +00:00
iedowse	4357db17b4	Since the addition of the VI_DOINGINACT flag some time ago, VOP_INACTIVE routines need not worry about their vnode getting recycled if they block. Remove the code from nfs_inactive() that used vget() to get an extra vnode reference that was held during the nfs_vinvalbuf() call.	2003-10-05 12:41:35 +00:00
jeff	f61f6f6aa8	- Remove an incorrect XXX comment. This code does respect the XLOCK since it uses vget() which will fail if the identity changes.	2003-10-05 06:47:56 +00:00
jeff	2b4a2d9fbe	- Check the XLOCK before we inspect the vnode.	2003-10-05 06:46:45 +00:00
jeff	5b01a09002	- We don't need to cache_purge() in nfs_reclaim(), vclean() does it for us.	2003-10-05 06:46:02 +00:00
jeff	daf0443857	- Consistently set sopt_dir. Pointed out by: pete@isilon.com	2003-10-04 17:41:59 +00:00
jeff	46f6642c5b	- Acquire the vnode interlock prior to dropping the mntvnode_mtx. - Make a note of the lack of XLOCK protection in this code. We would access a vnode while it is changing identities without Giant.	2003-10-04 13:44:51 +00:00
jeff	849854f240	- Remove the backtrace() call from the *_vinvalbuf() functions. Thanks to a stack trace supplied by phk, I now understand what's going on here. The check for VI_XLOCK stops us from calling vinvalbuf once the vnode has been partially torn down in vclean(). It is not clear that this would cause a problem. Document this in nfs_bio.c, which is where the other two filesystems copied this code from.	2003-10-04 08:51:50 +00:00
jeff	4d0b3883a4	- Remove interlock protection around VI_XLOCK. The interlock is not sufficient to guarantee that this race is not hit. The XLOCK will likely have to be redesigned due to the way reference counting and mutexes work in FreeBSD. We currently can not be guaranteed that xlock was not set and cleared while we were blocked on the interlock while waiting to check for XLOCK. This would lead us to reference a vnode which was not the vnode we requested. - Add a backtrace() call inside of INVARIANTS in the hopes of finding out if this condition is ever hit. It should not, since we should be retaining a reference to the vnode in these cases. The reference would be sufficient to block recycling.	2003-09-19 23:37:49 +00:00
phk	0e80d17900	Name the vnode method vectors consistently with the rest of the filesystems. This improves the output of src/tools/tools/vop_table	2003-09-12 16:44:40 +00:00
phk	af43a08ef8	Remove now unused BOOTP tags related to NFS swap device.	2003-09-05 11:12:55 +00:00
dds	a07778264c	KNF: parentheses around return values. Suggested by: bde Approved by: schweikh (mentor - blanket) MFC after: 6 weeks	2003-09-04 11:27:13 +00:00
dds	c5e451a8b7	Fix errno return values to better represent failure reasons for read and open. Approved by: schweikh (mentor) Agreed: bde MFC after: 6 weeks	2003-09-02 16:46:31 +00:00
phk	8eb928cd77	Remove the magic way of configuring NFS backed swap. This code dates back to the very first diskless support on FreeBSD, back when swapon(8) couldn't simply be run on a NFS backed file. Suggested replacement command sequence on the client: dd if=/dev/zero of=/swapfile bs=1k count=1 oseek=100000 swapon /swapfile rm -f /swapfile For whatever value of 100000 you want.	2003-08-15 12:04:02 +00:00
billf	08d78e9b49	0) preallocate per-interface context structures without the ifnet lock held 1) avoid immediately calling bzero() after malloc() by passing M_ZERO 2) do not initialize individual members of the global context to zero 3) remove an unused assignment of ifctx in bootpc_init() Reviewed by: tegge	2003-08-07 21:27:17 +00:00
tjr	f4b299adc0	Fix a problem that occurs when truncating files on NFSv3 mounts: we need to set np->n_size back to the desired size again after calling nfs_meta_setsize(), since it could end up in nfs_loadattrcache() getting called, which would change n_size back to the value it had before the truncate request was issued. The result of this bug is that the size info cached in the nfsnode becomes incorrect, lseek(fd, ofs, SEEK_END) seeks past the end of the file, stat() returns the wrong size, etc. PR: 41792 MFC after: 2 weeks	2003-07-29 00:17:29 +00:00
phk	d4d7ca154a	Add fdidx argument to vn_open() and vn_open_cred() and pass -1 throughout.	2003-07-27 17:04:56 +00:00
phk	b99b564c20	Change idle sleep indentifier to "-" for nfsiod	2003-07-02 08:09:20 +00:00
alc	636a482b8d	Lock the vm object when freeing a page.	2003-06-17 05:17:00 +00:00
phk	24cc9156fe	Add the same KASSERT to all VOP_STRATEGY and VOP_SPECSTRATEGY implementations to check that the buffer points to the correct vnode.	2003-06-15 18:53:00 +00:00
phk	fd139fd7d0	Initialize struct vfsops C99-sparsely. Submitted by: hmp Reviewed by: phk	2003-06-12 20:48:38 +00:00
iedowse	2c04f19896	When removing a sillyrename file, make sure that the directory vnode has not been cleaned in the meantime, since this can happen during a forced unmount. Also add a comment that nfs_removeit() should really be locking the directory vnode before calling nfs_removerpc(). Reported by: mbr Tested by: mbr MFC after: 1 week	2003-06-12 15:41:20 +00:00
obrien	8b64eb1925	Use __FBSDID().	2003-06-11 05:37:42 +00:00
rwatson	decffe6132	Add the comment I meant to add about not passing in PCATCH to the tsleep(). Note the XXX.	2003-06-11 03:32:42 +00:00
hsu	d5ee1a976b	On a socket creation error, don't close the socket.	2003-06-09 03:44:34 +00:00
phk	174a772296	Remove unsed variables. Add explicit breaks to switch Found by: FlexeLint	2003-05-31 20:05:25 +00:00
phk	0129a20107	The IO_NOWDRAIN and B_NOWDRAIN hacks are no longer needed to prevent deadlocks with vnode backed md(4) devices because md now uses a kthread to run the bio requests instead of doing it directly from the bio down path.	2003-05-31 16:42:45 +00:00
rwatson	c264d8171d	rpc.lockd stability workaround: remove PCATCH from the tsleep() in nfs_lock.c. Right now, if we permit a signal to interrupt the sleep, we will slip the lock and no process on that client, the server, or any other client will be able to acquire the lock. This can happen, for example, if a user hits Ctrl-C or Ctrl-T while a process is waiting for the lock. By removing PCATCH, we prevent that from happening, at the cost of not permitting a user-requested lock abort: also nasty. However, a user interface bug might be preferable to a serious semantic bug, so we go with that for now. We need to teach the rpc.lockd/kernel protocol how to abort lock requests, and rpc.lockd how to handle aborted lock requests; patches for the kernel bit are floating around, but no rpc.lockd bit yet. Approved by: re (scottl)	2003-05-30 17:15:56 +00:00
peter	da1b9f9f88	Deal with the possibility of negative available space from the file server to avoid Bad Things(TM) happening (eg: df crashing with a floating point exception). Submitted by: Harold Gutch <logix@foobar.franken.de> Approved by: re (scottl)	2003-05-19 22:35:00 +00:00
rwatson	94ff93f449	This change grabs the vnode lock for NFS client vnodes when calling VOP_SETATTR() or VOP_GETATTR(); without these locks (a) VFS_DEBUG_LOCKS will panic, and (b) it may be possible to corrupt entries in the cached vnode attributes in the nfsnode, since nfsnode attribute cache data is also protected by the vnode lock. Approved by: re (jhb) Pointed out by: VFS_DEBUG_LOCKS	2003-05-15 21:12:08 +00:00
jhb	89a4eb17de	- Merge struct procsig with struct sigacts. - Move struct sigacts out of the u-area and malloc() it using the M_SUBPROC malloc bucket. - Add a small sigacts_*() API for managing sigacts structures: sigacts_alloc(), sigacts_free(), sigacts_copy(), sigacts_share(), and sigacts_shared(). - Remove the p_sigignore, p_sigacts, and p_sigcatch macros. - Add a mutex to struct sigacts that protects all the members of the struct. - Add sigacts locking. - Remove Giant from nosys(), kill(), killpg(), and kern_sigaction() now that sigacts is locked. - Several in-kernel functions such as psignal(), tdsignal(), trapsignal(), and thread_stopped() are now MP safe. Reviewed by: arch@ Approved by: re (rwatson)	2003-05-13 20:36:02 +00:00
des	8ed712ead1	Instead of recording the Unix time in a process when it starts, record the uptime. Where necessary, convert it back to Unix time by adding boottime to it. This fixes a potential problem in the accounting code, which would compute the elapsed time incorrectly if the Unix time was stepped during the lifetime of the process.	2003-05-01 16:59:23 +00:00
kan	9468fdaf14	Deprecate machine/limits.h in favor of new sys/limits.h. Change all in-tree consumers to include <sys/limits.h> Discussed on: standards@ Partially submitted by: Craig Rodrigues <rodrigc@attbi.com>	2003-04-29 13:36:06 +00:00
truckman	79b9eea6e8	VOP_FSYNC() expects to be called with the vnode locked, so lock fvp in nfs_rename() before calling VOP_FSYNC() and unlock fvp immediately after. Reviewed by: bde	2003-04-24 20:39:40 +00:00
peter	88151c4f8c	Fix a bug with df on large (>1TB) nfsv3 file servers on 32 bit client machines where the 'long' number of blocks in struct statfs wont fit. Instead of chosing an artificial 512 byte block size, simply scale it up until we avoid an overflow. NFSv3 reports the sizes in bytes, and the blocksize is a figment of nfsclient's imagination.	2003-04-24 20:36:32 +00:00
truckman	b8272feca3	Release the vnode interlock in nfs_flush() before calling nfs_sigintr(), and grab it again later if necessary. This prevents a lock order reversal because nfs_sigintr() calls PROC_LOCK().	2003-04-23 02:58:26 +00:00
thomas	7e134f95f3	Revert change 1.201 (removing mapping of VAPPEND to VWRITE). Instead, use the generic vaccess() operation to determine whether an operation is permitted. This avoids embedding knowledge on vnode permission bits such as VAPPEND in the NFS client. PR: kern/46515 vaccess() patch submitted by: "Peter Edwards" <pmedwards@eircom.net> Approved by: tjr, roberto (mentor)	2003-03-31 23:26:10 +00:00
jeff	46e6ba39f1	- Move p->p_sigmask to td->td_sigmask. Signal masks will be per thread with a follow on commit to kern_sig.c - signotify() now operates on a thread since unmasked pending signals are stored in the thread. - PS_NEEDSIGCHK moves to TDF_NEEDSIGCHK.	2003-03-31 22:49:17 +00:00
rwatson	109543a3e5	Add O_NONBLOCK to the vn_open_cred() flags for NFS client locking when opening the POSIX fifo; convert ENXIO error returns to EOPNOTSUPP. This improves handling of the case where the /var/run/lock fifo exists but there is no listener: we immediately return EOPNOTSUPP rather than blocking until a listener turns up. This could occur during a diskless boot before rpc.lockd is loaded, or if the lock file persists across a reboot following the disabling of rpc.lockd. This may have suddenly started to occur due to fifo blocking fixes--previously it looks like attempts to read on a fifo with no listener would time out due to insufficient resources. Reviewed by: alfred	2003-03-26 19:21:34 +00:00
alfred	5fb77f7c70	req can not be NULL or we'd die. Sponsored by: RED	2003-03-26 01:46:11 +00:00
tjr	874c219fad	Map VAPPEND to VWRITE in nfsspec_access() - VAPPEND is never set in the mode returned by VOP_GETATTR. This fixes incorrect "Permission denied" errors when trying to append to a file on an NFSv2 mount.	2003-03-21 05:13:23 +00:00
jeff	f500ebe3c4	- Add a forgotten BUF_LOCK() Most sincere apologies to: jake	2003-03-14 05:13:19 +00:00
jeff	4b8b33db8a	- Lock the buf before inspecting its contents.	2003-03-13 07:04:11 +00:00
jeff	4de0ae322c	- Add a new 'flags' parameter to getblk(). - Define one flag GB_LOCK_NOWAIT that tells getblk() to pass the LK_NOWAIT flag to the initial BUF_LOCK(). This will eventually be used in cases were we want to use a buffer only if it is not currently in use. - Convert all consumers of the getblk() api to use this extra parameter. Reviwed by: arch Not objected to by: mckusick	2003-03-04 00:04:44 +00:00
njl	5a225ad933	Finish cleanup of vprint() which was begun with changing v_tag to a string. Remove extraneous uses of vop_null, instead defering to the default op. Rename vnode type "vfs" to the more descriptive "syncer". Fix formatting for various filesystems that use vop_print.	2003-03-03 19:15:40 +00:00
des	2756b6c964	More low-hanging fruit: kill caddr_t in calls to wakeup(9) / [mt]sleep(9).	2003-03-02 16:54:40 +00:00
jeff	3c4fe935b7	- The interlock was not being droped in nfs_flush() if the first part of an if clause was true. Break the two clauses out into seperate statements since they require different actions. Reported/Tested by: jake Spotted by: jhb	2003-02-26 00:24:19 +00:00
jeff	e28e3bf81c	- Properly handle the vnode interlock in nfs_fsync. Reported by: phk	2003-02-25 08:50:21 +00:00
jeff	9e4c9a6ce9	- Add an interlock argument to BUF_LOCK and BUF_TIMELOCK. - Remove the buftimelock mutex and acquire the buf's interlock to protect these fields instead. - Hold the vnode interlock while locking bufs on the clean/dirty queues. This reduces some cases from one BUF_LOCK with a LK_NOWAIT and another BUF_LOCK with a LK_TIMEFAIL to a single lock. Reviewed by: arch, mckusick	2003-02-25 03:37:48 +00:00
imp	cf874b345d	Back out M_* changes, per decision of the TRB. Approved by: trb	2003-02-19 05:47:46 +00:00
peter	20bddb2776	Get rid of a silly message I added back in Sept 2001 (1.68).	2003-02-18 23:45:01 +00:00
tjr	a7cd813d68	Lock proc while accessing p_siglist, p_sigmask and p_sigignore in nfs_sigintr().	2003-02-15 08:25:57 +00:00
dillon	0a8a44e0b5	Provide a sysctl to allow defaulting of the connectionless (-c) feature to mount_nfs. The sysctl defaults to 1 (paranoid mode). Setting it to 0 will allow an NFS client to receive replies on a different IP then they were sent to by default. Submitted by: Sean Eric Fagan <sef@kithrup.com>	2003-01-22 19:57:31 +00:00
alfred	bf8e8a6e8f	Remove M_TRYWAIT/M_WAITOK/M_WAIT. Callers should use 0. Merge M_NOWAIT/M_DONTWAIT into a single flag M_NOWAIT.	2003-01-21 08:56:16 +00:00
phk	157437ec08	Since Jeffr made the std* functions the default in rev 1.63 of kern/vfs_defaults.c it is wrong for the individual filesystems to use the std* functions as that prevents override of the default. Found by: src/tools/tools/vop_table	2003-01-04 08:47:19 +00:00
phk	daf6948653	Convert calls to BUF_STRATEGY to VOP_STRATEGY calls. This is a no-op since all BUF_STRATEGY did in the first place was call VOP_STRATEGY.	2003-01-03 06:32:15 +00:00
schweikh	86f7487fb6	Fix typos, mostly s/ an / a / where appropriate and a few s/an/and/ Add FreeBSD Id tag where missing.	2002-12-30 21:18:15 +00:00
dillon	fd92c8a195	Abstract-out the constants for the sequential heuristic. No operational changes. MFC after: 1 day	2002-12-28 20:37:50 +00:00
hsu	32436a25c0	SMP locking for radix nodes.	2002-12-24 03:03:39 +00:00
alc	1b398f04d9	Avoid holding the vnode interlock around malloc() or free() to prevent a lock order reversal. Reviewed by: jeff	2002-12-23 06:20:41 +00:00
hsu	82e1e3bab0	SMP locking for ifnet list.	2002-12-22 05:35:03 +00:00
dillon	63da09d1e6	do not try to free a mountpoint that we did not allocate. X-MFC after: immediately	2002-12-21 20:55:34 +00:00
alfred	8f7431caeb	reapply 1.26 through 1.28. Approved by: re	2002-11-20 15:21:06 +00:00
alfred	e398b8022d	forgot about 5.x freeze, backout 1.26 through 1.28 pending re@ appoval.	2002-11-20 10:53:06 +00:00
alfred	8f8a40cefe	remove useless casts, unused macros and cleanup a line wrap.	2002-11-20 10:13:04 +00:00
alfred	22ecc18d19	comment and untwist error return logic	2002-11-20 10:06:51 +00:00
alfred	f4e72b4767	Remove an outdated comment complaining about exporting struct ucred to userspace, I fixed it a while ago.	2002-11-20 10:00:04 +00:00
phk	b9ad3f37bf	Don't examine an un-initialized variable. Spotted by: FlexeLint.	2002-10-20 21:52:05 +00:00
phk	9b7c9e2c4d	Remove extern declarations of stuff which is static in nfs_node.c Move related macro to nfs_node.c Spotted by: FlexeLint	2002-10-20 21:40:55 +00:00
mckusick	25230d4c6a	Regularize the vop_stdlock'ing protocol across all the filesystems that use it. Specifically, vop_stdlock uses the lock pointed to by vp->v_vnlock. By default, getnewvnode sets up vp->v_vnlock to reference vp->v_lock. Filesystems that wish to use the default do not need to allocate a lock at the front of their node structure (as some still did) or do a lockinit. They can simply start using vn_lock/VOP_UNLOCK. Filesystems that wish to manage their own locks, but still use the vop_stdlock functions (such as nullfs) can simply replace vp->v_vnlock with a pointer to the lock that they wish to have used for the vnode. Such filesystems are responsible for setting the vp->v_vnlock back to the default in their vop_reclaim routine (e.g., vp->v_vnlock = &vp->v_lock). In theory, this set of changes cleans up the existing filesystem lock interface and should have no function change to the existing locking scheme. Sponsored by: DARPA & NAI Labs.	2002-10-14 03:20:36 +00:00
mike	8630abe45f	Change iov_base's type from `char ' to the standard` void '. All uses of iov_base which assume its type is `char ' (in order to do pointer arithmetic) have been updated to cast iov_base to `char '.	2002-10-11 14:58:34 +00:00
scottl	3a150bca9c	Some kernel threads try to do significant work, and the default KSTACK_PAGES doesn't give them enough stack to do much before blowing away the pcb. This adds MI and MD code to allow the allocation of an alternate kstack who's size can be speficied when calling kthread_create. Passing the value 0 prevents the alternate kstack from being created. Note that the ia64 MD code is missing for now, and PowerPC was only partially written due to the pmap.c being incomplete there. Though this patch does not modify anything to make use of the alternate kstack, acpi and usb are good candidates. Reviewed by: jake, peter, jhb	2002-10-02 07:44:29 +00:00
jmallett	7a693db242	Back our kernel support for reliable signal queues. Requested by: rwatson, phk, and many others	2002-10-01 17:15:53 +00:00
jmallett	068343413c	Lock access to the signal queue, and related structures, with PROC_LOCK. Submitted by: jhb	2002-09-30 21:15:33 +00:00
jmallett	7bf6052470	Convert use of p_siglist and old SIG*() macros to use <sys/ksiginfo.h> prototyped functions to get a sigset_t, and further to check for any queued signals, rather than an empty signal set, to go with the move to signal queues rather than signal sets.	2002-09-30 20:48:29 +00:00
phk	1dfc2c167f	Be consistent about "static" functions: if the function is marked static in its prototype, mark it static at the definition too. Inspired by: FlexeLint warning #512	2002-09-28 17:15:38 +00:00
rwatson	c583effb16	Remove an errant debugging printf that got left in during my last commit. Pointed out by: guido	2002-09-27 00:25:54 +00:00
rwatson	2fed42cd45	Apparently pxeboot passes in a mygateway of non-zero sin length from DHCP in the event that no gateway is returned from DHCP, breaking the assumption that we skip the routing insertion of the gateway if the sin length is zero. Check also for s_addr of 0 to avoid the "Oh no, adding my default route failed" panic, making it possible to pxeboot machines on segments without default routes. Arguably this could be a bug in pxeboot, or in the TUNABLE code, but this makes my boxes boot.	2002-09-26 19:56:43 +00:00
jeff	5c7f8a426d	- Lock access to the buf lists. - Use vrefcnt() where appropriate. - Add some locking asserts.	2002-09-25 02:38:43 +00:00
jake	be3bee9396	Moved nfs_diskless setup code from autoconf.c to nfsclient/nfs_diskless.c so that it is MI. Allow nfs_mountroot to return an error if the nfs_diskless struct is not valid, rather than panicing later on. Call nfs_setup_diskless() from nfs_mountroot if NFS_ROOT is defined, like bootpc_init(). Removed legacy root mount support for sparc64, and enabled NFS_ROOT by default.	2002-09-22 00:59:02 +00:00
phk	63d87674c8	Use m_length() instead of home-rolled versions.	2002-09-18 19:44:14 +00:00
njl	0590c43070	Remove all use of vnode->v_tag, replacing with appropriate substitutes. v_tag is now const char * and should only be used for debugging. Additionally: 1. All users of VT_NTS now check vfsconf->vf_type VFCF_NETWORK 2. The user of VT_PROCFS now checks for the new flag VV_PROCDEP, which is propagated by pseudofs to all child vnodes if the fs sets PFS_PROCDEP. Suggested by: phk Reviewed by: bde, rwatson (earlier version)	2002-09-14 09:02:28 +00:00
phk	549c71a099	Now that we have a cached mount credential in struct mount, use it istead of a private cached copy.	2002-09-08 15:11:18 +00:00
bde	c1c3f72703	Use `struct uma_zone *' instead of uma_zone_t, so that <sys/uma.h> isn't a prerequisite.	2002-09-05 14:04:34 +00:00
sobomax	f6cebc0606	Increase size of ifnet.if_flags from 16 bits (short) to 32 bits (int). To avoid breaking application ABI use unused ifreq.ifru_flags[1] for upper 16 bits in SIOCSIFFLAGS and SIOCGIFFLAGS ioctl's. Reviewed by: -hackers, -net	2002-08-18 07:05:00 +00:00
alfred	6d7e27aceb	Remove a case of exposing 'struct ucred' to userspace. Use a struct xucred for LOCKD_MSG instead. Requested by: rwatson	2002-08-15 21:52:22 +00:00
rwatson	44404e4547	In order to better support flexible and extensible access control, make a series of modifications to the credential arguments relating to file read and write operations to cliarfy which credential is used for what: - Change fo_read() and fo_write() to accept "active_cred" instead of "cred", and change the semantics of consumers of fo_read() and fo_write() to pass the active credential of the thread requesting an operation rather than the cached file cred. The cached file cred is still available in fo_read() and fo_write() consumers via fp->f_cred. These changes largely in sys_generic.c. For each implementation of fo_read() and fo_write(), update cred usage to reflect this change and maintain current semantics: - badfo_readwrite() unchanged - kqueue_read/write() unchanged pipe_read/write() now authorize MAC using active_cred rather than td->td_ucred - soo_read/write() unchanged - vn_read/write() now authorize MAC using active_cred but VOP_READ/WRITE() with fp->f_cred Modify vn_rdwr() to accept two credential arguments instead of a single credential: active_cred and file_cred. Use active_cred for MAC authorization, and select a credential for use in VOP_READ/WRITE() based on whether file_cred is NULL or not. If file_cred is provided, authorize the VOP using that cred, otherwise the active credential, matching current semantics. Modify current vn_rdwr() consumers to pass a file_cred if used in the context of a struct file, and to always pass active_cred. When vn_rdwr() is used without a file_cred, pass NOCRED. These changes should maintain current semantics for read/write, but avoid a redundant passing of fp->f_cred, as well as making it more clear what the origin of each credential is in file descriptor read/write operations. Follow-up commits will make similar changes to other file descriptor operations, and modify the MAC framework to pass both credentials to MAC policy modules so they can implement either semantic for revocation. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-15 20:55:08 +00:00
phk	e4f487f25e	Introduce typedefs for the member functions of struct vfsops and employ these in the main filesystems. This does not change the resulting code but makes the source a little bit more grepable. Sponsored by: DARPA and NAI Labs.	2002-08-13 10:05:50 +00:00
rwatson	b0388fc24a	Pass IO_NOMACCHECK to vn_rdwr() in the following checks to prevent enforcement of MAC policy on the read or write operations: - In ext2fs, don't enforce MAC on loop-back reads and writes supporting directory read operations in lookup(), directory modifications in rename(), directory write operations in mkdir(), symlink write operations in symlink(). - In the NFS client locking code, perform vn_rdwr() on the NFS locking socket without enforcing MAC, since the write is done on behalf of the kernel NFS implementation rather than the user process. - In UFS, don't enforce MAC on loop-back reads and writes supporting directory read operations in lookup(), and symlink write operations in symlink(). Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2002-08-12 16:43:04 +00:00
jeff	fcdac052f8	- Add a missing VI_UNLOCK to an error case in nfs_flush.	2002-08-05 08:54:29 +00:00
jeff	02517b6731	- Replace v_flag with v_iflag and v_vflag - v_vflag is protected by the vnode lock and is used when synchronization with VOP calls is needed. - v_iflag is protected by interlock and is used for dealing with vnode management issues. These flags include X/O LOCK, FREE, DOOMED, etc. - All accesses to v_iflag and v_vflag have either been locked or marked with mp_fixme's. - Many ASSERT_VOP_LOCKED calls have been added where the locking was not clear. - Many functions in vfs_subr.c were restructured to provide for stronger locking. Idea stolen from: BSD/OS	2002-08-04 10:29:36 +00:00
alc	2e0c2c9a48	o Lock page queue accesses in nfs_getpages().	2002-07-21 20:01:32 +00:00
dillon	9d3af4cbd8	Fix a bug nfs_write() related to ^C'ing during a file write on an interruptable mount. We were returning from inside the loop without releasing the rslock. Submitted by: Mike Junk <junk@isilon.com> MFC after: 3 days	2002-07-16 19:43:59 +00:00
jhb	9618cc94df	If we get a receive error in nfs_receive() and then get an error trying to obtain the send lock, we would bogusly try to unlock the send lock before returning resulting in a panic. Instead, only unlock the send lock if nfs_sndlock() succeeds and nfs_reconnect() fails. MFC after: 3 days Sponsored by: The Weather Channel	2002-07-16 15:12:07 +00:00
alfred	df766765ba	Add IPv6 support. Submitted by: Jean-Luc Richier <Jean-Luc.Richier@imag.fr>	2002-07-15 19:40:23 +00:00
dillon	0b74a2da00	Convert old style (type foo *)0 casts to NULLs PR: kern/40360 Requested by: Hiten PAndya via direct email	2002-07-11 17:54:58 +00:00
dillon	da4e111a55	Replace the global buffer hash table with per-vnode splay trees using a methodology similar to the vm_map_entry splay and the VM splay that Alan Cox is working on. Extensive testing has appeared to have shown no increase in overhead. Disadvantages Dirties more cache lines during lookups. Not as fast as a hash table lookup (but still N log N and optimal when there is locality of reference). Advantages vnode->v_dirtyblkhd is now perfectly sorted, making fsync/sync/filesystem syncer operate more efficiently. I get to rip out all the old hacks (some of which were mine) that tried to keep the v_dirtyblkhd tailq sorted. The per-vnode splay tree should be easier to lock / SMPng pushdown on vnodes will be easier. This commit along with another that Alan is working on for the VM page global hash table will allow me to implement ranged fsync(), optimize server-side nfs commit rpcs, and implement partial syncs by the filesystem syncer (aka filesystem syncer would detect that someone is trying to get the vnode lock, remembers its place, and skip to the next vnode). Note that the buffer cache splay is somewhat more complex then other splays due to special handling of background bitmap writes (multiple buffers with the same lblkno in the same vnode), and B_INVAL discontinuities between the old hash table and the existence of the buffer on the v_cleanblkhd list. Suggested by: alc	2002-07-10 17:02:32 +00:00
jhb	8969d48c6a	In namei(), we use a NULL thread for uio_td when doing a VOP_READLINK(). nfs_readlink() calls nfs_bioread() which passes in uio_td as the thread argument to nfs_getcacheblk(). In nfs_getcacheblk() we dereference the thread pointer to get a process pointer to pass to nfs_sigintr(). This obviously results in a panic. :) Rather than change nfs_getcacheblk() to check if the thread pointer is NULL when calling nfs_sigintr() like other callers do, change nfs_sigintr() to take a thread as the last argument instead of a process so none of the callers have to care if the thread is NULL or not.	2002-06-28 21:53:08 +00:00
tanimura	e6fa9b9e92	Back out my lats commit of locking down a socket, it conflicts with hsu's work. Requested by: hsu	2002-05-31 11:52:35 +00:00
dd	90158e3b68	Don't tsleep() with an sb_mtx held.	2002-05-27 05:20:15 +00:00
peter	6fd2a8cc3f	Fix warning; deprecated use of label at end of compound statement	2002-05-24 05:50:28 +00:00
tanimura	92d8381dd5	Lock down a socket, milestone 1. o Add a mutex (sb_mtx) to struct sockbuf. This protects the data in a socket buffer. The mutex in the receive buffer also protects the data in struct socket. o Determine the lock strategy for each members in struct socket. o Lock down the following members: - so_count - so_options - so_linger - so_state o Remove *_locked() socket APIs. Make the following socket APIs touching the members above now require a locked socket: - sodisconnect() - soisconnected() - soisconnecting() - soisdisconnected() - soisdisconnecting() - sofree() - soref() - sorele() - sorwakeup() - sotryfree() - sowakeup() - sowwakeup() Reviewed by: alfred	2002-05-20 05:41:09 +00:00
ambrisko	0f6f0cccbe	Add TAG_VENDOR_INDENTIFIER (option 60) to our DHCP request done by the kernel BOOTP option. The format will be: FreeBSD:<MACHINE>:<osrelease> this way people can tune their DHCP server to server up root file systems via the OS, machine type and version. Obtained from: NetBSD MFC after: 3 weeks	2002-05-17 20:18:48 +00:00
trhodes	28d42899b7	More s/file system/filesystem/g	2002-05-16 21:28:32 +00:00
phk	5549556e4f	We don't need the arp kludge any more.	2002-04-28 18:29:44 +00:00
iedowse	bae478cc81	Remove the nfs_{lock,unlock,islocked} functions and the associated definitions; they have been unused and #if 0'd out since the Lite/2 merge and we are unlikely to want them in the future.	2002-04-27 22:10:16 +00:00
iedowse	64322dabea	The recent NFS forced unmount improvements introduced a side-effect where some client operations might be unexpectedly cancelled during an unsuccessful non-forced unmount attempt. This causes problems for amd(8), because it periodically attempts a non-forced unmount to check if the filesystem is still in use. Fix this by adding a new mountpoint flag MNTK_UNMOUNTF that is set only during the operation of a forced unmount. Use this instead of MNTK_UNMOUNT to trigger the cancellation of hung NFS operations. Also correct a problem where dounmount() might inadvertently clear the MNTK_UNMOUNT flag. Reported by: simokawa MFC after: 1 week	2002-04-17 01:07:29 +00:00
jhb	dc2e474f79	Change the suser() API to take advantage of td_ucred as well as do a general cleanup of the API. The entire API now consists of two functions similar to the pre-KSE API. The suser() function takes a thread pointer as its only argument. The td_ucred member of this thread must be valid so the only valid thread pointers are curthread and a few kernel threads such as thread0. The suser_cred() function takes a pointer to a struct ucred as its first argument and an integer flag as its second argument. The flag is currently only used for the PRISON_ROOT flag. Discussed on: smp@	2002-04-01 21:31:13 +00:00
jeff	5cc8ffe0d4	Remove references to vm_zone.h and switch over to the new uma API.	2002-03-20 10:07:52 +00:00
luigi	d4a3339ee0	Add a readonly sysctl variable of type string, kern.bootp_cookie, which is initialized with whatever string a dhcp/bootp server passes as vendor tag 134. There is no standard tag that I know with this information, and no vendor-defined tag that applies to FreeBSD that I could find doing the same thing. The intended use is to pass information to userland for run-time configuration of a diskless client without having to run a bootp/dhcp client for the third time (after the one in pxeboot/etherboot, and the one in the kernel bootp), also because these clients generally screwup the interface configuration, which is not exactly what you want when you have your disks nfs-mounted. Manpage update to follow soon. MFC-after: 3 days	2002-03-13 09:23:11 +00:00
phk	0b8d3eb375	vhold() our vnode while checking the remote side. This is belived to be the only place where a soft reference to a vnode is held with no sort of hard reference, consequently this change should allow us to free(9) vnodes from the freelist after properly cleaning them up. Reviewed by: dillon	2002-03-08 13:43:43 +00:00
peter	0535cd31ee	Fix warnings.. bootpc_init() and related.	2002-02-28 03:07:35 +00:00
jhb	b8b3ac8816	Use thread0.td_ucred instead of proc0.p_ucred. This change is cosmetic and isn't strictly required. However, it lowers the number of false positives found when grep'ing the kernel sources for p_ucred to ensure proper locking.	2002-02-27 19:18:10 +00:00
jhb	3706cd3509	Simple p_ucred -> td_ucred changes to start using the per-thread ucred reference.	2002-02-27 18:32:23 +00:00
peter	5830d22764	Fix a long line touched in previous commit (but not caused by previous commit)	2002-02-07 23:03:41 +00:00
julian	b5eb64d6f0	Pre-KSE/M3 commit. this is a low-functionality change that changes the kernel to access the main thread of a process via the linked list of threads rather than assuming that it is embedded in the process. It IS still embeded there but remove all teh code that assumes that in preparation for the next commit which will actually move it out. Reviewed by: peter@freebsd.org, gallatin@cs.duke.edu, benno rice,	2002-02-07 20:58:47 +00:00
peter	f71468f39b	Revise the nfsiod auto tuning code. Now both the upper and lower limits are specifyable by sysctl and are respected. Submitted by: Maxime Henrion <mux@sneakerz.org>	2002-01-15 20:57:21 +00:00
peter	08d32da0a5	Implement vfs.nfs.iodmin (minimum number of nfsiod's) and vfs.nfs.iodmaxidle (idle time before nfsiod's exit). Make it adaptive so that we create nfsiod's on demand and they go away after not being used for a while. The upper limit is NFS_MAXASYNCDAEMON (currently 20). More will be done here, but this is a useful checkpoint. Submitted by: Maxime Henrion <mux@qualys.com>	2002-01-14 02:13:46 +00:00
iedowse	2d507f0adf	Terminate requests in nfs_sigintr() if the filesystem is in the process of being unmounted. This allows forced NFS unmounts to complete even if there are processes stuck holding the mnt_lock while the server is down. The mechanism is not ideal in that there is a small chance we might accidentally cancel requests during a failed non-forced unmount attempt on that filesystem, but this is not really a big problem. Also, move the tsleep() in nfs_nmcancelreqs() so that we do not sleep in the case where there are no requests to be cancelled.	2002-01-10 02:15:35 +00:00
iedowse	e90d2d4ddf	Permit NFS filesystems to be forcibly unmounted when the server is down, even if there are hung processes and the mount is non- interruptible. This works by having nfs_unmount call a new function nfs_nmcancelreqs() in the FORCECLOSE case. It scans the list of outstanding requests and marks as interrupted any requests belonging to the specified mount. Then it waits up to 30 seconds for all requests to terminate. A few other changes are necessary to support this: - Unconditionally set a socket timeout so that even hard mounts are guaranteed to occasionally check the R_SOFTTERM flag on requests. For hard mounts this flag can only be set by nfs_nmcancelreqs(). - Reject requests on a mount that is currently being unmounted. - Never grant the receive lock to a request that has been cancelled. This should also avoid an old problem where a forced NFS unmount could cause a crash; it occurred when a VOP on an unlocked vnode (usually VOP_GETATTR) was in progress at the time of the forced unmount.	2002-01-02 00:41:26 +00:00
alc	9da90558a5	o Remove an errant ';' introduced in the last revision. o Remove an unused variable.	2002-01-01 19:44:01 +00:00
rwatson	4f087f57b5	o Remove premature use of nmp->nm_cred, it hasn't been initialized yet.	2002-01-01 16:17:55 +00:00
rwatson	85fc04400d	o Pass td into nfs_mountroot() to eliminate an XXX'd curthread use. Since it's in the parent function anyway, might as well pass it another layer down. Obtained from: TrustedBSD Project	2001-12-31 21:00:00 +00:00
rwatson	9348f9cada	o Remove premature leakage of use of td_ucred from base source tree: instead, use td->td_proc->p_ucred.	2001-12-31 20:56:59 +00:00
rwatson	70a29b1e5a	o Add missing #include's of sys/proc.h, missed in merge, required to dereference td->td_proc->p_ucred.	2001-12-31 20:05:26 +00:00
rwatson	5eea21ccca	o Make the credential used by socreate() an explicit argument to socreate(), rather than getting it implicitly from the thread argument. o Make NFS cache the credential provided at mount-time, and use the cached credential (nfsmount->nm_cred) when making calls to socreate() on initially connecting, or reconnecting the socket. This fixes bugs involving NFS over TCP and ipfw uid/gid rules, as well as bugs involving NFS and mandatory access control implementations. Reviewed by: freebsd-arch	2001-12-31 17:45:16 +00:00
iedowse	fb3ea25673	Add a #define for the size of the nfs_backoff[] array, and use this instead of magic constants in the code.	2001-12-30 18:41:52 +00:00
ambrisko	c79d7cebd4	Increase the buffer size to hold a bootp/DHCP reply from 256 bytes to 1222 bytes (derived as the maximum that isc-dhcpd uses). This solves the problem if a bootp/DHCP reply is over 256 bytes in which the end of the bootp/DHCP reply will not be found and then the reply will be ignored. This happens when swap and root paths are longish or many parameters are set. Reviewed by: imp Approved by: imp	2001-12-30 02:35:09 +00:00
dillon	2cc743e124	nfs_nget() does no locking whatsoever when looking up a vnode. If the vget() sleeps we have to retry the operation to avoid racing against a deletion. MFC maybe: submitted to re's	2001-12-27 19:40:34 +00:00
iedowse	6e9f1df98f	Avoid passing the variable `tl' to functions that just use it for temporary storage. In the old NFS code it wasn't at all clear if the value of `tl' was used across or after macro calls, but I'm fairly confident that the convention was to keep its use local. Each ex-macro function now uses a local version of this variable, so all of the double-indirection goes away. The only exception to the `local use' rule for `tl' is nfsm_clget(), which is left unchanged by this commit. Reviewed by: peter	2001-12-18 01:22:09 +00:00
dillon	cd4d323ad3	This fixes a large number of bugs in our NFS client side code. A recent commit by Kirk also fixed a softupdates bug that could easily be triggered by server side NFS. * An edge case with shared R+W mmap()'s and truncate whereby the system would inappropriately clear the dirty bits on still-dirty data. (applicable to all filesystems) THIS FIX TEMPORARILY DISABLED PENDING FURTHER TESTING. see vm/vm_page.c line 1641 * The straddle case for VM pages and buffer cache buffers when truncating. (applicable to NFS client side) * Possible SMP database corruption due to vm_pager_unmap_page() not clearing the TLB for the other cpu's. (applicable to NFS client side but could effect all filesystems). Note: not considered serious since the corruption occurs beyond the file EOF. * When flusing a dirty buffer due to B_CACHE getting cleared, we were accidently setting B_CACHE again (that is, bwrite() sets B_CACHE), when we really want it to stay clear after the write is complete. This resulted in a corrupt buffer. (applicable to all filesystems but probably only triggered by NFS) * We have to call vtruncbuf() when ftruncate()ing to remove any buffer cache buffers. This is still tentitive, I may be able to remove it due to the second bug fix. (applicable to NFS client side) * vnode_pager_setsize() race against nfs_vinvalbuf()... we have to set n_size before calling nfs_vinvalbuf or the NFS code may recursively vnode_pager_setsize() to the original value before the truncate. This is what was causing the user mmap bus faults in the nfs tester program. (applicable to NFS client side) * Fix to softupdates (see ufs/ffs/ffs_inode.c 1.73, commit made by Kirk). Testing program written by: Avadis Tevanian, Jr. Testing program supplied by: jkh / Apple (see Dec2001 posting to freebsd-hackers with Subject 'NFS: How to make FreeBS fall on its face in one easy step') MFC after: 1 week	2001-12-14 01:16:57 +00:00
rwatson	08704afd44	o Modify nfslockdans() to accept a thread reference instead of a proc reference: with td->td_ucred, it will be desirable to authorize based on td->td_ucred, rather than p->p_ucred. o Since the same variable 'p' was later used with pfind() on the target process for the wakeup, introduce a new local variable 'targetp' to use instead. Obtained from: TrustedBSD Project Sponsored by: DARPA, NAI Labs	2001-11-14 18:20:45 +00:00
alfred	fa9d19d5b5	Allow users to use the 'nolockd' or -L options with mount_nfs in order to avoid the need for rpc.lockd to perform client locks. Using this option a user can revert back to using local locks for NFS mounts like we did before we had rpc.lockd.	2001-11-12 02:33:52 +00:00
alfred	015f13094a	turn vn_open() into a wrapper around vn_open_cred() which allows one to perform a vn_open using temporary/other/fake credentials. Modify the nfs client side locking code to use vn_open_cred() passing proc0's ucred instead of the old way which was to temporary raise privs while running vn_open(). This should close the race hopefully.	2001-11-11 22:39:07 +00:00
dillon	1147eaf58a	Implement IO_NOWDRAIN and B_NOWDRAIN - prevents the buffer cache from blocking in wdrain during a write. This flag needs to be used in devices whos strategy routines turn-around and issue another high level I/O, such as when MD turns around and issues a VOP_WRITE to vnode backing store, in order to avoid deadlocking the dirty buffer draining code. Remove a vprintf() warning from MD when the backing vnode is found to be in-use. The syncer of buf_daemon could be flushing the backing vnode at the time of an MD operation so the warning is not correct. MFC after: 1 week	2001-11-05 18:48:54 +00:00
rwatson	1704b54dc9	o Note an additional potential problem here: LOCKD_MSG directly exports struct ucred to userland. In 5.0-CURRENT, it is desirable to instead export struct xucred, as ucred contains mutexes, pointers, and other kernel evil. I'll add it to my work queue.	2001-10-24 02:48:38 +00:00
rwatson	337c917faf	o Add two comments identifying problems with the current nfs_lock.c implementation, so that the information doesn't get lost. (1) /var/run/lock is looked up relative to the current thread's root directory, but it's not clear that's desirable. (2) A race condition associated with live credential modification on a shared credential is present when privilege is granted for the purposes of talking to /var/run/lock.	2001-10-23 19:11:31 +00:00
dillon	45a6fabe87	Change the vnode list under the mount point from a LIST to a TAILQ in preparation for an implementation of limiting code for kern.maxvnodes. MFC after: 3 days	2001-10-23 01:21:29 +00:00
jhb	4806d88677	Change the kernel's ucred API as follows: - crhold() returns a reference to the ucred whose refcount it bumps. - crcopy() now simply copies the credentials from one credential to another and has no return value. - a new crshared() primitive is added which returns true if a ucred's refcount is > 1 and false (0) otherwise.	2001-10-11 23:38:17 +00:00
jhb	daacd5aa55	Use crhold() instead of crdup() since we aren't modifying the cred but just need to ensure it remains immutable.	2001-10-09 16:48:57 +00:00
peter	fd12502e1d	Make this compile after last commit. It should be: "td ? td->td_proc : NULL", not "td ? td->td_proc, NULL"	2001-10-09 02:40:45 +00:00
julian	5596973a17	Don't dereference td if it's NULL. Submitted by: Alexander N. Kabaev <ak03@gte.com>	2001-10-08 23:47:44 +00:00
peter	562ebdfbed	Unwind some more macros. NFSMADV() was kinda silly since it was right next to equivalent m_len adjustments. Move the nfsm_subs.h macros into groups depending on which phase they are used in, since that affects the error recovery requirements. Collect some of the common error checking into a single macro as preparation for unwinding some more. Have nfs_rephead return a value instead of secretly modifying args. Remove some unused function arguments that were being passed around. Clarify nfsm_reply()'s error handling (I hope).	2001-09-28 04:37:08 +00:00
peter	2854bb2840	Make nfsm_dissect() have an obvious return value.	2001-09-27 22:40:38 +00:00
peter	bc122022f9	Tidy up nfsm_build usage. This is only partially finished.	2001-09-27 02:33:36 +00:00
iedowse	879c2b08b5	Add a missing dereference level. This caused nfsm_postop_attr_xx() to try and extract node attributes from an RPC reply even if none were present. Reviewed by: peter	2001-09-25 00:00:33 +00:00
peter	f6cc549f2c	Add the magic marker so that loader and kldload(2) can find this in module form automagically.	2001-09-20 04:57:34 +00:00
peter	afb77dde2c	Oops. Fix a missing indirection level. gcc didn't complain about it on x86, but did complain about it on alpha (since int and pointer are different sizes)	2001-09-20 03:45:51 +00:00
peter	09d2b9e4f7	Sigh, Last minute pre-merge typo. (missing quotes)	2001-09-18 23:49:33 +00:00
peter	85182a8d78	Cleanup and split of nfs client and server code. This builds on the top of several repo-copies.	2001-09-18 23:32:09 +00:00
imp	26847c44d7	nfs_strategy calls nfs_asyncio with td as NULL. So add a bandaid that will pass NULL as the struct proc when td is NULL. This has stopped crashing on my machine. Note: The passing of NULL may be bogus, but I'll let others fix that problem. Reviewed by: jhb	2001-09-18 18:37:52 +00:00
peter	2392b3448b	Sync some differences that were different between the copies of the files that were in nfs/nfs.h and nfsserver/nfs.h in the p4 tree.	2001-09-15 04:41:56 +00:00
julian	5596676e6c	KSE Milestone 2 Note ALL MODULES MUST BE RECOMPILED make the kernel aware that there are smaller units of scheduling than the process. (but only allow one thread per process at this time). This is functionally equivalent to teh previousl -current except that there is a thread associated with each process. Sorry john! (your next MFC will be a doosie!) Reviewed by: peter@freebsd.org, dillon@freebsd.org X-MFC after: ha ha ha ha	2001-09-12 08:38:13 +00:00
kris	bd6f9cb9b6	Fix some signed/unsigned integer confusion, and add bounds checking of arguments to some functions. Obtained from: NetBSD Reviewed by: peter MFC after: 2 weeks	2001-09-10 11:28:07 +00:00
dillon	6b8714e0aa	Pushdown Giant for nfs syscalls (nfssvc())	2001-08-31 22:39:36 +00:00
ache	6e290545e1	Stupid error from my side in prev. commit: \|\| -> &&	2001-08-23 18:02:29 +00:00
ache	86b9c46400	Implement l_len<0 per POSIX check. Check for valid l_whence too.	2001-08-23 16:13:59 +00:00
ache	34f1fd94b4	Even better move: suppose that server is able to handle SEEK_END, so check arguments for all but not SEEK_END case, leaving SEEK_END handling for server	2001-08-23 14:21:26 +00:00
ache	aafa17c550	Apparently SEEK_END locking not supported by NFS. Previous variant returns EINVAL in that case, change it to EOPNOTSUPP.	2001-08-23 14:09:16 +00:00
ache	2879f02ee4	Move <machine/> after <sys/> Pointed by: bde	2001-08-23 13:27:58 +00:00
ache	e955b0b735	adv. lock: detect off_t overflow _before_ it occurse and return EOVERFLOW instead of EINVAL	2001-08-23 08:20:21 +00:00
iedowse	a39dd4a8a2	Fix a client-side memory leak in nfs_flush(). The code allocates a temporary array to store struct buf pointers if the list doesn't fit in a local array. Usually it frees the array when finished, but if it jumps to the 'again' label and the new list does fit in the local array then it can forget to free a previously malloc'd M_TEMP memory. Move the free() up a line so that it frees any previously allocated memory whether or not it needs to malloc a new array. Reviewed by: dillon	2001-08-01 10:25:13 +00:00
peter	4763bc528e	Check the filehandle size when mounting. Obtained from: Constantine Sapuntzakis <csapuntz@openbsd.org>	2001-07-30 20:01:59 +00:00
jhb	0b11844c1a	- Sort includes. - Update vmmeter statistics for vnode pagein/pageouts in getpages/putpages.	2001-07-04 20:14:59 +00:00
dillon	e028603b7e	With Alfred's permission, remove vm_mtx in favor of a fine-grained approach (this commit is just the first stage). Also add various GIANT_ macros to formalize the removal of Giant, making it easy to test in a more piecemeal fashion. These macros will allow us to test fine-grained locks to a degree before removing Giant, and also after, and to remove Giant in a piecemeal fashion via sysctl's on those subsystems which the authors believe can operate without Giant.	2001-07-04 16:20:28 +00:00
jhb	ab91beada7	- Protect the mnt_vnode list with the mntvnode lock. - Use queue(9) macros.	2001-06-28 04:10:07 +00:00
jake	d729aaf555	Unlock the process returned from pfind() if it does not return NULL. This fixes a witness lock violation for nfssvc returning with locks held. Submitted by: Jean-Luc Richier <Jean-Luc.Richier@imag.fr> PR: kern/27776	2001-06-01 01:30:51 +00:00
rwatson	f504530d9f	o Merge contents of struct pcred into struct ucred. Specifically, add the real uid, saved uid, real gid, and saved gid to ucred, as well as the pcred->pc_uidinfo, which was associated with the real uid, only rename it to cr_ruidinfo so as not to conflict with cr_uidinfo, which corresponds to the effective uid. o Remove p_cred from struct proc; add p_ucred to struct proc, replacing original macro that pointed. p->p_ucred to p->p_cred->pc_ucred. o Universally update code so that it makes use of ucred instead of pcred, p->p_ucred instead of p->p_pcred, cr_ruidinfo instead of p_uidinfo, cr_{r,sv}{u,g}id instead of p_*, etc. o Remove pcred0 and its initialization from init_main.c; initialize cr_ruidinfo there. o Restruction many credential modification chunks to always crdup while we figure out locking and optimizations; generally speaking, this means moving to a structure like this: newcred = crdup(oldcred); ... p->p_ucred = newcred; crfree(oldcred); It's not race-free, but better than nothing. There are also races in sys_process.c, all inter-process authorization, fork, exec, and exit. o Remove sigio->sio_ruid since sigio->sio_ucred now contains the ruid; remove comments indicating that the old arrangement was a problem. o Restructure exec1() a little to use newcred/oldcred arrangement, and use improved uid management primitives. o Clean up exit1() so as to do less work in credential cleanup due to pcred removal. o Clean up fork1() so as to do less work in credential cleanup and allocation. o Clean up ktrcanset() to take into account changes, and move to using suser_xxx() instead of performing a direct uid==0 comparision. o Improve commenting in various kern_prot.c credential modification calls to better document current behavior. In a couple of places, current behavior is a little questionable and we need to check POSIX.1 to make sure it's "right". More commenting work still remains to be done. o Update credential management calls, such as crfree(), to take into account new ruidinfo reference. o Modify or add the following uid and gid helper routines: change_euid() change_egid() change_ruid() change_rgid() change_svuid() change_svgid() In each case, the call now acts on a credential not a process, and as such no longer requires more complicated process locking/etc. They now assume the caller will do any necessary allocation of an exclusive credential reference. Each is commented to document its reference requirements. o CANSIGIO() is simplified to require only credentials, not processes and pcreds. o Remove lots of (p_pcred==NULL) checks. o Add an XXX to authorization code in nfs_lock.c, since it's questionable, and needs to be considered carefully. o Simplify posix4 authorization code to require only credentials, not processes and pcreds. Note that this authorization, as well as CANSIGIO(), needs to be updated to use the p_cansignal() and p_cansched() centralized authorization routines, as they currently do not take into account some desirable restrictions that are handled by the centralized routines, as well as being inconsistent with other similar authorization instances. o Update libkvm to take these changes into account. Obtained from: TrustedBSD Project Reviewed by: green, bde, jhb, freebsd-arch, freebsd-audit	2001-05-25 16:59:11 +00:00
jhb	c1ce7745c1	Assert Giant is held by the caller rather than getting it and releasing it in getpages/putpages.	2001-05-23 22:26:05 +00:00
ru	35437d86aa	- FDESC, FIFO, NULL, PORTAL, PROC, UMAP and UNION file systems were repo-copied from sys/miscfs to sys/fs. - Renamed the following file systems and their modules: fdesc -> fdescfs, portal -> portalfs, union -> unionfs. - Renamed corresponding kernel options: FDESC -> FDESCFS, PORTAL -> PORTALFS, UNION -> UNIONFS. - Install header files for the above file systems. - Removed bogus -I${.CURDIR}/../../sys CFLAGS from userland Makefiles.	2001-05-23 09:42:29 +00:00
alfred	a3f0842419	Introduce a global lock for the vm subsystem (vm_mtx). vm_mtx does not recurse and is required for most low level vm operations. faults can not be taken without holding Giant. Memory subsystems can now call the base page allocators safely. Almost all atomic ops were removed as they are covered under the vm mutex. Alpha and ia64 now need to catch up to i386's trap handlers. FFS and NFS have been tested, other filesystems will need minor changes (grabbing the vm lock when twiddling page properties). Reviewed (partially) by: jake, jhb	2001-05-19 01:28:09 +00:00
iedowse	dafd513732	Change the second argument of vflush() to an integer that specifies the number of references on the filesystem root vnode to be both expected and released. Many filesystems hold an extra reference on the filesystem root vnode, which must be accounted for when determining if the filesystem is busy and then released if it isn't busy. The old `skipvp' approach required individual filesystem xxx_unmount functions to re-implement much of vflush()'s logic to deal with the root vnode. All 9 filesystems that hold an extra reference on the root vnode got the logic wrong in the case of forced unmounts, so `umount -f' would always fail if there were any extra root vnode references. Fix this issue centrally in vflush(), now that we can. This commit also fixes a vnode reference leak in devfs, which could result in idle devfs filesystems that refuse to unmount. Reviewed by: phk, bp	2001-05-16 18:04:37 +00:00
markm	bcca5847d5	Undo part of the tangle of having sys/lock.h and sys/mutex.h included in other "system" header files. Also help the deprecation of lockmgr.h by making it a sub-include of sys/lock.h and removing sys/lockmgr.h form kernel .c files. Sort sys/*.h includes where possible in affected files. OK'ed by: bde (with reservations)	2001-05-01 08:13:21 +00:00
phk	608c1caf3b	Add a vop_stdbmap(), and make it part of the default vop vector. Make 7 filesystems which don't really know about VOP_BMAP rely on the default vector, rather than more or less complete local vop_nopbmap() implementations.	2001-04-29 11:48:41 +00:00
alfred	6aad15a674	Remove incorrect comment. Submitted by: quinot@inf.enst.fr <quinot@inf.enst.fr> PR: kern/26893	2001-04-29 03:10:24 +00:00
grog	4b9d9cbaac	Revert consequences of changes to mount.h, part 2. Requested by: bde	2001-04-29 02:45:39 +00:00

... 4 5 6 7 8 ...

975 Commits